<<

Discovery and characterization of stable in yeast

by

Jeffrey T. Morgan

B.S., (2011) University of Michigan

SUBMITTED TO THE DEPARTMENT OF IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY

SEPTEMBER 2018

c 2018 Massachusetts Institute of Technology All rights reserved

Signature redacted

Signature of Author: Jeffrey T. Morgan Department of Biology August 2, 2018 Signature redacted Certified by: David P. Bartel Professor of Biology Thesis Supervisor

Signature redacted Accepted by: Amy E. Keating MASSACHUSETTS INSTITUTE Professor of Biology OF TECHNOLOGY Co-Chair, Biology Graduate Committee AUG 6jj018 LIBRARIES I 2 Discovery and characterization of stable introns in yeast

by

Jeffrey T. Morgan

Submitted to the Department of Biology on August 2, 2018 In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

Abstract

Spliceosomal introns are a defining feature of eukaryotes; they are present in all known eukaryotic genomes, absent from all known non-eukaryotic genomes, and their accurate removal is essential for mRNA maturation. Although smaller ncRNAs can be processed from introns, the introns themselves are considered biologically inert byproducts of splicing; their collective fate post-splicing is to be de-branched and rapidly degraded.

This dissertation details the first described instance of a regulated fate and function for excised and de-branched introns in eukaryotes. We observed a set of introns in the budding yeast Saccharomyces cerevisiae's transcriptome that, although rapidly degraded during log-phase growth as expected, accumulate as linear under saturated-growth conditions and during inhibition of TORC 1, a key integrator of growth signaling. At least 34 introns-1 1% of the introns in S. cerevisiae-show this change in stability. We find no evidence that this stability can be attributed to retention in the mature transcript. Instead, introns that become stabilized remain associated with components of the post-splicing, likely resulting in their protection from degradation. Compared to other yeast introns, these stable introns have no enriched sequence motifs but do share a short distance between their lariat branch point and the 3' splice site. Indeed, by manipulating this distance, we are able to show a causal relationship between branch-point position and stable-intron formation.

To test for cellular functions of stable introns, we created strains with precise intron deletions. We created 20 strains with combinations of up to five introns deleted, with the quintuple mutant eliminating >60% of the stable-intron molecules in the transcriptome. When these strains are challenged with the TORC I inhibitor rapamycin, their growth exceeds that of the parental strain, with a striking relationship (R 2 = 0.9) between the fraction of SI molecules removed from the transcriptome and the rate of growth under TORC 1 inhibition. Overexpression of native or engineered stable introns suppresses this aberrant rapamycin response. These results indicate that stable introns function within the TOR-mediated growth-signaling pathway of S. cerevisiae, and more broadly, excised introns can be stabilized and coopted to perform biological functions in eukaryotic cells.

Thesis Advisor: David P. Bartel Title: Professor

3 4 Acknowledgements

This work was possible because of the mentorship and trust of my advisor Dave Bartel. Dave has an endless ability to think critically and rigorously about the diverse projects in the lab. He taught me how to ask the right questions, how to conclusively answer them, when to focus on one experiment until you crack it, and when to seek outside input on a problem. He also tried to teach me a great deal about very specific aspects of grammar; I have assuredly erred in these aspects in the writing to follow. I was fortunate to land with great mentors since I started cold e- mailing Pls before my sophomore year of college: Steve Ragsdale and Li Yi at the University of Michigan, and Richard Leapman and Alioscka Sousa at the NIH. I did not fully appreciate them at the time, but I would not have ended up at MIT without their support and example.

I am grateful to many faculty members (and one fellow) of the MIT and Whitehead communities for their insight over the years: Chris Burge, Gerry Fink, Dennis Kim, David Pincus, David Sabatini, and Phil Sharp. Gerry and Phil have been instrumental in the success of this project since its inception and have been constant sources of guidance during thesis committee meetings. I additionally thank Gerry for his advice as I looked for postdocs, convincing me success lies foremost in looking where others aren't already looking. I thank members of the Fink lab for input and discussion on my work during Fink group meetings, and a special additional thanks to David Pincus for a great deal of input and suggestions over the past few years.

I knew effectively nothing about what the Bartel lab studies when I joined. Because of the creative and generous members of the lab, that quickly changed. I have overlapped with many amazing scientists in the lab: Weinberg, Igor, Vincent (during his brief return), Alex, Olivia, DK, Vikram, Sue-Jean, Katrin, Stephen, Junjie, Grace, Asia, Ben, Wenwen, Coffee, Jamie, Xuebing, Namita, Dan, Jarrett, Matt, Sean, Tim, Charlie, Danny, Kathy, Elena, Justin, Glenn, Thy. and Emir. I am additionally grateful to Laura, the lab's administrative manager, for processing my 657 (and counting) orders, and for being an endless source of positivity in the lab. I must point to a few lab members in particular: Stephen for advice-both big- and small-picture; Alex for being an endless fount of timeworn impressions and timeless experimental designs; Olivia for guidance when I was starting out in the lab, and for the brief period when I was super into the Tour de France; Grace for always having the big soccer matches streaming on her computer for quick check-ins; Thy for many unsolicited pictures of her cat; and Sean for bringing his light-hearted presence, thoughtful opinions, and so much of the cafeteria's flatware to our bay.

I still believe a basic tenet of MIT Biology's recruitment pitch: it is special because of the focus on one's cohort. Mine contained many amazing people. I especially want to thank Eric, Erik, Ian, Julie, Kevin, and Sahin for their friendship and support.

Finally, I thank my parents for supporting me, my education, and for pushing my siblings and I to follow our interests-even if those interests led us far from home. Many aren't lucky enough to do something they enjoy for a living, and I don't take that for granted. I thank my partner, Zo , for her support, putting up with my inability to accurately predict the time needed to finish up in lab, her love of Jeopardy! and national parks, making sure I eat decent food, and generally for making all aspects of my much richer. Last but not least, I would like to thank our cat, Ollie, for being a good boy.

5 6 Table of Contents

A bstract...... 3

A cknow ledgem nents ...... 5

Ta ble of C ontents ...... 7

C hapter 1. Introduction ...... 9 Part 1. Pre-m R N A processing...... 11 Intron recognition...... 1 1 Chem istry of splicing ...... 13 Figure 1. Tw o-step m echanism of pre-m RN A splicing...... 14 The spliceosom e: a dynam ic ribonucleoprotein m achine...... 15 Figure 2. Schematic view of the spliceosome cycle in S. cerevisiae...... 18 Spliceosom e disassem bly and lariat-intron degradation...... 19 Part 2. Introns qua introns...... 21 Evolution of introns ...... 21 Loss and gain of introns...... 23 Figure 3. Intron density of eukaryotes ...... 24 Function of introns in ...... 25 Other functions of introns in m odern eukaryotes ...... 26 N otable post-splicing fates of intact introns ...... 28 Part 3. S. cerevisiae outside of log-phase grow th...... 29 G row th phases...... 30 Figure 4. G rowth phases of S. cerevisiae...... 31 References...... 34

Chapter 2. Excised linear introns regulate growth in yeast...... 51

C hapter 3. Future D irections...... 105

Curriculum vitae ...... 121

7 8 Chapter 1. Introduction

Jacques Monod's oft-cited axiom', "anything found to be true of E. coli must also be true

of elephants," (Monod and Jacob, 1961) speaks to the long-discussed concept of biochemical

unity of life on Earth: although organismal scale and complexity varies, the same elementary

molecules and reactions underlie and unify all living things (Rubner, 1909; Kluyver and Donker,

1926; Friedmann, 2004). As science moves into the age of , this unity could be

investigated in higher-order reactions. Seminal studies of the order and logic for transfer of genetic information from DNA to RNA to (Avery et al., 1944; Boivin and Vendrely,

1947; Watson and Crick, 1953; Mazia, 1956; Crick, 1958) found they operate, in a general sense,

in bacteria as they do in eukaryotes. Importantly, analogous machinery (DNA-dependent RNA polymerase and messenger RNA-dependent ribosome) was identified as performing these operations in both bacteria and animals (Weiss and Gladstone, 1959; Hurwitz et al., 1960;

Stevens, 1960; Brenner et al., 1961; Gros et al., 1961; Gierer, 1963; Warner et al., 1963;

Wettstein et al., 1963). Although features of the 5' and 3' ends of eukaryotic messenger RNAs

(mRNAs) indicated that there might be complexity of and pre-mRNA processing not found in bacteria (Darnell et al., 1973; Brawerman, 1976; Shatkin, 1976), biochemical unity seemed as if it might apply to molecular biology: the details would differ, but the same elemental processes of expression would be shared by E. coli and elephants.

With this backdrop landed the discovery of mRNA splicing disrupted the notion of a unified molecular biology more than anything before it. First observed in adenovirus type 2 mRNAs produced late in the virus's infection cycle (Berget et al., 1977; Chow et al., 1977), this discovery revealed that the organization of in eukaryotes was profoundly different than that in bacteria. During the transfer of genetic information from DNA to mRNA, only select,

' Or Albert Jan Kluyver's less oft-cited version: "From the elephant to butyric acid bacterium-it is all the same!"

9 physically disconnected pieces ("") of a pre-mRNA are spliced together to form the mature mRNA while intervening pieces ("introns") are discarded. What was initially found for these viral mRNAs was shown to be characteristic of a diversity of eukaryotic organisms and mRNAs

(Brack and Tonegawa, 1977; Breathnach et al., 1977; Jeffreys and Flavell, 1977; Gilmore-Hebert and Wall, 1978; Tilghman et al., 1978; Tonegawa et al., 1978; Weinstock et al., 1978).

Since the discovery of mRNA splicing 41 years ago, it has become clear that spliceosomal introns (and the spliceosomal machinery to remove them) are a defining feature of eukaryotes: they are present in all known eukaryotic genomes and absent from all known non- eukaryotic genomes (Koonin, 2006; Irimia and Roy, 2014). Although a great deal has been learned subsequently about the sequence determinants of splicing, the chemistry of splicing, the spliceosome, alternative splicing, and more, much is left to be discovered about this critical and ubiquitous step in eukaryotic . This dissertation describes the discovery, characterization, and regulation of stable introns in the budding yeastS. cerevisiae. Although introns are generally viewed as inert and ephemeral by-products of splicing, the findings from this study add an unexpected dimension to possible fates and functions of spliceosomal introns within eukaryotic biology. Further, this dissertation emphasizes the benefit of considering myriad environments relevant to the survival and evolution of a given species as opposed to focusing on only the most experimentally standardized and tractable conditions suitable for a laboratory. Our understanding of fundamental biological processes will remain incomplete if not considered in this light. This chapter introduces knowledge that contextualizes the research advances found in the rest of the dissertation. Therefore, the focus is on eukaryotic biology with a further emphasis on S. cerevisiae biology where appropriate2 .

2 For instance, there is no discussion of the minor spliceosome, which is not extant in S. cerevisiae (Mewes et al., 1997; Burge et al., 1998).

10 Part 1. Pre-mRNA processing

In eukaryotes, mRNA maturation requires three major steps. Upon the onset of RNA

Polymerase 11 transcription, a 7-methylguanosine cap 3 is added to the 5' end, introns (if present) are removed by splicing, and after transcription is complete, a string of adenosines are added to the 3' end (Proudfoot et al., 2002). Transcription termination and freeing of an mRNAs 3' end for involves sequence elements in the nascent RNA transcript recognized by trans- acting cleavage and polyadenylation factors. The stringency and specificity of this cleavage position varies greatly by organism and can be regulated to produce multiple mRNA isofonns through alternative polyadenylation (Proudfoot, 2011; Tian and Manley, 2013). These 5' and 3' modifications function to imbue stability and translational capacity to an mRNA during its life.

Splicing, instead, is required to create an mRNA whose will generate the intended protein. Fidelity is absolutely vital; a skipped intron or inaccurate splice-site selection will result in aberrant protein products. This precision must be maintained despite the fact that intron lengths can vary by three orders of magnitude in a single organism (Garber et al., 1983; Hawkin,

1988; Michael and Manyuan, 1999). This section discusses in detail the recognition of an intron within pre-mRNA, the chemistry of splicing, the machinery necessary to perform splicing, and the assembly and disassembly of this machinery throughout the splicing cycle.

Intron recognition

Only three sequences of a spliceosomal intron are universally required for its definition and recognition by the spliceosome: the 5' splice site (5'SS), the 3' splice site (3'SS), and the lariat branch point (BP) (Figure 1). Each of these three sequences has its own consensus motif

3 A noted exception: the nematode C. elegans, where most messages receive 2,2,7-trimethylguanosine caps through trans-splicing of short splice-leader sequences (Blumenthal and Steward, 1997). Similar mechanisms are also present in less well-studied organisms (Hastings, 2005). 4 A noted exception: replication-dependent histones, which terminate in short stem-loops after endonucleolytic cleavage (Dominski and Marzluff, 2007).

I I generally shared between all spliceosomal introns, suggesting a single evolutionary origin

(discussed below). These sequences vary greatly in the strength of their conservation, being more strongly conserved in yeast and more weakly conserved in mammals. In yeast, the 5'SS hexamer is GUAYGU (Y being either pyrimidine ribonucleotide), the BP heptamer is UACUAAC

(branch point adenosine is underlined), and the 3'SS trimer is YAG (Langford and Gallwitz,

1983; Pikielny et al., 1983; Langford et al., 1984; Teem et al., 1984; Spingola et al., 1999; Davis et al., 2000). There are some additional features that aid in intron processing in yeast, such as a pyrimidine-rich tract upstream of the 3'SS and an adequate distance between 5'SS and BP

(Thompson-Juger and Domdey, 1987; Patterson and Guthrie, 1991). However, presence of the three motifs above is generally sufficient for splicing to occur.

In addition to consensus sequences being much more degenerate in mammals than yeast, a second fundamental difference in intron recognition between these eukaryotes is that mammalian introns are often much longer than mammalian exons, whereas the inverse is true in yeast. Therefore, mammalian intron definition is in actuality often driven by the definition of flanking exons (Robberson et al., 1990; Nakai and Sakamoto, 1994; Berget, 1995; Sterner et al.,

1996). Myriad cis-regulatory elements are present in both exons and introns to promote recruitment of spliceosome components to legitimate splice sites, which would be otherwise difficult to find using an intron-centric mechanism due to the degeneracy of splice-site sequences and widely varied intron lengths in mammals (Schaal and Maniatis, 1999; Fairbrother and

Chasin, 2000; Sun and Chasin, 2000; Fairbrother et al., 2002). The basic mechanism by which definition occurs is through initial recruitment of spliceosome components (U 1 snRNP

[small nuclear ribonucleo protein]) to the 5'SS of one intron, which in turn promotes recognition

5 Weak conservation could imply coupling between a given intron's motifs, i.e. intron-specific requirements for particular 5'SS-BP-3'SS combinations. However, hybrid introns containing 5' and 3' splice junctions from separate genes can be processed as if endogenous introns (Chu and Sharp, 1981).

12 of the upstream intron's 3'SS by a distinct set of spliceosome components (U2 auxiliary factor

[U2AF]) (Hoffman and Grabowski, 1992). In practice, this process is much more complex, being

especially sensitive to intron and exon lengths, as well as which cis-regulatory elements are

being actively utilized in a given at a given time at a given locus (Sterner et al., 1996; Fox-

Walsh et al., 2005). This regulation is central to alternative splicing, the prevalence and

importance of which is discussed below.

Chemistry of splicing

The fundamental sequence determinants of introns were by-and-large elucidated based on

early in vivo experimentation. However, development of a splicing-competent in vitro system-

as well as a system to produce sufficient quantities of desired, splicing-competent pre-mRNA

substrates (Butler and Chamberlin, 1982; Kassavetis et al., 1982; Green et al., 1983; Melton et

al., 1984)-was necessary for understanding how a splicing reaction proceeds from start to

finish, and what steps in the reaction are blocked by particular .

Splicing-competent extracts from HeLa cells (Hernandez and Keller, 1983; Hardy et al.,

1984; Krainer et al., 1984) and yeast (Lin et al., 1985) did indeed enable rapid illumination of

how a pre-mRNA is spliced. Splicing consists of two sequential transesterification reactions

(Figure 1). In the first step, the 2'-OH of the branch point adenosine attacks the 5'SS, which

ligates the 5'SS to the BP through a 2'-5' phosphodiester linkage6 , and frees the 3'-OH on the 5'

exon (Figure 1a). In the second step, the free 3'-OH on the 5' exon attacks the 3'SS, which ligates the two exons together, and releases the intron as a lariat with a free 3'-OH (Figure 1b)

(Grabowski et al., 1984; Krainer et al., 1984; Padgett et al., 1984; Ruskin et al., 1984). The two

6 This peculiar branch structure (a nucleotide with both a 2'-5' and 3-5' phosphodiester linkage) had been previously identified in bulk nuclear RNA (Wallace and Edmonds, 1983).

13 reactions are each SN2, and are not catalyzed as simply forward and reverse reactions in a single

active site (Moore and Sharp, 1993). Intriguingly, both spliceosomal introns and group II self-

splicing introns-a more ancient type of highly structured intron found in all three domains of

life-were noted to proceed through the same intermediates (Sharp, 1985; Cech, 1986), and

more recently have been shown to act through a common mechanism of RNA-mediated

positioning of divalent metals at catalytic sites (Sontheimer et al., 1997; Gordon et al., 2000; Fica

et al., 2013). The potential of group I-like elements as the progenitors of spliceosomal introns is

discussed below.

a 5' exon

HO-3'

54/ G n 3'exonP=O G G A---GpZGpZ

b X 3'-OH

0- 5'exon 3'exon XpZ pp D )A-3OHACG

Figure 1. Two-step mechanism of pre-mRNA splicing. a, First step of spliceosomal splicing, resulting in the 5'SS guanosine (G) ligated to the BP adenosine (A) through a 2'-5' phosphodiester linkage and a free 3'-OH on the 5' exon. b, Second step of spliceosomal splicing, resulting in the 5' exon ligated to the 3' exon and a lariat intron with a free 3'-OH. X and Z indicate the most 3' nucleotide of the 5' exon and the most 5' nucleotide of the 3' exon, respectively. (Modified from Moore et al., 1993).

14 The spliceosome: a dynamic ribonucleoprotein machine

Although the chemistry of splicing is relatively simple, the spliceosome is complex. It is a ribonucleoprotein complex made up of five small RNAs and approximately 100 proteins7.

These components must assemble on each intron de novo and undergo a series of ATP- dependent rearrangements to orient the requisite portions of intronic RNA such that the reactions described above can occur (Figure 2). During these rearrangements, individual and multipartite must dock and undock from the spliceosome in a specific order to drive the correct reaction to completion. Once complete, the spliceosome must release its two products- joined exons and lariat intron-and be disassembled so that its components may once again assemble de novo on and catalyze removal of other introns.

Five snRNPs (U 1, U2, U4, U5 and U6 snRNPs), named after their respective small nuclear RNA (snRNA), are the major building blocks of the spliceosome8 (Brody and Abelson,

1985; Frendewey and Keller, 1985; Grabowski et al., 1985; Grabowski and Sharp. 1986). The discovery of snRNAs predates the discovery of splicing, although at the time these RNAs were only recognized for their distinguishing characteristics as abundant, stable, small, and nuclear

RNAs (Muramatsu and Busch, 1965; Hodnett and Busch, 1968; Weinberg and Penman, 1968;

Weinberg and Penman, 1969; Ro-Choi and Busch, 1974; Zieve and Penman, 1976; Hellung-

Larsen and Frederiksen, 1977). It was subsequently found that antibodies produced by individuals with the autoimmune disease lupus react with snRNA-containing ribonucleoprotein complexes, allowing their immunoprecipitation and study (Mattioli and Reichlin, 1971;

Northway and Tan, 1972; Lerner and Steitz, 1979). These antibodies recognize similarly sized snRNPs in eukaryotes ranging from humans to fall armyworms (Lerner et al., 1980). These

7 The number is closer to 200 proteins in humans. 8 Along with the Prp19-centric NineTeen Complex (Chan et al., 2003).

15 findings were extended to yeast by a concerted effort largely on the part of Christine Guthrie's lab because evolutionary distance precluded the use of human-derived antibodies to detect snRNPs, and minimal sequence conservation made 1:1 relationships with mammalian snRNA counterparts difficult to establish (Tollervey et al., 1983; Wise et al., 1983; Parker and Guthrie,

1985; Ares Jr, 1986; Kretzner et al., 1987; Parker et al., 1987; Patterson and Guthrie, 1987;

Siliciano et al., 1987; Brow and Guthrie, 1988; Shuster and Guthrie, 1990)9.

The connection between snRNAs and splicing was initially proposed based on sequence complementarity between the very 5' end of U 1 and the 5'SS consensus sequence in mammalian introns (Lerner et al., 1980; Rogers and Wall, 1980)10. Experimental evidence followed showing immunodepletion of snRNPs, snRNAs, and RNase H-mediated removal of U I's 5' end are all sufficient to inhibit splicing (Padgett et al., 1983; Krimer et al., 1984; Rinke et al., 1984).

Similar depletion studies targeted at U2, U4, and U6 showed the essentiality of these components for splicing and spliceosome formation"1 (Kramer et al., 1984; Black et al., 1985; Chabot et al.,

1985; Berget and Robberson, 1986; Black and Steitz, 1986).

Spliceosome function can be broken down into three major steps: recognition and assembly, catalysis, and release and disassembly' 2 . As discussed above, the U I snRNP first recognizes the 5'SS (E Complex); this is mediated by direct RNA-RNA interactions between U I and the intron, as demonstrated genetically through compensatory base changes (Zhuang and

Weiner, 1986; Seraphin et al., 1988; Siliciano and Guthrie, 1988). Similarly, the U2 snRNP is recruited via U2-mediated RNA-RNA recognition of the BP (Parker et al., 1987; Wu and

9 The interested reader should see (Guthrie, 2010) for a more complete story of this effort. 0 Initially, a "cross-over" model was proposed, where U I would pair with both the 5'SS and 3'SS to align the two ends of an intron. However, while the 5'SS-complimentary sequence of UI is highly conserved, the 3'SS- complimentary sequence is not (Mount and Steitz, 1981). " U5 is largely resistant to RNase H-mediated cleavage (Black and Pinto, 1989). 2 The disassembly step-as it most directly relates to intron stability and the fate of introns post-splicing-is discussed in a subsequent stand-alone section.

16 Manley, 1989; Zhuang and Weiner, 1989). U I recruitment is not dependent on ATP, while U2 recruitment is both dependent on ATP and the presence of U 1 (Bindereif and Green, 1987; Ruby and Abelson, 1988; Seraphin and Rosbash, 1989; O'Day et al., 1996)'1.4. This complex (A complex) is next joined by the pre-associated tri-snRNP of U4/U5/U6 (Bringmann et al., 1984;

Hashimoto and Steitz, 1984; Bindereif and Green, 1987; Cheng and Abelson, 1987; Konarska and Sharp, 1987; Behrens and Lfihrmann, 1991; Stevens and Abelson, 1999). The ATP- dependent helicase activity of Prp28 disrupts U I-5'SS base pairing, resulting in U 1 snRNP eviction (B complex) (Staley and Guthrie, 1999). The final pre-catalytic step requires ATP- dependent helicase Brr2 to unwind U4-U6 pairing, which leads to dissociation of U4 and allows

U6 to make catalytically relevant interactions with U2 (Konarska and Sharp, 1987; Brow and

Guthrie, 1988; Lamond et al., 1988; Madhani and Guthrie, 1992; Sun and Manley, 1995;

Laggerbauer et al., 1998; Raghunathan and Guthrie, 1998). The catalytically active complex (B"' complex) is further stabilized by the NineTeen Complex (Ohi and Gould, 2002; Chan et al.,

2003; Chan and Cheng, 2005).

" In mammalian studies, U2 recruitment did not depend on an intact 5'SS (Ruskin and Green, 1985b). However, U I-then-U2 has been found to be the order of operations on intact introns (Bindereif and Green, 1987). 1 In most eukaryotes, U2 recruitment to the BP is preceded by U2AF association with the 3'SS and upstream pyrimidine-rich tract (Ruskin et al., 1988; Wu et al., 1999). However, this is not the case in S. cerevisiae. The difference likely lies in the relative stringency of its BP sequence (UACUAAC in S. cerevisiae compared to YURAY [Y being a pyrimidine and R being a purine ribonucleotide] in humans) and inability for mammalian U2 snRNP to locate a degenerate BP motif without additional context clues.

17 'SS BP 3SS

EGUAUGU UACUAAC- YAG

,-- ' F" E --I Prp \exonucleases ,- +ATP ,-' A

Dbrl ~(L~j ~ pre-B

A Prp28 Prp43 *ATP +NTP us 'us B ligated exons

Prp22 Brr2 +ATP +ATP

Bact

/Prp2 +ATP

B* C* Pr Prp 6 +ATP

Figure 2. Schematic view of the spliceosome cycle in S. cerevisiae. Pre-mRNA (top) enters the splicing reaction, which produces ligated mRNA (left) and lariat intron (top left) as products. The snRNP particles (U 1, U2, U4, U5, U6) assemble on the pre- mRNA in an ordered manner. Solid arrows indicate the paths of the mRNA and intron products. Dotted arrows indicate the paths of recycled snRNPs. Spliceosome assembly and fidelitous catalysis of the splicing cycle requires 5' splice-site, branch-point, and 3' splice-site sequences in the intron, as indicated on the pre-mRNA as 5'SS, BP, and 3'SS, respectively. The branch-point adenosine is additionally indicated (red). For simplicity, the NineTeen Complex has been omitted. (Modified from Ruby and Abelson, 1991; Moore et al., 1993).

18 Catalysis occurs while retaining the remaining snRNAs (U2, U5, and U6) throughout the

process. First, B is converted to the transient B* complex through juxtaposition of the 5'SS and

the BP for branching, which then catalyzes step I of splicing (Warkocki et al., 2009). After step I

is complete (C complex), another ATP-dependent remodeling enables docking of the 3'SS into

the active site (Schwer and Guthrie, 1992; James et al., 2002; Tseng et al., 2011). The 5' and 3'

exons are aligned by the U5, and the resulting C* complex performs step II of splicing, joining

the exons (Newman and Norman, 1992; Sontheimer and Steitz, 1993). At this stage, the P-

complex spliceosome still retains its two products: joined exons and lariat intron. The mRNA is

released through rearrangements driven by the Prp22 helicase (Arenas and Abelson, 1991;

Schwer and Gross, 1998; Schwer, 2008). Release of the lariat intron is discussed below.

More than 30 years of genetic and biochemical study of spliceosome function led to the

model described here. Recently, this model can be further refined in the light of high-resolution,

stage-specific cryo-EM structures of both human (Agafonov et al., 2016; Bertram et al., 2017a;

Bertram et al., 2017b; Zhang et al., 2017; Haselbach et al., 2018; Zhan et al., 2018) and yeast

(Nguyen et al., 2015; Yan et al., 2015; Galej et al., 2016; Nguyen et al., 2016; Rauhut et al.,

2016; Wan et al., 2016a; Wan et al., 2016b; Yan et al., 2016a, b; Bai et al., 2017; Fica et al.,

2017; Li et al., 2017; Liu et al., 2017; Plaschka et al., 2017; Wan et al., 2017; Wilkinson et al.,

2017) . These data will shape the future of research on this dynamic machine.

Spliceosome disassembly and lariat-intron degradation

Each newly assembled spliceosome is a single-turnover . Of course, a spliceosome that can only catalyze one splicing reaction is of little use for a cell. Just as the spliceosome undergoes dynamic changes during the splicing cycle, it must undergo a final set of

19 changes in order to dismantle its final form-the intron-lariat spliceosome (ILS)-into constituent parts. The DEAH-box NTPase Prp43 along with associated factors Ntrl and Ntr2 controls ILS disassembly (Figure 2) (Arenas and Abelson, 1997; Martin et al., 2002; Tsai et al.,

2005; Boon et al., 2006; Pandit et al., 2006; Tsai et al., 2007; Fourmann et al., 2013). In vitro, incubation of purified ILS complexes with Prp43, Ntrl, Ntr2, and any NTP is sufficient to generate defined dissociation products: the intron-lariat, U6 snRNA, U2 snRNP containing

SF3a/b, U5 snRNP, and the NTC (Fourmann et al., 2013). Some studies also implicate Brr2 and its associated GTPase Snul 14 in ILS disassembly (Small et al., 2006; Tsai et al., 2007; Hahn and

Beggs, 2010), but their activities are not required for in vitro disassembly, as any NTP-not just

ATP and/or GTP-supports disassembly (Fourmann et al., 2013). Abortive splicing events (such as spliceosomes stalled on suboptimal pre-mRNA substrates) are also rescued by this set of factors (Koodathingal et al., 2010; Mayas et al., 2010; Semlow and Staley, 2012).

Introns are generally degraded very rapidly after the completion of splicing (Ruskin and

Green, 1985a; Arenas and Hurwitz, 1987; Sharp et al., 1987; Chapman and Boeke, 1991). In order for the intron lariat to be degraded, it must be debranched by the debranchase Dbrl

(Chapman and Boeke, 1991; Khalid et al., 2005). To be debranched an intron lariat must first be released from the ILS, indicating that ILS disassembly is required for intron degradation (Martin et al., 2002)". Without debranchase activity, intron lariats accumulate in S. cerevisiae, resulting in a moderate reduction of growth rate. More severe associated with loss of Dbrl- often embryonic lethality-are found in more intron-rich eukaryotes (Nam et al., 1997; Wang et al., 2004; Dickinson et al., 2016). So, in addition to recycling snRNPs, ILS disassembly is also vital for intron turnover, the loss of which is otherwise lethal.

15 This was rigorously shown for 1T's intron in log-phase yeast extract. Relevant for work presented below, debranching could occur before ILS dissociation in intron-dependent or environment-dependent ways.

20 Part 2. Introns qua introns

Why are our genes in pieces? How and when did introns originate? What was their selective advantage in our ancestors-or is that not the right question to be asking at all? These questions have driven a great deal of experimental, computational, and philosophical work since the discovery of splicing. Although these questions are impossible to answer definitively, this section first describes the current state of thought on the evolution of introns, their loss and gain over time, and their function in past and present eukaryotes. It concludes with a survey of the known diversity of intronic fates post-splicing, including examples of non-coding RNAs that are harbored within introns across eukaryota.

Evolution of introns

The debate over when introns arose in evolution (later termed "introns-early" vs.

"introns-late" (Doolittle, 1987)) was contested for many years. Prominent early observers often favored an "introns-early" model wherein introns were relics of a primordial, pre-cellular gene structure. In this model, primordial exons were each independently functional; the function of interspersed introns was to allow exons to be easily joined and re-shuffled to produce new proteins without disrupting coding sequence (Gilbert, 1978, 1987). In turn, introns and the potential for "exon shuffling" were entirely lost in all modern bacterial lineages through extreme genome streamlining (Darnell, 1978; Doolittle, 1978). However, two major findings have largely pushed this model out of favor. First, the discovery of the archeal domain of life and determination of the most parsimonious (an archeal origin of the last eukaryotic common ancestor [LECA]) requires of the "introns-early" model that all introns have been independently lost twice in all known lineages of bacteria and archea (Woese and Fox, 1977;

Williams et al., 2013). Second, the "exon shuffling'" aspect of the "introns-early" model suggests

21 that introns would tend to fall between protein-domain boundaries, as it supposes pre-cellular exons would each encode a foldable, functional protein module. This is not convincingly borne out across the many sequenced eukaryotic genomes and transcriptomes (Doolittle, 2014).

What of the "introns-late" model, which suggests spliceosomal introns are not primordial but instead arose in eukaryotes alone (Cavalier-Smith, 1985, 1987)? This model took longer to fully mature, being especially supported by later findings of similarities between spliceosomal splicing and group II self-splicing (Sharp, 1985; Cech, 1986). Self-splicing introns serve as a much more concrete source for the origins of spliceosomal introns than a diverse collection of transposable elements-especially given the uniformity of splice-site sequences (Cavalier-Smith,

1978; Borst and Grivell, 1981; Cavalier-Smith, 1985). In this model, spliceosomal introns are the result of a massive invasion and proliferation of group Il-like introns introduced into proto- eukaryotes-potentially via the genome of cc-proteobacterium that became the mitochondria.

How could this have happened? How would those invaded individuals pass on their genomes when one could imagine this irrevocably damaging their reproductive fitness? The spread on selfish, efficiently transposable genomic elements does not necessarily have to be beneficial or neutral to continue spreading in a population, and, as their frequency in the population increases, can become more and more detrimental to offspring viability 6 (Hickey, 1982). Therefore, this scenario is compatible with the view that early introns lacked beneficial functions for eukaryotes, as one would expect given their evolutionary origin as selfish genomic elements. In all, the

"introns-late" model as understood today proposes early eukaryotes did not acquire introns so that in a billion years we could have alternative splicing. Instead, they were likely forced upon

6 This is only true for sexually reproductive organisms, as selfish genomic elements only able to spread within a clonal asexual lineage would certainly put that lineage at a unique disadvantage relative to the population (Cavalier- Smith, 1980). Intriguingly, this suggests such elements could have played a role in the evolution of sexual reproduction itself.

22 the genome of the LECA and rapidly spread through the population. Some of the features that make eukaryotes what they are today (compact chromatin, a physical barrier between cytosol and genome, and RNAi to name a few) may have evolved in response to this invasion or ones like it

(Madhani, 2013).

The evolution of the spliceosome likely resulted from the extreme pressures to maintain extensive sequence conservation of all of these newly acquired self-splicing introns. Unlike the minimal sequence requirements of spliceosomal introns, group I self-splicing introns have extensive RNA secondary structures that must be maintained for splicing to occur (Michel et al.,

1990). If too many mutations are introduced into a self-splicing intron intercalated between exons of an essential gene, then the intron is retained in the mature mRNA, disrupts translation, and leads to an organism that is no longer viable. Expand this to the thousands of self-splicing introns littered throughout the LECA genome and it is easy to see where the evolutionary pressure came to develop trans-acting splicing machinery. An interesting possibility in the genesis of trans-acting machinery is that the five snRNAs present in modern eukaryotes arose from the fragmentation of self-splicing intron structures into "five easy pieces" (Sharp, 1991).

These RNAs-along with the many proteins that became the spliceosome-could then act in trans to catalyze removal of all introns as long as enough sequence remained to define where introns' boundaries are. From there, each intron's sequence could rapidly diverge from the uniform precursor, allowing for the evolution of diverse intron functions we see today.

Loss and gain of introns

Introns have been lost and gained in various lineages since their initial proliferation.

Although how these events occurred in the past cannot be started with certainty, hypotheses

23 about both events have been forwarded. Models based on modem eukaryotes estimate that the

LECA contained between 4-6 introns per gene (Figure 3) (Koonin, 2006; Csuros et al., 2011).

Therefore, while some eukaryotic lineages have doubled the number of introns per gene, intron

loss is the more dramatic and prominent event, with the S. cerevisiae lineage losing 99% of their primordial introns.

Last eukaryotic Introns per gene common ancestor 0 1 2 3 4 5 6 7 8 9

S. cerevisiae S. pombe N. crassa D. melonogaster A. thaliana C. elegans H. sapiens Micospoddla T gOndli Giardia

Figure 3. Intron density of eukaryotes. Shown are the introns per gene for example modem eukaryotes. These values range from -0.005 introns per gene (or 1 intron per 200 genes) in microsporidia and giardia to -8.5 introns per gene in humans. S. cerevisiae (left) has -0.05 introns per gene. The estimated intron density in the last eukaryotic common ancestor (middle, red) is 4.4-6.3 introns per gene (Modified from Csuros et al., 2011; additional data from Irimia and Roy, 2014).

A mechanism-driven hypothesis for intron loss in S. cerevisiae followed from the discovery that genetic Ty elements transpose through an RNA intermediate, necessitating an endogenous reverse transcriptase (Boeke et al., 1985; Garfinkel et al., 1985). If this activity exists in the cell, then intronless mature mRNA could be inadvertently reverse transcribed into cDNA; this cDNA could recombine into the genome, replacing the original, intron-containing genomic sequence (Fink, 1987). Because crossover events between genome and cDNA need to take place on either side of the intron to remove it, introns very close to the 5' end of the transcript would be refractory to removal. Supporting this hypothesis, introns in S. cerevisiae and many other intron-poor eukaryotes are enriched in the very 5' end of transcripts (Fink, 1987;

24 Mourier and Jeffares, 2003)17. Although we cannot know how this processed happen throughout evolution, it has been shown that this mechanism is a viable pathway for intron loss in modern S. cerevisiae (Derr et al., 1991; Derr, 1998).

Intron gain events are thought to be lineage-specific, dramatic, and rare (Fedorov et al.,

2003; Babenko et al., 2004; Coghlan and Wolfe, 2004). As with intron loss, mechanisms for intron gain have been proposed (Yenerall and Zhou, 2012; Huff et al., 2016). A particularly intriguing finding related to intron gain is that both steps of spliceosomal splicing are reversible in vitro with appropriate salt and divalent cation conditions (Tseng and Cheng, 2008). In principle, this allows for the reverse splicing of intron lariats into new mRNA substrates. How reverse splicing would result in events of massive intron gain are unclear, and requires reverse transcription and recombination to enable genomic intron gain. That said, it has recently been shown experimentally that S. cerevisiae (Lee and Stevens, 2016) and single-celled algae

(Worden et al., 2009; Huff et al., 2016) can gain new introns, providing insight into these poorly understood phylogenetic events.

Function of introns in alternative splicing

S. cerevisiae contains only 300 spliceosomal introns-and only 14 multi-intronic genes- across its 6,000 genes (Spingola et al., 1999; Davis et al., 2000; Juneau et al., 2007; Zhang et al.,

2007), making alternative splicing events few and far between (Juneau et al., 2009; Hossain et al., 2011; Hossain et al., 2016). However, the function and utility of splicing in eukaryotes is often viewed through the lens of alternative splicing, warranting discussion of alternative splicing outside of yeast.

17 Introns are not enriched in the 3' end of transcripts. It could be these have been selected against for reasons beyond limitations of homologous recombination.

25 The first alternative splicing events were found relatively quickly after the discovery of splicing itself (Alt et al., 1980; Early et al., 1980). Today, we recognize that >95% of pre-mRNA in humans have alternatively spliced isoforms (Pan et al., 2008; Wang et al., 2008). Alternative splicing serves as one of the major sources of transcriptome and proteome diversity in multicellular eukaryotes. The utility of alternative splicing is particularly seen in developmental

(Sanchez, 2004; Demir and Dickson, 2005) and differentiation-specific (Boutz et al., 2007;

Makeyev et al., 2007) expression of specific isoforms.

Although there are instances when the molecular mechanism underlying the production of different isoforms have been elucidated (Siebel et al., 1992; Valca'rcel et al., 1993; Zuo and

Maniatis, 1996; Sharma et al., 2008), in general, the totality of factors impinging on a given splicing event, in a given cell, with a given cellular history, in a given point in time, and in a given environment have made universal predictions of isoform usage challenging (Wang and

Burge, 2008). Nonetheless, it is difficult to overstate both the importance and complexity of alternative splicing in crafting the transcriptome of multicellular eukaryotes.

Other functions of introns in modern eukaryotes

Besides their role in alternative splicing, introns have other functions that manifest before, during, or after active splicing. In many multicellular eukaryotes, introns have been found to have prominent functions related to gene expression, translational yield of the mature mRNA, and ncRNA production. In many cases, the underlying mechanism is unclear, but examples of these general functions are still illuminating for thinking about the roles introns play in modern eukaryotes.

26 There is a wealth of literature demonstrating coupling between transcription and introns

(Moore and Proudfoot, 2009). In particular, 5' introns-those introns nearest the transcription

start site-often contain regulatory elements that increase Pol 1l's initiation rate (Bornstein et al.,

1988; Vasil et al., 1989; Palmiter et al., 1991; Furger et al., 2002). In extreme cases, a 5' intron is

required to produce detectable levels of a transcript at all (Buchman and Berg, 1988). Introns

near the site of 3' processing may also modulate that process as well (Rigo and Martinson, 2008;

Proudfoot, 2011). In metazoans, splicing results in the deposition of a complex at each exon-

exon junction, which is appropriately named the exon-junction complex (EJC) (Le Hir et al.,

2000). The EJC remains on the mRNA until the pioneer round of translation in the cytoplasm,

and is thought to serve as an indication that a given mRNA was processed correctly and should

therefore be translated efficiently (Wiegand et al., 2003; Moore, 2005).

Introns often harbor smaller ncRNAs within their sequence, resulting in their incomplete

degradation post-splicing as these processed remnants go on to outlast their ephemeral host

RNA1 8 . The most prominent classes are small nucleolar RNAs (snoRNAs) and microRNAs

(miRNAs). snoRNAs are required for ribosome biogenesis and direct two types of RNA base

modifications, 2'-O-methylation and pseudouridylation, to 100-200 sites per ribosome as well as

directing other aspects of rRNA trimming and maturation (Venema and Tollervey, 1999).

miRNAs are a class of small (-22 nt) RNAs that pair to partially complementary sequences within an mRNA and direct post-transcriptional repression of these target messages in diverse eukaryotic lineages (Bartel, 2018). miRNAs are processed from hairpin substrates, generally embedded within much longer primary Pol I transcripts (Lee et al., 2002; Lee et al., 2003; Cai et al., 2004; Lee et al., 2004). Approximately half of all human miRNAs are processed from introns through the canonical Drosha-mediated pathway (Baskerville and Bartel, 2005; Chiang et al.,

' Examples of intact lariats and linear introns are discussed in a separate section below.

27 2010). In some cases, a full-length debranched intron will be both the correct length and have the correct, extensive base-pairing to resemble a Drosha-processed precursor miRNA. These

"mirtrons" are recognized by the downstream processing enzyme Dicer, bypassing Drosha (Ruby et al., 2007). Even in these cases, more than half of the excised intron will be catabolized during miRNA maturation. Other examples of functional ncRNAs processed from introns have been found in mammalian immunoglobin class-switch recombination (Zheng et al., 2015) and fly embryogenesis (Tay and Pek, 2017), with more likely awaiting discovery.

The function of individual introns has been examined thoroughly in S. cerevisiae, where because of the small number of introns and few alternative-splicing events the potential functions of introns outside of alternative splicing can be more readily recognized. Additionally, because

95% of S. cerevisiae genes do not contain an intron, it is unlikely for introns to have general functions related to mRNA quality control as they do in metazoans. For most introns, no growth phenotypes are detected when they are removed from the genome (Ng et al., 1985; Parenteau et al., 2008; Hooks et al., 2016). In a couple of cases, the presence of an intron in a given locus appears to have a function. In one case, the intron regulates expression of paralogous ribosomal- protein genes (Parenteau et al., 2011), and in another case the intron counteracts deleterious R- loop formation during transcription (Bonnet et al., 2017). As in other eukaryotes, some S. cerevisiae introns are further processed into functional ncRNAs, such as snoRNAs, in which case the flanking portions of the intron are rapidly catabolized (Qu et al., 1995; Petfalski et al.,

1998).

Notable post-splicing fates of intact introns

There are very few known examples of full-length introns that persist post-splicing, and even fewer with known functions or regulation. Broadly, this category could include either

28 branched or linear introns. The most notable examples are the latency-associated transcripts

(LATs): lariats produced during latent stages of herpes simplex virus I infections that may play a

role in the maintenance of viral latency (Fraser et al., 1992; Wu et al., 1996; Cliffe et al., 2009).

Lariats spliced from the T cell -P pre-mRNA in mouse and human T cells have developmentally regulated stability, although no function has been found (Qian et al., 1992).

Trimmed circles (lariats with 3' tails removed) spliced from ANKRD52 may remain associated with their genomic locus and modulate transcription (Zhang et al., 2013). The sole example of a debranched, seemingly full-length intron is one produced by the Epstein-Barr virus in infected human B cells (Moss and Steitz, 2013). How this intron is protected from degradation is unknown.

Part 3. S. cerevisiae outside of log-phase growth

Cells-both within a multicellular organism and single-celled microorganisms-are not continually growing and dividing. In fact, most eukaryotic cells likely spend the majority of their in a resting, quiescent state exemplified by stem cells, neurons, eggs, and spores. (Werner-

Washburne et al., 1996; Gray et al., 2004). This is especially true for organisms, such as S. cerevisiae, that must respond to sub-optimal environmental changes without the benefit of locomotion. Waxing and waning of optimal and sub-optimal environments has likely been a consistent evolutionary pressure on S. cerevisiae's lineage. We see this reflected today in the species: yeast cells can remain viable in their quiescent state for months (equivalent to thousands of generation times for exponentially doubling cultures) (Lillie and Pringle, 1980; Granot and

Snyder, 1993; Fuge et al., 1994), and likely for orders of magnitude longer under the right conditions (Prokesch, 1991).

29 Despite being among the most thoroughly studied eukaryotic organisms (Botstein and

Fink, 2011), our knowledge of the molecular mechanisms of S. cerevisiae's environmental responses, state changes, and state maintenance is fragmented and incomplete. In turn, this lack of knowledge often leads to unfounded assumptions, based on the more experimentally tractable and standardized log-phase growth, which is presumed to hold for other growth conditions. Of course, yeasts in this growth phase are in their optimal environment, where the name of the game is presumably to produce as many offspring as possible. However, a culture of yeast cells is a dynamic entity with all growth phases able to provide relevant insights-especially in the case of phase-specific phenomenon-into the evolution and survival of the species. This section provides an overview of how S. cerevisiae responds to changes in its environment. Focus is given to the growth phases transited by an otherwise unperturbed yeast culture, and what these growth phases can potentially teach us about S. cerevisiae's evolutionary interactions with its natural environments.

Growth phases

Yeast cultures are most often initiated by diluting from a confluent overnight culture into fresh media' 9 . First, the culture experiences a lag phase where yeast become increasingly biochemically active, but do not yet divide (Figure 4) (Forsburg and Nurse, 1991). Next, they enter an extended state of exponential growth termed log phase. Log phase occurs while the yeast are sufficiently dilute such that their metabolism has little influence on overall nutrient availability in the medium. In either of the two standard media for S. cerevisiae growth-rich, yeast extract-based media (YPD) or synthetic complete media (YSC)-the majority of cell

1) This style of culture is termed a "batch culture" due to all nutrients ever provided to the culture being provided in one batch at the time of seeding. This is in contrast to less commonly used "continuous culture," which utilizes a chemostat to supplement media and maintain a constant growth rate.

30 division and increase in culture density occurs during the log phase when each doubling requires only 1.5-2 hours (under standard conditions). Since essential nutrients are finite in either media, the yeast will at some point sense some change in their environment and cease rapid divisions.

Surprisingly, this sensing can happen early in the rapid-doubling period. In particular, the ribosome-synthesis rate drops 50% early in log phase, even though multiple subsequent doublings occur before the growth rate begins to decline (Ju and Warner, 1994; Warner, 1999).

So, despite the apparent "full-throttle" nature of log phase, yeast in this phase are already beginning to estimate the future potential for growth in their environment.

Time

Figure 4. Growth phases of S. cerevisiae. Shown are the growth phases that a newly seeded S. cerevisiae culture will transit if left unperturbed (x-axis labels) as well as the relative cell density in each phase (y-axis). The time axis is qualitative, as the length of each phase will depend on media used and initial cell density.

The type of media used to grow yeast will affect its growth rate as well as which essential

nutrients are first to become limiting. In YPD media, the limiting nutrient is most often

(Lillie and Pringle, 1980). When glucose becomes exhausted, the culture undergoes a diauxic

shift to begin utilizing other, non-fermentable carbon sources via respiration. This shift first

results in a transient lag period with no growth while necessary are produced to utilize

the new carbon source-not dissimilar to the lag phase discussed earlier (Monod, 1949). The

31 post-diauxic phase encompasses a slight increase in cell density until limiting nutrients are once again exhausted. It is worth noting that when grown in YSC, the limiting factor for growth is likely lipids and/or lipid precursors (Hanscho et al., 2012). Glucose, on the other hand, does not appear to be limiting; roughly half of the standard 20 g/L glucose is not catabolized by the time log phase ends (Ju and Warner, 1994; Hanscho et al., 2012). How much of a classical diauxic shift exists under these conditions is unclear.

Stationary phase is characterized by neither a marked increase nor decrease in culture density (Werner-Washburne et al., 1996). It is worth noting that stationary phase is distinctly a property of cultures, not individual cells. Stationary cultures are comprised of two primary populations: quiescent cells and non-quiescent cells (Allen et al., 2006; Aragon et al., 2008).

Quiescent cells maintain viability, genome stability, ROS repression, and reproductive competency for much longer than non-quiescent cells. However, it is the non-quiescent population that continues to reproduce in the short term, and may be the source of new, advantageous mutations in an altered environment (Longo et al., 1996; Allen et al., 2006). As the classification of distinct stationary-phase populations is relatively recent, there are still many unanswered questions about the interplay between quiescent and non-quiescent cells, and if there are quorum-like decisions that control the relative ratio of these populations during stationary- phase growth (Werner-Washburne et al., 2011).

Although some cells survive in stationary-phase cultures for extended periods of time

(discussed above), the overall viability of the culture decreases over time. The study of chronological life span (CLS) and aging in yeast is often focused on this phase of growth. Of particular interest are genetic mutations that confer increased CLS. Genetic approaches have implicated a number of factors in regulating CLS: asymmetric inheritance of extrachromosomal

32 DNA and damaged mitochondria, oxidative stress, cytosolic acidification, caloric restriction, and

TOR signaling are among the most prominent (Fabrizio et al., 200 1; Powers et al., 2006;

Kennedy et al., 2007; Guarente, 2008; Seo et al., 2010; Hughes and Gottschling, 2012; Longo et al., 2012; Gottschling and Nystrom, 2017). In all, it is clear that diverse cell-intrinsic and cell- extrinsic factors lead to the decline in a given cell's viability as well as the overall viability of a culture.

The question that initiated this dissertation was simply: Are there unknown, post- transcriptional gene regulatory regimes hidden outside of well-trodden environmental contexts?

This dissertation describes one of many possible paths that led directly from that question: the discovery of regulated intron stability post-splicing and post-debranching in S. cerevisiae, and a function for these ncRNAs in TOR-mediated growth control. This regulated stability is not limited to one or two introns, but is characteristic of at least 34 introns, which is greater than

10% of all the spliceosomal introns in the species. Importantly, these introns like all other spliceosomal introns in S. cerevisiae are rapidly degraded in log-phase cultures, as expected based on many previous studies. Instead. stable introns are found in a variety of saturated-growth conditions, and specifically as a result of TOR inhibition. This work exemplifies a broader belief that focusing on a single environmental state can obfuscate fundamental aspects of biology, and serves as evidence that the post-splicing lives of introns may be much more complex than currently appreciated.

33 References

Agafonov, D.E., Kastner, B., Dybkov, 0., Hofele, R.V., Liu, W.-T., Urlaub, H., Lchrmann, R., and Stark, H. (2016). Molecular architecture of the human U4/U6. U5 tri-snRNP. Science, aad2085. Allen, C., Buttner, S., Aragon, A.D., Thomas, J.A., Meirelles, 0., Jaetao, J.E., Benn, D., Ruby, S.W., Veenhuis, M., Madeo, F., et al. (2006). Isolation of quiescent and nonquiescent cells from yeast stationary-phase cultures. The Journal of cell biology 174, 89-100. Alt, F.W., Bothwell, A.L., Knapp, M., Siden, E., Mather, E., Koshland, M., and Baltimore, D. (1980). Synthesis of secreted and membrane-bound immunoglobulin mu heavy chains is directed by mRNAs that differ at their 3' ends. Cell 20, 293-301. Aragon, A.D., Rodriguez, A.L., Meirelles, 0., Roy, S., Davidson, G.S., Tapia, P.H., Allen, C., Joe, R., Benn, D., and Werner-Washburne, M. (2008). Characterization of differentiated quiescent and nonquiescent cells in yeast stationary-phase cultures. Molecular biology of the cell 19, 1271-1280. Arenas, J., and Abelson, J. (1991). Requirement of the RNA helicase-like protein PRP22 for release of messenger RNA from spliceosomes. Nature 349, 487. Arenas, J., and Hurwitz, J. (1987). Purification of a RNA debranching activity from HeLa cells. Journal of Biological Chemistry 262, 4274-4279. Arenas, J.E., and Abelson, J.N. (1997). Prp43: An RNA helicase-like factor involved in spliceosome disassembly. Proceedings of the National Academy of Sciences 94, 11798- 11802. Ares Jr, M. (1986). U2 RNA from yeast is unexpectedly large and contains homology to vertebrate U4, U5, and U6 small nuclear RNAs. Cell 47, 49-59. Avery, 0., Macleod, C., and McCarty, M. (1944). Studies on the chemical nature of the substance inducing transformation of pneumococcal types: induction of transformation by a desoxyribonucleic acid fraction isolated from pneumococcus type Ill. The Journal of experimental medicine 79, 137-158. Babenko, V.N., Rogozin, l.B., Mekhedov, S.L., and Koonin, E.V. (2004). Prevalence of intron gain over intron loss in the evolution of paralogous gene families. Nucleic acids research 32, 3724-3733. Bai, R., Yan, C., Wan, R., Lei, J., and Shi, Y. (2017). Structure of the Post-catalytic Spliceosome from Saccharomyces cerevisiae. Cell 171, 1589-1598. e1588. Bartel, D.P. (2018). Metazoan MicroRNAs. Cell 173, 20-51. Baskerville, S., and Bartel, D.P. (2005). Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes. Rna 11, 241-247. Behrens, S.-E., and Ltihrmann, R. (1991). Immunoaffinity purification of a [U4/U6. U5] tri- snRNP from human cells. Genes & development 5, 1439-1452. Berget, S.M. (1995). Exon recognition in vertebrate splicing. Journal of biological Chemistry 270, 2411-2414. Berget, S.M., Moore, C., and Sharp, P.A. (1977). Spliced segments at the 5' terminus of adenovirus 2 late mRNA. Proceedings of the National Academy of Sciences 74, 3171- 3175. Berget, S.M., and Robberson, B.L. (1986). UJ, U2, and U4/U6 small nuclear ribonucleoproteins are required for in vitro splicing but not polyadenylation. Cell 46, 691-696.

34 Bertram, K., Agafonov, D.E., Dybkov, 0., Haselbach, D., Leelaram, M.N., Will, C.L., Urlaub, H., Kastner, B., Lthrmann, R., and Stark, H. (2017a). Cryo-EM structure of a pre- catalytic human spliceosome primed for activation. Cell 170, 701-713. e71 1. Bertram, K., Agafonov, D.E., Liu, W.-T., Dybkov, 0., Will, C.L., Hartmuth, K., Urlaub, H., Kastner, B., Stark, H., and Lhrmann, R. (2017b). Cryo-EM structure of a human spliceosome activated for step 2 of splicing. Nature 542, 318. Bindereif, A., and Green, M.R. (1987). An ordered pathway of snRNP binding during mammalian pre - mRNA splicing complex assembly. The EMBO journal 6, 2415-2424. Black, D.L., Chabot, B., and Steitz, J.A. (1985). U2 as well as U I small nuclear ribonucleoproteins are involved in premessenger RNA splicing. Cell 42, 737-750. Black, D.L., and Pinto, A. (1989). U5 small nuclear ribonucleoprotein: RNA structure analysis and ATP-dependent interaction with U4/U6. Molecular and cellular biology 9, 3350- 3359. Black, D.L., and Steitz, J.A. (1986). Pre-mRNA splicing in vitro requires intact U4/U6 small nuclear ribonucleoprotein. Cell 46, 697-704. Blumenthal, T., and Steward, K. (1997). RNA processing and gene structure. In C elegans II, D.L. Riddle, ed. (Cold Spring Harbor: Cold Spring Harbor Laboratory Press), pp. 1 7- 145. Boeke, J.D., Garfinkel, D.J., Styles, C.A., and Fink, G.R. (1985). Ty elements transpose through an RNA intermediate. Cell 40, 491-500. Boivin, A., and Vendrely, R. (1947). Sur le role possible des deux acides nucleiques dans la cellule vivante. Experientia 3, 32-34. Bonnet, A., Grosso, A.R., Elkaoutari, A., Coleno, E., Presle, A., Sridhara, S.C., Janbon, G., Geli, V., de Almeida, S.F., and Palancade, B. (2017). Introns Protect Eukaryotic Genomes from Transcription-Associated Genetic Instability. Molecular cell 67, 608-621 e606. Boon, K.-L., Auchynnikava, T., Edwalds-Gilbert, G., Barrass, J.D., Droop, A.P., Dez, C., and Beggs, J.D. (2006). Yeast ntrl/spp382 mediates prp43 function in postspliceosomes. Molecular and cellular biology 26, 6016-6023. Bornstein, P., McKay, J., Liska, D., Apone, S., and Devarayalu, S. (1988). Interactions between the and first intron are involved in transcriptional control of alpha I (1) collagen gene expression. Molecular and cellular biology 8, 4851-4857. Borst, P., and Grivell, L.A. (1981). One gene's intron is another gene's exon. Nature 289, 439. Botstein, D., and Fink, G.R. (2011). Yeast: an experimental organism for 21st Century biology. 189, 695-704. Boutz, P.L., Stoilov, P., Li, Q., Lin, C.-H., Chawla, G., Ostrow, K., Shiue, L., Ares, M., and Black, D.L. (2007). A post-transcriptional regulatory switch in polypyrimidine tract- binding proteins reprograms alternative splicing in developing neurons. Genes & development 21, 1636-1652. Brack, C., and Tonegawa, S. (1977). Variable and constant parts of the immunoglobulin light chain gene of a mouse myeloma cell are 1250 nontranslated bases apart. Proceedings of the National Academy of Sciences 74, 5652-5656. Brawerman, G. (1976). Characteristics and significance of the polyadenylate sequence in mammalian messenger RNA. In Progress in nucleic acid research and molecular biology (Elsevier), pp. 117-148. Breathnach, R., Mandel, J.-L., and Chambon, P. (1977). Ovalbumin gene is split in chicken DNA. Nature 270, 314.

35 Brenner, S., Jacob, F., and Meselson, M. (1961). An unstable intermediate carrying information from genes to ribosomes for protein synthesis. Nature 190, 576-581. Bringmann, P., Appel, B., Rinke, J., Reuter, R., Theissen, H., and Liihrmann, R. (1984). Evidence for the existence of snRNAs U4 and U6 in a single ribonucleoprotein complex and for their association by intermolecular base pairing. The EMBO journal 3, 1357- 1363. Brody, E., and Abelson, J. (1985). The" spliceosome": yeast pre-messenger RNA associates with a 40S complex in a splicing-dependent reaction. Science 228, 963-967. Brow, D.A., and Guthrie, C. (1988). Spliceosomal RNA U6 is remarkably conserved from yeast to mammals. Nature 334, 213. Buchman, A.R., and Berg, P. (1988). Comparison of intron-dependent and intron-independent gene expression. Molecular and cellular biology 8, 4395-4405. Burge, C.B., Padgett, R.A., and Sharp, P.A. (1998). Evolutionary fates and origins of U 12-type introns. Molecular cell 2, 773-785. Butler, E.T., and Chamberlin, M. (1982). Bacteriophage SP6-specific RNA polymerase. I. Isolation and characterization of the enzyme. Journal of Biological Chemistry 257, 5772- 5778. Cai, X., Hagedorn, C.H., and Cullen, B.R. (2004). Human microRNAs are processed from capped, polyadenylated transcripts that can also function as mRNAs. Rna 10, 1957-1966. Cavalier-Smith, T. (1978). Nuclear volume control by nucleoskeletal DNA, selection for cell volume and cell growth rate, and the solution of the DNA C-value paradox. Journal of cell science 34, 247-278. Cavalier-Smith, T. (1980). How selfish is DNA? Nature 285, 617. Cavalier-Smith, T. (1985). Selfish DNA and the origin of introns. Nature 315, 283. Cavalier-Smith, T. (1987). The origin of eukaryotic and archaebacterial cells. Annals of the New York Academy of Sciences 503, 17. Cech, T.R. (1986). The generality of self-splicing RNA: relationship to nuclear mRNA splicing. Cell 44, 207-2 10. Chabot, B., Black, D.L., LeMaster, D.M., and Steitz, J.A. (1985). The 3'splice site of pre- messenger RNA is recognized by a small nuclear ribonucleoprotein. Science 230, 1344- 1349. Chan. S.-P., and Cheng, S.-C. (2005). The Prp 19-associated complex is required for specifying interactions of U5 and U6 with pre-mRNA during spliceosome activation. Journal of Biological Chemistry 280, 31190-31199. Chan, S.-P., Kao, D.-I., Tsai, W.-Y., and Cheng, S.-C. (2003). The Prpl9p-associated complex in spliceosome activation. Science 302, 279-282. Chapman, K.B., and Boeke, J.D. (1991). Isolation and characterization of the gene encoding yeast debranching enzyme. Cell 65, 483-492. Cheng, S., and Abelson, J. (1987). Spliceosome assembly in yeast. Genes & development 1, 1014-1027. Chiang, H.R., Schoenfeld, L.W., Ruby, J.G., Auyeung, V.C., Spies, N., Baek, D., Johnston, W.K., Russ, C., Luo, S., Babiarz, J.E., el al. (2010). Mammalian microRNAs: experimental evaluation of novel and previously annotated genes. Genes & development 24, 992-1009. Chow, L.T., Gelinas, R.E., Broker, T.R., and Roberts, R.J. (1977). An amazing sequence arrangement at the 5' ends of adenovirus 2 messenger RNA. Cell 12, 1-8.

36 Chu, G., and Sharp, P.A. (1981). A gene chimaera of SV40 and mouse p-globin is transcribed and properly spliced. Nature 289, 378. Cliffe, A.R., Garber, D.A., and Knipe, D.M. (2009). Transcription of the herpes simplex virus latency-associated transcript promotes the formation of facultative heterochromatin on lytic promoters. Journal of virology 83, 8182-8190. Coghlan, A., and Wolfe, K.H. (2004). Origins of recently gained introns in Caenorhabditis. Proceedings of the National Academy of Sciences of the United States of America 101, 11362-11367. Crick, F.H. (1958). On protein synthesis. Paper presented at: Symp Soc Exp Biol. Csuros, M., Rogozin, I.B., and Koonin, E.V. (2011). A detailed history of intron-rich eukaryotic ancestors inferred from a global survey of 100 complete genomes. PLoS computational biology 7, e1002150. Darnell, J.E. (1978). Implications of RNA-RNA splicing in evolution of eukaryotic cells. Science 202, 1257-1260. Darnell, J.E., Jelinek, W.R., and Molloy, G.R. (1973). Biogenesis of mRNA: genetic regulation in mammalian cells. Science 181, 1215-1221. Davis, C.A., Grate, L., Spingola, M., and Ares Jr, M. (2000). Test of intron predictions reveals novel splice sites, alternatively spliced mRNAs and new introns in meiotically regulated genes of yeast. Nucleic acids research 28, 1700-1706. Demir, E., and Dickson. B.J. (2005). fruitless splicing specifies male courtship behavior in Drosophila. Cell 121, 785-794. Derr, L.K. (1998). The involvement of cellular recombination and repair genes in RNA-mediated recombination in Saccharomyces cerevisiae. Genetics 148, 937-945. Derr, L.K., Strathern, J.N., and Garfinkel, D.J. (1991). RNA-mediated recombination in S. cerevisiae. Cell 67, 355-364. Dickinson, M.E., Flenniken, A.M., Ji, X., Teboul, L., Wong, M.D., White, J.K., Meehan, T.F., Weninger, W.J., Westerberg, H., and Adissu, H. (2016). High-throughput discovery of novel developmental phenotypes. Nature 537, 508. Dominski, Z., and Marzluff, W.F. (2007). Formation of the 3' end of histone mRNA: getting closer to the end. Gene 396, 373-390. Doolittle, W.F. (1978). Genes in pieces: were they ever together? Nature 272, 581. Doolittle, W.F. (1987). The origin and function of intervening sequences in DNA: a review. The American Naturalist 130, 915-928. Doolittle, W.F. (2014). The trouble with (group II) introns. Proceedings of the National Academy of Sciences 111, 6536-6537. Early, P., Rogers, J., Davis, M., Calame, K., Bond, M., Wall, R., and Hood, L. (1980). Two mRNAs can be produced from a single immunoglobulin pt gene by alternative RNA processing pathways. Cell 20, 313-319. Fabrizio, P., Pozza, F., Pletcher, S.D., Gendron, C.M., and Longo, V.D. (2001). Regulation of longevity and stress resistance by Sch9 in yeast. Science 292, 288-290. Fairbrother, W.G., and Chasin, L.A. (2000). Human genomic sequences that inhibit splicing. Molecular and cellular biology 20, 6816-6825. Fairbrother, W.G., Yeh, R.-F., Sharp, P.A., and Burge, C.B. (2002). Predictive identification of exonic splicing enhancers in human genes. Science 297, 1007-1013. Fedorov, A., Roy, S., Fedorova, L., and Gilbert, W. (2003). Mystery of intron gain. Genome research 13, 2236-2241.

37 Fica, S.M., Oubridge, C., Galej, W.P., Wilkinson, M.E., Bai, X.-C., Newman, A.J., and Nagai, K. (2017). Structure of a spliceosome remodelled for exon ligation. Nature 542, 377. Fica, S.M., Tuttle, N., Novak, T., Li, N.-S., Lu, J., Koodathingal, P., Dai, Q., Staley, J.P., and Piccirilli, J.A. (2013). RNA catalyses nuclear pre-mRNA splicing. Nature 503, 229. Fink, G.R. (1987). Pseudogenes in yeast? Cell 49, 5-6. Forsburg, S.L., and Nurse, P. (1991). Cell cycle regulation in the yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe. Annual review of cell biology 7, 227-256. Fourmann, J.B., Schmitzova, J., Christian, H., Urlaub, H., Ficner, R., Boon, K.L., Fabrizio, P., and Luhrmann, R. (2013). Dissection of the factor requirements for spliceosome disassembly and the elucidation of its dissociation products using a purified splicing system. Genes & development 27, 413-428. Fox-Walsh, K.L., Dou, Y., Lam, B.J., Hung, S.-p., Baldi, P.F., and Hertel, K.J. (2005). The architecture of pre-mRNAs affects mechanisms of splice-site pairing. Proceedings of the National Academy of Sciences of the United States of America 102, 16176-16181. Fraser, N.W., Block, T.M., and Spivack, J.G. (1992). The latency-associated transcripts of herpes simplex virus: RNA in search of function. Virology 191, 1-8. Frendewey, D., and Keller, W. (1985). Stepwise assembly of a pre-mRNA splicing complex requires U-snRNPs and specific intron sequences. Cell 42, 355-367. Friedmann, H.C. (2004). From butyribacterium to E. coli: an essay on unity in biochemistry. Perspectives in biology and medicine 47, 47-66. Fuge, E.K., Braun, E.L., and Werner-Washburne, M. (1994). Protein synthesis in long-term stationary-phase cultures of Saccharomyces cerevisiae. Journal of bacteriology 176, 5802-5813. Furger, A., O'Sullivan, J.M., Binnie, A., Lee, B.A., and Proudfoot, N.J. (2002). Promoter proximal splice sites enhance transcription. Genes & development 16, 2792-2799. Galej, W.P., Wilkinson, M.E., Fica, S.M., Oubridge, C., Newman, A.J., and Nagai, K. (2016). Cryo-EM structure of the spliceosome immediately after branching. Nature 537, 197. Garber, R., Kuroiwa, A., and Gehring, W.J. (1983). Genomic and cDNA clones of the homeotic locus Antennapedia in Drosophila. The EMBO journal 2, 2027-2036. Garfinkel, D.J., Boeke, J.D., and Fink, G.R. (1985). Ty element transposition: reverse transcriptase and virus-like particles. Cell 42, 507-517. Gierer, A. (1963). Function of aggregated reticulocyte ribosomes in protein synthesis. Journal of molecular biology 6, 148-IN 148. Gilbert, W. (1978). Why genes in pieces? Nature 271, 501. Gilbert, W. (1987). The exon theory of genes. Paper presented at: Cold Spring Harbor symposia on quantitative biology (Cold Spring Harbor Laboratory Press). Gilmore-Hebert, M., and Wall, R. (1978). Immunoglobulin light chain mRNA is processed from large nuclear RNA. Proceedings of the National Academy of Sciences 75, 342-345. Gordon, P.M., SONTHEIMER, E.J., and PICCIRILLI, J.A. (2000). Metal ion catalysis during the exon-ligation step of nuclear pre-mRNA splicing: extending the parallels between the spliceosome and group II introns. Rna 6, 199-205. Gottschling, D.E., and Nystrom, T. (2017). The Upsides and Downsides of Organelle Interconnectivity. Cell 169, 24-34. Grabowski, P.J., Padgett, R.A., and Sharp, P.A. (1984). Messenger RNA splicing in vitro: an excised intervening sequence and a potential intermediate. Cell 37, 415-427.

38 Grabowski, P.J., Seiler, S.R., and Sharp, P.A. (1985). A multicomponent complex is involved in the splicing of messenger RNA precursors. Cell 42, 345-353. Grabowski, P.J., and Sharp, P.A. (1986). Affinity chromatography of splicing complexes: U2. U5, and U4+ U6 small nuclear ribonucleoprotein particles in the spliceosome. Science 233, 1294-1299. Granot, D., and Snyder, M. (1993). Carbon source induces growth of stationary phase yeast cells, independent of carbon source metabolism. Yeast 9, 465-479. Gray, J.V., Petsko, G.A., Johnston, G.C., Ringe, D., Singer, R.A., and Werner-Washburne, M. (2004). "Sleeping beauty": quiescence in Saccharomyces cerevisiae. Microbiol Mol Biol Rev 68, 187-206. Green, M.R., Maniatis, T., and Melton, D. (1983). Human p-globin pre-mRNA synthesized in vitro is accurately spliced in Xenopus oocyte nuclei. Cell 32, 681-694. Gros, F., Hiatt, H., Gilbert, W., Kurland, C.G., Risebrough, R., and Watson, J.D. (1961). Unstable ribonucleic acid revealed by pulse labelling of . Nature 190, 581. Guarente, L. (2008). Mitochondria-a nexus for aging, calorie restriction, and sirtuins? Cell 132, 171-176. Guthrie, C. (2010). From the ribosome to the spliceosome and back again. Journal of Biological Chemistry 285, 1-12. Hahn, D., and Beggs, J.D. (2010). Brr2p RNA helicase with a split personality: insights into structure and function (Portland Press Limited). Hanscho, M., Ruckerbauer, D.E., Chauhan, N., Hofbauer, H.F., Krahulec, S., Nidetzky, B., Kohlwein, S.D., Zanghellini, J., and Natter, K. (2012). Nutritional requirements of the BY series of Saccharomyces cerevisiae strains for optimum growth. FEMS Yeast Res 12, 796-808. Hardy, S.F., Grabowski. P.J., Padgett, R.A., and Sharp, P.A. (1984). Cofactor requirements of splicing of purified messenger RNA precursors. Nature 308, 375. Haselbach, D., Komarov, I., Agafonov, D.E., Hartmuth, K., Graf, B., Dybkov, 0., Urlaub, H., Kastner, B., LOhrmann, R., and Stark, H. (2018). Structure and Conformational Dynamics of the Human Spliceosomal B act Complex. Cell. Hashimoto, C., and Steitz, J.A. (1984). U4 and U6 RNAs coexist in a single small nuclear ribonucleoprotein particle. Nucleic acids research 12, 3283-3293. Hastings, K.E. (2005). SL trans-splicing: easy come or easy go? Trends in genetics 21, 240-247. Hawkin, J.D. (1988). A survey on intron and exon lengths. Nucleic acids research 16, 9893- 9908. Hellung-Larsen, P., and Frederiksen, S. (1977). Occurrence and properties of low molecular weight RNA components from cells at different taxonomic levels. Comparative Biochemistry and Part B: Comparative Biochemistry 58, 273-281. Hernandez, N., and Keller, W. (1983). Splicing of in vitro synthesized messenger RNA precursors in HeLa cell extracts. Cell 35, 89-99. Hickey, D.A. (1982). Selfish DNA: a sexually-transmitted nuclear parasite. Genetics 101, 519- 531. Hodnett, J.L., and Busch, H. (1968). Isolation and characterization of uridylic acid-rich 7 S ribonucleic acid of rat liver nuclei. Journal of Biological Chemistry 243, 6334-6342.

39 Hoffman, B.E., and Grabowski, P.J. (1992). Ul snRNP targets an essential splicing factor, U2AF65, to the 3'splice site by a network of interactions spanning the exon. Genes & development 6, 2554-2568. Hooks, K.B., Naseeb, S., Parker, S., Griffiths-Jones, S., and Delneri, D. (2016). Novel intronic RNA structures contribute to maintenance of in Saccharomyces cerevisiae. Genetics 203, 1469-1481. Hossain, M.A., Claggett, J.M., Edwards, S.R., Shi, A., Pennebaker, S.L., Cheng, M.Y., Hasty, J., and Johnson, T.L. (2016). Posttranscriptional regulation of Gcrl expression and activity is crucial for metabolic adjustment in response to glucose availability. Molecular cell 62, 346-358. Hossain, M.A., Rodriguez, C.M., and Johnson, T.L. (2011). Key features of the two-intron Saccharomyces cerevisiae gene SUSI contribute to its alternative splicing. Nucleic acids research 39, 8612-8627. Huff, J.T., Zilberman, D., and Roy, S.W. (2016). Mechanism for DNA transposons to generate introns on genomic scales. Nature 538, 533. Hughes, A.L., and Gottschling, D.E. (2012). An early age increase in vacuolar pH limits mitochondrial function and lifespan in yeast. Nature 492, 261. Hurwitz, J., Bresler, A., and Diringer, R. (1960). The enzymic incorporation of ribonucleotides into polyribonucleotides and the effect of DNA. Biochemical and biophysical research communications 3, 15-19. Irimia, M., and Roy, S.W. (2014). Origin of spliceosomal introns and alternative splicing. Cold Spring Harbor perspectives in biology 6. James, S.-A., Turner, W., and Schwer, B. (2002). How Slu7 and Prp 18 cooperate in the second step of yeast pre-mRNA splicing. Rna 8, 1068-1077. Jeffreys, A.J., and Flavell, R. (1977). The rabbit P-globin gene contains a large insert in the coding sequence. Cell 12, 1097-1108. Ju, Q., and Warner, J.R. (1994). Ribosome synthesis during the growth cycle of Saccharomyces cerevisiae. Yeast 10, 151-157. Juneau, K., Nislow, C., and Davis, R.W. (2009). Alternative splicing of PTC7 in Saccharomyces cerevisiae determines protein localization. Genetics 183, 185-194. Juneau, K., Palm, C., Miranda, M., and Davis, R.W. (2007). High-density yeast-tiling array reveals previously undiscovered introns and extensive regulation of meiotic splicing. Proceedings of the National Academy of Sciences of the United States of America 104, 1522-1527. Kassavetis, G.A., Butler, E., Roulland, D., and Chamberlin, M. (1982). Bacteriophage SP6- specific RNA polymerase. II. Mapping of SP6 DNA and selective in vitro transcription. Journal of Biological Chemistry 257, 5779-5788. Kennedy, B., Steffen, K., and Kaeberlein, M. (2007). Ruminations on dietary restriction and aging. Cellular and molecular life sciences 64, 1323-1328. Khalid, M.F., Damha, M.J., Shuman, S., and Schwer, B. (2005). Structure-function analysis of yeast RNA debranching enzyme (Dbrl), a manganese-dependent phosphodiesterase. Nucleic acids research 33, 6349-6360. Kluyver, A.J., and Donker, H.J. (1926). Die einheit in der biochemie (Borntraeger). Konarska, M.M., and Sharp, P.A. (1987). Interactions between small nuclear ribonucleoprotein particles in formation of spliceosomes. Cell 49, 763-774.

40 Koodathingal, P., Novak, T., Piccirilli, J.A., and Staley, J.P. (2010). The DEAH box ATPases Prpl6 and Prp43 cooperate to proofread 5' splice site cleavage during pre-mRNA splicing. Molecular cell 39, 385-395. Koonin, E.V. (2006). The origin of introns and their role in eukaryogenesis: a compromise solution to the introns-early versus introns-late debate? Biol Direct 1, 22. Krainer, A.R., Maniatis, T., Ruskin, B., and Green, M.R. (1984). Normal and mutant human p- globin pre-mRNAs are faithfully and efficiently spliced in vitro. Cell 36, 993-1005. Krdmer, A., Keller, W., Appel, B., and Lthrmann, R. (1984). The 5' terminus of the RNA moiety of U I small nuclear ribonucleoprotein particles is required for the splicing of messenger RNA precursors. Cell 38, 299-307. Kretzner, L., Rymond, B.C., and Rosbash, M. (1987). S. cerevisiae UI RNA is large and has limited primary sequence homology to metazoan U I snRNA. Cell 50, 593-602. Laggerbauer, B., Achsel, T., and Lthrmann, R. (1998). The human U5-200kD DEXH-box protein unwinds U4/U6 RNA duplices in vitro. Proceedings of the National Academy of Sciences 95, 4188-4192. Lamond, A.I., Konarska, M.M., Grabowski, P.J., and Sharp, P.A. (1988). Spliceosome assembly involves the binding and release of U4 small nuclear ribonucleoprotein. Proceedings of the National Academy of Sciences 85, 411-415. Langford, C.J., and Gallwitz, D. (1983). Evidence for an intron-contained sequence required for the splicing of yeast RNA polymerase I transcripts. Cell 33, 519-527. Langford, C.J., Klinz, F.-J., Donath, C., and Gallwitz, D. (1984). Point mutations identify the conserved, intron-contained TACTAAC box as an essential splicing signal sequence in yeast. Cell 36, 645-653. Le Hir, H., Izaurralde, E., Maquat, L.E., and Moore, M.J. (2000). The spliceosome deposits multiple proteins 20-24 nucleotides upstream of mRNA exon-exon junctions. The EMBO journal 19, 6860-6869. Lee, S., and Stevens, S.W. (2016). Spliceosomal intronogenesis. Proceedings of the National Academy of Sciences 113, 6514-6519. Lee, Y., Ahn, C., Han, J., Choi, H., Kim, J., Yim, J., Lee, J., Provost, P., Radmark, 0., and Kim, S. (2003). The nuclear RNase III Drosha initiates microRNA processing. Nature 425, 415. Lee, Y., Jeon, K., Lee, J.T., Kim, S., and Kim, V.N. (2002). MicroRNA maturation: stepwise processing and subcellular localization. The EMBO journal 21, 4663-4670. Lee, Y., Kim, M., Han, J., Yeom, K.H., Lee, S., Baek, S.H., and Kim, V.N. (2004). MicroRNA genes are transcribed by RNA polymerase II. The EMBO journal 23, 4051-4060. Lerner, M.R., Boyle, J.A., Mount, S.M., Wolin, S.L., and Steitz, J.A. (1980). Are snRNPs involved in splicing? Nature 283, 220. Lerner, M.R., and Steitz, J.A. (1979). Antibodies to small nuclear RNAs complexed with proteins are produced by patients with systemic lupus erythematosus. Proceedings of the National Academy of Sciences 76, 5495-5499. Li, X., Liu, S., Jiang, J., Zhang, L., Espinosa, S., Hill, R.C., Hansen, K.C., Zhou, Z.H., and Zhao, R. (2017). CryoEM structure of Saccharomyces cerevisiae U l snRNP offers insight into alternative splicing. Nature Communications 8, 1035. Lillie, S.H., and Pringle, J.R. (1980). Reserve carbohydrate metabolism in Saccharomyces cerevisiae: responses to nutrient limitation. Journal of bacteriology 143, 1384-1394.

41 Lin, R., Newman, A., Cheng, S.-C., and Abelson, J. (1985). Yeast mRNA splicing in vitro. Journal of Biological Chemistry 260, 14780-14792. Liu, S., Li, X., Zhang, L., Jiang, J., Hill, R.C., Cui, Y., Hansen, K.C., Zhou, Z.H., and Zhao, R. (2017). Structure of the yeast spliceosomal postcatalytic P complex. Science, eaar3462. Longo, V.D., Gralla, E.B., and Valentine, J.S. (1996). Superoxide dismutase activity is essential for stationary phase survival in Saccharomyces cerevisiae Mitochondrial production of toxic oxygen species in vivo. Journal of Biological Chemistry 271, 12275-12280. Longo, V.D., Shadel, G.S., Kaeberlein, M., and Kennedy, B. (2012). Replicative and chronological aging in Saccharomyces cerevisiae. Cell Metab 16, 18-31. Madhani, H.D. (2013). The frustrated gene: origins of eukaryotic gene expression. Cell 155, 744- 749. Madhani, H.D., and Guthrie, C. (1992). A novel base-pairing interaction between U2 and U6 snRNAs suggests a mechanism for the catalytic activation of the spliceosome. Cell 71, 803-817. Makeyev, E.V., Zhang, J., Carrasco, M.A., and Maniatis, T. (2007). The MicroRNA miR-124 promotes neuronal differentiation by triggering brain-specific alternative pre-mRNA splicing. Molecular cell 27, 435-448. Martin, A., Schneider, S., and Schwer, B. (2002). Prp43 is an essential RNA-dependent ATPase required for release of lariat-intron from the spliceosome. Journal of Biological Chemistry 277, 17743-17750. Mattioli, M., and Reichlin, M. (1971). Characterization of a soluble nuclear ribonucleoprotein antigen reactive with SLE sera. The Journal of Immunology 107, 1281-1290. Mayas, R.M., Maita, H., Semlow, D.R., and Staley, J.P. (2010). Spliceosome discards intermediates via the DEAH box ATPase Prp43p. Proceedings of the National Academy of Sciences 107, 10020-10025. Mazia, D. (1956). Nuclear products and nuclear reproduction. In Enzymes: Units of biological structure and function, O.H. Gaebler, ed. (New York: Academic Press Inc.), pp. 261-278. Melton, D., Krieg, P., Rebagliati, M., Maniatis, T., Zinn, K., and Green, M. (1984). Efficient in vitro synthesis of biologically active RNA and RNA hybridization probes from plasmids containing a bacteriophage SP6 promoter. Nucleic acids research 12, 7035-7056. Mewes, H., Albermann, K., Bihr, M., Frishman, D., Gleissner, A., Hani, J., Heumann, K., Kleine, K., Maierl, A., and Oliver, S. (1997). Overview of the yeast genome. Nature 387, 7-8. Michael, D., and Manyuan, L. (1999). Intron-exon structures of eukaryotic model organisms. Nucleic acids research 27, 3219-3228. Michel, F., Umesono, K., and Ozeki, H. (1990). Comparative and functional anatomy of group II catalytic introns-a review. In RNA: Catalysis, Splicing, Evolution (Elsevier), pp. 5-30. Monod, J. (1949). The growth of bacterial cultures. Annual Reviews in Microbiology 3, 371- 394. Monod, J., and Jacob, F. (1961). General conclusions: teleonomic mechanisms in cellular metabolism, growth, and differentiation. Paper presented at: Cold Spring Harbor symposia on quantitative biology (Cold Spring Harbor Laboratory Press). Moore, M.J. (2005). From birth to death: the complex lives of eukaryotic mRNAs. Science 309, 1514-1518. Moore, M.J., and Proudfoot, N.J. (2009). Pre-mRNA processing reaches back totranscription and ahead to translation. Cell 136, 688-700.

42 Moore, M.J., Query, C.C., and Sharp, P.A. (1993). Splicing of precursors to mRNAs by the spliceosome. COLD SPRING HARBOR MONOGRAPH SERIES 24, 303-303. Moore, M.J., and Sharp, P.A. (1993). Evidence for two active sites in the spliceosome provided by stereochemistry of pre-mRNA splicing. Nature 365, 364. Moss, W.N., and Steitz, J.A. (2013). Genome-wide analyses of Epstein-Barr virus reveal conserved RNA structures and a novel stable intronic sequence RNA. BMC genomics 14, 543. Mount, S.M., and Steitz, J.A. (1981). Sequence of U I RNA from : implications for U I secondary structure and possible involvement in splicing. Nucleic acids research 9, 6351-6368. Mourier, T., and Jeffares, D.C. (2003). Eukaryotic intron loss. Science 300, 1393-1393. Muramatsu, M., and Busch, H. (1965). Studies on the nuclear and nucleolar ribonucleic acid of regenerating rat liver. Journal of Biological Chemistry 240, 3960-3966. Nakai, K., and Sakamoto, H. (1994). Construction of a novel database containing aberrant splicing mutations of mammalian genes. Gene 141, 171-177. Nam, K., Lee, G., Trambley, J., Devine, S.E., and Boeke, J.D. (1997). Severe growth defect in a Schizosaccharomyces pombe mutant defective in intron lariat degradation. Molecular and cellular biology 17, 809-818. Newman, A., and Norman, C. (1992). U5 snRNA interacts with exon sequences at 5' and 3' splice sites. Cell 68, 743-754. Ng, R., Domdey, H., Larson, G., Rossi, J., and Abelson, J. (1985). A test for intron function in the yeast actin gene. Nature 314, 183-184. Nguyen, T.H.D., Galej, W.P., Bai, X.-c., Oubridge, C., Newman, A.J., Scheres, S.H., and Nagai, K. (2016). Cryo-EM structure of the yeast U4/U6. U5 tri-snRNP at 3.7 A resolution. Nature 530, 298. Nguyen, T.H.D., Galej, W.P., Bai, X.-c., Savva, C.G., Newman, A.J., Scheres, S.H., and Nagai, K. (2015). The architecture of the spliceosomal U4/U6. U5 tri-snRNP. Nature 523, 47. Northway, J., and Tan, E. (1972). Differentiation of antinuclear antibodies giving speckled staining patterns in immunofluorescence. Clinical Immunology and Immunopathology 1, 140-154. O'Day, C.L., Dalbadie-McFarland, G., and Abelson, J. (1996). The Saccharomyces cerevisiae Prp5 protein has RNA-dependent ATPase activity with specificity for U2 small nuclear RNA. Journal of Biological Chemistry 271, 33261-33267. Ohi, M.D., and Gould, K.L. (2002). Characterization of interactions among the Cefl p-PrpI9p- associated splicing complex. Rna 8, 798-815. Padgett, R.A., Konarska, M.M., Grabowski, P.J., Hardy, S.F., and Sharp, P.A. (1984). Lariat RNA's as intermediates and products in the splicing of messenger RNA precursors. Science 225, 898-904. Padgett, R.A., Mount, S.M., Steitz, J.A., and Sharp, P.A. (1983). Splicing of messenger RNA precursors is inhibited by antisera to small nuclear ribonucleoprotein. Cell 35, 101-107. Palmiter, R.D., Sandgren, E.P., Avarbock, M.R., Allen, D.D., and Brinster, R.L. (1991). Heterologous introns can enhance expression of transgenes in mice. Proceedings of the National Academy of Sciences 88, 478-482. Pan, Q., Shai, 0., Lee, L.J., Frey, B.J., and Blencowe, B.J. (2008). Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nature genetics 40, 1413.

43 Pandit, S., Lynn, B., and Rymond, B.C. (2006). Inhibition of a spliceosome turnover pathway suppresses splicing defects. Proceedings of the National Academy of Sciences 103, 13700-13705. Parenteau, J., Durand, M., Morin, G., Gagnon, J., Lucier, J.F., Wellinger, R.J., Chabot, B., and Elela, S.A. (2011). Introns within ribosomal protein genes regulate the production and function of yeast ribosomes. Cell 147, 320-331. Parenteau, J., Durand, M., Veronneau, S., Lacombe, A.-A., Morin, G., Guerin, V., Cecez, B., Gervais-Bird, J., Koh, C.-S., and Brunelle, D. (2008). of many yeast introns reveals a minority of genes that require splicing for function. Molecular biology of the cell 19, 1932-1941. Parker, R., and Guthrie, C. (1985). A point in the conserved hexanucleotide at a yeast 5 ' splice junction uncouples recognition, cleavage, and ligation. Cell 41, 107-118. Parker, R., Siliciano, P.G., and Guthrie, C. (1987). Recognition of the TACTAAC box during mRNA splicing in yeast involves base pairing to the U2-like snRNA. Cell 49, 229-239. Patterson, B., and Guthrie, C. (1987). An essential yeast snRNA with a U5-like domain is required for splicing in vivo. Cell 49, 613-624. Patterson, B., and Guthrie, C. (1991). A U-rich tract enhances usage of an alternative 3' splice site in yeast. Cell 64, 181-187. Petfalski, E., Dandekar, T., Henry, Y., and Tollervey, D. (1998). Processing of the precursors to small nucleolar RNAs and rRNAs requires common components. Molecular and cellular biology 18, 1181-1189. Pikielny, C.W., Teem, J.L., and Rosbash, M. (1983). Evidence for the biochemical role of an internal sequence in yeast nuclear mRNA introns: implications for U 1 RNA and metazoan mRNA splicing. Cell 34, 395-403. Plaschka, C., Lin, P.-C., and Nagai, K. (2017). Structure of a pre-catalytic spliceosome. Nature. Powers, R.W., Kaeberlein, M., Caldwell, S.D., Kennedy, B.K., and Fields, S. (2006). Extension of chronological life span in yeast by decreased TOR pathway signaling. Genes & development 20, 174-184. Prokesch, S. (1991). Small British brewers make a dent. New York Times November 28. Proudfoot, N.J. (2011). Ending the message: poly (A) signals then and now. Genes & development 25, 1770-1782. Proudfoot, N.J., Furger, A., and Dye, M.J. (2002). Integrating mRNA processing with transcription. Cell 108, 501-512. Qian, L., Vu, M.N., Carter, M., and Wilkinson, M.F. (1992). A spliced intron accumulates as a lariat in the nucleus of T cells. Nucleic acids research 20, 5345-5350. Qu, L.-H., Henry, Y., Nicoloso, M., Michot, B., Azum, M.-C., Renalier, M.-H., Caizergues- Ferrer, M., and Bachellerie, J.-P. (1995). U24, a novel intron-encoded small nucleolar RNA with two 12 nt long, phylogenetically conserved complementarities to 28S rRNA. Nucleic acids research 23, 2669-2676. Raghunathan, P.L., and Guthrie, C. (1998). RNA unwinding in U4/U6 snRNPs requires ATP hydrolysis and the DEIH-box splicing factor Brr2. Current Biology 8, 847-855. Rauhut, R., Fabrizio, P., Dybkov, 0., Hartmuth, K., Pena, V., Chari, A., Kumar, V., Lee, C.-T., Urlaub, H., and Kastner, B. (2016). Molecular architecture of the Saccharomyces cerevisiae activated spliceosome. Science, aag 1906.

44 Rigo, F., and Martinson, H.G. (2008). Functional coupling of last-intron splicing and 3' -end processing to transcription in vitro: the poly (A) signal couples to splicing before committing to cleavage. Molecular and cellular biology 28, 849-862. Rinke, J., Appel, B., Bldcker, H., Frank, R., and LOhrmann, R. (1984). The 5' -terminal sequence of U l RNA complementary to the consensus 5' splice site of hnRNA is single-stranded in intact U I snRNP particles. Nucleic acids research 12, 4111-4126. Ro-Choi, T.S., and Busch, H. (1974). Low-molecular-weight nuclear RNA's. The cell nucleus 3, 151-208. Robberson, B.L., Cote, G.J., and Berget, S.M. (1990). Exon definition may facilitate splice site selection in RNAs with multiple exons. Molecular and cellular biology 10, 84-94. Rogers, J., and Wall, R. (1980). A mechanism for RNA splicing. Proceedings of the National Academy of Sciences 77, 1877-1879. Rubner, M. (1909). Kraft und stoff im haushalte der natur (Akademische verlagsgesellschaft mbh). Ruby, J.G., Jan, C.H., and Bartel, D.P. (2007). Intronic microRNA precursors that bypass Drosha processing. Nature 448, 83. Ruby, S.W., and Abelson, J. (1988). An early hierarchic role of UI small nuclear ribonucleoprotein in spliceosome assembly. Science 242, 1028-1035. Ruby, S.W., and Abelson, J. (1991). Pre-mRNA splicing in yeast. Trends in Genetics 7, 79-85. Ruskin, B., and Green, M.R. (1985a). An RNA processing activity that debranches RNA lariats. Science 229, 135-140. Ruskin, B., and Green, M.R. (1985b). Specific and stable intron-factor interactions are established early during in vitro pre-mRNA splicing. Cell 43. 131-142. Ruskin, B., Krainer, A.R., Maniatis, T., and Green, M.R. (1984). Excision of an intact intron as a novel lariat structure during pre-mRNA splicing in vitro. Cell 38, 317-331. Ruskin, B., Zamore, P.D., and Green, M.R. (1988). A factor, U2AF, is required for U2 snRNP binding and splicing complex assembly. Cell 52, 207-219. Sdnchez, L. (2004). Sex-determining mechanisms in insects. International journal of 52, 837-856. Schaal, T.D., and Maniatis, T. (1999). Multiple distinct splicing enhancers in the protein-coding sequences of a constitutively spliced pre-mRNA. Molecular and cellular biology 19, 261- 273. Schwer, B. (2008). A conformational rearrangement in the spliceosome sets the stage for Prp22- dependent mRNA release. Molecular cell 30, 743-754. Schwer, B., and Gross, C.H. (1998). Prp22, a DExH-box RNA helicase, plays two distinct roles in yeast pre-mRNA splicing. The EMBO journal 17, 2086-2094. Schwer, B., and Guthrie, C. (1992). A conformational rearrangement in the spliceosome is dependent on PRP16 and ATP hydrolysis. The EMBO journal 11, 5033-5039. Semlow, D.R., and Staley, J.P. (2012). Staying on message: ensuring fidelity in pre-mRNA splicing. Trends in biochemical sciences 37, 263-273. Seo, A.Y., Joseph, A.-M., Dutta, D., Hwang, J.C., Aris, J.P., and Leeuwenburgh, C. (2010). New insights into the role of mitochondria in aging: mitochondrial dynamics and more. Journal of cell science 123, 2533-2542. Seraphin, B., Kretzner, L., and Rosbash, M. (1988). A U l snRNA: pre - mRNA base pairing interaction is required early in yeast spliceosome assembly but does not uniquely define the 5' cleavage site. The EMBO journal 7, 2533-2538.

45 Seraphin, B., and Rosbash, M. (1989). Identification of functional U I snRNA-pre-mRNA complexes committed to spliceosome assembly and splicing. Cell 59, 349-358. Sharma, S., Kohlstaedt, L.A., Damianov, A., Rio, D.C., and Black, D.L. (2008). Polypyrimidine tract binding protein controls the transition from exon definition to an intron defined spliceosome. Nature Structural and Molecular Biology 15, 183. Sharp, P. (1991). Five easy pieces. Science 254, 663-663. Sharp, P.A. (1985). On the origin of RNA splicing and introns. Cell 42, 397-400. Sharp, P.A., Konarksa, M., Grabowski, P., Lamond, A., Marciniak, R., and Seiler, S. (1987). Splicing of messenger RNA precursors. Paper presented at: Cold Spring Harbor symposia on quantitative biology (Cold Spring Harbor Laboratory Press). Shatkin, A. (1976). Capping of eucaryotic mRNAs. Cell 9, 645-653. Shuster, E., and Guthrie, C. (1990). Human U2 snRNA can function in pre-mRNA splicing in yeast. Nature 345, 270. Siebel, C., Fresco, L., and Rio, D. (1992). The mechanism of somatic inhibition of Drosophila P- element pre-mRNA splicing: multiprotein complexes at an exon pseudo-5'splice site control U I snRNP binding. Genes & development 6, 1386-140 1. Siliciano, P.G., and Guthrie, C. (1988). 5'splice site selection in yeast: genetic alterations in base- pairing with U I reveal additional requirements. Genes & development 2, 1258-1267. Siliciano, P.G., Jones, M.H., and Guthrie, C. (1987). Saccharomyces cerevisiae has a U 1-like small nuclear RNA with unexpected properties. Science 237, 1484-1487. Small, E.C., Leggett, S.R., Winans, A.A., and Staley, J.P. (2006). The EF-G-like GTPase Snu I 4p regulates spliceosome dynamics mediated by Brr2p, a DExD/H box ATPase. Molecular cell 23, 389-399. Sontheimer, E.J., and Steitz, J.A. (1993). The U5 and U6 small nuclear RNAs as active site components of the spliceosome. Science 262, 1989-1996. Sontheimer, E.J., Sun, S., and Piccirilli, J.A. (1997). Metal ion catalysis during splicing of premessenger RNA. Nature 388, 801. Spingola, M., Grate, L., Haussler, D., and Ares Jr, M. (1999). Genome-wide bioinformatic and molecular analysis of introns in Saccharomyces cerevisiae. Rna 5, 221-234. Staley, J.P., and Guthrie, C. (1999). An RNA switch at the 5' splice site requires ATP and the DEAD box protein Prp28p. Molecular cell 3, 55-64. Sterner, D.A., Carlo, T., and Berget, S.M. (1996). Architectural limits on split genes. Proceedings of the National Academy of Sciences 93, 15081-15085. Stevens, A. (1960). Incorporation of the adenine ribonucleotide into RNA by cell fractions from E. coli B. Biochemical and biophysical research communications 3, 92-96. Stevens, S.W., and Abelson, J. (1999). Purification of the yeast U4/U6- U5 small nuclear ribonucleoprotein particle and identification of its proteins. Proceedings of the National Academy of Sciences 96, 7226-7231. Sun, H., and Chasin, L.A. (2000). Multiple splicing defects in an intronic false exon. Molecular and cellular biology 20, 6414-6425. Sun, J.-S., and Manley, J.L. (1995). A novel U2-U6 snRNA structure is necessary for mammalian mRNA splicing. Genes & development 9, 843-854. Tay, M.L.-I., and Pek, J.W. (2017). Maternally inherited stable intronic sequence RNA triggers a self-reinforcing loop during development. Current Biology 27, 1062-1067.

46 Teem, J.L., Abovich, N., Kaufer, N.F., Schwindinger, W.F., Warner, J.R., Levy, A., Woolford, J., Leer, R., Raamsdonk-Duin, M.v., and Mager, W. (1984). A comparison of yeast ribosomal protein gene DNA sequences. Nucleic acids research 12, 8295-8312. Thompson-Jiger, S., and Domdey, H. (1987). Yeast pre-mRNA splicing requires a minimum distance between the 5'splice site and the internal branch acceptor site. Molecular and cellular biology 7, 4010-4016. Tian, B., and Manley, J.L. (2013). Alternative cleavage and polyadenylation: the long and short of it. Trends in biochemical sciences 38, 312-320. Tilghman, S.M., Tiemeier, D.C., Seidman, J., Peterlin, B.M., Sullivan, M., Maizel, J.V., and Leder, P. (1978). Intervening sequence of DNA identified in the structural portion of a mouse beta-globin gene. Proceedings of the National Academy of Sciences 75, 725-729. Tollervey, D., Wise, J.A., and Guthrie, C. (1983). A U4-like small nuclear RNA is dispensable in yeast. Cell 35, 753-762. Tonegawa, S., Maxam, A.M., Tizard, R., Bernard, 0., and Gilbert, W. (1978). Sequence of a mouse germ-line gene for a variable region of an immunoglobulin light chain. Proceedings of the National Academy of Sciences 75, 1485-1489. Tsai, R.-T., Fu, R.-H., Yeh, F.-L., Tseng, C.-K., Lin, Y.-C., Huang, Y.-h., and Cheng, S.-C. (2005). Spliceosome disassembly catalyzed by Prp43 and its associated components Ntrl and Ntr2. Genes & development 19, 2991-3003. Tsai, R.-T., Tseng, C.-K., Lee, P.-J., Chen, H.-C., Fu, R.-H., Chang, K.-j., Yeh, F.-L., and Cheng, S.-C. (2007). Dynamic interactions of Ntrl-Ntr2 with Prp43 and with U5 govern the recruitment of Prp43 to mediate spliceosome disassembly. Molecular and cellular biology 27, 8027-8037. Tseng, C.-K., and Cheng, S.-C. (2008). Both catalytic steps of nuclear pre-mRNA splicing are reversible. Science 320, 1782-1784. Tseng, C.-K., Liu, H.-L., and Cheng, S.-C. (2011). DEAH-box ATPase Prpl6 has dual roles in remodeling of the spliceosome in catalytic steps. Rna 17, 145-154. Valca'rcel, J., Singh, R., Zamore, P.D., and Green, M.R. (1993). The protein Sex-lethal antagonizes the splicing factor U2AF to regulate alternative splicing of transformer pre- mRNA. Nature 362, 171. Vasil, V., Clancy, M., Ferl, R.J., Vasil, l.K., and Hannah, L.C. (1989). Increased gene expression by the first intron of maize shrunken-I locus in grass species. Plant physiology 91, 1575- 1579. Venema, J., and Tollervey, D. (1999). Ribosome synthesis in Saccharomyces cerevisiae. Annual review of genetics 33, 261-311. Wallace, J.C., and Edmonds, M. (1983). Polyadenylylated nuclear RNA contains branches. Proceedings of the National Academy of Sciences 80, 950-954. Wan, R., Yan, C., Bai, R., Huang, G., and Shi, Y. (2016a). Structure of a yeast catalytic step I spliceosome at 3.4 A resolution. Science 353, 895-904. Wan, R., Yan, C., Bai, R., Lei, J., and Shi, Y. (2017). Structure of an intron lariat spliceosome from Saccharomyces cerevisiae. Cell 171, 120-132. el 12. Wan, R., Yan, C., Bai, R., Wang, L., Huang, M., Wong, C.C., and Shi, Y. (2016b). The 3.8 A structure of the U4/U6. U5 tri-snRNP: Insights into spliceosome assembly and catalysis. Science, aad6466.

47 Wang, E.T., Sandberg, R., Luo, S., Khrebtukova, I., Zhang, L., Mayr, C., Kingsmore, S.F., Schroth, G.P., and Burge, C.B. (2008). Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470. Wang, H., Hill, K., and Perry, S.E. (2004). An Arabidopsis RNA lariat debranching enzyme is essential for embryogenesis. Journal of Biological Chemistry 279, 1468-1473. Wang, Z., and Burge, C.B. (2008). Splicing regulation: from a parts list of regulatory elements to an integrated splicing code. Rna 14, 802-813. Warkocki, Z., Odenwalder, P., Schmitzovi, J., Platzmann, F., Stark, H., Urlaub, H., Ficner, R., Fabrizio, P., and Lihrmann, R. (2009). Reconstitution of both steps of Saccharomyces cerevisiae splicing with purified spliceosomal components. Nature Structural and Molecular Biology 16, 1237. Warner, J.R. (1999). The economics of ribosome biosynthesis in yeast. Trends in biochemical sciences 24, 437-440. Warner, J.R., Knopf, P.M., and Rich, A. (1963). A multiple ribosomal structure in protein synthesis. Proceedings of the National Academy of Sciences 49, 122-129. Watson, J.D., and Crick, F.H. (1953). Molecular structure of nucleic acids. Nature 171, 737-738. Weinberg, R., and Penman, S. (1969). Metabolism of small molecular weight monodisperse nuclear RNA. Biochimica et Biophysica Acta (BBA)-Nucleic Acids and Protein Synthesis 190, 10-29. Weinberg, R.A., and Penman, S. (1968). Small molecular weight monodisperse nuclear RNA. Journal of molecular biology 38, 289-304. Weinstock, R., Sweet, R., Weiss, M., Cedar, H., and Axel, R. (1978). Intragenic DNA spacers interrupt the ovalbumin gene. Proceedings of the National Academy of Sciences 75, 1299-1303. Weiss, S.B., and Gladstone, L. (1959). A mammalian system for the incorporation of cytidine triphosphate into ribonucleic ACIDL. Journal of the American Chemical Society 81, 4118-4119. Werner-Washburne, M., Braun, E.L., Crawford, M.E., and Peck, V.M. (1996). Stationary phase in Saccharomyces cerevisiae. Molecular microbiology 19, 1159-1166. Werner-Washburne, M., Roy, S., and Davidson, G.S. (2011). Aging and the survival of quiescent and non-quiescent cells in yeast stationary-phase cultures. In Aging research in yeast (Springer), pp. 123-143. Wettstein, F., Staehelin, T., and Noll, H. (1963). Ribosomal aggregate engaged in protein synthesis: characterization of the ergosome. Nature 197, 430. Wiegand, H.L., Lu, S., and Cullen, B.R. (2003). Exon junction complexes mediate the enhancing effect of splicing on mRNA expression. Proceedings of the National Academy of Sciences 100, 11327-11332. Wilkinson, M.E., Fica, S.M., Galej, W.P., Norman, C.M., Newman, A.J., and Nagai, K. (2017). Postcatalytic spliceosome structure reveals mechanism of 3' -splice site selection. Science 358, 1283-1288. Williams, T.A., Foster, P.G., Cox, C.J., and Embley, T.M. (2013). An archaeal origin of eukaryotes supports only two primary domains of life. Nature 504, 231. Wise, J.A., Tollervey, D., Maloney, D., Swerdlow, H., Dunn, E.J., and Guthrie, C. (1983). Yeast contains small nuclear RNAs encoded by single copy genes. Cell 35, 743-751. Woese, C.R., and Fox, G.E. (1977). Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proceedings of the National Academy of Sciences 74, 5088-5090.

48 Worden, A.Z., Lee, J.-H., Mock, T., Rouz6, P., Simmons, M.P., Aerts, A.L., Allen, A.E., Cuvelier, M.L., Derelle, E., and Everett, M.V. (2009). Green evolution and dynamic revealed by genomes of the marine picoeukaryotes Micromonas. Science 324, 268-272. Wu, J., and Manley, J.L. (1989). Mammalian pre-mRNA branch site selection by U2 snRNP involves base pairing. Genes & development 3, 1553-1561. Wu, S., Romfo, C.M., Nilsen, T.W., and Green, M.R. (1999). Functional recognition of the 3' splice site AG by the splicing factor U2AF 35. Nature 402, 832. Wu, T.-T., Su, Y.-H., Block, T.M., and Taylor, J.M. (1996). Evidence that two latency- associated transcripts of herpes simplex virus type I are nonlinear. Journal of virology 70, 5962-5967. Yan, C., Hang, J., Wan, R., Huang, M., Wong, C.C., and Shi, Y. (2015). Structure of a yeast spliceosome at 3.6-angstrom resolution. Science 349, 1182-1191. Yan, C., Wan, R., Bai, R., Huang, G., and Shi, Y. (2016a). Structure of a yeast activated spliceosome at 3.5 A resolution. Science 353, 904-91 1. Yan, C., Wan, R., Bai, R., Huang, G., and Shi, Y. (2016b). Structure of a yeast step II catalytically activated spliceosome. Science, aak9979. Yenerall, P., and Zhou, L. (2012). Identifying the mechanisms of intron gain: progress and trends. Biology direct 7, 29. Zhan, X., Yan, C., Zhang, X., Lei, J., and Shi, Y. (2018). Structure of a human catalytic step I spliceosome. Science, eaar640 1. Zhang, X., Yan, C., Hang, J., Finci, L.l., Lei, J., and Shi, Y. (2017). An atomic structure of the human spliceosome. Cell 169, 918-929. e914. Zhang, Y., Zhang, X.-O., Chen, T., Xiang, J.-F., Yin, Q.-F., Xing, Y.-H., Zhu, S., Yang, L., and Chen. L.-L. (2013). Circular intronic long noncoding RNAs. Molecular cell 51, 792-806. Zhang, Z., Hesselberth, J.R., and Fields, S. (2007). Genome-wide identification of spliced introns using a tiling microarray. Genome research 17, 503-509. Zheng, S., Vuong, B.Q., Vaidyanathan, B., Lin, J.Y., Huang, F.T., and Chaudhuri, J. (2015). Non-coding RNA Generated following Lariat Debranching Mediates Targeting of AID to DNA. Cell 161, 762-773. Zhuang, Y., and Weiner, A.M. (1986). A compensatory base change in UI snRNA suppresses a 5' splice site mutation. Cell 46, 827-835. Zhuang, Y., and Weiner, A.M. (1989). A compensatory base change in human U2 snRNA can suppress a branch site mutation. Genes & development 3, 1545-1552. Zieve, G., and Penman, S. (1976). Small RNA species of the HeLa cell: metabolism and subcellular localization. Cell 8, 19-31. Zuo, P., and Maniatis, T. (1996). The splicing factor U2AF35 mediates critical protein-protein interactions in constitutive and enhancer-dependent splicing. Genes & development 10, 1356-1368.

49 50 Chapter 2.

Excised linear introns regulate growth in yeast

Jeffrey T. Morgan '22 3 , Gerald R. Fink2' 3 , David P. Bartel"2 3

'Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA

02139, USA

2Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA

3Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA

Apart from tetrad dissections performed by G.R.F., J.T.M. performed all experiments and analyses. D.P.B. supervised with help from G.R.F. All authors contributed to the design of the study and preparation of the manuscript.

Work in this chapter is in revision.

51 Abstract

Spliceosomal introns are ubiquitous non-coding RNAs typically viewed as inert byproducts of splicing, destined for rapid debranching and degradation. Here, we describe a subset of excised

Saccharomyces cerevisiae introns that, although rapidly degraded in log-phase growth, accumulate as linear RNAs under either saturated-growth conditions or inhibition of TORC 1, a key integrator of growth signaling. Introns that become stabilized remain associated with components of the spliceosome and differ from the other spliceosomal introns in having a short distance between their lariat branch point and 3' splice site, which is necessary and sufficient for their stabilization. Deletion of these unusual introns causes aberrantly high growth rates of yeast challenged with the TORCI inhibitor rapamycin, and reintroduction of native or engineered stable introns suppresses this aberrant rapamycin response. Thus, excised introns function within the TOR growth-signaling pathway of S. cerevisiae, and more generally, excised spliceosomal introns can have biological functions.

Introduction

Spliceosomal introns are a defining feature of eukaryotic life; they are present in all known eukaryotic genomes and absent from all known non-eukaryotic genomes (Koonin, 2006; Irimia and Roy, 2014). Every splicing event produces two products: ligated exons and an excised lariat intron (Domdey et al., 1984; Grabowski et al., 1984; Padgett et al., 1984; Rodriguez et al., 1984;

Ruskin et al., 1984). Because their production is an obligate result of gene expression and mRNA maturation, introns could be a fertile source of functional ncRNAs across eukaryota. However, although produced abundantly, excised lariat introns are debranched and degraded within seconds (Ruskin and Green, 1985; Arenas and Hurwitz, 1987; Sharp et al., 1987; Chapman and

52 Boeke, 1991). So, although introns have broad roles in essential alternative splicing events during pre-mRNA processing (Black, 2003), excised introns are generally viewed not as products of splicing but instead as inactive byproducts of exon ligation (Hesselberth, 2013).

Although not reported to accumulate post-splicing, introns can play important roles either before or after splicing. Function for individual introns has been examined most thoroughly in the intron-poor budding yeast S. cerevisiae, which contains approximately 300 spliceosomal introns, with only 14 multi-intronic genes and only a few annotated cases of alternative-splicing events (Spingola et al., 1999; Davis et al., 2000; Juneau et al., 2007; Zhang et al., 2007). Thus, in yeast, potential functions of introns after splicing can be more readily separated from their functions during pre-mRNA processing. For most introns tested, no growth phenotypes are detected upon intron removal (Ng et al., 1985; Parenteau et al., 2008; Hooks et al., 2016). In a few cases, however, functions have been observed. These functions include regulating expression of duplicated ribosomal protein genes (Parenteau et al., 2011) and counteracting R- loop formation during transcription (Bonnet et al., 2017). In these cases, the function manifests entirely during pre-mRNA production and processing and thus before the intron exists as a separate RNA molecule. With respect to functions post-splicing, some introns are processed to produce noncoding RNAs, such as small nucleolar RNAs (snoRNAs), although in these cases the flanking portions of the intron are still rapidly catabolized (Qu et al., 1995; Petfalski et al., 1998).

Thus, functional analyses in S. cerevisiae support the prevailing view that the collective fate of introns post-splicing is solely to be debranched and at least partially degraded.

Although hundreds of individual yeast introns have been assayed for function, relatively few experimental conditions have been explored. Specifically, most experiments use cells in the exponential growth phase, which provides consistent measurements in a standardized system

53 sensitized to detect differences in growth and metabolism. However, outside of the laboratory setting, yeast cells are unlikely to spend many consecutive generations rapidly dividing and more often face limiting nutrients or other stresses (Werner-Washburne et al., 1996; Gray et al., 2004).

Because the ability of cells to appropriately respond to these suboptimal conditions would have been important for survival, the findings from more optimal environments might not reflect biological phenomena present during non-exponential phases of growth that more adequately reflect growth in natural habitats. Accordingly, we set out to examine gene regulation of S. cerevisiae outside of the context of log-phase growth.

Results

Accumulation of excised, linear introns

We performed RNA sequencing (RNA-seq) on two . cerevisiae samples: one taken from a culture in log-phase growth and the other from a saturated culture in which cell density was minimally increasing, presumably because of nutrient limitation or other stresses. For most intron-containing genes, such as that of actin (ACT]), very few intron-mapping RNA-seq reads were observed from either culture condition (Fig. I a), as expected if introns were rapidly degraded post-splicing (Sharp et al., 1987). However, for a subset of genes, exemplified by

ECM33, many intron-mapping reads were observed specifically from the saturated culture (Fig.

I b). Indeed, the much higher density of reads for the ECM33 intron compared to that for its exons suggested that in this growth condition, the intron accumulated to much greater levels (10 fold) than its corresponding mature mRNA. Despite hundreds of reads mapping to the ECM33 intron, coverage dropped abruptly at the 5' and 3' boundaries of the intron (Fig. I b), indicating the intron was not being retained in the mature mRNA.

54 a 120 ACTI 0.5 kb_ Saturated culture _L _M 120 Log phase

0 ______

b 80 ECM33 0.5 kb C LP S Saturated 400- culturea300_ 4 -ECM33intron 0 -M 80 200- Log phase A kIM& hLA A AAAA 0

- 5.8S rRNA d 40 SAC6 0.5 kb S LP S 300- Saturated culture 200-, 0 40 Log phase

0 - SAC6 intron -I 100- -- f WT dbrlA LP S LP S

ECM33 intron

SW 5.8S rRNA

Figure 1. Some excised linear introns accumulate in yeast. a, Undetectable accumulation of the intron from the ACT] gene. Shown are RNA-seq profiles for ACT] (thick box, CDS; thin box, UTR; line, intron; closed circle on intron, BP) in both log-phase and saturated yeast cultures. Plotted for each nucleotide of the locus is the number of reads per million combined exon- and intron-mapping reads of the library. b, Accumulation of the ECM33 intron in saturated but not log-phase culture, otherwise as in a. c, Accumulation of intact ECM33 intron in saturated but not log-phase culture. Shown is an RNA blot that resolved total RNA from both log-phase (LP) and saturated (S) cultures and was probed for the ECM33 intron. Migration of markers with lengths indicated (nucleotides) is at the left. For a loading control, the blot was reprobed for 5.8S rRNA. d, Accumulation of the SAC6 intron in saturated but not log-phase culture, otherwise as in a. e, Reprobing of RNA blot in c for the SAC6 intron; otherwise, as in c. f, Accumulation of the debranched form of the ECM33 intron in saturated culture. Analysis is as in c, except the RNA blot included samples from both WT and dbrlA cultures. Migration of the linear isoform is indicated (diagram on the left) as is migration of both the lariat and the circle isoforms (diagrams on the right).

55 To examine whether our RNA-seq analysis was indeed detecting accumulation of excised introns of a defined size, we probed RNA blots for the inferred RNA species. Probes to the

ECM33 intron detected a single major species running at the position expected for the full- length, 330-nt intron (Fig. I c). Likewise, a probe for SAC6, another intron inferred by RNA-seq to accumulate in the saturated culture (Fig. Id), detected a defined RNA of the size expected for the full-length excised intron (Fig. le).

It is known that introns can be protected from degradation if they persist post-splicing as either lariat RNAs or circular lariat derivatives in which the lariat tail is missing (Chapman and

Boeke, 1991; Qian et al., 1992; Wu et al., 1996). However, these nonlinear species have either branched RNA or a 2'-5' phosphodiester linkage, which would impede reverse transcriptase

(Ruskin et al., 1984; Lorsch et al., 1995) and thereby cause RNA-seq reads to be depleted in the region of the branch point-a pattern that we did not observe in the RNA-seq profiles (Fig. l b,

Extended Data Fig. 1). To test further whether ECM33 intronic RNA accumulated as either a lariat RNA or its circular derivative, we harvested RNA from yeast lacking Dbrl, the enzyme required to debranch intron lariats (Chapman and Boeke, 1991), and compared the ECM33 intronic RNA that accumulated in the dbrlA strain with RNA that accumulated in wild-type saturated cultures. As expected, in dbrlz log-phase culture, ECM33 intron was detected as two abundant species, which corresponded to the branched lariat and circle, and the same species accumulated in dbrJ saturated culture (Fig. If). Importantly, neither of these non-linear species co-migrated with the linear intron identified in wild-type saturated culture. These results confirmed that Dbrl is necessary to form linear introns outside of log phase and showed that the previously known mode of decreased intron turnover cannot explain the observed accumulation of excised introns in a saturated S. cerevisiae culture.

56 Another mechanism that might protect these introns from degradation in saturated culture is incorporation into a ribonucleoprotein complex. Indeed, the ECM33 intron predominantly co- sedimented with complexes about the size of ribosomal subunits (35-50 Svedberg units,

Extended Data Fig. 2a). To identify proteins associating with the intron, we performed pull-down experiments from gradient fractions containing the intron and used quantitative mass spectrometry to identify the co-purifying proteins. For these experiments, we took advantage of the observation that tagged versions of the ECM33 intron excised from expression constructs retained the behavior of the endogenous ECM33 intron, i.e., rapid degradation in log-phase culture and accumulation as excised linear intronic RNA in saturated culture (Extended Data Fig.

2b, c). The top 10 proteins consistently co-purifying with MS2-tagged versions of the ECM33 intron (average enrichment = 4.9 fold) were each spliceosomal proteins, the identities of which indicated that the excised and debranched ECM33 intron resided in a specific complex that resembled the intron-lariat spliceosome (ILS) complex (Extended Data Table 1) (Fourmann et al., 2013).

Studies in log-phase extracts indicate that debranching occurs after spliceosome disassembly (Martin et al., 2002). Although we cannot rule out the possibility that these protected introns are debranched outside the spliceosome and subsequently re-associate with spliceosome components as linear RNAs, a simpler model is of continued association of the linear RNAs with ILS components post-splicing. In this favored scenario, the relationship between ILS disassembly and debranching presumably varies, depending on growth condition and/or intron identity. In either scenario, our findings indicate that in saturated cultures accumulating introns are bound to and presumably protected by a complex resembling the ILS.

57 Defining features of stable introns

We performed a systematic search for all introns that undergo a switch in stability and accumulate as linear RNAs in saturated cultures, hereafter referred to as "stable introns." In this search, RNA-seq reads of each intron-containing gene were analyzed for a preponderance of reads mapping to introns, particularly those mapping precisely the edges of the excised introns

(consistent with a post-splicing intron) relative to those mapping across splice sites and splice junctions (signatures of intron retention and mature mRNA expression, respectively) (Katz et al.,

2010) (Fig. 2a). Inspection of RNA-seq reads that mapped to the 3' edges of stable introns identified many that were extended by one or more untemplated adenosine residues (Fig. 2b).

This frequent addition of untemplated adenosines was not observed on either reads with 3' ends mapping to the interior of stable introns (Fig. 2b) or reads mapping to introns in log phase- although very few reads were obtained for this latter case. Short 3'-terminal oligo(A) tails are added by TRAMP complex to mark nuclear RNAs for degradation (LaCava et al., 2005; Jia et al., 2011) and have been observed on lariat introns isolated from dbrlz yeast (Qin et al., 2016).

Our finding that many stable-intron molecules had these tails suggests that these molecules might have been targeted for exosomal decay yet were somehow protected from this decay.

Regardless of their function, these untemplated adenosine residues provided an additional criterion for the annotation of stable introns, which helped us confidently identify another 28 introns whose stable form accumulated in yeast grown in saturated cultures (Extended Data Fig.

3, Extended Data Table 2).

58 165 nt 330 nt 165 nt ECM33 reads a Log phase Saturated C

Flanking exons ______676 95 0.8 Spliced .-----...... 37 4 Retained 0.6 or pre-MRN 2 Stable, retained. w-M. I 0.4- or pre-mRNA - Stable introns n = 29 n -271 0.2 -- Remaining introns P< 104 3'SS 0.0 3' end of intron 1 exon 0 25 50 75 100 125 150 aatc"a rcuuuUcuua~C"aacuaacu3 BP-3'SS distance (nt) C AAUACUAACACUACUUUUCUUUAUCUAAGCAG a UAACACUACUUUUCUUUAUCUAAGCAGAA E JUACUAACACUACUUUUCUUUAUCUAAGCAG UAACACUACUUUUCUUUAUCUAAGCAGAA d ECM33%h" (WT) - 330 nt ACACUACUUUUCUUUAUCUAAGCAGA 350 nt o UACUAACACUACUUUUCUUUAUCUAAGCAG ECM33 -9 (Mut) CACUACUUUUCUUUAUCUAAGCAGA ACT1- (WT) - 309 nt CACUACUUUUCUUUAUCUAAGCAGAA A CT1IS" (Mut) - 290 nt AACACUACUUUUCUUUAUCUAAGCAGA UAACACUACUUUUCUUUAUCUAAGCAGAAA WT ECM33L- ACT1&o" LP S LP S LP S 500- ECM33 intron 400- _rLong : 350 nI b 300- Shorl : 330 nt 200- 400- ACTI intron Long: 309 nt 300- 4 150- Short: 290 nt AA r: 200- 100 *e 5.8S rRNA WAA AAA 0 0 10 20 -10 0 10 -20 -10 0 5SS BP 3 SS Position relative to indicated intron feature

Figure 2. Stable introns have oligo(A) tails and short BP-3'SS distances. a, Example of RNA-seq support for intron accumulation and terminal adenylylation. The diagram (top left) classifies the RNA-seq reads deriving from the possible intron states of ECM33 transcripts (black lines, reads; red dashed lines, intron boundaries). Read counts from log-phase and saturated cultures (normalized for library depth) are listed for each class of reads (top right). For convenient comparison to intron accumulation, exon reads are only counted if they map within 165 nucleotides of either splice site (thereby encompassing 330 nucleotides, the length of the intron). The alignment (bottom) shows representative reads mapping to the intron 3' terminus, aligned below the sequence of the 3' intron-exon junction. Many of these reads had untemplated terminal adenosine residues (blue). b, Composition of untemplated tailing nucleotides observed in saturated culture. Reads that still had at least one terminal untemplated nucleotide after trimming the 3'-adapter sequence were collected, and the position of this tail was annotated as that of the last templated nucleotide. Counts of reads with tails added at positions 0 to +20 relative to the 5'SS, -10 to +10 relative to the BP, and -20 to 0 relative to the 3'SS are plotted, binning counts for tails of only adenosines (An, teal) separately from those of all other tails (other, purple). For An tails mapping to the 3'-terminal nucleotide of introns, the fraction with each indicated length is plotted (right). The relative abundance of An at positions -2 and -l relative to the 3'SS was ambiguous because most introns have an A at position -1 (3'SS

59 consensus sequence, YAG), which causes tails of length N at position -1 to be indistinguishable from tails of length N+1 at position -2. c, The shorter BP-3'SS distance of stable introns compared to that of most other introns. Plotted are cumulative distributions of BP-3'SS distances (P < 10-8, one-tailed Kolmogorov-Smirnov test). d, A causal relationship between BP-3'SS distance and stable-intron formation. The diagram at the top shows WT and mutant (Mut) introns, in which the WT ECM33 BP-3'SS distance was extended from 25 nt (short, red) to 45 nt (long, blue), and the WT ACT] BP-3'SS distance was shortened from 44 nt (long, blue) to 25 nt (short, red). Below the diagram are results of an RNA blot that resolved samples from the indicated strains and was probed sequentially for the ECM33 intron (top), the ACT] intron (middle), and 5.8S rRNA. Probes were complementary to portions of introns common to both short and long isoforms. Migration of markers with lengths indicated (nucleotides) is at the left. Expected migration of long and short linear isoforms of each intron is at the right. The asterisks (*) mark the detection of long-isoform degradation products, which each migrated even faster than did the short isoform.

In several cases, one of these stable introns derived from one of the few yeast genes with more than one intron (e.g., EFM5, Extended Data Fig. 3). The differential stability of one intron but not of the other intron from the same gene suggested that intron-intrinsic characteristics drive stability and accumulation. Consequently, we searched for common features among the 30 stable introns, which the cellular machinery might use to differentiate stable introns from the majority of introns that are still rapidly degraded in saturated cultures. This search found that stable introns were indistinguishable from other introns in nearly every respect. Compared to other introns, they had similar strengths of canonical splicing motifs (Extended Data Fig. 4a), similar length distributions (Extended Data Fig. 4b, P > 0.05), no common predicted structures or enriched sequence motifs (Extended Data Fig. 4c and d), and no enriched functional ontologies of their host genes. Of the features examined, the only one that differed was the distance between the lariat branch point (BP) and 3' splice site (3'SS), which tended to be shorter for stable introns

(Fig. 2c, P < 10-8).

To investigate a potential role of BP position in influencing intron stability, we made mutations that changed endogenous BP-3'SS distances and examined the effects of these mutations on intron accumulation. Lengthening the short BP-3'SS distance of the normally

60 stable ECM33 intron from 25 to 45 nt abrogated accumulation of the full-length excised intron, indicating that a short BP-3'SS distance is required for stability of this intron (Fig. 2d, compare

ECM33short and ECM33'0"). Moreover, shortening the BP-3'SS distance of the normally unstable ACT] intron from 44 to 25 nt conferred stability to this intron in saturated culture, which suggested that a short BP-3'SS distance is not only necessary but also sufficient for the stability of introns in saturated culture (Fig. 2d, compare ACTshon and AcTJIlong).

The notion that a short BP-3'SS distance is sufficient for stability seemed at odds with the observation that some introns with BP-3'SS distances of 20-25 nt were not annotated as stable introns (Fig. 2c). One possibility was that some introns with short BP-3'SS distances were not annotated as stable introns simply because their genes were not actively transcribed during the period at which stable introns were protected from degradation. To investigate this possibility, we placed introns that had not been identified as stable introns into the previously used expression construct, choosing two introns with a short (20- and 25-nucleotide) and two with a long (37- and 44-nucleotide) BP-3'SS distance. The two test introns with short BP-3'SS distances accumulated specifically in saturated culture, whereas the two test introns with long

BP-3'SS distances were unstable in both conditions (Extended Data Fig. 5).

We conclude that the two defining features of stable introns are 1) a short BP-3'SS distance and 2) expression within an environmental context in which introns are stabilized. After being modified to satisfy these criteria, all tested non-stable introns became stable introns (Fig.

2d, Extended Data Fig. 5), which suggested that no other sequence or structural features of either the intron or the host gene are required to achieve stability.

61 Regulation of stable introns

We next examined when, during the interval between log-phase and saturated conditions, the switch in intron stability occurs. To examine this, we harvested samples through 72 h of culture and monitored the levels of the ECM33 stable intron and its host mRNA (Fig. 3a). The ECM33 mRNA increased 2 h after culture seeding, remained steady through 12 h, and then decreased to low levels through the remainder of the time course (Fig. 3a). The Ecm33 protein is a cell-wall- related protein (Terashima et al., 2003), which explained its high expression during phases of rapid cell division. The ECM33 intron had a very different pattern of accumulation. Intron levels were low through 8 h of growth, and then, as cells exited the rapid-growth phase, intron levels dramatically increased (Fig. 3a). Intron abundance remained high for at least the next 62 h, even as the levels of mature mRNA decreased 23 fold. Although the contrasting dynamics of the

ECM33 intron and mRNA illustrated the different behaviors sometimes observed for stable introns and their host mRNAs, the large decrease in the levels of the mature mRNA in saturated culture was not observed for all mRNAs that hosted stable introns. Indeed, as a class, these mRNAs had no significant trend in expression between log-phase and saturated cultures

(Extended Data Fig. 4e).

62 a

1 2 4 5 6 8 10 12 18 24 48 72 Time after dilution (h) 10 -ECM33 mRNA

- ECM33 intron 10 4 8 12 16 20 24 48 72 Time (h) -5 8S rRNA b C W303a Time after dilution (h) 4 4 24 S r Strain ctro Rapamycin TORI-1 400- Time after dilution (h) 4 4 24 4 4 24 * -ECM33 intron Rapamycin - + - - + - 300- + - ECM33 intron

200- -SAC6 intron 100- NA S4 f 5.8S &4 & * t -5.8SrRNA d 40 ECM33 0.5 kb r I & . Saturated liquid Saturated lawn Rapamycin

40 Rapamycin

40 Saturated liquid A AL.A A

Log phase ijk.Am&AA.1 0 kA A,

Figure 3. Nutritional state regulates stable-intron formation. a, The distinct accumulation pattern of the ECM33 stable intron and its host mRNA. Culture was seeded at OD60o (optical density at 600 nM wavelength) = 0.2 from an overnight culture and at the indicated time points growth was monitored using OD600 measurements (left), and aliquots were harvested for RNA- blot analysis (right). The RNA blot was as in Fig 1 c, except it utilized a gel designed to also resolve longer RNA. The asterisk (*) shows the migration of the ECM33 pre-mRNA, which is also detected by the probe for the ECM33 intron. b, Induced accumulation of the ECM33 stable intron in cells cultured in the presence of rapamycin. To prevent contamination of starting cultures with stable introns contributed by inoculum, cultures were seeded at OD6oo = 0.2 from an overnight culture, allowed to grow to early log phase, collected by vacuum filtration, and resuspended in fresh media without (lanes 1 and 3) or with (lane 2) 100 nM rapamycin. Cultures were harvested after indicated times and analyzed as in Fig. 1 c. c, Requirement of rapamycin-sensitive TORC1 for induction of stable introns by rapamycin. Compared are the results of a W303a control strain and a W303a strain containing a rapamycin- insensitive allele of TOR1 (TORi-i [S19721]) (Zheng et al., 1995). Otherwise, this panel is as in b, except RNA blot was additionally reprobed for the SAC6 intron.

63 d, Accumulation of the ECM33 intron in rapamycin-treated, saturated-lawn, and saturated-liquid cultures, detected using RNA-seq. Results showing accumulation of this intron in saturated- liquid but not log-phase liquid culture are from a different biological replicate (samples from b, lanes I and 3) than those shown in Fig. la and b; otherwise, this panel is as in Fig. Ia. e, Overlap between stable introns identified in saturated-liquid, saturated-lawn, and rapamycin-treated cultures.

The timing of stable-intron accumulation suggested that nutrient limitation might trigger their formation. With this in mind, we inhibited the target of rapamycin complex I (TORC I), a broadly conserved master integrator of nutritional and other environmental signals (Loewith and

Hall, 2011), using the small molecule rapamycin. In yeast, rapamycin-mediated TORCI inhibition leads to repression of anabolic processes, stimulation of catabolic processes, and ultimately a greatly reduced growth rate (Loewith et al., 2002; Saxton and Sabatini, 2017). When a log-phase culture was resuspended in fresh media with or without 100 nM rapamycin, ECM33 intron accumulated after 4 h of rapamycin treatment, despite the presence of abundant nutrients in the media (Fig. 3b). Repeating this experiment using a strain with a rapamycin-resistant allele of TOR] (TOR]-]) yielded no accelerated intron accumulation upon treatment with rapamycin

(Fig. 3c), and inhibiting protein synthesis in a TORC I -independent way did not cause stable- intron formation (Extended Data Fig. 6a). These results showed that rapamycin-induced intron accumulation was specifically due to inhibition of TORCI.

To extend this investigation transcriptome-wide, we prepared RNA-seq libraries from yeast treated for 4 h with rapamycin. We also investigated stable-intron formation in a metabolically different (Barnett and Entian, 2005)-but experimentally common-nutrient- limited scenario: a lawn of yeast grown to saturation over 2 days in an aerobic environment.

When compared to the saturated liquid culture, both rapamycin-treated cells and cells from the saturated lawn showed similar accumulation not only of the ECM33 intron (Fig. 3d) but also of other stable introns (Fig. 3e). In total, 34 introns (11% of S. cerevisiae introns) were classified as

64 stable introns in at least one of the three conditions (Extended Data Fig. 3, Extended Data Table

2). A somewhat greater yield was obtained from the saturated liquid culture (Fig. 3e), presumably because deeper sequencing of this sample enabled confident identification of more lowly expressed stable introns, implying that with even deeper sequencing, additional stable introns would be confidently identified.

We tested several genetic and environmental perturbations for their effects on stable- intron accumulation. These experiments showed that neither loss-of-function mutations in SCH9 or TAP42 (the two major effector branches of TORC 1) nor depletion of carbon or nitrogen

(known to rapidly inhibit TORC1) induced pre-mature stable-intron formation (Extended Data

Fig. 6b-d). In addition, stable introns were induced only after an extended period of TORCI inhibition, with no detectable induction after I h of rapamycin treatment (Extended Data Fig 6e).

Nonetheless, our results with saturated cultures and TORCI inhibition established a clear link between environment and stable-intron regulation. In particular, they showed that a TORC I - dependent pathway prevents accumulation of stable introns, such that when this pathway is inactive a select class of introns evades degradation and accumulates.

Biological function of stable introns

The discovery of stable introns and their link to the TOR growth-signaling pathway brought to the fore the question of their function. To test for loss-of-function phenotypes, we created strains with genomic deletions of stable introns, using a CRISPR-Cas9 system adapted for S. cerevisiae

(Vyas et al., 2015; Vyas et al., 2018) to precisely remove introns without affecting exonic sequences. Reasoning that stable introns might contribute collectively to function, and thus a phenotype might be detected only after multiple introns were deleted from the same strain, we

65 generated 18 strains of single, double, triple, quadruple, and quintuple intron deletions (Fig. 4a).

Based on abundances estimated from RNA-seq data, these deletions were predicted to reduce the number of stable-intron molecules in the cell by up to 55% (Fig. 4a). This percentage of depletion was verified for the quintuple-deletion strain, with no sign of a compensatory increase in the remaining stable introns (Extended Data Fig. 7). As controls, we also generated strains in which non-stable introns were removed, either singly or simultaneously, from three highly expressed genes.

Because of the connection between TORCI inhibition and stable-intron formation (Fig.

3b-e), we tested the effect of rapamycin on growth of mutant strains producing varied amounts of stable introns. In the absence of rapamycin, the strains all grew at equivalent rates (Fig. 4b). In the presence of rapamycin, growth was inhibited for all strains, but this growth inhibition was attenuated for strains lacking one or more stable intron, with mutant strains lacking more stable introns growing at faster rates (Fig. 4b). Indeed, across all the strains assayed, we observed a striking correlation between the estimated fraction of stable-intron molecules depleted from the transcriptome and attenuation of the rapamycin response (Fig. 4c, Pearson R2 = 0.96). In contrast, the rapamycin response of the control strains that lacked non-stable introns was indistinguishable from that of wild type (Fig. 4c, inset), which showed that the observed phenotype was specific to depletion of stable introns and not a generic consequence of decreased splicing load.

66 a b t MRFA2 M hPT5 *SAC6 1.5 30 nM -EUHSR *UBC4 1.0 100 nM *ECM33 -EUHR - EUS - EU 250 nM 25- 0.5 -E I Rapemycin 0 -H 01 -S -WT I 0.2 0 4 8 12 1620 2428 32 5' ime (h) Strains 1 I

C 30. I 2

2.5

20

1.5- 1.5 Unstable introns ~1.0 - * COFI ACTI 0.5 0.5 ale MMS2

0.0. 6 lbwb 3b 4b56& Estimnated deletion of stable hmow Molecules; (%) d MEWDn Overfixprossd intron %mSa Omerarssed intron f Stress Rapamycin Nutrients - WT ECM33-Unstable - WI ACTI-Unstable - WT ECM33-Stable - WT ACT--Stable TORCI - EUHSR ECM33-Unstable - EUHSR ACTI-Unstable - EUHSR ECM33-Stable -EUHSR ACTI-Stalf 0 nM 1.0. Rapamycrm 1.0 0 nM GI Raycmn Grotht

0.5. 1250 M 0.5 250 nM Rapamycin Rapamycin 0.50

0.2 . 0.2______0 4 8 1216 20 24 28 32 0 4 8 1216 20 24 28 32 lime (h) Timve (h)

Figure 4. Stable introns regulate yeast growth under TORC1 inhibition. a, Eighteen strains with genomic deletions that deplete stable-intron molecules. In each strain, introns from 1-5 of the genes listed in the key were deleted. Deletion strains are named using single-letter abbreviations of the genes with deleted introns. The bar for each strain plots its estimated depletion of stable-intron molecules, indicating the contribution of each deleted intron (color key) based on its contribution to the total number of stable-intron molecules in WT saturated culture. b, Attenuated rapamycin response of strains with fewer stable-intron molecules. Shown are growth curves of the indicated deletion strains (color key) when cultured in the presence of either 0 nM, 100 nM, or 250 nM rapamycin. Each curve shows the average for technical triplicates from a representative 96-well plate assaying eight strains. c, Relationship between the

67 depletion of stable-intron molecules and increased growth in 250 nM rapamycin. Estimated depletion of stable-intron molecules is as in a. Relative growth rates are averages of biological replicates (n = 12 for WT, n = 3 for all other strains; error bars, s.e.m.). Also shown are results for control strains in which 1-3 unstable introns were deleted in the indicated genes (inset). c, Rescue of the rapamycin-response defect by ectopically expressing the ECM33 stable intron. Shown are representative growth curves of WT and EUHSR strains overexpressing either the native stable ECM33 intron (red) or an unstable ECM33 intron (blue) cultured in the presence of either 0 nM or 250 nM rapamycin. The unstable version of the intron (EMC33-Unstable) is described in Fig. 3d (EMC33Long). Each curve plots the averages of technical triplicates. d, Rescue by an engineered ACT] stable intron. As in c, but ectopically expressing either the native unstable ACT] intron (blue) or an engineered stable ACT] intron (red). The stable version of the intron (ACT]-Stable) is described in Fig. 3d (ACTshOrt). e. Placement of stable introns in the TORC 1-mediated environmental-response pathway. TORC I, which is stimulated by nutrients and inhibited by environmental stress and rapamycin, represses accumulation of diverse stable introns (colored lines), which in turn repress cell growth. This of a repressor leads to growth stimulation, as do previously characterized branches of the pathway, which do not involve stable introns (arrow bypassing stable introns).

The attenuated response to TORC I inhibition seemed to depend solely on the number of stable-intron molecules that were removed from the cell and not on the identities of either the removed introns or their mutated host genes (Fig. 4c). This result provided evidence against the idea that the phenotype we observed might have been the result of subtle changes in host-gene expression or other secondary effects of intron removal or strain construction (such as off-target effects of the gene editing).

To further test the conclusion that the attenuated response to TORC I inhibition depended solely on the aggregate number of stable-intron molecules produced and not on other factors, we examined the ability of ectopically expressed introns to rescue the phenotype. We first overexpressed the ECM33 stable intron or, as a control, its unstable derivative that had a lengthened BP-3'SS distance (ECM33-stable and ECM33-unstable, respectively). The stability of this overexpressed intron had no detectable effect on either wild-type (WT) or quintuple mutant (EUHSR) growth in the absence of rapamycin, and it had little effect on growth of the

WT strain in rapamycin (Fig. 4d). Importantly, however, the stability of this ectopically

68 expressed intron dramatically influenced the growth of the quintuple mutant in the presence of rapamycin-nearly completely rescuing its defective response to rapamycin (Fig. 4d). Moreover, analogous results were observed when ectopically expressing the stabilized and normal version of the ACT] intron (ACTJ-stable and ACTJ-unstable, respectively) (Fig. 4e), The ability of a stable version of the ACT] intron, which does not exist naturally in S. cerevisiae, to rescue the quintuple-mutant phenotype confirmed that the this phenotype was a primary consequence of reduced stable-intron accumulation, with no dependence on the identity of the intron or its host gene.

Taken together, our results show that stable introns function within the TORC I -mediated environment-response pathway of S. cerevisiae (Fig. 4f). TORCI activity prevents stable-intron formation, as shown by the accumulation of these introns when cells undergo TORCI inhibition by rapamycin (Fig. 3b-e), and the stable introns inhibit growth of cells with decreased TORCI activity, as shown by the increased growth observed in TORC I -inhibited strains with fewer stable introns (Fig. 4). Thus, this double-negative regulation, in which TORC I inhibits stable introns, which in turn inhibit growth, forms a previously unknown node of the TOR regulatory network, which works in concert with other TORC I -dependent and TORC 1-independent pathways to control growth in S. cerevisiae (Fig. 4f).

Discussion

The ease by which we were able to observe accumulation of stable introns raised the question of why they had not been detected earlier, especially when considering that S. cerevisiae has been subjected to countless molecular analyses. The answer to this question lies with two features of our analysis. First, we examined cells in saturated culture, whereas most analyses examine cells

69 in log phase, a stage at which all excised introns are rapidly degraded. Second, we avoided mRNA poly(A) selection, whereas many analyses perform poly(A) selection prior to RNA-seq, which depletes excised introns because they have no poly(A) tail. In addition, we used an in- house RNA-seq protocol in which RNA was fragmented and 27- to 40-nucleotide fragments were isolated for sequencing (with the goal of providing a suitable comparison to 27- to 33- nucleotide ribosome-protected fragments in order to accurately measure translational efficiency

(Ingolia et al., 2009)), whereas most analyses use commercial RNA-seq kits, which deplete

RNAs shorter than a few hundred nucleotides, the size of the excised yeast introns. Although perhaps not essential for detecting longer stable introns, this RNA-seq protocol enabled reliable and quantitative detection of stable introns regardless of their lengths. In principle, studies that used splicing-specific microarrays to assay intron retention during stress, including rapamycin treatment, might have detected stable introns (Pleiss et al., 2007; Bergkessel et al., 2011;

Munding et al., 2013). However, those studies focus on measurements through 40 min, with no measurements taken beyond 2 h of treatment, which might not have been enough time for detection of stable introns.

Although specific inhibition of TORC I with rapamycin induced stable introns, other means known to affect TORC I activity and signaling were insufficient to do the same. For instance, genetic bypass of the Sch9 and Tap42 effector branches of TORCI signaling were individually insufficient to broadly affect stable-intron regulation (Extended Data Fig. 6b, c), which suggests that another branch of the TORCI pathway, such as Mpk 1-mediated proteasome regulation, Sfp 1-mediated ribosome production, retrograde signaling, or autophagy, might instead regulate stable-intron formation. Additionally, the rapid effects on TORC 1 activity of both rapamycin treatment and carbon or nitrogen depletion did not induce stable-intron

70 accumulation to detectable levels (Extended Data Fig. 6d,e). As the effects of long-term TORC I inhibition are less thoroughly studied than are those of short-term inhibition, we have fewer clues as to how this time-dependent phenotype is regulated and which changes in nutritional and/or environmental signals ultimately control stable-intron formation in natural and rapamycin-treated settings.

The discovery of a previously unknown node of the TOR regulatory network raises mechanistic questions, including that of how stable-intron accumulation might inhibit growth.

Our results imply that the stoichiometry of stable introns relative to some cellular component underlies this function. One possibility is that stable introns interact with and sequester spliceosomes to reduce the splicing activity needed for ribosome biogenesis. Supporting this idea, substrate pre-mRNAs are known to compete for limited splicing machinery in S. cerevisiae

(Munding et al., 2013), and we found that stable introns are associated with spliceosome components (Extended Data Table 1). Moreover, in budding yeast, sequestering spliceosomes would disproportionally effect mRNAs of ribosomal protein genes (RPGs); because of their high expression levels and frequent possession of an intron, RPG mRNAs are substrates for 90% of all splicing events in log-phase S. cerevisiae (Warner, 1999). Perhaps stable introns function as part of a larger regulatory network that uses inefficient splicing to help keep ribosome production low during periods of limited nutrients (Broach, 2012). In this scenario, a large pool of stored spliceosome components would be available to be released when yeast are re-fed, allowing for rapid induction of ribosome biogenesis.

To evaluate the plausibility of this idea, we revisited our RNA-seq data from rapamycin- treated yeast to determine whether stable introns can reach the levels sufficient to have an effect on available spliceosome components. This analysis showed that in aggregate stable-intron

71 transcripts reached 40% of the number of U5 snRNA molecules (Extended Data Fig. 8a). In principle, essential splicing proteins could be even more limiting than these abundant RNAs.

Thus, stable-intron accumulation was well within a regime in which depleting 55% of the stable- intron molecules-as in our EUHSR strain-could substantially alter spliceosome availability.

Furthermore, overexpressing a stable intron (ECM33) in the EUHSR strain grown in rapamycin resulted in more intron retention and less RPG mRNA accumulation as compared with overexpression of a non-stable intron (ACT]) (Extended Data Fig. 8b and c). Future characterization of the stable-intron complex will provide more detailed information on the cellular machinery that is sequestered and might also provide insight into other mechanistic questions, such as how stable introns are protected from degradation, and how this protection is biochemically coupled to TORCI inhibition.

Although our ideas regarding the mechanism of stable-intron function require additional experimental validation, one interesting aspect of this mechanism is that it operates regardless of the stable-intron sequence or genomic origin. This observation provides an example in which the history of a noncoding RNA, i.e., how the molecule is born, can be sufficient to imbue a cellular function-independent of any primary-sequence considerations. Similar principles might also apply to other noncoding RNAs, especially those that lack primary-sequence conservation.

Our results add an unexpected dimension to possible fates and functions of spliceosomal introns within eukaryotic biology. Although noteworthy examples of intron lariats and trimmed linear introns with lives post-splicing have been reported (Hesselberth, 2013), the introns described here are unique in that they persist as excised, full-length, debranched RNAs.

Moreover, their stability is specifically and dramatically regulated in response to environmental changes, and they collectively act as a newly defined type of functional ncRNA. Our focus has

72 been in the contexts of TORC I inhibition and saturated growth, but stable introns might also be induced and have similar functions in other conditions, such as during meiosis, a time at which ribosome biogenesis and splicing competition are known to be dynamically regulated (Chu et al.,

1998; Juneau et al., 2007; Munding et al., 2013). Although intron-rich eukaryotes cannot as obviously leverage global spliceosome availability to manipulate production of specific subsets of proteins, some might still use excised linear introns to perform biological functions-perhaps in environmental conditions not extensively profiled to date. At this point, we know that excised linear introns accumulate and function in at least one eukaryotic lineage, and we would be surprised if it is the only one.

Acknowledgements We thank C. Burge, D. Pincus, D. Sabatini, P. Sharp, and members of the

Bartel and Fink labs for comments and discussion, G. Li, T. Pham, and A. Symbor-Nagrabska for experimental assistance, R. Loewith, D. Pincus, and V. Vyas for reagents, the Whitehead

Institute Genome Technology Core for sequencing, and the Whitehead Institute Proteomics Core

Facility for mass spectrometry. This work was supported by NIH grant GMI 18135 (D.P.B.).

D.P.B. is an investigator of the Howard Hughes Medical Institute.

Author Contributions Apart from tetrad dissections performed by G.R.F., J.T.M. performed all experiments and analyses. D.P.B. supervised with help from G.R.F. All authors contributed to the design of the study and preparation of the manuscript.

Author Information Sequencing data and the processed data for each gene will be available at the Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo). The authors declare no

73 competing financial interests. Correspondence and requests for materials should be addressed to

D.P.B. ([email protected]).

Methods

Yeast strains and genetic manipulations. S. cerevisiae strains used in this study are listed in

Supplementary Table 1. All strains were constructed in the BY4741 background unless otherwise noted. A TAP42 heterozygous diploid knockout strain (Horizon Discovery) was transformed with plasmids encoding either wild-type (TAP42) or mutant (tap42-11) alleles before sporulation and tetrad dissection. Transformations were performed using standard methods. Intron deletions and endogenous BP manipulations were made using a CRISPR-Cas9 system adapted for use in S. cerevisiae (Vyas et al., 2015; Vyas et al., 2018) and were confirmed by colony PCR and Sanger sequencing of relevant genomic loci. Clones with correct deletions were counter-selected for the Cas9 plasmid by plating onto 5-FOA media before their use in experiments. This process was iterated to generate strains with multiple deleted introns. Strains that underwent 5-FOA counter-selection were confirmed to not be petite by patching to YP- glycerol plates. Plasmids used in this study are listed in Supplementary Table 2.

Oligonucleotides used for guide RNAs and repair templates are listed in Supplementary Table 3

(IDT).

Growth conditions and harvesting. Yeast were grown at 30 'C on standard synthetic complete

(SC) plate and liquid media, except strains in the lap42::KanMX background, which were grown at 25 'C due to temperature sensitivity. SC-Ura and SC-Trp were used in some experiments to maintain selection for URA3- and TRPI-expressing plasmids. Rapamycin (LC Laboratories) was from fresh stock solutions of 10 mM rapamycin dissolved in DMSO, diluted to 10 pM in water immediately before use. Cultures were seeded at OD600 0.2 from overnight cultures (typically

74 OD60 -6) for growth to log phase or saturation. Log-phase cultures were harvested during early log phase, typically at OD600 0.5, reached 4-5 h after seeding. Unless otherwise indicated, saturated cultures were harvested 18-20 h after seeding. In acute nutrient depletion experiments, cultures were grown to mid-log phase (OD600 -), filtered, washed two times with water, and resuspended in appropriate media (SC-glucose, SC-ammonium sulfate, SC-leucine, or SC- uracil). All cultures were rapidly harvested by vacuum filtration and flash frozen in liquid nitrogen as previously described (Weinberg et al., 2016). Frozen pellets were mechanically lysed using a Sample Prep 6870 Freezer/Mill (Spex SamplePrep; 10 cycles of 2 min on, 2 min off at setting 10). Lysate powder was aliquoted and stored at -80 'C.

RNA-Seq. Total RNA was extracted using TRI Reagent (Ambion) according to the manufacturer's protocol. rRNA was depleted from 5 pg of total RNA using the Ribo-Zero Gold

Yeast rRNA Removal Kit (Illumina) according to the manufacturer's protocol. RNA-seq libraries were prepared as described (Subtelny et al., 2014) and sequenced on the Illumina HiSeq platform using 40 bp single reads. A detailed protocol is available at http://bartellab.wi.mit.edu/protocols.html.

RNA-Seq Analyses. Reads were trimmed of adaptor sequence using cutadapt (Martin, 2011).

Trimmed reads were aligned to the S. cerevisiae genome (R64-1-1, downloaded from www.yeastgenome.org) using STAR (Dobin et al., 2013) (v2.4) with the parameters "-- alignintronMax 1000 --sjdbOverhang 31 --outSAMtype BAM SortedByCoordinate --quantMode

GeneCounts" and with "--sjdbGTFfile" supplied with transcript annotations. Intron annotations were constructed from the UCSC Genome browser and the Ares lab yeast intron database

(http://intron.ucsc.edu/yeast4.3/), with the Ares lab database (Spingola et al., 1999; Davis et al.,

75 2000) providing branch point annotations used in subsequent analyses. RNA-seq visualization was performed using IGV (v.2.3.57) (Robinson et al., 2011; Thorvaldsd6ttir et al., 2013).

Untemplated nucleotides were found and cataloged by extracting reads from BAM files that uniquely mapped to the sense strand of an intron but which also contained soft-clipped, non- mapping nucleotides at the 3' end of the read. The Mixture-of-Isoforms (Katz et al., 2010)

(MISO, v.0.5.4) framework was used to quantify relative expression of three potential isoforms for every intron: spliced (intron degraded), spliced (intron stable), and intron retained. MISO requires uniform read length as input. As such, 3' adaptor sequence was removed by trimming a constant 8 nt from every read. Based on the distribution of prior cutadapt-mediated trimming, this removed all 3' adaptor sequence from >98% of reads. Retained intron GFF events were constructed using "gffmakeannotation" from rnaseqlib

(http://rnaseqlib.readthedocs.io/en/clip/). Stable intron GFF events were made by directly modifying the retained intron GFF events to instead include "intron only" as a potential outcome of splicing. Although reads that supported a stable-intron splicing event necessarily could also support a retained-intron splicing event, in practice, the greater abundance of most stable introns relative to flanking exons enabled MISO to identify stable intron as the predominant isoform for many introns in stable-intron-inducing conditions.

To be identified as a stable intron expressed in a given condition, accumulation of the intron had to exceed thresholds calibrated on experimentally validated cases. First, intron accumulation (transcripts per million, TPM) in the stable-intron-inducing condition had to be greater than 50% of exon accumulation (TPM). Second, the intron accumulation (TPM) in the stable-intron-inducing condition had to be more than twice that of its accumulation (TPM) in log phase (assigning a pseudocount of 0.1 reads to introns with 0 reads). Third, the ratio of

76 intron:exon accumulation in the stable-intron-inducing condition had to be greater than 4-fold that of the intron:exon ratio in log phase. Imposing these thresholds eliminated many false- positives, including those attributed to intronic snoRNA expression or constitutively poor splicing. For introns exceeding these accumulation thresholds, at least one of the following two additional criteria were required for annotation as a stable intron: 1) > 2 terminal adenylylated reads mapping to the 3' terminus of the intron, or 2) MISO-based support for preferential accumulation of the stable-intron isoform in the stable-intron-inducing condition (Bayes factor >

30 when compared to log phase). These criteria identified conservative sets of stable introns expressed in a given condition, erring towards reducing false-positive identifications.

A search for motifs enriched in stable introns was performed using the MEME Suite

(Bailey et al., 2009) with remaining introns as the background set. k-mer frequencies were generated with the "fasta-get-markov" program from the MEME Suite. In addition, a search for enrichment of position-specific motifs was performed using kpLogo (Wu and Bartel, 2017) with remaining introns as the background set.

Growth curves. Growth curves were collected using Nunc Edge 2.0 96-well plates in a

Multiskan GO Microplate Spectrophotometer. Wells were seeded at OD6 0 0= 0.2 from overnight cultures, and plates were incubated at 30 OC. Absorbance was read every 5 min with shaking on the "background" setting, cycling between 1 min on and 1 min off. Every strain tested on each plate was run in technical triplicate. Single wells were censored if artefactual spikes in ODoo0 attributable to bubbles or condensation were observed. Strains were censored if more than one well was censored. Biological replicates for each strain included at least two independently derived transformants. A replicate of parental BY4741 was included on every plate as a control strain, and SC media was included on every plate as a control condition for every strain being

77 assayed. Growth rate was calculated from the log-linear portion of each growth curve. Growth curves were analyzed with Skanlt (v3.2).

RNA blots. 10 ptg of total RNA for each sample was resolved on a denaturing polyacrylamide gel and transferred onto a Hybond membrane (GE Healthcare) using a semi-dry transfer cell.

Because UV crosslinking is biased against shorter RNAs, EDC (N-(3-dimethylaminopropyl)-N'- ethylcarbodiimide; Sigma-Aldrich) was used to chemically crosslink 5' phosphates to the membrane (Pall et al., 2007). Blots were hybridized to radio-labeled DNA probes. Probe oligonucleotides are listed in Supplementary Table 4. More details of this protocol are available at http://bartellab.wi.mit.edu/protocols.html. For experiments in Fig. 3a, RNAs were instead resolved on a glyoxal agarose gel and transferred overnight onto a Nytran SuPerCharge

Turboblotter membrane (GE Healthcare). RNA blot data were analyzed with ImageQuant TL

(v8. 1.0.0).

Sedimentation velocity. Crude lysates were prepared by re-suspending an aliquot of thawed lysate powder (500-800 p.L of loosely packed powder) in I mL of lysis buffer (10 mM Tris-HCl

[pH 7.4], 5 mM MgC 2 , 100 mM KCI, I% Triton X-100, 1% Sodium Deoxycholate, 2 mM DTT,

20 U/ml SUPERase-In [Ambion], cOmplete EDTA-free Protease Inhibitor Cocktail [Roche]).

The lysates were placed on a rotator mixer at 4 'C for 5 min to allow for re-suspension.

Following brief vortexing, lysates were centrifuged at 1,300 x g for 10 min, and the supematant loaded onto a 12.5 mL linear 10-30% (w/v) sucrose gradient (20 mM HEPES-KOH [pH 7.4], 5

mM MgCl 2 , 100 mM KCl, 2 mM DTT, 20 U/ml SUPERase-In). Gradients were centrifuged in a pre-chilled SW-41 Ti rotor at 38,000 rpm for 4 hr at 4 'C. Gradients were fractionated using the

Piston Gradient Fractionator (Biocomp) in I mL fractions. For RNA analysis, total RNA was extracted from a portion of each fraction using TRI Reagent. RNA blots were performed as

78 described above, except RNA loading was normalized by percent of gradient fraction rather than by RNA concentration. For pull-downs, fractions were flash frozen and stored at -80 'C.

Ectopic intron expression, affinity purification, and mass spectrometry. Intron overexpression constructs were based on SC-Ura-selectable pRSI416 (CEN) or pRS11426 (2pi) vectors (Chee and Haase, 2012) (Extended Data Fig. 3c, Supplementary Table 2). Introns were inserted 49 nucleotides downstream of the URA3 start codon by Gibson assembly (Gibson et al.,

2009). Proper splicing was confirmed by RNA blot and cell viability, as cells unable to produce

Ura3p through intron removal could not grow on SC-Ura media.

Pull-down experiments were performed in the ECM33Jinlronstrain. Introns were purified utilizing two MS2 hairpins (2xMS2) inserted in various positions within the ECM33 intron (Supplementary Table 5). To minimize potential effects on splicing, sequence within 60 nt of the 5' splice site and 80 nt of the branch point was kept constant across all constructs. The

2xMS2 hairpin sequence was based on CRISPR RNA scaffold designs (Zalatan et al., 2015).

3xFLAG-tagged MS2 coat protein with a C-terminal nuclear localization signal (FLAG-MCP) was co-expressed from the same construct as MS2-tagged introns.

We were unable to purify intact introns from supernatant of saturated cultures due to increased endogenous RNase activity in saturated cultures (McFarlane, 1980). To circumvent this, sucrose gradient fractions containing the intron of interest (typically fractions 6-9, determined by RNA blot) were pooled and used as the starting material for purification. ANTI-

FLAG M2 Magnetic Beads (Sigma-Aldrich) (20 tL of packed bead volume) were equilibrated and washed twice in 10 volumes of buffer 1 (100 mM KCI, 20 mM HEPES KOH [pH 7.9], 1%

Triton X-100, 20 U/ml SUPERase-In, and cOmplete EDTA-free Protease Inhibitor Cocktail).

400 pL of the pooled fractions were added to the beads and incubated on a rotator mixer for 30

79 min at 4 'C. Remaining at 4 'C, the beads were washed twice in 10 volumes of buffer I and twice in 10 volumes of buffer 2 (200 mM KCl, 20 mM HEPES KOH [pH 7.9], 1% Triton X-100,

20 U/ml SUPERase-In, and cOmplete EDTA-free Protease Inhibitor Cocktail). Bound FLAG-

MCP was eluted from beads with 30 pL of 150 ng/piL 3x FLAG peptide (Sigma-Aldrich) on a rotator mixer for 30 min at 4 *C. The eluents were precipitated with TCA, digested with trypsin, and labeled with TMTs to allow for quantitative comparisons between 6 total samples (3 control and 3 test samples). Peptides were analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS) using an Orbitrap Elite (Thermo Fisher) coupled with a NanoAcquity

UPLC system (Waters). Peptides were identified using SEQUEST and data analyzed using

PEAKS Studio (Bioinformatics Solutions).

Experimental design and reproducibility. No statistical methods were used to predetermine sample size. Growth curve cultures were randomized by permutation of strain placement on 96- well plates across experiments. Edge effects were not significant within the time measurements were taken. The investigators were not blinded to allocation during experiments and outcome

2 assessment. RNA-seq results for biological replicates correlated well (log-phase culture, R =

0.98 [mRNA, n = 5898]; saturated culture, R2 = 0.90 [mRNA, n = 6211] and 0.78 [stable introns, n = 29]).

References

Arenas, J., and Hurwitz, J. (1987). Purification of a RNA debranching activity from HeLa cells. Journal of Biological Chemistry 262, 4274-4279. Bailey, T.L., Boden, M., Buske, F.A., Frith, M., Grant, C.E., Clementi, L., Ren, J., Li, W.W., and Noble, W.S. (2009). MEME SUITE: tools for motif discovery and searching. Nucleic acids research 37, W202-W208. Barnett, J.A., and Entian, K.D. (2005). A history of research on yeasts 9: regulation of sugar metabolism. Yeast 22, 835-894. Bergkessel, M., Whitworth, G.B., and Guthrie, C. (2011). Diverse environmental stresses elicit distinct responses at the level of pre-mRNA processing in yeast. Rna 17, 1461-1478.

80 Black, D.L. (2003). Mechanisms of alternative pre-messenger RNA splicing. Annual review of biochemistry 72, 291-336. Bonnet, A., Grosso, A.R., Elkaoutari, A., Coleno, E., Presle, A., Sridhara, S.C., Janbon, G., Geli, V., de Almeida, S.F., and Palancade, B. (2017). Introns Protect Eukaryotic Genomes from Transcription-Associated Genetic Instability. Molecular cell 67, 608-621 e606. Broach, J.R. (2012). Nutritional control of growth and development in yeast. Genetics 192, 73- 105. Chapman, K.B., and Boeke, J.D. (1991). Isolation and characterization of the gene encoding yeast debranching enzyme. Cell 65, 483-492. Chee, M.K., and Haase, S.B. (2012). New and Redesigned pRS Plasmid Shuttle Vectors for Genetic Manipulation of Saccharomycescerevisiae. G3 (Bethesda) 2, 515-526. Chu, S., DeRisi, J., Eisen, M., Mulholland, J., Botstein, D., Brown, P.O., and Herskowitz, 1. (1998). The transcriptional program of sporulation in budding yeast. Science 282, 699- 705. Davis, C.A., Grate, L., Spingola, M., and Ares Jr, M. (2000). Test of intron predictions reveals novel splice sites, alternatively spliced mRNAs and new introns in meiotically regulated genes of yeast. Nucleic acids research 28, 1700-1706. Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15-2 1. Domdey, H., Apostol, B., Lin, R.-J., Newman, A., Brody, E., and Abelson, J. (1984). Lariat structures are in vivo intermediates in yeast pre-mRNA splicing. Cell 39, 611-621. Fourmann, J.B., Schmitzova, J., Christian, H., Urlaub, H., Ficner, R., Boon, K.L., Fabrizio, P., and Luhrmann, R. (2013). Dissection of the factor requirements for spliceosome disassembly and the elucidation of its dissociation products using a purified splicing system. Genes & development 27, 413-428. Gibson, D.G., Young, L., Chuang, R.-Y., Venter, J.C., Hutchison, C.A., and Smith, H.O. (2009). Enzymatic assembly of DNA molecules up to several hundred kilobases. Nature methods 6, 343-345. Grabowski, P.J., Padgett, R.A., and Sharp, P.A. (1984). Messenger RNA splicing in vitro: an excised intervening sequence and a potential intermediate. Cell 37, 415-427. Gray, J.V., Petsko, G.A., Johnston, G.C., Ringe, D., Singer, R.A., and Werner-Washburne, M. (2004). "Sleeping beauty": quiescence in Saccharomyces cerevisiae. Microbiol Mol Biol Rev 68, 187-206. Hesselberth, J.R. (2013). Lives that introns lead after splicing. Wiley interdisciplinary reviews RNA 4, 677-691. Hooks, K.B., Naseeb, S., Parker, S., Griffiths-Jones, S., and Delneri, D. (2016). Novel intronic RNA structures contribute to maintenance of phenotype in Saccharomyces cerevisiae. Genetics 203, 1469-1481. Ingolia, N.T., Ghaemmaghami, S., Newman, J.R., and Weissman, J.S. (2009). Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218-223. Irimia, M., and Roy, S.W. (2014). Origin of spliceosomal introns and alternative splicing. Cold Spring Harbor perspectives in biology 6.

81 Jia, H., Wang, X., Liu, F., Guenther, U.P., Srinivasan, S., Anderson, J.T., and Jankowsky, E. (2011). The RNA helicase Mtr4p modulates polyadenylation in the TRAMP complex. Cell 145, 890-901. Juneau, K., Palm, C., Miranda, M., and Davis, R.W. (2007). High-density yeast-tiling array reveals previously undiscovered introns and extensive regulation of meiotic splicing. Proceedings of the National Academy of Sciences of the United States of America 104, 1522-1527. Katz, Y., Wang, E.T., Airoldi, E.M., and Burge, C.B. (2010). Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nature methods 7, 1009- 1015. Koonin, E.V. (2006). The origin of introns and their role in eukaryogenesis: a compromise solution to the introns-early versus introns-late debate? Biol Direct 1, 22. LaCava, J., Houseley, J., Saveanu, C., Petfalski, E., Thompson, E., Jacquier, A., and Tollervey, D. (2005). RNA degradation by the exosome is promoted by a nuclear polyadenylation complex. Cell 121, 713-724. Loewith, R., and Hall, M.N. (2011). Target of rapamycin (TOR) in nutrient signaling and growth control. Genetics 189, 1177-1201. Loewith, R., Jacinto, E., Wullschleger, S., Lorberg, A., Crespo, J.L., Bonenfant, D., Oppliger, W., Jenoe, P., and Hall, M.N. (2002). Two TOR complexes, only one of which is rapamycin sensitive, have distinct roles in cell growth control. Molecular cell 10, 457- 468. Lorsch, J.R., Bartel, D.P., and Szostak, J.W. (1995). Reverse transcriptase reads through a 2'-5' linkage and a 2'-thiphosphate in a template. Nucleic acids research 23, 2811-2814. Martin, A., Schneider, S., and Schwer, B. (2002). Prp43 is an essential RNA-dependent ATPase required for release of lariat-intron from the spliceosome. Journal of Biological Chemistry 277, 17743-17750. Martin, M. (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet journal 17, pp. 10-12. McFarlane, E.S. (1980). Ribonuclease activity during Gl arrest of the yeast Saccharomyces cerevisiae. Archives of microbiology 124, 243-247. Munding, E.M., Shiue, L., Katzman, S., Donohue, J.P., and Ares Jr, M. (2013). Competition between pre-mRNAs for the splicing machinery drives global regulation of splicing. Molecular cell 51, 338-348. Ng, R., Domdey, H., Larson, G., Rossi, J., and Abelson, J. (1985). A test for intron function in the yeast actin gene. Nature 314, 183-184. Padgett, R.A., Konarska, M.M., Grabowski, P.J., Hardy, S.F., and Sharp, P.A. (1984). Lariat RNA's as intermediates and products in the splicing of messenger RNA precursors. Science 225, 898-904. Pall, G.S., Codony-Servat, C., Byrne, J., Ritchie, L., and Hamilton, A. (2007). Carbodiimide- mediated cross-linking of RNA to nylon membranes improves the detection of siRNA, miRNA and piRNA by northern blot. Nucleic acids research 35, e60. Parenteau, J., Durand, M., Morin, G., Gagnon, J., Lucier, J.F., Wellinger, R.J., Chabot, B., and Elela, S.A. (2011). Introns within ribosomal protein genes regulate the production and function of yeast ribosomes. Cell 147, 320-33 1. Parenteau, J., Durand, M., Veronneau, S., Lacombe, A.-A., Morin, G., Gudrin, V., Cecez, B., Gervais-Bird, J., Koh, C.-S., and Brunelle, D. (2008). Deletion of many yeast introns

82 reveals a minority of genes that require splicing for function. Molecular biology of the cell 19, 1932-1941. Petfalski, E., Dandekar, T., Henry, Y., and Tollervey, D. (1998). Processing of the precursors to small nucleolar RNAs and rRNAs requires common components. Molecular and cellular biology 18, 1181-1189. Pleiss, J.A., Whitworth, G.B., Bergkessel, M., and Guthrie, C. (2007). Rapid, transcript-specific changes in splicing in response to environmental stress. Molecular cell 27, 928-937. Qian, L., Vu, M.N., Carter, M., and Wilkinson, M.F. (1992). A spliced intron accumulates as a lariat in the nucleus of T cells. Nucleic acids research 20, 5345-5350. Qin, D., Huang, L., Wlodaver, A., Andrade, J., and Staley, J.P. (2016). Sequencing of lariat termini in S. cerevisiae reveals 5' splice sites, branch points, and novel splicing events. Rna 22, 237-253. Qu, L.-H., Henry, Y., Nicoloso, M., Michot, B., Azum, M.-C., Renalier, M.-H., Caizergues- Ferrer, M., and Bachellerie, J.-P. (1995). U24, a novel intron-encoded small nucleolar RNA with two 12 nt long, phylogenetically conserved complementarities to 28S rRNA. Nucleic acids research 23, 2669-2676. Robinson, J.T., Thorvaldsd6ttir, H., Winckler, W., Guttman, M., Lander, E.S., Getz, G., and Mesirov, J.P. (2011). Integrative genomics viewer. Nature biotechnology 29, 24-26. Rodriguez, J.R., Pikielny, C.W., and Rosbash, M. (1984). In vivo characterization of yeast mRNA processing intermediates. Cell 39, 603-610. Ruskin, B., and Green, M.R. (1985). Specific and stable intron-factor interactions are established early during in vitro pre-mRNA splicing. Cell 43, 131-142. Ruskin, B., Krainer, A.R., Maniatis, T., and Green, M.R. (1984). Excision of an intact intron as a novel lariat structure during pre-mRNA splicing in vitro. Cell 38, 317-331. Saxton, R.A., and Sabatini, D.M. (2017). mTOR signaling in growth, metabolism, and disease. Cell 168, 960-976. Sharp, P.A., Konarksa, M., Grabowski, P., Lamond, A., Marciniak, R., and Seiler, S. (1987). Splicing of messenger RNA precursors. Paper presented at: Cold Spring Harbor symposia on quantitative biology (Cold Spring Harbor Laboratory Press). Spingola, M., Grate, L., Haussler, D., and Ares Jr, M. (1999). Genome-wide bioinformatic and molecular analysis of introns in Saccharomyces cerevisiae. Rna 5, 221-234. Subtelny, A.O., Eichhorn, S.W., Chen, G.R., Sive, H., and Bartel, D.P. (2014). Poly(A)-tail profiling reveals an embryonic switch in translational control. Nature 508, 66-71. Terashima, H., Hamada, K., and Kitada, K. (2003). The localization change of Ybr078w/Ecm33, a yeast GPI-associated protein, from the plasma membrane to the cell wall, affecting the cellular function. FEMS microbiology letters 218, 175-180. Thorvaldsd6ttir, H., Robinson, J.T., and Mesirov, J.P. (2013). Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in bioinformatics 14, 178-192. Vyas, V.K., Barrasa, M.I., and Fink, G.R. (2015). A Candida albicans CRISPR system permits genetic engineering of essential genes and gene families. Science advances 1, el1500248. Vyas, V.K., Bushkin, G.G., Bernstein, D.A., Getz, M.A., Sewastianik, M., Barrasa, M.I., Bartel, D.P., and Fink, G.R. (2018). New CRISPR Mutagenesis Strategies Reveal Variation in Repair Mechanisms among Fungi. mSphere 3. Wan, R., Yan, C., Bai, R., Lei, J., and Shi, Y. (2017). Structure of an intron lariat spliceosome from Saccharomyces cerevisiae. Cell 171, 120-132. el 12.

83 Warner, J.R. (1999). The economics of ribosome biosynthesis in yeast. Trends in biochemical sciences 24, 437-440. Weinberg, D.E., Shah, P., Eichhorn, S.W., Hussmann, J.A., Plotkin, J.B., and Bartel, D.P. (2016). Improved Ribosome-Footprint and mRNA Measurements Provide Insights into Dynamics and Regulation of Yeast Translation. Cell reports 14, 1787-1799. Werner-Washburne, M., Braun, E.L., Crawford, M.E., and Peck, V.M. (1996). Stationary phase in Saccharomyces cerevisiae. Molecular microbiology 19, 1 159-1166. Wu, T.-T., Su, Y.-H., Block, T.M., and Taylor, J.M. (1996). Evidence that two latency- associated transcripts of herpes simplex virus type I are nonlinear. Journal of virology 70, 5962-5967. Wu, X., and Bartel, D.P. (2017). kpLogo: positional k-mer analysis reveals hidden specificity in biological sequences. Nucleic acids research. Zalatan, J.G., Lee, M.E., Almeida, R., Gilbert, L.A., Whitehead, E.H., La Russa, M., Tsai, J.C., Weissman, J.S., Dueber, J.E., Qi, L.S., et al. (2015). Engineering complex synthetic transcriptional programs with CRISPR RNA scaffolds. Cell 160, 339-350. Zhang, Z., Hesselberth, J.R., and Fields, S. (2007). Genome-wide identification of spliced introns using a tiling microarray. Genome research 17, 503-509. Zheng, X.-F., Fiorentino, D., Chen, J., Crabtree, G.R., and Schreiber, S.L. (1995). TOR kinase domains are required for two distinct functions, only one of which is inhibited by rapamycin. Cell 82, 121-130.

84 Extended Data

80 ECU33 0.5 kb Saturatedaulhure

00

AATACA TGTATAAATCGATCGGGAATACTA'ACACTACTTTTCTTTATCTAAGCAGCTAACT

CI If

Extended Data Figure 1. RNA-seq coverage across the BP and to the 3'SS of the ECM33 intron. Shown is the pileup of reads mapping to the 61 -nucleotide region centered on the ECM33 BP, replotted from Fig. I b (red, top). The position of the BP (closed circle with flanking dashed lines) is indicated on the intron (thick gray line) and relative to 3' exon (gray box). Below, all reads mapping uniquely to this region are shown (thin gray lines). Reads mapping across the exon-exon junction are colored blue in the region of the excised intron and are shown above the other reads. Mismatched nucleotides within reads are indicated with colored bars, with color of the bar indicating the identity of the mismatch. Terminal untemplated nucleotides have been clipped from reads.

85 a 4 % 30% 400- ECM33 intron 300- 40S 60S Sedimentation b strain background W T ECM33dintron intron overexpressed N A none ECM33 IP EP S LP S 400- CENIARS # ECM33 intron 300- PA 200- URA3

C strain background WT ECM33intron plasmid - A B C D E Prul

LP S LP S LP S LP S LP S LP S =CEN/ARS FLAG-MCP 500- 400- ECM33 introns 300- *6 'C URA3 Intron encoded by plasmid:

A. pRS416-PF1 -MCP-nointron

B. pRS416-PTF1 -MCP-ECM33_intron

GTATGT TACTAAC-CAG

C. pRS416-Pr-MCP-5P-ECM33_intron

GTATGT n TACTAAC-CAG D. pRS416-PW,-MCP-5M-ECM33_intron

GTATGT n ...- TACTAAC-CAG E. pRS416-Prr-MCP-3P-ECM33_intron

GTATGT - r-TACTAAC-CAG

Extended Data Figure 2. Stable-intron sedimentation and expression constructs for pull- down and mass spectrometry. a, Co-sedimentation of the ECM33 stable intron with large ribonucleoprotein complexes. Cleared lysate from a saturated yeast culture was fractionated by sedimentation into a 10-30% glucose gradient, and RNA was extracted from each fraction. Shown is an RNA blot that resolved 25% of the RNA from each fraction and was probed for the ECM33 intron. Fractions are oriented with increasing sedimentation left-to-right. RNA was also extracted from 12.5% of the total lysate before sedimentation, of which 50% was loaded for comparison (left lane). Sedimentation of 40S and 60S complexes are marked based on

86 sedimentation of the ribosome subunits. Migration of markers with lengths indicated (nucleotides) is shown at the left. b, Endogenous behavior of stable introns ectopically expressed from the URA3 gene. Shown is an RNA blot probed for the ECM33 intron after resolving total RNA from cultures expressing the ECM33 intron from the endogenous ECM33 locus (WT, left lanes), cultures from a strain in which the intron had been deleted (ECM33/intron,middle lanes), and cultures of this deletion strain ectopically expressing the ECM33 intron spliced from the plasmid-borne URA3 reporter gene (right lanes, pRS4I6-ECM33 _intron plasmid shown on right, PURA3, URA3 promoter; CEN/ARS, low-copy origin of replication). Interior lanes not relevant to this experiment were removed from this image (vertical line). Otherwise, this panel is as in Fig. I c. c, Endogenous behavior of stable introns with MS2 hairpins inserted to be used as affinity tags for pull-downs. Five different plasmids with a common backbone (right; PTEF1, TEF1 promoter; FLAG-MCP, coding region of FLAG-tagged MS2 coat protein) each expressed URA3 with a different variant of the ECM33 intron (variants A-E, schematized below). These plasmids were each expressed in the strain that lacked an endogenous ECM33 intron (ECM33Aintron). The RNA blot resolved total RNA from the indicated cultures and was probed for a sequence common to the intron variants; otherwise as in Fig. I c. The 2xMS2 hairpin region is 90 nucleotides long, and the expected linear-intron sizes were: A, no intron; B, 330 nucleotides; C, 420 nucleotides; D, 300 nucleotides; E, 420 nucleotides.

87 AIMI1I " " 7 30 ARF2 15 ARP2 10 CMC2

38 1_

CNBI DID4- DYN2 - 10 7 15 EFM - ..h

- 0 ... .. === =- = - 30 GLC7 5 GPI15 - 60 HNTI - 5 KEII1

68

APT A NBL I --- 15 Aa QCRIO - 25 5 20 20 ~POP8

2A A 28 01

" " I 10 RFA2 "20 RIM1 15 RP026 30 RPS23A

18 _ 28 0

70 RPS29A 30 RPS6B 60 RPS8B 30 SARI

30 78 30 aA & TDA5 30 SUN4 . 80 UBC4 0---" 30 URA2

38 9, 88

USV1 '""20 VMA9 YCLO12C 15[ " 10

10 101 28 OL _OL L

Extended Data Figure 3. RNA-seq profiles of genes with stable introns. Profiles from the stable-intron-inducing condition (red) and log-phase culture (blue) are shown for the indicated 32 genes with confidently identified stable introns not already depicted in Figure 1a. For all but four of these, the profile of the stable-intron-inducing condition is from RNA-seq of the saturated-liquid sample. The four exceptions were not confidently identified in saturated liquid (Extended Data Table 2); for these, the profile of the stable-intron-inducing condition is from either the rapamycin-treated sample (DYN2_2 and TDA5) or the saturated lawn (RP026 and RPS8B). Scale bars, 100 nucleotides. Otherwise, this panel is as in Fig. 1a.

88 a2 h 1.u

Remaining introns (n = 271) -M I I 0.6 2A 0.4- Stable introns (n = 29) Stable introns n =29 IC,- Remaining intons n = 271 0. 0.2 Y P > 0.05 0.01 0 250 500 750 1000 Intron length (nt) C Top discovered motifs controlled for intron-nucleotide frequencies d 10.2 5'SS 2ft E-value>106 2 # of sites:

2 1 T E-value > 106 2.4 # of sites: 2 0 11.6 BP

Top discovered motifs, no control .C ~N

E-value < 10-42 9 1 # of sites: 29 2-4

2 8.7 3 SS .4 T1 E-value < 102 of sites: CO OML! # 29 2.*-

2.9

104- 1.0-

08 - 103- 0.6- 102. I04- 0 402 101 - Stable-intron genes n =29 Stable-intron genes n =29 o Remnaining genes n = 6137 0.2 - Remaining genes n 6137 E = 0.68 00 i g R P > 0.05 100 101 162 163 104 0.0. -8 -6 -4 -2 0 2 4 6 8 mnRNA in log-phese mlture (TPM) mRNA fold change (log2)

Extended Data Figure 4. Stable introns are indistinguishable from other introns in many respects. a, Similar splicing motifs compared to other introns. Plotted are information-content logos of splicing motifs (6-mer 5'SS, 8-mer BP, and 3-mer 3'SS) for stable introns (bottom) and other introns (top). b, Similar length distribution compared to other introns. Plotted are cumulative distributions of intron lengths (P > 0.05, one-tailed Kolmogorov-Smimov test). c, No significantly enriched motifs within stable introns. Plotted are stable-intron sequence motifs discovered by MEME (Bailey et al., 2009) either controlling for non-stable intron k-mer frequencies (top) or without controlling (bottom). No significant motifs were discovered in stable introns when k > 6. The motifs discovered without controlling for k-mer frequencies matched the canonical BP and 5'SS motifs (see a) BP and 5'SS motifs were also the only significantly enriched motifs discovered when k 5 5. d, No enriched positional k-mer motifs detected by

89 kpLogo (Wu and Bartel, 2017) in stable introns. Plotted are the most enriched k-mers at positions relative to 5'SS (top), BP (middle), and 3'SS (bottom) comparing stable introns to unstable introns. Stacked nucleotides at a position represent the most significant motif starting at that position. The height is scaled relative to the significance of the motif, as determined by the one- sided binomial test statistic (y axes). Black numbers indicate invariant nucleotides occurring > 95% of the time at the position. No k-mers were significantly enriched when using a Bonferroni- corrected P of 0.01. e, Expression of mRNAs from genes containing stable introns. Left, relationship between the expression in log-phase and saturated cultures (expression cutoff, I transcript per million [TPM]). Points for genes expressing stable introns are indicated (orange), and the one for ECM33 is labeled. Right, comparison of the expression results of genes with stable introns to those of the remaining genes. Plotted are the cumulative distributions of log2- fold-changes in expression observed between log-phase and saturated cultures, which shows no significant difference between stable-intron genes and other genes. (P > 0.05, two-tailed Kolmogorov-Smirnov test).

90 4P, -1? v Intron overexpressed% "k& 4 4 BP-3'SS distance (nt) - 25 44 37 22 25 ULP-S LPS LPS LPLS LP S 400- 300_ * * - ECM33 intron (330 nt)

300- -ACTl intron (309 nt) 90- - HNT2 intron (89 nt)

400 - RPL27B intron (384 nt)

80- APS3 intron (77 nt) 70-

Extended Data Figure 5. Support for a role of BP-3'SS distance in specifying stable-intron formation. The indicated introns were ectopically expressed from the URA3 splicing construct (Extended Data Fig. 2b). Shown are results from an RNA blot that resolved total RNA from cultures overexpressing the indicated introns and was probed for the indicated introns (length of excised intron in parentheses). The ECM33 and RPL27B introns were probed sequentially, and then the ACT], HNT2, and APS3 introns were probed concurrently. ACT] probe was validated on synthetic transcripts resembling the ACT] intron, which were produced by in vitro transcription (not shown). Migration of markers with lengths indicated (nucleotides) is at the left.

91 a b Strain sch9h 0.25 mgiL 25 mg/IL Plasmid SCH9 SCH9DE Time after dilution (h) 4 4 24 4 24 4 24 Time after dilution (h) 4 4 24 4 4 24 Drug - rap - CHXCHXCHX CHX Rapamycin - + - - - D0O 0.8 0.5 6.1 0.6 3.4 0.5 0.7 400- . - ECM33 intron 400- ,e 40 -ECM33 intron 300- -5.8S rRNA dk. MW -5.8S rRNA c Strain tap42A SUncorrected Plasmid TAP42 tap42-11 Corroded for loading Time after dilution (h) 4 4 24 24 4 4 24 24 14 Rapamycin - + + - - + +- D00 0.9 0.7 0.9 4.6 0.7 0.6 2.3 3.8 12 400- 10 30- -ECM33 intron 8

-SAC6 intron 100- 0L -- TAP42 tap42-11 TAP42 1ap42-11 rRNA I' 60 *S*a#&- S.AS 24 h, +rap. 24 h, -Wa.

d Nutrient depleted mock carbon nitrogen Mucine uracil Time after depletion (h) 4 24 4 24 4 24 4 24 4 24 ODOW 1.0 6.0 0.5 0.5 0.5 6.4 0.5 0.7 0.5 1.1 j - RPS29A intron

00- a- ECM33 intron

- SAC6 intron 100-

80- SUSV intron 70-,

AS~~~6 ftA-S -M45.8S rRNA

e 80 ECM33 0.5 kb Rapamycin, 1h ."A &".LJ.A 40

Rapamycin A 4h 0a A &A

Extended Data Figure 6. Assessing aspects of TORC1 regulation on stable-intron expression. a, Inability of TORC 1-independent inhibition of protein synthesis to prematurely induce stable-intron accumulation. The left lanes show a replicate of Fig. 3b, and the right lanes show results after treatment with either low (0.25 mg/L) or high (25 mg/L) concentrations of cycloheximide. As indicated by OD600 at harvest, the mild cycloheximide treatment allowed the culture to reach an OD6 00 of 3.4 after 24 h, which was equivalent to the OD60 0 of 10 h without cycloheximide (Fig. 3a). b, Dispensability of TORC 1-responsive Sch9 for stable-intron formation. Samples were grown in SC-Trp media to maintain plasmids that either rescued (SCH9) or did not rescue (SCH9DE) Sch9 activity. Otherwise, this panel is as in Fig. 3b. c, Dispensability of TORC 1-responsive Tap42 for stable-intron formation. Samples were grown at 25 'C due to temperature sensitivity of the tap42-11 allele, which led to delay in rapamycin- mediated stable-intron formation (compare 4 h +rapamycin sample to same condition in Fig. 3b,

92 c, Extended Data Fig. 6a, b). Otherwise, the left side of this panel is as in Fig. 3b. At the right is a bar graph showing the average signal for the ECM33 and SAC6 introns before (light blue) and after (dark blue) correction for loading, based on the 5.8S rRNA signals. d, Inability of acute deprivation of select nutrients to induce accumulation of stable introns. To prevent contamination of starting cultures with stable introns contributed by inoculum, cultures were seeded at OD60 0 = 0.2 from an overnight culture that was allowed to grow to mid-log phase, collected by vacuum filtration, washed in water, and resuspended in fresh media lacking the indicated nutrients. Cultures were harvested after the indicated times and analyzed as in Fig. I c, except the RNA blot was sequentially reprobed for the RPS29A, ECM33, SAC6, and USVJ introns. As indicated by ODoo at harvest, the sample deprived of ammonium sulfate, the main nitrogen source, was still able to reach a high density after 24 h. e, Undetectable accumulation of the ECM33 intron after I h of rapamycin treatment. For comparison, results showing accumulation of this intron in a culture treated with rapamycin for 4 h are re-plotted from Fig. 3c. Otherwise, this panel is as in Fig. 3c.

93 103 'Stable introns n =26

f -

102

101.

HIM

100 101 102 103 Accumulation in WT culture (TPM)

Extended Data Figure 7. Assessing stable-intron expression in a EUHSR culture. No evidence for compensatory stable-intron expression after genomic deletion of five stable introns. The scatter plot shows the relationship between intron accumulation (TPM, transcripts per million) in rapamycin-treated WT culture and in rapamycin-treated EUHSR culture. The dashed line is placed at x = y. Stable introns are indicated (closed orange circles). Points for introns deleted from the EUHSR genome (ECM33, UBC4, HNTJ, SAC6, and RFA2) are labeled. All introns were pseudo-counted at 0.5 TPM.

94 105 .Stable introns n =26 IM ,

104

S103

101. - E 100 ..

100 101 102 10 10' 108 Accumulation in rapamycin-treated culture (TPM)

b 1.0 C Total introns: 300

06 31 Retained introns: 103

0.4 log 72 0- RPGs n = 132 - Remaining genes n=5808 PGrntrons 100 P<10.16 0.0 -2 -1 0 1 2

mRNA fold change (log2 )

Extended Data Figure 8. Evidence for spliceosome sequestration and control of ribosome production by stable introns. a, Aggregate stable-intron accumulation approaching that of spliceosomal RNAs. Plotted are levels of stable introns and spliceosomal RNAs (labeled closed blue circles), comparing levels in log-phase WT culture to those in rapamycin-treated WT culture. Also plotted is the aggregate stable-intron abundance (closed orange circle, "stable intron total"). Otherwise, as in Extended Data Fig. 7. b, Reduced RPG mRNA expression when overexpressing a stable intron in rapamycin-treated EUHSR culture. Plotted are the cumulative distributions of log2-fold-changes in mRNA expression observed between a EUHSR culture with stable-intron (ECM33) ectopic expression and a EUHSR culture with control-intron (ACT]) ectopic expression. The distribution of changes for mRNAs of RPGs (green) differed from that other genes (black), with generally lower expression of RPGs in the culture overexpressing the stable intron (P < 1015, one-tailed Kolmogorov-Smirnov test; expression cutoff, 1 TPM in both samples). c, Less efficient splicing, as detected by increased intron retention, when overexpressing a stable intron in rapamycin-treated EUHSR culture. When analyzing data set of b, 103 genes had significantly more intron retention when ectopically expressing the stable intron compared to when expressing the control intron (MISO, Bayes factor > 100). The Venn diagram shows the overlav between these genes with increased intron retention and intron- containing RPGs (P < 10~ 1, hypergeometric test).

95 Enrichment values Protein name Control 1 Control 2 Control 3 Test 1 Test 2 Test 3 Mean control Mean test Fold difference

Prp9 0.00 1.00 0.57 4.84 6.34 4.83 0.52 5.34 10.20 Hsh49 0.60 1.00 1.20 5.67 5.15 5.24 0.93 5.35 5,74 Cef1 0.93 1.00 0.99 345 6.04 5.79 0.97 5.09 5.23

Rse1 0.66 1.00 0.90 3.17 4.16 5.80 0.85 4.38 5.13 Bud3l 0.68 1.00 0.96 3.96 3.96 4.16 0.88 4.03 4 58 Prp45 0.88 1.00 0.71 3.12 3.77 3.86 0.86 3.58 4 15 Prp19 0.87 1.00 1.05 3.28 4.03 4.20 0.97 3.84 3.94 Brr2 078 1.00 0.69 2.22 3.21 3.66 0.82 3.03 368 PrpO 0 75 1.00 0.73 2.00 2.63 3.46 0.83 2 70 3.26 SmbI 067 1.00 1.10 249 2.80 2.69 0.92 2-66 288

Extended Data Table 1. Proteins associated with stable introns. Shown are results for the ten proteins consistently enriched > 2 fold in stable-intron pull-down eluates. Three control samples without a tagged intron and three test samples with a unique tagged introns (Extended Data Fig. 2, Supplementary Table 5) were simultaneously analyzed by quantitative mass spectrometry. These ten proteins were the only proteins enriched > 2-fold in each of the nine possible pairwise comparisons between test and control samples. The identities of these proteins were consistent with the excised and debranched introns being part of a complex resembling the ILS complex, in that all ten are known components of the ILS identified through biochemical studies (Fourmann et al., 2013), and most (all but Brr2, Hsh49, Prp9, and Rsel) have also been identified in a cryo- EM structure of the ILS complex (Wan et al., 2017). Moreover, several early spliceosome components (Luc7, Prp3, Snpl, and Snul 3) as well as an essential disassembly factor (Prp43) were identified across all samples but not enriched in tagged-intron eluates. Enrichment values were those reported by PEAKS Studio, which are reported relative to values of control 2.

96 Stable intron Liquid Rapamycin Lawn BP-3'SS distance (nt) Intron length (nt) Description of gene function AIMlI 0 20 75 Protein of unknown function ARF2 . * 18 332 ADP-ribosylation factor ARP2 0 15 123 Essential component of the Arp2/3 complex CMC2 . . 29 85 Protein involved in respiratory chain complex assembly or maintenance CNB1 0 0 19 76 Calcineurin B DID4 0 23 68 Class E Vps protein of the ESCRT-ll complex DYN2_2 17 80 Cytoplasmic light chain dynein. microtubule motor protein ECM33 . . 25 330 GPI-anchored protein of unknown function EFM5_2 .. 0 25 93 S-adenosylmethionine-dependent lysine methyltransferase GLC7 . . 31 525 Type 1 S/T protein phosphatase (PP1) catalytic subunit GPI15 0 0 14 74 Protein involved in the synthesis of GlcNAc-PI HNT1 0 0 23 111 Adenosine 5'-monophosphoramidase KEII 0 31 101 Component of inositol phosphorylceramide (IPC) synthase MPT5 . . 0 25 640 mRNA-binding protein of the PUF family NBL 1 . . 0 12 67 Subunit of the conserved chromosomal passenger complex (CPC) POP8 . . 0 20 75 Subunit of both RNase MRP and nuclear RNase P QCRIO . 15 63 Subunit of the ubiqunol-cytochrome c oxidoreductase complex RFA2 . . 24 108 Subunit of heterotrimeric Replication Protein A (RPA) RIAM1 . . 26 83 ssDNA-bmding protein essential for mitochondnal genome maintenance RP026 0 21 76 RNA polymerase subunit ABC23 RPS23A 0 23 319 Ribosomal protein 28 (rp28) of the small (40S) nbosomal subunit RPS29A . 0 22 488 Protein component of the small (40S) ribosomal subunit RPS6B . 23 352 Protein component of the small (40S) ribosomal subunit RPS8B 0 24 360 Protein component of the small (40S) ribosomal subunit SAC6 . . 0 20 111 Fimbnn actin-bundling protein SARI * . 19 139 ARF family GTPase SUN4 0 22 346 Cell wall protein related to glucanases TDA5 0 19 71 Putative protein of unknown function UBC4 .. 0 26 95 Ubiquitin-conjugating enzyme (E2) URA2 . . 0 22 320 Bifunctional carbamoylphosphate synthetase/aspartate transcarbamylase

USV1 * 0 24 75 Putative containing a C2H2 zinc finger

VMA9_1 . 22 77 Vacuolar H+ ATPase subunit e of the V-ATPase VO subcomplex YCLO12C . . 13 67 Protein of unknown function YOS1I2 24 111 Integral membrane protein required for ER to Golgi transport

Extended Data Table 2. Stable introns identified in saturated cultures. Description of gene function is from YeastMine (https://yeastmine.yeastgenome.org/). Liquid, saturated-liquid sample; Lawn, saturated-lawn sample; Rapamycin, rapamycin-treated sample.

97 Supplementary Information

Supplementary Table 1: S. cerevisiae strains used and generated in this study.

Strain Name in Fig. 4a Genotype Reference / Source BY4741 WT Mat a his3z I leu2A0 metl5JO ura3A0 Brachmann et al. (1998) DPB050 R BY4741 rfa2Jintron This study DPB051 M BY4741 mpt5Aintron This study DPB052 S BV4741 sac6iintron This study DPB053 H BY4741 hnt/lintron This study DPB054 U BY4741 ubc4zJintron This study DPB055 E BY4741 ecm33Aintron This study DPB056 HR BY4741 hntlAintron rfa2iintron This study DPB057 HS BY4 741 hntl /intron sac6Jintron This study DPB058 ES BI'4741 ecm33Aintron sac6Jintron This study DPB059 UH BY4741 ubc4Azintron hntilintron This study DPB060 EU BY4741 ecm33iintron ubc4Jintron This study DPB061 HSR BY4741 hnt]Jintron sac6lintron rfa2Aintron This study DPB062 UHS BY4741 ubc4iintron hntliintron sac6Jiintron This study 00 DPB063 EUS BY4741 ecm33Jinlron ubc4Jintronsac6iin/ron This study DPB064 EUH BY4741 ecm33Jintron ubc4in/ron hntlAintron This study DPB065 EUHR BY4741 ecm33Jintron ubc4iintronhntlJinron rfa2Aintron This study DPB066 EUHM BY4741 ecm33Jintron ubc4Aintron hntIJintron npt5Aintron This study DPB067 EUHSR BY4741 ecm33lintron ubc4Aintron hntliintronsac6Jintron rfa2iintron This study DPB068 C BY4741 cofLAintron This study DPB069 A BY4741 actlzintron This study DPB070 CA BY4741 coflJ intron act J intron This study DPB071 CM BY4 741 coflAintron mms2zlintron This study DPB072 CAM BY4741 cof/Lintron act]Jintron mms2Aintron This study

Strain Figure Genotype Reference / Source DPB073 2d BY4741 ecm33-longBP This study DPB074 2d BY4741 act1-shortBP This study W303a 3c Mat a leu2-3,112 tIrpl-i can1-100 ura3-1 ade2-1 his3-11, I5 Yeast genetic Stock. Center (Berkeley, CA) TORI-I 3c W303a TORi-1(S19721) Zheng et al. (1995) SCH9-del Extended 6b BY4741 sch9J This study TAP42-del Extended 6c BY4743 TAP42 tap42::kanMX HetDip KO Collection (Horizon Discovery) Supplementary Table 2: Plasmids used and generated in this study.

CRISPR plasmids See ref. 40,41 for detailed protocol of plasmid design and construction Entry plasmid is pV1382 (ref 4 1), based on pRS416 vector See Supplementary Table 3 for sgDNA sequences

Pulldown plasmids (Extended Data Fig. 2) Description pRS416-ECM33_intron low-copy URA-selectable plasmid, ECM33 intron inserted into UJRA3 pRS416-PTEF 1-MCP-nointron low-copy URA-selectable plasmid, 3xFLAG-MCP under TEFI promoter low-copy URA-selectable plasmid, 3xFLAG-MCP under TEFI promoter, ECM33 intron inserted pRS416-PTEF 1-MCP-ECM33_intron into URA3 low-copy URA-selectable plasmid, 3xFLAG-MCP under TEFI promoter, 5P-ECM33 intron inserted pRS416-PTEF 1-MCP-5P-ECM33 intron into URA3 low-copy URA-selectable plasmid, 3xFLAG-MCP under TEFI promoter, 5AI-ECA133 intron inserted pRS4I6-PTEF 1-MCP-5M-ECM33_intron into URA3 low-copy URA-selectable plasmid, 3xFLAG-MCP under TEFI promoter, 3P-ECM33 intron inserted pRS416-PTEF 1-MCP-3P-ECM33_intron into URA3

Ectopic intron expression plasmids (Fig. 4d,e, Extended Data Fig. 5) ON pRS426-ECMstable high-copy URA-selectable plasmid, WT ECM33 intron (stable) inserted into URA3 ON pRS426-ECMunstable high-copy URA-selectable plasmid, Mut EC.i33 intron (unstable) inserted into URA3 pRS426-ACTunstable high-copy URA-selectable plasmid, WT ACTI intron (unstable) inserted into URA3 pRS426-ACTstable high-copy URA-selectable plasmid, Mut ACTI intron (stable) inserted into URA3 pRS416-ECM33 intron low-copy URA-selectable plasmid, ECAI33 intron inserted into URA3 pRS416-ACTI intron low-copy URA-selectable plasmid, ACTI intron inserted into URA3 pRS416-HNT2 intron low-copy URA-selectable plasmid, HNT2 intron inserted into URA3 pRS416-RPL27Bintron low-copy URA-selectable plasmid, RPL27B intron inserted into URk3 pRS416-APS3_intron low-copy URA-selectable plasmid, APS3 intron inserted into URA3

SCH9 and TAP42 expression plasmids (Extended Data Fig. 6b,c) Plasmids gifts of R. Loewith. Reference: Huber et al. (2009) pRS414::SCH9 low-copy, TRP-selectable plasmid, WT SCH9 expression pRS414::SCH9DE low-copy, TRP-selectable plasmid, Mutant SCH9 (T723D,S726D,T737E,S758E,S765E) expression pRS415::TAP42 low-copy, LEU-selectable plasmid, WT TAP42 expression pRS415::tap42-1 1 low-copy, LEU-selectable plasmid, Mutant TAP42 (tap42-II allele) expression Supplementary Table 3: CRISPR guide and repair oligos.

Intron deletion guide oligos ecm33_sgT gatcgTCAATATTTCTGCCTGTCATG ecm33_sgB AAAACATGACAGGCAGAAATATTGAC ubc4_sgT gatcgAGAAAGGTATGTCTAAAGTTAG ubc4_sgB AAAACTAACTTTAGACATACCTTTCTc hntlsgT gatcgCAATCGATCCGCTATGCAACG hntlsgB AAAACGTTGCATAGCGGATCGATTGc rfa2_sgT gatcgGTATTAGTGCTAGGAATTGGG rfa2_sgB AAAACCCAATTCCTAGCACTAATACc sac6_sgT gatcgTTTCACTGTGCTCAGCAGTGG sac6_sgB AAAACCACTGCTGAGCACAGTGAAAc mpt5_sgT gatcgTCTTGATTCTCACGCATCTCG mpt5_sgB AAAACGAGATGCGTGAGAATCAAGAC actlsgT gatcgTACAGATCAGTCAATATAGGG act IsgB AAAACCCTATATTGACTGATCTGTAc coflisgT gatcgACAGAAACTTCACATTTTCCG coflsgB AAAACGGAAAATGTGAAGTTTCTGTc mms2_sgT gatcgGATTTACAATAGGACAGTGAG 0 mms2_sgB AAAACTCACTGTCCTATTGTAAATCc

Intron deletion repair oligos ecm33_repairT CAAGAACGCTTTGACTGCTACTGCTATTCTAAGTGCCTCCGCTCTAGCTGCTAACTCAAC ecm33_repairB GGCAGAAGTACCAATACTACATGAAGATGGAATAGAAGTAGTTGAGTTAGCAGCTAGAGC ubc4_repairT AACATGTCTTCTTCTAAACGTATTGCTAAAGAACTAAGTGATCTAGAAAGAGATCCACCT ubc4_repairB GATATAGATCATCGCCGACTGGACCGGCTGAACATGAAGTAGGTGGATCTCTTTCTAGAT hntlrepairT TGCTCCTGCTACGCTTGATGCTGCCTGTATTTTTTGCAAGATTATTAAAAGCGAAATTCC hntlrepairB GAAAGCATACGAGTACTTTGTTTCAATCAATTTGAAGGATGGAATTTCGCTTTTAATAAT rfa2_repairT CGATAGCGACTATCTAGAACAGGCTAGTTTAAGCATATACATAATGGCAACCTATCAACC rfa2_repairB CTCAAAGCCACCGCCCGTTACTGATGAATATTCGTTATATGGTTGATAGGTTGCCATTAT sac6_repairT ATTAGCCCTAAGGAGTACACCAAAACACAATGAATATTGTCAAATTACAAAGAAAATTTC sac6_repairB TTTTCAATTGTGGAAAAAAGATCCTCTTGAGTCAAAATTGGAAATTTTCTTTGTAATTTG mpt5repairT ATTCTACGCAAATTTATAAATCAATTACGATTTTTCCAGTTTCTCTTATGATCAATAACG mpt5_repairB GTAGTTAAAATCGATGCTGAGTCGGCAGATGGAAATGGTTCGTTATTGATCATAAGAGAA act I _repairT GCTTTTTTCTTCCCAAGATCGAAAATTTACTGAATTAACAATGGATTCTGAGGTTGCTGC actlrepairB ACCGGCTTTACACATACCAGAACCGTTATCAATAACCAAAGCAGCAACCTCAGAATCCAT coflirepairT TCAAAACATACATAAACAAAAAACTAACAAAAGAAGATGTCTAGATCTGGTGTTGCTGTT coflrepairB CCAATTTCAAGTCATTGAAAGCGGTAAGGGATTCATCAGCAACAGCAACACCAGATCTAG mms2_repairT TGTATATGCAACGTAGAAGAAAGCAGCGTTTACACAAAAATGTCAAAAGTACCAAGAAAT mms2_repairB ACCCTTTTTCACCCTTTTCTAATTCTTCTAACAACCTAAAATTTCTTGGTACTTTTGACA Supplementary Table 3: CRISPR guide and repair oligos, continued.

BP manipulation guide oligos ecm33_sgT same as above ecm33_sgB same as above act lbpsgT gatcgATGTTTAGAGGTTGCTGCTTG actl bpsgB AAAACAAGCAGCAACCTCTAAACATc

BP manipulation repair oligos ecm-bp-repairT TTTGGAATATTCTTCAATATTTCTGCCTGTCATAGCATTAGGGCGAGTAATGAAACAGAATAATACATGTATAAATCGATCGGGAATACT ecm-bprepairB ATGGAATAGAAGTAGTTGAGTTAGCTGCTTAGATAAAGAAAAGTAGTGTTAGATAAAGAAAAGTAGTGTTAGTATTCCCGATCGATTTAT act-bpl repairT CTACTGTTACTAAGTCTCATGTACTAACATGTTGCTATATTATATGTTTAGAGGTTGCTG act bp lrepairB ACATACCAGAACCGTTATCAATAACTAAAGCAGCAACCTCTAAACATATA

SCH9 deletion guide oligos sch9_sgT gatcgTCCGTCTCCGAGACTAGGTGG sch9_sgB AAAACCACCTAGTCTCGGAGACGGAc

SCH9 deletion repair oligos sch9repairT TCTGAGAATTATACTCGTATAAGCAAGAAATAAAGATACGAATATACAATTTTCTCAATC sch9_repairB ATAAAAAGAAAAGGAAAAGAAGAGGAAGGGCAAGAGGAGCGATTGAGAAAATTGTATATT Supplementary Table 4: RNA-blot probes.

ECM33_intron 5' TCCTCACGAGATCTCGAAACCCGT 3' SAC6_intron 5' AAACTTTTTCACTGTGCTCAGCAGTG 3' ACTIintron 5' GGACCGTGCAATTCTTCTTACAGTTAAATGGG 3' HNT2_intron 5' ACGCAATGGTGCGAATGGGGTACAAAAAAACAT 3' RPL27Bintron 5' CGACACGATTGGTCGTGAATGTGGTGCTCCCC 3' APS3_intron 5' ATGTATAATTTTGGCAGAAGAAAAGACCCTTGAGAAATCTT 3' RPS29Aintron 5' CAGCCAGCATACGCAATGATAATAAGCGCTATGTTGTTATGTTATTC 3' USVIintron 5' GTTAGTAAAAAAAGCTAAGCCAGTAAGTCGCTCC 3' MS2_aptamer intron 5' CCGATCGATTTATACATGTATTATTCTGTTTCATTAC 3' ECM33_mRNA 5' TGGAATAGAAGTAGTTGAGTTAGCAGCTAGAGCGGAG 3' 5.8SrRNA 5' GCGTTCTTCATCGATGCGAGAACCAAG 3'

CA Supplementary Table 5: Mutant intron sequences.

Name Sequence 2x MS2 aptamer gggattttgacgtcgcACATGAGGATCACCCATGTgcgacgtcttttcacgagcgACATGAGGATCACCCATGTcgctcgtgttccc

GTATGTACACATTCTCCTTTTTTTTCATCTTTTTTTCTTTATTTGCCTCTTTCCTCACAAATCTCGAGTAGATTCGTGGTCCTCTTCAT TCTTTTCTTTTTTCTTTGTCGATTACTGGGCTTTTTTTCATAGGTCTCGATTGACGCGGACGGACAATGCGAAAAAAAAAAATTTCCAA AAGAGGAAACGGGTTTCGAGATCTCGTGAGGATGGTTTTGGAATATTCTTCAATATTTCTGCCTGTCATAGGATTAGGGCGAGTAATGA ECM33 AACAGAATAATACATGTATAAATCGATCGGGAATACTAACACTACTTTTCTTTATCTAAGCAG

GTATGTACACATTCTCCTTTTTTTTCATCTTTTTTTCTTTATTTGCCTCTTTCCTCACAgggattttgacgtcgcACATGAGGATCACC CATGTgcgacgtcttttcacgagcgACATGAGGATCACCCATGTcgctcgtgttcccAATCTCGAGTAGATTCGTGGTCCTCTTCATTC TTTTCTTTTTTCTTTGTCGATTACTGGGCTTTTTTTCATAGGTCTCGATTGACGCGGACGGACAATGCGAAAAAAAAAAATTTCCAAAA GAGGAAACGGGTTTCGAGATCTCGTGAGGATGGTTTTGGAATATTCTTCAATATTTCTGCCTGTCATAGGATTAGGGCGAGTAATGAAA ECM33_5P CAGAATAATACATGTATAAATCGATCGGGAATACTAACACTACTTTTCTTTATCTAAGCAG

GTATGTACACATTCTCCTTTTTTTTCATCTTTTTTTCTTTATTTGCCTCTTTCCTCACAgggattttgacgtcgcACATGAGGATCACC CATGTgcgacgtcttttcacgagcgACATGAGGATCACCCATGTcgctcgtgttcccAGAGGAAACGGGTTTCGAGATCTCGTGAGGAT GGTTTTGGAATATTCTTCAATATTTCTGCCTGTCATAGGATTAGGGCGAGTAATGAAACAGAATAATACATGTATAAATCGATCGGGAA ECM33_5M TACTAACACTACTTTTCTTTATCTAAGCAG

GTATGTACACATTCTCCTTTTTTTTCATCTTTTTTTCTTTATTTGCCTCTTTCCTCACAAATCTCGAGTAGATTCGTGGTCCTCTTCAT TCTTTTCTTTTTTCTTTGTCGATTACTGGGCTTTTTTTCATAGGTCTCGATTGACGCGGACGGACAATGCGAAAAAAAAAAATTTCCAA AAGAGGAAACGGGTTTCGAGATCTCGTGAGGATGGTTTTGGAATATTCgggattttgacgtcgcACATGAGGATCACCCATGTgcgacg tcttttcacgagcgACATGAGGATCACCCATGTcgctcgtgttcccTTCAATATTTCTGCCTGTCATAGGATTAGGGCGAGTAATGAAA ECM33_3P CAGAATAATACATGTATAAATCGATCGGGAATACTAACACTACTTTTCTTTATCTAAGCAG

Underlined indicates portions of hairpin recognized by MS2 coat protein Bold indicates aptamer within the ECM33 intron Supplementary Table 5: Mutant intron sequences, continued.

Name Sequence GTATGTACACATTCTCCTTTTTTTTCATCTTTTTTTCTTTATTTGCCTCTTTCCTCACAAATCTCGAGTAGATTCGTGGTCCTCTTCAT TCTTTTCTTTTTTCTTTGTCGATTACTGGGCTTTTTTTCATAGGTCTCGATTGACGCGGACGGACAATGCGAAAAAAAAAAATTTCCAA AAGAGGAAACGGGTTTCGAGATCTCGTGAGGATGGTTTTGGAATATTCTTCAATATTTCTGCCTGTCATAGGATTAGGGCGAGTAATGA ECM stable (WT) AACAGAATAATACATGTATAAATCGATCGGGAATACTAACACTACTTTTCTTTATCTAAGCAG

GTATGTACACATTCTCCTTTTTTTTCATCTTTTTTTCTTTATTTGCCTCTTTCCTCACAAATCTCGAGTAGATTCGTGGTCCTCTTCAT TCTTTTCTTTTTTCTTTGTCGATTACTGGGCTTTTTTTCATAGGTCTCGATTGACGCGGACGGACAATGCGAAAAAAAAAAATTTCCAA AAGAGGAAACGGGTTTCGAGATCTCGTGAGGATGGTTTTGGAATATTCTTCAATATTTCTGCCTGTCATACCATTAGGGCGAGTAATGA ECMunstable (Mut) AACAGAATAATACATGTATAAATCGATCGGGAATACTAACACTACTTTTCTTTATCTAACACTACTTTTCTTTATCTAAGCAG

GTATGTTCTAGCGCTTGCACCATCCCATTTAACTGTAAGAAGAATTGCACGGTCCCAATTGCTCGAGAGATTTCTCTTTTACCTTTTTT TACTATTTTTCACTCTCCCATAACCTCCTATATTGACTGATCTGTAATAACCACGATATTATTGGAATAAATAGGGGCTTGAAATTTGG AAAAAAAAAAAAAACTGAAATATTTTCGTGATAAGTGATAGTGATATTCTTCTTTTATTTGCTACTGTTACTAAGTCTCATGTACTAAC ACTunstable (WT) ATCGATTGCTTCATTCTTTTTGTTGCTATATTATATGTTTAG

GTATGTTCTAGCGCTTGCACCATCCCATTTAACTGTAAGAAGAATTGCACGGTCCCAATTGCTCGAGAGATTTCTCTTTTACCTTTTTT TACTATTTTTCACTCTCCCATAACCTCCTATATTGACTGATCTGTAATAACCACGATATTATTGGAATAAATAGGGGCTTGAAATTTGG AAAAAAAAAAAAAACTGAAATATTTTCGTGATAAGTGATAGTGATATTCTTCTTTTATTTGCTACTGTTACTAAGTCTCATGTACTAAC 0 ACT-stable (Mut) ATGTTGCTATATTATATGTTTAG

Underlined indicates BP 7-mer consensus sequence Bold indicates sequence deleted or inserted to manipulate BP-3'SS distance Chapter 3. Future Directions

Directions offuture passed

A brief explanation of the rationale for the original design and reasoning for this project precedes future directions of stable-intron research. The discovery of stable introns in S. cerevisiae was the serendipitous result of an interest in gene expression outside of log phase.

Although none of the work related to ribosome-footprint profiling and measurement of translational efficiency (TE) in non-log-phase cultures is presented above, these were the experiments on which we expected to focus before it became clear that something very strange was happening with introns in saturated cultures.

Initial transcriptome-wide studies utilizing RNA-seq and ribosome profiling in yeast show a relatively wide 10-fold range of TE (mRNA RPKM vs. ribosome-protected fragments

RPKM: Spearman R (Rs)= 0.75 in log-phase (Ingolia et al., 2009), Rs = 0.5 - 0.7 in sporulation

(Brar et al., 2012)). In these datasets, hundreds of individual genes are translated with significantly higher or lower efficiencies than average. These results imply that gene-specific regulation of TE could play a large role in controlling and altering protein production in yeast.

Recent protocol improvements-which more accurately capture endogenous levels of both mRNAs and ribosome-protected fragments (Weinberg et al., 2016)-have dramatically updated the picture of TE in yeast. We now find that almost all of the 4,500+ genes expressed in log-phase cultures show little variation in how efficiently they are translated (Rs = 0.92). In other words, effectively all regulation of protein level in log-phase yeast is controlled through mRNA transcription and degradation, with a striking 1:1 relationship between gene expression and translation (as read out by ribosome-footprint profiling). In fact, there are only two genes that are

105 both well-expressed and lowly translated in this dataset: GCN4 and HACl. Strikingly, these two genes were discovered to be poorly translated long before the advent of high-throughput sequencing (Hinnebusch and Fink, 1983; Cox and Walter, 1996). They have since been the subjects of significant research for the past three decades to fully understand the mechanisms that repress their translation (Hinnebusch, 1984; Mueller and Hinnebusch, 1986; Dever et al., 1992;

Kawahara et al., 1997; Rfegsegger et al., 2001; Di Santo et al., 2016; Cherry et al., 2018). So, although our improved high-throughput assay easily identified the two known examples, it also suggests we have exhaustively identified all genes with gene-specific TE regulation in log-phase yeast.

No other growth conditions had been examined with these protocols, leaving an incredibly wide swath of global TE regulation currently unexplored. Under less ideal conditions

(e.g. stress, stationary phase) or conditions where temporal control of protein production may be beneficial (e.g. mitotic and/or meiotic cell cycle), could there be other examples of gene-specific regulation of TE? If so, could we find the underlying mechanism, as has been done with GCN4 and HA Cl? Of course, finding even one more example would increase the known examples of

TE regulation in S. cerevisiae by 50%. Therefore, this study was designed to substantially expand the limited repertoire of such mechanisms known in this model eukaryote system. The experiments did indeed yield a number of interesting candidates with clear indications of post- transcriptional regulation of protein abundance. How are these mRNAs translationally repressed?

Is there a phenotypic consequence if regulation of their TE is disrupted? However, as mentioned, we have not followed up on these hits to date due to prioritization of understanding stable-intron biology and function. Sooner or later, everything old is new again, and it will be interesting to elucidate the mechanism(s) of these candidates' TE regulation when the time comes.

106 Sequestration model

Earlier, we suggested that the function of stable introns might lie in sequestering spliceosomal factors in order to reduce the production of intron-rich ribosomal protein genes.

This raises a few obvious questions: where are the stable-intron RNPs being sequestered, and what is the complete list of factors being sequestered? Our assumption has been that stable introns remain nuclear post-splicing. This assumption is supported by the known presence of membraneless nuclear organelles called nuclear speckles, which store mRNA splicing factors

(Mao et al., 2011; Zhu and Brangwynne, 2015). These speckles are the very same as those described in the introduction as recognized by snRNP antibodies (Lerner and Steitz, 1979;

Lerner et al., 1981). However, though nuclear speckles have been identified in mammals, flies

(Segalat and Lepesant, 1992), and frogs (Gall et al., 1999), they have not been identified in yeast where snRNAs are instead localized to the nucleolus (Potashkin et al., 1990). But, there is a familiar caveat: these experiments in yeast were only performed on cells in mid-log phase. Could the nuclear localization of snRNPs be regulated by growth conditions? And, may it turn out they do reside in their own phase-separated speckles under said conditions (Banani et al., 2017)?

Though our working model, nuclear localization has not been directly observed, partially because nuclear vs. cytoplasmic fractionation is not effective in yeast. Additionally, there are reports that discarded splicing events are exported to the cytoplasm for turnover (Hilleren and

Parker, 2003; Mayas et al., 2010; Zeng and Staley, 2017), suggesting perhaps stable-intron RNPs have the same initial fate. To address this question, we could attempt to observe individual introns with fluorescent in situ hybridization (FISH), but they are too short to be compatible with current technologies, which require tiling dozens of probes across an RNA of interest (Raj et al.,

2008; Trcek et al., 2012). However, this could be circumvented by either amplifying the in situ

107 signal of a few initial hybridization events (Rouhanifard et al., 2018), or engineering a longer stable intron capable of accommodating the requisite number of probes for single-molecule

FISH. It will be interesting to know both the gross localization of stable introns, as well as if the individual RNPs remain distinct or coalesce into a larger complex or body. Though not necessarily conducive to FISH, it is of further interest to know the dynamics of RNP disassembly after returning a culture to an optimal environment.

To appreciate the consequences of sequestration, we need a more complete picture of what snRNAs and proteins are being sequestered. What factors are likely to become limiting for splicing in this model? It is not a new concept that the spliceosome is generally limiting in yeast

(Munding et al., 2013). In agreement with the spliceosome not being in large excess to splicing events, we estimate that stable introns could sequester -40% of spliceosomes assuming 1:1 stoichiometry. However, that was based on their levels relative to snRNAs, which are themselves quite abundant. Protein factors could easily be much more limiting if also sequestered 1:1 with stable introns.

To address the consequences of sequestration, we would first want to re-visit the pull- down and mass spectrometry experiments already performed. These experiments required an extended period of troubleshooting centered around the eventual realization that lysates from saturated cultures contain high levels of endogenous RNase activity. To circumvent this, we learned to first fractionate the lysate on a sucrose gradient and work with only the stable-intron- containing fractions, which are relatively free of RNase issues. However, due to the extended troubleshooting, we have not yet optimized any downstream steps of the pull-down protocol. The mass spectrometry results indicated a fair amount of non-specific protein in all samples-both negative controls and test samples. This does not alter our interpretations, but necessarily limits

108 the number informative peptides we can interrogate. It is likely we could get a more pure stable-

intron RNP using our extant MS2 hairpin-based constructs and further optimization.

Alternatively, a number of techniques have been described recently that would allow for massive

improvement of non-RNA-bound background contaminants before performing mass spectrometry (M. L. Queiroz et al., 2018; Trendel et al., 2018; Urdaneta et al., 2018).

Mechanistic insights into spliceosome disassembly

With a more complete list of stable-intron-bound proteins, we could re-visit comparisons with cryo-EM- (Wan et al., 2017) and biochemistry- (Fourmann et al., 2013) defined components of the ILS and be able to say conclusively whether it is the ILS in its entirety or instead an ILS-like sub-complex that is bound to stable introns. If factors are missing, that may inform us as to why stable introns are not being released, why Prp43 is apparently unable to disassemble the ILS, and how the intron can be debranched while remaining in-complex. This is a particularly interesting point to address, as it has been shown for ACTJ's intron (not a stable intron) in log-phase extract that Prp43 activity and disassembly of the ILS are necessary before

Dbrl can debranch lariat introns (Martin et al., 2002). Are these requirements different for all introns in saturated cultures? Are these requirements different for different introns independent of growth condition? Or, is it a mixture of these two possibilities?

We defined the BP-3'SS distance as discriminating between stable and unstable introns, but we do not understand how this distance is recognized by the spliceosome or how this leads to the cell's inability to disassemble stable-intron complexes. One possibility is that the spliceosome protects a static distance of intron downstream of the BP. In this case, longer BP-

3'SS sequences would have nucleotides extending beyond this protection, allowing disassembly

109 machinery to grab hold. However, this possibility still requires a difference between log-phase and stationary-phase spliceosomes because all introns with both long and short BP-3'SS can be disassembled in log phase. It may be that ILS complexes containing introns with short BP-3'SS distances are more difficult to disassemble in both log- and stationary-phase compared to introns with long BP-3'SS distances. The switch stable-intron stability could possibly result from a change in the competition for ILS disassembly machinery due to greatly decreased levels of

Prp43 and/or other disassembly factors. We would need to carefully assess levels of these factors across growth conditions to understand if they are limiting in stationary phase. If so, could simply overexpressing disassembly machinery abrogate stable-intron formation? Finally, in cryo-

EM structures of the log-phase I LS (Wan et al., 2017), Prp43 was found to bind far from the intron's 3' end, requiring the involvement of additional factors in discriminating BP-3'SS distances. In all, there are many questions about the potential for substrate-specific and environment-specific complexities of splicing, intron release, and debranching that are raised by this work. Given the similarities in core spliceosome function between yeast and humans, this stable-intron RNP-centric direction could provide very broad insight into eukaryotic splicing biology.

Determinants of intron stability

We found that a reasonably large subset of S. cerevisiae introns (-1 1%) are stabilized during saturated growth, and that the only feature that unifies these introns is their short BP-3'SS distance. There are three major questions that are left by this finding, which are mentioned below. The first question, related to the above section: what controls the switch in intron stability between log-phase and saturated growth? Stable introns are degraded just as other introns in log-

10 phase cultures, so we are interested in understanding what changes are necessary to allow the

rapid switch between instability and stability. One obvious way this could be regulated is a

change in the constituents of the spliceosome-wholesale change in recruited factors, post-

translational modifications, or potentially post-transcriptional modifications to introns or

snRNAs. Changes to proteins could be detected through mass spectrometry, as described.

Changes to RNAs could be detected through direct interrogation of known modifications to

snRNAs (Wu et al., 2011; Wu et al., 2016a; Wu et al., 2016b; van der Feltz et al., 2017), or

through performing transcriptome-wide measurements of pseudouridine, N6-methyladenosine,

or other base modifications (Schwartz et al., 2013; Carlile et al., 2014; Schwartz et al., 2014).

Alternatively, a stable-intron sensor could be designed that would permit for screening of factors

whose presence or absence regulates stable-intron formation.

The second question: is the "stabilizable" period transient? We know that active

transcription is required, but we do not know what that period of active transcription is wherein

stable introns can form. There are two ways to address this. On a single-intron level, one could

create an inducible system of stable-intron transcription and processing. Or, on a transcriptome-

wide level, one could add a metabolic label to the culture to label all introns-as well as related

mature mRNAs-transcribed from a certain point onward (Chan et al., 2017). Working within the simple system of a culture transiting from lag-to-log-to-saturation, is there only a brief window of time in which introns can be imbued with stability? Or, is it that after a certain time point all transcription is equivalent in forming stable introns ad infinitum? These results would greatly inform how we think about the switch-like mechanism of stable-intron formation.

The third question: are all sequences with a given BP-3'SS distance truly equivalent?

Although we do not detect any primary-sequence features enriched in stable introns, the

IllI endogenous 300 introns in S. cerevisiae do not come close to exhaustively spanning the potential sequence space. Our lab has previously examined a similar question, the role of 3'-end sequences on mammalian mRNA cleavage and processing, using a randomized library placed between two sequences of interest (polyadenylation signal and cleavage site in this instance) (Wu and Bartel,

2017). With a randomized cassette, one can query the effects of millions of BP-3'SS sequences and all reasonable BP-3'SS distances on intron stability in a single assay. From this experiment, we could learn if there is a clear cutoff for what counts as a short BP-3'SS distance with single- nucleotide resolution, as well as discover any primary or secondary RNA sequences that affect intron stability.

Homogeneity of stable-intron expression

As discussed in the introduction, stationary-phase cultures are not homogeneous populations of cells, but instead made of at least two subpopulations: quiescent and non- quiescent cells (Allen et al., 2006). These populations are readily separable due to differences in their cell walls and therefore their buoyancies in Percoll* gradients. The obvious future direction, then, is to isolate these populations and profile stable introns. Are they evenly distributed between populations, or are they enriched in one or the other? Does the relative ratio of quiescent and non-quiescent cells vary depending on the amount of stable-intron expression? If, hypothetically, stable introns are only expressed in quiescent cells, what are the consequences of inducing stable-intron expression in an isolated, non-quiescent population? Or, perhaps non- quiescent cells are not competent for stable-intron formation. Could we identify what is different about the spliceosomes or other factors between these populations that allow or prevent stable- intron formation? These experiments could lead to very interesting comparisons between

112 quiescent and non-quiescent cells and an understanding of stable introns as causative of or

consequential of cell state.

Upstream and downstream effectors of stable introns

What causes stable introns to form? How does the cell read these inputs, and how to they

lead to differential recognition and stabilization of introns based solely on their BP-3'SS

distance? Although we have tested a number of genes and a number of starvations for their role

in stable-intron regulation, thus far we have been largely stymied in efforts to move beyond the

TORC I node. What we have come to appreciate is that the TOR biology we are studying is not

currently addressed in much of the literature as a distinct phenomenon. Specifically, we find that

stable-intron induction upon rapamycin treatment takes a few hours to manifest. Nutrient-related

TORC I inhibition begins to manifest after only a few minutes, suggesting stable introns are not

related to this better understood aspect of TOR regulation (a conclusion also suggested directly

by our own starvation experiments).

Previous studies have shown that starvation inhibits TORC I signaling differently from treatment with rapamycin. Particularly, rapamycin-resistant TOR]-] strains are still responsive to

starvation (Jorgensen et al., 2004). Conversely, starvation-resistant mutants are still sensitive to rapamycin (Neklesa and Davis, 2009). Even variations on the same types of starvation elicit qualitatively different responses downstream of TORC I (Crespo et al., 2002; Tate and Cooper,

2013). What is clear: not all TOR inhibitions are created equal. This can explain why four hours of rapamycin induce stable introns but four hours of carbon or nitrogen depletion do not. Few studies related to long-term rapamycin treatment in yeast exist. But, those that do exist show strikingly different TORC I responses compared to short-term treatment or starvation (Zaragoza

113 et al., 1998; Lempiiinen et al., 2009; Mulleder et al., 2016; Rousseau and Bertolotti, 2016).

Given that four hours is quite long on a S. cerevisiae time-scale, we think some other type of stress, perhaps related to cell division or cell-wall integrity, could be the causative endogenous stress, but this is still to-be-tested. In all, we hoped this future direction would have yielded more readily given the vast knowledge of TOR signaling in yeast. However, it now appears we will have to endeavor to answer this with a more limited road map.

Stable introns in other eukaryotes

Will stable introns be found in other yeast species, other intron-poor species, or intron- rich species? We believe the answer is likely yes to at least one of these questions. Although it is possible evidence of stable introns in other species already exists in extant RNA-seq data, a few points limit these possibilities. First, poly(A)-selected RNA-seq data is blind to excised introns.

Second, for species like S. cerevisiae with short introns (only a few hundred nucleotides-long on average), most mRNA-oriented RNA-seq protocols select against RNAs in their size range.

Third, and potentially most important, if intron stability is a property regulated by environmental changes in other eukaryotes, then the likelihood that it has been examined under the correct experimental conditions is reduced. As in yeast, baseline experimental conditions for other eukaryotic model organisms are optimized for robust and reproducible growth, not necessarily to mimic natural environments or challenges. So, perhaps stable introns in a given species are only found in environmental conditions not extensively profiled to date. Taking the above points together, it is likely the case that new experiments designed to interrogate the lives of introns post-splicing in both intron-poor and intron-rich eukaryotes will be needed to elucidate the breadth of stable-intron existence.

114 Thinking about potential functions of stable introns in intron-rich eukaryotes, it is

unlikely the model as proposed for S. cerevisiae, to help regulate expression of intron-rich

ribosomal protein genes, will hold, as all classes of genes will have introns that need removal.

However, the general function of transiently manipulating spliceosome availability to modulate

global splicing or push spliceosomes to particular splicing substrates has the potential to exist

broadly. As a related observation, many human cancers are addicted to splicing due to their

greatly increased transcriptional rates and need for pre-mRNA processing (Dvinge et al., 2016;

Lee and Abdel-Wahab, 2016). The spliceosome has therefore become an effective target for therapeutic inhibition, as decreasing spliceosome function has a marked effect on cancerous cells while having limited toxicity on non-cancerous ones (Hsu et al., 2015; Salton and Misteli, 2016).

If stable introns were to be found in human cells (perhaps under a particular environmental condition, or in a particular cellular state, etc.), would it be possible to induce their formation based on what we have learned and could learn from the future directions above? If so, could this prove an effective way of modulating spliceosome availability for therapeutic purposes? This is, of course, a hypothetical built on a hypothetical. Currently, we are left to wonder about other contexts in which introns-a defining feature of the eukaryotic genome and a ubiquitously expressed type of ncRNA-have been coopted to perform functions.

115 References

Allen, C., Buttner, S., Aragon, A.D., Thomas, J.A., Meirelles, 0., Jaetao, J.E., Benn, D., Ruby, S.W., Veenhuis, M., Madeo, F., et al. (2006). Isolation of quiescent and nonquiescent cells from yeast stationary-phase cultures. The Journal of cell biology 174, 89-100. Banani, S.F., Lee, H.O., Hyman, A.A., and Rosen, M.K. (2017). Biomolecular condensates: organizers of cellular biochemistry. Nature reviews Molecular cell biology 18, 285. Brar, G.A., Yassour, M., Friedman, N., Regev, A., Ingolia, N.T., and Weissman, J.S. (2012). High-resolution view of the yeast meiotic program revealed by ribosome profiling. Science 335, 552-557. Carlile, T.M., Rojas-Duran, M.F., Zinshteyn, B., Shin, H., Bartoli, K.M., and Gilbert, W.V. (2014). Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells. Nature 515, 143. Chan, L.Y., Mugler, C.F., Heinrich, S., Vallotton, P., and Weis, K. (2017). Non-invasive measurement of mRNA decay reveals translation initiation as the major determinant of mRNA stability. bioRxiv. Cherry, P.D., White, L.K., York, K., and Hesselberth, J.R. (2018). Genetic bypass of essential RNA repair enzymes in budding yeast. Rna 24, 313-323. Cox, J.S., and Walter, P. (1996). A novel mechanism for regulating activity of a transcription factor that controls the unfolded protein response. Cell 87, 391-404. Crespo, J.L., Powers, T., Fowler, B., and Hall, M.N. (2002). The TOR-controlled transcription activators GLN3, RTG 1, and RTG3 are regulated in response to intracellular levels of glutamine. Proceedings of the National Academy of Sciences 99, 6784-6789. Dever, T.E., Feng, L., Wek, R.C., Cigan, A.M., Donahue, T.F., and Hinnebusch, A.G. (1992). Phosphorylation of initiation factor 2a by protein kinase GCN2 mediates gene-specific translational control of GCN4 in yeast. Cell 68, 585-596. Di Santo, R., Aboulhouda, S., and Weinberg, D.E. (2016). The fail-safe mechanism of post- transcriptional silencing of unspliced HACI mRNA. Elife 5. Dvinge, H., Kim, E., Abdel-Wahab, 0., and Bradley, R.K. (2016). RNA splicing factors as oncoproteins and tumour suppressors. Nature Reviews Cancer 16, 413. Fourmann, J.B., Schmitzova, J., Christian, H., Urlaub, H., Ficner, R., Boon, K.L., Fabrizio, P., and Luhrmann, R. (2013). Dissection of the factor requirements for spliceosome disassembly and the elucidation of its dissociation products using a purified splicing system. Genes & development 27, 413-428. Gall, J.G., Bellini, M., Wu, Z.a., and Murphy, C. (1999). Assembly of the nuclear transcription and processing machinery: Cajal bodies (coiled bodies) and transcriptosomes. Molecular biology of the cell 10, 4385-4402. Hilleren, P.J., and Parker, R. (2003). Cytoplasmic degradation of splice-defective pre-mRNAs and intermediates. Molecular cell 12, 1453-1465. Hinnebusch, A.G. (1984). Evidence for of the activator of general amino acid control in yeast. Proceedings of the National Academy of Sciences 81, 6442-6446. Hinnebusch, A.G., and Fink, G.R. (1983). Positive regulation in the general amino acid control of Saccharomyces cerevisiae. Proceedings of the National Academy of Sciences 80, 5374-5378.

116 Hsu, T.Y.-T., Simon, L.M., Neill, N.J., Marcotte, R., Sayad, A., Bland, C.S., Echeverria, G.V., Sun, T., Kurley, S.J., and Tyagi, S. (2015). The spliceosome is a therapeutic vulnerability in MYC-driven cancer. Nature 525, 384. Ingolia, N.T., Ghaemmaghami, S., Newman, J.R., and Weissman, J.S. (2009). Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218-223. Jorgensen, P., Rupes, I., Sharom, J.R., Schneper, L., Broach, J.R., and Tyers, M. (2004). A dynamic transcriptional network communicates growth potential to ribosome synthesis and critical cell size. Genes & development 18, 2491-2505. Kawahara, T., Yanagi, H., Yura, T., and Mori, K. (1997). Endoplasmic reticulum stress-induced mRNA splicing permits synthesis of transcription factor Hac I p/Ern4p that activates the unfolded protein response. Molecular biology of the cell 8, 1845-1862. Lee, S.C.-W., and Abdel-Wahab, 0. (2016). Therapeutic targeting of splicing in cancer. Nature medicine 22, 976. Lempiainen, H., Uotila, A., Urban, J., Dohnal, I., Ammerer, G., Loewith, R., and Shore, D. (2009). Sfpl interaction with TORCI and Mrs6 reveals feedback regulation on TOR signaling. Molecular cell 33, 704-716. Lerner, E.A., Lerner, M.R., Janeway, C.A., and Steitz, J.A. (1981). Monoclonal antibodies to nucleic acid-containing cellular constituents: probes for molecular biology and autoimmune disease. Proceedings of the National Academy of Sciences 78, 2737-2741. Lerner, M.R., and Steitz, J.A. (1979). Antibodies to small nuclear RNAs complexed with proteins are produced by patients with systemic lupus erythematosus. Proceedings of the National Academy of Sciences 76, 5495-5499. M. L. Queiroz, R., Smith, T., Villanueva, E., Monti, M., Pizzinga, M., Marti-Solano, M., Mirea, D.-M., Ramakrishna, M., F. Harvey, R., Dezi, V., el al. (2018). Unbiased dynamic characterization of RNA-protein interactions by OOPS. bioRxiv. Mao, Y.S., Zhang, B., and Spector, D.L. (2011). Biogenesis and function of nuclear bodies. Trends in Genetics 27, 295-306. Martin, A., Schneider, S., and Schwer, B. (2002). Prp43 is an essential RNA-dependent ATPase required for release of lariat-intron from the spliceosome. Journal of Biological Chemistry 277, 17743-17750. Mayas, R.M., Maita, H., Semlow, D.R., and Staley, J.P. (2010). Spliceosome discards intermediates via the DEAH box ATPase Prp43p. Proceedings of the National Academy of Sciences 107, 10020-10025. Mueller, P.P., and Hinnebusch, A.G. (1986). Multiple upstream AUG codons mediate translational control of GCN4. Cell 45, 201-207. Mulleder, M., Calvani, E., Alam, M.T., Wang, R.K., Eckerstorfer, F., Zelezniak, A., and Ralser, M. (2016). Functional Metabolomics Describes the Yeast Biosynthetic Regulome. Cell 167, 553-565 e512. Munding, E.M., Shiue, L., Katzman, S., Donohue, J.P., and Ares Jr, M. (2013). Competition between pre-mRNAs for the splicing machinery drives global regulation of splicing. Molecular cell 51, 338-348. Neklesa, T.K., and Davis, R.W. (2009). A genome-wide screen for regulators of TORCI in response to amino acid starvation reveals a conserved Npr2/3 complex. PLoS genetics 5, el1000515.

117 Potashkin, J.A., Derby, R., and Spector, D. (1990). Differential distribution of factors involved in pre-mRNA processing in the yeast cell nucleus. Molecular and cellular biology 10, 3524- 3534. Raj, A., Van Den Bogaard, P., Rifkin, S.A., Van Oudenaarden, A., and Tyagi, S. (2008). Imaging individual mRNA molecules using multiple singly labeled probes. Nature methods 5, 877. Rouhanifard, S.H., Mellis, I.A., Dunagin, M., Bayatpour, S., Symmons, 0., Cote, A., and Raj, A. (2018). Exponential fluorescent amplification of individual RNAs using clampFISH probes. bioRxiv. Rousseau, A., and Bertolotti, A. (2016). An evolutionarily conserved pathway controls proteasome homeostasis. Nature 536, 184-189. RUegsegger, U., Leber, J.H., and Walter, P. (2001). Block of HACI mRNA translation by long- range base pairing is released by cytoplasmic splicing upon induction of the unfolded protein response. Cell 107, 103-114. Salton, M., and Misteli, T. (2016). Small molecule modulators of pre-mRNA splicing in cancer therapy. Trends in molecular medicine 22. 28-37. Schwartz, S., Agarwala, S.D., Mumbach, M.R., Jovanovic, M., Mertins, P., Shishkin, A., Tabach, Y., Mikkelsen, T.S., Satija, R., Ruvkun, G., et al. (2013). High-resolution mapping reveals a conserved, widespread, dynamic mRNA methylation program in yeast meiosis. Cell 155, 1409-1421. Schwartz, S., Bernstein, Douglas A., Mumbach, Maxwell R., Jovanovic, M., Herbst, Rebecca H., Le6n-Ricardo, Brian X., Engreitz, Jesse M., Guttman, M., Satija, R., Lander, Eric S., et al. (2014). Transcriptome-wide Mapping Reveals Widespread Dynamic-Regulated Pseudouridylation of ncRNA and mRNA. Cell 159, 148-162. Segalat, L., and Lepesant, J. (1992). Spatial distribution of the Sm antigen in Drosophila early embryos. Biology of the Cell 75, 181-185. Tate, J.J., and Cooper, T.G. (2013). Five conditions commonly used to down-regulate tor complex I generate different physiological situations exhibiting distinct requirements and outcomes. Journal of Biological Chemistry 288, 27243-27262. Trcek, T., Chao, J.A., Larson, D.R., Park, H.Y., Zenklusen, D., Shenoy, S.M., and Singer, R.H. (2012). Single-mRNA counting using fluorescent in situ hybridization in budding yeast. nature protocols 7, 408. Trendel, J., Schwarzl, T., Prakash, A., Bateman, A., Hentze, M.W., and Krijgsveld, J. (2018). The Human RNA-Binding Proteome and Its Dynamics During Arsenite-Induced Translational Arrest. bioRxiv. Urdaneta, E.C., Vieira-Vieira, C.H., Hick, T., Wessels, H.-H., Figini, D., Moschall, R., Medenbach, J., Ohler, U., Granneman, S., Selbach, M., et al. (2018). Purification of Cross-linked RNA-Protein Complexes by Phenol-Toluol Extraction. bioRxiv. van der Feltz, C., DeHaven, A.C., and Hoskins, A.A. (2017). Stress-induced Pseudouridylation Alters the Structural Equilibrium of Yeast U2 snRNA Stem II. Journal of molecular biology. Wan, R., Yan, C., Bai, R., Lei, J., and Shi, Y. (2017). Structure of an intron lariat spliceosome from Saccharomyces cerevisiae. Cell 171, 120-132. el 12. Weinberg, D.E., Shah, P., Eichhorn, S.W., Hussmann, J.A., Plotkin, J.B., and Bartel, D.P. (2016). Improved Ribosome-Footprint and mRNA Measurements Provide Insights into Dynamics and Regulation of Yeast Translation. Cell reports 14, 1787-1799.

118 Wu, G., Adachi, H., Ge, J., Stephenson, D., Query, C.C., and Yu, Y.T. (2016a). Pseudouridines in U2 snRNA stimulate the ATPase activity of Prp5 during spliceosome assembly. The EMBO journal 35, 654-667. Wu, G., Radwan, M.K., Xiao, M., Adachi, H., Fan, J., and Yu, Y.T. (2016b). The TOR signaling pathway regulates starvation-induced pseudouridylation of yeast U2 snRNA. Rna 22, 1146-1152. Wu, G., Xiao, M., Yang, C., and Yu, Y.T. (2011). U2 snRNA is inducibly pseudouridylated at novel sites by Pus7p and snR81 RNP. The EMBO journal 30, 79-89. Wu, X., and Bartel, D.P. (2017). Widespread influence of 3' -end structures on mammalian mRNA processing and stability. Cell 169, 905-917. e91 1. Zaragoza, D., Ghavidel, A., Heitman, J., and Schultz, M.C. (1998). Rapamycin induces the GO program of transcriptional repression in yeast by interfering with the TOR signaling pathway. Molecular and cellular biology 18, 4463-4470. Zeng, Y., and Staley, J. (2017). Export of discarded, splicing intermediates provides insight into mRNA export. The FASEB Journal 31, 596.596-596.596. Zhu, L., and Brangwynne, C.P. (2015). Nuclear bodies: the emerging biophysics of nucleoplasmic phases. Current opinion in cell biology 34, 23-30.

119 120 Curriculum vitae

Jeff Morgan Whitehead Institute, 455 Main St., Cambridge, MA 02142 [email protected] - (586) 770-2335

EDUCATION 2012 - 2018 Massachusetts Institute of Technology Ph.D., Department of Biology, September 2018 Thesis Advisor: David P. Bartel "Discovery and characterization of stable introns in yeast"

2007-2011 University of Michigan, Ann Arbor B.S., Biochemistry with high honors, May 2011 Thesis advisor: Stephen W. Ragsdale "Characterization of the interaction between human heme oxygenase-2 and BK channel"

RESEARCH EXPERIENCE 2013 - 2018 Graduate Research Assistant Whitehead Institute, Cambridge, MA David P. Bartel, Principal Investigator

2011 - 2012 Postbaccalaureate Fellow National Institute of Biomedical Imaging and Bioengineering, NIH, Bethesda, MD Richard D. Leapman, Principal Investigator

2008 - 2011 Undergraduate Researcher University of Michigan Department of Biological Chemistry, Ann Arbor, MI Stephen W. Ragsdale, Principal Investigator

PUBLICATIONS 4. Morgan, J.T., Fink, G.R., & Bartel, D.P. Excised linear introns regulate growth in yeast. In revision.

3. Sousa, A.A., Morgan, J.T., Brown, P.H., Adams, A., Jayasekara S., Zhang, G., Ackerson, C.J., Kruhlak, M.J., & Leapman, R.D. Synthesis, characterization, and direct intracellular imaging of ultrasmall and uniform glutathione-coated gold nanoparticles. Small 8, 2277-2288 (2012).

2. Leapman, R.D., Sousa, A.A., Morgan, J.T., Adams, A., Zhang, G., Aronova, M.A., Bryant, L., & Frank, J. A. Characterization of hybrid nanoparticles by EFTEM and STEM. Microscopy and Microanalysis 18, 1596-1597 (2012).

1. Yi, L., Morgan, J.T., & Ragsdale, S.W. Identification of a thiol/disulfide redox switch in the human BK channel that controls its affinity for heme and CO. J. Biol. Chem. 285, 20117-20127 (2010).

121 ORAL PRESENTATIONS 2. Morgan, J.T. & Bartel, D.P. Excised linear introns regulate growth in yeast. Whitehead Institute, Whitehead forum, October 2017.

1. Morgan, J.T. & Bartel, D.P. Excised, stable introns function in yeast to help mediate growth regulation by TOR. Cold Spring Harbor Laboratory, Eukaryotic mRNA processing meeting, August 2017.

TEACH ING Spring 2016 TA, Quantitative Biology for Graduate Students (7.57), MIT

Fall 2013 TA, Intro to Experimental Biology and Communication (7.02J), MIT

HONORS AND AWARDS 2011 Postbaccalaureate Intramural Research Training Award 2007-2011 Thrall Enterprises Honors Scholarship 2007-2011 Robert Byrd Honors Scholarship Fall 2007-Fall 2010 University Honors

122