The intron-mediated regulation in Saccharomyces cerevisiae

A thesis presented for the degree of Doctor of Philosophy by Shih-Ching Eva Chen

School of Biotechnology and Biomolecular Sciences, University of New South Wales 2011

Table of contents

Abstract i Abbreviations ii Acknowledgements iv Publications v

Chapter 1 Introduction 1 1.1 The hidden layer of gene regulation, non-coding RNAs 5 1.1.1 Small ncRNAs 7 1.1.2 Long ncRNAs 8 1.1.3 Complex genomic organization 9 1.2 Nuclear introns 12 1.2.1 Significance of introns 12 1.2.2 Function of introns 13 1.2.3 Introns in Saccharomyces cerevisiae 15 1.3 Splicing and the spliceosome 17 1.3.1 Basal machinery 17 1.3.2 Spliceosome assembly and splicing reactions 18 1.3.3 Catalytic centre of spliceosome 21 1.4 The Sm family 23 1.4.1 Sm 23 1.4.2 Lsm proteins 26 1.4.3 Other roles of Sm/Lsm complexes 27 1.4.4 Regulation of expression of LSM 30 1.5 Aims 34

Chapter 2 Materials and methods 2.1 General materials and methods 36 2.1.1 Materials 36 2.1.2 Sterilization procedures and preparation of materials 37 2.2 Escherichia coli and Saccharomyces cerevisiae strains, media and growth conditions 38 2.2.1 Escherichia coli strains 38 2.2.2 Escherichia coli media and growth conditions 38

2.2.3 Saccharomyces cerevisiae strains 38 2.2.4 Yeast media and growth conditions 39 2.3 General DNA methods 41 2.3.1 General methods and reagents 41 2.3.2 DNA isolation from E. coli or S. cerevisiae 42 2.3.3 Polymerase chain reaction 42 2.3.4 Southern blot analysis 47 2.3.5 DNA sequencing 48 2.4 Generation of strains and plasmids 48 2.4.1 E.coli. transformation 48 2.4.2 Yeast transformation 49 2.4.3 Production of mutant constructs and plasmids 49 2.4.4 Construction of yeast strains 51 2.5 RNA based methods 52 2.5.1 Culture harvest conditions for RNA abundance measurement 52 2.5.2 RNA preparation 52 2.5.3 Quantitative real-time PCR (qRT-PCR) 53 2.5.4 Affymetrix gene chip expression microarray analysis 55 2.6 Mating-type based methods 56 2.6.1 Mating efficiency analysis 56 2.6.2 Measurement of pheromone production 56 2.6.3 Sensitivity test of opposite pheromone 57 2.7 Cell growth assay 57

Chapter 3 Regulation of LSM genes and the function of LSM7 intron on the expression of LSM genes 3.1 Introduction and aims 58 3.2 Production and verification of LSM7 mutant strains 61 3.3 Expression level of LSM genes changes in response to different carbon sources 65 3.4 The requirement for LSM7 coding sequence and intron in maintaining the expression levels of the LSM genes in response to a poor carbon source 67 3.4.1 LSM7 is required for normal expression of LSM genes in response to growth on acetate 67 3.4.2 The LSM7 intron is required for normal expression of LSM genes in

response to growth on acetate 70 3.4.3 The LSM7 intron alone is involved in fine-tuning the transcription level of LSM genes in response to growth on acetate 73 3.4.4 Expression of the LSM7 intron improves the growth rate of cells lacking Lsm7 protein 78 3.5 Does the LSM7 intron act as an independent trans-regulator in LSM gene expression ? 80 3.5.1 Expression of LSM7 intron from ADE1 locus alters the expression levels of LSM genes 81 3.5.2 Normal LSM gene expression in cells grown on acetate was not restored with the LSM7 intron expressed from the ADE1 locus in lsm7' and 'i mutants. 85 3.6 Discussion 91

Chapter 4 Regulatory elements of the LSM7 intron 4.1 Introduction and aims 94 4.2 Conservation of the LSM7 intron across yeast species 95 4.3 Linker-scanning mutagenesis of the LSM7 intron 97 4.3.1 Essential sequences of the LSM7 intron in regulation the level of mature LSM7 transcript 98 4.3.2 Expression patterns of LSM genes in response to mutations in the intron under different growth conditions 100 4.3.3 Essential sequences of the LSM7 intron in regulation of LSM genes 105 4.4 Secondary structure prediction of the LSM7 intron 109 4.5 Discussion 111

Chapter 5 Global transcriptional profiling in response to deletion of LSM7 intron 5.1 Introduction and aims 115 5.2. Principle component analysis (PCA) of microarray expression data 116 5.3 Differentially expressed genes in response to the LSM7 intron deletion 118 5.3.1 Differentially expressed genes in response to the LSM7 intron deletion under both media conditions 119 5.3.2 Genes involved in mating-type regulation are altered by the LSM7 intron deletion under both media conditions 124 5.4 The LSM7 intron is required for efficient mating in MATD-cells. 130 5.5 Discussion 133

Chapter 6 Summary and perspectives 135 Reference 141 Key to appendices 165

Abstract

Sm-like (Lsm) proteins are critically involved in a variety of RNA-processing events, including splicing, post-transcriptional modification and RNA degradation in organisms that range from bacteria and archea to yeast and humans. In Sacchromyces cerevisiae, the proteins existing in at least two heteroheptameric ring complexes: Lsm1-Lsm7, which promotes mRNA degradation via decapping in the cytoplasm; and, Lsm2-Lsm8, which is required for mRNA splicing in facilitating U4/U6 snRNP in the nucleus. Despite extensive understanding of their function, little is known about the mechanisms that regulate expression of the LSM genes.

By constructing a set of mutants that lacked the LSM7 gene, or its intron, or which expressed the intron but not the exons, the LSM7 intron was shown to modulate expression of other LSM genes in trans. The intron located at a separate locus in the ADE1 gene was able to affect expression of some LSM genes although the patterns of expression in cells grown on different carbon sources were not the same as those in the wild type. Sequences within the intron that regulate LSM genes were identified through sequential targeted mutagenesis. The data indicated that the splicing elements are required for full function of the intron. Moreover, a 24 nt region of the intron was found important in controlling the level of expression of the mature LSM7 transcript and other LSM genes in the response to growth on different carbon sources.

Microarray analysis was also employed to determine the full extent of the regulatory effect of the intron. Deletion of just the LSM7 intron affected a large group of genes involved in mating in haploids, and a mating efficiency assay provided strong evidence of the requirement of LSM7 intron in mating regulation. However, this regulation was not exerted through control in pheromone production or sensitivity towards opposite mating-type pheromone.

These data constitute a step towards understanding not only the regulation of LSM genes but also provide an example of a novel mechanism for gene regulation driven by an intron that can act in trans in a relatively simple eukaryote that lacks the machinery for gene silencing.

i

Abbreviations

5’SS 5’-splice site 3’SS 3’-splice site aa amino acid BPS branch point sequence bp base pairs d day h hour kb kilo LB Luria-Bertoni broth like-Sm Lsm min minute miRNA micro RNA mRNA messenger RNA ncRNA non-coding RNA nt nucleotides

OD600 optical density measured at 600 nm wavelength ORF open reading frame PCR polymerase chain reaction pre-mRNA precursor mRNA qRT-PCR quantitative real-time polymerase chain reaction rRNA ribosomal RNA RNAi RNA-mediated gene silencing s second snRNP small nuclear ribonucleoprotein snRNA small nuclear RNA snoRNA small nucleolar RNA SNP single nucleotide polymorphism SC synthetic complete (minimal) medium SD synthetic define (minimal) medium S. cerevisiae Saccharomyces cerevisiae tRNA transfer RNA UTR untranslated region WT wild-type

ii

YEPD rich glucose medium YEPA rich acetate medium

Gene names for dominant alleles are designated by the standard italicized capitalized three-letter mnemonic, followed by a number (e.g. LSM7). Recessive mutant alleles are denoted in italicized lower case (e.g. lsm7). Deletion is indicated by '. The protein product of the gene is designated by Roman type, with the first letter capitalized (e.g. Lsm7). Open reading frames are designated with three letters, followed by a three digit numerical code and ending in either “c” or “w”. The nomenclature indicates the chromosomal position of the open reading frame (e.g.YNL147W).

iii

Acknowledgments

In has been a privilege to conduct my PhD study in this laboratory. First of all, I would like to pass my greatest thanks to Prof. Ian Dawes for not only the opportunity to work in his lab but also his support, guidance, advice and understanding throughout the years of my study. Without his positive outlook and encouragement, this project would not have been possible. I will always remember your dry ice trick that always work and the moves on the dance floor.

My appreciation also goes to all the support and guidance received from Geoff Kornfeld, including small things like making me an RNA special esky, to more importantly supporting me in the project in many ways. Also, I would like to pass my special thanks to Dr. Joyce Chu and Dr. Ruby Lin who have not only kindly provided me their knowledge and experience for most of the RNA works, but also mentally encouraged me throughout my study. To all the members from “the splicing” team, thank you for all the hard works contributed to the project, especially Michael Dengler for his nice work that supported me in believing my own.

A huge “thank you” must go to every past and present member of the Dawes lab- Cristy, Suresh, Radhika, Anita, Gab, Duncan, Shixiong, Chong Han, Nadia, Bonny, Abraham and too many more to list, for filling the lab with so much fun, as well as all the advice-giving and helps in all sorts. Thanks also to Monica and May for their caring and sharing all the highs and lows in the life. Thank you all for the time we have shared. You have been very good friends and colleagues.

Finally, I would like to thank my wonderful family- Mum, Dad and my brother Shawn as well as my partner Boon for their unconditional love, support and encouragement throughout the years. Your emotional, intellectual and financial support is very much appreciated and cherished.

iv

Publications

Loughlin F. E., Mansfield R. E., Vaz P. M., McGrath A. P., Setiyaputra S., Gamsjaeger R., Chen E. S., Morris B. J., Guss J. M., Mackay J. P. (2009) The zinc fingers of the SR-like protein ZRANB2 are single-stranded RNA-binding domains that recognize 5' splice site-like sequences. Proc Natl Acad Sci U S A 106: 5581-5586

Chen E. S., Kornfeld G. D., Mabbutt B. C., Dawes I. W. (2011) The LSM7 intron-mediated gene regulation in Saccharomyces cerevisiae. (Manuscript in preparation)

v

Introduction

Chapter 1 Introduction

It has been more than 50 years since Francis Crick first articulated “the central dogma of molecular biology” (Crick, 1970; Crick, 1958), suggesting that the genetic information flows from DNA to RNA and is irreversibly translated into proteins. The theory rapidly gained acceptance as one of the intellectual foundations of life sciences and has been widely recognized as a keystone of molecular biology. However, as a consequence it has generally been assumed that proteins fulfill most cellular functions and RNA transcripts only serve as genetic intermediates between DNA sequences

(genes) and their encoded proteins. Contrary to that assumption, recent work has shown that RNA is much more extensively involved in cell regulation than was previously thought.

Although the was found twenty-five times larger than any previously sequenced genome, the human genome projects led to a striking estimate that the number of protein-coding genes in the human (~25,000 - 30,000) is much less than expected from an appreciation of the increasing developmental complexity relative to other lower eukaryotes (Lander et al., 2001; Venter et al., 2001). Only 1.1% of the genome is spanned by exons, whereas intron and intergenic regions constituted 24% and

75% of the genome, respectively (Venter et al., 2001). It is now known that human and other vertebrates such as the mouse (Mus musculus) and chicken (Gallus gallus) have approximately the same number of protein-coding genes as the nematode worm

(Caenorhabditis elegans; made up of only 103 cells); most of which share strong homology to orthologues (Taft et al., 2007). Furthermore, only 1% of the single nucleotide polymorphisms (SNPs) between individual humans account for the variation between them, suggesting that the protein-coding genes are not involved in the majority

1

Introduction

of phenotypic variation between individuals, as well as between species (Mattick, 2001;

Venter et al., 2001).

Extensive studies of the mechanisms of genetic regulation involved in phenotypic variation and complexity of higher eukaryotes have focused on the protein isoforms generated by alternative splicing (Castle et al., 2008; House & Lynch, 2008; Lareau et al., 2004; Levine & Tjian, 2003; Nagasaki et al., 2005; Nilsen & Graveley, 2010). It is generally accepted that the expanded variation of combinatorial interactions between isoforms of sequence-specific regulatory proteins and cis-regulatory elements, including enhancers, promoters and transcripts, modulate messenger RNA (mRNA) expression and processing in complex organisms (Levine & Tjian, 2003; Mattick et al., 2010;

Nilsen & Graveley, 2010). In addition, a range of other post-transcriptional and post-translational modifications also contribute to the protein regulatory repertoire

(Gingeras, 2007; Yang, 2005). On the other hand, more recent works have shown a positive correlation between the number of non-protein-coding intronic and intergenic sequences and the complexity of an organism, ranging from 0.25% of a prokaryote’s genome to 98.8% of humans; suggesting that these sequences may contain increasingly elaborate regulatory information (Ahnert et al., 2008; Taft et al., 2007).

High-resolution genome-wide transcriptomic analysis, such as full-length cDNA sequencing and use of genome tiling arrays, have confirmed that almost entire eukaryotic genomes are transcribed (Carninci, 2006; Cheng et al., 2005; Johnson et al.,

2005; Kapranov et al., 2007b; Mattick, 2009; Mattick & Makunin, 2006). Examples include 90% of the human genome from the study of 44 diverse regions (~1%) of the genome by the Encyclopedia of DNA Elements (ENCODE) Consortium (Birney et al.,

2007), 85% of the fruit fly (Drosophila melanogaster) genome during early

2

Introduction

embryogenesis (Manak et al., 2006) and 85% of the unicellular yeast (Saccharomyces cerevisiae) genome in rich media (David et al., 2006). Most RNA species are transcripts with low protein-coding properties, termed non-coding RNAs (ncRNAs).

While some ncRNAs are ubiquitously expressed, many appear to be regulated differentially. For example, the analysis of the human 21 and 22 has found ncRNA transcripts responsive to retinoic acid stimulation (Cawley et al., 2004).

Likewise, the Functional Annotation of Mouse (FANTOM) consortium has shown that a significant proportion of ncRNA transcripts exhibit tissue-specific expression patterns in the mouse (Ravasi et al., 2006). In several cases, ncRNAs were also found to undergo post-transcriptional modifications such as alternative splicing and polyadenylation

(Carninci et al., 2005; Cawley et al., 2004; Hüttenhofer, 2006; Johnson et al., 2005).

Thus, the increase in non-coding sequences is accompanied by a concomitant increase in the transcription of ncRNAs whose expression is regulated in order to modulate correct cellular functions.

Determining the function of ncRNAs is relatively difficult due to their transient nature, variation in length, lower conservation in sequence and hence more resistance to conventional genetic screens. Evidence of ncRNAs with disparate properties and regulatory functions is growing (Carninci, 2010; Mattick, 2009; Mattick et al., 2010;

Prasanth & Spector, 2007). Several short- and long- ncRNAs have been found to be associated with complex diseases, implicating the importance of this extra tier of regulation in coordination with proteins as functional units (Kapranov, 2009; Mercer et al., 2009; Prasanth & Spector, 2007; Taft et al., 2010a).

The aim of this thesis was to study the regulation of Like-Sm (LSM) genes that are critically involved in several aspects of RNA metabolism, including pre-mRNA splicing.

3

Introduction

This chapter presents a brief review of current classifications and examples of regulatory ncRNA in eukaryotes, in particular those derived from introns. An overview is given of the spliceosome complex its role in splicing and the important roles of the core splicing proteins, Sm and Lsm in RNA biogenesis to provide the background to this study which sought to identify the role of an intron in the yeast LSM7 gene.

4

Introduction

1.1 The hidden layer of gene regulation, non-coding RNAs

The first RNA molecules capable of catalyzing reactions (ribozymes) were discovered in the early 1980s, leading to the theory of “the RNA world” proposing that RNA fulfilled the roles of DNA (genes) and proteins (enzymes) early in evolution as part of the origin of life (Gilbert, 1986; Guerrier-Takada et al., 1983; Kruger et al., 1982).

However, for many years, it was believed that there were only a few functional ncRNAs, including transfer RNAs (tRNAs), ribosomal RNA (rRNAs), small nucleolar RNAs

(snoRNAs) and spliceosomal small nuclear RNAs (snRNAs), that are required for general house-keeping mRNA processing and translation (Eddy, 2001).

Until the late 1990s, the discovery of small RNA-mediated gene silencing (RNAi) in C. elegans led to a paradigm shift and uncovered a hidden layer of gene regulation that integrates the transcriptome to the proteome (Carninci, 2010; Taft et al., 2010a).

Publications on regulatory ncRNAs covering almost every level of gene regulation including chromatin architecture, epigenetic memory, transcription, RNA splicing, editing, translation and turnover have increased almost exponentially over the last decade. Different criteria have been used in the classification of ncRNAs between a wide range of small (< 200 nt) and long (> 200 nt) ncRNAs, which comprise various classes that differ in functions, genomic proximity to protein-coding genes, sequence characters, protein binding partners and biogenesis pathways (Table 1.1) (Amaral et al.,

2008; Amaral & Mattick, 2008; Carninci, 2010; Mattick et al., 2010; Taft et al., 2010a).

5

Introduction

Table 1.1 Summary of ncRNA classes and their functions (Amaral & Mattick, 2008; Carninci, 2010; Kapranov et al., 2010; Taft et al., 2010a; Taft et al., 2010b) ncRNA Full name of ncRNA classes Feature/Function class rRNAs ribosomal RNAs Protein synthesis tRNAs transporter RNAs Protein synthesis snRNAs small nuclear RNAs Components of snRNPs, involved in splicing and other functions snoRNAs small nucleolar RNAs 60-300 nt long, involved in tRNA rRNA and snRNA biogenesis, guide chemical modification of other ncRNAs and gene regulation (“orphan” snoRNAs) miRNAs micro RNAs 21-24 nt long small RNAs, produced by Dicer and Drosha two-step cleavage of imperfect RNA hairpins, involved in post-transcriptional gene expression siRNAs small interfering RNAs 20-25 nt long small RNAs, produced by Dicer cleavage of dsRNA duplexes or internal hairpins, involved in gene regulation, transposon control and viral defense piRNAs PIWI-interacting RNAs Dicer-independent 26-31 nt long small RNAs, involved in germline-restricted gene silencing and regulationġ sdRNAs sno-derived RNAs Small ncRNA processed from snoRNAs, some are Dicer independent; involved in miRNA-like gene regulation tdRNAs tRNA-derived RNAs Small ncRNA processed from tRNAs by RNases, able to induce translational repressionġġ PARs promoter-associated RNAs Long and short ncRNAs derived from promoter regions, including 5'-capped promoter-associated small RNAs (PASRs), transcription initiation RNAs (tiRNAs) and transcription start site antisense RNAs (TSSa RNAs)

6

Introduction

spliRNAs splice sites-associated RNAs ~17-18 nt long small RNAs, derived from splicing donor site of internal exons, may be involved in regulation of highly expressed genes TASRs termini-associated short RNAs 20-70 nt long derived from regions around transcription termination site aTASRs antisense TASRs Derived from the very 3’ end of an mRNA with a non-genomically encoded 5’poly U tail moRNAs microRNA-offset RNAs ~20 nt long small RNAs derived from regions adjacent to pre-miRNAs MSY-RNAs MSY2-associated RNAs 26-30 nt long pi-like small RNAs associated with germline specific protein MSY2 tel-sRNAs telomere small RNAs Dicer-dependent 24 nt long small RNAs, derived from G-rich strand of telomeric repeats. crasiRNAs centrosome-associated RNAs 34-42 nt long small RNAs, derived from centrosomes xiRNAs X-inactivation RNAs 25-42 nt Dicer-dependent small RNAs processed from duplexs of lncRNAs Xist and Tsix, involved in X- inactivation in placental mammals lncRNAs long non-coding RNA All ncRNAs more than 200 nt, including X-inactivation lncRNAs (Xist and Tsix), promoter-associated long RNAs (PALRs), and long intergenic RNAs (lincRNAs)

1.1.1 Small ncRNAs

Smaller RNA species are often processed from large RNA precursors (Amaral &

Mattick, 2008); the best examples are microRNAs (miRNAs) and small interfering

RNAs (siRNAs) which are produced from RNA hairpins and dsRNAs by RNase III

(Drosha and Dicer) cleavage (Carthew & Sontheimer, 2009; Ghildiyal & Zamore, 2009).

7

Introduction

Based on sequence-specific recognition, mature miRNAs and siRNAs modulate modifications of target loci, translation or degradation of various transcripts in animals and plants by associating with effector proteins of the Argonaute family to form

RNA-induced silencing complexes (RISC) that execute their functions. On the other hand, the most abundant small RNAs, PIWI-interacting RNAs (piRNAs) which in association with piwi-specific Argonaute protein, involved in germline-specific transposon defense and gene regulation, were found to be derived from long ssRNAs through a Dicer-independent process (Siomi et al., 2011).

Regulatory ncRNAs can also be processed further into smaller functional RNA species.

This includes the recently identified small nucleolar-derived RNAs (sdRNAs), tRNA-derived RNAs (tdRNAs) and X-inactivation RNAs (xiRNAs) (Carninci, 2010;

Taft et al., 2009b). Interestingly, tdRNAs identified from various organisms have all been shown to regulate cell proliferation (Ardelt et al., 2003; Williams, 2009). For example, ribonuclease angiogenin processed tdRNAs have been shown to promote translational arrest in stressed human tissue. Similarly, cleavage of tRNAs by the Rny1 in yeast promote cell death under oxidative stress (Thompson & Parker, 2009;

Yamasaki et al., 2009).

1.1.2 Long ncRNAs

In comparison with small ncRNAs, long ncRNAs (lncRNAs) are not as well studied and classified. Many have often been assumed to be transcriptional noise or precursors of smaller RNAs, however, lncRNAs are primarily known to have a role in silencing of gene expression in processes such as X chromosome inactivation and epigenetic imprinting (Mercer et al., 2009). For example, the iconic lncRNA Xist recruits

Polycomb chromatin remodeling complex PRC2 through its internal locus RepA to

8

Introduction

silence one X-chromosome in female mammalian cells, whereas PRC2 is blocked by the anitisense transcript Tsix from the remaining active X-chromosome (Panning, 2008).

Interestingly, Dicer-dependent xiRNAs derived from the Xist and Tsix duplex have recently been recognized and are also required in X-inactivation, demonstrating an integrated regulatory network of small and long ncRNAs pathways in gene regulation

(Ogawa et al., 2008). Furthermore, as with small ncRNAs, lncRNAs have also been implicated in various transcriptional regulations. More recent studies on the human genome have identified a class of long intergenic ncRNAs (lincRNAs) with enhancer-like function, which when depleted leads to decreased expression of neighboring genes (Ørom et al., 2010).

With potential to not only provide recognition through sequences, but also specific structures to RNA, DNA and proteins, lncRNAs are believed to play more sophisticated controls in complex eukaryotes (Mattick et al., 2010; Spirin, 2002). In fact, several lncRNAs have been found to be bifunctional, not only playing roles as trans-regulatory elements but also encoding a protein (Dinger et al., 2008). For example, the human steroid receptor activator (SRA) transcript, which co-activates steroid hormone receptors, was later found to code for the functional protein SRAP acting antagonistically

(Chooniedass-Kothari et al., 2004; Chooniedass-Kothari et al., 2010; Lanz et al., 1999).

Furthermore, it was predicted that around 5% of yeast mRNAs encode at least one RNA secondary structure which may function independently of the transcript level (Warden et al., 2008).

1.1.3 Complex genomic organization

In parallel to the discovery of ncRNAs, mapping of the expressed sequences in the genome has revealed an extraordinarily sophisticated organization of the transcriptome

9

Introduction

(Carninci, 2010; Kapranov, 2009; Mattick et al., 2010). Multifunctional usage of the same genomic space is common in eukaryotes. Many transcripts have been found to be overlapping with others from the same or opposite strands and these often contain regions of protein-coding sequences including exons and introns. Several newly discovered classes of small and long ncRNAs including promoter-associated RNAs

(PASRs), termini-associated short RNAs (TASRs), splice sites-associated RNAs

(spliRNAs) and transcription initiation RNAs (tiRNAs) with various possible biogenesis and roles have been mapped to regulatory regions of protein-coding genes (Kapranov et al., 2007a; Kapranov et al., 2010; Taft et al., 2009a; Taft et al., 2009c; Taft et al.,

2010b). In addition, recent observations of capped exon-derived RNAs have revealed a conserved secondary capping of cleaved transcripts regulated in a developmental-stage and tissue-specific manner (Carninci et al., 2006; Fejes-Toth et al., 2009). This post-transcriptional RNA cleavage is believed to play a significant part in diversity of both the non-coding and coding transcriptional repertoire of eukaryotic genomes

(Mercer et al., 2010).

It is now clear that ncRNAs represent an important functional expression of the genome, especially in complex organisms. The evolution of developmental complexity may have been dependent on a much larger set of RNA-mediated regulation rather than expansion of the protein repertoire. However, with new species continuously emerging, detailed experiments are still required to distinguish observed ncRNAs from high-throughput screens from experimental artifacts and degradation products, and further understand their functions and biogenesis. On the other hand, aberrant regulation of both small and long ncRNAs has been linked to various diseases, including cancers, central nervous system disorders and cardiovascular diseases (Taft et al., 2010a; Zhang, 2009).

Increased understanding of ncRNA biogenesis would therefore would not only provide

10

Introduction

a bigger picture of global dynamics in gene regulation but also better insights into disease-causing mechanisms and ultimately lead to the development of diagnostic and therapeutic tools.

11

Introduction

1.2 Nulcear introns

Introns are the non-coding intervening regions of precursor mRNA (pre-mRNA) or other RNA species, that are spliced out before the mature RNA is formed. Since introns do not code for proteins, their presence in the genomes have been considered as evolutionary remnants from the early assembly of genes (intron-early hypothesis) and/or the result of accumulation of evolutionary debris from transposons and other sources in higher organisms (intron-late hypothesis) (Fedorova & Fedorov, 2003;

Jeffares et al., 2006; Roy & Gilbert, 2006). However, precise removal of introns is critical for fundamental cellular functions, and this requires highly regulated machinery and significant commitment in time and metabolic energy.

1.2.1 Significance of introns

With similarity in splicing pathways, modern nuclear introns are believed to have descended from group II self-splice introns with a sophisticated complex of proteins and

RNAs, termed the spliceosome (see Section 1.3) (Mattick, 1994; Roy & Gilbert, 2006).

It has been speculated that the devolution of cis-acting catalytic RNAs into trans-acting spliceosomal RNAs and recruitment of accessory proteins have reduced the internal sequence constraints on introns and allowed them to drift, expand and evolve. Any sequences that acquired a function would have a certain selective value and form part of regulatory networking (Mattick, 1994).

Although introns are relatively less conserved than associated protein-coding exons, studies across species from yeast to vertebrates have found highly conserved sequences enriched in introns of genes related to development and transcription controls, mostly in complex organisms (Bejerano et al., 2004; Glazov et al., 2005; Siepel et al., 2005;

Sironi et al., 2005). Intron density is correlated well with developmental complexity,

12

Introduction

ranging from 10-20% of primary transcripts in fungal species with average intron size less than 100 base pairs, to over 95% of human primary transcripts with average intron size up to six thousands base pairs (Mattick & Gagen, 2001; Taft et al., 2007).

Culminations of intronic sequences have been linked to transcription delay which contributes in development where timing and dynamic patterns of expression are important (Swinburne & Silver, 2008). Furthermore, the distribution of introns in complex organisms was also found to be non-random. Genes composed of intronic sequences have been found enriched amongst genes that are highly expressed in the nervous system, and amongst genes down-regulated in embryonic stem cells and cancers (Taft et al., 2007), suggesting important regulatory information is retained in these sequences. The fact that most introns and ncRNAs are less conserved than their associated protein-coding exons does not mean that they lack function, but rather that they are subject to less constraints.

1.2.2 Function of introns

With the characteristic dispersion of introns in between the protein-coding exon sequences, it is known that introns play a role in exon shuffling and alternative splicing as well as function as a source of cis-regulatory elements to increase proteome complexity in higher organisms (Fedorova & Fedorov, 2003; Roy & Gilbert, 2006).

Exons can be rearranged by recombination or duplication to create mosaic proteins and this is believed to be an important source of protein evolution. It is estimated that about

19% of exons in eukaryotic genes have been formed by exon shuffling (Long et al.,

2003). On the other hand, specific protein isoforms are generated by alternative splicing in particular cell types, stages of development, or under specific biological stimuli. By including and/or excluding different sequences, alternative splicing not only generates

13

Introduction

segments of mRNA variability but also affects gene expression by removing or inserting cis-regulatory elements controlling translation, mRNA stability, or localization.

Up to 59% of human genes generate multiple mRNAs by alternative splicing (Lander et al., 2001), and approximately 80% of alternative splicing results in changes in the encoded protein (Modrek & Lee, 2002).

Apart from increasing proteomic complexity, it has been previously assumed that introns are simply degraded once being processed from precursor transcripts. However, the possibility that intronic RNAs may be functionally stable has been shown from intron-specific subcellular localizations and stability experiments (Clement et al., 2001;

Clement et al., 1999). Given the extensive discovery of functional ncRNAs, it is now evident that introns are important sources of functional ncRNAs, including snoRNAs, miRNAs, endogenous siRNAs and lncRNAs (Brown et al., 2008; Louro et al., 2009;

Mattick & Makunin, 2006; Rearick et al., 2010). This organization provides coordinated expression of intronic ncRNAs with their host mRNAs (Baskerville & Bartel, 2005;

Brown et al., 2008; Rodriguez et al., 2004; Vincenti et al., 2007). Most of the intronic snoRNA genes identified are involved in ribosome biogenesis and nucleolar function which require snoRNAs in guiding site-specific methylation and pseudouridylation of rRNAs, tRNAs and snRNAs (Bachellerie et al., 2002; Brown et al., 2008). In yeast, altering the position of intronic snoRNAs has been found to reduce the expression of both snoRNA and the host mRNA at a post-transcriptional level (Vincenti et al., 2007).

The majority of snoRNAs and miRNAs in complex eukaryotes are encoded in introns of both protein-coding and non-protein-coding genes (Brown et al., 2008; Rearick et al.,

2010). Processing of snoRNAs is commonly splicing-dependent and occurs by the exonucleolytic processing of debranched introns after their excision from the

14

Introduction

pre-mRNA (Brown et al., 2008). On the other hand, polycistronic snoRNAs found in plants and yeast are released by specific splicing-independent endonucleolytic cleavage followed by an exonucleolytic reaction. Similarly, in contrast to monocistronic snoRNAs, animal intronic miRNAs commonly exist in clusters (Brown et al., 2008;

Ying & Lin, 2006). Apart from being liberated from excised and linearized introns, miRNAs can be processed from excised intron lariats, linearized introns or from spliced pre-mRNAs through Drosha-mediated enonucleolytic cleavage. In exception, miRNAs derived from a short intron species (mirtrons) identified from D. melanogaster and

C.elegans can be released by a splicing reaction and bypass Drosha processing (Ruby et al., 2007). However, it is currently unclear to what extent functional ncRNAs exist in the introns of any particular genome.

1.2.3 Introns in Saccharomyces cerevisiae

In comparison with other eukaryotes including Schizosaccharomyces pombe and other ancient ascomycetes, hemiascomycetous yeast have experienced a massive reduction in introns and numerous genes involved in splicing (Aravind et al., 2000; Bon et al., 2003;

Fabrizio et al., 2009). Only around 5% of roughly 6000 genes in the budding yeast S. cerevisiae contain an intron (Davis et al., 2000; Juneau et al., 2007; Miura et al., 2006;

Spingola et al., 1999), and most of these are mono-intronic, with only ten genes containing two introns, and the requirement for alternative splicing is rare. The majority of introns are located at the 5’ end of the gene, with an average size of around 100 to

400 nt, and they contain more conserved consensus splicing signals than introns in higher eukaryotes (Bon et al., 2003).

Although introns appear to be on the way out of yeast genome (Fink, 1987) and studies have shown no major consequences of deletion of most of the introns affecting

15

Introduction

phenotypes, at least in tested conditions (Ng et al., 1985; Parenteau et al., 2008). S. cerevisiae introns are not randomly distributed throughout the genome, but are mostly found in highly expressed ribosomal protein genes and account for nearly one-third of total cellular transcription (Ares et al., 1999). At least 102 of the 139 ribosomal protein genes are interrupted by an intron. In non-ribosomal protein genes, introns are also found over-presented in genes related to secretion and meiosis (Juneau et al., 2007) and in essential genes (Skelly et al., 2009).

While only a few of snoRNAs are contained in yeast introns, several examples of intron-dependent gene regulation have been demonstrated, including autoregulated introns of RPL30 and YRA1 (Li et al., 1996; Meyer & Vilardell, 2009; Preker & Guthrie,

2006), transcriptional and translational enhancement (Juneau et al., 2006) and transcriptional response to environmental stresses (Pleiss et al., 2007). Splicing of several constitutively transcribed meiosis genes has also been found to be activated only during meiosis (Davis et al., 2000; Engebrecht et al., 1991; Juneau et al., 2007).

Therefore, introns may have been retained or gained for a sophisticated level of regulation in gene expression through splicing in a system with relatively simple machinery.

16

Introduction

1.3 Pre-mRNA splicing and the major spliceosome

Pre-mRNA splicing is a fundamental biological process in higher organisms, involving the precise removal of intervening introns and rejoining flanking coding exons to form mature transcripts prior to nuclear export and translation. Since more than 90% of mammalian genes contain introns (Lander et al., 2001) and any splicing error that adds or removes even one nucleotide will disrupt the open reading frame of an mRNA, precise cut-and-paste reactions are critical to the viability of the cell. In fact, aberrant splicing has been implicated in several disorders including cystic fibrosis and some breast cancers due to exon skipping (Liu et al., 2001; Ward & Cooper, 2010). It has been estimated that up to 60% of disease-causing mutations in humans affect RNA splicing (Lopez-Bigas et al., 2005; Wang & Cooper, 2007)

1.3.1 Basal machinery

The splicing reaction is catalyzed by a large RNA-protein macromolecule, termed the spliceosome comprised of five small nuclear ribonucleoproteins (snRNPs) and more than 100 accessory proteins to ensure the accuracy of the process (Staley & Woolford Jr,

2009; Wahl et al., 2009). In the most abundant (“major” or U2-dependent) class of spliceosome, each snRNP consists of a single-stranded uridine-rich small nuclear RNA

(U snRNA; U1, U2, U4, U5 or U6) and seven Sm or Lsm proteins (see Section 1.4) as well as other proteins specific for each snRNP (U-specific proteins). Despite a recent mass spectrometry study showing that the yeast spliceosome contains fewer accessory proteins than those in higher eukaryotes, the basic compositional dynamics of the splicing machinery are evolutionarily conserved from yeast to humans (Fabrizio et al.,

2009).

In order to remove an intron precisely, both the protein and RNA components of the

17

Introduction

snRNPs recognize a number of short consensus sequences in the pre-mRNA, including the 5’-splice site (5’SS) and the 3’-splice site (3’SS) at the exon-intron boundaries, as well as the branch point sequence (BPS) within the intron (Figure 1.1) (Wang & Burge,

2008). Other non-splice site regulatory elements such as splicing enhancers and splicing repressors can also be found both in the introns (ISE and ISS) and exons (ESE and ESS) of a pre-mRNA. These elements modulate both constitutive and alternative splicing by interacting with regulatory proteins that either stimulate or repress the assembly of a spliceosome at an adjacent splice site.

Besides the “major” spliceosome, a functionally and structurally analogous low-abundance (“minor” or U12-dependent) spliceosome exists in parallel in most multi-cellular eukaryotes (Patel & Steitz, 2003). This particular spliceosome catalyzes the removal of non-canonical introns that represent <1% of introns in mammals (Sheth et al., 2006). In a number of lower eukaryotes, a third class of spliceosome (“trans”) has also been found catalyzing the splicing from separately transcribed precursor RNAs

(Bonen, 1993; Liang et al., 2003). Each of these three classes of spliceosome requires a different combination of snRNPs and recognizes different splicing signals in precursor

RNAs (Tycowski et al., 2006). Since the work presented here focuses on the major class of spliceosome, the other two types of splicing will not be described further.

1.3.2 Spliceosome assembly and splicing reactions

The process of splicing is very dynamic, requiring a number of highly regulated conformational and compositional rearrangements of the spliceosome complex to bring together the consensus sequences in order to catalyze the two consecutive trans-esterification reactions (Figure 1.1). The well accepted model for spliceosome assembly involves five ordered interactions of the snRNPs and protein splicing factors

18

Introduction

on its pre-mRNA substrate through various RNA-RNA, RNA-protein and protein-protein recognitions (Figure 1.1) (Wahl et al., 2009).

The assembly begins with recognition of the 5’SS by the U1 snRNP via base-pairing with the U1 snRNA. This earliest defined complex in spliceosome assembly is known as the E (early) complex. Subsequently, the E complex associates with U2 snRNP in a sequence specific manner at the BPS to form the A complex. Consensus splicing sequences are recognized multiple times by the interactions with different components during the course of assembly. The addition of preformed U4/U6.U5 tri-snRNP to the A complex gives rise to the B complex, which undergoes another rearrangement of its

RNA network and the overall structure to form the catalytically active B* complex. In this remodeling, the U1 snRNP at the 5’SS is displaced with the U6 snRNP and the U1 and U4 snRNPs are dissociated from the complex. The activated spliceosome then undergoes the first catalytic step of splicing in which the adenosine at the BPS attacks the 5’SS, generating a cleaved 5’ exon and an intron 3’ exon intermediate. The resulting

C complex further catalyzes the second catalytic reaction, to generate ligated 5’ and 3’ exons and a cleaved intron in lariat form. The dissociated snRNPs are recycled to the next round of splicing.

Several observations have indicated that the spliceosomal snRNPs may also exist as a large performed complex prior to the engagement of pre-mRNA. Evidence includes the penta-snRNP and a large 200S RNP complex isolated from S. cerevisiae (Stevens, 2002) and HeLa nuclear extracts (Malca et al. 2003), respectively. Both consist of all five snRNPs and exhibit function to splicing in different extents. Nevertheless, the stepwise assembly represents a clear picture of the recruitment of each snRNP to the pre-mRNA and the remodeling events are still required for the formation of a catalytically active

19

Introduction

spliceosome (Brow, 2002).

Figure 1.1 Pre-mRNA splicing by the major spliceosome. Two exons (blue boxes) separated by an intron (blue line) with the consensus splice-site sequences (5’splice site, branch point and 3’splice site) of major pre-mRNA are indicated. Interactions of six U snRNP at discrete stages of splicing are as labelled. For simplicity, interactions of non-snRNP splice factors are not shown. Two trans-esterification reactions in the active spliceosome complexes B* and C are described in red. In the first reaction, the phosphate at the 5’SS is attacked by the 2’-hydroxyl (2’OH-) group of a conserved adenosine nucleotide (BPS) bulged from an RNA–RNA duplex created by base-pairing with U2 snRNA. This leads to cleavage of the 5’exon from the intron and ligation of the intron 5’ end to the branch point 2’-hydroxyl. The 3’ splice site is then ligated with the 5’exon in the second reaction, where the phosphate at the intron 3’ end is attacked by the 3’-hydroxyl (3’OH-) of the detached 5’exon. Finally, the intron is released in the form of a lariat. Figure adapted from (Krummel et al., 2010).

20

Introduction

1.3.3 Catalytic centre of the spliceosome

Unlike conventional enzymes, the spliceosome lacks a preformed active site and the catalytic core must be reassembled through multiple conformational rearrangements in each run of splicing. Much work has been done to elucidate not only the reaction pathways but also the proteins and RNAs involved in catalysis (Wachtel & Manley,

2009). The spliceosome is considered to be a ribozyme, as the mechanism and structure of pre-mRNA splicing resemble the self-splicing of group II introns (Collins & Guthrie,

2000; Valadkhan, 2007). In particular, there are strong similarities between the catalytically active structures (domains V) of group II introns with the stem-loop structure of U6 snRNA in the U2-U6 duplex (Sashital et al., 2004; Toor et al., 2008); both bind to a divalent metal ion which is required for splicing reactions (Sontheimer et al., 1997; Yean et al., 2000). The observations also support the theory that the spliceosome and the nuclear intron have evolved in tandem from these self-spliced introns (Pyle & Lambowitz, 2006; Roy & Gilbert, 2006). Nonetheless, numerous proteins are still required for assembly of the spliceosome into a catalytically active structure to promote effective splicing in order to couple other cellular processes

(Bessonov et al., 2008; Fabrizio et al., 2009; Warkocki et al., 2009). Among the enormous number of proteins involved, the largest and most conserved splicing factor

Prp8 (62% identity between yeast and humans) resides at the heart of the spliceosome and has been implicated in splicing catalysis (Collins & Guthrie, 2000; Grainger &

Beggs, 2005). Due to its position in the spliceosome, Prp8 was found to not only interact with several splicing factors, but also cross-linking consensus splice sites and almost all snRNAs, which leads to stabilization of the structure of the spliceosome

(Newman & Nagai, 2010). It is also involved in the transition between the two trans-esterification steps by changing the spliceosome conformation to promote one step

21

Introduction

while inhibiting another (Liu et al., 2007; Query & Konarska, 2004). Interestingly, recent crystal structure studies of Prp8 have revealed a conserved domain for RNase

H-like catalysis (Abelson, 2008; Pena et al., 2008; Ritchie et al., 2008; Yang et al.,

2008). However, the precise function of the RNase H-like domain is yet to be clarified, the domain has not been found to coordinate any metal ions critical for RNA cleavage.

22

Introduction

1.4 The Sm protein family

As mentioned previously, highly regulated pre-mRNA splicing requires five snRNPs for important structural and catalytic functions. Despite various U-specific proteins involved in the formation of each snRNP, seven Sm or Lsm proteins are importantly associated with a specific snRNA to form the core of the corresponding snRNP (Will &

Luhrmann, 2001). These Sm and Lsm proteins are members of a highly conserved protein family (Sm-protein family) of ancient origin, found in various organisms ranging from bacteria and archea to yeast and humans (Salgado-Garrido et al., 1999;

Scofield & Lynch, 2008; Seraphin, 1995).

1.4.1 Sm proteins

The founder proteins, Sm proteins were first discovered as autoimmune targets from patients suffering systemic lupus erythematosus (Beggs, 2005; Lerner & Steitz, 1979).

The seven Sm proteins identified (SmB/B’, SmD1, SmD2, SmD3, SmE, SmF and SmG; two isoforms B/B’ differ by 11 residues and are jointly referred to as B in the following) have been implicated as necessary for maturation of snRNPs for splicing. With the exception of U6 snRNA, U1, U2, U4 and U5 snRNAs are modified with

N7-methyl-guanosine (m7G) caps at their 5’ end after they are synthesized by RNA polymerase II in the nucleus (Beggs, 2005; Will & Lührmann, 2001). Capped snRNAs are exported to the cytoplasm where they associate with the seven Sm proteins through their conserved Sm site (AU(4-6)G) at the single-stranded regions. None of the individual

Sm proteins bind stably to snRNA. In the absence of snRNA, the Sm proteins exist as three subcomplexes, D1D2, D3B and EFG. At the minimum, interaction of EFG and

D1D2 heterodimers are required for formation of a stable intermediate snRNP core. The assembly is completed by the final association of the D3B/B’ heterodimer (Raker et al.,

23

Introduction

1996). Hypermethylation of the 5’cap of snRNAs to 2, 2, 7-trimethylated (m3G) is then triggered for nuclear import of the assembled core snRNPs. This allows maturation of snRNPs by associating with U-specific proteins upon the formation of the spliceosome.

All Sm proteins share a conserved Sm domain consisting of two hydrophobic residue-rich motifs (Sm1 and Sm2) separated by a variable region (Hermann et al.,

1995). X-ray crystallography of two Sm heterodimers (B/D3, D1/D2) has revealed a highly conserved structure of Sm proteins containing an N-terminal α-helix followed by a five-stranded anti-parallel β sheet determined by the Sm motifs (1.2 A) (Kambach et al., 1999). Based on these structures, a model of the Sm complex has been proposed that is consistent with biochemical and genetic interaction assays. The seven Sm proteins interact with two neighboring Sm proteins through β4-β5 pairing and arrangement into a heteroheptameric ring with a small central pore (Figure 1.2 B) (Camasses et al., 1998;

Fury et al., 1997; Kambach et al., 1999). This model has been supported by electron microscopy studies of human U1, U2 and U5 snRNPs, where the round shaped core domains 80 Å in diameter are in good agreement with the donut-shaped structure of the

Sm complex (Figure 1.2 C) (Kastner et al., 1990; Stark et al., 2001; Stark & Luhrmann,

2006; Will & Lührmann, 2001). A recent crystal structure of human U1 snRNP solved by two groups has also verified the arrangement of the heptameric ring in the order of E,

G, D3, B, D1, D2, and F (Pomeranz Krummel et al., 2009; Weber et al., 2010). Each

Sm protein binds a single nucleotide of the Sm site on U1 snRNA, which threads through the Sm pore in three concentric circles with bases radiating outwards to lie in pockets of the Sm proteins (Figure 1.4.1 D).

24

Introduction

A B

C D

Figure 1.2 Structures of eukaryotic Sm protein and Sm-bound U1 snRNP. (A) Ribbon presentation of the Sm D3 crystal structure. Sm1 motif formed β1-3 (blue), Sm2 motif formed β4-5 (yellow) and N-terminal α-helix (red) are shown as indicated. (B) Ribbon presentation of the Sm complex model arranged in a heptameric ring. Seven Sm ring proposed by two crystal structures of Sm heterodimers is shown with each Sm protein arranged in order determined by biochemical and genetic experiments (Kambach et al., 1999). (C) Model arrangement of RNA and proteins in the U1 snRNP by cryoelectron microscopy. The donut-shaped structure of the Sm proteins accommodated in the core of U1 snRNP is shown in yellow. U-specific proteins: U1-A, 70K and B+C are indicated. Stem-loops of U1 snRNA are colored in blue, red, and orange (Stark et al., 2001). (D) Crystal structure of U1 snRNA in the Sm ring. The seven Sm site nucleotides (bases, gold; sugar units, beige; phosphates, orange) of U1 snRNA and Sm proteins (grey semitransparent surface with ribbons) are as labeled. The outer circle indicates sectors corresponding to the building blocks of Sm proteins during snRNP assembly (blue, D1D2; green, FEG; red, D3B). The inner circle indicates sectors corresponding to the Sm proteins with non-canonical Sm pockets (light grey), canonical Sm pockets (grey) and those with special pockets form the buckle at the Sm site termini (black) (Weber et al., 2010).

25

Introduction

1.4.2 Lsm proteins

Unlike the other spliceosomal snRNAs, the U6 snRNA is transcribed by RNA polymerase III and acquires a 5’γ-monomethyl phosphate cap. Instead of binding to Sm proteins, it is associated with seven Lsm proteins (Lsm2-8) through its uridine-rich

3’end and retained in the nucleus throughout the process of U6 snRNP biogenesis

(Beggs, 2005). Interestingly, the Lsm2 – Lsm8 proteins exist stably as a heteroheptameric ring similar in shape to the Sm core RNP prior to its nuclear import and association with U6 snRNA (Figure 1.3 A) (Achsel et al., 1999; Will & Lührmann,

2001). This interaction is required to stably accumulate U6 snRNP and facilitate the

U4/U6 di-snRNP and U4/U6.U5 tri-snRNP formations for pre-mRNA splicing (Achsel et al., 1999; Karaduman et al., 2006; Mayes et al., 1999; Pannone et al., 2001).

Formation of the catalytic centre of the spliceosome is further dependent on the dissociation of the Lsm complex from U6 snRNA, indicating a role of Lsm proteins as

RNA chaperones (Chan et al., 2003). In addition, the retention of U6 snRNA and regeneration of U6 snRNP in the nucleus are both dependent upon the presence of the complete complex of Lsm2-Lsm8 (Spiller et al., 2007a; Verdone et al., 2004).

Lsm proteins were identified based on sequence homology to the Sm proteins (Achsel et al., 1999; Salgado-Garrido et al., 1999; Seraphin, 1995). All eukaryotes have an Lsm protein Lsm2, Lsm3, Lsm4, Lsm5, Lsm6, and Lsm7 homologous to each of the six Sm proteins D1, D2, D3, E, F, and G, respectively, whereas the other two Lsm proteins,

Lsm1 and Lsm8 are weakly related to the SmB protein. Based on this, the order of the

Lsm protein in the ring complex has been proposed, similar in arrangement to the Sm ring (Figure 1.3 B) (Beggs, 2005). This arrangement has been partially supported by various studies including yeast two-hybrid screens, crystal structure analysis of Lsm3 and electron microscopy of U6 snRNP (Karaduman et al., 2008; Naidoo et al., 2008;

26

Introduction

Pannone et al., 2001). In addition, co-expression assays of the Lsm heterodimer and heterotrimer that are analogous to Sm subcomplexes have reiterated the assembly pathways of Sm proteins in vitro, and more recently with engineered Lsm polyproteins

(Sobti et al., 2010; Zaric et al., 2005). However, the associations of Lsm protein in the complex have been found to be more flexible with RNA substrates and the precise organization of the Lsm complex at higher resolution is yet to be uncovered.

A B

Figure 1.3 Structure of the Lsm spliceosomal complex. (A) Electron microscopy image of Lsm complex. Stable ring-shaped complexes of purified human Lsm2-8 (Achsel et al., 1999). (B) Model of Lsm1-7 and Lsm2-8 complexes. Arrangement of Lsm proteins in heteroheptameric ring based on sequence homology to Sm proteins (Beggs, 2005).

1.4.3 Other roles of the Sm/Lsm complex

Besides being involved in pre-mRNA splicing, proteins of the Sm family are involved in various other RNA metabolic processes including histone formation, post-transcriptional modifications and mRNA degradation (Beggs, 2005; Scofield &

Lynch, 2008; Tharun, 2009). A common theme amongst the Sm/Lsm proteins is to associate with RNAs and other proteins in complexes. Twelve Lsm proteins (Lsm1-9,

Lsm 12, Lsm 13 and Lsm 16) have been identified from budding yeast to date. While

27

Introduction

the function of four of them remains unclear, Lsm1-Lsm8 have been suggested to form into several multi-unit complexes that function in a variety of RNA-processing events

(Figure 1.4).

Figure 1.4 Schematic diagram of Sm/Lsm complexes in cellular localizations and their associated RNA species in S. cerevisiae. The Sm proteins (green) bind to a conserved Sm site that is often found between two helices in the spliceosomal U snRNAs. The Lsm2–Lsm8 proteins bind the uridine-rich tract at the 3’ end of U6 snRNA and may also be involved in nuclear mRNA decay. On the other hand, Lsm1-Lsm7 proteins interact with mRNA and other protein factors to promote mRNA decay in the cytoplasm. Lsm8 and Lsm1 both associate with Lsm2-Lsm7 to form complexes with diverse functions which are indicated in purple and grey, respectively. Lsm7 is involved in all Lsm complexes and is shown in red. Lsm2-Lsm7 complexes have also been implicated in various RNA processing events, however the actual composition of this complex is not clear. Figure modified from (Zaric et al., 2005)

Genetic interaction screens of Lsm proteins have implicated their role in cytoplasmic mRNA decay in association with mRNA decapping enzymes (Dcp1, Dcp2 and Pat1)

28

Introduction

and 5’-3’ exoribonuclease (Xrn1) (Fromont-Racine et al., 2000). Further studies have revealed that a second Lsm heteroheptameric complex comprised of Lsm1-Lsm7 is required for the 5’ decapping of mRNA, thereby facilitating the 5’ to 3’ degradation in the deadenylation-dependent pathway (Bouveret et al., 2000; He & Parker, 2000;

Tharun, 2009; Tharun et al., 2000). Mutations in Lsm proteins as well as Pat1 resulted in the accumulation of 5’capped deadenylated mRNAs and increased stability of mRNAs, suggesting the complex targets the mRNAs for degradation after deadenylation. In fact, the Lsm1-Lsm7-Pat1 complex has been found to be interacting preferentially with oligoadenylated mRNP and is able to distinguish between oligoadenylated and polyadenylated RNAs (Chowdhury et al., 2007; Chowdhury &

Tharun, 2008). Consistent with the role in mRNA degradation, Lsm1-Lsm7 were colocalized with decapping factors and Xrn1 in several cytoplasmic foci called processing bodies (P-bodies), which contain mRNA decay intermediates representing sites of mRNA degradation (Ingelfinger et al., 2002; Sheth & Parker, 2003). However, the structure of the Lsm1-Lsm7 complex is undefined; it is likely that in replacing Lsm8,

Lsm1 also forms a ring with Lsm2-Lsm7 (Zaric et al., 2005).

In a role analogous to the Lsm1-Lsm7 in cytoplasmic mRNA decay, the Lsm2-Lsm8 complex was also found to play a role in mRNA degradation within the nucleus separating from splicing (Kufel et al., 2004). Nucleus restricted mRNA and unspliced pre-mRNA degradation intermediates were found to be accumulated in cells lacking any of the Lsm2-8 but not Lsm1. The accumulated RNA species in Lsm-depleted strains are polyadenylated and 5’ capped, suggesting Lsm2–8 may facilitate 5’ decapping and promote 5’ to 3’ RNA decay through a deadenylation-independent pathway.

Besides the Lsm2-Lsm8 and Lsm1-Lsm7 complexes, studies on the human U7 snRNP

29

Introduction

core have revealed a modified heptameric complex of five Sm (B, D3, E, F and G) and two Lsm (10 and 11) proteins interacting with U7 snRNA, which functions in histone 3' end formation (Pillai et al., 2001). The presence of the Lsm10 and Lsm11 was found to be U7 specific, since they provide the binding specificity to favor the unique Sm site of

U7 RNA. Furthermore, Lsm11 with an N-terminal extension has been found to contribute to the function of the U7 snRNP in processing the 3’ ends of histone mRNAs

(Pillai et al., 2003).

Lsm proteins are also involved in various RNA processing pathways in heteromeric compositions. In yeast, Lsm2-Lsm7 associate with snR5, a snoRNA for guiding site-specific pseudouridylation of rRNA in the nucleolus, and with low level of pre-RNase P RNA which functions in pre-tRNA processing (Fernandez et al., 2004;

Salgado-Garrido et al., 1999). The role of Lsm proteins in the biogenesis and/or function of both snR5 and RNase P remains unclear. However, depletion of any of the essential proteins Lsm2-5 and Lsm8 resulted in the accumulation of aberrant precursor and mature species of tRNA, rRNA or U3 snoRNA (Kufel et al., 2003a; Kufel et al.,

2003b; Kufel et al., 2002). Lsm3 was also found to immunoprecipitate with a small fraction of pre-tRNAs and some pre-rRNAs, demonstrating that disrupted maturation of these RNA species is related to an Lsm protein. Similarly, the U8 snoRNA in Xenopus have been found to be associated with another six-unit Lsm complex (Lsm2–4 and

Lsm6–8), although the precise role of this interaction is not clear (Tomasevic & Peculis,

2002).

1.4.4 Regulation of expression of LSM genes

While it is unclear how different combinations of Lsm complexes promote different roles in a variety of RNA metabolic systems, an important question in this regard is how

30

Introduction

cells maintain the integrity of the different Lsm complexes. Recent studies have revealed that the Lsm2–Lsm7 subunits can be exchanged between the nuclear and cytoplasmic Lsm2–8 and Lsm1–7 complexes depending on the levels of Lsm1 and

Lsm8 (Luhtala & Parker, 2009; Spiller et al., 2007b). It is proposed that the N-terminal regions of Lsm1 and Lsm8 contribute to nuclear exclusion and nuclear accumulation, respectively (Reijns et al., 2009).

On the other hand, previous work in the group reported that expression of a set of LSM genes co-regulated in response to growth on different carbon sources (Palmisano, 2006).

With the exception of LSM1 and LSM6, an apparent coordinate reduction at transcript levels of LSM2, LSM4, LSM5, LSM7 and LSM8 was observed in cells grown in poorer carbon sources (Figure 1.5).

A B

Figure 1.5 Transcription levels of LSM genes in cells grown on different carbon sources. (A) Coordinate regulation of LSM2, LSM4, LSM5, LSM7, and LSM8 genes in response to different carbon sources. (B) Transcripts of LSM1, LSM3 and LSM6 exhibit different patterns of expression. Figures adopted from (Palmisano, 2006).

31

Introduction

Further analysis revealed an LSM7-dependent regulation of the set of LSM genes; not only was the stability of LSM7 transcripts varied in cells grown on different carbon sources, but down-regulation of the LSM genes in response to poor carbon source was also disrupted in a lsm7 deletion mutant (lsm7'). The transcript levels of LSM2, LSM4,

LSM5 and LSM8 in lsm7' strain grown on lactate resembled these in cells grown on glucose (Figure 1.6).

Figure 1.6 Transcript levels of LSM genes in lsm6' and lsm7' mutants in response to different carbon sources. Relative transcription levels of LSM genes were measured from wild-type, lsm6' and lsm7' grown on glucose and lactate. Figures adopted from (Palmisano, 2006).

Unlike most of the genes in budding yeast, the coding sequence of LSM7 is interrupted by an intron. The possibility of this intron being involved in the coordinated regulation of LSM genes was further investigated with intronless and intron only mutants.

Transcription levels of LSM2, LSM4, LSM5 and LSM8 in these mutants grown on glucose and lactate were measured in the same manner as wild-type and lsm7' strains.

32

Introduction

The pattern of co-regulation was restored in the intron only strain but not the intronless strain, suggesting that the LSM7 intron in the same way mediates the coordinate regulation of the LSM gene in response to different carbon sources, possibly via a repressive mechanism (Figure 1.7). In addition, disruption in the regulation of LSM genes also led to an effect on splicing capacity. A reduction in splicing of RPL25 transcript in response to poorer lower carbon source was shown to be disrupted in the lsm7' and intronless strains, and less affected in the intron only strain.

Figure 1.7 Expression levels of LSM genes in LSM7 mutant strains in response to different carbon sources. Relative transcription levels of LSM genes were measured from wild-type (black), lsm7Δ (LSM7 replaced by kanMX4 and URA3; green and purple, respectively), intronless, (LSM7 with intron removed; blue), and intron only (LSM7 intron but all exons deleted; red) grown on glucose and lactate. Figures adopted from (Palmisano, 2006)

33

Introduction

1.5 Aims

Lsm proteins form into several multi-unit complexes that critically function in a variety of RNA-processing events in organisms that range from bacteria and archea to yeast and humans. While extensive studies have focused on their protein structures and interactions in relation to their functions, little is known about the mechanisms that regulate expression of the LSM genes for cells to maintain the integrity of the different

Lsm complexes.

Previous work provided a possible role of the intron of LSM7 in mediating the coordinated expression of a set of LSM genes in the intron-poor unicellular eukaryote

Saccharomyces cerevisiae to adapt to growth on various carbon sources (Palmisano,

2006). These data indicated that the LSM7 intron may regulate pre-mRNA splicing as a function of cell growth rates in different growth environments. However, subsequent work indicated that this interpretation may not be correct. Therefore, the first aim of this thesis was to re-investigate the specific properties of LSM7 and more specifically its intron in regulation of the expression of the set of LSM genes.

To address the initial aim, a series of lsm7 mutants that lacked the LSM7 intron, LSM7 exons, or both of the elements, were constructed to study the involvement of these sequences of LSM7 in expression of LSM genes. Since it is difficult to determine the conditions under which splicing may be required to be regulated, rich media containing either glucose or an alternative carbon source, acetate, were used to study changes in transcript levels of LSM genes as these provided data in the Palmisano’s study (2006).

The patterns of LSM gene expression in glucose-grown cells were compared to those in acetate-grown cells as a potential indicator of regulation. This investigation is presented in Chapter 3 and which highlights that there are differences in the way expression of the

34

Introduction

LSM genes are regulated from the previous interpretation.

This observation immediately led to the second aim of this work, which was to identify the possible mechanism whereby the LSM7 intron may modulate the expression of LSM genes in trans. First, this was addressed by expressing the LSM7 intron inserted at another locus in the genome. The specific aim was to introduce the LSM7 intron into the

ADE1 locus and determine whether it affected the transcript levels of LSM genes.

The second aim was also approached in Chapter 4 using sequential targeted mutagenesis of the LSM7 intron to elucidate important sequences or regions of the intron involved in mediating regulation of LSM genes. The levels of expression of LSM genes were analyzed in each mutant with 6 bp substitutions covering the whole sequence of the intron. This analysis was aimed at determining whether splicing of the intron for the full function of the intron and identifying sequences of the intron that are important in modulating the expression level of LSM7 and other LSM genes in response to growth on different carbon sources.

In the last part of the thesis, the second aim was further explored by carrying out a genome-wide transcription study in response to the deletion of the LSM7 intron. Since deletion of the LSM7 intron did not appear to have an effect on production of functional

Lsm7 protein, genome-wide expression data of an intronless LSM7 mutant grown on rich media containing either glucose or acetate were compared with those from the WT strain under the same growth conditions. This led to the discovery of an unexpected role of the intron in regulation of mating in yeast and further indicated a connection between the pathways of splicing and mating.

35

Materials and Methods

Chapter 2 Materials and methods

2.1 General materials and methods

2.1.1 Materials

Water used was purified with the Milli-Q® system (Millipore, NSW). Unless otherwise specified all reagents are analytical grade (AR). The sources of materials not noted in the description of the methods are given in Table 2.1 below.

Table 2.1 Source of materials and reagents used in this study Material/reagents Source 1, 10 phenanthroline Sigma-Aldrich (NSW) 3-(N-morpholino)propanesulfonicacid (MOPS) Sigma-Aldrich (NSW) 5-bromo-4-chloro-3-indolyl- beta-D-galactopyranoside Progen Industries (QLD) (X-gal) Agarose (DNA grade) Progen Industries (QLD) Amino acids and nucleobases Sigma-Aldrich (NSW) Ammonium sulfate Ajax FineChem (NSW) Ampicillin (sodium salt) ICN Biomedicals (NSW) Bacteriological agar Oxoid Ltd. (NSW) Chloroform APS Chemicals Ltd. (NSW) Deoxyribonucleotides Roche Diagnostics (NSW) D-glucose Ajax FineChem (NSW) Diethyl pyrocarbonate (DEPC) Sigma-Aldrich (NSW) Dimethyl sulfoxide (DMSO) Sigma-Aldrich (NSW) Ethanol (EtOH) Ajax FineChem (NSW) Ethidium bromide Sigma-Aldrich (NSW) Ethylenediaminetetra-acetic acid disodium salt (EDTA) APS Chemicals Ltd. (NSW) Formaldehyde Ajax FineChem (NSW) Formamide Progen Industries (QLD) GeneRuler• 1kb DNA ladder plus MBI fermentas (QLD). Geneticin (G418) Progen Industries (QLD) Galactose Sigma-Aldrich (NSW) Glycerol APS Chemicals Ltd. (NSW) Hydrochloric acid Ajax FineChem (NSW) Isoamyl alcohol (3-methyl-1-butanol) Sigma-Aldrich (NSW)

36

Materials and Methods

Isopropanol (propane-2-ol) Ajax FineChem (NSW) Kanamycin ICN Biomedicals (NSW) Lithium acetate Sigma-Aldrich (NSW) Peptone Oxoid Ltd (NSW) Phenol ICN Biomedicals (NSW) Polyethylene glycol 3350 (PEG) Sigma-Aldrich (NSW) Potassium acetateICN Biomedicals (NSW) Potassium chloride APS Chemicals Ltd. (NSW) Sodium acetate Ajax FineChem (NSW) Sodium chloride Ajax FineChem (NSW) Sodium dodecyl sulfate (SDS) Sigma-Aldrich (NSW) Sodium hydroxide APS Chemicals Ltd. (NSW) Salmon sperm DNA Sigma-Aldrich (NSW) Tris base Merck (NSW) Triton X-100 Sigma-Aldrich (NSW) Trisodium citrate Ajax FineChem (NSW) Tryptone Oxoid Ltd. (NSW) Yeast extract Oxoid Ltd. (NSW) Yeast Nitrogen Base Difco (NSW) (without amino acids or ammonium sulfate)

All other standard chemicals and reagents were obtained from Ajax Chemicals (NSW) and Sigma Aldrich (NSW).

2.1.2 Sterilization procedures and preparation of materials

Heat-labile solutions were sterilized by filtration through a sterile 0.2 μm filter

(Millipore, Australia). Heat-stable solutions, glassware and media were sterilized by autoclaving for 15 min at 120°C (125 kPa).

DNases were inactivated by autoclaving as above. To inactivate RNases, solutions were treated with DEPC (0.1% v/v) overnight. Unreacted DEPC was removed by autoclaving as above for two 15 min cycles. General glassware was baked at 180°C for overnight and heat-labile plasticware was cleaned with 10% (w/v) SDS or RNaseZAP

37

Materials and Methods

(Sigma-Aldrich, NSW).

2.2 Escherichia coli and Saccharomyces cerevisiae strains, media and growth conditions

2.2.1 E. coli strains

The E. coli D-select (DH5D equivalent) strain, [F- deoR endA1 recA1 relA1 gyrA96

- + - hsdR17 (rk , mk ) supE44 thi-1 phoAΔ (lacZYA argF) U169 Φ80lacZΔM15 λ ] was obtained from Bioline (Australia) and used for plasmid propagation.

2.2.2 E.coli media and growth conditions

E. coli strains were grown at 37°C with shaking (500 rpm) on Luria-Bertani (LB) media

(1% tryptone, 0.5% yeast extract, 1% NaCl), supplemented with ampicillin (100 μg/ml) or kanamycin (50 μg/ml) for selective maintenance of ampicillin- or kanamycin- resistant plasmids respectively. For short term storage (less than a month), E. coli strains were kept on LB media solidified with 2% (w/v) agar at 4°C, and for long term storage cells were stored in 50% (v/v) glycerol at -80°C.

2.2.3 S. cerevisiae strains

The strains used in this study are listed below in Table 2.3. Strains from the S. cerevisiae Genome Deletion Project (Winzeler et al., 1999) collection were obtained from Open Biosystems (NSW). Unless otherwise stated, strains in the BY4741 background were used in Chapter 5 and all the other mutant strains constructed were in the BY4742 background. The genotypes of all strains used were confirmed by PCR

(Section 2.3.3) and their phenotypic characteristics. Single integrations were confirmed by Southern blot (Section 2.3.4) and/or sequencing (Section 2.3.5). Details of the

38

Materials and Methods

cloning procedure are described in the Section 2.3.

TABLE 2.3 Strains used in this study Strains Genotype Source/Reference BY4741 MATa his3'1 leu2'0 met15'0 ura3'0 EUROSCARF (Brachmann et al., 1998) BY4741 bar1' MATa his3'1 leu2'0 met15'0 ura3'0 EUROSCARF bar1:: kanMX4 (Winzeler et al., 1999) BY4741 ade1' MATa his3'1 leu2'0 met15'0 ura3'0 EUROSCARF ade1:: kanMX4 (Winzeler et al., 1999) BY4741 lsm7' lsm7::URA3 in BY4741 This study BY4741 LSM7 WT LSM7_LEU2 in BY4741 This study BY4741 Intronless ('i) lsm7 intron'_LEU2 in BY4741 This study BY4741 Intron-only (iO) lsm7 exon1' exon2'_LEU2 in BY4741 This study Y341a tester MATa ade5 I.W. Dawes Y341α tester MATα ade5 I.W. Dawes BY4742 MATa his3D1 leu2D0 lys2D0 ura3D0 EUROSCARF (Brachmann et al., 1998) lsm7' lsm7::URA3 in BY4742 L. Palmisano (PhD thesis, 2006) LSM7 WT LSM7_LEU2 in BY4742 This study Intronless ('i) lsm7 intron'_LEU2 in BY4742 This study Intron-only (iO) lsm7 exon1' exon2'_LEU2 in BY4742 This study LSM7 WT_ade1∆ LSM7::LSM7_LEU2 ade1::KanMX4 in BY4742 This study lsm7'_ade1∆ lsm7::URA3 ADE1:: ade1::KanMX4 in BY4742 This study 'i _ade1∆ lsm7 intron'_LEU2 ade1::KanMX4 in BY4742 This study LSM7 WT_ADE1LSM7i LSM7::LSM7_LEU2 ADE1::ADE1LSM7i in BY4742 This study lsm7'_ADE1LSM7i lsm7::URA3 ADE1::ADE1LSM7i in BY4742 This study 'i _ADE1LSM7i lsm7 intron'_LEU2 ADE1::ADE1LSM7i in BY4742 This study LSM7 i01~i19* lsm7 i01~i19_LEU2 in BY4742 This study *These 19 mutants are named according to the site of mutation in the LSM7 intron, for details refer to Chapter 4.

2.2.4 Yeast media and growth conditions

Yeast cells were routinely propagated in YEPD medium containing 2% (w/v) D-glucose,

39

Materials and Methods

2% (w/v) bacteriological peptone, and 1% (w/v) yeast extract. When carbon sources other than D-glucose were used, cells were grown on medium containing 2% (w/v) potassium acetate or 3% (v/v) glycerol to substitute D-glucose. Synthetic defined medium (SD) refers to synthetic minimal medium (2% (w/v) D-glucose, 0.5% (w/v) ammonium sulfate and 0.17% (w/v) Yeast Nitrogen Base) containing only the amino acids or nucleobases required according to auxotrophic requirements of particular strains (at the individual concentrations indicated in Table 2.4). SD media was used for propagation and selection of yeast transformants. Solid media were made as above, but contained 2% (w/v) bacteriological agar. For selection of strains carrying the kanMX4 marker, G418 (200 μg/ml) was added to the medium. Selection against strains containing the URA3 gene was achieved by adding 0.1% (w/v) 5-fluoro-orotic acid

(5-FOA) and a minimal amount of uracil (2% (w/v)) to the medium.

Table 2.4 Concentration of individual supplements added to SD medium as described in methods in yeast genetics (Adams et al., 1997)

Supplement Final Concentration (mg/l) Adenine sulfate 10 L-Methionine 20 L-Histidine 20 L-Leucine 100 L-Lysine 30 Uracil 20

Unless otherwise stated, yeast strains were grown at 30°C with shaking (500 rpm) using a vessel flask volume : culture volume ratio of 5:1 to allow adequate aeration. For short-term storage (up to one month), yeast strains were stored on solid medium at 4°C.

For long-term storage yeast strains were stored in 15% (v/v) glycerol at – 80°C.

40

Materials and Methods

Sporulation of diploid cells from the BY474X background was induced by the method developed by the Saccharomyces genome deletion project

(http://www-sequence.stanford.edu/group/yeast_deletion_project/spo_riles). Diploid cells were patched onto GNA pre-sporulation plates (5% w/v glucose, 3% w/v Difco nutrient broth, 1% yeast extract, 2% w/v agar), grown overnight at 30qC and the procedure repeated. The cells were then directly resuspended in liquid sporulation medium (1% w/v potassium acetate, 0.005% w/v zinc acetate) and incubated at room temperature with end-over-end shaking to allow sufficient aeration until sufficient spores were observed (3-6 d).

2.3 General DNA methods

2.3.1 General methods and reagents

Composition of buffers, as well as methods for agarose gel electrophoresis, ethanol precipitation, DNA isolation and quantification were as described by Sambrook

(Sambrook et al., 1989). Fermentas GeneRuler™ 1 kb Plus DNA Ladder was used for

DNA size estimation and was obtained from Genesearch. Type II restriction endonuclease (SnaBI, SacI, HindIII, SphI, BsmI, DpnI), T4 DNA ligase and associated buffers were purchased from New England Biolabs (NSW) and used according to the supplied instructions. Recovery of DNA fragments from agarose gels was performed using the Gel extraction Kit (QIAGEN, VIC), according to the manufacturer's instructions. Ligations were performed in 10-25 μl at room temperature overnight.

Conditions were optimized for each ligation reaction.

41

Materials and Methods

2.3.2 DNA isolation from E. coli or S. cerevisiae

Plasmids were isolated from E. coli (5 ml culture) using the alkaline lysis method described previously (Sambrook et al., 1989) or using the mini- or midi- preparation plasmid kit (QIAGEN, VIC) according to the manufacturer’s instructions.

Yeast genomic DNA was prepared from 10 ml cultures grown to stationary phase in

YEPD according to the glass bead disruption phenol extraction method (Hoffman &

Winston, 1987).

2.3.3 Polymerase chain reaction

High-fidelity polymerases, iProof• high-fidelity DNA polymerase (BioRad, NSW) or

Phusion• high-fidelity DNA polymerase (Finnzymes, NSW) were used for polymerase chain reaction (PCR) amplification of DNA according to the manufacturer’s instructions. Cycling conditions, annealing temperatures and magnesium concentrations were optimized for each primer pair and template.

PCR for yeast colony screening typically used 200-400 ng of genomic DNA in a 20 μl reaction containing 1x supplied reaction buffer, 200 μM of each dNTP, 0.5 μM of each primer, 1 unit of DNA polymerase and 1.25 mM additional MgCl2. Typical cycling conditions were: 98°C for 5 min, followed by 35 cycles of denaturation at 98°C for 10 s, annealing at 55 – 72°C (depending on the melting temperature of the oligonucleotide) for 10 s, extension at 72°C for 0.5 to 1 min per 1 kb of product and finally 5 min extension at 72º C.

DNA oligonucleotides for generating strains and strain confirmation were obtained from either Invitrogen or GeneWorks (Australia). All oligonucleotides used for these purposes in this study are listed in Table 2.5.

42

Materials and Methods

Table 2.5 Primers used in this study for strain generation Primer Sequence (5’-3’) Application JR16 F TTGTCGACCAGGTCTTTTGATCT CTTTTC JR19 R GTAGGCAAGTAACTGTAGAGA TCTCTGT LSM7UP962 F GAATTTCCACAGCCCGCGGACG Cloning and LSM7UP825 F TCCAACGGTAGTGGTCCTACGTA verification of CC LSM7 constructs by LSM7UP704 F GCTTTGCCAGCGCAGCGCGC targeting upstream LSM7UP430 F TATCTCCTTGTTGTCGTACTATCT or downstream TTTCAC locations as LSM7UP430 R GTGAAAAGATAGTACGACAACA indicated relative to AGGAGATA the LSM7 start LSM7Down810 R CTTAAAAAATTATGGGCGTCCTA codon TGCATGC LSM7Down842 R GTCATGTAAATTTCCCGAATAGG TAAATACATGC LSM7Down855 R CCTCTGACATGCATGTTTCATTTT CGTTCC LSM7Down996R GCTCGCTGGTGTCTCTAGTGATG TC URA3int107 F GGATGTTCGTACCACCAAGGAAT Verification of TACTG URA3 genes, also URA3int398F GAAGCAGGCGGCGGAAGAAGTA used in generating AC Southern blot probe URA3int398R GTTACTTCTTCCGCCGCCTGCTTC

LSM7int F GGTGTCCTAAAAGGCTATGATCA ACTGATG Verification of LSM7int R CATCAGTTGATCATAGCCTTTTA LSM7 ORF GGACACC LSM7start F ATGCATCAGCAACACTCCGTATG TT LSM7exon2 F CAAAGGAAAAAATTCGAAGGCC Verification of CTAAAAGAG LSM7 exon2, also LSM7exon2 R CTATTTTTGCATATATAGTACAT used in generating CAGAACCTTCG Southern blot probe

43

Materials and Methods

LSM7intron005 F GTTTCACTTCTTATTTTCTTCC Verification of LSM7intron096 R TGTTTCCTTTTTGAACTA AG LSM7 intron, also used in generating Southern blot probe LSM7_pFL26 F AGTGCATGCAAGCTTGGCGTAAT CATG LSM7_pFL26 R CATGATTACGCCAAGCTTGCATG Verification of CACT the ligation LEU2start R AACGACGATCTTCTTAGGGGCAG boundary of LSM7 ACAT and pFL26, in LEU2start F ATGTCTGCCCCTAAGAAGATCGT particular the LEU2 CGTT marker on pFL26 LEU2int287 R AAGTTGGCGTACAATTGAAGTTC TTTACGG LEU2LSM7Down50 R TTGTTTTCAACTGTAAGGAAGGG Pairing with A LSM7UP704 F/ GTTTATATGAGATTATATTATTA LSM7UP430 F for AAC amplification of TTAAGCAAGGATTTTCTTAACTT LSM7_LEU2 CTTCGGC cassette to construct all LSM7 strains JR22 F GTACAATCGGTCCTTGAACTGCC Sequencing of LSM7 inserts in pFL26 and all derivatives including genomic integration JR12 F TCAGCAACACTCCGTTTGCTTCA Construction of CTTCTT “Read-through JR13 R AAGAAGTGAAGCAAACGGAGTG (RT)” mutant TTGCTGA JR32 F TCAGCAACACTCCGTATGTAAC Construction of ACTTCTTAT “Intronic-stop site JR33 R ATAAGAAGTGTTACATACGGAG (IS)” mutant TGTTGCTGA LSM7i SacI01 F CAGCAACACTCCGAGCTCTTCAC Construction of TTC LSM7i01 mutant

44

Materials and Methods

LSM7i SacI02 F CTCCGTATGTGAGCTCTCTTATTT Construction of TCTTCC LSM7i02 mutant LSM7i SacI02 R GGAAGAAAATAAGAGAGCTCAC ATACGGAG LSM7i SacI03 F CCGTATGTTTCACTGAGCTCTTT Construction of CTTCCG LSM7i03 mutant LSM7i SacI04 F TTTCACTTCTTATGAGCTCCCGT Construction of GGCAAT LSM7i04 mutant LSM7i SacI04 R ATTGCCACGGGAGCTCATAAGA AGTGAAA LSM7i SacI05 F CTTCTTATTTTCTTGAGCTCCAAT Construction of AACCTTCC LSM7i05 mutant LSM7i SacI05 R GGAAGGTTATTGGAGCTCAAGA AAATAAGAAG LSM7i SacI06 F CGTGGGAGCTCCCTTCCTTTTGA Construction of C T LSM7i06 mutant LSM7i SacI07 F CTTCCGTGGCAATAAGAGCTCTT Construction of TTGACTT LSM7i07 mutant LSM7i SacI08 F TGGCAATAACCTTCCGAGCTCCT Construction of TATTTATA LSM7i08 mutant LSM7i SacI09 F CAATAACCTTCCTTTTGAGAGCT Construction of CTATACTAAC LSM7i09 mutant LSM7i SacI10 F CCTTTTGACTTATTGAGCTCAAC Construction of ATTATAATAACTATG LSM7i10 mutant LSM7i SacI11 F GACTTATTTATACTGAGCTCATA Construction of ATAACTATGTTTCC LSM7i11 mutant LSM7i SacI12 F TACTAACATTGAGCTCACTATGT Construction of TTCCTTTTTGA LSM7i12 mutant LSM7i SacI12 R TCAAAAAGGAAACATAGTGAGC TCAATGTTAGTA LSM7i SacI13 F CTAACATTATAATAGAGCTCTTT Construction of CCTTTTTGAACTAAG LSM7i13 mutant LSM7i SacI13 R CTTAGTTCAAAAAGGAAAGAGC TCTATTATAATGTTAG

45

Materials and Methods

LSM7i SacI14 F AATAACTATGGAGCTCTTTTGAA Construction of CTAAGAAATCAGA LSM7i14 mutant LSM7i SacI14 R TCTGATTTCTTAGTTCAAAAGAG CTCCATAGTTATT LSM7i SacI15 F TATGTTTCCTGAGCTCACTAAGA Construction of AATCAGAGAA LSM7i15 mutant LSM7i SacI15 R TTCTCTGATTTCTTAGTGAGCTC AGGAAACATA LSM7i SacI16 F GTTTCCTTTTTGAGAGCTCAAAT Construction of CAGAGAACA LSM7i16 mutant LSM7i SacI16R TGTTCTCTGATTTGAGCTCTCAA AAAGGAAAC LSM7i SacI17 F GAACTAAGGAGCTCGAGAACAA Construction of ACCACAA LSM7i17 mutant LSM7i SacI17 R TTGTGGTTTGTTCTCGAGCTCCTT AGTTC LSM7i SacI18 F CTAAGAAATCAGAGCTCAAACC Construction of ACAACAGC LSM7i18 mutant LSM7i SacI18 R GCTGTTGTGGTTTGAGCTCTGAT TTCTTAG LSM7i SacI19 F AAATCAGAGAACGAGCTCCAAC Construction of AGCAAAG LSM7i19 mutant LSM7i SacI19 R CTTTGCTGTTGGAGCTCGTTCTCT GATTT ADE1LSM7i F TAATGTCAATTACGAAGACTGTA Construction of the TGTTTCACTTCTTATTTT first half of the LSM7iADE1 R CAATATACCGTCCAGTTCCTTAG ADE1LSM7i TTCA AAA AGGAAACA cassette, generating an LSM7 intron fragment with two ends complimentary to ADE1 ADE1UP100 F AATTTGCTCTGAGAACATTTATA Construction of the CATTAATACATACGG first half of the ADE1LSM7i R AAAATAAGAAGTGAAACATACA ADE1LSM7i GTCTTCGTAATTGACATTA cassette, generating a fragment

46

Materials and Methods

containing the first 18bp of ADE1 ORF LSM7iADE1 F TGTTTCCTTTTTGAACTAAGGAA Construction of the CTGGACGGTATATTG second half of the ADE1Down100 R GTAAGACGGTTGGGTTTTATCTT ADE1LSM7i TTGCAGT cassette ADE1UP80 F CTTTTGCAGTTGGTACTATTA Amplification of the AGA ACA ATCGAATC ADE1LSM7i and ADE1Down80 R ATACATTAATACATACGGGTATG ade1:: KanMX4 TATGAATCATATTC cassette, also used in sequencing ADE1UP200 F GAGGATGTAATAATACTAATCTC Verification of GAAGATGCC ade1:: ADE1Down200 R TAGCGGTAAGAAAGTTGGTAAG ADE1LSM7i GTTCCAGA

2.3.4 Southern blot analysis

Single-copy integration of LSM7 mutant strains was detected by Southern blot analysis.

Restriction-digested (BsmI and SphI) genomic DNA (10μg) from each strain was separated on a 1% (w/v) agarose gel and denaturated in 1.5M NaCl, 0.5M NaOH for 1 h.

The gel was transferred into neutralizing solution (1M Tris, 1.5M NaCl, pH 7.5) for 1 h, rinsed with MQ water and soaked in 2x SCC buffer (0.3M NaCl and 30mM tri-sodium citrate, pH 7.0) for 10-20 min. The DNA was transferred by unidirectional blotting to a

Hybond-N+ nylon membrane (Amersham Biosciences, Sweden) for at least 15 h as described by Sambrook (Sambrook et al., 1989). DNA was fixed to the membrane by baking at 120˚C for 30 min after air-drying the membrane.

A digoxigenin (DIG) based system supplied by Roche Diagnostics (NSW) was used to probe the membranes according to the manufacturer’s instructions. In brief, blots were pre-hybridized with DIG Easy-Hyb at 50°C for more than 2 h, followed by

47

Materials and Methods

hybridization with DIG Easy-Hyb containing DIG labeled probe generated using the

PCR DIG probe synthesis kit for at least 15 h at 55°C. Membranes were washed once with low stringency buffer (2x SCC, 0.1% w/v SDS) at room temperature for 30 min and twice with high stringency buffer (0.5x SCC, 0.1% w/v SDS) at 65°C for 15 min.

After rinsing with washing buffer, the membrane was blocked for at least 30 min with freshly prepared blocking buffer containing maleic acid prior to treatment with anti-DIG-AP antibody solution for a further 30 min. The membrane was washed to remove any free antibody and treated with detection buffer. Chemilumininescent signals were visualized after the addition of CDP-star substrate by exposure to X-ray film

(Kodak, Australia). Stripping solution (0.2 M NaOH, 0.1% w/v SDS) was applied to the membrane when rehybridazation was required.

2.3.5 DNA sequencing

DNA samples were sequenced using the ABI Prism™ BigDye Terminator system

(Applied Biosystems, NSW) according to manufacturer’s instructions. PCR products were treated with ExoSAP-IT (USB®, VIC) before sequencing. Samples were sequenced and analyzed by the Ramaciotti Centre for Gene Function Analysis (UNSW,

Australia).

2.4 Generation of strains and plasmids

2.4.1 E.coli transformation

E. coli competent cells were purchased from Bioline (NSW) or prepared by the rubidium chloride method as described (Hanahan et al., 1991). Competent cells were stored at -80°C and thawed on ice when required. Transformation was performed by mixing, incubated on ice for 5 min, heat-shocked for 45 s at 42°C, and finally recovered

48

Materials and Methods

in LB for 1 h at 37°C. Transformants were plated on solid selective LB medium, pre-spread with x-gal (100 Pg in DMSO) if required, for blue-white colony selection and incubated for 1 d at 37°C.

2.4.2 Yeast transformation

Yeast strains were transformed using the lithium acetate method as described (Gietz &

Woods, 2002). In general, cells were pre-grown on YEPD medium and sheared salmon-sperm DNA was used as carrier DNA. Purified PCR products (1-5 μg) generated with the appropriate sequences for genomic integration were used in transformations. After recovering in YEPD liquid medium at 30qC for 1-2 h, transformants were plated onto the appropriate selective solid media and incubated for 3 d at 30°C. Single colonies were patched onto positive or negative selective solid media for confirmation of gene integrations or knockouts, respectively.

2.4.3 Production of mutant constructs and plasmids

All of the LSM7 mutant constructs containing the 705 bp upstream (SnaBI) and 743 bp downstream (SphI) regions from the LSM7 locus were cloned into the yeast integrative plasmid pFL26 (Bonneaud et al., 1991) using the SmaI and SphI restriction sites.

In order to effectively obtain all the mutants, four cloning methods were applied:

1) lsm7iO and lsm7i∆ were designed by Geoff Kornfeld (UNSW), custom synthesized,

and subcloned into pMA and pMK plasmids respectively with the desired restriction

enzyme sites (SnaBI at the 5’end and SphI at the 3’end) from GENEART

(Regensburg, Germany).

2) The Megaprimer PCR-based site-directed mutagenesis method (Kammann et al.,

1989; Sarkar & Sommer, 1990; Sarkar & Sommer, 1992) was modified with an

49

Materials and Methods

ExoSAP-IT treatment between the two rounds of PCR to remove any residual

primers. The PCR product containing the mutation site from the initial amplification

was used as a primer in the second PCR for amplification of the full-length product

without any intermediate purification.

3) Standard 3-step PCR based site-directed mutagenesis was performed for generating

LSM7 mutant cassettes. ExoSAP-IT was used between each PCR amplification.

4) PCR based site-directed mutagenesis using double-stranded plasmid DNA as

template (Geneart“ (Invitrogen, NSW) or Quickchange“ (Statagene, CA)) was

performed with the pFL26 LSM7 WT plasmid. Complementary primers containing

the mutation were used to introduce the mutation during replication of the whole

plasmid. The resulting DNA product was treated with DpnI endonuclease to destroy

methylated plasmid templates from the newly synthesized unmethylated products

and transformed into E.coli.

The ADE1LSM7i mutant cassette was constructed with multi-step PCR to generate a fragment containing the LSM7 intron cloned into the ADE1 gene, separating the 921 bp long sequence into an 8 bp exon and a 913 bp exon to mimic the molecular structure of the wild-type LSM7 gene. Primers described in Table 2.5 were designed to create amplicons with flanking sequences that are complimentary to the neighbouring amplicons. In the first PCR reaction, sequences containing the 5’ exon and LSM7 intron were generated forming an intermediate fragment of exon one-LSM7 intron for use in the second PCR. The amplicon containing the 3’ exon was also generated in a similar manner. The two overlapping amplicons were then used as templates to generate the full-length cassette. Transformation of the appropriate mutant strains to generate double gene mutations are described in the following section.

50

Materials and Methods

2.4.4 Construction of yeast strains

LSM7 mutant strains were constructed by genomic integration through homologous recombination using BY4742 lsm7' as the parent strain. Due to problems of petite generation arising from the use of 5-fluoroorotic acid (5-FOA) for selection of URA3- strains, the amplicon encompassing the LSM7 promoter region (~500 bp upstream), downstream region (743 bp downstream) and the LEU2 marker was generated from the pFL26 lsm7mutant plasmid and used for transformation of yeast cells. As a result, the fragment was integrated at the LSM7 locus such that the disrupted promoter and downstream regions were reconstituted by the native copy and vice versa. Double selection of LEU2 and URA3- were used for selection of candidate colonies.

Single-copy integration was verified by PCR and Southern blot analysis (Section 2.4.4).

BY4742 strains incorporating the ade1∆ locus were also constructed by homologous recombination of the ade1::KanMX4 cassette from the BY4741 ade1' strain. Pink pigmented colonies were selected as candidates for the knockout of ADE1. Strains containing two modified LSM7 constructs at different gene loci were generated using homologous recombination of the PCR-generated cassette (ADE1LSM7i) targeting the second gene locus (ADE1 in this study) of a LSM7 WT or mutant strain (LSM7

WT_ade1∆, lsm7'_ade1∆ and 'i_ade1∆). After selection on appropriate solid medium, the presence of the dual gene mutations was confirmed by PCR and sequencing.

For mating-type assays, several mutant strains of opposite mating types were generated by sporulation. Haploid single lsm7 mutants were mated with the wild-type parental

BY4741 and diploids were selected based on the markers present. Spores were induced as described in Section 2.3.4. Ascii were digested with zymolase (ICN Biomedicals,

NSW) solution (0.5 mg/ml zymolase in 1 M sorbitol) for 10 min, incubated on ice for

51

Materials and Methods

30 min and dissected onto YEPD plates using a Singer micromanipulator (Singer MSM systems). The genotype of the candidate haploids was tested using selective plates and mating types were tested with the tester strains Y341α and Y341a. PCR and sequencing were used for final confirmation.

2.5 RNA based methods

2.5.1 Culture harvest conditions for RNA abundance measurement

For gene expression analyses using qRT-PCR and microarray, yeast cultures were grown to mid-exponential phase (optical density at 600 nm (OD600) of 0.5 r 0.05) in media containing D-glucose or potassium acetate as the carbon source. 50 ml of cells was harvested by centrifugation at 4,000 rpm at 4°C for 2 min in pre-chilled 50 ml tubes

(-80°C) containing 20 ml of ice and immediately stored at -80°C for future procedure.

2.5.2 RNA preparation

RNA was extracted with AE-equilibrated phenol and further clean-up with RNA purification column (RNAspin Mini RNA isolation kit, GE Healthcare) as described

(Mutiu & Brandl, 2005). The quality and the quantity of the total RNA were determined using RNA denaturing gels containing 1.2% (w/v) agarose and 37% (w/w)

“ formaldehyde and using a Nanodrop spectrophotometer for A260/280 and A260/230 ratios.

RNA integrity was further confirmed using the Bioanalyzer (Agilent Technologies) at the Ramaciotti Centre for Gene Function Analysis (UNSW, Australia).

52

Materials and Methods

2.5.3 Quantitative real-time PCR (qRT-PCR)

Primers used in qRT-PCR analysis were designed using the web based program Primer3

(http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi) according to the guidelines below where possible:

z Approximately 50% GC content

z 18 to 24 nucleotides in length

z Primers forming secondary structures and primer-dimers were avoided

z Melting temperature of primers were approximately 50qC

z Amplicons between 90 to 120 bases in length

z BLAST searches were performed against public databases to ensure target

specific sequences were used

All primers used for qRT-PCR in this study were synthesised by GeneWorks and are listed in Table 2.7.

Table 2.7 Primers used for qRT-PCR Primer Sequence (5’-3’) Transcript to determine LSM1-162 F CTTCGTTCTTTTGCGTGATG LSM1 LSM1-256 R TTCTCTCCACGCAATCTTGA LSM2-279 F ATCCACACTTGGGTTCCGTA LSM2 LSM2-375 R TCCCTTCTGGTCGCGTCTTG LSM3-176 F CCGAAAGACGATGTGAAATG LSM3 LSM3-269 R TATATCTCCACTGCGCCATC LSM4-420 F CCGTCAATACAACAACAGCAA LSM4 LSM4-518 R ACGGACCCACCTAAACCATT LSM5-92 F AGGGCACGTTAGTTGGTTTC LSM5 LSM5-184 R CATTTCTGCTCTCGTCCTCA LSM6-123 F TGATGGTTTTATGAATGTTGCAC LSM6 LSM6-225 R GCCCCTCAAAAAGACATCAC

53

Materials and Methods

LSM7-003 F GCATCAGCAACACTCCAAAT LSM7 LSM7-095 R TTCGCTAAATCCAGAATAGCTTC LSM8-78 F CCTAAACGGCTTCGACAAAA LSM8 LSM8-182 R ATCTCGCTGCCTCGA AGTAA SIR3-1025 F CTGGGTCAAAGCCAGAGAAG SIR3 SIR3-1128 R CTCTTAAGCCCACCATCAT ADE1total291 F GTACAAAACGCAACTAGAAG Total ADE1 transcipts ADE1total405 R TTTTACGTACTCTTTCCAAG for ADE1lsm7i strains ADE1spliced005 F CAATTACGAAGACTGAACTG Spliced ADE1 ADE1spliced107 R GTAGCAACAAACAGCAAC transcipts for ADE1lsm7i strains ADE1unspliced010 F ACGAAGACTGTATGTTTCAC Unspliced ADE1 ADE1unspliced129 R TATACCGTCCAGTTCCTTA transcipts for ADE1lsm7i strains

Prior to qRT-PCR set-up, cDNA was synthesized from 1 μg of each RNA sample using the SuperScript• III First-strand Synthesis System for RT-PCR (Invitrogen, NSW) and

Oligo(dT)20 primer according to the manufacturer’s instructions. For each 25 Pl real-time PCR reaction, 1 Pl of cDNA sample was added to a master mix containing

SYBR Green dsDNA binding dye, Taq DNA polymerase and 100 Pg of each primer as described (Platinum® SYBR® Green qPCR SuperMix-UDG User Manual; Invitrogen,

NSW). Unless otherwise stated, expression data was obtained from duplicate PCRs on up to five biological replicates and normalized to the level of the SIR3 transcript as an internal standard.

Standard real-time PCR was performed using a RotorGene 3000 (Qiagen, VIC) followed by melting curves analysis: 2 min at 50°C, 10 min at 95°C, then 40 cycles of

10 s at 95°C, 20 s at 60°C and 20 s at 72°C, completed by a final ramp from 55°C to

95°C.

54

Materials and Methods

The primer efficiencies of each primer pair was analyzed with standard curves of ten-fold serial dilutions to ensure all the amplicons were amplified at a similar and optimum rate (amplification efficiency = 2 r 0.2) for the data to be comparable. In all cases the expression levels measured by each primer pair was corrected with their primer efficiencies. The presence of only one product was confirmed by melting curve analysis in every run of real-time PCR, and confirmed by 1% agarose gel electrophoresis if necessary. Template-minus and reverse transcriptase minus (-RT) controls were included to eliminate errors caused by primer-dimer formation and DNA contamination from the RNA samples or other sources.

2.5.4 Affymetrix gene chip expression microarray analysis

Microarray experiments were performed using the Affymetrix Yeast 2.0 Genechip® oligonucleotide array (Affymetrix Inc.) according to the manufacturer’s instructions

(http://www.affymetrix.com/support/technical/manuals.affx) by the Ramaciotti Centre for Gene Function Analysis (UNSW, Australia). Gene expression analysis of the LSM7

WT and intronless ('i) strains on each medium was carried out using the Partek®

Genomics Suite•, version 6.4 (Partek Inc.) by calculating the simple log2 intensity after normalization with the Robust Multichip Average (RAM) algorithm (Irizarry et al.,

2003). To assess reproducibility, biological duplicate experiments were carried out.

P-values, ratios and fold-changes for each factor and comparison were calculated using

Analysis of Variance (ANOVA) upon normalization. Clustering and analysis of significantly differentially expressed genes (p-value <0.01) were carried out using MultiExperiment Viewer (MeV, v4.6.1) of TM4 microarray software suite (Saeed et al., 2006) and FunSpec (Robinson et al., 2002), respectively.

55

Materials and Methods

2.6 Mating-type based methods

2.6.1 Mating efficiency analysis

The mating efficiencies of lsm7 mutant strains were analyzed by measuring diploid production from mating versus the total mutant haploid number used (modified from

(Hartwell, 1980)) as represented in the formula below.

No. of diploid colonies on selective plate

Mating efficiency % = x 100

No. of total haploid on YEPD plate

Stationary phase cultures of each LSM7 WT, lsm7', 'i and iO in BY4742 background were washed with sterilized water twice and resuspended to an OD600 of 0.05 and 100 ml of ten-fold serial dilutions (10-1, 10-2 and 10-3) and were mated with a fixed number of washed BY4741 (OD600 = 0.1) in an equal volume (100 ml). Cells were plated immediately after mixing to prevent aggregation and diploids were plated onto selective medium. Haploids of each serial dilution were also plated onto rich medium for total cell counts. Data was obtained from technical triplicates of up to three biological replicates from each dilution after incubation at 30°C for 3 d.

2.6.2 Measurement of pheromone production

Production of pheromone of lsm7 mutant strains was measured by halo formation on a lawn of α-mating factor supersensitive strain (bar1Δ) of the opposite mating type.

Stationary phase BY4741 bar1Δ strain (OD600 = 0.1, 1ml) was mixed with 2 ml of pre-warmed 0.8% agar and overlaid onto pre-made YEPD plates. Ten-fold serial dilutions (OD600 = 7 and 0.7) of each LSM7 WT and lsm7 mutant strains were spotted (2

Pl) onto the plate in triplicates. Plates were incubated at 30°C for 3d. Relative halo

56

Materials and Methods

diameter was calculated by subtraction of the diameter of the spotted colony from that of the halo formed around it and is used as a representation of the relative amount of

α-factor released. Data was obtained from up to three biological replicates from each strain and analysed with ImageJ (Abramoff et al., 2004).

2.6.3 Sensitivity test of opposite pheromone

Sensitivity of the lsm7 mutant strains towards α-factor (Zymo research, CA) was measured by halo formation on the lawn of each lsm7 mutant strain in BY4741 background. Stationary phase cells of each strain (OD600 = 0.1, 1 ml) were mixed with 2 ml pre-warmed 0.8% agar and overlaid onto pre-made YEPD plates. Ten-fold serial dilutions (1 mM, 0.1 mM) of α-factor were applied onto sterilized filter discs (5 Pl capacity) on the surface of the plate in triplicates. Plates were incubated at 30°C for 3 d.

Relative halo diameter was calculated by subtraction of the diameter of the α-factor disc from that of the halo formed around it and was used as a representation of the pheromone sensitivity of each strain. Data was obtained from up to three biological replicates from each strain and analysed with ImageJ (Abramoff et al., 2004).

2.7 Cell growth assay

A Bioscreen C (MTX Lab System, Inc.) machine was used for growth assay by optical density measurement. Stationary phase cultures were inoculated into 200 μl of corresponding media in a Bioscreen Honey-comb 100 well plate (cat.9502550) with a starting OD600 of 0.001 and incubated at 30°C. Absorbance readings at 600nm

(wideband range) were taken every 15 min for up to 72 h. Shaking commenced 15 s prior to each measurement. OD data was normalized for background by subtracting the control OD from each subsequent reading.

57

The function of LSM7 intron on the expression of LSM genes

Chapter 3. Regulation of LSM genes and the function of LSM7 intron on the expression of LSM genes

3.1 Introduction and aims

In order to adapt to a wide variety of growth conditions in its natural environment, the unicellular budding yeast Saccharomyces cerevisiae has evolved flexible controls to adjust its metabolic and regulatory systems rapidly and appropriately. One obvious and important aspect of this flexibility is the capability to utilize a variety of carbon sources.

Yeast has evolved to use fermentative metabolism when the preferred carbon source, glucose, is present, resulting in production of ethanol. When glucose is not available or limiting, cells are able to utilize alternative carbon sources including non-fermentable ethanol, glycerol or acetate and thereby undergo respiratory metabolism (Turcotte et al.,

2010). The switch from fermentative growth to respiratory growth, termed the diauxic shift, involves massive reprogramming in the expression of over 1,000 genes (DeRisi et al., 1997; Turcotte et al., 2010).

Genes encoding proteins of macromolecular complexes are often required to be co-expressed over a variety of conditions in order for the protein subunits to exist in stoichiometric amounts for assembly into the functional complex. In budding yeast, the

Lsm1-8 proteins have been shown to exist in at least two heptameric complexes associating with other protein factors, in different subcellular compartments, and to carry out diverse RNA processing functions (Scofield & Lynch, 2008; Tharun, 2009).

While many studies have focused on understanding the structural interactions and functions of the Lsm proteins in relation to the nuclear Lsm2-Lsm8 spliceosomal complex and the cytoplasmic Lsm1-Lsm7 RNA decay complex, information on the transcriptional regulation of these genes required for cells to maintain the integrity of

58

The function of LSM7 intron on the expression of LSM genes

the different Lsm complexes is minimal.

From the extensive discovery of functional ncRNAs, it is now evident that introns are important sources of functional ncRNAs, including snoRNAs, miRNAs, endogenous siRNAs and lncRNAs, particularly in higher eukaryotes (Brown et al., 2008; Louro et al., 2009; Mattick & Makunin, 2006; Rearick et al., 2010). Although only 5% of the S. cerevisiae genes contain an intron (Davis et al., 2000; Juneau et al., 2007; Miura et al.,

2006; Spingola et al., 1999), intron-containing genes in yeast are found over-represented in the highly expressed ribosomal genes (Ares et al., 1999), genes related to secretion and meiosis (Juneau et al., 2007), as well as essential non-ribosomal genes (Skelly et al., 2009). Examples of intron-dependent gene regulation have also been shown from individual studies in yeast (Juneau et al., 2006; Li et al., 1996; Meyer

& Vilardell, 2009; Pleiss et al., 2007; Preker & Guthrie, 2006), however, the extent of functional ncRNAs contained in the introns of particular genes still awaits discovery.

Previous work has shown a coordinated down-regulation of the LSM2, 4, 5, 7 and 8 genes in response to poor carbon sources in relation to slower cellular growth rate in S. cerevisiae. This work has provided an insight into a possible functional role of the

LSM7 intron in trans-regulation of this set of spliceosomal LSM genes (Figure 3.1), which was also extended to the splicing capacity of the cells driving this environmental adaptation (Palmisano, 2006). Unfortunately, the strains generated by Palmisano could not be recovered from the laboratory stocks. Furthermore a correction in the annotation of LSM7 intron was made subsequent to her work (Miura et al., 2006) led to the addition of 8 amino acids at the beginning of exon 2 and a shortening of the intron. This eliminated a gap in the alignment between Lsm7 and its fungal orthologs. Based on this, the initial work described in this chapter aimed to re-investigate the properties of the

59

The function of LSM7 intron on the expression of LSM genes

LSM7 gene and specifically its intron in mediating the expression of LSM genes under two types of carbon metabolism using reconstructed LSM7 mutant strains.

Figure 3.1 Model of LSM7 intron-mediated regulation of a subset of LSM genes proposed by Palmisano (Palmisano, 2006). The LSM7 intron sequence was found play a role in control expression levels of LSM2, 4, 5, 8 and LSM7 itself as a repressor, particularly in response to poor carbon sources in relation to slower growth rates.

A different technique of mutant construction from previous study was employed since there was considerable difficulty in obtaining a number of the mutants using the method used by Palmisano (Palmisano, 2006). Mutant strains that lacked the LSM7 gene

(lsm7'), or its intron (intronless, 'i), or which expressed the intron but not the exons

(intron-only, iO) (Figure 3.3 A) were constructed to study the essentiality of the Lsm7 protein, the intron or other elements which may be involved in the regulation the expression of the LSM genes. Since it is not easy to determine a priori what conditions may require splicing systems to be regulated, previously obtained data indicated that growth rate would be a good candidate. Different media containing different carbon

60

The function of LSM7 intron on the expression of LSM genes

sources which might also detect differential regulation in response to metabolic changes were used. Rich media containing either glucose or a less favorable carbon source, acetate were used to study changes in transcript levels of the LSM genes of cells grown on each condition. Comparing of the patterns of LSM gene expression in glucose-grown cells to that in acetate-grown cells might be an indicator of regulation occurring.

3.2 Production and verification of LSM7 mutant strains

The substrate analogue 5-fluoroorotic acid (5-FOA) is converted to toxic 5-fluorouracil

(5-FU) in yeast strains expressing a functional URA3 gene and this has been commonly used for counter-selection of strains for displacement of URA3 by a targeted gene, in this case a mutated LSM7 PCR product (Boeke et al., 1984). In this study, BY4742 lsm7::URA3 (lsm7') (Palmisano, 2006) was initially used as the parent strain for yeast transformations to reconstruct lsm7 mutants with 5-FOA. However, a high level of background growth on the plates for both the haploid (BY4742 lsm7::URA3) and the heterozygous diploid (BY4741x BY4742 lsm7::URA3/LSM7) strains was observed using this selection technique. Sporulation of potential diploid colonies to obtain the desired haploid mutant was unachievable since the majority of the colonies were found to be petite, indicating that 5-FOA had a mutagenic effect impairing the mitochondrial genome of the strain. Only a small number of colonies (less than 5%) were found to be grande, but all of these had point mutations on their URA3 genes, allowing the cells to survive in the presence of 5-FOA.

To avoid this problem, several variations of this method were attempted, including reducing the amount of 5-FOA and reducing the length of time and the temperature of the heat-shock treatment used in the transformation. However, it was found that the

61

The function of LSM7 intron on the expression of LSM genes

amount of 5-FOA (0.1% w/v) used initially was the minimal amount needed to inhibit background growth of Ura+ cells, especially in diploids. Skipping or reducing the temperature of the heat-shock treatment during transformation did not eliminate the impairment of mitochondrial functions or mutations in the URA3 gene in the presence of 5-FOA.

Resistance to 5-FOA can arise spontaneously and results in colonies that grow rapidly in the presence of 5-FOA. Under these conditions, it may be necessary to screen more than 104 transformants to have a reasonable chance of obtaining a desired mutant

(Frohlich et al., 1992). Therefore a modified method of the directed integration of a linearized plasmid (Adams et al., 1997) was employed and successfully replaced the original cloning method. The lsm7 mutant constructs were subcloned into the yeast integrative plasmid pFL26 which contains a LEU2 autotrophic marker (Bonneaud et al.,

1991). BY4742 lsm7::URA3 (lsm7') was transformed with each lsm7_LEU2 cassette generated by PCR allowing integration of the construct into the LSM7 locus under the control of its own promoter. This strategy also minimizes any potential complication in genomic structure and interference by extra foreign sequences derived from the plasmid (Figure 3.2).

Putative integrants were selected by positive growth on leucine drop-out media and simultaneously confirmed with negative growth on uracil drop-out media to confirm loss of the URA3 gene insertion at the native LSM7 locus. A wild-type strain containing the wild-type LSM7 construct was also generated in the same manner. No background growth was observed and single integration of each construct was confirmed by

Southern blot (Figure 3.3), PCR and sequencing. This method led to the construction of the wild-type (WT) strain, the intronless ('i), intron-only (iO) and the (i01~i19) intron

62

The function of LSM7 intron on the expression of LSM genes

mutants (used in Chapter 4).

Figure 3.2 Schematic diagram of cloning procedures for the LSM7 wild-type and mutant constructs at the LSM7 locus. The LSM7 wild-type or mutant construct is shown in green, including the coding region (box), the 5’ UTR and the 3’ UTR (line). The URA3 cassette at the LSM7 locus of the BY4742 lsm7::URA3 strain is indicated by a grey box. A blue line indicates the 5’ and 3’ UTR of LSM7 on yeast chromosome XIV. PCR primers and product are shown in the colors of the corresponding sequences. The positions of homologous recombination are represented with red crosses. As the result the lsm7_LEU2 cassettes are integrated into the LSM7 native locus with intact 5’UTR and 3’UTR.

63

The function of LSM7 intron on the expression of LSM genes

A

B

Figure 3.3 Production and verification of mutant strains described in this chapter. (A) Schematic diagram of molecular structure of the LSM7 gene and the LSM7 wild-type and mutant constructs. Sequences of splicing elements of the LSM7 intron are indicated in red with the two exons represented in open boxes. In lsm7', the entire 444 bp long LSM7 gene is replaced by the URA3 sequence plus the 5’ untranslated region (5'UTR, 793 bp) and the 3’ untranslated region (3’UTR, 143 bp) to generate a complete knock-out of LSM7; in intronless ('i), the intron is removed and exon 1 (1-19 bp) and exon 2 (115-444 bp) are joined; in intron-only (iO), only the intron is present with start and stop codons remaining for transcription. (B) Southern blot analysis of genomic DNA isolated from strains carrying single integrated LSM7 wild-type and mutant constructs. Genomic DNA isolated from putative single integrants was analyzed by Southern blot as described in Section 2.3.4. The hybridization signals from LSM7-specific probe are shown above. No signals were detected from lsm7'. A single 3.1 kb band was detected in the parent strain BY4742 and in wild-type (WT) both of which contain a single copy of the wild-type LSM7 gene. Smaller bands of appropriate sizes of around 3.0 kb and 2.7 kb were detected in 'i and iO, respectively. Transformants with multiple bands detected were considered as multiple integrations and were excluded.

64

The function of LSM7 intron on the expression of LSM genes

3.3 Expression level of LSM genes changes in response to different carbon sources

In order to determine the transcriptional response of the LSM genes to growth on different carbon sources, transcript levels of LSM1-8 from cells grown on rich media containing either glucose or acetate (representing rich and poor carbon sources respectively), were analyzed by quantitative real-time PCR (qRT-PCR). RNA samples were extracted from each strain harvested in mid-log phase (OD600 = 0.5 r 0.05) for both medium conditions as described in Section 2.5. The transcript level of the SIR3 gene in each sample was determined and used as a reference control to normalize the expression levels of all data. SIR3 exhibits constitutive expression across a wide variety of growth conditions and is present in similar abundance to the LSM transcripts (DeRisi et al., 1997; Hibbs et al., 2007; Ronen & Botstein, 2006) (also microarray data in this study).

The expression pattern of the LSM genes in BY4742 cells grown on glucose was significantly different to those on acetate (Figure 3.4 A). A comparable result was also observed in the wild-type (WT) strain encoding a wild-type LSM7_LEU2 construct at the LSM7 locus of the parent strain BY4742 (Figure 3.4 B). These data indicated that the disruption of the sequence 743 bp downstream of LSM7 with ~2kb of LEU2 sequence had no significant effect on the normal transcription of LSM7 and other LSM genes. Therefore, all lsm7 mutant strains were constructed using the same approach, and the WT strain was used for comparison in the following sections for consistency.

65

The function of LSM7 intron on the expression of LSM genes

A B

Figure 3.4 Expression levels of LSM genes in wild-type conditions on glucose and acetate. Relative levels of LSM transcripts of BY4742 (A) and WT (B) were measured using SIR3 as a reference control. Error bars represent the standard deviation of the mean transcript level of each gene from duplicate PCR on at least three biological replicates.

Significant differences in LSM gene expression in response to a poorer carbon source were observed from the relative expression levels of each LSM gene under different growth conditions (Figure 3.5). This shows that although several of the LSM genes do exhibit transcriptional regulation in response to the carbon-source, coordinated down-regulation of LSM2, 4, 5, 7 and 8 genes on acetate (as described by Pamisano,

2006) was not observed. Under growth on acetate, expression of LSM2, 3, 4, and 8 was up-regulated in comparison to their expression in cells undergoing glucose fermentation, especially for LSM8 which showed a more than 1-fold increase. The LSM1 and LSM7 genes showed no significant change in their expression, and only LSM5 and LSM6 exhibited a reduction in expression in response to acetate. These results were consistent with those undertaken in a collaboration with a praktikum student who assessed the regulation of the LSM genes under 3 different nitrogen sources and 3 carbon sources, including glucose and acetate (Dengler, 2008). The study also found no evidence for the coordinated down-regulation of any LSM gene in cells grown on poor nitrogen sources

66

The function of LSM7 intron on the expression of LSM genes

or carbon sources which led to slower growth rates, and that the expression profile of the LSM genes was unique to each condition. This indicated that the transcription levels of LSM genes are finely regulated in response to different nutrient environments in order to maintain normal cellular function rather than to growth rate. 

A B

Figure 3.5 Changes in expression levels of LSM genes of wild-type strains in response to acetate. Differential expression of LSM genes of BY4742 (A) and WT (B) on acetate relative to glucose were presented with data normalized to glucose-grown cells. Cells were grown to exponential phase on rich media containing either glucose (YEPD) or acetate (YEPA) and transcript levels for the LSM genes determined by qRT-PCR using the SIR3 gene transcript as a control. Error bars represent the standard deviation of the mean transcript level of each gene from duplicate PCR on at least three biological replicates.

3.4 The requirement for LSM7 coding sequence and intron in maintaining the expression levels of the LSM genes in response to a poor carbon source.

3.4.1 LSM7 is required for normal expression of LSM genes in response to growth on acetate.

Unlike most of the LSM genes, LSM1, LSM6 and LSM7 are not essential for viability in

S. cerevisiae. However, given that their products occur in at least two Lsm complexes, strains lacking Lsm6 or Lsm7 are heat sensitive and have mild defects in growth and

67

The function of LSM7 intron on the expression of LSM genes

RNA splicing (Beggs, 2005; Mayes et al., 1999). Having observed the changes in expression of LSM genes in response to different carbon metabolism and from the implications of a LSM7-dependent LSM gene regulation from previous work (Palmisano,

2006), the effect of deletion of the entire LSM7 coding sequence (lsm7') on the expression of LSM1-8 in cells grown on different carbon sources was examined.

The change in transcription of the LSM genes in response to deletion of LSM7 was shown to be carbon-source dependent (Figure 3.6). Deletion of LSM7 led to a statistically significant increase in expression of the LSM2, LSM3, LSM5 and LSM8 genes when the cells were grown on glucose (Figure 3.6 A). In acetate, expression of the LSM3, LSM5 and LSM6 were significantly increased whereas the level of LSM8 transcript was reduced in comparison to that measured for the WT (Figure 3.6 B). LSM2 expression was unchanged in cells growing on acetate. One hypothesis is that expression of LSM3 and LSM5 was up-regulated as a common response in cells grown on both carbon sources in order to compensate for the loss of Lsm7 protein, with the

Lsm3 and Lsm5 taking the place of the non-essential Lsm7 in the complexes (Beggs,

2005). However, Lsm3 and/or Lsm5 are not more similar to Lsm7 than the other Lsm proteins based on the sequence similarity in their Sm domains (Salgado-Garrido et al.,

1999). Further analysis on the levels of these proteins in each Lsm complex in the strain are required to confirm the speculation. On the other hand, the differential response of

LSM2, 6 and 8 under the different carbon sources was more likely a specific response to the different metabolic requirements of each specific condition. The change in expression levels for the LSM genes between cells grown on the glucose and acetate was also expressed as a ratio for both strains (Figure 3.6 C). This shows that complete knockout of LSM7 led to changes in the normal regulation on the LSM3, LSM5, LSM6 and LSM8, particularly for LSM8.

68

The function of LSM7 intron on the expression of LSM genes

A B

C

Figure 3.6 Expression levels of LSM genes in lsm7' mutant grown under different carbon sources. Relative levels of LSM transcripts in cells grown to exponential phase on glucose (A) and acetate (B) were measured using SIR3 as a reference. Changes in transcript levels of LSM genes in response to growth on acetate were presented with data normalized to glucose (C). Error bars represent the standard deviation of the mean transcript level of each gene from duplicate PCR on at least three biological replicates. Two-tailed t-test was performed as indicated with asterisks (*) to look for significant differences in mutant to WT.

69

The function of LSM7 intron on the expression of LSM genes

3.4.2 The LSM7 intron is required for normal expression of LSM genes in response to growth on acetate.

The coding sequence of LSM7 is interrupted by an intron, which is only found in 5% of genes in budding yeast. To determine whether changes in the expression of the LSM genes was due to loss of the Lsm7 protein, the LSM7 intron or any other encoded elements, the transcript levels of the LSM genes was analyzed in an lsm7 intronless mutant ('i), which contains a functional LSM7 gene with the intron removed.

In glucose-grown cells expressing a functional LSM7 gene without the intron there was a significant increase in the expression level of most of the LSM genes, with the exception of LSM1 and LSM7 (Figure 3.7 A). Since there was no significant alteration in the level of the LSM7 transcript, the increase in the expression level of LSM2, 3, 5 and 8 observed in both lsm7' and 'i mutants was likely to be a direct response to deletion of the LSM7 intron. However, transcription of these genes was affected to different extents when the intronless LSM7 was expressed, indicating that changes in the expression of these genes may have been due to a combination of loss of the LSM7 protein from the complex (indirect effect) and loss of the control potentially mediated by the LSM7 intron. On the other hand, increased expression of LSM4 and LSM6 observed only in the 'i mutant was probably not due to any direct effect of the intron since it is also lacking in the lsm7' strain.

Deletion of LSM7 intron had a much smaller effect on the transcript levels of LSM genes when the cells were grown on acetate (Figure 3.7 B). Transcription of most LSM genes was not significantly affected and only LSM5 showed a statistically significant increase in expression, indicating that any regulation of LSM genes is more dependent on the Lsm7 protein than the LSM7 intron in this condition. It was noteworthy, that the

70

The function of LSM7 intron on the expression of LSM genes

level of LSM7 transcript was not significantly changed in 'i under both media conditions, indicating that the LSM7 gene is regulated independently from its intron and the alteration in expression of LSM genes specific to the deletion of intron was not mediated through control of expression of the mature LSM7 transcript.

Given the variations in LSM transcript levels in 'i cells grown on different carbon sources, the data show a consistent, albeit not always statistically significant, reduction in the relative response in expression of all the LSM genes between acetate and glucose

(Figure 3.7 C). This profile of change strongly suggests that the LSM7 intron is required for normal cellular response to this environmental change. As indicated above, expression ratio of LSM7 was not significantly affected.

71

The function of LSM7 intron on the expression of LSM genes

A B

C

Figure 3.7 Expression levels of LSM genes under different carbon sources in response to deletion of the LSM7 intron. Relative levels of LSM transcripts in cells grown on glucose (A) and acetate (B) were measured using SIR3 as a reference. Changes in transcript levels of LSM genes in response to growth on acetate were presented with data normalized to glucose (C). Error bars represent the standard deviation of the mean transcript level of each gene from duplicate PCR on at least three biological replicates. The outcome of two-tailed t-tests to determine significant differences in expression is indicated with asterisks (*).

72

The function of LSM7 intron on the expression of LSM genes

3.4.3 The LSM7 intron alone is involved in fine-tuning the transcription level of LSM genes in response to growth on acetate.

The LSM7 intron was found to be implicated in maintaining the normal transcription levels of a group of LSM genes in response to acetate metabolism, however it was not clear if the observation involved both Lsm7 protein and its intron. In order to further access whether any of the previous changes in the transcriptional regulation of these genes were due to an intron-dependent regulation, the expression level of LSM genes was analyzed under both media conditions in the lsm7 intron-only (iO) mutant strain.

This strain has the LSM7 intron encoded at the LSM7 locus with native start and stop codons for transcription and contains no functional protein-coding sequence.

In this mutant there were changes in expression levels of LSM1, LSM3 and LSM5 in cells grown on glucose (Figure 3.8 A). The significant alteration in the level of LSM8 transcript observed in lsm7' and 'i mutants was not observed, suggesting a direct involvement of the intron alone in regulation of this gene, likely as a repressor. On the other hand, up-regulation of LSM3 and LSM5 was detected in all lsm7 mutant strains, suggesting transcriptional control of these genes was dependent to some extent on both

LSM7 exons and the intron. Moreover, induction of LSM1 transcript was observed only in the iO strain grown on glucose and this may have been an indirect response through the alteration of LSM8 transcript whose product is involved in a diversed RNA processing function.

Transcript levels of the LSM genes in cells grown on acetate were affected more by the absence of Lsm7 protein (iO) than by loss of the intron ('i). Expression levels of LSM2,

LSM3, LSM5 and LSM6 in iO cells grown on acetate were significantly increased

(Figure 3.8 B). Similar changes in the level of LSM3, LSM5 and LSM6 transcripts were

73

The function of LSM7 intron on the expression of LSM genes

observed in lsm7', indicating a direct response to the absence of the Lsm7 protein.

However, a small but significant level increase in the amount of LSM5 transcript was also observed in 'i, indicating a possible involvement of the intron in regulating this gene. Noteworthy, up-regulation of LSM5 was observed in all lsm7 mutants grown on either carbon source, indicating both intron and exons of LSM7 are commonly involved in regulation of LSM5 whose product is located adjacent to Lsm7 in Lsm protein complexes. The up-regulation of LSM2 that was observed only in iO under acetate growth was likely to be an indirect response through alteration on the levels of other transcripts.

Although the LSM7 intron appears to exhibit a minor independent involvement in the regulation of LSM genes under either carbon source, the expression ratios comparing expression on acetate to that on glucose of all LSM genes of iO mutant resembled those obtained from the WT (Figure 3.8 C). No significant variations in the expression levels of LSM3, 5, 6 and 8 in lsm7' and LSM2, 3, 4, 6 and 8 in 'i between two growth conditions were observed, suggesting that the intron alone was involved in fine-tuning expression level of some LSM genes, at least in LSM3, 6 and 8 in response to a different carbon metabolism.

74

The function of LSM7 intron on the expression of LSM genes

A B

C

Figure 3.8 Expression levels of LSM genes under different carbon sources in response to deletion of the LSM7 coding regions. Relative levels of LSM transcripts in cells grown on glucose (A) and acetate (B) were measured using SIR3 as a reference. Changes in transcript levels of LSM genes in response to growth on acetate were presented with data normalized to glucose (C). Error bars represent the standard deviation of the mean transcript level of each gene from duplicate PCR on at least three biological replicates. The outcome of two-tailed t-tests to determine significant differences in expression is indicated with asterisks (*).

In summary, a regular, consistent pattern of significant changes in LSM transcript levels was not observed for the removal of coding and/or non-coding sequences of the LSM7 gene in cells grown on either carbon source (Figure 3.9). In addition, the response of

LSM genes to a specific mutation of the LSM7 gene exhibited different patterns in the two growth conditions. Four different changes in transcript levels of LSM genes were

75

The function of LSM7 intron on the expression of LSM genes

generally classified: 1) LSM7 ORF-specific, requiring both the intron and the coding exons (LSM3 and LSM5 in glucose; LSM5 in acetate); 2) Lsm7 protein-specific (LSM3 and LSM6 in acetate); 3) LSM7 intron-specific (LSM2 and LSM8 in glucose); and 4) non-specific change observed only in the lsm7' strain, the 'i strain or the iO strain

(LSM1, LSM4 and LSM6 in glucose; LSM2 and LSM8 in acetate).

In both growth conditions, the level of the LSM7 mature transcript was not significantly affected by the deletion of its intron. Transcription profiles of LSM genes of the lsm7' strain and the iO strain are more similar than those measured from strain lacking the

LSM7 intron, in particular those grown on acetate. Nevertheless, deletion of the LSM7 intron led to changes in expression of the LSM genes to different extents in different conditions. Both LSM7 intron and protein coding exons are required for normal expression of LSM genes. The intron plays a role, at least in part, in regulation of the expression of LSM2, LSM3, LSM5 and LSM8 genes in glucose-grown cells, and contributes to minor control of the expression of LSM5 in acetate-grown cells. In addition, an overall increase in transcript levels of most of the LSM genes in the 'i mutant grown on glucose have shown a possibility of the LSM7 intron acts as a repressor.

76

The function of LSM7 intron on the expression of LSM genes

Figure 3.9 Expression levels of LSM transcripts in response to each mutation on the LSM7 gene in cells grown on glucose and acetate. Transcript levels of LSM genes of each lsm7 mutant were normalized to those of WT cells grown on glucose and acetate. The relative expression levels of LSM transcripts in each strain are indicated as a two-color heat-map.

As previously mentioned, the response of LSM transcript levels to the growth on acetate could be generally divided into three groups in wild-type cells: unchanged (LSM1 and

LSM7), up-regulated (LSM2, LSM3, LSM4 and LSM8) and down-regulated (LSM5 and

LSM6) (Figure 3.5). Interestingly, expression ratios of each of these gene sets were found to coordinately change in relation to each specific mutation of LSM7 (Figure

3.10). While expression ratios of LSM1 and LSM7 remained identical in all lsm7 mutants compare to that of WT, ratios of genes in the up-regulated set (LSM2, LSM3,

LSM4 and LSM8) were decreased in both lsm7' and 'i, and the transcript ratios of those of the down-regulated set (LSM5 and LSM6) exhibited an opposite pattern in lsm7' to that of 'i. In each case, the altered response was restored when the LSM7 intron was expressed alone, in particular for the LSM8 exhibited the most significant changes in transcript level in response to deletion of the intron sequence.

77

The function of LSM7 intron on the expression of LSM genes

Figure 3.10 Groups of coordinately expressed LSM genes based on their response to carbon source and LSM7 construct. Changes in expression of LSM genes between growth on acetate and growth on glucose are presented for all the lsm7 strains. The three groups of coordinately regulated LSM gene sets were plotted separately. Error bars represent the standard deviation of the mean transcript level of each gene from duplicate PCR on at least three biological replicates.

3.4.4 Expression of the LSM7 intron improves the growth rate of cells lacking Lsm7 protein

The importance of the LSM7 intron in mediating expression of the LSM genes has also been illustrated by the differences in growth rate of the WT and lsm7 mutants. The growth of each strain on different carbon sources was monitored for 60 h by measuring the optical density (OD) of the cell cultures at 600 nm using a Bioscreen C spectrophotometer as described in Section 2.7. The doubling times (which are inversely proportional to the growth rate) of each strain on both media were determined from the exponential phase of the corresponding growth curve.

As described previously, strains lacking Lsm7 are heat sensitive with mild defects in growth (Mayes et al., 1999) (Figure 3.11). The presence of the 'i construct in cells grown on both carbon sources relieved the slow growth phenotype of lsm7', indicating that a functional Lsm7 protein is made in the strain. This also indicates that the lack of the intron does not have an observable effect on the growth rate if the protein is

78

The function of LSM7 intron on the expression of LSM genes

otherwise functional. As would be expected, the iO strain, which has only the intron sequence of LSM7 gene, grew slower than WT and 'i. However, it also appears that the presence of the intron alone has an effect especially on acetate where it leads to a higher growth rate than the mutant lacking all the LSM7 sequences (lsm7'). The increase in the growth rate in the iO mutant may indicate an improved transcript levels of LSM genes for proper function in RNA processing events.

A B

C

Figure 3.11 Growth curve and growth rate of lsm7 mutants in media containing glucose or acetate. Growth curve of each WT and lsm7 mutant strains was monitored for 60 h based on optical density at 600nm (OD600) of each culture containing either glucose (A) or acetate (B). The doubling time representing growth rate of each strains on both media were determined from exponential phase of the corresponding growth curve (C). Error bars represent the standard deviation of the mean growth rate of each culture from at least three biological replicates. The experiment was repeated and a similar pattern of growth curve was observed.

79

The function of LSM7 intron on the expression of LSM genes

3.5 Does the LSM7 intron act as an independent trans-regulator in LSM gene expression?

From the results shown above, the LSM7 intron appears to play a role in fine-tuning expression some of the LSM genes. This raises the question of the mechanism by which this regulation occurs. Since there was no significant alteration in the expression of mature LSM7 transcript in response to intron deletion, it can be suggested that this intron could act independently as a trans-regulator of the LSM gene loci scattered in different locations of the genome. In order to further confirm this regulation, the LSM7 intron was cloned into the ADE1 gene locus, separating the 921 bp long sequence into 8 bp and 913 bp exons to mimic the sequence arrangement of LSM7. Strains lacking a functional ADE1 gene generate a red pigment derived from the polymerization of the intermediate phosphoribosylamino-imidazole in the biosynthetic pathway of adenine.

Therefore, ADE1 was knocked-out in the WT strain and lsm7 mutants and the double mutants were transformed with an ADE1LSM7i cassette generated by multi-step PCR as described in Section 2.4.3. Putative white integrants were selected among mixtures of red and white colonies on adenine drop-out plates and correct insertion of the construct was verified with PCR, resulting in three new strains: the WT_ADE1LSM7i, the lsm7'_ADE1LSM7i and the 'i_ADE1LSM7i strain. Each confirmed double mutant was re-streaked on adenine drop-out plates and no red colonies were observed (data not shown), indicating that the LSM7 intron was properly spliced from the ADE1 pre-mRNA and functional Ade1 protein was produced in all strains including cells lacking a functional LSM7 gene (lsm7'_ADE1LSM7i).

80

The function of LSM7 intron on the expression of LSM genes

3.5.1 Expression of the LSM7 intron from the ADE1 locus alters the expression levels of LSM genes.

Assuming that none of the species in mRNA splicing (unspliced pre-mRNA o spliced

+ intron lariat) is rapidly degraded, the level of mature mRNA produced equals the level of intron spliced for any pre-mRNA with a single intron. Therefore, in order to assess the level of LSM7 intron expressed from ADE1 locus, spliced ADE1 transcript was measured in all ADE1LSM7i mutants. The ratio of unspliced to total ADE1 transcripts was also analyzed to determine the splicing capacity of all of the strains. As illustrated in the mathematical model of Patel et al. (2002), the concentration of unspliced transcript is dependent on transcription rate versus splicing rate (Kts /Ksp) at steady state.

Therefore, splicing rate is inversely proportional to the ratio of unspliced transcripts to total transcripts assuming that the rate of transcription is proportional to the total amount of transcript produced at the examined time. RNA samples were isolated from cells grown on glucose or acetate as described previously (Section 3.3) and the level of each ADE1 transcript was determined by qRT-PCR using primers specific to each species (Section 2.5.3 and Figure 3.12A).

With the exception of 'i_ADE1LSM7i cells grown on glucose, the mature ADE1 mRNA of all ADE1LSM7i mutants was found to be expressed at levels comparable to the mature LSM7 transcripts under both media conditions (Figure 3.12B; ~6 relative to

SIR3); indicating that the LSM7 intron was produced at a similar level from ADE1 locus to that from native LSM7. Therefore, considering the intron produced from both the

LSM7 and ADE1 loci, the intron was produced at a similar level in the lsm7'_ADE1LSM7i cells as WT grown on both carbon sources, whereas almost twice of the native level of LSM7 intron was produced in WT_ADE1LSM7i. Interestingly,

81

The function of LSM7 intron on the expression of LSM genes

around doubled the native level of the intron was generated in the 'i_ADE1LSM7i cells grown on glucose compared to that produced on acetate.

Although the changes in expression of LSM genes may also affect the splicing regulation in these strains, splicing capacity of each strain did not vary greatly (Figure

3.12C). Splicing of ADE1 was significantly better in the 'i_ADE1LSM7i cells than the

WT_ADE1LSM7i or lsm7'_ADE1LSM7i strains. Lack of the LSM7 did not lead to a large defect in splicing of ADE1 but did show more unspliced ADE1 in cells growing in glucose and, curiously, less in cells growing in acetate. The level of LSM7 intron produced from the LSM7 and/or ADE1 loci was used to relate changes in expression of

LSM genes in each strain in the following sections.

82

The function of LSM7 intron on the expression of LSM genes

A

B C

Figure 3.12 Splicing of ADE1 pre-mRNA from ADE1LSM7i strains grown under two carbon sources. Schematic diagram of targeting sequences of primer pairs used to measure all mRNA species (Total), pre-mRNA (Unspliced) and mature mRNA (Spliced) of ADE1 (A). Relative levels of spliced ADE1 transcripts in cells grown on glucose and acetate representative of the amount of LSM7 intron produced (B). Splicing capacity was reciprocally represented by the percentage (%) of unspliced ADE1 transcript relative to the total ADE1 mRNA species (C). Error bars represent the standard deviation of the mean transcript level of each gene from at least three biological replicates. Two-tailed t-test was performed and significant changes in the expression of each transcript in ADE1LSM7i mutants to WT_ADE1LSM7i are indicated with asterisks (*).

With an extra amount of LSM7 intron expressed from the ADE1 locus, the transcript level of most of the LSM genes in WT_ADE1LSM7i was significantly affected relative to those of the WT under both media conditions (Figure 3.13A and B). The majority of affected genes, including LSM7, were down-regulated and only the transcript level of

LSM5 was induced, suggesting that a higher level of LSM7 intron generally altered transcription of the genes through repression. More LSM genes showed statistically

83

The function of LSM7 intron on the expression of LSM genes

significant changes in cells grown on acetate than on glucose. The response in expression level of most of the LSM genes to growth on acetate was also significantly altered (Figure 3.13 C).

A B

C

Figure 3.13 Expression levels of the LSM genes in WT cells in response to LSM7 intron expressed within the ADE1 gene. Relative levels of LSM transcripts in cells grown on glucose

(A) and acetate (B) were measured using SIR3 as a reference. Changes in expression of LSM genes in response to growth on acetate were presented with data normalized to glucose (C).

Error bars represent the standard deviation of the mean transcript level of each gene from duplicate PCR on at least three biological replicates. The outcome of two-tailed t-tests to determine significant differences in expression ratio of ADE1LSM7i to WT is indicated with asterisks (*).

84

The function of LSM7 intron on the expression of LSM genes

3.5.2 Normal LSM gene expression in cells grown on acetate was not restored with the LSM7 intron expressed from the ADE1 locus in lsm7' and 'i mutants.

The expression of the majority of the LSM genes in WT cells were affected in response to the extra copy of the LSM7 intron expressed with the ADE1 gene and the response of

LSM genes to growth on acetate was also altered. In order to examine whether this intron-dependent effect could also restore the changes in the expression of LSM genes caused by deletion of LSM7 or its intron, the expression levels of LSM genes were analyzed in the lsm7'_ADE1LSM7i and 'i_ADE1LSM7i strains in both media conditions.

In an lsm7'_ADE1LSM7i strain where the intron is expressed only from the ADE1 construct and at approximately the same level as it is in the WT, the expression pattern of the LSM genes was very similar in both glucose and acetate (Figure 3.14A and B).

While LSM1, 3 and 5 showed similar responses, down-regulation in LSM6 and LSM8 was observed only under glucose and acetate, respectively and the pattern of expression of the LSM genes did not resemble those in iO or lsm7' in either growth condition

(Figure 3.14C and D).

85

The function of LSM7 intron on the expression of LSM genes

A B

C D

Figure 3.14 Expression levels of the LSM genes of lsm7' cells in response to expression of a LSM7 intron at ADE1 locus. Relative levels of LSM transcripts in cells grown on glucose (A and C) and acetate (B and D) were measured using SIR3 as a reference. Error bars represent the standard deviation of the mean transcript level of each gene from duplicate PCR on at least three biological replicates. The outcome of two-tailed t-tests to determine significant differences in expression ratio of ADE1LSM7i mutants to WT or lsm7 is indicated with asterisks (*).

If the LSM7 intron is able to act as a transcriptional enhancer or repressor in trans, ectopic expression of the LSM7 intron from the ADE1 gene should mimic the WT strain when expressed at the same level. This however was not the case (Figure 3.15). A copy of the LSM7 intron expressed from the ADE1 locus was unable to restore the expression pattern of LSM genes in the 'i_ADE1LSM7i that of the WT in either carbon source

(Figure 3.15A and B). Moreover, expression levels of all of the LSM genes were

86

The function of LSM7 intron on the expression of LSM genes

significantly altered both carbon sources (Figure 3.15C and D), most of which were decreased at least 1-fold. The observations indicate that the ectopic expressed intron has a function that led to a different expression profile of the LSM genes in the

'i_ADE1LSM7i strain from that of the 'i strain. Curiously, LSM7 was down-regulated to similar extents as those observed from the WT_ADE1LSM7i strain. In fact, the expression patterns of the LSM genes in 'i_ADE1LSM7i and WT_ADE1LSM7i strains were almost identical in either growth condition (Figure 3.15E and F), indicating in the presence of Lsm7 protein the expression of LSM genes was generally suppressed in comparison to lsm7'_ADE1LSM7i and the effects were not proportional to the amount of intron present in the cells.

87

The function of LSM7 intron on the expression of LSM genes

A B

C D

E F

Figure 3.15 Expression levels of LSM genes of 'i cells in response to expression of a LSM7 intron at ADE1 locus. Relative levels of LSM transcripts in cells grown on glucose (A, C and E) and acetate (B, D and F) were measured using SIR3 as a reference. Error bars represent the standard deviation of the mean transcript level of each gene from duplicate PCR on at least three biological replicates. The outcome of two-tailed t-tests to determine significant differences in expression ratio of ADE1LSM7i mutants to WT or lsm7 is indicated with asterisks (*).

88

The function of LSM7 intron on the expression of LSM genes

As shown previously, the WT response in expression of LSM genes to growth on acetate would be restored in the strain expressing only the LSM7 intron. The response was also altered in WT cells with an additional amount of LSM7 intron expressed in the cells.

Unfortunately, the normal response was not restored with the intron expressed at the

ADE1 locus; no significant differences in LSM transcript ratios were observed from lsm7'_ADE1LSM7i to those of lsm7' (Figure 3.16A and B). Similarly, no significant changes in most of the LSM transcript ratios were observed in 'i_ADE1LSM7i to the ones of 'i (Figure 3.16C and D). While only the expression ratios of LSM5 and LSM6 in 'i_ADE1LSM7i resembled those of WT, the ratio of LSM7 was significantly changed.

In all cases, the responses of the LSM gene did not resemble those in the strains where the LSM7 intron was expressed ectopically.

These results indicate that expression of the LSM7 intron from another locus does affect expression of different sets of the LSM genes in cells grown on different carbon sources but does not recover the normal response in expression of LSM genes that mediated by the LSM7 intron expressed from the LSM7 locus.

89

The function of LSM7 intron on the expression of LSM genes

A B

C D

E

Figure 3.16 Expression levels of LSM genes of ADE1LSM7i mutants in response to growth on acetate. Changes in expression of LSM genes in response to growth on acetate were presented with data normalized to glucose. Expression ratios of LSM genes of WT represent normal response to growth on acetate from glucose. Ratios of lsm7'_ADE1LSM7i (A, B and E), 'i_ADE1LSM7i (C, D and E) and relevant strains were plotted on separate graphs for better presentation. Error bars represent the standard deviation of the mean transcript level of each gene from duplicate PCR on at least three biological replicates. The outcome of two-tailed t-tests to determine significant differences in expression ratio of ADE1LSM7i mutants to WT or lsm7 is indicated with asterisks (*).

90

The function of LSM7 intron on the expression of LSM genes

3.6 Discussion

Studies of LSM proteins have indicated that alteration of the level of one Lsm protein may affect the expression of its proposed partner in an Lsm ring complex (Spiller et al.,

2007b) and shared subunits of two complexes may be involved in regulation of both complexes (Mazzoni et al., 2007; Reijns et al., 2008). Moreover, the formation of

Lsm1-Lsm7 (mRNA degradation) and Lsm2-Lsm8 complexes (mRNA splicing) at different subcellular compartments was found to depend on the levels of Lsm1 and

Lsm8 produced (Luhtala & Parker, 2009; Reijns et al., 2009; Spiller et al., 2007b).

Given Lsm proteins are involved in mRNA degradation, pre-mRNA splicing and several other diverged RNA processing functions, precise regulation of LSM genes is required and may involve several elements and factors at different levels.

Previous data indicated that LSM2, 4, 5, 7 and 8 were coordinately down-regulated in response to poor carbon sources in relation to slower cellular growth rate and this regulation was found mediated by LSM7 intron (Palmisano, 2006). In this study, however, transcriptional control of LSM genes was found more complex than previously believed. The patterns of expression of the LSM gene sets differed under different growth conditions and these changes did not simply reflect cellular growth rate or type of carbon source. Since the Lsm proteins are involved in various different RNA processing functions, these differential changes in expression patterns of LSM genes at different growth conditions were believed important to meet cellular requirements of various RNA processing functions. However, further analysis of Lsm protein localization or assembly into different complexes under different growth conditions is required to address the speculation.

The data presented in this chapter further demonstrates that the deletion of the

91

The function of LSM7 intron on the expression of LSM genes

non-essential LSM7 gene led to a range of alterations in the expression pattern of LSM genes in cells grown on different carbon sources. The relative response of LSM genes to growth of cells on acetate compared to their expression in cells grown on glucose was also disrupted when LSM7 was completely deleted and the response was not restored in cells expressing a functional LSM7 lacking the intron. Interestingly, the data did further confirm that the intron of LSM7 alone can mediate fine-tuning of the expression level of the LSM genes as cells undergo physiological changes from growth on glucose to acetate.

Previous study has suggested that the majority of introns in yeast can be removed without affecting cellular growth on rich media (Parenteau et al., 2008). Consistent with this here, it was found that the growth rate was not affected in cells with the LSM7 intron removed in either growth medium. However, expressing the intron alone did increase cellular growth rate on acetate, further strengthening the view that the intron has retained or gained some function during the evolution of S. cerevisiae. Although the mechanism of intron-mediated regulation of LSM genes is still under investigation, the expression level of LSM7 itself was not significantly changed on deletion of the LSM7 intron, despite transcription levels of several other LSM genes being affected to various extents under the two different growth conditions used in this study. Therefore, it is highly unlikely that the intron codes for a cis-regulatory element that controls transcription or splicing of LSM7, at least in the two tested conditions.

Transcription levels of LSM genes under both growth conditions were also altered when the LSM7 intron was expressed from another separate locus in the genome. However, changes in expression of LSM genes in the cells producing the LSM7 intron from ectopic locus were different from those of cells expressing relevant level of the intron

92

The function of LSM7 intron on the expression of LSM genes

from the native LSM7 locus. The extents of changes were not proportional to the amount of intron present in the cells and differed from strains with or without a functional Lsm7. More importantly, the response of LSM genes to growth on acetate was not restored, strengthening the hypothesis that the spliced intron lariat or something derived from it has a function as a trans-regulator in the control of LSM genes yet it also requires the expression of LSM7 from its native locus and possibly other factors.

Several instances of intron-dependent gene regulations have been noted in S. cerevisiae

(Meyer & Vilardell, 2009); most of these involve controlling production of mature transcript in cis through splicing. Apart from coding for cis-regulatory elements such as splicing enhancers or repressors, regulatory sequence of an intron can also form a specific structure with exon for autoregulation of the protein product (Li et al., 1996).

These processes do not depend on the canonical RNA interference mechanisms since these are lacking in S. cerevisiae (Drinnenberg et al., 2009).

Further insights into the functions of the intron of LSM7 are clearly needed to help understand its mode of action and specificity. This can be gained through mutagenesis analysis of the intron and the genome-wide transcriptional responses to deletion of the intron which are described in the next two chapters.

93

Regulatory elements of the LSM7 intron

Chapter 4. Regulatory elements of the LSM7 intron

4.1 Introduction and aims

The data presented in Chapter 3 provides evidence of a function for the LSM7 intron in the regulation of the LSM genes. Although introns are relatively less conserved then their associated protein-coding exons, studies across species from yeast to vertebrates have found highly conserved sequences enriched in introns of genes related to development and transcriptional control (Bejerano et al., 2004; Glazov et al., 2005;

Siepel et al., 2005; Sironi et al., 2005). Reduced sequence conservation across species does not necessarily mean that these regions lack sequence-dependent functions, but rather that they have been subject to less constraints over evolution. The work presented in this chapter aims to elucidate important sequences or regions of the LSM7 intron in mediating the regulation of the LSM genes and to further understand the possible mechanism by which this regulation is achieved. Site-directed linker scanning mutagenesis of the LSM7 intron was conducted to assess the contribution of intronic regions to impart specific changes in the transcriptional regulation of the LSM genes to growth on acetate or glucose. Since expression levels of the LSM genes can be affected by both the LSM7 intron and the Lsm7 protein, the levels of mature LSM7 mRNA were also examined to distinguish the causes in the alteration of the regulation of the LSM genes. A region of the LSM7 intron was found to be important in this control and further analysis on the sequence and the predicted secondary structure of LSM7 intron were explored.

94

Regulatory elements of the LSM7 intron

4.2 Conservation of the LSM7 intron across yeast species

In order to identify potential sequence motifs conserved in this mode of regulation, phylogenetic analysis was carried out on the LSM7 intron sequences. Intron sequences of five closely related Saccharomyces species (sensu stricto) and four other more distantly related yeast species were obtained from the SGD database (fungal genome search) using WuBlast2. Information on genome locations as well as the homologous sequences of the Lsm7 protein and coding sequences in each yeast species were also obtained using the Yeast Gene Order Browser (version 5) (Byrne & Wolfe, 2005) for confirmation of intron sequences.

All the LSM7 homologues in the species examined were found to have retained an intron near the 5’ end of the coding sequence. To assess conservation, the sequences were aligned with ClustalW (v2.1) (Thompson et al., 1994). Consistent with previous observations (Palmisano, 2006), blocks of highly conserved sequences were distributed throughout the introns among closely related yeast species, in particular the well-defined splicing signals (Figure 5.1). However, the sequence conservation was not observed in more diverged species. The blocks of conserved sequences decline from the intron of S. castellii while the splicing signals are retained in all species.

95

Regulatory elements of the LSM7 intron

A

B

Figure 4.1 Conservation of the LSM7 intron in yeast species. Schematic diagram of phylogenetic relationship between each yeast species (A). Species indicated in blue were considered in the study. Alignment of LSM7 intron sequences from five closely related sensu stricto yeast species and four distantly related yeast species (B). Conserved regions of at least 3 sequences are indicated by grey shading and highly conserved regions are represented with asterisks (*). Splicing elements of conserved sequences in closely related species are indicated in red. Intron sequences of LSM7 homologues were obtained from SGD using WuBlast2 and aligned with ClustalW (2.1) (Thompson et al., 1994).

96

Regulatory elements of the LSM7 intron

4.3 Linker-scanning mutagenesis of the LSM7 intron

The sequences of the LSM7 intron were highly conserved in closely related yeast species. In order to determine important sequences and potential regulatory elements within the LSM7 intron, PCR-mediated site-directed mutagenesis was conducted to create 6 base-pair substitutions covering the intron and the first 18 bp region of exon 2 as described in Section 2.4.3 (Table 4.1 and Figure 4.3). The GC-rich SacI restriction site sequence was chosen to replace relatively AT-rich sequence of the LSM7 intron without introducing any potential stop codons that would terminate the transcription of

LSM7. Mutant strains containing each of the 6 bp substitutions were constructed and confirmed as was done for the lsm7 mutants described in Chapter 3. The expression profiles of the LSM genes of each strain carrying a defined alteration in the LSM7 intron grown on rich media containing either glucose or acetate were analyzed by qRT-PCR.

Table 4.1 LSM7 intron mutant constructs Strains Altered sequence (position in LSM7 ORF) Known sequence properties LSM7 i01 GTATGT (19- 24) 5’splice site LSM7 i02 TTCACT (25-30) LSM7 i03 TCTTAT (31-36) LSM7 i04 TTTCTT (37-42) LSM7 i05 CCGTGG (43-48) LSM7 i06 CAATAA (49-54) LSM7 i07 CCTTCC (55-60) LSM7 i08 TTTTGA (61-66) LSM7 i09 CTTATT (67-72) LSM7 i10 TATACT (73-78) branch point (5’ side) LSM7 i11 AACATT (79-84) branch point (3’ side) LSM7 i12 ATAATA (85-90) LSM7 i13 ACTATG (91-96) LSM7 i14 TTTCCT (97-102) LSM7 i15 TTTTGA (103-108) LSM7 i16 ACTAAG (109-114) 3’splice site

97

Regulatory elements of the LSM7 intron

LSM7 i17 AAATCA (115-120) exon2 (code for 7th and 8th amino acid of Lsm7) LSM7 i18 GAGAAC (121-126) exon2 (code for 9th and 10th amino acid of Lsm7) LSM7 i19 AAACCA (127-132) exon2 (code for 11th and 12th amino acid of Lsm7)

4.3.1 Essential sequences of the LSM7 intron in regulation the level of mature LSM7 transcript

Assuming none of the species in mRNA splicing (unspliced mRNA o spliced + intron lariat) is rapidly degraded, the amount of mature mRNA produced equals the amount of intron spliced and the amount of template for protein synthesis. Since expression levels of the LSM genes can be affected by both the LSM7 intron and the Lsm7 protein, the levels of mature LSM7 transcripts in all lsm7 intronic mutants were measured. This was achieved using specific primers that probed for spliced transcripts of LSM7 to monitor the level of LSM7 intron produced and the potential amount of Lsm7 protein synthesized.

As expected, production of mature LSM7 transcript was affected in the mutants with mutations of the intronic sequences under both growth conditions. Expression of mature

LSM7 transcript was almost undetectable in strains with mutations of the branch point

(i10), the 3’ splice site (i16), and the first 6 bp region of exon2 (i17), and completely eliminated in the strain with mutation of the 5’ splice site (i01) (Figure 4.2 and Figure

4.3). Interestingly, mutation of the branchpoint from TACTAAC to TACTGAG (i11) showed a less severe phenotype, indicating that the 3’ section of this motif was less important in branchpoint formation or the mutation led to less effect to the cells.

The level of mature LSM7 mRNA was not strongly affected by mutations on the region surrounding the branch point (i08, i09 and i12). However, it was repressed in cells with mutations on the region between the branch point and 3’ splice site (i13 and i15) to the

98

Regulatory elements of the LSM7 intron

level similar to the branch point mutant i11, indicating that sequences in this region are likely required for proper splicing and/or transcription of LSM7 pre-mRNA. Production of the mature LSM7 transcript was found not only to be critically dependent on the first

6 bp region of exon 2 immediately next to the intron-exon junction (i17), but also moderately on the further 12 bp region (i18 and i19) of exon 2.

There is a striking difference in the effect seen for mutations in the 5’ region of the intron under different growth conditions. In particular, alteration of the sequences by mutations covered by i03 to i06 led to an induced level of mature LSM7 transcript under glucose and a decreased level under acetate. The level of mature LSM7 transcript was also differentially affected in cells containing mutations on the surrounding regions of the branch point (i09 and i12) and the exon 2 (i18 and i19). The least important sequence in the regulation of LSM7 was found in the central region of the intron (i08 and to a lesser extent i09). This region shows relatively low sequence conservation across the most closely related species.

99

Regulatory elements of the LSM7 intron

Figure 4.2 Essential sequences of the LSM7 intron in regulation the level of mature LSM7 transcript. Expression levels of mature LSM7 transcripts in mutants with 6 bp substitutions in the LSM7 intron during exponential growth under glucose and acetate were measured with qRT-PCR using SIR3 as a reference control. Error bars represent the standard deviation of the mean transcript level of each gene from duplicate qRT-PCR on at least three biological replicates. The schematic diagram represents the sequence of LSM7 recognized by primers specific to spliced transcript of LSM7 in qRT-PCR.

4.3.2 Expression patterns of LSM genes in response to mutations in the intron under different growth conditions

To determine if mutations in the LSM7 intron had wider transcriptional effects, the levels of the other LSM gene transcripts were quantified by qRT-PCR. Complete disruption of the synthesis of mature LSM7 transcript by the mutation on the 5’ splice site (i01) resulted in expression patterns of the other LSM genes that was almost identical to those produced in the lsm7' and iO strains (Figure 4.3 and Figure 4.4 A and

100

Regulatory elements of the LSM7 intron

B). Similar transcription profiles of the LSM genes were observed in the branch point mutant (i10) in both growth conditions and the i11, i13 and i15 strains grown on acetate

(Figure 4.3 and Figure 4.4 C and D), which have mutations that also reduced the level of mature LSM7 transcript. These observations indicated that expression of the LSM genes in these strains was more likely to have been affected through failure to generate sufficient Lsm7 protein. These findings were consistent with the data from lsm7 mutants in which transcript levels of LSM genes were more dependent on the Lsm7 protein than the intron for cells grown on acetate.

Interestingly, expression profiles of the LSM genes were affected to different extents by mutations on the different splicing elements. Expression of the LSM genes was consistently down-regulated in cells with mutations of the 3’ spice site (i16) and the start of the exon 2 (i17, i18 and i19) under both growth conditions (Figure 4.3 and

Figure 4.4 G and H). The effect was more intense than those caused by deletion of

LSM7 (lsm7') or deletion of the coding exons (iO), indicating that the response was not solely due to the absence of a functional Lsm7 protein in the cell.

As was observed for lsm7 mutants in the previous chapter, regulation of the LSM genes depended not only on the LSM7 coding exons and the intron but also the nature of the carbon source for growth of the cells. In this study, expression patterns of the LSM genes of several of the LSM7 intron mutations were significantly varied under different growth conditions. In particular, mutations of the region between the 5’ splice site and the branch point (i03, i04, i05 and i06) led to a strong induction of all LSM genes when the cells were grown on glucose and a repression of LSM genes under acetate supply in comparison with those of WT (Figure 4.3 and Figure 4.4 E and F). Although expression of LSM7 itself was affected from the same mutations while expression of LSM7 was

101

Regulatory elements of the LSM7 intron

shown unaffected by complete deletion of the intron, the reversal changes in the expression patterns of LSM genes indicated a possible mechanism by which the LSM gene is regulated through the LSM7 intron in response to different growth conditions.

Furthermore, the repression of LSM genes observed in cells grown on acetate was very different from that caused by deletion of LSM7 (lsm7') or coding exons (iO), indicating that the response was not solely a result of decrease in Lsm7 protein.

Interestingly, mutations in the regions immediately surrounding the branch point showed changes in LSM gene expression that were independent from the synthesis of

Lsm7 protein. The expression of several LSM genes was slightly but significantly altered in i08 and i09 grown on glucose and i08 and i12 grown on acetate while the level of mature LSM7 transcript was not significantly affected (Figure 4.3 and Figure

4.4 I and J), indicating the regions may be important for the intron in control of other

LSM genes without change in expression of LSM7 under different growth conditions.

However, the expression patterns of the LSM genes in these mutants did not resemble the ones of 'i, indicating these mutations have less effect than other factors of the intron in regulation of LSM gene expression.

102

Regulatory elements of the LSM7 intron

Figure 4.3 Expression levels of the LSM genes of each intron mutant and the corresponding 6 bp substitutions. Expression levels of LSM genes in LSM7 intron mutants grown on glucose (upper panel) and acetate (lower panel) were measured by qRT-PCR using SIR3 as a reference control. Undetectable levels of transcript are shown in grey while the expression levels of LSM transcripts in each strain are indicated with the two-color heat-map shown at the top of the figure. The 96 bp LSM7 intron sequence and part of flanking coding sequence (highlighted in grey) is shown at the top of the mutation sequence series. Splicing elements are indicated in red. The substituted nucleotides (SacI linker) are shown in green and the unchanged nucleotides are indicated in black.

103

Regulatory elements of the LSM7 intron

A B

C D

E F

G H

104

Regulatory elements of the LSM7 intron

I J

Figure 4.4 Effect of mutations in the LSM7 intron on expression of the other LSM genes in cells grown on glucose or acetate. Expression levels of the LSM genes in the LSM7 intron mutants grown on glucose and acetate were measured by qRT-PCR using SIR3 as a reference.

Expression profiles of LSM genes in strains with 6 bp substitution of the conserved splice elements (A and B), the regions required for generating mature LSM7 transcript (C and D), the regions important for regulation of LSM genes in response to different growth conditions (E and

F), the first 18 bp region of exons 2 (G and H) and the regions showing less effect on the level of mature LSM7 transcript (I and J) are plotted in separate graphs for better presentation.

4.3.3 Essential sequences of the LSM7 intron in regulation of LSM genes

Expression of the LSM genes was affected by the LSM7 intronic mutations to different extents in the two growth conditions. Accordingly, the relative expression pattern of the

LSM genes of each intron mutant in response to growth on acetate was found to be distinctive from those of the lsm7 mutants (Figure 4.5). Only the patterns of LSM genes in the i01 strain (5’ splice-site) and the i11 strain (branch point) showed any similarity in their expression profiles to that of ∆lsm7. This is to be expected since removal of the

5’ splice-site or removal of the branch point will lead to a message that cannot be

105

Regulatory elements of the LSM7 intron

transcribed into a functional Lsm7 protein. However, these also varied from the profile of the iO strain (Figure 4.6), indicating that the loss of normal regulation to different metabolic states in these mutants was not only due to the failure to generate a functional

Lsm7 protein by removal of the intron but also from lack of regulation mediated by the intron itself.

The requirement for the spliced LSM7 intron in this model of intron mediated regulation is also supported by the results from the i10 (branch point) and i16 (3’ splice site) mutants as well as other mutants with alteration of the sequences important for generating mature LSM7 transcript (i13 and i15). Each mutation led to an alteration on the pattern of LSM transcripts to a different degree including those with similar level of mature LSM7. Specifically, the expression of LSM genes differed in response to each mutation of the splicing elements, although none of them resembled the full function of the intron, suggesting that the loss of normal regulation in LSM genes in these mutants was not completely due to lack of Lsm7 protein, and may have involved failure in intron lariat formation and/or complete or efficient splicing.

The most profound differences in carbon-source regulation of the LSM genes were seen in the strains harboring the i03, i04, i05 and i06 mutations. These changes in regulation were more intense than that caused by complete deletion of intron. Although it was unclear whether the changes were controlled through regulation of the splicing of LSM7, these data indicate that a sequence encoded in this region

(UCUUAUUUUCUUCCGUGGCAAUAA) is important for the response to a different growth condition. This sequence is unlike the sequence at the equivalent locations in the other intron-containing spliceosomal genes LSM2, SMD2, MUD1 and YSF3.

In contrast, the strong response observed in strains with mutations in the sequences of

106

Regulatory elements of the LSM7 intron

exon 2 (i17, i18 and i19) was due to repression of LSM genes in both carbon sources, indicating that the alterations were more likely related to defects in making functional

Lsm7. For these mutations, the relative expression patterns of LSM genes were quite distinct from that of the iO strain, indicating that the sequences were also required for normal regulation of LSM genes. From amino acid sequence alignments between the yeast Lsm3 (the only yeast Lsm protein with solved crystal structure) and the Lsm7 protein, as well as sequences of other Lsm and Sm proteins from organisms ranging from human to archaebacteria (Naidoo et al., 2008), the region covered by the i17, i18 and i19 strains contributes to the N-terminal region of the Lsm7 protein which is not known to form any α-helix or β-sheet structures. The N- and C-termini of the Lsm1 and

Lsm8 proteins have been shown to play important roles in protein localization (Reijns et al, 2008). The C-terminal asparagine rich domain of Lsm4 protein is also required in localization of Lsm1–7 to the P-body (Reijns et al., 2008) and in P-body assembly

(Decker et al., 2007). In addition, complex formation of Lsm proteins is essential for correct localization (Spiller et al., 2007b; Tharun et al., 2005). The altered response of

LSM genes in these strains may be due to production of Lsm7 protein with a disrupted

N-terminus that is required for facilitating protein complex formation or localization in combination with parts of other Lsm proteins. Confirmation requires studies of the

Lsm7 protein structure and protein-protein interactions with other Lsm proteins in relation to Lsm complex localization.

107

Regulatory elements of the LSM7 intron

Figure 4.5 Expression ratios of LSM genes in each 6 base-pair linker-scanning mutated strains of the LSM7 intron in response to growth on acetate. The 96 bp LSM7 intron sequence of S. cerevisiae and part of the flanking coding sequence (highlight in grey) is shown at the top of the mutation series. Splicing elements are indicated in red. The substituted nucleotides (SacI linker) are shown in green and the unchanged nucleotides are indicated in black.

Figure 4.6 Hierarchical clustering of relative expression levels of the LSM genes in LSM7 intron linker-scanning mutants in response to acetate. Expression ratios of LSM transcript levels of each mutant in response to acetate was clustered with MeV (v4.6.1) of the TM4 microarray software suite (Saeed et al, 2006).

108

Regulatory elements of the LSM7 intron

4.4 Secondary structure prediction of the LSM7 intron

Studies on functional intronic RNAs from various organisms including plants and yeast have revealed that functional RNA species can be released from intron lariat or debranched intron through various post-splicing cleavage events (Brown et al, 2008,

Rearick et al., 2010). Non-coding miRNAs derived from short intron species from D. melanogaster and C.elegans could also be released by a splicing reaction that bypasses enzyme-dependent processing (Ruby et al., 2007). Although the machinery of RNA interference is absent from S. cerevisiae while retained in other yeast species

(Drinnenberg et al, 2009), the function of LSM7 intron in LSM gene regulation was found likely to be splicing-dependent since the splicing elements are required for full function of the intron. To further explore the possibility of a functional RNA involved in such regulation, the secondary structure of the LSM7 intron was predicted using the

RNAfold program of the Vienna RNA package (Gruber et al., 2008). The minimum free energy (MFE) secondary structure of single stranded sequences was predicted using the dynamic programming algorithm (Zuker & Stiegler, 1981).

Although it was unclear whether further processing is targeted to the intron lariat, the full-length RNA sequence of the LSM7 intron was predicted to form a structure with several loops (figure 4.6). Interestingly, one loop structure appeared to be formed by the sequences that were important for regulation of LSM genes in response to different growth conditions, indicating a possible site or structure that could interact with other regulatory elements (RNA or protein) to mediate such control. This structure is unique among the introns of other intron-containing spliceosomal genes, including LSM2,

SMD2, MUD1 and YSF3, predicted using the same program. Note that no in silico structure prediction tools are currently available for intron lariats and prediction can

109

Regulatory elements of the LSM7 intron

only be made from linear RNA structure therefore the prediction was restricted to debranched intron and pre-mRNA. In addition, the predictive power of the program is limited and does not account for other factors in vivo. Further empirical studies are required to confirm these structure predictions.

Figure 4.6 Secondary structure of LSM7 intron. Minimum free energy structure was predicted with RNAfold of Vienna RNA package (Gruber et al, 2008). The free energy of the thermodynamic ensemble is -6.82 kcal/mol at 30°C. Conserved splicing signals are represented in red. The sequence shown to be important for the regulation of the LSM genes is indicated in blue.

110

Regulatory elements of the LSM7 intron

4.5 Discussion

Hemiascomycetous yeast have experienced a massive reduction in introns and numerous genes involved in splicing (Aravind et al., 2000; Bon et al., 2003; Fabrizio et al., 2009). Only around 5% of roughly 6000 genes in S. cerevisiae contain an intron and only ten genes contain two introns (Davis et al., 2000; Juneau et al., 2007; Miura et al.,

2006; Spingola et al., 1999). However, LSM7 homologues of several distantly related yeast species have consistently retained an intron near the 5’ end of the open reading frame, consistent with the study from evolutionarily conserved genes across 19 eukaryotes indicating highly conserved genes preferentially retain intron sequences

(Carmel et al., 2007). The sequence of the intron was highly conserved in yeast species closely related to Saccharomyces. cerevisae, indicating that the intron was retained for a selective advantage.

In this study, site-directed mutagenesis of the LSM7 intron was used to elucidate important sequences or regions of LSM7 intron in mediating regulation of LSM genes observed in the previous chapter. Consistent with the well studied mechanism of the splicing reaction (Krummel et al., 2010; Wang & Burge, 2008), the conserved splicing elements were found essential for generating mature LSM7 transcripts. The sequences between the branch point and the 3’ splice site as well as the intron-exon junction region of exon 2 were also required for generating mature LSM7 transcripts and may be involved in control splicing efficiency or transcription of the LSM7 gene itself. Further splicing assays are required to confirm the decreased level of mature LSM7 transcript in these strains was due to splicing and/or transcription.

Mutants disrupting in the splicing elements of the LSM7 intron were found to be unable to restore the normal expression pattern of LSM genes that appears to be mediated by

111

Regulatory elements of the LSM7 intron

the LSM7 intron. Transcript levels of LSM genes were also altered in other intronic mutants affecting the level of mature LSM7 transcript to different extents. Interestingly each of the mutations exhibited a different alteration to the expression profile of the

LSM genes, indicating that the regulation was intron sequence-dependent and more complex than expected. The region of the LSM7 intron does not overlap with any promoter or intergenic region of any other known open reading frame on the sense or antisense strand, and therefore the mutations were unlikely to have their effect directly through other genes.

A region between 5’ splice site and the branch point

(UCUUAUUUUCUUCCGUGGCAAUAA) was shown to be important in regulating the expression of LSM7 and the other LSM genes under the two different growth conditions used in this study. In cells containing mutations in this sequence, the LSM genes were up-regulated when glucose was used as the carbon source and suppressed in cells grown on acetate, suggesting that the sequences were not only required for the transcriptional regulation of LSM7 but also other LSM genes in adaptation to different growth conditions.

Many introns in S. cerevisiae have shown to play a role in transcriptional regulation at specific conditions through control in splicing by encoding specific sequences that recognized by specific splicing elements or sequences of adjacent exons to form specific structures (Meyer & Vilardell, 2009). It is unclear whether the LSM genes were affected through the 24 nt sequence independent from their effect on the regulation of expression of the LSM7 gene. However, the sequence was predicted to form a loop on the secondary structure of the debranched intron, suggesting a possible mechanism of interaction with other regulatory elements or proteins.

112

Regulatory elements of the LSM7 intron

Many functional RNA species from yeast to human have been shown to be derived from intron lariats or debranched introns (Brown et al., 2008; Louro et al., 2009;

Mattick & Makunin, 2006; Rearick et al., 2010). Although only a minority of snoRNAs are contained in yeast introns and the machinery of RNA interference is absent from S. cerevisiae (Drinnenberg et al, 2009), non-coding miRNAs derived from short intron species from D. melanogaster and C.elegans were found able to be released by a splicing reaction that bypasses enzyme-dependent processing (Ruby et al., 2007).

In addition to the observations from Chapter 3, various other experiments are required to elucidate the mechanism of the LSM7 intron-mediated regulation, including:

z Analysis of LSM gene expression in other LSM gene deletants, such as the essential

LSM2 gene, which also contains an intron, to verify whether the regulation is

specific to LSM7.

z Create mutations within the intron of the iO construct to confirm the regulation is

intron sequence-dependent and further verify regulatory elements that are

independent from Lsm7 protein.

z Transforming the lsm7' and 'i mutants with a plasmid containing the iO construct

expressed under the native LSM7 promoter or an inducible promoter to verify the

trans-regulation of LSM genes requires native promoter elements and/or correct

dosage of the LSM7 intron.

z Assays of the splicing capacity of other intron-containing genes in the 'i and

intronic mutants to verify whether alteration in expression of the LSM genes affects

splicing function of the cells.

z Pull-down assay to find potential binding targets of the intron (protein or RNA).

113

Regulatory elements of the LSM7 intron

z Northern blots or deep sequencing of transcripts to access potential regulatory

ncRNA species from LSM7 locus especially from the intron. z Assay of the capacity of the LSM7 gene to be spliced in the LSM7 intronic mutants,

especially those with mutations affecting levels of mature LSM7 transcript to verify

whether the changes in transcript levels of LSM genes were due to change in

splicing efficiency or transcription rate.

114

Global transcriptional profiling in response to deletion of the LSM7 intron

Chapter 5. Global transcriptional profiling in response to deletion of the LSM7 intron

5.1 Introduction and aims

The data represented in the previous chapters established that the LSM7 intron has some role in the regulation of LSM gene expression. While the mechanism is still unclear, it is possible that many other genes may be affected in response to the loss of regulation of the LSM genes, particularly spliceosomal genes and intron-containing genes as a result of disturbing the mechanism and efficiency of the splicing reaction. On the other hand, there are many other genes whose products may affect the regulation of LSM genes.

Therefore the work presented in this chapter aimed to characterize the genome-wide transcriptional response of yeast to the deletion of the LSM7 intron in order to help elucidate the function of the intron and the potential mechanisms involved.

Genome-wide expression data of the 'i mutant grown on rich media containing either glucose or acetate were compared with the WT strain under the same growth conditions.

Since the only difference between these strains was the presence or absence of the

LSM7 intron, both would be producing functional Lsm7 protein. The resulting transcriptional profiles provided a panoramic view of the cellular functions affected by the presence of the intron in LSM7 and differences in its activity, even though not necessarily in a direct way. Since a group of mating-type genes were found to be involved in the course of analysis, further investigations on the mating-related functions of the lsm7 mutant strains were explored.

115

Global transcriptional profiling in response to deletion of the LSM7 intron

5.2. Principle component analysis (PCA) of microarray expression data

In order to investigate the genome-wide transcriptional response to the deletion of the

LSM7 intron in cells grown on different carbon sources, Affymetrix DNA microarrays were used to examine transcript levels of wild-type and 'i strains. Cells were grown on rich media containing either glucose or acetate as carbon supplies and RNA samples were harvested at mid-exponential phase (OD600 = 0.5 r 0.05). Samples were processed for microarray analysis at the Ramaciotti Centre for Gene Function Analysis (UNSW,

Australia). The raw data were processed, normalized and analyzed as described in

Section 2.4.5 and later in this chapter. All data are provided in the Appendices in the attached CD.

Principal components analysis (PCA) of expression data was performed using the

Partek® Genomics Suite• (version 6.4) to determine major effects influencing the global expression values in the study. PCA is a linear projection that transforms expression data into a coordinate system in which the greatest variance of the data lies on the first coordinate (PCA 1, the first principal component) and the other variances on the second (PCA 2) or third coordinates according to the degree of contribution to the variance in the data.

In this analysis, 69.6% of the variation in the samples occurred in the first three principal components (Figure 5.1). Samples from glucose (pink) and the acetate (yellow) formed distinct clusters on different dimensions. The type of culture condition was the most significant effect present by PCA 1 on X-axis, contributing to 39.3% of the variation. A smaller yet clear separation of biological duplicates of wild-type (blue) and

'i (red) under each culture condition contributed to 18.3% of the variation (PCA 2), represented by the Y-axis. Moreover, the analysis indicated that none of the samples

116

Global transcriptional profiling in response to deletion of the LSM7 intron

were extreme outliers, which would be indicative of poor quality RNA or a flawed hybridization of one or more of the samples.

Figure 5.1 Principle Components Analysis (PCA) plots of microarray data. Variation of expression data of biological duplicates of wild-type (blue) and 'i (red) grown on glucose (pink ellipse) and acetate (yellow ellipse) media was analyzed by PCA. 69.6% of the variation in samples was revealed in the first two principal components, represented by the X-axis (PCA 1, 39.3%) and the Y-axis (PCA 2, 18.3%).

117

Global transcriptional profiling in response to deletion of the LSM7 intron

5.3 Differentially expressed genes in response to the LSM7 intron deletion.

Having determined that the data were of adequate quality and identified the principle determinants contributing to variance, two-way ANOVA (Analysis of Variance) was performed with Partek® Genomics Suite• (version 6.4) in order to identify genes that showed significant changes in expression. The sources of variation identified in the

ANOVA model were in good agreement with the PCA plot (Figure 5.2), in which the culture condition (media) had the most significant effect on the expression data. A smaller, yet significant contribution of variation was from cell types (strains). There also appeared to be a significant contribution from strain by media interaction

(strains*media), representing the difference between the strains that depended on the media.

Figure 5.2 Source of variation plot of two-way ANOVA. Each bar indicates the average ratio to error (noise) for each factor in the ANOVA model used in this study. The media effect was by far the largest source of variation in the expression data (blue). The difference between the strains contributed to the second largest source of variation (red). The difference between the strains depending on the media (strains-by media interaction) represented by the green bar was the smallest significance variance in the ANOVA model.

118

Global transcriptional profiling in response to deletion of the LSM7 intron

Genes that were significantly differentially expressed (p-value < 0.01) in response to deletion of the LSM7 intron were determined using linear contrast analysis between expression data of 'i and WT under each growth condition using ANOVA. Deletion of the intron significantly affected the expression of 186 genes in glucose-grown cells, whereas 230 genes were affected in acetate-grown cells. Genes that were affected by deletion of the LSM7 intron under only one growth condition and those affected by both conditions are listed in the Appendix, and the Venn diagram (Figure 5.3) shows the overlap. Seventy-four genes were affected by deletion of the LSM7 intron under both culture conditions. On the other hand, expression of 112 and 156 genes were uniquely altered in cells grown on glucose and acetate, respectively.

Figure 5.3 Genes influenced by the deletion of the LSM7 intron under glucose and acetate growth. The Venn diagram represents the number of genes showing significant expression variation (p<0.01) between 'i and WT strains under both media conditions.

5.3.1 Differentially expressed genes in response to the LSM7 intron deletion in both carbon sources.

In order to characterize potentially co-regulated genes in response to deletion of the

LSM7 intron under the two growth conditions, hierarchical clustering of the 74 commonly affected genes was performed using the MultiExperiment Viewer (MeV, v4.6.1) (Saeed et al., 2006) which indicates the extent of expression differences (Figure

5.4). From this it can be seen that the majority of the genes (41 including two gene pairs

119

Global transcriptional profiling in response to deletion of the LSM7 intron

that could not be resolved) were up-regulated as a consequence of deletion of the intron, while less than half of the genes (23 out of 74) were down-regulated under both growth conditions. Thirteen genes were affected differentially by the absence of the LSM7 intron under the two growth conditions, indicating that the changes in expression of these genes were also dependent on culture conditions.

Gene expression differences for wild-type and the 'i cells grown on acetate compared to those grown on glucose were also plotted in Figure 5.4 as a control and to indentify changes due to growth on different media. Although the majority of the 74 genes in WT cells responded to different culture conditions, it should be noted that the apparent changes in expression were not necessarily statistically significant (Appendix 1). For example, in the case of the down-regulated gene IME4 and up-regulated FUS3, the p-values for the data between acetate and glucose were 0.032 and 0.016 respectively and were therefore not statistically significant. Several genes that significantly responded to deletion of the intron in both growth conditions also showed a change in response to the different growth conditions compared to WT. For instance, genes such as MFα1 and STE5 were significantly down-regulated in response to growth on acetate in WT cells but less affected in cells lacking the LSM7 intron (Appendix 1).

To further characterize cellular functions affected by the deletion of the LSM7 intron, the 74 commonly affected genes were grouped according to their expression patterns and analyzed with the web-based FunSpec program (Robinson et al., 2002) (Table 5.1).

Genes were classified on the basis of their molecular function, subcellular localization and biological process based on data derived mainly from the MIPS and GO (Gene

Ontology) databases and further analyzed using the data from SGD. Genes with products involved in multiple cellular functions, subcellular locations or biological

120

Global transcriptional profiling in response to deletion of the LSM7 intron

functions may appear more than once in the functional classification. Of particular interest, this analysis highlighted a group of genes involved in mating-type regulation that was enriched in the data set (Table 5.2). More than half of the up-regulated genes were found to function in mating, indicating that the LSM7 intron may play a role in the mating programming of the haploid cell.

Contrary to expectation, the data indicated that splicing genes were not significantly affected at the genome-wide transcriptional level. The absence of LSM genes in the gene set may also be due to their very low expression levels which would render them below the level of statistically significant detection by the microarrays. Splicing was not required for almost all of the affected genes and only one intron-containing gene,

HMRa1, was down-regulated. On the other hand, expression of a spliceosomal gene,

PRP8 was slightly repressed (-1.12 on acetate and -1.06 on glucose) due to deletion of the LSM7 intron. As mentioned earlier (Chapter 1), Prp8 is the largest and the most highly conserved protein in the spliceosome (62% identity between yeast and human). It plays an important role in splicing by interacting with the U4, U5 and U6 snRNA and splicing elements in the pre-mRNA at the heart of the spliceosomal complex (Grainger

& Beggs, 2005; Wachtel & Manley, 2009) to promote either of the two transesterification steps in the splicing while inhibiting the other by changing conformation of the spliceosome (Liu et al., 2007; Query & Konarska, 2004). While the changes in expression of PRP8 were relatively small, the LSM7 intron may be required for the regulation of PRP8 expression, however this is unlikely to have any critical effect on the splicing reactions.

121

Global transcriptional profiling in response to deletion of the LSM7 intron

Figure 5.4 Hierarchical clustering of genes that were significantly differentially expressed in response to deletion of the LSM7 intron in cells grown on acetate and glucose. Fold-changes of the expression levels of the genes from 'i to the ones of wild-type were clustered with MeV (v4.6.1) of the TM4 microarray software suite (Saeed et al., 2006) under both growth conditions. Signals that are recognized by more than one gene are indicated with

122

Global transcriptional profiling in response to deletion of the LSM7 intron

both gene names. The gene expression differences for wild-type cells grown on acetate compare to those grown on glucose were also presented for these genes as a control to indentify changes due to growth on different media. Insignificant changes are indicated in grey. The same data were also presented for the 'i cells to identify changes in normal response to growth on different media due to deletion of the LSM7 intron. Information such as p-values and the extent of expression changes of each gene under different growth conditions are presented in the Appendix 1.

Table 5.1 Function categories of significantly up-regulated (A), down-regulated (B) and differentially regulated (C) genes in response to deletion of the LSM7 intron in both carbon sources. Genes were grouped according to their GO biological processes and molecular functions from FunSpec analysis. Genes involved in multiple functions were manually correlated with related biological processes derived from the SGD database. Other information and the extent of expression changes of each gene under different growth conditions are presented in the Appendix 1 and 4.

(A) Up-regulated genes under both glucose and acetate growth Function Gene pheromone-dependent signal transduction FAR1 AGA1 AFR1 FUS1 FUS3 during conjugation with cellular fusion; STE3 STE4 STE5 STE18 cell cycle arrest; MFα1 MFα2 cell division; GPA1 PRM5 SAG1 RME1 mating-type specific regulation SPR28 HUG1 AMN1 HO meiosis HMLα1 MATα1 metabolic process; AAD16 AAD4 AAD10 oxidation reduction RIB4 PUT1 YKL107W translation regulation ANB1 transmembrane transport HXT10 AUS1 chitin catabolic process CDA2 anti-apoptosis SNO1 response to pH YFR012W Stress response; DNA damage responsive DDR2 DDR48 Unknown function PHM7 YDR018C YGR201C YOR152C YKR104W

(B) Down-regulated genes under both glucose and acetate growth Function Gene pheromone-dependent signal transduction HMRa1

123

Global transcriptional profiling in response to deletion of the LSM7 intron

during conjugation with cellular fusion; BUD9 mating-type specific regulation PCL7 cell division; CDC9 meiosis IME4 metabolic process; ALD5 LYS2 POS5 oxidation reduction LEU1 DSE4 cell wall organization and biogenesis HPF1 RNA modification; RPC37 RNA polymerase III subunit C37 ribosome biogenesis; rRNA processing UTP13 DNA metabolic process CST6 nuclear mRNA 3'-splice site recognition; PRP8 assembly of spliceosomal tri-snRNP phosphatidylinositol metabolic process MSS4 chemotaxis YIL169C amino acid transport BAP2 MMP1 secretory granule organization and biogenesis GEA2 potassium ion import QDR2 mitochondrial transport OAC1 Unknown function YJL213W

(C) Differentially regulated genes under both glucose and acetate growth Function Gene invasive growth in response to glucose limitation DIA1 cell wall organization and biogenesis ECM27 RCR1 YLR194C VAN1 translation regulation RPL9A mitochondrial transport ODC1 fatty acid elongation ELO1 Unknown function PNS1 TMS1 YLR108C YOR390W YPL279C

5.3.2 Genes involved in mating-type regulation are altered by the LSM7 intron deletion under both media conditions.

From the functional enrichment analysis, genes of mating-type regulation and pheromone-dependent signaling were highly over-represented in the set of genes whose 124

Global transcriptional profiling in response to deletion of the LSM7 intron

expression was altered due to the absence of the LSM7 intron (Table 5.2). In S. cerevisiae, cells can stably exist as either diploids or haploids. The mating type of a haploid cell is determined by the a or Dġallele present at the mating-type locus, MAT

(Bardwell, 2005). Each of the mating type genes directs the transcription of an allelic-specific transcriptional program, including responses to the opposite mating-type pheromone through the cell-type specific pheromone receptor Ste2 or Ste3 (Galgoczy et al., 2004; Johnson, 1995). In addition, haploid cells of both mating types share a haploid transcriptional program which activates haploid-specific genes and represses diploid-specific genes. The experiments carried out here were done in haploid strains with the MATα background, and hence show an increased expression of the MFα1,

MFα2 and other MATα-specific transcripts.

Mating in S. cerevisiae is initiated when opposite mating-type cells are in close proximity with the secretion of sufficient levels of mating-type specific pheromones.

The pheromone-dependent signal transduction is coupled by a set of G proteins and a mitogen-activated protein kinase (MAPK) cascade (Figure 5.5) (Bardwell, 2005;

Breitkreutz & Tyers, 2002; Chen et al., 2007; Roberts & Fink, 1994). In brief, transduction of a mating signal by the Fus3-dependent MAPK pathway leads to activation of the transcription factor Ste12 which promotes transcription of mating-specific genes. As a result, cells undergo several physiological changes in response to the opposite mating-type pheromone, including cell-cycle arrest at G1, projections and agglutination for cell-cell fusion.

125

Global transcriptional profiling in response to deletion of the LSM7 intron

Table 5.2 Functional enrichment of genes differentially expressed in response to deletion of the LSM7 intron under glucose and acetate growth conditions. Genes were grouped according to their MIPS functional classification (as indicated by the numbers in square brackets) derived from the FunSpec program with their changes in expression relative to the WT. Genes appearing in multiple enrichments were grouped with related functions. Fold-change Function p-value* Genes Acetate Glucose MFα1 2.19 1.40 MFα2 6.48 2.34 pheromone response, 3.85x10-11 HMLα1// MATα1 3.01 1.80 mating-type STE3 3.03 1.84 determination, STE4 1.98 1.55 sex-specific proteins STE5 2.09 1.46 [34.11.03.07] STE18 3.12 1.74 GPA1 2.28 1.86 FUS1 2.77 2.36 4.87 x10-10 FUS3 2.79 2.29 transmembrane signal FAR1 2.62 2.06 transduction [30.05] HO 2.41 1.56 AFR1 1.63 1.24 SAG1 3.63 2.33 1.23 x10-06 AGA1 3.15 3.50 mating (fertilization) PRM5 1.42 2.12 [41.01.01] RME1 1.52 1.51 SPR28 1.51 1.53

AMN1 1.42 1.61

HUG1 1.85 1.98 cell cycle arrest 1.81 x10-03 HMRa1 -11.77 -48.33 [10.03.01.02] CDC9 -1.15 -1.11 BUD9 -1.45 -1.24 PCL7 -1.32 -1.19 AAD4// AAD16 1.23 1.22 fermentation [02.16] 1.6 x10-04 AAD10 1.39 1.33 ALD5 -2.01 -1.49 mRNA modification [11.06.03] 1.181 x10-02 IME4 -10.62 -12.89

* P-value represents the probability of the enrichment of the 74 commonly affected genes with a given functional category by chance.

126

Global transcriptional profiling in response to deletion of the LSM7 intron

Figure 5.5 Mating pathway of S. cerevisiae and expression profile of mating-type genes in response to deletion of the LSM7 intron. Schematic representation of the mitogen-activated protein kinase (MAPK) cascade in yeast mating and the key components involved. MATD-specific genes such as MFα1, MFα2, HMLα1/MATα1 and STE3 as well as haploid-specific genes were induced (red box) in the 'i strain. Figure modified from Kyoto Encyclopedia of Genes and Genomes (KEGG) database (http://www.genome.jp/kegg/).

Many key components of the mating pathway were up-regulated in MATDġcells lacking the LSM7 intron (Figure 5.5). MATD-specific genes including the α-factor pheromone encoded genes MFα1 and MFα2, as well as the gene for a-factor pheromone receptor

STE3 were co-regulated together with genes involved in the pheromone-dependent

127

Global transcriptional profiling in response to deletion of the LSM7 intron

MAPK pathway (GPA1, STE4, STE18, STE5, FUS1, FUS3, FAR1) in response to deletion of the LSM7 intron. Expression of mating-type specific genes AFR1, AGA1,

SAG1 and PRM5 required for pheromone-induced projection, agglutination and plasma membrane fusion was also induced probably through the up-regulation of FUS1

(Bharucha et al., 2008; Doi et al., 1989; Heiman & Walter, 2000; Roy et al., 1991). In addition, as a result of induction of FAR1, expression of several cell-cycle regulated genes involved in meiosis (RME1, SPR28, IME4), mitosis (AMN1), DNA ligase (CDC9) or DNA repair at the cell-cycle check-point (HUG1) were also affected in 'i cells

(Basrai et al., 1999; Covitz et al., 1991; De Virgilio et al., 1996; Shah & Clancy, 1992;

Wang et al., 2003; Willer et al., 1999). In contrast, the silenced copy of the

MATa-specific gene HMRa1 (one of only ten genes in S.cerevisiae that contains two introns) was the most down-regulated gene among all of the commonly affected genes

(-11.77 on acetate and -48.33 on glucose).

The mechanism whereby the LSM7 intron is involved in mating is unclear. One possibility is that the alteration in LSM gene transcript levels in the 'i strain affects the splicing of the two introns of the HMRa1 gene, thereby promoting the expression of

MATD-specific genes in MATDġhaploid cells. However, it is not clear how a change in expression of a silent copy of the MATa1 gene in MATD cells could have this effect.

Moreover, splicing was not significantly affected in the 'i strain as previous discussed.

On the other hand, suppression of HMRa1 may be a result of up-regulation of

MATD-specific genes.

Interestingly, the gene encoding the HO endonuclease required for initiation of mating-type conversion at the MAT locus was slightly up-regulated (2.4 fold in acetate and 1.6 fold in glucose). The mating-type switch occurs exclusively in haploid mother

128

Global transcriptional profiling in response to deletion of the LSM7 intron

cells at late G1 by replacing sequences at the MAT locus with sequences from either

HML or HMR encoding the opposite mating allele. This is initiated through a double-stranded break generated by the HO enconuclease (Haber, 1998; Nasmyth,

1993). Transcription of HO is tightly regulated by several cis-regulatory elements at the promoter and transcription factors such as Swi5 and the Swi4-Swi6 SBF complex

(Andrews & Herskowitz, 1989). Up-regulation of HO was likely to be an indirect response of induction of FAR1 promoting G1 cell-cycle arrest.

Besides HMRA1, expression of IME4 was strongly repressed in 'i cells under both growth conditions (-10.62 fold in acetate and -12.89 fold in glucose). Under nutrient-limiting conditions Ime4 activates meiosis in diploid cells for sporulation, and the expression of IME4 is repressed by its own antisense transcript in haploid cells

(Hongay et al., 2006; Shah & Clancy, 1992). Although the down-regulation of IME4 may be an indirect result of up-regulation of haploid-specific genes, it should be noted that Ime4 exerts its mRNA [N6Ʈadenosine]Ʈmethyltransferase (m6A) activity (including modification of the IME4 transcript itself) during sporulation and is important for regulation of the initiation of meiosis (Bodi et al., 2010; Clancy et al., 2002). The LSM7 intron may be required to control Ime4 methylation activity during the early stages of sporulation.

Several components of the MAPK pathway (STE20, STEI1, STE7, and STE12) are also required for pseudohyphal growth in diploid yeast cells, and invasive growth in haploid cells under nutrient-limiting conditions (Breitkreutz & Tyers, 2002; Erdman & Snyder,

2001; Roberts & Fink, 1994). The phosphorylation cascade of filamentous growth is

MAP kinase Kss1-dependent, whereas the cellular response to pheromones in mating is mediated specifically by the MAP kinase Fus3. Therefore, given the significant

129

Global transcriptional profiling in response to deletion of the LSM7 intron

involvement of all of the mating-type specific genes, the LSM7 intron is less likely to be involved in filamentous growth in response to nutrient limitation and strongly required for mating programming. Further investigations on mating-related functions in relation to the LSM7 intron are described in the following sections.

5.4 The LSM7 intron is required for efficient mating in MATD-cells.

Since mating-type regulatory genes comprised the largest functional group in the microarray analysis, the question of whether mating functions are altered in the cells without the LSM7 intron was raised. In order to understand the function of the LSM7 intron in relation to mating, mating efficiency of each lsm7 mutant was analyzed by measuring diploid formation from mating of each lsm7 mutant (MATα) with WT cells of the opposite mating-type (MATa) as described in Section 2.6. Mating efficiency was defined by the ratio of diploid cells formed to the total number of lsm7 haploid cells used in mating, expressed as a percentage. Interestingly, mating efficiency was significantly decreased in the lsm7' and 'i strains and restored in a strain expressing the LSM7 intron without the Lsm7 protein-coding exons (Figure 5.6 A), indicating that the Lsm7 protein is less important for efficient cellular mating whereas the intron is required for normal mating function.

Almost all of the key components in the yeast mating pathway were up-regulated as a result of deletion of the LSM7 intron, including the D-factor pheromone-encoded genes

MFα1 and MFα2. In order to further explore the function of the LSM7 intron in regulating mating, α-factor production of each strain was tested on a lawn of the

α-factor supersensitive strain (bar1Δ) of the opposite mating type. MATa cells lacking

Bar1 protease are unable to cleave the α-factor released from MATα cells and do not

130

Global transcriptional profiling in response to deletion of the LSM7 intron

recover from α-factor induced cell-cycle arrest (Ballensiefen & Schmitt, 1997). The amount of α-factor released was represented by the halo diameter of the colony.

Consistent with the microarray data, production of D-factor pheromone was significantly increased by ~25% in cells lacking the LSM7 intron compared to those of the WT (Figure 5.6 B). However, production of D-factor pheromone was also induced to the same level in mutant cells lacking coding exons or both intron and exons. This observation indicates that not only the intron but the exons of LSM7 are also required in regulation of pheromone production. However, the mating efficiency of lsm7' or Δi was not enhanced by the overproduction of pheromone.

To further investigate the possible mechanism whereby the LSM7 intron was involved in decreased mating efficiency in Δi cells, the sensitivity of the lsm7 mutant strains towards the opposite mating-type pheromone was analyzed. Due to the unavailability of a-factor pheromone (which is difficult to synthesize since the 12-mer peptide is postranscriptionally prenylated and methylated (Bardwell, 2005; Chen et al., 1997;

Michaelis et al., 1992) and the modifications are essentially required for the activity

(Michaelis & Herskowitz, 1988)), lsm7 mutant strains were constructed in a BY4741 background (MATa) in order to measure cellular responses towards D-factor pheromone.

Halo formation on a lawn of each lsm7 mutant strain from α-factor applied to a disc was measured as an indication of the pheromone sensitivity of each strain.

Contrary to expectation, cells lacking the Lsm7 protein but not the LSM7 intron were slightly more sensitive to D-factor pheromone (Figure 5.6 C), indicating that the intron is not required for the response to the opposite mating-type pheromone even though up-regulation of the pheromone receptor was observed in the Δi strain. Alternatively, the increased sensitivity of D-factor pheromone in cells lacking Lsm7 protein may be a

131

Global transcriptional profiling in response to deletion of the LSM7 intron

result of slower recovery from pheromone-dependent cell cycle arrest, indicating Lsm7 may be involved in mating regulation. However, the possibility of mating-type specific regulation of the LSM7 intron cannot be excluded.

A B

C

Figure 5.6 Mating assays of lsm7 mutants. (A) Mating efficiency test. (B) Production of D-factor pheromone. (C) Response to D-factor pheromone. Error bars represent the standard deviation of the mean activity level of each strain from at least three biological replicates. Two-tailed t-test was performed as indicated with asterisks (*) for significantly differences in tested activity between lsm7 mutants and WT or other mutant as indicated.

132

Global transcriptional profiling in response to deletion of the LSM7 intron

5.5 Discussion

Although studies have shown no major consequences affecting phenotypes and growth under several stresses following deletion of most of the introns in yeast (Ng et al., 1985;

Parenteau et al., 2008), genome-wide analysis of splicing and global surveillance of intron expression have suggested that introns are required to improve and regulate gene expression in yeast (Juneau et al., 2006; Juneau et al., 2007). Several meiosis genes and intron-rich ribosomal genes are regulated at the level of splicing during meiosis and for rapid response to environmental stresses, respectively (Davis et al., 2000; Juneau et al.,

2007; Pleiss et al., 2007) .

In this study, deletion of the LSM7 intron was shown to affect the transcript level of a group of mating-type genes in cells grown on both glucose and acetate. Many components of the pheromone-dependent signaling pathway were up-regulated, including all of the genes involved in the MAPK cascade. Although the same MAPK pathway is required for filamentous growth under nutrient limitation in both haploids and diploids, expression of filamentous growth specific genes was not significantly affected while several mating-specific genes for cell projection, agglutination and cell-cell fusion were induced. The data from the mating efficiency assay of lsm7 mutants also strongly supported evidence of the requirement of the LSM7 intron in mating regulation. However, the involvement of the LSM7 intron in mating was not through control of pheromone production or sensitivity towards opposite mating-type pheromone.

The genes most affected by deletion of the LSM7 intron were HMRa1 and IME4; both encode products not required in MATα haploid cells and both are extremely repressed in comparison to other genes. Interestingly, HMRa1 encodes two introns and was likely

133

Global transcriptional profiling in response to deletion of the LSM7 intron

affected by alteration in spliceosomal LSM genes by the absence of the LSM7 intron in the cells. However, the transcription level of LSM genes was not significantly affected although these are transcribed at a very low level nor was any change seen in any of the other genes that contain two introns. Only one splicing factor gene, PRP8, was slightly down-regulated in response to deletion of the intron. Further studies on the splicing capacity of the 'i strain in relation to changes of expression levels of HMRa1 and PRP8 are required to understand the observations.

IME4 is a key regulator in the initiation of meiosis and its expression is inhibited by its own antisense transcript in haploid cells (Hongay et al., 2006; Shah & Clancy, 1992).

During sporulation, Ime4 acts as an [N6Ʈadenosine]Ʈmethyltransferase (m6A) to methylate mRNA, including its own transcript, and is important in regulation of meiosis

(Bodi et al., 2010; Clancy et al., 2002). The function of m6A is not completely understood, but several studies have observed that the modification is required for efficiency in mRNA splicing and translation (Carroll et al., 1990; Clancy et al., 2002;

Tuck et al., 1999). Many genes involved in meiosis are regulated at the level of splicing

(Davis et al., 2000; Juneau et al., 2007). The relationship of LSM7 intron and splicing regulation through control of mRNA methyltransferase would be of great interest in determining the role and mechanism of the LSM7 intron.

These data provide insights into the possible mechanisms of cross pathway regulation between splicing and mating through an intron of a spliceosomal gene. However, further studies are required to address the identity of the direct regulatory sensor in the process.

134

Summary and perspectives

Chapter 6 Summary and perspectives

The aim of this thesis was to study the function and property of an intron encoded in the evolutionary conserved gene LSM7 in the intron-poor yeast Saccharomyces cerevisiae.

This work was based on the previous report that the intron may be involved in regulating other LSM genes in response to a different growth condition (Palmisano,

2006). Analysis on the expression of LSM genes under two growth conditions showed that the regulation of these genes was more complex than previously believed. Different sets of LSM genes were found to respond differently to the nutrient conditions and may be assumed important for the cell to meet requirements of the various diverged RNA processing functions driven by the Lsm proteins. This suggestion can be further explored by analyzing the localization of Lsm proteins and the complexes they form in cells under different conditions.

The data from analysis of various lsm7 mutants that are affected in the LSM7 gene and/or its intron indicated that both the protein and the intron are influential in affecting the expression levels of some other LSM genes. Interestingly, the LSM7 intron alone was able to affect the transcription level of LSM genes in response to different growth conditions. Further studies on other LSM genes are now needed to confirm whether the observations were LSM7-dependent. A clear candidate for such study is the LSM2 gene which also encodes an intron. It is particularly strikingly that the intron was shown able to affect expression of LSM genes when it was inserted in a different gene (ADE1) at another locus. However, in this other locus the intron did not lead to the same patterns of gene expression in cells grown in glucose and acetate as seen in the wild-type strain.

The expression patterns of LSM genes were also different in cells with and without a functional Lsm7 protein. Experiments such as expressing the intron under the LSM7

135

Summary and perspectives

promoter or an inducible promoter in a plasmid in cells lacking the LSM7 intron now need to be done to further address the requirement of native locus and dosage response for such control. Nonetheless, the data presented here clearly show that the LSM7 intron can affect expression of other LSM genes in trans.

While many lines of evidence from various organisms have revealed that introns are an important source of functional ncRNAs (Brown et al., 2008; Louro et al., 2009; Mattick

& Makunin, 2006; Rearick et al., 2010), little is known about intronic trans-regulatory elements in S. cerevisiae. Only a few snoRNAs that are involved in site-specific modification of rRNA and snRNA are derived from introns in yeast. Various factors including lariat-debranching enzyme (Dbr1), endo- and exo-nucleases are required to release snoRNA from intron lariat or unspliced pre-mRNA (Bachellerie et al., 2002;

Brown et al., 2008; Fatica et al., 2000; Ooi et al., 1998). However, while many other ncRNAs are processed from introns in various organisms, the miRNA machinery is absent from S. cerevisiae, but retained in other yeast species (Drinnenberg et al., 2009).

Some non-coding miRNAs derived from small intron species in D. melanogaster and C. elegans can be released by a splicing reaction without further enzyme-dependent cleavages (Ruby et al., 2007). Therefore, the possibility of functional ncRNAs retained in yeast intron cannot be excluded on the basis of a lack of processing machineries.

Further studies are needed to access transcribed species and unknown ncRNA from

LSM7 locus, this can be achieved by Northern blotting or deep sequencing of transcripts.

Although introns appear to be being lost from the yeast genome (Fink, 1987) and studies have shown no major consequences affecting phenotypes in deletion of most of the introns in laboratory conditions (Ng et al., 1985; Parenteau et al., 2008), several

136

Summary and perspectives

examples of intron-dependent gene regulation have been demonstrated, most of these involve controlling production of mature transcript through splicing. For instance, the product of the ribosomal gene RPL30 was found to be autoregulated by interacting with a specific structure formed by its pre-mRNA through the intron and the spliced transcript to tightly regulate splicing and translation, respectively (Li et al., 1996; Meyer

& Vilardell, 2009; Preker & Guthrie, 2006). On the other hand, splicing of several constitutive transcribed meiosis genes is activated only during meiosis by the meiosis-specific splicing factor Mer1 (Davis et al., 2000; Engebrecht et al., 1991;

Juneau et al., 2007).

Sequential mutation assay of the LSM7 intron in this thesis has identified the sequences and regions required for the intron-mediated regulation of LSM genes and confirmed that the control is intron sequence-dependent. More significantly, a region of the intron was found to be important in consistently regulating the level of mature LSM7 transcript and other LSM transcripts depend on growth conditions. While the region was not known to bind any protein, the sequences were found forming a predicted loop in the secondary structure of the intron, providing a possible mechanism as a “regulator” of such control. Future mutagenesis of the intron of iO strain is required to confirm the regulatory elements that acting independently from Lsm7 protein. Quantitative assay on the unspliced and total transcripts of the LSM7 genes in the LSM7 intron mutants can also be done to verify the regulation is splicing-dependent.

Since the splicing elements are required for the full function of the LSM7 intron and most of the LSM genes are essential for splicing, it is suspected that the LSM7 intron works as part of some autoregulatory control in splicing. This possibility can be studied by analysis of the capacity of the 'i and relevant intron mutation strains to splice other

137

Summary and perspectives

intron-containing genes and confirmed by studying the expression levels of other splicing factors, such as Sm proteins and U6snRNA.

The work present in this thesis also included genome-wide transcript profiling of cells in response to deletion of the intron. The data provided strong evidence that the LSM7 intron has some role in mating regulation. Genes encoding many key components in the mating pathway as well as several mating-specific genes for cell projection, agglutination and cell-cell fusion were up-regulated in MATD cells lacking a LSM7 intron. An analysis of mating efficiency of lsm7 mutants also strongly supported the requirement of the LSM7 intron in mating regulation and provided evidence for a phenotype of deleting the intron from the gene. However, the involvement of the LSM7 intron in mating was not through control in pheromone production or sensitivity of a cell towards the opposite mating-type pheromone.

While many genes involved in mating were up-regulated, the MATa -specific gene

HMRa1 and the diploid-specific gene IME4, which both encode products not required in

MATα haploid cells were extremely repressed in response to deletion of the LSM7 intron.

It is not clear if there is any requirement for the expression of these genes in MATα haploid cells and the response may be due to up-regulation in MATD-specific genes and other haploid-specific genes. HMRa1 is a silence copy of MRa1, both genes contain two introns and require a specific splicing factor, Aar2 for splicing for their pre-mRNAs

(Nakazawa et al., 1991). Interestingly, Aar2 is a component of U5 snRNP and able to direct interact with Prp8 during splicing (Boon et al., 2006; Gottschalk et al., 2001).

Although the expression of one spliceosomal gene, PRP8, was found to be slightly repressed; splicing functions were not significantly affected at the genome-wide transcriptional level by the deletion of LSM7 intron.

138

Summary and perspectives

Despite the microarray analysis having not shown any significantly variation in expression of LSM genes as observed from the qRT-PCR study, the results have clearly demonstrated the power of microarray analysis for identifying potential functions of an intron in yeast. These data provide insight into the possible mechanisms of cross-pathway regulation between splicing and mating through an intron of a spliceosomal gene. However, it should be mentioned that the genes regulated by the

LSM7 intron, including LSM genes were not found to encode any sequence that was complementary to the intron for a potential siRNA-like mechanism in such control.

Future studies are required to address the identity of the direct regulatory sensor involved.

There have been frequent reports of disagreement between the results from the two powerful techniques (qRT-PCR and microarrays) used to study regulation of transcript levels, often related to low precision of the microarray technique (Kothapalli et al., 2002;

Morey et al., 2006). In this thesis, qRT-PCR data did show small but significant effects of mutating the LSM7 intron on the transcript levels of other LSM genes- given likely that the changes were not detectable in the microarray analysis. Further analysis on the microarray observations is required, including a confirmation on the expression of the most significantly changed genes by qRT-PCR, and/or a more thorough transcriptomic screen using a more advance tool, RNA-sequencing for quantifying transcripts and their isoforms in higher resolution.

In summary, this study represents an advance in our understanding of a sophisticated level of regulation in gene expression involving a non-coding intron sequence acting in trans, in a system with relatively simple splicing machinery and genome complexity. It provides an insight into an additional level of gene regulation and provides a reason for

139

Summary and perspectives

retention of some introns in S. cerevisiae which has lost during its evolution most of the introns found in the genes of other organisms. The results also indicate that there exists another level of regulation due to introns that may also exist in higher eukaryotes.

140

References

Abelson J. (2008) Is the spliceosome a ribonucleoprotein enzyme? Nat Struct Mol Biol 15: 1235-1237

Abramoff M. D., Magelhaes P. J.,Ram S. J. (2004) Image Processing with ImageJ. Biophotonics International 11: 36-42

Achsel T., Brahms H., Kastner B., Bachi A., Wilm M.,Luhrmann R. (1999) A doughnut-shaped heteromer of human Sm-like proteins binds to the 3'-end of U6 snRNA, thereby facilitating U4/U6 duplex formation in vitro. EMBO J 18: 5789-5802

Adams A., Gottschling D. E., Kaiser C. A.,Stearns T. (1997) Methods in yeast genetic: A Cold Spring Harbor Laboratory Course Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York.

Ahnert S. E., Fink T. M. A.,Zinovyev A. (2008) How much non-coding DNA do eukaryotes require? Journal of Theoretical Biology 252: 587-592

Amaral P. P., Dinger M. E., Mercer T. R.,Mattick J. S. (2008) The eukaryotic genome as an RNA machine. Science 319: 1787-1789

Amaral P. P.,Mattick J. S. (2008) Noncoding RNA in development. Mamm Genome 19: 454-492

Andrews B. J., Herskowitz I. (1989) Identification of a DNA binding factor involved in cell-cycle control of the yeast HO gene. Cell 57: 21-29

Aravind L., Watanabe H., Lipman D. J.,Koonin E. V. (2000) Lineage-specific loss and divergence of functionally linked genes in eukaryotes. Proc Natl Acad Sci U S A 97: 11319-11324

Ardelt B., Ardelt W.,Darzynkiewicz Z. (2003) Cytotoxic ribonucleases and RNA interference (RNAi). Cell Cycle 2: 22-24

Ares M., Jr., Grate L.,Pauling M. H. (1999) A handful of intron-containing genes produces the lion's share of yeast mRNA. RNA 5: 1138-1139

141

Bachellerie J. P., Cavaille J.,Huttenhofer A. (2002) The expanding snoRNA world. Biochimie 84: 775-790

Ballensiefen W., Schmitt H. D. (1997) Periplasmic Bar1 protease of Saccharomyces cerevisiae is active before reaching its extracellular destination. Eur J Biochem 247: 142-147

Bardwell L. (2005) A walk-through of the yeast mating pheromone response pathway. Peptides 26: 339-350

Baskerville S., Bartel D. P. (2005) Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes. Rna 11: 241-247

Basrai M. A., Velculescu V. E., Kinzler K. W.,Hieter P. (1999) NORF5/HUG1 is a component of the MEC1-mediated checkpoint response to DNA damage and replication arrest in Saccharomyces cerevisiae. Mol Cell Biol 19: 7041-7049

Beggs J. D. (2005) Lsm proteins and RNA processing. Biochem Soc Trans 33: 433-438

Bejerano G., Pheasant M., Makunin I., Stephen S., Kent W. J., Mattick J. S., Haussler D. (2004) Ultraconserved elements in the human genome. Science 304: 1321-1325

Bessonov S., Anokhina M., Will C. L., Urlaub H.,Luhrmann R. (2008) Isolation of an active step I spliceosome and composition of its RNP core. Nature 452: 846-850

Bharucha J. P., Larson J. R., Konopka J. B.,Tatchell K. (2008) Saccharomyces cerevisiae Afr1 protein is a protein phosphatase 1/Glc7-targeting subunit that regulates the septin cytoskeleton during mating. Eukaryotic Cell 7: 1246-1255

Birney E., Stamatoyannopoulos J. A., Dutta A., Guigo R., Gingeras T. R., Margulies E. H., . . . de Jong P. J. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447: 799-816

Bodi Z., Button J. D., Grierson D.,Fray R. G. (2010) Yeast targets for mRNA methylation. Nucleic Acids Res 38: 5327-5335

Boeke J. D., LaCroute F.,Fink G. R. (1984) A positive selection for mutants lacking orotidine-5'-phosphate decarboxylase activity in yeast: 5-fluoro-orotic acid resistance.

142

Mol Gen Genet 197: 345-346

Bon E., Casaregola S., Blandin G., Llorente B., Neuveglise C., Munsterkotter M., . . . Gaillardin C. (2003) Molecular evolution of eukaryotic genomes: hemiascomycetous yeast spliceosomal introns. Nucleic Acids Res 31: 1121-1135

Bonen L. (1993) Trans-splicing of pre-mRNA in plants, animals, and protists. The FASEB Journal 7: 40-46

Bonneaud N., Ozier-Kalogeropoulos O., Li G., Labouesse M., Minvielle-Sebastia L.,Lacroute F. (1991) A family of low and high copy replicative, integrative and single-stranded S. cerevisiae/E. coli shuttle vectors. Yeast 7: 609-615

Boon K. L., Norman C. M., Grainger R. J., Newman A. J.,Beggs J. D. (2006) Prp8p dissection reveals domain structure and protein interaction sites. Rna 12: 198-205

Bouveret E., Rigaut G., Shevchenko A., Wilm M.,Seraphin B. (2000) A Sm-like protein complex that participates in mRNA degradation. EMBO J 19: 1661-1671

Brachmann C. B., Davies A., Cost G. J., Caputo E., Li J., Hieter P.,Boeke J. D. (1998) Designer deletion strains derived from Saccharomyces cerevisiae S288C: a useful set of strains and plasmids for PCR-mediated gene disruption and other applications. Yeast 14: 115-132

Breitkreutz A.,Tyers M. (2002) MAPK signaling specificity: it takes two to tango. Trends Cell Biol 12: 254-257

Brow D. A. (2002) Allosteric cascade of spliceosome activation. Annu Rev Genet 36: 333-60

Brown J. W., Marshall D. F.,Echeverria M. (2008) Intronic noncoding RNAs and splicing. Trends Plant Sci 13: 335-342

Byrne K. P., Wolfe K. H. (2005) The Yeast Gene Order Browser: combining curated homology and syntenic context reveals gene fate in polyploid species. Genome Res. 15:1456-61

Camasses A., Bragado-Nilsson E., Martin R., Seraphin B.,Bordonne R. (1998)

143

Interactions within the yeast Sm core complex: from proteins to amino acids. Mol Cell Biol 18: 1956-1966

Carninci P. (2006) Tagging mammalian transcription complexity. Trends Genet 22: 501-510

Carninci P. (2010) RNA dust: where are the genes? DNA Res 17: 51-59

Carninci P., Kasukawa T., Katayama S., Gough J., Frith M. C., Maeda N., . . . Hayashizaki Y. (2005) The transcriptional landscape of the mammalian genome. Science 309: 1559-1563

Carninci P., Sandelin A., Lenhard B., Katayama S., Shimokawa K., Ponjavic J., . . . Hayashizaki Y. (2006) Genome-wide analysis of mammalian promoter architecture and evolution. Nat Genet 38: 626-635

Carmel L., Rogozin I. B., Wolf Y. I., Koonin E. V. (2007) Evolutionarily conserved genes preferentially accumulate introns. Genome Res 17: 1045-50

Carroll S. M., Narayan P.,Rottman F. M. (1990) N6-methyladenosine residues in an intron-specific region of prolactin pre-mRNA. Mol Cell Biol 10: 4456-4465

Carthew R. W., Sontheimer E. J. (2009) Origins and mechanisms of miRNAs and siRNAs. Cell 136: 642-655

Castle J. C., Zhang C., Shah J. K., Kulkarni A. V., Kalsotra A., Cooper T. A.,Johnson J. M. (2008) Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell lines. Nat Genet 40: 1416-1425

Cawley S., Bekiranov S., Ng H. H., Kapranov P., Sekinger E. A., Kampa D., . . . Gingeras T. R. (2004) Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116: 499-509

Chan S. P., Kao D. I., Tsai W. Y.,Cheng S. C. (2003) The Prp19p-associated complex in spliceosome activation. Science 302: 279-282

Chen E. H., Grote E., Mohler W.,Vignery A. (2007) Cell-cell fusion. FEBS Lett 581:

144

2181-2193

Chen P., Sapperstein S. K., Choi J. D.,Michaelis S. (1997) Biogenesis of the Saccharomyces cerevisiae mating pheromone a-factor. J Cell Biol 136: 251-269

Cheng J., Kapranov P., Drenkow J., Dike S., Brubaker S., Patel S., . . . Gingeras T. R. (2005) Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308: 1149-1154

Chooniedass-Kothari S., Emberley E., Hamedani M. K., Troup S., Wang X., Czosnek A., . . . Leygue E. (2004) The steroid receptor RNA activator is the first functional RNA encoding a protein. FEBS Lett 566: 43-47

Chooniedass-Kothari S., Hamedani M. K., Auge C., Wang X., Carascossa S., Yan Y., . . . Leygue E. (2010) The steroid receptor RNA activator protein is recruited to promoter regions and acts as a transcriptional repressor. FEBS Lett 584: 2218-2224

Chowdhury A., Mukhopadhyay J.,Tharun S. (2007) The decapping activator Lsm1p-7p-Pat1p complex has the intrinsic ability to distinguish between oligoadenylated and polyadenylated RNAs. RNA 13: 998-1016

Chowdhury A., Tharun S. (2008) lsm1 mutations impairing the ability of the Lsm1p-7p-Pat1p complex to preferentially bind to oligoadenylated RNA affect mRNA decay in vivo. RNA 14: 2149-2158

Clancy M. J., Shambaugh M. E., Timpte C. S.,Bokar J. A. (2002) Induction of sporulation in Saccharomyces cerevisiae leads to the formation of N6-methyladenosine in mRNA: a potential mechanism for the activity of the IME4 gene. Nucleic Acids Res 30: 4509-4518

Clement J. Q., Maiti S.,Wilkinson M. F. (2001) Localization and stability of introns spliced from the Pem homeobox gene. J Biol Chem 276: 16919-16930

Clement J. Q., Qian L., Kaplinsky N.,Wilkinson M. F. (1999) The stability and fate of a spliced intron from vertebrate cells. RNA 5: 206-220

Collins C. A.,Guthrie C. (2000) The question remains: is the spliceosome a ribozyme? Nat Struct Biol 7: 850-854

145

Covitz P. A., Herskowitz I.,Mitchell A. P. (1991) The yeast RME1 gene encodes a putative zinc finger protein that is directly repressed by a1-alpha 2. Gene Dev 5: 1982-1989

Crick F. (1970) Central dogma of molecular biology. Nature 227: 561-563

Crick F. H. (1958) On protein synthesis. Symp Soc Exp Biol 12: 138-163

David L., Huber W., Granovskaia M., Toedling J., Palm C. J., Bofkin L., Jones T., Davis R. W., Steinmetz L. M. (2006) A high-resolution map of transcription in the yeast genome. Proc Natl Acad Sci USA 103: 5320-5

Davis C. A., Grate L., Spingola M., Ares M., Jr. (2000) Test of intron predictions reveals novel splice sites, alternatively spliced mRNAs and new introns in meiotically regulated genes of yeast. Nucleic Acids Res 28: 1700-1706

De Virgilio C., DeMarini D. J.,Pringle J. R. (1996) SPR28, a sixth member of the septin gene family in Saccharomyces cerevisiae that is expressed specifically in sporulating cells. Microbiology 142: 2897-2905

Decker C. J., Teixeira D.,Parker R. (2007) Edc3p and a glutamine/asparagine-rich domain of Lsm4p function in processing body assembly in Saccharomyces cerevisiae. J Cell Biol 179: 437-449

Dengler M. (2008) Intron-mediated regulation of the spliceosomal LSM genes in Saccharomyces cerevisiae. Prakticum Thesis, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney Institute of Biochemistry & University of Stuttgart, Stuttgart, Germany

DeRisi J. L., Iyer V. R.,Brown P. O. (1997) Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278: 680-686

Dinger M. E., Pang K. C., Mercer T. R.,Mattick J. S. (2008) Differentiating protein-coding and noncoding RNA: challenges and ambiguities. PLoS Comput Biol 4: e1000176

Doi S., Tanabe K., Watanabe M., Yamaguchi M.,Yoshimura M. (1989) An

146

alpha-specific gene, SAG1 is required for sexual agglutination in Saccharomyces cerevisiae. Curr Genet 15: 393-398

Drinnenberg I. A., Weinberg D. E., Xie K. T., Mower J. P., Wolfe K. H., Fink G. R.,Bartel D. P. (2009) RNAi in budding yeast. Science 326: 544-550

Eddy S. R. (2001) Non-coding RNA genes and the modern RNA world. Nat Rev Genet 2: 919-929

Engebrecht J. A., Voelkel-Meiman K.,Roeder G. S. (1991) Meiosis-specific RNA splicing in yeast. Cell 66: 1257-1268

Erdman S., Snyder M. (2001) A filamentous growth response mediated by the yeast mating pathway. Genetics 159: 919-928

Fabrizio P., Dannenberg J., Dube P., Kastner B., Stark H., Urlaub H.,Luhrmann R. (2009) The evolutionarily conserved core design of the catalytic activation step of the yeast spliceosome. Mol Cell 36: 593-608

Fatica A., Morlando M., Bozzoni I. (2000) Yeast snoRNA accumulation relies on a cleavage-dependent/polyadenylation-independent 3'-processing apparatus. EMBO J 19: 6218-6229

Fedorova L., Fedorov A. (2003) Introns in gene evolution. Genetica 118: 123-131

Fejes-Toth K., Sotirova V., Sachidanandam R., Assaf G., Hannon G. J., Kapranov P., . . . Gingeras T. R. (2009) Post-transcriptional processing generates a diversity of 5'-modified long and short RNAs. Nature 457: 1028-1032

Fernandez C. F., Pannone B. K., Chen X., Fuchs G.,Wolin S. L. (2004) An Lsm2-Lsm7 complex in Saccharomyces cerevisiae associates with the small nucleolar RNA snR5. Mol Biol Cell 15: 2842-2852

Fink G. R. (1987) Pseudogenes in yeast? Cell 49: 5-6

Frohlich K. U., Rudiger M., Eberhardt D.,Mecke D. (1992) An easy and fast alternative to plasmid shuffling for the identification of in vitro mutagenized alleles of essential genes of Saccharomyces cerevisiae. Nucleic Acids Res 20: 6113-6114

147

Fromont-Racine M., Mayes A. E., Brunet-Simon A., Rain J. C., Colley A., Dix I., . . . Legrain P. (2000) Genome-wide protein interaction screens reveal functional networks involving Sm-like proteins. Yeast 17: 95-110

Fury M. G., Zhang W., Christodoulopoulos I.,Zieve G. W. (1997) Multiple protein: protein interactions between the snRNP common core proteins. Exp Cell Res 237: 63-69

Galgoczy D. J., Cassidy-Stone A., Llinas M., O'Rourke S. M., Herskowitz I., DeRisi J. L.,Johnson A. D. (2004) Genomic dissection of the cell-type-specification circuit in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 101: 18069-18074

Ghildiyal M., Zamore P. D. (2009) Small silencing RNAs: an expanding universe. Nat Rev Genet 10: 94-108

Gietz R. D., Woods R. A. (2002) Transformation of yeast by lithium acetate/single-stranded carrier DNA/polyethylene glycol method. Methods Enzymol 350: 87-96

Gilbert W. (1986) Origin of life: The RNA world. Nature 319: 618-618

Gingeras T. R. (2007) Origin of phenotypes: genes and transcripts. Genome Res 17: 682-690

Glazov E. A., Pheasant M., McGraw E. A., Bejerano G.,Mattick J. S. (2005) Ultraconserved elements in insect genomes: a highly conserved intronic sequence implicated in the control of homothorax mRNA splicing. Genome Res 15: 800-808

Gottschalk A., Kastner B., Luhrmann R.,Fabrizio P. (2001) The yeast U5 snRNP coisolated with the U1 snRNP has an unexpected protein composition and includes the splicing factor Aar2p. RNA 7: 1554-1565

Grainger R. J.,Beggs J. D. (2005) Prp8 protein: at the heart of the spliceosome. RNA 11: 533-557

Gruber A. R., Lorenz R., Bernhart S. H., Neubock R., Hofacker I. L. (2008) The Vienna RNA websuite. Nucleic Acids Res 36: W70-4

148

Guerrier-Takada C., Gardiner K., Marsh T., Pace N., Altman S. (1983) The RNA moiety of ribonuclease P is the catalytic subunit of the enzyme. Cell 35: 849-857

Hüttenhofer A. (2006) RNomics: identification and function of small non-protein-coding RNAs in model organisms. Cold Spring Harb Sym 71: 135-140

Haber J. E. (1998) Mating-type gene switching in Saccharomyces cerevisiae. Ann Rev Genet 32: 561-599

Hanahan D., Jessee J., Bloom F. R. (1991) Plasmid transformation of Escherichia coli and other bacteria. Methods Enzymol 204: 63-113

Hartwell L. H. (1980) Mutants of Saccharomyces cerevisiae unresponsive to cell division control by polypeptide mating hormone. J Cell Biol 85: 811-822

He W., Parker R. (2000) Functions of Lsm proteins in mRNA degradation and splicing. Curr Opin Cell Biol 12: 346-350

Heiman M. G., Walter P. (2000) Prm1p, a pheromone-regulated multispanning membrane protein, facilitates plasma membrane fusion during yeast mating. J Cell Biol 151: 719-730

Hermann H., Fabrizio P., Raker V. A., Foulaki K., Hornig H., Brahms H., Luhrmann R. (1995) snRNP Sm proteins share two evolutionarily conserved sequence motifs which are involved in Sm protein-protein interactions. EMBO J 14: 2076-2088

Hibbs M. A., Hess D. C., Myers C. L., Huttenhower C., Li K., Troyanskaya O. G. (2007) Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics 23: 2692-2699

Hoffman C. S., Winston F. (1987) A ten-minute DNA preparation from yeast efficiently releases autonomous plasmids for transformaion of Escherichia coli. Gene 57: 267-272

Hongay C. F., Grisafi P. L., Galitski T., Fink G. R. (2006) Antisense transcription controls cell fate in Saccharomyces cerevisiae. Cell 127: 735-745

House A. E., Lynch K. W. (2008) Regulation of alternative splicing: more than just the ABCs. J Biol Chem 283: 1217-1221

149

Ingelfinger D., Arndt-Jovin D. J., Luhrmann R.,Achsel T. (2002) The human LSm1-7 proteins colocalize with the mRNA-degrading enzymes Dcp1/2 and Xrnl in distinct cytoplasmic foci. RNA 8: 1489-1501

Irizarry R. A., Bolstad B. M., Collin F., Cope L. M., Hobbs B.,Speed T. P. (2003) Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 31: e15

Jeffares D. C., Mourier T.,Penny D. (2006) The biology of intron gain and loss. Trends Genet 22: 16-22

Johnson A. D. (1995) Molecular mechanisms of cell-type determination in budding yeast. Curr Opin Genet Dev 5: 552-558

Johnson J. M., Edwards S., Shoemaker D., Schadt E. E. (2005) Dark matter in the genome: evidence of widespread transcription detected by microarray tiling experiments. Trends Genet 21: 93-102

Juneau K., Miranda M., Hillenmeyer M. E., Nislow C., Davis R. W. (2006) Introns regulate RNA and protein abundance in yeast. Genetics: genetics.106.058560

Juneau K., Palm C., Miranda M., Davis R. W. (2007) High-density yeast-tiling array reveals previously undiscovered introns and extensive regulation of meiotic splicing. Proc Nat Acad Sci USA 104: 1522-1527

Kambach C., Walke S., Young R., Avis J. M., de la Fortelle E., Raker V. A., . . . Nagai K. (1999) Crystal structures of two Sm protein complexes and their implications for the assembly of the spliceosomal snRNPs. Cell 96: 375-387

Kammann M., Laufs J., Schell J., Gronenborn B. (1989) Rapid insertional mutagenesis of DNA by polymerase chain reaction (PCR). Nucleic Acids Res 17: 5404

Kapranov P. (2009) Studying chromosome-wide transcriptional networks: new insights into disease? Genome Med 1: 50

Kapranov P., Cheng J., Dike S., Nix D. A., Duttagupta R., Willingham A. T., . . . Gingeras T. R. (2007a) RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science 316: 1484-1488

150

Kapranov P., Ozsolak F., Kim S. W., Foissac S., Lipson D., Hart C., . . . Milos P. M. (2010) New class of gene-termini-associated human RNAs suggests a novel RNA copying mechanism. Nature 466: 642-646

Kapranov P., Willingham A. T.,Gingeras T. R. (2007b) Genome-wide transcription and the implications for genomic organization. Nat Rev Genet 8: 413-423

Karaduman R., Dube P., Stark H., Fabrizio P., Kastner B., Luhrmann R. (2008) Structure of yeast U6 snRNPs: arrangement of Prp24p and the Lsm complex as revealed by electron microscopy. RNA 14: 2528-2537

Karaduman R., Fabrizio P., Hartmuth K., Urlaub H., Lührmann R. (2006) RNA structure and RNA-protein interactions in purified yeast U6 snRNPs. J Mol Biol 356: 1248-1262

Kastner B., Bach M., Luhrmann R. (1990) Electron microscopy of small nuclear ribonucleoprotein (snRNP) particles U2 and U5: evidence for a common structure-determining principle in the major U snRNP family. Proc Natl Acad Sci USA 87: 1710-1714

Kothapalli R., Yoder S., Mane S., Loughran T. (2002) Microarray results: how accurate are they? BMC Bioinformatics 3: 22

Kruger K., Grabowski P. J., Zaug A. J., Sands J., Gottschling D. E.,Cech T. R. (1982) Self-splicing RNA: autoexcision and autocyclization of the ribosomal RNA intervening sequence of Tetrahymena. Cell 31: 147-157

Krummel D. A., Nagai K., Oubridge C. (2010) Structure of spliceosomal ribonucleoproteins. F1000 Biol Rep 2: 39

Kufel J., Allmang C., Petfalski E., Beggs J., Tollervey D. (2003a) Lsm proteins are required for normal processing and stability of ribosomal RNAs. J Biol Chem 278: 2147-2156

Kufel J., Allmang C., Verdone L., Beggs J., Tollervey D. (2003b) A complex pathway for 3 ' processing of the yeast U3 snoRNA. Nucleic Acids Res 31: 6788-6797

151

Kufel J., Allmang C., Verdone L., Beggs J. D., Tollervey D. (2002) Lsm proteins are required for normal processing of pre-tRNAs and their efficient association with La-homologous protein Lhp1p. Mol Cell Biol 22: 5248-5256

Kufel J., Bousquet-Antonelli C., Beggs J. D., Tollervey D. (2004) Nuclear pre-mRNA decapping and 5 ' degradation in yeast require the Lsm2-8p complex. Mol Cell Biol 24: 9646-9657

Lander E. S., Linton L. M., Birren B., Nusbaum C., Zody M. C., Baldwin J., . . . Chen Y. J. (2001) Initial sequencing and analysis of the human genome. Nature 409: 860-921

Lanz R. B., McKenna N. J., Onate S. A., Albrecht U., Wong J., Tsai S. Y., . . . O'Malley B. W. (1999) A steroid receptor coactivator, SRA, functions as an RNA and is present in an SRC-1 complex. Cell 97: 17-27

Lareau L. F., Green R. E., Bhatnagar R. S., Brenner S. E. (2004) The evolving roles of alternative splicing. Curr Opin Struct Biol 14: 273-282

Lerner M. R., Steitz J. A. (1979) Antibodies to small nuclear RNAs complexed with proteins are produced by patients with systemic lupus-erythematosus. Proc Natl Acad Sci USA 76: 5495-5499

Levine M., Tjian R. (2003) Transcription regulation and animal diversity. Nature 424: 147-151

Li B., Vilardell J., Warner J. R. (1996) An RNA structure involved in feedback regulation of splicing and of translation is critical for biological fitness. Proc Natl Acad Sci USA 93: 1596-1600

Liang X. H., Haritan A., Uliel S., Michaeli S. (2003) trans and cis splicing in trypanosomatids: mechanism, factors, and regulation. Eukaryot Cell 2: 830-840

Liu H. X., Cartegni L., Zhang M. Q., Krainer A. R. (2001) A mechanism for exon skipping caused by nonsense or missense mutations in BRCA1 and other genes. Nat Genet 27: 55-58

Liu L., Query C. C., Konarska M. M. (2007) Opposing classes of prp8 alleles modulate the transition between the catalytic steps of pre-mRNA splicing. Nat Struct Mol Biol 14:

152

519-526

Long M., Deutsch M., Wang W., Betran E., Brunet F. G., Zhang J. (2003) Origin of new genes: evidence from experimental and computational analyses. Genetica 118: 171-182

Lopez-Bigas N., Audit B., Ouzounis C., Parra G., Guigo R. (2005) Are splicing mutations the most frequent cause of hereditary disease? FEBS Lett 579: 1900-1903

Louro R., Smirnova A. S., Verjovski-Almeida S. (2009) Long intronic noncoding RNA transcription: Expression noise or expression choice? Genomics 93: 291-298

Luhtala N., Parker R. (2009) LSM1 over-expression in Saccharomyces cerevisiae depletes U6 snRNA levels. Nucleic Acids Res 37: 5529-5536

Malca H., Shomron N., Ast G. (2003) The U1 snRNP base pairs with the 5' splice site within a penta-snRNP complex. Mol Cell Biol 23: 3442-55

Manak J. R., Dike S., Sementchenko V., Kapranov P., Biemar F., Long J., . . . Gingeras T. R. (2006) Biological function of unannotated transcription during the early development of Drosophila melanogaster. Nat Genet 38: 1151-1158

Mattick J. S. (1994) Introns: evolution and function. Curr Opin Genet Dev 4: 823-831

Mattick J. S. (2001) Non-coding RNAs: the architects of eukaryotic complexity. EMBO Rep 2: 986-991

Mattick J. S. (2009) The genetic signatures of noncoding RNAs. PLoS Genet 5: e1000459

Mattick J. S., Gagen M. J. (2001) The evolution of controlled multitasked gene networks: the role of introns and other noncoding RNAs in the development of complex organisms. Mol Biol Evol 18: 1611-1630

Mattick J. S., Makunin I. V. (2006) Non-coding RNA. Hum Mol Genet 15: R17-29

Mattick J. S., Taft R. J., Faulkner G. J. (2010) A global view of genomic information--moving beyond the gene and the master regulator. Trends Genet 26: 21-28

153

Mayes A. E., Verdone L., Legrain P., Beggs J. D. (1999) Characterization of Sm-like proteins in yeast and their association with U6 snRNA. EMBO J 18: 4321-4331

Mazzoni C., D'Addario I., Falcone C. (2007) The C-terminus of the yeast Lsm4p is required for the association to P-bodies. FEBS Lett 581: 4836-4840

Mercer T. R., Dinger M. E., Bracken C. P., Kolle G., Szubert J. M., Korbie D. J., . . . Mattick J. S. (2010) Regulated post-transcriptional RNA cleavage diversifies the eukaryotic transcriptome. Genome Research 20: 1639-1650

Mercer T. R., Dinger M. E., Mattick J. S. (2009) Long non-coding RNAs: insights into functions. Nat Rev Genet 10: 155-159

Meyer M., Vilardell J. (2009) The quest for a message: budding yeast, a model organism to study the control of pre-mRNA splicing. Briefings in Functional Genomics & Proteomics 8: 60-67

Michaelis S., Chen P., Berkower C., Sapperstein S., Kistler A. (1992) Biogenesis of yeast a-factor involves prenylation, methylation and a novel export mechanism. Antonie Van Leeuwenhoek 61: 115-117

Michaelis S., Herskowitz I. (1988) The a-factor pheromone of Saccharomyces cerevisiae is essential for mating. Mol Cell Biol 8: 1309-1318

Miura F., Kawaguchi N., Sese J., Toyoda A., Hattori M., Morishita S.,Ito T. (2006) A large-scale full-length cDNA analysis to explore the budding yeast transcriptome. Proc Natl Acad Sci U S A 103: 17846-17851

Modrek B., Lee C. (2002) A genomic view of alternative splicing. Nat Genet 30: 13-19

Morey J. S., Ryan J. C., Van Dolah F. M. (2006) Microarray validation: factors influencing correlation between oligonucleotide microarrays and real-time PCR. Biol Proced Online 8: 175-193

Mutiu A. I., Brandl C. J. (2005) RNA isolation from yeast using silica matrices. J Biomol Tech 16: 316-317

154

Nagasaki H., Arita M., Nishizawa T., Suwa M., Gotoh O. (2005) Species-specific variation of alternative splicing and transcriptional initiation in six eukaryotes. Gene 364: 53-62

Naidoo N., Harrop S. J., Sobti M., Haynes P. A., Szymczyna B. R., Williamson J. R., . . . Mabbutt B. C. (2008) Crystal structure of Lsm3 octamer from Saccharomyces cerevisiae: implications for Lsm ring organisation and recruitment. J Mol Biol 377: 1357-1371

Nakazawa N., Harashima S., Oshima Y. (1991) AAR2, a gene for splicing pre-mRNA of the MATa1 cistron in cell type control of Saccharomyces cerevisiae. Mol Cell Biol 11: 5693-5700

Nasmyth K. (1993) Regulating the HO endonuclease in yeast. Curr Opin Genet Dev 3: 286-294

Newman A. J., Nagai K. (2010) Structural studies of the spliceosome: blind men and an elephant. Curr Opin Struct Biol 20: 82-89

Ng R., Domdey H., Larson G., Rossi J. J., Abelson J. (1985) A test for intron function in the yeast actin gene. Nature 314: 183-184

Nilsen T. W., Graveley B. R. (2010) Expansion of the eukaryotic proteome by alternative splicing. Nature 463: 457-463

Ogawa Y., Sun B. K., Lee J. T. (2008) Intersection of the RNA interference and X-inactivation pathways. Science 320: 1336-1341

Ooi S. L., Samarsky D. A., Fournier M. J., Boeke J. D. (1998) Intronic snoRNA biosynthesis in Saccharomyces cerevisiae depends on the lariat-debranching enzyme: Intron length effects and activity of a precursor snoRNA. RNA 4: 1096-1110

Ørom U. A., Derrien T., Beringer M., Gumireddy K., Gardini A., Bussotti G., . . . Shiekhattar R. (2010) Long noncoding RNAs with enhancer-like function in human cells. Cell 143: 46-58

Palmisano L. (2006) Intron-mediated regulation of LSMs and splicing. PhD thesis, School of Biotechnology and Biomolecular Sciences, University of New South Wales,

155

Sydney

Panning B. (2008) X-chromosome inactivation: the molecular basis of silencing. J Biol 7: 30

Pannone B. K., Kim S. D., Noe D. A., Wolin S. L. (2001) Multiple functional interactions between components of the Lsm2-Lsm8 complex, U6 snRNA, and the yeast La protein. Genetics 158: 187-196

Parenteau J., Durand M., Veronneau S., Lacombe A. A., Morin G., Guerin V., . . . Abou Elela S. (2008) Deletion of many yeast introns reveals a minority of genes that require splicing for function. Mol Biol Cell 19: 1932-1941 Patel, Abhijit A. McCarthy, Matthew Steitz, Joan A. (2002) The splicing of U12-type introns can be a rate-limiting step in gene expression. EMBO J 21: 3804-3815

Patel A. A., McCarthy M. S., Joan A. (2002) The splicing of U12-type introns can be a rate-limiting step in gene expression. EMBO J 21: 3804-3815

Patel A. A., Steitz J. A. (2003) Splicing double: insights from the second spliceosome. Nat Rev Mol Cell Biol 4: 960-970

Pena V., Rozov A., Fabrizio P., Luhrmann R., Wahl M. C. (2008) Structure and function of an RNase H domain at the heart of the spliceosome. EMBO J 27: 2929-2940

Pillai R. S., Grimmler M., Meister G., Will C. L., Luhrmann R., Fischer U., Schumperli D. (2003) Unique Sm core structure of U7 snRNPs: assembly by a specialized SMN complex and the role of a new component, Lsm11, in histone RNA processing. Genes Dev 17: 2321-2333

Pillai R. S., Will C. L., Luhrmann R., Schumperli D., Muller B. (2001) Purified U7 snRNPs lack the Sm proteins D1 and D2 but contain Lsm10, a new 14 kDa Sm D1-like protein. EMBO J 20: 5470-5479

Pleiss J. A., Whitworth G. B., Bergkessel M., Guthrie C. (2007) Rapid, transcript-specific changes in splicing in response to environmental stress. Mol Cell 27: 928-937

Pomeranz Krummel D. A., Oubridge C., Leung A. K. W., Li J., Nagai K. (2009) Crystal

156

structure of human spliceosomal U1 snRNP at 5.5A resolution. Nature 458: 475-480

Prasanth K. V., Spector D. L. (2007) Eukaryotic regulatory RNAs: an answer to the ‘genome complexity’ conundrum. Gene Dev 21: 11-42

Preker P. J., Guthrie C. (2006) Autoregulation of the mRNA export factor Yra1p requires inefficient splicing of its pre-mRNA. RNA 12: 994-1006

Pyle A. M., Lambowitz A. M. (2006) 17 Group II Introns: Ribozymes That Splice RNA and Invade DNA, Vol. 43 3rd edn. North America: Cold Spring Harbor Monograph Archive.

Query C. C., Konarska M. M. (2004) Suppression of multiple substrate mutations by spliceosomal prp8 alleles suggests functional correlations with ribosomal ambiguity mutants. Mol Cell 14: 343-354

Raker V. A., Plessel G., Luhrmann R. (1996) The snRNP core assembly pathway: identification of stable core protein heteromeric complexes and an snRNP subcore particle in vitro. EMBO J 15: 2256-2269

Ravasi T., Suzuki H., Pang K. C., Katayama S., Furuno M., Okunishi R., . . . Mattick J. S. (2006) Experimental validation of the regulated expression of large numbers of non-coding RNAs from the mouse genome. Genome Res 16: 11-19

Rearick D., Prakash A., McSweeny A., Shepard S. S., Fedorova L., Fedorov A. (2010) Critical association of ncRNA with introns. Nucleic Acids Res 39: 2357-2366

Reijns M. A., Alexander R. D., Spiller M. P., Beggs J. D. (2008) A role for Q/N-rich aggregation-prone regions in P-body localization. J Cell Sci 121: 2463-2472

Reijns M. A., Auchynnikava T., Beggs J. D. (2009) Analysis of Lsm1p and Lsm8p domains in the cellular localization of Lsm complexes in budding yeast. FEBS J 276: 3602-3617

Ritchie D. B., Schellenberg M. J., Gesner E. M., Raithatha S. A., Stuart D. T., Macmillan A. M. (2008) Structural elucidation of a PRP8 core domain from the heart of the spliceosome. Nat Struct Mol Biol 15: 1199-1205

157

Roberts R. L., Fink G. R. (1994) Elements of a single MAP kinase cascade in Saccharomyces cerevisiae mediate two developmental programs in the same cell type: mating and invasive growth. Genes Dev 8: 2974-2985

Robinson M. D., Grigull J., Mohammad N., Hughes T. R. (2002) FunSpec: a web-based cluster interpreter for yeast. BMC Bioinformatics 3: 35

Rodriguez A., Griffiths-Jones S., Ashurst J. L., Bradley A. (2004) Identification of mammalian microRNA host genes and transcription units. Genome Res 14: 1902-1910

Ronen M., Botstein D. (2006) Transcriptional response of steady-state yeast cultures to transient perturbations in carbon source. Proc Natl Acad Sci U S A 103: 389 - 394

Roy A., Lu C. F., Marykwas D. L., Lipke P. N., Kurjan J. (1991) The AGA1 product is involved in cell surface attachment of the Saccharomyces cerevisiae cell adhesion glycoprotein a-agglutinin. Mol Cell Biol 11: 4196-4206

Roy S. W., Gilbert W. (2006) The evolution of spliceosomal introns: patterns, puzzles and progress. Nat Rev Genet 7: 211-221

Ruby J. G., Jan C. H., Bartel D. P. (2007) Intronic microRNA precursors that bypass Drosha processing. Nature 448: 83-86

Saeed A. I., Bhagabati N. K., Braisted J. C., Liang W., Sharov V., Howe E. A., . . . Quackenbush J. (2006) TM4 microarray software suite. Methods Enzymol 411: 134-193

Salgado-Garrido J., Bragado-Nilsson E., Kandels-Lewis S., Seraphin B. (1999) Sm and Sm-like proteins assemble in two related complexes of deep evolutionary origin. EMBO J 18: 3451-3462

Sambrook J., Frisch E. F., Maniatis T. (1989) In Molecular Cloning: A Laboratory Manual.: Cold Spring Harbor Laboratory, Cold Spring Harbor, New York.

Sarkar G., Sommer S. S. (1990) The "megaprimer" method of site-directed mutagenesis. Biotechniques 8: 404-407

Sarkar G., Sommer S. S. (1992) Double-stranded DNA segments can efficiently prime the amplification of human genomic DNA. Nucleic Acids Res 20: 4937-4938

158

Sashital D. G., Cornilescu G., McManus C. J., Brow D. A., Butcher S. E. (2004) U2-U6 RNA folding reveals a group II intron-like domain and a four-helix junction. Nat Struct Mol Biol 11: 1237-1242

Scofield D. G., Lynch M. (2008) Evolutionary diversification of the Sm family of RNA-associated proteins. Mol Biol Evol 25: 2255-2267

Seraphin B. (1995) Sm and Sm-Like proteins belong to a large family - Identification of proteins of the U6 as well as the U1, U2, U4 and U5 snRNPs. EMBO J 14: 2089-2098

Shah J. C., Clancy M. J. (1992) IME4, a gene that mediates MAT and nutritional control of meiosis in Saccharomyces cerevisiae. Mol Cell Biol 12: 1078-1086

Sheth N., Roca X., Hastings M. L., Roeder T., Krainer A. R., Sachidanandam R. (2006) Comprehensive splice-site analysis using comparative genomics. Nucleic Acids Res 34: 3955-3967

Sheth U., Parker R. (2003) Decapping and decay of messenger RNA occur in cytoplasmic processing bodies. Science 300: 805-808

Siepel A., Bejerano G., Pedersen J. S., Hinrichs A. S., Hou M., Rosenbloom K., . . . Haussler D. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15: 1034-1050

Siomi M. C., Sato K., Pezic D., Aravin A. A. (2011) PIWI-interacting small RNAs: the vanguard of genome defence. Nat Rev Mol Cell Biol 12: 246-258

Sironi M., Menozzi G., Comi G. P., Cagliani R., Bresolin N., Pozzoli U. (2005) Analysis of intronic conserved elements indicates that functional complexity might represent a major source of negative selection on non-coding sequences. Hum Mol Genet 14: 2533-2546

Skelly D. A., Ronald J., Connelly C. F., Akey J. M. (2009) Population genomics of intron splicing in 38 Saccharomyces cerevisiae genome sequences. Genome Biol Evol 1: 466-478

Sobti M., Cubeddu L., Haynes P. A., Mabbutt B. C. (2010) Engineered rings of mixed

159

yeast Lsm proteins show differential interactions with translation factors and U-rich RNA. Biochemistry 49: 2335-2345

Sontheimer E. J., Sun S., Piccirilli J. A. (1997) Metal ion catalysis during splicing of premessenger RNA. Nature 388: 801-805

Spiller M. P., Boon K. L., Reijns M. A., Beggs J. D. (2007a) The Lsm2-8 complex determines nuclear localization of the spliceosomal U6 snRNA. Nucleic Acids Res 35: 923-929

Spiller M. P., Reijns M. A., Beggs J. D. (2007b) Requirements for nuclear localization of the Lsm2-8p complex and competition between nuclear and cytoplasmic Lsm complexes. J Cell Sci 120: 4310-4320

Spingola M., Grate L., Haussler D., Ares M. (1999) Genome-wide bioinformatic and molecular analysis of introns in Saccharomyces cerevisiae. RNA 5: 221-234

Spirin A. S. (2002) Omnipotent RNA. FEBS Lett 530: 4-8

Staley J. P., Woolford Jr J. L. (2009) Assembly of ribosomes and spliceosomes: complex ribonucleoprotein machines. Curr Opin Cell Biol 21: 109-118

Stark H., Dube P., Luhrmann R., Kastner B. (2001) Arrangement of RNA and proteins in the spliceosomal U1 small nuclear ribonucleoprotein particle. Nature 409: 539-542

Stark H., Luhrmann R. (2006) Cryo-electron microscopy of spliceosomal components. Annu Rev Biophys Biomol Struct 35: 435-457

Stevens S. W., Ryan D. E., Ge H. Y., Moore R. E., Young M. K., Lee T. D., Abelson, J. (2002) Composition and functional characterization of the yeast spliceosomal penta-snRNP. Mol Cell 9: 31-44

Swinburne I. A., Silver P. A. (2008) Intron delays and transcriptional timing during development. Dev Cell 14: 324-330

Taft R. J., Glazov E. A., Cloonan N., Simons C., Stephen S., Faulkner G. J., . . . Mattick J. S. (2009a) Tiny RNAs associated with transcription start sites in animals. Nat Genet 41: 572-578

160

Taft R. J., Glazov E. A., Lassmann T., Hayashizaki Y., Carninci P., Mattick J. S. (2009b) Small RNAs derived from snoRNAs. RNA 15: 1233-1240

Taft R. J., Kaplan C. D., Simons C., Mattick J. S. (2009c) Evolution, biogenesis and function of promoter-associated RNAs. Cell Cycle 8: 2332-2338

Taft R. J., Pang K. C., Mercer T. R., Dinger M., Mattick J. S. (2010a) Non-coding RNAs: regulators of disease. J Pathol 220: 126-139

Taft R. J., Pheasant M., Mattick J. S. (2007) The relationship between non-protein-coding DNA and eukaryotic complexity. Bioessays 29: 288-299

Taft R. J., Simons C., Nahkuri S., Oey H., Korbie D. J., Mercer T. R., . . . Mattick J. S. (2010b) Nuclear-localized tiny RNAs are associated with transcription initiation and splice sites in metazoans. Nat Struct Mol Biol 17: 1030-1034

Tharun S. (2009) Roles of eukaryotic Lsm proteins in the regulation of mRNA function. Int Rev Cell Mol Biol 272: 149-189

Tharun S., He W., Mayes A. E., Lennertz P., Beggs J. D., Parker R. (2000) Yeast Sm-like proteins function in mRNA decapping and decay. Nature 404: 515-518

Tharun S., Muhlrad D., Chowdhury A., Parker R. (2005) Mutations in the Saccharomyces cerevisiae LSM1 gene that affect mRNA decapping and 3' end protection. Genetics 170: 33-46

Thompson J. D., Higgins D. G., Gibson T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22: 4673-80

Thompson D. M., Parker R. (2009) The RNase Rny1p cleaves tRNAs and promotes cell death during oxidative stress in Saccharomyces cerevisiae. J Cell Biol 185: 43-50

Tomasevic N., Peculis B. A. (2002) Xenopus LSm proteins bind U8 snoRNA via an internal evolutionarily conserved octamer sequence. Mol Cell Biol 22: 4101-4112

161

Toor N., Keating K. S., Taylor S. D., Pyle A. M. (2008) Crystal structure of a self-spliced group II intron. Science 320: 77-82

Tuck M. T., Wiehl P. E., Pan T. (1999) Inhibition of 6-methyladenine formation decreases the translation efficiency of dihydrofolate reductase transcripts. The International Journal of Biochemistry & Cell Biology 31: 837-851

Turcotte B., Liang X. B., Robert F.,Soontorngun N. (2010) Transcriptional regulation of nonfermentable carbon utilization in budding yeast. FEMS Yeast Res 10: 2-13

Tycowski K. T., Kolev N. G., Conrad N. K., Fok V., Steitz J. A. (2006) 12 The Ever-Growing World of Small Nuclear Ribonucleoproteins, Vol. 43, North America: Cold Spring Harbor Monograph Archive

Valadkhan S. (2007) The spliceosome: a ribozyme at heart? Biol Chem 388: 693-697

Venter J. C., Adams M. D., Myers E. W., Li P. W., Mural R. J., Sutton G. G., . . . Zhu X. (2001) The sequence of the human genome. Science 291: 1304-1351

Verdone L., Galardi S., Page D., Beggs J. D. (2004) Lsm proteins promote regeneration of pre-mRNA splicing activity. Curr Biol 14: 1487-1491

Vincenti S., De Chiara V., Bozzoni I., Presutti C. (2007) The position of yeast snoRNA-coding regions within host introns is essential for their biosynthesis and for efficient splicing of the host pre-mRNA. RNA 13: 138-150

Wachtel C., Manley J. L. (2009) Splicing of mRNA precursors: the role of RNAs and proteins in catalysis. Mol BioSystems 5: 311-316

Wahl M. C., Will C. L., Luhrmann R. (2009) The spliceosome: design principles of a dynamic RNP machine. Cell 136: 701-718

Wang G.-S., Cooper T. A. (2007) Splicing in disease: disruption of the splicing code and the decoding machinery. Nat Rev Genet 8: 749-761

Wang Y., Shirogane T., Liu D., Harper J. W., Elledge S. J. (2003) Exit from exit: resetting the cell cycle through Amn1 inhibition of G protein signaling. Cell 112: 697-709

162

Wang Z., Burge C. B. (2008) Splicing regulation: from a parts list of regulatory elements to an integrated splicing code. RNA 14: 802-813

Ward A. J., Cooper T. A. (2010) The pathobiology of splicing. J Pathol 220: 152-163

Warden C. D., Kim S. H., Yi S. V. (2008) Predicted functional RNAs within coding regions constrain evolutionary rates of yeast proteins. PLoS One 3: e1559

Warkocki Z., Odenwalder P., Schmitzova J., Platzmann F., Stark H., Urlaub H., . . . Luhrmann R. (2009) Reconstitution of both steps of Saccharomyces cerevisiae splicing with purified spliceosomal components. Nat Struct Mol Biol 16: 1237-1243

Weber G., Trowitzsch S., Kastner B., Luhrmann R., Wahl M. C. (2010) Functional organization of the Sm core in the crystal structure of human U1 snRNP. EMBO J 29: 4172-4184

Will C. L., Lührmann R. (2001) Spliceosomal UsnRNP biogenesis, structure and function. Curr Opin Cell Biol 13: 290-301

Will C. L., Luhrmann R. (2001) Spliceosomal UsnRNP biogenesis, structure and function. Curr Opin Cell Biol 13: 290-301

Willer M., Rainey M., Pullen T., Stirling C. J. (1999) The yeast CDC9 gene encodes both a nuclear and a mitochondrial form of DNA ligase I. Curr Biol 9: 1085

Williams R. (2009) Blood, stress, and tiRNAs. J Cell Biol 185: 3

Winzeler E. A., Shoemaker D. D., Astromoff A., Liang H., Anderson K., Andre B., . . . Davis R. W. (1999) Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science 285: 901-906

Yamasaki S., Ivanov P., Hu G.-f., Anderson P. (2009) Angiogenin cleaves tRNA and promotes stress-induced translational repression. J Cell Biol 185: 35-42

Yang K., Zhang L., Xu T., Heroux A., Zhao R. (2008) Crystal structure of the beta-finger domain of Prp8 reveals analogy to ribosomal proteins. Proc Natl Acad Sci U S A 105: 13817-13822

163

Yang X. J. (2005) Multisite protein modification and intramolecular signaling. Oncogene 24: 1653-1662

Yean S. L., Wuenschell G., Termini J., Lin R. J. (2000) Metal-ion coordination by U6 small nuclear RNA contributes to catalysis in the spliceosome. Nature 408: 881-884

Ying S. Y., Lin S. L. (2006) Current perspectives in intronic micro RNAs (miRNAs). J Biomed Sci 13: 5-15

Zaric B., Chami M., Rémigy H., Engel A., Ballmer-Hofer K., Winkler F. K., Kambach C. (2005) Reconstitution of two recombinant Lsm protein complexes reveals aspects of their architecture, assembly, and function. J Biol Chem 280: 16066-16075

Zhang C. (2009) Novel functions for small RNA molecules. Curr Opin Mol Ther 11: 641-651

Zuker M., Stiegler P. (1981) Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res 9: 133-48

164

Keys to Appendices

Appendices are files on the enclosed compact disc.

Appendix 1 Genes with significantly changed transcript levels in response to deletion of LSM7 intron under both growth conditions.

Normalized Log2 intensities (log2 ratios of experimental treatment RNA sample: reference control RNA sample) of all significantly differentially expressed genes in each biological duplicate RNA samples are present in the first spreadsheet in the enclosed file. Data was also shown in heat-map below. Statistical data (p-values, fold-changes and ratios) of all significantly changed genes (p value < 0.01) from linear contrast of ANOVA analysis is also included in the second spreadsheet in the file.

165

166

Appendix 2 Genes with significantly changed transcript levels in response to deletion of LSM7 intron under glucose only. Normalized Log2 intensities (log2 ratios of experimental treatment RNA sample: reference control RNA sample) of all significantly differentially expressed genes in each biological duplicate RNA samples was present in the first spreadsheet in the enclosed file. Statistical data (p-values, fold-changes and ratios) of all significantly changed genes (p value < 0.01) from linear contrast of ANOVA analysis is also included in the second spreadsheet in the file.

Appendix 3 Genes with significantly changed transcript levels in response to deletion of LSM7 intron under acetate only.

Normalized Log2 intensities (log2 ratios of experimental treatment RNA sample: reference control RNA sample) of all significantly differentially expressed genes in each biological duplicate RNA samples are present in the first spreadsheet in the enclosed file. Statistical data (p-values, fold-changes and ratios) of all significantly changed genes (p value < 0.01) from linear contrast of ANOVA analysis is also included in the second spreadsheet in the file.

Appendix 4 GO biological process of genes with significantly changed transcript levels in response to deletion of LSM7 intron under both growth conditions.

GO biological processes and relevant fold-changes of significantly differentially expressed genes in response to deletion of LSM7 intron are present in the spreadsheet in the second file.

167