The Fission Yeast Non-Coding Transcriptome

The Fission Yeast Non-Coding Transcriptome Sophie Radha Atkinson A thesis submitted for the degree of Doctor of Philosophy University College London September 2014 1 Declaration I, Sophie Radha Atkinson, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. 2 Abstract Long non-coding RNAs (lncRNAs) are emerging as important regulators of gene expression, although it remains unclear to what extent they contribute overall to the information flow from genotype to phenotype. Using strand-specific RNA- sequencing, I identify thousands of novel unstable, or cryptic, lncRNAs in Schizosaccharomyces pombe. The nuclear exosome, the RNAi pathway and the cytoplasmic exonuclease Exo2 represent three key pathways regulating lncRNAs in S. pombe, defining the overlapping classes of CUTs, RUTs and XUTs, respectively. The nuclear exosome and the RNAi pathway act cooperatively to control nuclear lncRNA expression, while the cytoplasmic Exo2 pathway is more distinct. Impairing both the nuclear exosome and the cytoplasmic exonuclease Exo2 is lethal in S. pombe. Importantly, I show that CUTs, RUTs and XUTs are stabilised under physiologically relevant growth conditions, with three key groups emerging: late meiotic RUTs/XUTs, early meiotic CUTs and quiescent CUTs. Late meiotic RUTs/XUTs tend to be antisense to protein-coding genes, and anti-correlate in expression with their sense loci. In contrast, early meiotic and quiescent CUTs tend to be transcribed divergently from protein-coding genes and positively correlate in expression with their mRNA partners. The current study provides an in-depth survey of the lncRNA repertoire of S. pombe, and the pathways that regulate their expression. It seems likely that any regulatory functions mediated by most of these lncRNAs are in cis, nuclear and co- transcriptional. The current study provides a rich and comprehensive resource for future studies of lncRNA function. 3 Acknowledgements I would like to thank Jürg Bähler for being an excellent supervisor and for giving me the opportunity to work in his lab. I have greatly benefited from his vision and insight, and constant willingness to offer his time and support throughout my PhD. I would also like to thank Samuel Marguerat for his invaluable input and guidance, especially during the initial stages of my PhD, and for teaching me everything I need to know about R. Thank you to all past and present members of the Bähler group for all their help and advice. I’m grateful to Xavi for his patience in teaching me how to do tetrad dissections, and to Babis, Maria and Michal for always being so willing to help me troubleshoot wet-lab experiments. Thank you to Danny for providing countless discussions on the intricacies of analysing RNA-seq data, and to Sandra for managing the smooth running of the lab. In particular, I’d like to thank Maria, Michal, Danny and Dave for their friendship and support – I hope the climbing and pub trips will continue. Thank you to all my friends for keeping me sane throughout this whole process, providing valuable distractions when necessary, and helping me keep my eye on the bigger picture. Thank you to my brother and sister-in-law, Stephen and Karen, and my beautiful niece, India, for always being there. Most importantly, thank you to my Mum and Dad for their constant love and support. 4 Publications arising from work done in this thesis Atkinson SR, Marguerat S, Bähler J. 2012. Exploring long non-coding RNAs through sequencing. Semin Cell Dev Biol 23: 200-205. Atkinson SR, Marguerat S, Lemay JF, Bachand F, Bähler J (manuscript in preparation). The fission yeast non-coding transcriptome. Bitton DA, Atkinson SR, Rallis C, Smith GC, Ellis DA, Jeffares DC et al. (manuscript in preparation). Skip and bin: pervasive alternative splicing is degraded by nuclear RNA surveillance in fission yeast. Lemay JF, Larochelle M, Marguerat S, Atkinson SR, Bähler J, Bachand F. 2014 (in press). The RNA exosome promotes transcription termination of backtracked RNA polymerase II. Nat Struct Mol Biol. 5 Abbreviations AS Antisense bp Base-pair CAGE Cap analysis of gene expression CC Correlation coefficient cds Coding sequence cDTA Comparative dynamic transcriptome analysis ceRNAs Competitive endogenous RNAs CESR Core environmental stress response CFU Colony forming units ChIP-seq Chromatin immunoprecipitation with sequencing ChIRP-seq Chromatin isolation by RNA purification circRNAs Circular RNA CUT Cryptic unstable transcript dChIRP Domain specific chromatin isolation by RNA purification DE Differentially expressed DSR Determinant of selective removal 6 eCUT Elongated CUT EMM Edinburgh minimal media eRNA Enhancer RNA GO Gene ontology GRO-seq Global run-on sequencing GTF General transcription factor GWAS Genome-wide association studies lincRNA Long intergenic non-coding RNA lncRNA Long non-coding RNA MEA Malt extract agar miRNA MicroRNA mRNA Messenger RNA mRNP Messenger ribonucleoprotein particle ncRNA Non-coding RNA NFR Nucleosome free region NGD No-go decay NGS Next-generation sequencing 7 NH4Cl Ammonium chloride NMD Nonsense mediated decay NNS Nrd1-Nab3-Sen1 NPD Non parental ditype NSD Non-stop decay nt Nucleotide NUT Nrd1-unterminated transcript OD Optical density ORF Open reading frame PD Parental ditype PIC Pre-initiation complex piRNA Piwi interacting RNA Pol II RNA polymerase II rDNA Ribosomal DNA RIP RNA immunoprecipitation RNA-seq RNA sequencing RNAi Interference RNA RPKM Reads per kilobase per million 8 rpm Revolutions per minute siRNA Small-interfering RNA snoRNA Small nucleolar RNA snRNA Small nuclear RNA snRNP Small nuclear ribonucleoprotein SUT Stable unannotated transcript T Tetratype ts Temperature sensitive TSS Transcription start site TTS Transcription termination site UTR Untranslated region WT Wild-type XUT Xrn1-dependent unstable transcript YES Rich media 9 Table of Contents Declaration .................................................................................................................. 2! Abstract .................................................................................................................. 3! Acknowledgements ..................................................................................................... 4! Publications arising from work done in this thesis ................................................. 5! Abbreviations .............................................................................................................. 6! Table of Contents ..................................................................................................... 10! Table of Figures ........................................................................................................ 16! Table of Tables ......................................................................................................... 18! Chapter 1! Introduction .......................................................................................... 19! 1.1! Overview ....................................................................................................... 19! 1.2! Genomes are pervasively transcribed ............................................................ 19! 1.2.1! Transcriptomics reveals pervasive lncRNA transcripts ......................... 19! 1.2.2! General features of lncRNAs ................................................................. 20! 1.3! Genomic origins of lncRNA transcription .................................................... 22! 1.3.1! Antisense lncRNAs ................................................................................ 22! 1.3.2! Bidirectional lncRNAs ........................................................................... 24! 1.3.3! Enhancer associated lncRNAs ............................................................... 26! 10 1.3.4! Long intergenic ncRNAs (lincRNAs) .................................................... 27! 1.3.5! Repetitive element-associated lncRNAs ................................................ 28! 1.4! Regulation of pervasive transcription by RNA degradation ......................... 29! 1.4.1! Overview of RNA degradation pathways .............................................. 29! 1.4.2! Themes of RNA degradation pathways ................................................. 33! 1.4.3! RNA degradation pathways regulating lncRNAs .................................. 34! 1.5! Emerging themes of lncRNA functional mechanisms .................................. 37! 1.5.1! Guides for chromatin modifying complexes .......................................... 38! 1.5.2! Evictors .................................................................................................. 39! 1.5.3! Sponges and decoys ............................................................................... 40! 1.5.4! Co-transcriptional functions ................................................................... 41! 1.6! Aims of this thesis ......................................................................................... 43! Chapter 2! Materials and Methods ........................................................................ 45! 2.1! S. pombe strains and growth conditions ........................................................ 45! 2.2! Transcriptomics ............................................................................................. 57! 2.3! Segmentation of sequencing data .................................................................. 58! 2.4!

The Fission Yeast Non-Coding Transcriptome

PLATFORM ABSTRACTS Abstract Abstract Numbers Numbers Tuesday, November 6 41

Supplementary Materials

Interpreting Human Genetic Variation with in Vivo Zebrafish Assays Erica E

Biocuration 2016 - Posters

From Genome to Phenome

The 2014 Version of the Gene Table of Monogenic Neuromuscular Disorders (Nuclear Genome)

How Abnormal RNA Metabolism Results in Childhood-Onset Neurological Diseases

Immune Gene Variation and Susceptibility to Upper Respiratory

Identification and Functional Characterization of Novel Genes Implicated in Congenital Myopathies Xavière Lornage

Prediction of Four Novel Snps V17M, R11H, A66T, and F57S in the SBDS

Molecular Insights to Crustacean Phylogeny

GENS DIANA DE LES PROANTOCIANIDINES Sabina Diaz Martinez