Control of Divergent Noncoding in

Saccharomyces cerevisiae

Chun Kit Andrew Wu

University College London and The Institute

PhD Supervisor: Folkert van Werven

A thesis submitted for the degree of Doctor of Philosophy University College London April 2020

Declaration

I, Chun Kit Andrew Wu, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis.

2

Abstract

The regulation of gene expression underlies all cellular processes and fundamentally enables complexity of eukaryotic organisms. Aberrant expression of noncoding RNAs can compromise normal gene expression. Gene promoters are inherently bidirectional and generate divergent noncoding RNAs along with protein- coding messenger RNAs. Chromatin and RNA turnover pathways limit expression of noncoding RNAs, but how sequence-specific transcription factors regulate divergent noncoding transcription and promoter directionality is not well understood. Here, I investigate how divergent transcription is repressed at highly expressed genes in Saccharomyces cerevisiae. I find that the sequence-specific transcription factor Rap1 limits divergent noncoding transcription at a large fraction of its target genes. Rap1 safeguards normal gene expression by limiting aberrant transcription that overlaps with neighbouring loci. Divergent RNAs initiate at or extremely close to Rap1 binding sites, indicating that Rap1 limits initiation of transcription from divergent core promoters. Stable binding of Rap1 near cryptic promoters is required and sufficient to suppress divergent transcription. Silencing cofactors or transcriptional coactivators associated with Rap1 are not required for repression of noncoding RNAs at promoters. In contrast, a small region within the Rap1 carboxy-terminal domain is required for repression of divergent transcription and affects interaction between Rap1 and the RSC chromatin remodeller. RSC and Rap1 regulate divergent transcription at Rap1-regulated gene promoters in distinct ways. Promoter output shifts from unidirectional to bidirectional transcription in the absence of Rap1, which is partially suppressed after co-depletion of Rap1 and RSC. RSC activity is not strictly required for divergent transcription suggesting that additional regulators also play important roles. I propose that certain sequence- specific transcription factors limit the access of transcription machinery and coactivators to divergent core promoters by steric hindrance, thereby providing directionality towards productive transcription of coding genes.

3

Impact Statement

In this thesis, I discovered that the sequence-specific transcription factor Rap1 prevents initiation of divergent noncoding transcription near its binding sites at gene promoters, thereby conferring directionality towards productive transcription of coding genes. This work highlights the molecular basis by which a DNA-binding regulatory protein interprets the information encoded within cis- regulatory element DNA to produce a transcriptional output.

These findings have been disseminated through two peer-reviewed articles published in academic journals (Wu et al., 2018b; Wu and Van Werven, 2019). These open access articles are freely available to the scientific community and the wider public, and may to be of interest to scientists investigating fundamental aspects of gene regulation in various systems including humans.

Within this thesis, I also demonstrated that CRISPR technology can be used to specifically inhibit expression of divergent long noncoding RNAs by targeting their core promoters. This fundamental information will help to inform design of CRISPR interference (CRISPRi) screens to identify genes involved in diverse cellular processes, and understand how they interact. Noncoding RNAs are associated with a wide range of human diseases, and may offer attractive therapeutic targets. The novel information regarding principles of gene regulation presented in this thesis may help to guide future therapies that exploit CRISPRi to interfere with transcription of coding genes or long noncoding RNAs in a clinical setting.

The RNA and transcription start site sequencing data sets generated within this thesis have been deposited and archived in the NCBI Gene Expression Omnibus (GEO) database repository (GEO: GSE110004). These data sets are publicly available and may help to improve future S. cerevisiae genome annotation through community-based resources such as the Saccharomyces Genome Database or Ensembl.

4

Acknowledgement

I must start by acknowledging my PhD supervisor, Folkert van Werven. Folkert, thank you for being the best PhD supervisor I could ever have. I have had the opportunity to see first-hand your tireless enthusiasm for science, pragmatic but fearless approach, and astute intuitions that are quite often correct. I’m very grateful for the opportunity to do a PhD in your laboratory where I grew up, learned how to be a scientist, and had the chance to contribute to something meaningful that is much bigger than myself. However, these things cannot compare to the selfless dedication and genuine care you have for all your lab members. I appreciate that you always made the time to discuss my experiments, data, career, and anything else I needed your advice on. I could not have done this PhD without your support, and I will always be grateful for your guidance and mentorship. Thank you for making these past four years a hugely enjoyable experience. I know that your work will shape the future of science and the careers of those who have the privilege of working with you, and I can’t wait to see your amazing discoveries and achievements.

Minghao Chia, thank you for being a great collaborator and colleague, but most of all a great friend. I am lucky that I can always count on your amazing expertise and knowledge, wealth of experience, and willingness to help others. I always enjoy laughing at silly things together with you, in science and in life. It has been an amazing four years working together, and I’ve had the chance to see you develop, do amazing work, and even get married! I wish you and Joy all the best as you begin a new chapter of your lives together in Singapore. I’ll be sure to visit you often so we can try some more delicious food together!

Janis Tam, what a journey we’ve taken together over the past four years. Thank you for always bringing a laugh and smile to all of us, each and every day in the lab. As with Minghao, it is always fun to share a laugh with you about everything and nothing. It was always reassuring to talk to each other and offer advice or a tasty snack as we took each daunting step together throughout our PhDs. Good luck to you as you complete your thesis and embark on an exciting

5

new journey as well; my best wishes to you and Ryan. Because I know we have very similar tastes, I’m really looking forward to meeting up and discovering new restaurants together. 繼續加油!

Radhika Warrier, it has been an absolute pleasure to work together and get to know you. Thank you for always being caring, wise, and sensible with your advice, whether it was scientific or personal. Your encyclopaedic knowledge, quiet confidence, and astute judgement helped to keep all our crazy experiments on track. I wish you and your family good luck in the future, wherever you go and whatever you do.

Dora Sideri, thank you for your heroic efforts to maintain the smooth running of the lab, and for making our move from the LRI to the Crick in the middle of our PhDs as seamless as possible. Your calm approach and pragmatic advice always helped us keep our heads on our shoulders when things got hectic. Thanks for all your help, and good luck as you continue to make fantastic contributions.

Fabien Moretto, thank you for always taking the time to teach us everything you know. From proper yeast genetics to countless northern blots, your meticulous approach and expertise ensured that we learned to do things the right way. We certainly had quite the journey from the LRI to the Crick! I wish you and your family all the best as you start a new chapter together.

Imke Ensinck, it’s been wonderful to get to know you and see your amazing talents continue to develop over the first year of your PhD. Your warmth and sense of humour brightened up each day inside and outside the lab, and I hope we managed to pass some of our knowledge and not too many bad habits on to you. I hope you will continue to take part in London’s amazing theatre scene, and maybe I will see you on a West End production one day! Good luck to you as you continue to do amazing science together with Folkert, I can’t wait to see what you achieve.

Alice Rossi, it was fantastic to work with you in our group and get to know you. You are one of the brightest, most enthusiastic, and most Italian people I know. Quite frankly, you’re so smart and motivated, it’s unbelievable. I always

6

enjoyed reminiscing about the Pacific Northwest together, and I look forward to seeing you grow and accomplish amazing things. And thank you for sharing your delicious baked goods and your tiramisu recipe, which I will always keep near and dear to my heart! All the best to you, as you embark on the next phase of your scientific career during your PhD at the Crick.

Luc Costello Heaven and Jessie Beck, it was great to have you as valuable members of our group and I wish you the best of luck in your future careers.

Thank you to Harshil Patel (Bioinformatics and Biostatistics, The ) for your significant contributions towards the design, analysis, and interpretation of our genome-wide data. It has been an absolute pleasure to work with you over these past four years, and you have been the most dedicated, knowledgeable, and genuine collaborator I could have asked for. I look forward to seeing you shape the future of every topic you work on.

I thank Bram Snijders and David Frith (Protein Analysis and Proteomics Platform, the Francis Crick Institute) for their help in generating and analysing our proteomics data, which produced unexpected but fruitful findings.

I acknowledge and thank the following Science Technology Platforms and facilities at the Francis Crick Institute for their experimental and organisational support: the Advanced Sequencing Facility, Bioinformatics and Biostatistics, Protein Analysis and Proteomics Platform, Peptide Chemistry, Fermentation, Glasswash, Media Preparation, and all our administrators, laboratory operations staff, and quadrant managers from the LRI and the Crick. I would especially like to acknowledge and thank the dedicated staff of the Genomics Equipment Park for always providing excellent, quick, and reliable service for our countless experiments. Our work would not be possible without all your contributions.

I thank the amazing Academic Training Team at the Crick for their guidance and support throughout my PhD programme.

7

To my thesis committee, comprising Frank Uhlmann and Caroline Hill at the Crick and Jürg Bähler at UCL: thank you for your wise guidance, generous mentorship, and insightful discussions that helped to shape my PhD.

I thank the following people for their generous gifts of strains and reagents that contributed to work in this thesis: Peter Thorpe (Queen Mary University of London), Amanda Johnson and Tony Weil (Vanderbilt University), Cynthia Wolberger (Johns Hopkins University), Sebastian Marquardt (University of Copenhagen), Frank Uhlmann (The Francis Crick Institute), and Jesper Svejstrup (The Francis Crick Institute).

This work was supported by the Francis Crick Institute (FC001203), which receives its core funding from Cancer Research UK (FC001203), the UK Medical Research Council (FC001203), and the (FC001203). Minghao Chia, who performed some of the work included in this thesis, was supported by a fellowship from the Agency for Science, Technology and Research (A*STAR) of Singapore.

I am grateful to the anonymous peer reviewers and colleagues at meetings who provided insightful and helpful feedback that helped to clarify and improve our work.

I would particularly like to thank Celine Bouchoux, Lea Gregersen, and the members of the Uhlmann and Svejstrup labs for their scientific advice and guidance.

To the members of the Crick/LIF 3rd floor cake club: Janis, Hon, and Tiff – thank you for your delicious baked goods that brought joy and delicious moments throughout these four years.

I thank Drice Challal and Domenico Libri from the Institut Jacques Monod (CNRS/Université Paris Diderot) for sharing unpublished data and for being considerate while coordinating our publication of complementary work together.

8

To my scientific mentor Alan Cheung at the ISMB (UCL/Birkbeck), thank you for first allowing me to take my first steps in research five years ago. I will always be grateful for your dedicated training and continued mentorship.

Thank you as well to Cara Vaughan, Chris Taylorson, Amanda Cain, and my previous mentors at UCL, for their selfless support, advice, and care.

Thank you to my mentor and former basketball coach, Aaron Mitchell, for teaching me important life lessons that I’ll always carry with me.

To Alice Carty, Louise Blair, Glen Gronland, Patrik Eickhoff, Emily Hardman, Stephanie Nofal, Emir Aciyan, Shomon Miah, Josh Wort, Jake Wilkins, and all my friends I’ve met in London: I cannot express to you how much your love, support, and friendship has meant to me. You have become my “family” here in the UK and I can always count on you. I have always enjoyed talking about science, life, sport, food, and nonsense with you. I look forward to sharing more happy memories together and seeing every one of you grow and do amazing things. I wish you nothing but the best.

Thank you as well to my basketball team the Warthogs, the Thursday Basketball group, and the University of London Fencing Club for keeping me distracted and happy whenever my experiments were not working.

Last but not least, I thank my friends and family in Vancouver, London, and Hong Kong for their constant love and support. In particular: my parents, Olivia Chan, Siu Lung 舅父 and 舅母 Deborah, Karen and Kelly Yeung, Raveena Mahal, Kim Go, Alex Assumption, Patrick Savage, Christian Samson, Corbin Castres, Ben Hieltjes, and Dario Brzovic.

“The one important thing I have learned over the years is the difference between taking one's work seriously and taking one's self seriously. The first is imperative and the second is disastrous.” – Margot Fonteyn

9

"It is not the critic who counts; not the man who points out how the strong man stumbles, or where the doer of deeds could have done them better. The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood; who strives valiantly; who errs, who comes short again and again, because there is no effort without error and shortcoming; but who does actually strive to do the deeds; who knows great enthusiasms, the great devotions; who spends himself in a worthy cause; who at the best knows in the end the triumph of high achievement, and who at the worst, if he fails, at least fails while daring greatly, so that his place shall never be with those cold and timid souls who neither know victory nor defeat." – Theodore Roosevelt

The work presented in this thesis is my original work completed without assistance, apart from the contributions from others that are clearly stated in the “Acknowledgement” preceding each Chapter and in the relevant figure legends.

10

Table of Contents

Abstract...... 3 Impact Statement ...... 4 Acknowledgement ...... 5 Table of Contents ...... 11 Table of figures ...... 16 List of tables...... 19 Abbreviations ...... 20 Chapter 1.Introduction ...... 24 1.1 Acknowledgement ...... 24 1.2 Coding and noncoding transcription in eukaryotes ...... 24 1.2.1 Gene expression and the transcription cycle ...... 24 1.2.2 Types of noncoding RNAs in eukaryotes and their functions ...... 29 1.2.3 Gene regulation by noncoding RNAs in cis and trans ...... 32 1.3 Regulation of noncoding RNA expression ...... 37 1.3.1 Why limit noncoding transcription? ...... 37 1.3.2 Pathways that control expression of noncoding RNAs in yeast and mammalian cells ...... 41 1.3.3 Chromatin-based pathways that control expression of noncoding RNAs ...... 41 1.3.4 RNA degradation based pathways that control expression of noncoding RNAs ...... 44 1.4 Control of promoter directionality ...... 48 1.4.1 Eukaryotic promoters are bidirectional ...... 48 1.4.2 Factors that regulate expression of divergent transcripts ...... 50 1.4.3 Evolution of promoter directionality – enhancers and promoters ...... 58 1.5 Ribosomal protein genes in Saccharomyces cerevisiae ...... 59 1.5.1 Regulation of ribosomal protein gene expression...... 59 1.5.2 Additional functions of Rap1 in S. cerevisiae ...... 63 1.5.3 Ribosomal protein genes as a model to study divergent noncoding transcription ...... 67 1.6 Aims of this thesis ...... 67 Chapter 2.Materials & Methods ...... 70 2.1 Acknowledgement ...... 70 2.2 Construction of yeast strains ...... 70 2.2.1 Yeast strain genotypes ...... 70 2.2.2 Transformation of yeast ...... 84 2.2.3 Genetic crossing of yeast ...... 84 2.2.4 Replica plating ...... 85 2.2.5 Auxin-inducible degron (AID) system ...... 85 2.2.6 Cre-LoxP system ...... 86 2.3 Yeast culture conditions ...... 87 2.3.1 Yeast culture conditions in liquid media ...... 87 2.3.2 Growth on agar medium plates ...... 88 2.3.3 Storage of yeast strains ...... 88 2.4 Cloning, plasmids, and oligonucleotides ...... 88

11

2.4.1 Rap1 mutants ...... 88 2.4.2 Fluorescent reporter system for divergent promoter activity ...... 89 2.4.3 Plasmid amplification and minipreps ...... 90 2.4.4 Table of plasmids used in this study ...... 90 2.4.5 Table of oligonucleotides used in this study ...... 94 2.5 Experimental methods ...... 96 2.5.1 Fluorescence microscopy ...... 96 2.5.2 Spot growth assay ...... 97 2.5.3 RNA extraction ...... 97 2.5.4 Northern blot...... 98 2.5.5 Western blot ...... 99 2.5.6 Antibodies ...... 100 2.5.7 Cycloheximide protein stability assay ...... 101 2.5.8 Chromatin immunoprecipitation (ChIP) ...... 101 2.5.9 CRISPR interference (CRISPRi) ...... 102 2.5.10 Single molecule RNA fluorescence in situ hybridisation (FISH) .... 103 2.5.11 RNA sequencing (RNA-seq) ...... 104 2.5.12 Transcription start site sequencing (TSS-seq) ...... 104 2.5.13 Nascent RNA sequencing (Nascent RNA-seq) ...... 105 2.5.14 Chromatin proteomics mass spectrometry ...... 107 2.6 Bioinformatic analysis ...... 108 2.6.1 Differential expression analysis ...... 108 2.6.2 TSS-seq analysis ...... 110 2.6.3 ChIP-seq and MNase-seq analysis ...... 111 2.6.4 Promoter directionality score analysis ...... 112 2.6.5 Data plotting and visualisation ...... 112 2.6.6 Quantification and statistical analysis ...... 113 2.7 Data and Software Availability ...... 113 Chapter 3.Identification of Rap1 as a Key Repressor of Divergent Noncoding Transcription ...... 115 3.1 Acknowledgement ...... 115 3.2 Abstract ...... 115 3.3 Introduction ...... 116 3.4 Results ...... 119 3.4.1 Generation of the auxin-inducible degron (AID) system to deplete essential proteins in yeast ...... 119 3.4.2 Divergent transcripts at ribosomal protein genes are regulated by Rap1 ...... 123 3.4.3 Aberrant expression of Rap1-regulated divergent transcripts mis- regulates neighbouring genes ...... 126 3.4.4 Analysis of RNA sequencing experiments to measure changes in transcript expression after global Rap1 depletion ...... 130 3.4.5 Rap1 represses noncoding transcription at hundreds of sites across the yeast genome ...... 133 3.4.6 Rap1 represses divergent noncoding transcripts at the majority of Rap1-regulated gene promoters...... 136 3.4.7 Rap1 is not redundant with noncoding RNA surveillance pathways in yeast ...... 139

12

3.4.8 Rap1 represses noncoding transcription in a distinct manner to previously described chromatin regulatory pathways ...... 142 3.4.9 Genome-wide analysis of noncoding transcription in chromatin regulatory factor mutants ...... 145 3.4.10 Rap1 is not redundant with other chromatin regulatory pathways in repression of divergent transcription ...... 147 3.4.11 Rap1 and other chromatin regulatory pathways control divergent and antisense transcription in distinct genomic locations ...... 149 3.5 Discussion ...... 151 3.5.1 Summary ...... 151 3.5.2 Evaluation of the AID system to study essential transcription factors ...... 152 3.5.3 Rap1 specifically limits divergent transcription ...... 152 3.5.4 Consequences of aberrant divergent transcription ...... 153 3.5.5 Rap1 controls expression of noncoding transcripts to a large extent ...... 154 3.5.6 Functional redundancy between Rap1 and other pathways that limit noncoding RNAs ...... 155 3.5.7 Conclusion ...... 157 Chapter 4.Mechanism and Key Regulatory Principles for Regulation of Divergent Transcription by Rap1 ...... 159 4.1 Acknowledgement ...... 159 4.2 Abstract ...... 160 4.3 Introduction ...... 161 4.4 Results ...... 163 4.4.1 A proximal Rap1 binding site is required and sufficient to repress divergent transcription at the RPL43B locus ...... 163 4.4.2 A proximal Rap1 binding site is required and sufficient to repress divergent transcription at an independent promoter ...... 165 4.4.3 Repression of divergent noncoding transcription is independent of Rap1 motif orientation ...... 168 4.4.4 Transcription start site sequencing (TSS-seq) identifies transcript initiation sites at single nucleotide resolution ...... 170 4.4.5 Validation of TSS-seq data ...... 173 4.4.6 Rap1 represses divergent TSSs near its binding sites genome- wide ...... 175 4.4.7 Investigation of candidate transcription factors that may control divergent transcription ...... 179 4.4.8 The Rap1 silencing function is not crucial for repression of divergent noncoding transcripts ...... 182 4.4.9 The Rap1 C-terminal domain is important for repression of divergent noncoding transcription ...... 185 4.4.10 Tethering of the Rap1 C-terminal domain to DNA partly suppresses divergent transcription ...... 188 4.4.11 A small region within the Rap1 CTD comprising residues 631-696 is important for repression of divergent noncoding transcription ...... 190

13

4.4.12 Tethering a bulky protein to RP gene promoters via the Rap1 DNA- binding domain is not sufficient to repress divergent transcript expression ...... 194 4.4.13 Investigation of divergent TSS usage and regulation using the CRISPR interference system ...... 196 4.5 Discussion ...... 200 4.5.1 Summary ...... 200 4.5.2 Evaluation of TSS-seq approach ...... 201 4.5.3 Evaluation of a “transcriptional roadblock” mechanism ...... 203 4.5.4 Decoupling of coding and divergent core promoters ...... 205 4.5.5 Regulation of divergent transcript expression by Rap1 silencing cofactors ...... 207 4.5.6 Transcriptional repression of divergent core promoters by steric hindrance ...... 208 4.5.7 Conclusion ...... 212 Chapter 5.Regulatory Interplay between Rap1 and the RSC Chromatin Remodeller ...... 213 5.1 Acknowledgement ...... 213 5.2 Abstract ...... 214 5.3 Introduction ...... 215 5.4 Results ...... 216 5.4.1 Identification of Rap1 chromatin protein interactome using proteomics mass spectrometry ...... 216 5.4.2 Rap1 and RSC regulate chromatin organisation at Rap1-regulated genes ...... 221 5.4.3 RSC promotes divergent transcription in the absence of Rap1 at several model loci ...... 224 5.4.4 Development of nascent RNA-seq method to measure nascent transcription genome-wide ...... 226 5.4.5 Validation of nascent RNA-seq using conditional depletion mutants for Rap1 ...... 229 5.4.6 Nascent RNA-seq enriches for nascent RNA polymerase II transcripts ...... 232 5.4.7 RSC regulates relative expression of a large fraction of coding genes and noncoding RNAs in budding yeast ...... 234 5.4.8 Rap1 and RSC have distinct chromatin organisation functions at gene promoters ...... 237 5.4.9 RSC activity is required for divergent transcription in some instances ...... 240 5.4.10 Rap1 and RSC control noncoding transcription around Rap1 sites genome-wide ...... 244 5.4.11 Quantification of promoter directionality using nascent RNA sequencing ...... 247 5.4.12 Contribution of Rap1 and RSC towards promoter directionality at Rap1-regulated genes ...... 250 5.4.13 RSC controls promoter directionality at hundreds of yeast gene promoters ...... 253 5.5 Discussion ...... 257

14

5.5.1 Summary ...... 257 5.5.2 Evaluation of nascent RNA-seq approach ...... 257 5.5.3 Regulation of divergent transcription across eukaryotic species ..... 259 5.5.4 Contribution of ATP-dependent chromatin remodellers towards promoter directionality ...... 260 5.5.5 Regulatory interplay between Rap1 and RSC ...... 261 5.5.6 Conclusion ...... 263 Chapter 6.Discussion ...... 264 6.1 Acknowledgement ...... 264 6.2 Summary of key findings ...... 264 6.3 A model for regulation of divergent noncoding transcription in Saccharomyces cerevisiae ...... 265 6.3.1 Control of divergent noncoding transcription at Rap1-regulated genes ...... 266 6.3.2 Control of divergent noncoding transcription by RSC chromatin remodeller ...... 270 6.3.3 Requirements and limitations of the steric hindrance model for regulation of divergent noncoding transcription ...... 272 6.4 Alternative hypotheses to steric hindrance model ...... 274 6.4.1 Regulation by pausing of RNA polymerase II ...... 275 6.4.2 Regulation of TBP activity ...... 276 6.4.3 Regulation of divergent transcription by gene looping and chromatin conformation...... 277 6.4.4 Negative supercoiling and DNA accessibility at divergently oriented core promoters ...... 278 6.4.5 Regulation of divergent transcript expression by gene positioning in the nucleus ...... 279 6.5 Relevance to higher eukaryotes...... 280 6.5.1 Bidirectional promoters and enhancers ...... 280 6.5.2 Regulation of divergent core promoters and promoter directionality across the domains of life ...... 282 6.6 Resources for scientific community ...... 285 6.7 Future directions ...... 286 6.7.1 What are the consequences if Rap1 does not repress noncoding transcription? ...... 286 6.7.2 Can other sequence-specific transcription factors also control divergent transcription in a similar manner to Rap1? ...... 288 6.7.3 Which additional factors regulate divergent promoter activity? ...... 289 Chapter 7.Appendix ...... 291 7.1 Copyright Permissions ...... 291 Chapter 8.References ...... 292

15

Table of figures

Figure 1.1 Structure of a typical protein-coding gene and messenger RNA ...... 25 Figure 1.2 Key steps of the transcription cycle ...... 29 Figure 1.3 Common sources of noncoding RNAs in mammalian cells ...... 31 Figure 1.4 Examples of transcriptional interference and R-loop formation during transcription-replication conflict ...... 39 Figure 1.5 Intragenic histone deacetylation through the Set1/Set2 and Set3/Rpd3S pathways...... 44 Figure 1.6 Structure of a bidirectional gene promoter ...... 48 Figure 1.7 Control of promoter directionality by termination and degradation of divergent noncoding RNAs ...... 56 Figure 1.8 Structural and functional organisation of a ribosomal protein gene promoter in S. cerevisiae ...... 60 Figure 1.9 Additional functions of S. cerevisiae Rap1 at telomeres and the hidden mating (HM) type loci ...... 63 Figure 3.1 Generation of auxin-inducible degron (AID) system to deplete essential transcription factors in yeast ...... 121 Figure 3.2 Divergent transcripts at ribosomal protein genes are regulated by Rap1 ...... 124 Figure 3.3 Aberrant expression of Rap1-regulated divergent transcripts mis- regulates neighbouring genes ...... 128 Figure 3.4 Analysis of RNA sequencing experiments to measure changes in transcript expression after global Rap1 depletion ...... 132 Figure 3.5 Rap1 represses noncoding transcription at hundreds of sites across the yeast genome ...... 134 Figure 3.6 Rap1 represses divergent noncoding transcripts at the majority of Rap1- regulated gene promoters ...... 138 Figure 3.7 Rap1 is not redundant with noncoding RNA surveillance pathways in yeast ...... 141 Figure 3.8 Rap1 represses noncoding transcription in a distinct manner to previously described chromatin regulatory pathways ...... 143

16

Figure 3.9 Genome-wide analysis of noncoding transcription in chromatin regulatory factor mutants ...... 147 Figure 3.10 Rap1 is not redundant with other chromatin regulatory pathways in repression of divergent transcription ...... 148 Figure 3.11 Rap1 and other chromatin regulators control divergent and antisense transcription in distinct genomic locations ...... 151 Figure 4.1 A proximal Rap1 binding site is required and sufficient to repress divergent transcription at the RPL43B locus ...... 165 Figure 4.2 A proximal Rap1 binding site is required and sufficient to repress divergent transcription at an independent promoter...... 167 Figure 4.3 Repression of divergent noncoding transcription is independent of Rap1 motif orientation ...... 169 Figure 4.4 Transcription start site sequencing (TSS-seq) identifies transcription initiation sites at single nucleotide resolution ...... 172 Figure 4.5 Validation of TSS-seq data ...... 175 Figure 4.6 Rap1 represses divergent TSSs near its binding sites genome-wide .. 179 Figure 4.7 Investigation of candidate transcription factors in yeast that may control divergent transcription ...... 181 Figure 4.8 The Rap1 silencing function is not crucial for repression of divergent noncoding transcripts ...... 184 Figure 4.9 The Rap1 C-terminal domain is important for repression of divergent noncoding transcription ...... 187 Figure 4.10 Tethering of the Rap1 C-terminal domain to DNA partly suppresses divergent transcription ...... 189 Figure 4.11 A small region within the Rap1 CTD comprising residues 631-696 is important for repression of divergent noncoding transcription ...... 193 Figure 4.12 Tethering a bulky protein to RP gene promoters via the Rap1 DNA- binding domain is not sufficient to repress divergent transcript expression ...... 195 Figure 4.13 Investigation of divergent TSS usage and regulation using the CRISPR interference system ...... 199 Figure 4.14 Requirements for repression of divergent transcription initiation by Rap1 ...... 204 Figure 5.1 Identification of Rap1 chromatin protein interactome using proteomics mass spectrometry ...... 218

17

Figure 5.2 Rap1 and RSC regulate chromatin organisation at Rap1-regulated genes ...... 222 Figure 5.3 RSC promotes divergent transcription in the absence of Rap1 at several model loci ...... 225 Figure 5.4 Development of nascent RNA-seq method to measure nascent transcription genome-wide ...... 229 Figure 5.5 Validation of nascent RNA-seq using conditional depletion mutants for Rap1 ...... 232 Figure 5.6 Nascent RNA-seq enriches for nascent RNA polymerase II transcripts ...... 233 Figure 5.7 RSC regulates relative expression of a large fraction of coding genes235 Figure 5.8 RSC regulates relative expression of many noncoding RNAs ...... 237 Figure 5.9 Rap1 and RSC have distinct chromatin organisation functions at gene promoters ...... 239 Figure 5.10 Rap1 and RSC function independently at RPL43B and RPL40B ...... 242 Figure 5.11 RSC activity is required for divergent transcription in some instances ...... 244 Figure 5.12 Rap1 and RSC control noncoding transcription around Rap1 sites genome-wide ...... 247 Figure 5.13 Quantification of promoter directionality using nascent RNA sequencing ...... 250 Figure 5.14 Contribution of Rap1 and RSC towards promoter directionality at Rap1- regulated genes ...... 253 Figure 5.15 RSC controls promoter directionality at hundreds of yeast gene promoters ...... 256 Figure 6.1 Control of divergent noncoding and coding transcription at Rap1- regulated genes ...... 270 Figure 6.2 Control of divergent noncoding transcription by RSC chromatin remodeller ...... 272 Figure 6.3 Steric hindrance model for regulation of divergent noncoding transcription ...... 274

18

List of tables...

Table 1.1 Mechanisms of gene regulation by long noncoding RNAs ...... 33 Table 2.1 Table of yeast strain genotypes ...... 83 Table 2.2 Table of plasmids used in this study...... 93 Table 2.3 Table of oligonucleotides used in this study ...... 96 Table 2.4 Table of antibodies used for western blotting ...... 101 Table 3.1 Table summarising targeted chromatin regulator screen ...... 145 Table 4.1 Table summarising biochemical and functional properties of candidate transcription factors ...... 207 Table 6.1 Examples of transcriptional repression by steric hindrance in different organisms and systems...... 284

19

Abbreviations

AD Activation Domain AID Auxin-inducible degron AS Antisense ATP Adenosine triphosphate ATPase Adenosine triphosphatase bp Base pair C Crick (strand) cDNA Complementary DNA CF Cleavage factor ChIP Chromatin immunoprecipitation ChIP-seq Chromatin immunoprecipitation sequencing CHX Cycloheximide CPF Cleavage and polyadenylation factor CRAC Cross-linking and analysis of cDNAs CRISPR Clustered regularly interspaced short palindromic repeats CRISPRi CRISPR interference CTD Carboxy-terminal domain CUT Cryptic unstable transcripts DBD DNA-binding domain dCas9 Nuclease-inactivated Cas9 DIC Differential interference contrast DMSO Dimethyl sulfoxide DNA Deoxyribonucleic acid DNase Deoxyribonuclease ECL Enhanced Chemiluminescence EDTA Ethylenediaminetetraacetic acid eRNA Enhancer RNA ES cells Embryonic stem cells EV Empty vector FACT Facilitates chromatin transcription FDR False discovery rate

20

FISH Fluorescence in situ hybridisation FL Full length GEO Gene Expression Omnibus GFP Green fluorescent protein GO Gene Ontology GRO-seq Global run-on sequencing H3K36 Histone H3 lysine 36 H3K4 Histone H3 lysine 4 HM Hidden mating type IAA 3-indole-acetic acid IGV Integrative Genomics Viewer IP Immunoprecipitation KAN Kanamycin kDa Kilodalton LexO Lex operator LFQ Label-free quantification lncRNA Long noncoding RNA M Molar MAT Mating type Mb Megabases MNase Micrococcal nuclease MNase-seq Micrococcal nuclease sequencing mRNA Messenger RNA NAT Nourseothricin NDR Nucleosome-depleted region NET-seq Native elongating transcript sequencing NLS Nuclear localisation signal NMD Nonsense-mediated decay NNS Nrd1-Nab3-Sen1 NPC Nuclear pore complex Nt Nucleotide NTD Amino-terminal domain NUT Nrd1-unterminated transcripts

21

OD Optical density ORF Open reading frame PAGE Polyacrylamide gel electrophoresis PAS Polyadenylation signal PBS Phosphate buffered saline PCR Polymerase chain reaction PE Paired-end

Pi Phosphate pI Isoelectric point PIC Pre-initiation complex PKA Protein kinase A polyA Polyadenylated PROMPT Promoter upstream transcript PRO-seq Precision run-on sequencing qPCR Quantitative PCR Rap1 Repressor-activator protein 1 RNA Ribonucleic acid RNase Ribonuclease RNA-seq RNA sequencing RNP Ribonucleoprotein RP Ribosomal protein rRNA ribosomal RNA RSC Remodels the Structure of Chromatin RT Reverse transcription S Sense SDS Sodium dodecyl sulfate SE Single-end SEM Standard error of the mean Ser2 Serine 2 Ser5 Serine 5 SGD Saccharomyces Genome Database sgRNA Single guide RNA snRNP Small nuclear ribonucleoprotein

22

SPO Sporulation media SSC Saline-sodium citrate SUT Stable unannotated transcripts SWI/SNF Switch/Sucrose non-fermentable TBP TATA-binding protein TCA Trichloroacetic acid TES Transcription end site TF Transcription factor TFIIA Transcription factor II A TFIID Transcription factor II D TOR Target of rapamycin Tox Toxcity (domain) TPM Transcripts per million tRNA Transfer RNA TSS Transcription start site TSS-seq Transcription start site sequencing TTS Transcription termination site UTR Untranslated region W Watson (strand) WT Wild-type XUT Xrn1-sensitive unstable transcripts YFP Yellow fluorescent protein YPD Yeast extract peptone dextrose Δ Deletion

23

Chapter 1 Introduction

Chapter 1. Introduction

1.1 Acknowledgement

Some of the content in this chapter has been published in a Point-of-View article in Transcription (Wu and Van Werven, 2019), and has been modified to present within this chapter.

1.2 Coding and noncoding transcription in eukaryotes

1.2.1 Gene expression and the transcription cycle

Gene expression underlies all life on Earth. In order to produce the macromolecular machinery required to carry out essential cellular processes, the central dogma of molecular biology stipulates that genetic information must be converted from a sequence of nucleotides (G, A, T, and C) on DNA into a gene product – a protein or functional RNA. Genes comprise a unit of nucleotides on DNA or RNA that encode a functional molecule. The sequence of nucleotides within a gene must be transcribed from double-stranded DNA into an RNA molecule, which is then further processed to become functional RNAs or translated by the ribosome to produce proteins, which are composed of linear polypeptide chains. The amino acid composition and sequence of proteins are dictated by the sequence of triplet nucleotide codons on DNA and corresponding RNA. In addition to sequences encoding proteins, DNA also contains regulatory information that is read and interpreted by different regulatory proteins. A gene typically comprises a promoter containing regulatory information to dictate activation of transcription, a core promoter where the transcription start site (TSS) is located, a gene body containing the start codon, coding exons, introns, and stop codon, a transcript end site (TES), and a transcription termination site (TTS) (Mellor et al., 2016) (Figure 1.1). Genes are transcribed by the RNA polymerase enzyme, which processively travels along the DNA from the TSS to the transcription termination site and synthesises an RNA molecule in a templated fashion.

24

Chapter 1 Introduction

Figure 1.1 Structure of a typical protein-coding gene and messenger RNA A typical eukaryotic protein-coding gene comprises a gene promoter containing a transcription start site (TSS), one or more coding exons, (possibly) introns that are spliced out of pre-messenger RNAs (pre-mRNAs), start codons and stop codons at the 5’ end and 3’ end of the open reading frame (ORF) respectively, a transcript end site (TES), and a transcription termination site (TTS). The transcribed pre- mRNA is capped at its 5’ end by the addition of a 7-methyl-guanosine cap (me7Gppp), and a poly(A) tail is added to the 3’ end of the transcript at the TES. The 5’ untranslated region (UTR) comprises the transcribed region between the TSS and start codon of the first exon, and the 3’ UTR comprises the transcribed region between the stop codon of the last exon and the TES. Introns are removed from the pre-mRNA and the intron-flanking exons are joined in the process of splicing prior to mRNA export.

In human cells, ~2 metres of linear DNA (~4 metres after S-phase) must be packaged into a small nucleus approximately 10 μm in diameter. In eukaryotes, DNA does not exist as a naked polymer, but is packaged into chromatin – a dense complex of proteins, DNA, and RNA (Kornberg, 1974; Paulson and Laemmli, 1977). The basic structural unit of chromatin is the nucleosome, comprising a length of DNA (approx. 146 bp, or 1.7 turns of DNA) wrapped around a core octamer of histone proteins (Luger et al., 1997). Nucleosomes are regularly arranged as repetitive structures and are connected by stretches of linker DNA of variable length (Clark, 2010). Chromatin and nucleosomes are generally repressive towards gene transcription (Han and Grunstein, 1988; Knezetic and Luse, 1986; Lorch et al., 1987). Regulation of DNA accessibility represents a major mechanism of gene regulation. In addition to the information encoded in the DNA sequence, additional heritable epigenetic information is conveyed in the form of covalent

25

Chapter 1 Introduction

modifications to histones (especially histone tails) and DNA bases (Allis and Jenuwein, 2016).

Regulation of gene expression is essential for all cellular processes and underlies the complexity of eukaryotic sytems. To produce distinct cell identities, different combinations of genes must be activated or repressed. This regulation occurs at various stages during gene expression, especially during transcription. Prokaryotes utilise simpler RNA polymerases comprising fewer subunits, and like archaea, only possess one type of RNA polymerase enzyme (Cramer, 2002). However, eukaryotic cells possess multiple RNA polymerases that transcribe distinct subsets of RNAs (Carter and Drouin, 2009). RNA polymerase II transcribes precursor messenger RNAs (mRNAs), long noncoding RNAs (lncRNAs), small nuclear RNAs (snRNAs), and precursors to microRNAs. RNA polymerase I transcribes precursors of ribosomal RNA (rRNA), and RNA polymerase III transcribes tRNAs, small rRNAs, and other small RNAs. RNA polymerase IV and V are plant-specific, and involved in siRNA gene silencing (Zhou and Law, 2015). In this thesis, I will focus primarily on the regulation of RNA polymerase II. The introduction to transcriptional regulation below mainly describes the principles and mechanisms best understood in S. cerevisiae, which may differ from the corresponding aspects found within other eukaryotic cells, including humans.

Transcription typically involves three key phases: initiation, elongation, and termination (Figure 1.2) (Hahn and Young, 2011; Svejstrup, 2004). These phases require the coordinated action of multiple proteins and complexes, and are subject to extensive regulation. Initiation of transcription begins when a sequence-specific transcription factor locates and binds to its target motif on DNA, typically within an enhancer or gene promoter. Transcription factors recruit coactivator complexes (for example SAGA, NuA4, and Mediator in yeast) to facilitate recruitment of general transcription factors and stimulate assembly of the pre-initiation complex (PIC) at the core promoter (Hahn and Young, 2011). Coactivators can stimulate transcription by adding or removing post-translational modifications on promoter- proximal nucleosomes, or recruiting basal transcription factors. In budding yeast, PIC assembly is generally dependent on recruitment and action of SAGA or TFIID (Baptista et al., 2017; Warfield et al., 2017). During PIC assembly, the TATA-

26

Chapter 1 Introduction

binding protein (TBP) binds to the gene promoter, and imposes a nearly 90˚ bend in the double-stranded DNA which facilitates template positioning in the PIC (Sainsbury et al., 2015). TBP localisation and activity is subject to extensive regulation by additional factors in eukaryotic cells including yeast, mice, and humans (Pugh, 2000; van Werven et al., 2008; Xue et al., 2017). The PIC is assembled in a stepwise fashion using general transcription factors such as TFIIA, TFIIB, RNA polymerase II, TFIIF, TFIIE, and TFIIH (Sainsbury et al., 2015). RNA polymerase II is a protein complex of twelve subunits (Rpb1 – Rpb12 in S. cerevisiae), and the carboxy-terminal domain (CTD) of largest subunit (Rpb1 in S. cerevisiae) comprises multiple heptad repeats of amino acids: YSPTSPS (Eick and Geyer, 2013). The number of repeats can vary between species: there are 26 repeats in S. cerevisiae, whereas there are 52 repeats in humans. Phosphorylation of different tyrosine, serine, and threonine residues within each CTD repeat is associated with different stages in the transcription cycle (Buratowski, 2009; Eick and Geyer, 2013). The CTD is a key regulatory hub and assembly platform for other proteins involved in initiation, elongation, and termination. During initiation, the promoter DNA bubble “melts” locally, and the DNA template strand is situated within the RNA polymerase active site to provide a template for RNA synthesis (Sainsbury et al., 2015). Phosphorylation of the Ser5 residue of the RNA polymerase II CTD by CDK7 (within TFIIH) is required for promoter clearance and promoter-proximal pausing which is prominent in mammalian cells (Buratowski, 2009; Eick and Geyer, 2013). Transcription initiates at the TSS, and the 5’ end of the RNA is modified with a 7-methylguanylate cap (me7Gppp) by a capping enzyme recruited by the RNA polymerase II CTD (McCracken et al., 1997; Rasmussen and Lis, 1993).

In mammalian cells, after transcription initiation inhibitory factors such as NELF and DSIF stabilise pausing of engaged RNA polymerase within ~60 nucleotides of the TSS (Adelman and Lis, 2012; Core and Adelman, 2019; Tome et al., 2018). A complement of elongation factors stimulates the transition from paused to elongating RNA polymerase, which is associated with phosphorylation of the Ser2 residues of the CTD by CDK9 within P-TEFb. In higher eukaryotes, pause-release constitutes a key regulatory step in transcription (Adelman and Lis, 2012; Core and Adelman, 2019). It is important to note that transcription is a

27

Chapter 1 Introduction

dynamic process and occurs in bursts – more frequent initiation and pause-release events lead to higher overall expression of each gene (Raj et al., 2006; Raj and van Oudenaarden, 2008). Splicing factors also remove transcribed introns from pre- mRNAs and splice exons together co-transcriptionally (Herzel et al., 2017). Stalling or backtracking of elongation complexes can be relieved by TFIIS (DST1 in yeast), which stimulates endonucleolytic cleavage of backtracked RNA blocking the RNA polymerase active site (Cheung and Cramer, 2011; Churchman and Weissman, 2011; Sheridan et al., 2019).

Elongation occurs until the RNA polymerase enzyme encounters a polyadenylation signal (PAS) demarcating where the transcript end site (TES) is located (Proudfoot, 2011; Proudfoot and Brownlee, 1976). In budding yeast, the site of transcription termination is not well defined by the polyadenylation signal, in contrast to human cells. The CPSF-CPF complex recognises the PAS signal on RNA, endonucleolytically cleaves the transcript at the 3’ end of the gene, and a poly(A) tail is added to the 3’ end of the RNA by polyA-polymerase (Porrua and Libri, 2015). In budding yeast, transcription termination for noncoding RNAs occurs through another pathway, involving the action of Nrd1, Nab3, and Sen1 (NNS) (Arigo et al., 2006a; Arigo et al., 2006b; Steinmetz et al., 2001; Thiebaut et al., 2006). RNA polymerase can transcribe for several hundred to several thousand nucleotides after the transcript end site, and eventually disengages from the DNA template through a mechanism that is not yet completely understood (Porrua and Libri, 2015). The pre-mRNA molecule is processed co-transcriptionally and post- transcriptionally to generate a messenger RNA (mRNA), containing a 5’ cap, a poly(A) tail of several hundred nucleotides, and a coding sequence (CDS) comprising one or more exons which is flanked by 5’ and 3’ untranslated regions (UTRs). Further details of the steps and key factors involved in transcription initiation, elongation, and termination have been covered in many comprehensive reviews (Buratowski, 2009; Core and Adelman, 2019; Eick and Geyer, 2013; Porrua and Libri, 2015; Sainsbury et al., 2015).

28

Chapter 1 Introduction

Figure 1.2 Key steps of the transcription cycle The key steps of the transcription cycle are initiation, elongation, and termination. Important events that occur are listed for each stage (black text), along with the key factors that mediate these events (grey text). The regions where the key steps of the transcription cycle respectively occur are stated at the bottom of the schematic diagram (grey text, italicised). The phosphorylation status (P) of two key regulatory residues, serine 2 (S2) and serine 5 (S5) of each heptad repeat within the RNA polymerase II (Pol II) C-terminal domain (CTD), varies across the steps of the transcription cycle. TSS, transcription start site.

1.2.2 Types of noncoding RNAs in eukaryotes and their functions

Protein-coding genes only constitute a small portion the total genomic content in different eukaryotic species (Alexander et al., 2010). Significant advances in RNA sequencing technology and bioinformatic analysis have revealed that much more of the genome is transcribed besides coding genes. For example,

29

Chapter 1 Introduction

73.5% of the 12 million bp budding yeast genome encodes for protein, and approximately 85% of its genome is transcribed (David et al., 2006). In humans, only 2.8% of the genome encodes for protein but up to 76% is transcribed in various cell types (Alexander et al., 2010; ENCODE Project Consortium, 2012). Noncoding transcription can initiate in noncoding regions, generating a variety of diverse noncoding RNAs. In particular, long noncoding RNAs (lncRNAs) comprise RNA transcripts at least 200 nucleotides in length without obvious protein-coding function. LncRNA transcripts are often capped, spliced, and polyadenylated (Derrien et al., 2012; Hon et al., 2017). In mammalian cells, noncoding RNAs can arise from various locations, but these classes are particularly notable: bidirectional enhancer RNAs (eRNAs), divergent promoter lncRNAs and short promoter- associated transcripts, terminal antisense lncRNAs, intronic lncRNAs, nested antisense lncRNAs, and stand-alone lncRNAs, also known as long intergenic noncoding RNAs (lincRNAs) (Figure 1.3) (Kung et al., 2013). With more sensitive RNA sequencing technology coupled with epigenomics and and bioinformatic prediction, the number of annotated lncRNA loci has outgrown the number of protein-coding genes in humans and continues to increase (Hon et al., 2017; Uszczynska-Ratajczak et al., 2018). In addition, transposable elements represent at least 45% of the human genome and are a major source of noncoding RNAs due to residual regulatory sequences, for example in long terminal repeats (LTRs) (Alexander et al., 2010; Hoekstra et al., 2013). Here, I will primarily focus on noncoding RNAs transcribed by RNA polymerase II, but other functional RNA species include transfer RNAs (tRNAs), ribosomal RNAs (rRNAs), small nuclear RNAs (snRNAs), small nucleolar RNAs (snoRNAs), and mitochondrial RNAs.

30

Chapter 1 Introduction

Figure 1.3 Common sources of noncoding RNAs in mammalian cells Noncoding RNAs often originate as divergent transcripts from bidirectional gene promoters, as antisense transcripts from nucleosome-depleted regions (NDRs) at the 3’ ends of protein-coding genes, as long intergenic noncoding RNAs (lincRNAs), as short bidirectional enhancer RNAs (eRNAs), and as lncRNAs associated with transcriptional enhancers in mammalian cells. These species constitute a major fraction of noncoding RNAs transcribed by RNA polymerase II in mammalian cells.

It has been debated whether most transcriptional activity of RNA polymerase II represents functional transcription or transcriptional noise due to low fidelity of transcription initiation. Some estimates have proposed that ~90% of RNA polymerase II initiation events in budding yeast represent transcriptional noise (Struhl, 2007). A widely quoted figure from the ENCODE project assigns “biochemical activity” to 80% of DNA in the human genome, implying that much of the noncoding DNA and RNA in the cell is in fact functional (ENCODE Project Consortium, 2012). In general, lncRNAs fall into three general categories: lncRNAs that represent “transcriptional noise”, lncRNAs whose act of transcription, but not the transcripts themselves, mediate their function, and functional lncRNAs that regulate gene expression in cis or in trans. To conclusively determine whether a specific lncRNA has specific functions, experimental perturbation is required. For example, targeting 16,401 lncRNAs using a CRISPR interference screening strategy in 7 human cell lines identified that only a small percentage (3-8%) of lncRNAs impacted growth in any one cell type (Liu et al., 2017). Those lncRNAs whose knockdown impacted growth did so in a highly specific manner – of the 499 lncRNA loci required for robust cellular growth, 89% displayed growth phenotypes

31

Chapter 1 Introduction

exclusively in one cell type. In another study, CRISPR gene editing was used to dissect 12 lncRNA loci in murine cell lines (Engreitz et al., 2016). In this case, only 5 lncRNAs modulated expression of a nearby gene in cis within 10 Mb of the locus, and the RNAs themselves were not required to regulate expression – rather, processes involved in lncRNA expression such as enhancer-like promoter activity, the act of transcription, or splicing were important. In zebrafish embyros, knockout of 25 individual lncRNAs resulted in no overt developmental phenotypes for embryogenesis, viability, and fertility, implying that many lncRNAs are dispensable for cellular function (Goudarzi et al., 2019). It has been proposed that splicing activity may differentiate functional from non-functional lncRNAs in human cells (Churchman, 2017; Mele et al., 2017; Schlackow et al., 2017). However, functional testing on a case-by-case basis is required to conclusively determine whether individual lncRNAs are functional or non-functional products of transcription.

1.2.3 Gene regulation by noncoding RNAs in cis and trans

Transcription of noncoding RNAs, and the noncoding RNAs themselves, can regulate gene expression through various mechanisms (Table 1.1) (Kung et al., 2013). Britten and Davidson first proposed in 1969 that noncoding RNAs could act as regulatory intermediates to convey signals in gene regulatory networks (Britten and Davidson, 1969). Numerous studies over several decades have highlighted different pathways through which noncoding RNAs and noncoding transcription can affect gene expression in cis or in trans (Table 1.1). Here, I describe some notable examples in budding yeast and higher eukaryotes to highlight the fact that noncoding transcription can affect or facilitate normal programmes of gene expression.

32

Chapter 1 Introduction

Nucleus Cytoplasm

Decoy for transcription factor Disruption of translation

Nuclear body formation & function Transcription factor localisation

Disruption of transcription machinery Competition for miRNA binding

Chromatin looping Decrease in mRNA stability

Transcriptional coactivator Increase in mRNA stability

Tether miRNA site masking

Scaffold

Transcriptional collision

Transcriptional read-through

Splice site masking

Table 1.1 Mechanisms of gene regulation by long noncoding RNAs Some examples of gene regulation events mediated by long noncoding RNAs are listed, under the cellular compartments in which they occur. Adapted from (Kung et al., 2013). Some of these pathways are specific to higher eukaryotes and are not present in S. cerevisiae (e.g. miRNA competition and masking).

Noncoding RNAs can act in trans by recruiting chromatin regulatory complexes to specific target loci. In mammals, the gene expression at the Hox loci (HOXA to HOXD) must be carefully coordinated, as expression of the Hox transcription factors specifies positional cell identity (Mallo and Alonso, 2013). In particular, differential histone methylation is important to demarcate regions of gene silencing from gene activation (Schuettengruber et al., 2007). The noncoding RNA HOTAIR is transcribed from the HOXC locus, and interacts with the Polycomb Repressive Complex 2 (PRC2) (Rinn et al., 2007). HOTAIR is required for recruitment of PRC2 to the HOXD locus, which leads to H3K27 trimethylation (me3) and gene repression. Other specialised cellular RNAs such as Xist and Knq1ot1 also associate with PRC2 and modulate recruitment of repressive complexes to target genes (Pandey et al., 2008; Silva et al., 2003). In fact, short promoter RNAs transcribed at Polycomb target and other active genes can also

33

Chapter 1 Introduction

recruit PRC2 (Kanhere et al., 2010; Khalil et al., 2009; Zhao et al., 2010). The interaction between PRC2 and RNA or chromatin is mutually antagonistic in vitro and in vivo, which highlights a regulatory mechanism for PRC2 activity (Beltran et al., 2016). Another lncRNA, HOTTIP, is transcribed from the 5’ end of the HOXA locus and interacts with the adaptor protein WDR5. HOTTIP and WDR5 serve to recruit the MLL H3K4 methyltransferase complex to the HOXA gene cluster via long-range chromosomal looping (Wang et al., 2011).

Transcription factors can act in concert with noncoding RNAs to mediate their effects in trans. The spreading of the Xist transcript across the inactive X (Xi) in human and mouse cells is facilitated by interactions with the transcription factor YY1, which tethers Xist RNA to the inactive X nucleation center and aids loading of Xist onto Xi (Jeon and Lee, 2011). Short enhancer RNAs (eRNAs) are transcribed bidirectionally and interact with the histone acetyltranferase domain of the coactivator CBP, which stimulates its histone acetyltransferase activity in an RNA-dependent manner (Bose et al., 2017). This activity likely contributes to H3K27 which demarcates active enhancers that stimulate target gene expression. Noncoding RNAs can also act in concert with other chromatin regulatory pathways to regulate gene expression. This is epitomised at the protocadherin (Pcdh) loci, where stochastic promoter activation depends on expression of antisense lncRNAs (Canzio et al., 2019; Mountoufaris et al., 2018). At the Pcdhα locus, antisense lncRNAs are expressed from divergent promoters for each alternative 5’ exon. Antisense lncRNA expression leads to DNA demethylation of CTCF binding sites, which allows binding of CTCF to a selected alternative promoter. The selected promoter interacts with a distal enhancer in cis through the action of CTCF and likely by loop extrusion. This mechanism allows stochastic promoter choice independent of enhancer-promoter distance and demonstrates the coordinated actions of lncRNA transcription, epigenetic regulation, and 3D genome topology towards regulation of gene expression. A class of circular RNAs (circRNAs) arises from back-splicing of introns by the spliceosome machinery. Some circRNAs have been implicated in gene regulation by titrating miRNAs, regulating transcriptional activity of RNA polymerase II, and competing with other transcripts for host splicing machinery (Wu et al., 2017).

34

Chapter 1 Introduction

In yeast, noncoding RNAs tend to function in cis due to the dense arrangement of genes within the compact genome. Transcriptional interference is mediated by overlapping transcription of noncoding RNAs within gene bodies or regulatory regions (Ard et al., 2017; Shearwin et al., 2005). Other examples of activating RNAs have been characterised individually or within larger regulatory circuits involving several transcripts. At the fbp1(+) locus in S. pombe, overlapping transcription of three upstream noncoding RNAs within the fbp1(+) promoter progressively establishes open chromatin required for activation of the downstream coding gene (Hirota et al., 2008). In contrast, an antisense transcript is expressed at low levels within the PHO5 gene and overlaps the PHO5 promoter region (Uhler et al., 2007). In response to low phosphate conditions, transcription of the promoter-overlapping transcript is associated with more rapid histone remodelling of the PHO5 promoter, faster recruitment of RNA polymerase, and kinetics of gene expression. As the authors reported little effect on histone modification or overall chromatin accessibility at the PHO5 promoter, it was suggested that nucleosome displacement or turnover associated with promoter-overlapping noncoding transcription facilitates “chromatin plasticity” and more rapid gene activation.

Antisense noncoding transcription can also tune coding gene expression. At the well characterised GAL locus in S. cerevisiae, transcription of a lncRNA antisense to GAL10 is essential to prevent transcriptional leakage of the GAL10 and GAL1 genes under glucose-repressive conditions (Houseley et al., 2008; Lenstra et al., 2015). Notably, transcription of this antisense ncRNA does not inhibit GAL10 transcription under galactose-activating conditions, and this transcript switches from being spurious to functional in response to different environmental nutrient cues (Lenstra et al., 2015). Another example wherein antisense transcription controls a cellular response to extracellular cues can be found at the IME4 locus. Entry into meiosis, also known as gametogenesis or sporulation in yeast, is a key cell fate decision point dictated by multiple cues from nutrient signalling pathways and the mating type status of each cell (van Werven and Amon, 2011). IME4 is a master regulator gene whose expression is required for entry into sporulation. In haploid cells, expression of IME4 is repressed by an antisense transcript that initiates downstream of its coding sequence (Hongay et al., 2006). However, repression of IME4 is overcome in MATa/α diploid cells as the

35

Chapter 1 Introduction

a1/α2 heterodimer transcription factor represses the antisense transcript, ensuring that only diploid cells can undergo gametogenesis in response to nutrient starvation. Antisense noncoding RNAs can mediate gene silencing through recruitment and action of histone deacetylases (Camblong et al., 2007). Inactivation of the nuclear exosome Rrp6 stabilises two antisense transcripts at the PHO84 gene, concomitant with repression of PHO84 expression. Histone deacetylation is limited to the PHO84 gene itself, implying that stabilisation of the antisense PHO84 noncoding RNAs may stimulate local epigenetic modification in an RNA-dependent manner. In budding yeast, there is only a modest correlation between antisense and sense transcription at the same loci (Churchman and Weissman, 2011). Overlapping transcription can lead to different outcomes on gene expression at different loci (e.g. activation or repression), and further investigation will be required on a case-by-case basis and at a genome-wide level to understand the nature of antisense transcription. Dissection of individual antisense stable unannotated transcripts (SUTs) demonstrated that only a fraction of yeast genes are sensitive to regulation by antisense SUT transcription (Huber et al., 2016). It has been proposed that antisense RNA polymerase II elongation leads to enrichment of different epigenetic modifications, which could affect the histone turnover and regulation of overlapping regulatory regions or gene bodies (Murray et al., 2015).

Noncoding RNAs can also be arranged within more complex regulatory circuits. At the FLO11 locus, a combination of tandem and convergent transcripts creates a regulatory circuit that toggles between two stable states, resulting in variegated expression of FLO11 (Bumgarner et al., 2009). ICR1 is expressed as a tandem noncoding RNA upstream of FLO11, and the PWR1 ncRNA is situtated convergent to ICR1. ICR1 expression inhibits FLO11 through transcriptional interference. PWR1 transcription represses ICR1 in cis, in a mechanism dependent on the histone deacetylase Rpd3L, to allow FLO11 expression. Competition between the transcription factors that activate PWR1 and ICR1 – Flo8 and Sfl1, respectively, determine which ncRNA is expressed in a mutually exclusive fashion. Another example of a double-negative regulatory circuit can be found at the IME1 locus, which encodes a master regulator of entry into gametogenesis in yeast (Moretto et al., 2018; van Werven et al., 2012). In haploid cells, transcription of the

36

Chapter 1 Introduction

lncRNA IRT1 (IME1-regulatory transcript 1) within the IME1 promoter recruits the Set2 histone methyltransferase and Set3 histone deacetylase complex to establish a repressive chromatin environment, allowing for mating-type control of entry into sporulation in parallel with IME4 (Hongay et al., 2006; van Werven et al., 2012). IRT1 is activated by the transcription factor Rme1, which is only significantly expressed in haploid cells. A noncoding RNA called IRT2 (IME1-regulatory transcript 2) is located upstream of IRT1, and mediates feedback of IME1 to its own promoter. Upon induction of gametogenesis, the Ime1 protein activates IRT2 transcription which interferes with Rme1 recruitment – leading to lower expression of IRT1 and higher expression of IME1 in a cascading feedback mechanism (Moretto et al., 2018). These examples highlight that noncoding RNAs can be used as versatile tools to regulate gene expression within complex circuits. Finally, noncoding RNA expression can be exploited to coordinate protein-level changes in a pervasive manner during key developmental processes. Long undecoded transcript isoforms (lutis) are 5’ extended alternate isoforms of mRNAs that do not generate proteins due to the presence of inhibitory upstream ORFs within the 5’ transcript leader (Chen et al., 2017; Chia et al., 2017). In budding yeast, transcription of the 5’ extended luti at the NDC80 locus from an upstream TSS represses the canonical NDC80 TSS through co-transcriptional interference and chromatin modification (Chia et al., 2017). Besides NDC80, switching between mRNA and long undecoded transcript isoforms also occurs at hundreds of genes in yeast during meiosis, resulting in temporal regulation of gene expression through isoform toggling (Cheng et al., 2018). In conclusion, noncoding transcription has widespread potential to regulate gene expression. Whether and how expression of noncoding RNAs affects the expression of specific genes depends strongly on genomic context, especially the arrangement of coding and noncoding transcription units within each locus.

1.3 Regulation of noncoding RNA expression

1.3.1 Why limit noncoding transcription?

Mis-expression of divergent and noncoding RNAs can compromise cellular fitness through different pathways. The intergenic distance is relatively short in in

37

Chapter 1 Introduction

organisms such as S. cerevisiae (typically a few hundred nucleotides). For example, if two neighbouring genes are oriented in tandem, divergent transcripts from the downstream gene can overlap with the upstream gene in the antisense direction. As a consequence, noncoding transcripts can overlap with regulatory or coding regions of neighbouring loci and cause transcriptional interference in cis (Ard et al., 2017; Shearwin et al., 2005) (Figure 1.4A). For example, failure to terminate transcription of the cryptic unstable transcript CUT60 leads to transcriptional read-through and interference at the downstream ATP16 gene promoter (du Mee et al., 2018). In this scenario, defective noncoding RNA termination eventually triggers a drastic loss of mitochondrial genomes from yeast cells. As described above, gene regulation by trans-acting lncRNAs is also possible, and aberrant expression of noncoding transcripts may have unintended consequences on crucial programmes of gene expression.

38

Chapter 1 Introduction

Figure 1.4 Examples of transcriptional interference and R-loop formation during transcription-replication conflict (A) An example of transcriptional interference involving a divergent noncoding RNA. Divergent transcription occurs from a bidirectional gene promoter, and overlaps with the promoter region of a neighbouring gene due to genomic proximity. Transcriptional interference can involve displacement of the RNA polymerase (Pol II) at the transcription start site (TSS, black arrow) of a neighbouring gene promoter. (B) An example of an R-loop structure formed by aberrant noncoding transcription, wherein the nascent RNA hybridises with the DNA template strand to form a RNA:DNA hybrid. The single-stranded non-template DNA strand is exposed and more susceptible to DNA damage. Pervasive R-loop formation can lead to head-on collisions between elongating RNA polymerase (Pol II) and progressing DNA

39

Chapter 1 Introduction

replication complexes (Replisome) and inhibition of DNA replication. Double- stranded DNA, dsDNA.

The act of transcription can also lead to mutation of DNA, particularly on the coding (non-template) strand. Mutagenesis on single-stranded DNA can occur via strand cleavage, deamination, and depurination (Gates, 2009). Within the transcription complex, the coding (non-template) DNA strand is exposed as single- stranded DNA, whereas the noncoding (template) DNA strand is temporarily hybridised to the nascent RNA chain conferring some protection (Kettenberger et al., 2004). Noncoding transcription can also generate extensive R-loops, structures formed by hybridisation between nascent RNA and template DNA which also exposes the non-template strand as single-stranded DNA (Aguilera and Garcia- Muse, 2012) (Figure 1.4B). In mammalian cells, various splicing and pre-mRNA processing factors, such as ASF/SF2 and THO/TREX, are implicated in suppression of R-loop formation and DNA damage (Huertas and Aguilera, 2003; Li and Manley, 2005). Divergent promoter transcripts are vulnerable to R-loop formation due to depletion of splicing signals in the divergent direction (Li and Manley, 2005, 2006; Wu and Sharp, 2013). Through mutation, coding genes can accumulate G and T base content – and G-T-rich U1 snRNP binding sites may emerge which strengthens the U1-PAS splicing axis in mammalian cells (Wu and Sharp, 2013). Convergent noncoding and coding transcription can cause insurmountable transcriptional collisions that must be removed by ubiquitylation and degradation of the RNA polymerase enzyme (Hobson et al., 2012; Prescott and Proudfoot, 2002). Head-on or co-directional conflicts can also occur between elongating RNA polymerase and replication forks (Figure 1.4B). R-loops generate head-on collisions that lead to increased DNA damage and genome instability, and the frequency of head-on collisions increases with deregulated replication origin firing and higher levels of aberrant noncoding transcription (Aguilera and Garcia- Muse, 2012; Hamperl et al., 2017; Nojima et al., 2018; Sankar et al., 2016). Noncoding transcription can also generate excessive positive supercoiling and G- quadruplex structures in R-loops, both of which can interfere with progression of the DNA replisome (Aguilera and Garcia-Muse, 2012). Aberrant divergent transcription can be wasteful for cells in terms of ATP energy and nutrient resources (Lynch and Marinov, 2015). In order to transcribe a gene or noncoding

40

Chapter 1 Introduction

RNA, cells must produce, assemble, and recruit macromolecular machines to DNA. Finally, there is evidence that aberrant transcription from alternative or internal cryptic promoters can generate extended or truncated mRNA and protein isoforms in yeast (Chen et al., 2017; Cheng et al., 2018; Chia et al., 2017; Pelechano et al., 2013; Wei et al., 2019). Therefore, eukaryotic cells have robust mechanisms in place to limit the inappropriate expression of divergent and noncoding transcripts.

1.3.2 Pathways that control expression of noncoding RNAs in yeast and mammalian cells

Here, I will describe specific pathways that limit expression of noncoding RNAs by highlighting examples from yeast and mammalian cells. On the most part, expression of noncoding RNAs is controlled at the level of transcription by chromatin integrity and RNA termination pathways, while the aberrant noncoding RNA products themselves are removed by RNA degradation machinery. This section focuses on pathways that repress aberrant noncoding transcription within euchromatin, which are distinct from the more indiscriminate pathways within heterochromatin that suppress both coding and noncoding transcription. Many of these pathways are also exploited to specifically repress divergent noncoding RNAs in different eukaryotic systems, which will be described in detail in a separate section (Introduction 1.4.2).

1.3.3 Chromatin-based pathways that control expression of noncoding RNAs

Nucleosomes and chromatin limit transcription initiation by blocking access to regulatory and core promoter elements (Han and Grunstein, 1988; Knezetic and Luse, 1986; Lorch et al., 1987). Cryptic promoter sequences are present within intragenic regions, are repressed by pathways involved in chromatin assembly, co- transcriptional nucleosome remodelling, and histone modification (Rando and Winston, 2012). In yeast, numerous chromatin and transcription-related factors that repress cryptic initiation within gene bodies have been identified – in particular: histones, histone gene regulators (e.g. Spt10, Spt21), histone chaperones (e.g.

41

Chapter 1 Introduction

Asf1, HIR complex, Rtt106), chromatin assembly and remodeling factors, Rpd3S- mediated and general histone deacetylation enzymes, elongation factors, and Mediator components (Cheung et al., 2008; Silva et al., 2012). The activity of ATP- dependent chromatin remodellers (e.g. RSC, Ino80, Swr1, Isw2, Chd1) also limits noncoding transcription, particularly from NDRs at promoters and 3’ gene ends (Alcid and Tsukiyama, 2014; Hennig et al., 2012; Whitehouse et al., 2007; Yadon et al., 2010). Transcription elongation factors also play an important role in limiting cryptic transcription through chromatin integrity. Nucleosomes form a barrier to elongating RNA polymerase (Churchman and Weissman, 2011; Lorch et al., 1987). During elongation, nucleosomes ahead of the transcription complex are acetylated, the histone octamer is evicted from DNA, and nucleosomes are re-deposited behind RNA polymerase by elongation factors such as Spt6 and FACT (facilitates chromatin transcription) (Petesch and Lis, 2012). Spt6 associates with the phosphorylated RNA polymerase II CTD, and both Spt6 and FACT travel along with the transcribing RNA polymerase in vivo (Fischl et al., 2017; Mason and Struhl, 2003; Saunders et al., 2003; Sun et al., 2010; Yoh et al., 2007). Inactivation of Spt6 and Spt16 (part of the FACT complex, along with Pob3) compromises the re-establishment of normal chromatin structure in the wake of transcription elongation complexes, leading to exposure of cryptic promoters and aberrant noncoding transcription (Belotserkovskaya et al., 2003; DeGennaro et al., 2013; Doris et al., 2018; Fischl et al., 2017; Kaplan et al., 2003). Finally, NET-seq and RNA-seq analysis revealed that inactivation of Spt5, a key RNA polymerase II elongation factor, leads to widespread divergent and antisense transcription in S. pombe (Shetty et al., 2017). The act of transcription can disrupt chromatin structure and these pathways function to re-establish nucleosome organisation, limiting cryptic and noncoding RNA expression.

Histone modifications also significantly contribute to repression of intragenic and cryptic transcription – particularly through the Set1/Set3C pathway and the Set2/Rpd3S pathway (Figure 1.5). Set1 is the sole H3K4 methyltransferase enzyme in budding yeast (Briggs et al., 2001; Shilatifard, 2012), and forms part of the COMPASS/Set1C complex. Set1C associates with RNA polymerase II and establishes a “gradient” of H3K4 tri-, di-, and mono-methylation across gene bodies, which is dictated by transcription frequency and rate (Ruthenburg et al.,

42

Chapter 1 Introduction

2007; Soares et al., 2017). H3K4 dimethylation (H3K4me2) is found across gene bodies and is mechanistically linked to histone deacetylation, as the H3K4me2 epigenetic mark recruits the histone deacetylase complex Set3C (comprising Set3 and HDACs Hos2 and Hst1) (Kim and Buratowski, 2009; Pijnappel et al., 2001). Histone deacetylation represses frequency of cryptic transcription, likely due to higher DNA affinity for histones, lower rates of nucleosome turnover, and increased nucleosome density (Buratowski and Kim, 2010; Zentner and Henikoff, 2013). The Set2 protein is the sole H3K36 methyltransferase in yeast, and associates with RNA polymerase II within gene bodies via the Ser2 phosphorylated CTD (Kizer et al., 2005; Krogan et al., 2003; Strahl et al., 2002). The gradient of H3K36 mono-, di-, and tri-methylation increases in the 5’ to 3’ direction across a gene body (Figure 1.5). H3K36me2 and H3K36me3 marks are co-transcriptionally deposited and recognised by components of the Rpd3S deacetylase complex (Carrozza et al., 2005; Joshi and Struhl, 2005; Keogh et al., 2005). Rpd3S can also be recruited to gene bodies by transcribing RNA polymerase independently of H3K36 methylation (Govind et al., 2010). Deacetylation of the core histone H4 by Rpd3S suppresses spurious initiation of intragenic transcription (Churchman and Weissman, 2011; Venkatesh et al., 2016). Rpd3S also suppresses antisense transcripts from NDRs associated with transcription termination sites, which are distinct from the divergent transcripts commonly found at promoter NDRs (Tan-Wong et al., 2012). H3K36 methylation also mediates recruitment of FACT in human cells (Carvalho et al., 2013) and the KDM5B H3K4 demethylase in mouse embryonic stem (ES) cells (Xie et al., 2011), both of which suppress intragenic transcription initiation. In mouse ES cells, deposition of H3K36me by Setd2 (the mammalian homolog of yeast Set2) facilitates the recruitment of the “de novo” DNA methyltransferase Dnmt3b (Neri et al., 2017). Repression of transcription by cytosine DNA methylation does not occur in S. cerevisiae due to lack of a DNA methyltransferase enzyme (Ponger and Li, 2005). In mammalian cells, H3K4me2 may recruit the homolog of Set3C in Drosophila, UpSET, which can deacetylate histones for repression at coding gene promoters – but it remains unclear whether this pathway plays an extensive role in repressing noncoding transcription (Ali et al., 2013; Rincon-Arano et al., 2012). However, chromatin integrity and nucleosome organisation are clearly important for limiting the expression of cryptic and noncoding transcripts.

43

Chapter 1 Introduction

Figure 1.5 Intragenic histone deacetylation through the Set1/Set2 and Set3/Rpd3S pathways Gradients of H3K4 (histone H3 lysine 4) and H3K36 (histone H3 lysine 36) methylation are established across a typical protein-coding gene in budding yeast. The H3K4 gradient is established co-transcriptionally by the Set1C/COMPASS methyltransferase complex, wherein H3K4 trimethylation (me3) is highest at the 5’ end of a gene and H3K4 monomethylation (me1) is lowest at the 3’ end of a gene. H3K36 methylation is deposited by the Set3 histone methyltransferase enzyme. Set3 associates with Ser2-phosphorylated RNA polymerase II, which typically marks later stages of transcription elongation, and thus H3K36 methylation is enriched towards the 3’ end of a gene body. The H3K4me2 mark is recognised by the Set2 histone deacetylase, and H3K4me2 and me3 marks are recognised by components of the Rpd3S histone deacetylase complex. Local histone deacetylation on intragenic nucleosomes (grey globes) suppresses histone turnover and maintains a repressive chromatin environment that suppresses cryptic transcription initiation from intragenic transcription start sites (TSS, grey arrows). Main coding TSS, black arrow on DNA.

1.3.4 RNA degradation based pathways that control expression of noncoding RNAs

In addition to the pathways described above which limit aberrant and noncoding transcription, the resultant noncoding RNA products themselves must also be dealt with. Co-transcriptional and post-transcriptional RNA decay pathways degrade products of aberrant transcription, particularly using factors coupled to

44

Chapter 1 Introduction

transcription termination. For example, inactivation of the nuclear exosome Rrp6 in budding yeast leads to accumulation of aberrant noncoding RNAs termed “cryptic unstable transcripts” or CUTs (Davis and Ares, 2006; Houalla et al., 2006; Wyers et al., 2005). Genome-wide mapping using tiling arrays and sequencing approaches revealed that bidirectional gene promoters are a major source of CUTs in S. cerevisiae (Neil et al., 2009; Xu et al., 2009). CUTs and other stable unannotated transcripts (SUTs) frequently initiate from NDRs associated with gene promoters and 3’ gene ends. The nuclear exosome has also been implicated in degradation of meiotic unstable transcripts (MUTs) prior to the yeast gametogenesis programme (Frenk et al., 2014; Lardenois et al., 2011). Noncoding RNAs that manage to escape nuclear RNA surveillance and reach the cytosol are also degraded by the 5’-3’ exonuclease Xrn1. These Xrn1-sensitive unstable transcripts (XUTs) accumulate after inactivation of Xrn1, are polyadenylated, and mostly antisense to coding genes (van Dijk et al., 2011). Antisense XUTs that possess single-stranded 3’ extensions are also destabilised by the nonsense-mediated decay (NMD) pathway, which can be inhibited by dsRNA structures (Wery et al., 2016). Finally, inactivation of a 5’-3’ exonuclease, Rat1, stabilises a class of telomere repeat- containing RNA (TERRA) in budding yeast (Luke et al., 2008). TERRAs form an integral part of telomeric heterochromatin, have been implicated in telomere function and DNA replication, and can also be found in mammalian cells (Azzalin et al., 2007; Luke and Lingner, 2009).

In budding yeast, expression of noncoding RNAs is often limited through premature termination of transcription coupled to RNA degradation (Figure 1.7B). Canonical termination at protein-coding genes in mammalian cells involves recruitment of the CPF-CF complex at the polyadenylation signal (PAS) (Proudfoot and Brownlee, 1976). Pcf11, a component of the CPF-CF complex, interacts with the Ser2 phosphorylated CTD of RNA polymerase II, and the pre-mRNA is cleaved at the PAS by the CPF complex endonuclease Ysh1 (CPSF-73 in mammals) (Chanfreau et al., 1996; Ryan et al., 2004). A poly(A) tail is added to the mRNA by Pap1, and the RNA polymerase that has already transcribed past the PAS eventually terminates transcription through a mechanism that remains to be fully understood (Porrua and Libri, 2015). In contrast, termination at noncoding genes in yeast (e.g. CUTs) is mediated by a complex of Nrd1, Nab3, and Sen1 (NNS) (Arigo

45

Chapter 1 Introduction

et al., 2006a; Arigo et al., 2006b; Steinmetz et al., 2001; Thiebaut et al., 2006). Nrd1 and Nab3 recognise specific motifs within the nascent noncoding RNA (Creamer et al., 2011; Porrua et al., 2012; Wlotzka et al., 2011). This pathway coordinates termination of noncoding transcripts directly to degradation pathways that rapidly eliminate unwanted RNA byproducts of aberrant transcription (Arigo et al., 2006a; Arigo et al., 2006b; Thiebaut et al., 2006; Vasiljeva and Buratowski, 2006). The CTD-interaction domain (CID) of Nrd1 facilitates interaction with phosphorylated Ser5 on the RNA polymerase II CTD (Gudipati et al., 2008; Kubicek et al., 2012; Vasiljeva et al., 2008). Sen1 is an ATP-dependent RNA and DNA helicase, and is loaded onto the RNA. Sen1 unwinds the RNA/DNA hybrid and its activity is strictly required to elicit termination of transcription by RNA polymerase II (Porrua and Libri, 2013). The Nrd1-Nab3 complex bound to the nascent RNA also interacts with the TRAMP (Trf4-Air2-Mtr4) complex, which catalyses addition of a short poly(A) tail to the noncoding transcript. This coordination and physical association of NNS, TRAMP, and the exosome effectively couples termination to RNA degradation (Vasiljeva and Buratowski, 2006). Sen1 expression varies across the , leading to higher NNS termination activity during S-phase and G2 (Mischo et al., 2018). Divergent and noncoding RNAs in yeast are particularly enriched in Nrd1 and Nab3 binding site motifs, which promote premature termination and degradation. Genome-wide studies wherein Nrd1 was selectively depleted from the nucleus revealed that Nrd1-unterminated transcripts (NUTs) are particularly common at NDRs found at gene promoters and 3’ ends, which respectively produce divergent and antisense noncoding RNAs if not limited by the NNS pathway (Schulz et al., 2013). Due to the dense arrangement of transcription units within the yeast genome and the pervasive pathways of transcript termination and degradation, there is significant overlap between SUTs, CUTs, XUTs, and NUTs. In addition, transcriptome-wide mapping of 13 RNA regulatory factors in budding yeast revealed that there is also significant overlap between coding mRNAs and ncRNAs with regards to ribonucleoprotein (RNP) composition and the nuclear RNA surveillance machinery in particular (Tuck and Tollervey, 2013).

Pervasive noncoding transcription has also been observed in mammalian cells (Carninci et al., 2005; Kapranov et al., 2007; Katayama et al., 2005).

46

Chapter 1 Introduction

Promoter-associated small RNAs (PASRs) and terminator-associated small RNAs (TASRs) (Kapranov et al., 2007), TSS-associated RNAs (Seila et al., 2008), and tiny RNAs (Taft et al., 2009) are short transcripts primarily associated with promoter-proximal stalled RNA polymerase (Valen et al., 2011). Divergent transcripts from gene promoters are also a major source of noncoding RNAs in mammalian cells (Core et al., 2008; Ntini et al., 2013; Seila et al., 2008; Sigova et al., 2013). Divergent transcription is rapidly terminated due to enrichment of premature polyadenylation signals (PAS) in the antisense direction upstream of gene promoters in mammalian cells (Almada et al., 2013; Ntini et al., 2013) (Figure 1.7A). In contrast, the PAS sites are less frequent in the coding direction, and the remaining sites are also suppressed by binding of U1 snRNP to its target sites, allowing effective elongation (Almada et al., 2013; Ashe et al., 1997; Chiu et al., 2018; Kaida et al., 2010). The Integrator complex is also involved in termination of noncoding RNA expression (Nojima et al., 2018). These divergent or upstream- antisense transcripts are also degraded by the nuclear exosome in mouse ES and HeLa cells, limiting the extent of their accumulation (Andersson et al., 2014; Flynn et al., 2011; Ntini et al., 2013; Preker et al., 2008). Bidirectional enhancer RNAs in mammalian cells are also terminated by the premature PAS pathway and degraded by the exosome (Andersson et al., 2014). In concert with the Ars2 protein and the nuclear exosome targeting (NEXT) complex, transcription termination is similarly coupled to RNA degradation at premature PAS near divergent TSSs in mammalian cells (Andersen et al., 2013; Hallais et al., 2013). A PAS independent pathway that functions post-transcriptionally has been described in human HEK293T and HeLa cell lines, which involves recognition of lncRNAs via poly(A) binding protein 1 (PABP1) and exosome targeting (Beaulieu et al., 2012). These versatile pathways act in concert to effectively limit the expression of noncoding RNAs generated by pervasive transcription in yeast and mammalian cells.

47

Chapter 1 Introduction

1.4 Control of promoter directionality

1.4.1 Eukaryotic promoters are bidirectional

Figure 1.6 Structure of a bidirectional gene promoter A bidirectional gene promoter contains separate core promoters (blue boxes) arranged in opposite orientations, at which transcription of the coding mRNA and divergent RNA (red wavy lines) initiates (transcription start sites, TSSs, depicted with black arrows). Coding and divergent core promoters are usually located at the edges of the nucleosome-depleted region (NDR) which can vary in width, flanked by the +1 and -1 nucleosomes (grey globes) respectively. The relative balance of coding and divergent transcription determines the overall promoter “directionality”. The directional output of eukaryotic gene promoters can vary; some promoters are more unidirectional whereas others display nearly bidirectional transcription (depicted by the slider bar). This figure is adapted from my point-of-view article in Transcription (Wu and Van Werven, 2019).

Most gene promoters in eukaryotic cells are inherently bidirectional (Figure 1.6) (Andersson et al., 2015; Duttke et al., 2015; Jin et al., 2017). Within a bidirectional gene promoter, separate unidirectional core promoters allow transcription in the coding and divergent directions to occur, and are located at the flanking edges of the NDR. High-resolution mapping of pre-initiation complex (PIC) components in yeast by ChIP-exo identified that distinct PICs are recruited to each core promoter (Rhee and Pugh, 2012). In mouse macrophages, divergent TSSs at bidirectional promoters are associated with distinct hubs of chromatin organisation and transcription factor binding (Scruggs et al., 2015). Bidirectional transcription in higher eukaryotes also initiates from two divergent PICs, and is associated with more transcription factor binding, a wider NDR, and is commonly found at highly

48

Chapter 1 Introduction

expressed genes (Scruggs et al., 2015). Through a functional evolutionary approach wherein foreign yeast DNA was introduced to S. cerevisiae, the Churchman and Struhl groups found that eukaryotic promoters are inherently bidirectional and become more unidirectional through evolution (Jin et al., 2017). Heterologous promoters lose their directionality in a new foreign environment, and fortuitous promoter regions that occur generate equal bidirectional transcription. The evolutionary implications of this work will be discussed in a separate section below.

Early studies of exosome-sensitive promoter upstream transcripts (PROMPTs) by tiling microarray analysis revealed both antisense (divergent) and sense (pausing-related) short RNAs emanating from gene promoters in human HeLa cells (Preker et al., 2008). Sequencing of short RNAs confirmed these findings in mouse embryonic stem (ES) cells (Seila et al., 2008). Antisense PROMPT expression was reported to be both correlated (Preker et al., 2008) or anti-correlated (Preker et al., 2011) with coding gene expression. In addition to RNA polymerase II, RNA polymerase I (rDNA) and RNA polymerase III (U6) genes also contain divergent PROMPTs (Preker et al., 2011). Transcription in both the sense and antisense directions at promoters is associated with peaks of “active” histone post-translational modifications (e.g. H3 acetylation and H3K4me3) (Seila et al., 2008). However, epigenetic marks associated with transcription elongation (e.g. H3K79me2, H3K4me2, H3K27Ac) are mainly correlated with transcription in the coding direction and truly bidirectional promoters. These marks are not enriched upstream of highly “unidirectional” promoters (Duttke et al., 2015; Seila et al., 2008). These data indicate that in mammalian cells, transcription initiation occurs in both the coding and divergent direction from inherently bidirectional promoters, but pause-release and elongation effectively occur only the coding direction. Mapping nascent transcription at single nucleotide resolution (NET-seq) verified that divergent transcription is a common phenomenon, and that promoter- proximal pausing is not a major axis for regulation of coding and divergent transcripts in S. cerevisiae (Churchman and Weissman, 2011). Subsequent studies involving high-resolution mapping of nascent transcription and initiation also revealed widespread divergent initiation and pausing of RNA polymerase in mammalian cells (Andersson et al., 2015; Andersson et al., 2014; Core et al., 2014;

49

Chapter 1 Introduction

Core et al., 2008; Duttke et al., 2015; Tome et al., 2018). These studies also confirmed that bidirectional promoters contain distinct core promoters oriented in opposing directions at the edges of the NDR.

Bidirectional gene promoters are a major source of long noncoding RNAs in yeast and higher eukaryotes. Through inactivation of the nuclear exosome component Rrp6, numerous noncoding and divergent transcripts have been identified (SUTs and CUTs) that are mainly associated with NDRs at promoters and 3’ gene ends in yeast (Neil et al., 2009; Xu et al., 2009). In human ES cells, 60% of lncRNAs originate from divergent transcription at coding gene promoters, and their expression is often coordinately regulated (Sigova et al., 2013). Particularly in organisms with gene-dense genomes like budding yeast, it has been proposed that overlap between divergent, noncoding, and coding transcripts may convey regulatory signals between loci – for example, by co-transcriptional interference or histone modification at neighbouring gene promoters.

1.4.2 Factors that regulate expression of divergent transcripts

As outlined above, aberrant noncoding transcription can have detrimental consequences for cellular functions. Divergent transcription from bidirectional gene promoters is a major source of noncoding transcription and long noncoding RNAs, and thus eukaryotic cells require multiple pathways to limit the expression of divergent transcripts. Here, I outline how chromatin, transcription-associated processes, promoter sequence, and trans-acting factors all contribute to repression of divergent transcription and therefore control overall promoter directionality. Some of these pathways are exploited more broadly to limit expression of pervasive noncoding transcription beyond divergent transcripts, and have been described in more detail above (Introduction 1.3).

Histone modifications and variants can dictate antisense and divergent transcription. In S. pombe, deletion of the histone variant H2A.Z leads to higher levels of antisense transcription, implying that H2A.Z plays a repressive role (Zofall et al., 2009). H2A.Z lowers the barrier imposed by the first nucleosome

50

Chapter 1 Introduction

encountered after the TSS, facilitating progression of RNA polymerase (Weber et al., 2014). In S. cerevisiae, incorporation of H2A.Z at 3’ gene ends demarcates and promotes units of overlapping antisense transcription (Gu et al., 2015). The histone chaperones Spt6 and FACT limit incorporation of H2A.Z within gene bodies to prevent licensing of cryptic transcription (Jeronimo et al., 2015). In mouse ES cells, incorporation of H2A.Z or H3K56 acetylation (H3K56ac) at promoter NDR-flanking nucleosomes influences the degradation of divergent RNAs by the nuclear exosome (Rege et al., 2015). This study found that H2A.Z was required for expression of divergent promoter-proximal transcripts that accumulate after exosome inactivation. In addition, it was proposed that H2A.Z functions in concert with H3K56Ac to facilitate formation of “chromosome interaction domains” comprising higher-order chromatin structures that may influence transcriptional repression (Rege et al., 2015). Inactivation of the CAF-I chromatin assembly pathway increases divergent noncoding transcription at most gene promoters, and functions distinctly to premature termination and RNA degradation pathways (Marquardt et al., 2014). Incorporation of H3K56Ac by CAF-I also targets promoter nucleosomes for remodelling and rapid turnover, which promotes divergent transcription. Subsequently, H3K56Ac at the -1 nucleosome likely promotes eviction or remodelling by the SWI/SNF complex and facilitates divergent transcription. The Set2-Rpd3S pathway (described in more detail above) mainly limits initiation of intragenic transcripts, some of which can overlap convergently with promoters in the antisense direction (Churchman and Weissman, 2011; Venkatesh et al., 2016; Venkatesh et al., 2012). Finally, depletion of the SIRT6 family histone deacetylases Hst2 and Hst3 leads to widespread derepression of CUTs and divergent promoter transcripts at thousands of gene promoters in S. cerevisiae (Feldman and Peterson, 2019). These chromatin-based pathways all contribute in parallel to limit the expression of divergent transcripts from gene promoters.

ATP-dependent nucleosome remodelling complexes organise chromatin in the nucleus (Clapier et al., 2017; Narlikar et al., 2013). In eukaryotic cells, they are broadly categorised into general families based on structural and functional similarities: ISWI, CHD, SWI/SNF, and INO80. The following discussion will focus on budding yeast, wherein the regulation and mechanisms of action are best

51

Chapter 1 Introduction

understood for these remodellers. ATP-dependent chromatin remodellers translocate DNA using different mechanisms to slide or eject nucleosomes, or incorporate histone variants into chromatin. These chromatin remodelling enzymes show different positional specificity across gene bodies, and different classes of remodellers act at gene promoters where divergent transcription occurs: ISW1a, ISW2, RSC, INO80, SWI/SNF, and CHD1 (Narlikar et al., 2013). Proper formation of NDRs and phasing of flanking nucleosomes depends on the activity of multiple ATP-dependent chromatin remodellers in concert, along with a set of “general regulatory factors” in budding yeast (e.g. Abf1, Reb1, Rap1) which may help to direct remodeller activity (Krietenstein et al., 2016). The action of multiple chromatin remodellers converges to regulate the position of the +1 nucleosome, which tunes the usage of the coding direction core promoter located adjacent (Challal et al., 2018; Klein-Brill et al., 2019; Kubik et al., 2019; Kubik et al., 2018). Inactivation of several chromatin remodellers in yeast – including INO80, RSC, ISW2, and SWR1 – leads to higher levels of antisense transcripts from NDRs and intragenic regions. Some of these aberrant transcripts overlap with promoters in the antisense convergent orientation (Alcid and Tsukiyama, 2014; Yadon et al., 2010). In addition, remodelling activity of ISW2 at promoter NDRs represses divergent transcription that overlaps with tandem upstream genes in an antisense orientation (Whitehouse et al., 2007). At a subset of genes regulated by the “general regulatory factor” Reb1, depletion of RSC leads to a small shift in -1 nucleosome positioning which may influence divergent core promoter accessibility and divergent transcription (Gutin et al., 2018). The activity and action of chromatin remodellers such as RSC is directed to the -1, +1, and “fragile” nucleosomes at promoters by specific GC- and AT-rich sequence motifs (Brahma and Henikoff, 2019; Krietenstein et al., 2016; Kubik et al., 2015; Kubik et al., 2018; Skene and Henikoff, 2017). Further investigation is required to understand the complex and overlapping interplay between sequence-specific transcription factors, ATP-dependent chromatin remodellers, chromatin organisation, and initiation of divergent transcription at gene promoters.

Chromatin conformation enhances promoter directionality through gene looping (Tan-Wong et al., 2012). Ssu72 is a RNA polymerase II CTD phosphatase that also interacts with polyadenylation complex factors. Inactivation of gene loop

52

Chapter 1 Introduction

formation by mutation of Ssu72 in yeast results in the production of aberrant divergent transcripts from bidirectional promoters that are stabilised after inactivation of the nuclear exosome. These divergent transcripts are distinct from the antisense intragenic transcripts repressed by the Rpd3S pathway. PAS mutation also disrupts gene looping and enhances divergent transcription in budding yeast and human HEK293 cells – thereby controlling promoter directionality.

In addition to chromatin-related pathways, factors involved in transcription also significantly regulate divergent transcript expression. Regulation of TBP binding and activity is a common mechanism of gene regulation (Goppelt et al., 1996; Gumbs et al., 2003; Teves et al., 2018; van Werven et al., 2008; Xue et al., 2017). In particular, Mot1, NC2, and INO80 bind to TBP at gene boundaries (e.g. promoters), and reportedly suppresses euchromatic divergent transcripts in yeast and mice at a significant fraction of gene promoters (Xue et al., 2017). Genes enriched in paused RNA polymerase II (with S2P) at their 5’ ends display higher levels of divergent transcription in RNA polymerase mutants with slower elongation rates (Fong et al., 2017). In S. pombe, depletion of the elongation factor Spt5 leads to divergent transcription at several hundred gene promoters, and elicits convergent antisense transcription that can overlap with coding TSSs at a subset of genes (Shetty et al., 2017). The Integrator complex is also involved in termination of noncoding RNA transcription (Nojima et al., 2018).

In mammalian cells, premature polyadenylation and transcript degradation is a pervasive mechanism that limits divergent RNA expression to a large extent (Almada et al., 2013; Ntini et al., 2013). The mechanistic details of the U1-PAS axis have been discussed earlier (Section 1.3.4), but differential regulation of divergent and coding transcripts essentially stems from enrichment of polyadenylation signals in the upstream antisense direction that trigger premature transcript termination, and enrichment of U1 snRNP binding site signals in the coding direction that suppress premature termination (Figure 1.7A). In yeast, the NNS pathway plays a similar role, albeit using a different complement of RNA-binding proteins, to limit expression of divergent CUTs and NUTs (Neil et al., 2009; Sohrabi-Jahromi et al., 2019; Xu et al., 2009) (Figure 1.7B). These prematurely terminated noncoding

53

Chapter 1 Introduction

RNAs are degraded by Rrp6 within the nuclear exosome. Despite the pervasive activity of these RNA surveillance pathways, some noncoding RNAs manage to escape the nucleus and enter the cytosol, where they are degraded by Xrn1 in a NMD-dependent manner (van Dijk et al., 2011) unless blocked by specific double- stranded RNA structures (Wery et al., 2016).

54

Chapter 1 Introduction

55

Chapter 1 Introduction

Figure 1.7 Control of promoter directionality by termination and degradation of divergent noncoding RNAs (A) Premature termination and degradation of divergent noncoding RNAs in mammalian cells is dictated by the U1-PAS (polyadenylation signal) axis. PAS are enriched in the divergent direction upstream of a bidirectional gene promoter, and trigger premature termination of the transcript by the CPSF-CF pathway. The divergent noncoding RNA is targeted to the nuclear exosome by the nuclear exosome targeting (NEXT) complex and other factors. U1 snRNP binding sites are enriched in the coding direction, and are recognised by the U1 snRNP to suppress premature PAS and allow productive transcription elongation at coding genes. (B) Premature termination and degradation of divergent noncoding RNAs in budding yeast is dictated by the NNS (Nrd1-Nab3-Sen1) pathway. Binding site motifs for Nrd1 and Nab3 are enriched in the antisense direction upstream of a bidirectional gene promoter, which are recognised co-transcriptionally by the NNS complex associated with RNA polymerase II. Helicase activity of Sen1 triggers dissociation of the divergently transcribing RNA polymerase complex from DNA, and the divergent transcript is directed for degradation by the nuclear exosome via the TRAMP complex. Nrd1 and Nab3 binding sites are depleted in the coding direction, allowing productive transcription elongation to occur.

Promoter activity is determined by the presence of cis-regulatory elements in the DNA sequence, together with trans-acting factors that interpret regulatory information and produce a unidirectional or bidirectional transcriptional output. Core promoter elements including TATA boxes can influence promoter directionality. In humans, TATA boxes are enriched at unidirectional promoters – they occur on the forward strand at 28% of unidirectional promoters but only 8% of bidirectional ones on either strand (Trinklein et al., 2004). Furthermore, one study found that 77% of bidirectional promoters, but only 38% of non-bidirectional promoters, are located within a CpG island (Trinklein et al., 2004). In vivo, reversal of yeast TATA box orientation can inactivate coding gene expression (Nagawa and Fink, 1985), and mutation of the TATA box also leads to higher expression of a divergent CUT in some instances (Neil et al., 2009). Although the majority of promoters in mammalian cells are bidirectional, directional promoters are slightly enriched in motifs for downstream promoter (DPE) and initiator (Inr) motifs, whereas the CCAAT box is slightly enriched at bidirectional promoters (Yang and Elnitski, 2008) in mammalian cells. In addition to specific core promoter elements, DNA sequence content may also influence promoter directionality. Bidirectional promoters are

56

Chapter 1 Introduction

enriched in GC nucleotides and CpG islands are present in 90% of bidirectional gene promoters in humans (Yang and Elnitski, 2008).

These DNA sequence elements are recognised by trans-acting factors, mainly sequence-specific transcription factors, to regulate expression of the coding and divergent transcripts. In multicellular organisms, transcription factor expression is highly cell-type specific (Vaquerizas et al., 2009). Genome-wide studies of different human cancers revealed that promoters can vary in unidirectional or bidirectional output in different cell types (Balbin et al., 2015). When a set of random and bidirectional human promoters was introduced as foreign DNA to one mouse and three human cell lines, strong cell-type specific differences were observed (Trinklein et al., 2004). Of the bidirectional promoters assayed, 22% showed bidirectional activity just half of the cell lines. In addition, deletion of one TSS within a bidirectional promoter pair increased activity of the opposing paired TSS (Trinklein et al., 2004) – indicating that divergently oriented TSSs compete for transcription factors to some extent. More recently, experiments involving introduction of foreign yeast DNA to S. cerevisiae also found that promoter directionality is lost in a foreign environment when trans-acting factors are unable to recognise specific cis-regulatory sequence elements (Jin et al., 2017). Sequence-specific transcription factors appear to either promote or limit divergent transcription. Certain motifs (e.g. GABPA, MYC, E2F1, E2F4, NRF-1, CCAAT, YYA, and ACTACAnnTCC) are enriched at bidirectional promoters in humans, and introduction of GABP motifs into unidirectional promoters can stimulate bidirectional transcription in some cases (Lin et al., 2007). Enrichment for certain sequence- specific transcription factor motifs in yeast promoters is associated with fortuitous promoter regions that produce bidirectional output (Jin et al., 2017). In budding yeast, depletion of some “general regulatory factors” (Tbf1, Abf1, Rsc3, and Rap1) leads to expression of cryptic divergent promoter transcripts enriched at bidirectional gene promoters (van Bakel et al., 2013). Chromatin and RNA-based pathways regulate promoter directionality in parallel, and their activities appear to be additive in some cases and non-redundant in others. Eukaryotic cells integrate many factors and pathways to repress initiation, regulate elongation, promote termination, and stimulate degradation of divergent noncoding transcripts at all stages of gene expression.

57

Chapter 1 Introduction

1.4.3 Evolution of promoter directionality – enhancers and promoters

Enhancers and promoters are inherently bidirectional. Introduction of foreign yeast (K. lactis or D. hansenii) DNA into S. cerevisiae on yeast artificial leads to loss of promoter directionality in a foreign environment, and bidirectional transcription occurs at fortuitous promoter regions enriched in transcription factor binding motifs (Jin et al., 2017). Evolutionary analysis of newly evolved promoter regions in yeast and humans showed that bidirectional transcription tends to become more directional over time, through mutation of the promoter DNA sequence and surrounding cis-regulatory elements (Jin et al., 2017). In mammalian cells, it has been proposed that bidirectional promoters evolved from enhancers, wherein new coding genes originate by purging regulatory elements (e.g. PAS) that promote premature termination of eRNAs, and acquiring U1 sites to suppress remaining PAS and allow productive elongation (Wu and Sharp, 2013). Some functional lncRNAs are derived from eRNAs that could serve as a reservoir of RNAs that can be subsequently shaped by evolutionary pressures to perform cellular functions. Supporting this hypothesis, analysis of 12 lncRNAs in mice reveals that lncRNA promoters contain highly conserved DNA sequences that, in human cells, have chromatin modifications associated with enhancers (e.g. H3K4me) (Engreitz et al., 2016). Various studies in human and mouse cells have also identified that mammalian promoters and enhancers share common architectural principles, particularly core promoter element frequency, nucleosome and transcription factor organisation, and divergent TSS spacing (Core et al., 2014; Scruggs et al., 2015). H3K9 trimethylation and splicing activity may distinguish mRNAs from lncRNAs in human cells, but the RNAs themselves are not always required for regulation of gene expression (Churchman, 2017; Mele et al., 2017; Schlackow et al., 2017). In fact, directionality of eRNA transcription output reflects the inherent enhancer or promoter activity of cis-regulatory elements in Drosophila (Mikhaylichenko et al., 2018). Some Drosophila enhancers (mainly directional) can be coerced to function as weak promoters. In Drosophila, while bidirectional promoters can also function as enhancers, unidirectional promoters generally cannot. Emerging evidence from a wide range of eukaryotic systems indicates that

58

Chapter 1 Introduction

control of transcription directionality may be ingrained in the evolution of genes and their regulation.

1.5 Ribosomal protein genes in Saccharomyces cerevisiae

1.5.1 Regulation of ribosomal protein gene expression

Ribosomal protein genes encode the essential protein subunits of the ribosome, a macromolecular machine that translates genetic information in mRNA into linear polypeptide chains. Ribosomes comprise numerous ribosomal proteins (RPs) and ribosomal RNAs (rRNAs), and ribosome biogenesis is a complex process that is carefully coordinated to meet cellular requirements (Baßler and Hurt, 2019). The ribosomal proteins are encoded on separate genes, whose expression must be coordinately regulated to produce ribosomal proteins in the correct stoichiometric ratio (Hu and Li, 2007). In E. coli, the coordinated regulation of RP gene expression is simple, as RP genes are transcribed from one locus as a polycistronic mRNA and therefore expressed in equal ratios (Nomura et al., 1984). Negative feedback of E. coli ribosomal protein translation occurs because RPs can directly bind to and repress the polycistronic RP mRNA. However, coordinated regulation of RP gene expression is more complicated in eukaryotes, because RP genes are scattered throughout different chromosomes at different loci (Zerbino et al., 2018). Eukaryotic RP gene expression has been best studied in budding yeast, but the factors responsible for coordinated regulation of RP gene expression have not been identified or characterised in higher eukaryotes (Hu and Li, 2007). Bioinformatic analysis has shown that specific DNA motifs are enriched at RP gene promoters within one species or between closely related species (Li et al., 2005), but are not similar across distant relations – suggesting that different transcription factors have been adopted to drive RP gene expression during evolution and speciation. This model is supported by evidence that different transcription factors drive RP gene expression in different fungi, through substitution of common motifs at promoters (Hogues et al., 2008). The “TCT” motif is prevalent at Drosophila and human RP gene promoters, but it is unclear whether this is recognised by specific factors (Parry et al., 2010). In Drosophila the sequence-specific transcription factor M1BP binds to core promoters and recruits TRF2 to allow coordinated transcription

59

Chapter 1 Introduction

of RP genes (Baumann and Gilmour, 2017). Much work remains to be done to fully elucidate and understand the regulation of RP gene expression in higher eukaryotes.

Figure 1.8 Structural and functional organisation of a ribosomal protein gene promoter in S. cerevisiae Ribosomal protein (RP) genes in S. cerevisiae contain a wide nucleosome- depleted region (NDR) between the +1 and -1 nucleosomes (grey globes). Most RP genes are regulated by the sequence-specific transcription factor Rap1. Rap1 target motifs (red box) are located at the 5’ (upstream) edge of the NDR, flanked by the -1 nucleosome. Additional coactivators (e.g. Fhl1, Ifh1, Sfp1, and Hmo1) of RP genes bind directly to target motifs or indirectly via other coactivators, and mediate regulation of RP gene activity by upstream TORC and PKA signalling in response to nutrient conditions. RNA polymerase II is recruited by Rap1 and general transcription factors (TFIID and TFIIA, not depicted), resulting in high levels of coding gene expression. Black arrow, transcription start site (TSS).

RP gene expression has been well characterised in S. cerevisiae. There are 138 RP genes in budding yeast, nearly all of which are regulated by the key transcription factor Rap1 (Lieb et al., 2001; Reja et al., 2015; Shore and Nasmyth, 1987). 59 RP genes have been duplicated through whole genome duplication, generating paralogs (Wapinski et al., 2010; Wolfe and Shields, 1997). RP genes are among the most highly expressed coding genes in S. cerevisiae, and most RP

60

Chapter 1 Introduction

genes in yeast have introns (Velculescu et al., 1997; Warner, 1999). 10 RP genes are regulated by the “general regulatory factor” Abf1 instead of Rap1; Abf1 also controls a larger set of genes involved in ribosome biogenesis (Fermi et al., 2016; Knight et al., 2014; Reja et al., 2015). The key transcription factors that regulate expression of most RP genes are Rap1, Fhl1, Ifh1, Sfp1, Hmo1, and Crf1 (Lieb et al., 2001; Reja et al., 2015) (Figure 1.8). RP gene expression is sensitive to nutrient sensing and signalling pathways, particularly TORC and PKA (Gasch et al., 2000). Rap1 and Fhl1 remain bound to the promoter DNA when RP genes are shut down, indicating that the other coactivators play important roles in transcriptional regulation. Rap1 is a pioneer transcription factor that facilitates recruitment of the other transcription factors to the promoters of RP genes (Knight et al., 2014). One or two Rap1 motifs are found in forward or reverse orientations (approximately 5-15 bp apart) at the upstream edge of the NDR. Rap1 also plays a direct role in activation of RP gene transcription. Rap1 directly interacts with TFIIA and TFIID, and recruits these general transcription factors to RP gene promoters in vivo through its activation domain (Garbett et al., 2007; Johnson and Weil, 2017; Layer et al., 2010; Layer and Weil, 2013; Mencia et al., 2002; Papai et al., 2010). Most RP gene promoters lack TATA elements and therefore recruitment of TFIID and TBP by Rap1 is essential for RP gene transcription (Layer et al., 2010; Mencia et al., 2002). SUMOylation of Rap1 appears to be important for TFIID recruitment (Chymkowitch et al., 2015). The turnover of DNA-bound Rap1 on chromatin is also linked to more potent transcriptional activation (Lickwar et al., 2012).

Approximately half of the RP genes are regulated by the HMG protein Hmo1 (Hall et al., 2006), which mediates regulation by the TOR pathway (Berger et al., 2007) and helps to tune the position of the +1 nucleosome (Kasahara et al., 2011). Fhl1, Ifh1, and Sfp1 activate RP gene transcription, which is sensitive to TORC and PKA signalling activity (Berger et al., 2007; Downey et al., 2013; Hall et al., 2006; Jorgensen et al., 2004; Marion et al., 2004; Martin et al., 2004; Rudra et al., 2005; Schawalder et al., 2004; Wade et al., 2004; Zhao et al., 2006). During stress or nutrient starvation, a negative regulatory protein called Crf1 binds to Fhl1 and displaces the activator Ifh1 (Martin et al., 2004) and all factors except for Rap1 and Fhl1 dissociate from the RP gene promoter (Reja et al., 2015; Rudra et al., 2005; Wade et al., 2004). Sfp1 is a transcription factor that putatively binds to RP gene

61

Chapter 1 Introduction

promoters under conditions with optimal nutrients, but under nutrient starvation or stress conditions, Sfp1 relocates to the cytosol (Jorgensen et al., 2004; Marion et al., 2004). Sfp1 associates with RP genes via Ifh1 (Albert et al., 2019) and promotes RP gene expression. The expression of RP genes and rRNA is also tightly coupled through Ifh1 and a component of the CURI complex, Utp22 (Albert et al., 2016; Rudra et al., 2007). Inhibition of TORC signalling by rapamycin treatment leads to dissociation of Ifh1 from RP gene promoters, while the RNA processing factor Utp22 titrates away Ifh1 from RP gene promoters. Activity of RNA polymerase I (which transcribes rRNA) interferes with the ability of Utp22 to titrate away Ifh1 from RP gene promoters within the CURI complex, effectively coupling rRNA transcription by RNA polymerase I and RP gene transcription by RNA polymerase II. RP genes may also be regulated by mRNA stability dictated by differences in promoter sequence (Bregman et al., 2011). RP gene expression is carefully coordinated by a diverse complement of regulatory factors and pathways to tune the biosynthetic capability of yeast cells in response to extracellular cues and availability of nutrients.

62

Chapter 1 Introduction

1.5.2 Additional functions of Rap1 in S. cerevisiae

Figure 1.9 Additional functions of S. cerevisiae Rap1 at telomeres and the hidden mating (HM) type loci (A) Domain map of the S. cerevisiae Rap1 protein. NTD, N-terminal domain; DBD, DNA-binding domain; Tox, toxicity domain; AD, activation domain; Rap1 C-terminal

63

Chapter 1 Introduction

interaction domain. The amino acid residues demarcating the edges of each annotated domain are listed below, and the main functions associated with the DBD, AD, and RCT are stated. (B) Schematic diagram of the S. cerevisiae telomeric silencing complex containing Rap1. Rap1 binds directly to the TG1-3 repeats and recruits additional silencing proteins (Rif1, Rif2, Sir3, Sir4, and the histone deacetylase Sir2) to mediate transcriptional silencing of telomeres. To date, the exact arrangement and stoichiometry of these telomeric silencing proteins in vivo remains unclear. (C) Schematic diagram of the silencing complex containing Rap1 at the HMR locus in S. cerevisiae. Rap1 binds directly to the HMR-E silencer adjacent to the silenced HMR gene, along with Abf1 and origin of replication complex (ORC) proteins. Multiple silencing proteins (e.g. Sir1, Sir3, Sir4, and Sir2) are recruited to establish and maintain repressive heterochromatin at HMR and HML.

Along with activation of RP genes and an additional subset of glycolytic genes (Lieb et al., 2001), Rap1 mediates transcriptional silencing at telomeres and the hidden mating type (HM) loci in budding yeast (Figure 1.9). Various modular domains of the Rap1 protein recruit cofactors that allow Rap1 to perform different functions at various regulatory DNA elements and loci (Figure 1.9A). The function of Rap1 in transcriptional activation is not conserved in higher eukaryotes, but Rap1 constitutes a key component of the telomeric silencing complex in S. cerevisiae, S. pombe, and H. sapiens (de Lange, 2018) (Figure 1.9B). David Shore and first described “repressor-activator protein 1” (Rap1), a factor that bound to both silencer and activator elements (Shore and Nasmyth, 1987). Telomeres comprise repetitive nucleotide sequences at the ends of chromosomes, which protect chromosome ends from deterioration or aberrant fusion. Telomeres are extended by the telomerase enzyme (Shay and Wright, 2019) and are protected by the shelterin complex, whose components vary across species (de

Lange, 2018). In budding yeast, Rap1 recognises the double-stranded TG1-3 repeats at telomeres directly (Longtine et al., 1989), and recruits silencing proteins Rif1, Rif2, and Sir4 through its C-terminal domain (Hardy et al., 1992a; Hardy et al., 1992b; Wotton and Shore, 1997). Through interactions with Rif2 and Sir4, Rap1 blocks non-homologous end joining (NHEJ) of the telomere ends and meditates heterochromatin silencing at telomeric and subtelomeric regions (Kyrion et al., 1993; Marcand et al., 2008; Shi et al., 2013). In mammalian cells, Rap1 does not bind telomeric DNA directly – this role is taken over by TRF2 – but plays essential roles in regulating telomere silencing, telomere length, and suppression of DNA repair mechanisms (de Lange, 2018).

64

Chapter 1 Introduction

The hidden mating type (HM) loci in budding yeast, HML and HMR, contain silenced copies of the MATα and MATa alleles. These loci allow specialised mating type switching by gene conversion, through the HO endonuclease (Haber, 2012). The HM loci comprise transcriptionally silenced heterochromatin, which requires the binding of the histone deacetylase Sir2, silencing proteins Sir3 and Sir4, and several DNA-binding proteins, including Rap1, Abf1, and several origin of replication (ORC) proteins (Figure 1.9C). Competition exists between telomeres and HM loci for Rap1-mediated silencing in yeast (Buck and Shore, 1995). Interaction surfaces within the C-terminal domain (CTD) of Rap1 are crucial for regulation of silencing at the HM loci and telomeres (Feeser and Wolberger, 2008; Graham et al., 1999; Shi et al., 2013; Sussel and Shore, 1991). Separation-of- function mutants for the activation and silencing activities of Rap1 have been identified within the CTD (Sussel and Shore, 1991). In addition, the Rap1 protein possesses a BRCA1 C-Terminus (BRCT) domain within its N-terminal region, which has not been well characterised (Miyake et al., 2000). The N-terminal domain may facilitate the binding of another sequence-specific transcription factor, Gcr1, to glycolytic genes co-regulated by Rap1 and Gcr1 (Mizuno et al., 2004). Deletion of a small region (toxicity domain) adjacent to the activation domain of Rap1 relieves the detrimental or toxic effects of Rap1 over-expression towards cellular growth (Freeman et al., 1995).

The DNA-binding domain (DBD, residues 361 to 597) comprises two Myb- like DNA-binding domains followed by a “wrapping loop”, and allows Rap1 to bind to specific sequence motifs on double-stranded DNA. The extended DBD “wrapping loop” interacts with the DNA on the opposite helical face to the Myb-like domains, enclosing the DNA helix in a closed conformation (PDB 3UKG) (Matot et al., 2012). This property may contribute to the relative stability of DNA binding (Lickwar et al., 2012). In addition, the DBD of Rap1 can bind to double-stranded DNA in higher stoichiometries in vitro, and the wrapping loop and CTD likely influence protein conformation to modulate the DNA-binding properties of Rap1 (Feldmann et al., 2015; Feldmann and Galletto, 2014; Matot et al., 2012). Rap1 does not display cooperative binding when present in multiple copies (Gilson et al.,

1993), but arrays of just a few TG1-3 Rap1 binding sites are already sufficient to

65

Chapter 1 Introduction

confer telomere capping properties in vivo (Ribeyre and Shore, 2012). It remains unclear exactly how apparently identical Rap1 proteins simultaneously function as activators at gene promoters and silencers at telomeres and the HM loci.

Rap1 has well documented roles in regulation of nucleosome organisation, particularly in the determination of NDR width at promoters. Various studies have shown that Rap1 is a “pioneer” transcription factor that can displace nucleosomes, even when its binding sites are protected within DNA wound around the core histone octamer (Mivelaz et al., 2019; Yan et al., 2018; Yarragudi et al., 2004; Yu and Morse, 1999). Rap1 binding is important for maintenance of the NDR at promoters in vivo (Ganapathi et al., 2011; van Bakel et al., 2013). The N- and C- terminal domains of Rap1 are apparently expendable for nucleosome displacement at the HIS4 promoter (Yu et al., 2001), which is supported by genome-wide evidence showing that expression of the Rap1 DBD alone is already sufficient to partially rescue the defects in chromatin organisation and repression of aberrant transcription initiation after depletion of full-length Rap1 (Challal et al., 2018). Rap1 also functions as a “transcriptional roadblock”, first described to terminate transcription within a mutated Ty1 transposon (Yarrington et al., 2012) and within a screen for downstream transcriptional interference (Briand, 2015). High-resolution mapping of RNA polymerase II using UV cross-linking (CRAC) revealed that Rap1 limits the extent of transcriptional interference by tandem upstream transcription at many genomic locations (Candelli et al., 2018). In parallel with my own work, Domenico Libri and colleagues identified that Rap1 and other general regulatory transcription factors (GRFs, Abf1 and Reb1) regulate ectopic transcription initiation within NDRs (Challal et al., 2018; Wu et al., 2018b). Specifically, depletion of Rap1 leads to de-repression of divergent TSSs near its binding sites, and an upstream shift in distal TSS usage near the +1 nucleosome in the coding direction. This extensive work undertaken by many groups over several decades highlights the many diverse functions of the multifaceted transcription factor Rap1 in S. cerevisiae.

66

Chapter 1 Introduction

1.5.3 Ribosomal protein genes as a model to study divergent noncoding transcription

To understand the mechanisms that control divergent noncoding transcription, I selected the ribosomal protein (RP) genes in S. cerevisiae for my initial investigations. RP genes are very highly expressed in exponentially growing cells, and account for much of the transcriptional activity by RNA polymerase II (Warner, 1999). The factors that regulate RP gene expression are well characterised, and respond to different nutrient and environmental cues which can be used as experimental tools (Introduction 1.5.1). Most RP genes also contain introns, a rare occurrence in budding yeast (Zerbino et al., 2018). Given that splicing factors are involved in regulation of noncoding RNA expression and suppression of R-loops in higher eukaryotes (Almada et al., 2013; Huertas and Aguilera, 2003; Li and Manley, 2005), it may be important to consider the contribution of splicing towards promoter directionality. Publicly available data sets (e.g. MNase-seq) are available to complement genome-wide studies of the transcriptome, including experiments performed after depletion of Rap1 and RSC – two key regulators of RP gene expression (Kubik et al., 2015; Kubik et al., 2018). RP gene promoters are also particularly depleted of nucleosomes, and may possess additional chromatin-independent mechanisms that also limit divergent noncoding transcription. In line with this, only 16 of 138 RP genes in S. cerevisiae possess a divergent noncoding transcript (CUT or SUT) emanating from the promoter NDR (Neil et al., 2009; Xu et al., 2009). This investigation relating to the regulation of promoter directionality at these highly expressed gene promoters should be widely applicable to other highly expressed and bidirectional promoters in many eukaryotic species, especially humans.

1.6 Aims of this thesis

Genome-wide approaches have identified divergent noncoding RNAs in a wide range of eukaryotic organisms. While various regulatory mechanisms and pathways have been characterised in yeast and human cells, our understanding of the regulation and function of divergent noncoding RNAs remains incomplete. Divergent transcripts represent a major source of noncoding RNAs in eukaryotic

67

Chapter 1 Introduction

cells, and can regulate gene expression through different mechanisms. It is unclear why a large number of highly expressed RP genes in budding yeast do not exhibit the common phenomenon of divergent noncoding transcription. As outlined above, enrichment of cis-regulatory elements and transcription factor motifs within promoters biases them towards directional output. However, the exact mechanisms by which sequence-specific transcription factors control divergent RNAs and promoter directionality remain to be identified. In particular, it is unclear whether ubiquitous chromatin- and RNA degradation-based pathways are sufficient to repress divergent noncoding transcripts at ribosomal protein genes, or whether additional complementary mechanisms exist. Several ATP-dependent chromatin remodellers have been implicated in regulating divergent noncoding transcription, but the extent to which they interact with sequence-specific transcription factors in this regard is not known.

I selected Saccharomyces cerevisiae, also known as budding yeast or baker’s yeast, to investigate these questions in cells. It is an experimentally tractable model organism which is easy to grow in the laboratory. Many genetic tools are commonly used to perturb genes or proteins and study their functions, such as gene deletion libraries and inducible protein depletion alleles (Nishimura et al., 2009; Winzeler et al., 1999). Budding yeast can be stably propagated as haploid or diploid cells, and their sexual life cycle allows for straightforward genetic crossing and back-crossing. Genome-wide sequencing and bioinformatic analysis are simplified because the yeast genome is small (~12 Mb) and very well annotated (Zerbino et al., 2018). Finally, many genes are conserved in sequence and function in higher eukaryotes (e.g. humans), particularly those involved in gene expression.

In Chapter 3, I identify that the sequence-specific transcription factor Rap1 specifically limits expression of divergent noncoding RNAs throughout the genome. In addition, I demonstrate that Rap1 is not redundant with ubiquitous chromatin assembly and RNA surveillance pathways that also limit expression of noncoding transcripts. Transcriptional repression by Rap1 is spatially limited to divergent noncoding RNAs, and Rap1 functions in concert with other pathways to ensure transcriptional fidelity. In Chapter 4, I determine that Rap1 specifically represses divergent core promoters in close proximity at hundreds of its target genes through

68

Chapter 1 Introduction

genome-wide mapping of transcript start sites at single nucleotide resolution. I demonstrate that Rap1 limits expression of divergent RNAs at initiation, not elongation, using mutagenesis and reporter assays for bidirectional promoters. In addition, I test additional transcription factors for the ability to repress divergent RNAs, and highlight several attractive candidates for further investigation. I establish that a small region within the Rap1 C-terminal domain comprising residues 631-696, but not the heterochromatin silencing function of Rap1, is required for control of divergent noncoding transcription. Finally, I exploit a range of endogenous, mutant, and synthetic transcription factors to demonstrate that divergent core promoters can be regulated by sequence-specific DNA-binding proteins. In Chapter 5, I explore the relationship between the sequence-specific transcription factor Rap1 and the ATP-dependent chromatin remodeller RSC with regards to regulation of divergent noncoding transcription. I identify that the nucleosome remodelling activity of RSC is only required for divergent transcription at a small number of Rap1-regulated genes. I implement a nascent RNA sequencing approach to quantify promoter directionality genome-wide, and identify the ATP-dependent chromatin remodeller RSC as a regulator of divergent transcription at a substantial number of coding gene promoters in budding yeast. Finally, I propose a unified model for control of divergent noncoding transcription, discuss alternative hypotheses and broader implications of these findings, and propose future directions for investigation. Together, these studies help to clarify the overlapping contributions of transcription factors and chromatin regulatory pathways towards control of divergent noncoding transcription in S. cerevisiae.

69

Chapter 2 Materials and Methods

Chapter 2. Materials & Methods

2.1 Acknowledgement

Parts of the Materials & Methods have been published in Molecular Cell, and have been modified to present within this chapter (Wu et al., 2018b).

The description of the methods used for bioinformatic analysis was written with the help of Harshil Patel (Bioinformatics and Biostatistics Science Technology Platform, The Francis Crick Institute).

The description of the methods used for proteomics mass spectrometry analysis was written with the help of David Frith and Bram Snijders (Protein Analysis and Proteomics Platform, The Francis Crick Institute).

The description of the methods used for TSS-seq was written with the help of Minghao Chia (Cell Fate and Cene Regulation Laboratory, The Francis Crick Institute).

The description of the methods used for single molecule RNA FISH was written with the help of Fabien Moretto (Cell Fate and Cene Regulation Laboratory, The Francis Crick Institute).

The Sanger sequencing used to validate plasmids was performed with the help of the Genomics Equipment Park Science Technology Platform (The Francis Crick Institute).

2.2 Construction of yeast strains

2.2.1 Yeast strain genotypes

Strains isogenic to the Saccharomyces cerevisiae BY4741 strain background (derived from S288C) were used for this investigation. The gene deletion strains used to study mis-regulation of IRT2 and iMLP1 expression were

70

Chapter 2 Materials and Methods

obtained from a strain library belonging to Peter Thorpe (Queen Mary University of London) and have been previously described (Winzeler et al., 1999).

Strain Genotype Number FW627 MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW629 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 fhl1::FHL1-V5-IAA7::KANMX6 FW4200 his3::pGPD1-osTIR1::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 ifh1::IFH1-V5-IAA7::KANMX6 FW4202 his3::pGPD1-osTIR1::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 sfp1::SFP1-V5-IAA7::KANMX6 FW4204 his3::pGPD1-osTIR1::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW3877 rap1::RAP1-V5-IAA7::KANMX6 his3::pGPD1-osTIR1::HIS3 FW4136 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 crf1::NATMX FW4132 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 hmo1::NATMX FW3443 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 pRPL43B(280-308)::loxP FW6030 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 mlp1::NATMX FW4141 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 pRPL40B(233-246)::LoxP MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 pRPL40B(233-246)::LoxP FW6029 mlp1::NATMX FW4122 MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 mlp1::MLP1-V5::KANMX6 MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 pRPL40B(233-246)::LoxP FW4120 mlp1::MLP1-V5::KANMX6 p110 pGAL1-CRE HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW631 MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 pRPL43B(280-308)::loxP FW6139 MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 pRPL43B(280-308)::loxP MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 pRPL43B(280-308)::loxP FW3440 p109 pGAL1-CRE URA3

71

Chapter 2 Materials and Methods

MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW3920 pRPL43B(272-308)::loxP_400bp p110 pGAL1-CRE HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW3922 pRPL43B(272-275)::loxP_400bp p110 pGAL1-CRE HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW7451 pRPL43B(313)::LoxP_400bp p109 pGAL1-CRE URA3 FW4732 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5::KANMX6 MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5::KANMX6 FW4734 pRPL43B(280-308)::loxP MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5::KANMX6 FW4737 pRPL43B(272-308)::loxP_400bp MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5::KANMX6 FW4735 pRPL43B(272-275)::loxP_400bp MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5::KANMX6 FW6228 pRPL40B(233-246)::LoxP FW5543 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 spt10::NATMX FW5547 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 spt21::NATMX FW5609 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rlf2::NATMX MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 spt6::SPT6-V5-IAA7::KANMX6 FW5555 leu2::pGPD1-osTIR1::LEU2 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW5559 spt16::SPT16-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW5129 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 Myc-NLS-Rap1 (1-827)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW5133 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 Myc-NLS-Rap1 (339-827)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW5138 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 Myc-NLS-Rap1 (1-599)::HIS3

72

Chapter 2 Materials and Methods

MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW5141 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 Myc-NLS-Rap1 (339-599)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW5145 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 his3::pNH603::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW4948 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 HA-NLS-Rap1 (1-827)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW4950 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 HA-NLS-Rap1 (ΔDBD 362-597)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW4952 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 HA-NLS-Rap1 (ΔTox 597-662)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW4954 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 HA-NLS-Rap1 (ΔAD 631-678)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW4958 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 HA-NLS-Rap1 (Δ764-827)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW4960 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 HA-NLS-Rap1 (Δ631-696)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-AID-Myc::KANMX FW5420 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 HA-NLS-Rap1 (1-827)-V5::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-AID-Myc::KANMX FW5393 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 HA-NLS-Rap1 (ΔDBD 362-597)-V5::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-AID-Myc::KANMX FW5394 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 HA-NLS-Rap1 (ΔTox 597-662)-V5::HIS3

73

Chapter 2 Materials and Methods

MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-AID-Myc::KANMX FW5424 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 HA-NLS-Rap1 (ΔAD 631-678)-V5::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-AID-Myc::KANMX FW5395 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 HA-NLS-Rap1 (Δ764-827)-V5::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-AID-Myc::KANMX FW5396 leu2::pGPD1-osTIR1::LEU2 his3::pNH603 HA-NLS-Rap1 (Δ631-696)-V5::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-AID-Myc::KANMX FW5399 leu2::pGPD1-osTIR1::LEU2 his3::pNH603::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 sth1::STH1-V5-IAA7::KANMX6 FW6032 leu2::pGPD1-osTIR1::LEU2 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6231 rap1::RAP1-V5-IAA7::KANMX6 sth1::STH1-AID-FLAG::hphNT his3::pGPD1-osTIR1::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW4821 nrd1::NRD1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 FW6715 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 ada2::KANMX FW6707 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 arp8::KANMX FW6722 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 bre1::KANMX FW6683 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 cdc40::KANMX FW6725 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 cmr1::KANMX FW6682 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 gcn4::KANMX FW6717 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 gcn5::KANMX FW6698 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 gcr2::KANMX FW6721 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 hst1::KANMX FW6691 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 hst2::KANMX FW6678 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 hst3::KANMX FW6686 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 hst4::KANMX FW6681 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 isw1::KANMX FW6679 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 isw2::KANMX

74

Chapter 2 Materials and Methods

FW6700 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 mga2::KANMX FW6687 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 ngg1::KANMX FW6719 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 nhp6a::KANMX FW6701 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 opi3::KANMX FW6706 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 paf1::KANMX FW6729 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 rif1::KANMX FW6704 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 rif2::KANMX FW6703 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 rlf2::KANMX FW6694 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 rpb9::KANMX FW6689 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 rpd3::KANMX FW6708 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 rrd1::KANMX FW6680 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 rrp6::KANMX FW6685 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 rsc1::KANMX FW6718 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 rsc2::KANMX FW6699 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 rtt106::KANMX FW6677 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 rtt109::KANMX FW6728 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 set2::KANMX FW6709 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 set3::KANMX FW6726 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 sgf29::KANMX FW6695 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 sgf73::KANMX FW6716 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 sin4::KANMX FW6705 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 sir1::KANMX FW6711 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 sir2::KANMX FW6713 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 sir3::KANMX FW6690 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 sir4::KANMX FW6724 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 snf2::KANMX FW6723 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 snf5::KANMX FW6676 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 spt21::KANMX FW6684 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 spt3::KANMX FW6710 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 spt4::KANMX FW6714 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 spt7::KANMX FW6675 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 spt8::KANMX FW6702 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 srb2::KANMX

75

Chapter 2 Materials and Methods

FW6693 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 ssn3::KANMX FW6696 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 sum1::KANMX FW6688 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 swi3::KANMX FW6697 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 swr1::KANMX FW6720 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 trf4::KANMX FW6712 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 ubp3::KANMX FW6692 MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0 vps16::KANMX FW4817 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 bur2::NATMX FW4756 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 ctk1::NATMX FW4757 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 est2::NATMX FW4819 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 ino80::NATMX FW4820 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 sch9::NATMX FW4758 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 spt23::NATMX FW4759 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 xrn1::NATMX MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6776 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(H709A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6777 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(R747A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6778 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(M763A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6779 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(M817A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6780 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(M763A M817A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6781 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(R804A M817A)::HIS3 FW6782 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0

76

Chapter 2 Materials and Methods

rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(T700A D701A R747A K748A N749A S753A Patch 2B)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6783 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(S725A D727A E729A Patch 4)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6784 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(K796A R804A T812A Patch 6)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6785 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(Δ672-827)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6786 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(D689A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6787 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(K696A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6788 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(D701A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6789 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(D701R)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6790 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(Q715A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6791 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(D701A H789A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6792 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(D727A)::HIS3

77

Chapter 2 Materials and Methods

MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6793 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(E729A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6794 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(S731A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6795 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(S731Y)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6796 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(E734A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6797 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(E743R)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6798 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(S753Y)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6799 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(N782R)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6800 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(H789A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6801 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(D790A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6802 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(K796A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6803 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(N798A)::HIS3

78

Chapter 2 Materials and Methods

MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6804 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(Q800A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6805 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(E801A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6806 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(R804A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6807 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(M817R)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6808 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(M817Y)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6809 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(R820A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6810 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(D701A R747A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6811 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(N798A D799A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6812 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(R804A T812A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6813 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(S731Y M763A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6814 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(S731Y R820E)::HIS3

79

Chapter 2 Materials and Methods

MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6815 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(N679A I682A N782A Patch 1)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 FW6816 his3::pNH603 Myc-NLS-Rap1(T700A D701A R747A K748A N749A Patch 2B)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6817 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(H709A D742A E743A Patch 3A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 FW6818 his3::pNH603 Myc-NLS-Rap1(L706A H709A D742A E743A Patch 3B)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6819 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(D761A M763A M817A Patch 5A)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 FW6820 his3::pNH603 Myc-NLS-Rap1(D761A M763A R814A M817A Patch 5B)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6821 rap1::RAP1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR::LEU2 his3::pNH603 Myc-NLS-Rap1(R747S)::HIS3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6407 ppt1::p592-YFP-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6895 ppt1::p593-YFP-R1p(SspI)-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW7253 ppt1::p618-YFP-R1d(XmnI)-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6208 rap1::RAP1-V5-IAA7::KANMX6 his3::pGPD1-osTIR1::HIS3 ppt1::p592-YFP-pPPT1-mCherry::NATMX6

80

Chapter 2 Materials and Methods

MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6206 rap1::RAP1-V5-IAA7::KANMX6 his3::pGPD1-osTIR1::HIS3 ppt1::p593-YFP-R1p(SspI)-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6204 rap1::RAP1-V5-IAA7::KANMX6 his3::pGPD1-osTIR1::HIS3 ppt1::p595-YFP-R1prv(SspI)-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6408 rap1::RAP1-V5-IAA7::KANMX6 his3::pGPD1-osTIR1::HIS3 ppt1::p618-YFP-R1d(XmnI)-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 leu2::pGPD1-osTIR1::LEU2 FW6218 sth1::STH1-AID-FLAG::hphNT ppt1::p593-YFP-R1p(SspI)-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5-IAA7::KANMX6 sth1::STH1-AID-FLAG::hphNT FW6433 his3::pGPD1-osTIR1::HIS3 ppt1::p593-YFP-R1p(SspI)-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6404 ppt1::p615-YFP-Cbf1_bs-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 cbf1::KANMX FW6306 ppt1::p615-YFP-Cbf1_bs-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6401 ppt1::p612-YFP-Gcn4_bs-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 gcn4::KANMX FW6300 ppt1::p612-YFP-Gcn4_bs-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6406 ppt1::p617-YFP-Hap4_bs-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 hap4::KANMX FW6304 ppt1::p617-YFP-Hap4_bs-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6402 ppt1::p613-YFP-Cat8_bs-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 cat8::KANMX FW6302 ppt1::p613-YFP-Cat8_bs-pPPT1-mCherry::NATMX6

81

Chapter 2 Materials and Methods

MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6403 ppt1::p614-YFP-Gal4_bs-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 gal4::KANMX FW6424 ppt1::p614-YFP-Gal4_bs-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6405 ppt1::p616-YFP-Gcr1_bs-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 gcr1::KANMX FW6315 ppt1::p616-YFP-Gcr1_bs-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 abf1::ABF1-V5-IAA7::KANMX6 FW6415 leu2::pGPD1-osTIR1::LEU2 ppt1::p619-YFP-Abf1_bs-pPPT1-mCherry::NATMX6 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW6411 reb1::REB1-V5-IAA7::KANMX6 leu2::pGPD1-osTIR1::LEU2 ppt1::p620-YFP-Reb1_bs-pPPT1-mCherry::NATMX6 FW7228 MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rpb3::RPB3-3xFLAG::NATMX MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rpb3::RPB3-3xFLAG::NATMX FW7220 leu2::pGPD1-osTIR1::LEU2 sth1::STH1-V5-AID::KANMX6 MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rpb3::RPB3-3xFLAG::NATMX FW7232 leu2::pGPD1-osTIR1::LEU2 rap1::RAP1-V5-AID::KANMX6 sth1::STH1-V5-AID::KANMX6 MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rpb3::RPB3-3xFLAG::NATMX FW7238 leu2::pGPD1-osTIR1::LEU2 rap1::RAP1-V5-AID::KANMX6 MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW4714 p446 (pADH1-LexA_DBD-SV40NLS-ADH1terminator::HIS3) MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW4715 p457 (pADH1-LexA_DBD-SV40NLS-Rap1_CTD-V5- ADH1terminator::HIS3) MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW4716 pRPL43B(280-308)::5xLexO-LoxP p446 (pADH1-LexA_DBD-SV40NLS-ADH1terminator::HIS3) FW4717 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0

82

Chapter 2 Materials and Methods

pRPL43B(280-308)::5xLexO-LoxP p457 (pADH1-LexA_DBD-SV40NLS-Rap1_CTD-V5- ADH1terminator::HIS3) MATα his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW5086 p294 (pADH1-LexA_DBD-SV40NLS-TAP-ADH1terminator::HIS3) MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW5087 pRPL43B(280-308)::5xLexO-LoxP p294 (pADH1-LexA_DBD-SV40NLS-TAP-ADH1terminator::HIS3) MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 FW8477 rap1::RAP1-V5-IAA7::KANMX6 his3::pGPD1-osTIR1::HIS3 ura3::p375::C.a. URA3 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5-IAA7::KANMX6 his3::pGPD1-osTIR1::HIS3 FW8523 ura3::p704::pTDH3-dCas9-3xFLAG-ADH1term::C.a. URA3 p692::pRS305-pSNR52-sgIRT2_A::LEU2 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5-IAA7::KANMX6 his3::pGPD1-osTIR1::HIS3 FW8531 ura3::p703::pTDH3-dCas9-Mxi1-3xFLAG-ADH1term::C.a. URA3 p692::pRS305-pSNR52-sgIRT2_A::LEU2 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5-IAA7::KANMX6 his3::pGPD1-osTIR1::HIS3 FW8533 ura3::p703::pTDH3-dCas9-Mxi1-3xFLAG-ADH1term::C.a. URA3 p705::pRS305-pSNR52-sgiMLP1_I::LEU2 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 rap1::RAP1-V5-IAA7::KANMX6 his3::pGPD1-osTIR1::HIS3 FW8535 ura3::p703::pTDH3-dCas9-Mxi1-3xFLAG-ADH1term::C.a. URA3 p352::pRS305-pSNR52-sgTEF1::LEU2

Table 2.1 Table of yeast strain genotypes

83

Chapter 2 Materials and Methods

2.2.2 Transformation of yeast

Yeast strains were transformed using the polyethylene glycol (PEG)-Lithium Acetate (LiAc) method, and gene deletions were performed using the one-step disruption protocol by homologous recombination as previously described (Longtine et al., 1998). Fragments of DNA containing selection markers were amplified by polymerase chain reaction (PCR) using primers containing 40 nucleotides of DNA each at the 5’ end homologous to sequences flanking the genomic site of integration. Briefly, cells were grown to exponential growth phase in yeast extract peptone dextrose (YPD) media (see below for recipe), harvested by centrifugation, washed in 0.1 M LiAc, and incubated at 30 °C for 40 minutes with 30% w/v PEG-4000, 0.1 M LiAc, carrier DNA (single-stranded sonicated salmon sperm DNA) and the plasmid or linear DNA for transformation. Cells were subjected to heat shock at 42 °C for 20 minutes in a water bath, and subsequently plated onto auxotrophic selection agar plates or onto YPD agar plates (to allow for recovery incubation, applies to transformation with drug resistance markers). For selection of drug-resistant colonies, lawns of cells were grown overnight at 30 °C on YPD agar plates and subsequently replica plated to YPD agar plates supplemented with drugs for selection as specified below (Materials and Methods 2.2.4). Transformed strains were verified for the correct gene deletion or integration wherever possible by PCR using specific primers to generate an amplicon spanning one flanking edge of the integration site.

2.2.3 Genetic crossing of yeast

Yeast strains of opposite mating types (MATa or MATα) were grown on YPD agar plates and incubated overnight at 30 °C. Patches of yeast cells were mixed in an approximate 1:1 ratio and incubated for at least 6 hours at 30 °C on YPD agar plates. Cell suspensions (made in sterile water) were streaked onto YPD agar plates, and individual diploid zygotes were picked using a dissection microscope (Singer MSM 400). YPD agar plates with picked diploid zygote cells were incubated at 30 °C (for approx. 2 days) until colonies were visible. Colonies were picked and spread in larger patches on fresh YPD agar plates. The cells were then incubated for 1 day at 30 °C and 1 subsequent day at room temperature (approx. 18-20 °C),

84

Chapter 2 Materials and Methods

before they were transferred to sporulation media (SPO) agar plates (see below for recipe) in small patches. Cells were incubated on SPO plates for 1 week at room temperature, and examined for the presence of complete tetrads. Samples of cells taken from each colony were incubated with Zymolyase (1 mg/mL, in 1 M sorbitol solution) at 37 °C for 5 to 7 minutes to digest the spore wall and to allow individual asci to be manually dissected. Zymolyase-digested cells were streaked onto YPD agar plates, tetrads were manually dissected (Singer MSM 400), and the plates were incubated at 30 °C until individual colonies appeared.

2.2.4 Replica plating

Colonies were screened for the ability to sustain respiratory growth by replica plating on YP media supplemented with glycerol instead of glucose. For phenotypic selection using drug resistance markers, colonies were replica plated using sterile velvets onto YPD agar plates supplemented with drugs at the following concentrations: geneticin sulphate (G418, kanamycin, KAN) at 0.2 mg/mL active concentration, nourseothricin (NAT) at 0.1 mg/mL. For phenotypic selection using auxotrophic markers, colonies were replica plated on synthetic complete medium (MP Biomedicals) “dropout” agar plates supplemented with glucose (2% w/v), lacking individual selected amino acids or supplements (e.g. histidine, leucine, or uracil). To determine the mating type of colonies derived from haploid spores, colonies were replica plated on yeast nitrogen base (YNB, 6.7 mg/mL) agar plates supplemented with glucose (2% w/v), on which MATa or MATα mating tester strains (full prototrophs) were spread. For strains containing auxotrophic markers, only cells that were able to mate with the prototroph mating tester strain of the opposite mating type were able to survive on YNB-glucose medium.

2.2.5 Auxin-inducible degron (AID) system

A one-step tagging procedure was used to generate endogenous carboxy (C)-terminally tagged auxin-inducible degron (AID) alleles (Nishimura et al., 2009). The standard AID tag used for most experiments contains three copies of the V5 epitope and the IAA7 degron (Nishimura et al., 2009). The RAP1-AID-MYC allele

85

Chapter 2 Materials and Methods

contains nine copies of the Myc epitope and amino acid residues 71-114 of IAA17 (AID-MYC), whereas STH1-AID-FLAG allele contains six copies of the FLAG epitope and amino acid residues 71-114 of IAA17 (AID-FLAG) (Morawska and Ulrich, 2013).

To reconstitute the functional AID system in yeast cells, single copy integration plasmids for expression of Oryza sativa TIR1 (osTIR1) ubiquitin E3 ligase (gift from Leon Chan, UC Berkeley) were transformed into yeast strains. To integrate the osTIR1 plasmids at either the HIS3 or LEU2 loci, plasmids were linearised by digestion with PmeI.

2.2.6 Cre-LoxP system

To delete endogenous Rap1 binding sites at the RPL43B and RPL40B loci, DNA sequences containing selection markers flanked by loxP sites (“floxed”) in tandem orientation were amplified by PCR (Gueldener et al., 2002), using specific primers containing 5’ homology arms directed to sequences flanking the target Rap1 binding sites. After transformation, selection, and isolation of positive colonies for successful integrations, the “floxed” markers were excised by transforming the yeast cells with non-integrating expression vectors for Cre recombinase. Individual colonies were isolated and screened; colonies that were positive for the Cre recombinase expression vector marker and negative for the integrated selection marker replacing the Rap1 binding sites were screened for successful excision of the “floxed” marker by PCR using convergently oriented primers flanking the target motifs. One residual LoxP sequence is retained within the integrated DNA.

The Cre-LoxP system was also used to introduce a “spacer” DNA sequence of 400 bp at the RPL43B promoter. The 400 bp spacer DNA derived from the pUG27 vector was included within the integrated DNA fragment by shifting the annealing site of the reverse PCR primer for amplification of the “floxed” marker by 400 bp. After integration and excision of the sequence between LoxP sites, the spacer sequence and one residual LoxP sequence are retained within the

86

Chapter 2 Materials and Methods

integrated DNA. A similar approach was used to replace the Rap1 binding site motifs at the RPL43B promoter with 5 copies of the LexO operator for the LexA DNA-binding protein.

2.3 Yeast culture conditions

2.3.1 Yeast culture conditions in liquid media

Cells were grown in yeast extract peptone dextrose (YPD) media (1% w/v yeast extract, 2% w/v bacto peptone, 2% w/v glucose, supplemented with tryptophan (96 mg/L), uracil (24 mg/L) and adenine (12 mg/L)). Yeast cells were grown in small batch cultures with shaking (300 RPM) in conical flasks at 30 °C.

A small amount of cells for each yeast strain was taken from patches of cells grown on YPD agar plates, and used to inoculate pre-cultures in YPD medium (25 to 100 mL). Pre-cultures were grown overnight to the saturated phase, diluted to

OD600 0.2 in fresh YPD medium, and cultured until the exponential growth phase

(approx. OD600 = 0.8) at which cells were harvested (unless otherwise specified).

For the single molecule RNA FISH experiments described in Figure 3.3, diploid cells were cultured until the saturated phase in nutrient-rich YPD media and transferred to sporulation media (SPO, 0.3% w/v potassium acetate and 0.02% w/v raffinose) to an OD600 of 1.8. Cells were collected immediately after resuspension in SPO media and fixed with formaldehyde (3% v/v) prior to sample processing.

For the LexA promoter tethering experiments described in Figure 4.10, yeast strains were grown in synthetic complete medium without histidine (SC-HIS, comprising 6.7 g/L yeast nitrogen base, YNB, supplemented with 2% w/v glucose, and 0.1 mg/mL of the following: methionine, adenine, leucine, tryptophan, uracil, and other essential amino acids) to retain the non-integrating LexA expression plasmids with HIS selection markers.

For auxin-inducible degron (AID) depletion experiments, induction of AID- tagged protein depletion was performed with 3-indole-acetic acid (IAA) (Sigma-

87

Chapter 2 Materials and Methods

Aldrich) during the exponential growth phase (approx. OD600 0.8). Stock solutions of IAA were prepared in dimethyl sulfoxide (DMSO) to 1 M, and added directly to cultures to a final concentration of 500 μM.

2.3.2 Growth on agar medium plates

Yeast cells were grown or incubated on YPD, SC-HIS, or SPO agar plates at 30 °C or room temperature (composition identical to liquid YPD, SC-HIS, or SPO medium, except with the addition of agar to 2% w/v).

2.3.3 Storage of yeast strains

Yeast strains were stored as glycerol stocks. A small amount of cells was obtained from fresh patches of cells grown overnight on YPD agar plates, and resuspended in 15% v/v glycerol solution (sterilised through 0.22 μm filter unit). Strains were stored at -80 ˚C, and small amounts of cryo-stored cells were taken as required and incubated on YPD agar plates at 30 ˚C overnight to allow cells to recover and multiply.

2.4 Cloning, plasmids, and oligonucleotides

2.4.1 Rap1 mutants

All of the Rap1 mutant expression constructs were cloned into single copy integration plasmids (pNH603, gift from Wendell Lim, UCSF), transformed, and integrated into the RAP1-AID or RAP1-AID-MYC strain backgrounds.

The Rap1 mutants with large N- or C-terminal truncations (plasmids 471- 474) in Figure 4.9, and deletions of specific functional domains (plasmids 477-483) in Figure 4.11 were sub-cloned from pRS415 CEN/ARS RAP1 plasmids (gift from Amanda Meyer and Tony Weil, Vanderbilt University) by restriction enzyme digestion into a single copy integration plasmid (plasmid 372) (Garbett et al., 2007; Layer et al., 2010). DNA encoding three tandem copies of green fluorescent protein

88

Chapter 2 Materials and Methods

(GFP) were introduced to the Rap1 DBD expression plasmid (p474) by Gibson- style cloning, using DNA from a plasmid template (p171).

Three copies of the V5 epitope tag were cloned in-frame at the C-terminus of Rap1 constructs by Gibson-style cloning (NEBuilder HiFi, NEB) in plasmids 477- 483 to generate plasmids 558, 559, 561, 562, 566, and 568 for chromatin immunoprecipitation experiments (Figure 4.11).

The RAP1 mutants containing a set of 46 point and patch mutations within the C-terminal domain were re-cloned from plasmids described previously (gift from Cynthia Wolberger, Johns Hopkins University) (Feeser and Wolberger, 2008) into a single copy expression plasmid encoding full-length RAP1 (plasmid 471) by Gibson-style cloning. Each plasmid was verified by Sanger sequencing. The library of individual Rap1 mutant plasmids was linearised by digestion with PmeI prior to integration at the HIS3 locus in the RAP1-AID strain background.

The LexA and LexA-TAP plasmids (p446, p294) used for the LexA promoter tethering experiments described in Figure 4.10 have been previously described (van Werven and Timmers, 2006). The Rap1 C-terminal domain and three copies of the V5 epitope were cloned in-frame with the LexA DNA-binding domain by Gibson-style cloning, using a genomic DNA template from a strain with Rap1 endogenously tagged with the V5 epitope (FW4732).

2.4.2 Fluorescent reporter system for divergent promoter activity

The pPPT1-pSUT129 (pPS, mCherry-YFP) fluorescent reporter plasmid was described previously (gift from Sebastian Marquardt, University of Copenhagen) (Marquardt et al., 2014). The target motifs for each transcription factor (obtained from the YeTFaSCo database (de Boer and Hughes, 2012) (Figure 4.2, Figure 4.3, and Figure 4.7) were introduced by blunt-end cloning into unique restriction sites proximal (SspI) or distal (Xmn1) to the SUT129 transcription start site and verified by Sanger sequencing. The tandem ribosomal protein gene target motifs for Rap1 were identical to those in the S. cerevisiae RPL43B promoter. After

89

Chapter 2 Materials and Methods

linearisation by digestion with EcoRI, the reporter construct was transformed and integrated to replace the endogenous PPT1-SUT129 locus.

2.4.3 Plasmid amplification and minipreps

To amplify plasmids, E. coli DH5α cells were transformed with plasmid DNA encoding the ampicillin (ampR) resistance marker, plated onto Luria Broth (LB) agar plates supplemented with 100 μg/mL ampicillin, and incubated overnight at 37 °C to select for positive colonies. Individual colonies were picked into 7 mL of liquid LB media supplemented with 100 μg/mL ampicillin, and cultured overnight with shaking at 37 °C before cell pellets were collected by centrifugation. Amplified plasmids were purified using a commercial miniprep kit (Machery-Nagel NucleoSpin Plasmid) according to the manufacturer’s instructions, and plasmid DNA concentration was determined using a NanoDrop spectrophotometer.

2.4.4 Table of plasmids used in this study

plasmid plasmid name number 255 pFA6A-V5::KANMX6 252 pFA6A-V5-IAA7::KANMX6 547 pKAN-IAA17 (71-114)-Myc::KANMX 546 pHYG-IAA17 (71-114)-FLAG::HPHNT 250 pNH603 pGPD1-osTIR1 HIS3 247 pNH605 pGPD1-osTIR1 LEU2 227 NATMX gene deletion 471 pNH603 Myc-NLS-Rap1 (1-827)::HIS3 472 pNH603 Myc-NLS-Rap1 (339-827)::HIS3 473 pNH603 Myc-NLS-Rap1 (1-599)::HIS3 474 pNH603 Myc-NLS-Rap1 (339-599)::HIS3 372 pNH603::HIS3 single copy integration vector 477 pNH603 HA-NLS-Rap1 (1-827)::HIS3 478 pNH603 HA-NLS-Rap1 (ΔDBD 362-597)::HIS3

90

Chapter 2 Materials and Methods

479 pNH603 HA-NLS-Rap1 (ΔTox 597-662)::HIS3 480 pNH603 HA-NLS-Rap1 (ΔAD 631-678)::HIS3 482 pNH603 HA-NLS-Rap1 (Δ764-827)::HIS3 483 pNH603 HA-NLS-Rap1 (Δ631-696)::HIS3 566 pNH603 HA-NLS-Rap1 (1-827)-V5::HIS3 558 pNH603 HA-NLS-Rap1 (ΔDBD 362-597)-V5::HIS3 559 pNH603 HA-NLS-Rap1 (ΔTox 597-662)-V5::HIS3 568 pNH603 HA-NLS-Rap1 (ΔAD 631-678)-V5::HIS3 561 pNH603 HA-NLS-Rap1 (Δ764-827)-V5::HIS3 562 pNH603 HA-NLS-Rap1 (Δ631-696)-V5::HIS3 171 pFA6a-3xeGFP-KANMX6 105 LoxP-KANMX5-LoxP 106 LoxP-HIS5MX4-LoxP 108 LoxP-KlURA3MX4-LoxP 295 LoxP-5xLacO-5xLexO-LoxP 109 pGAL1-CRE::URA3 110 pGAL1-CRE::HIS3 592 YFP-pPPT1-mCherry::NATMX6 593 YFP-R1p(SspI)-pPPT1-mCherry::NATMX6 595 YFP-R1prv(SspI)-pPPT1-mCherry::NATMX6 618 YFP-R1d(XmnI)-pPPT1-mCherry::NATMX6 615 YFP-Cbf1_bs-pPPT1-mCherry::NATMX6 612 YFP-Gcn4_bs-pPPT1-mCherry::NATMX6 617 YFP-Hap4_bs-pPPT1-mCherry::NATMX6 613 YFP-Cat8_bs-pPPT1-mCherry::NATMX6 614 YFP-Gal4_bs-pPPT1-mCherry::NATMX6 616 YFP-Gcr1_bs-pPPT1-mCherry::NATMX6 619 YFP-Abf1_bs-pPPT1-mCherry::NATMX6 620 YFP-Reb1_bs-pPPT1-mCherry::NATMX6 446 pADH1-LexA_DBD-SV40NLS-ADH1terminator::HIS3 pADH1-LexA_DBD-SV40NLS-Rap1_CTD-V5- 457 ADH1terminator::HIS3 294 pADH1-LexA_DBD-SV40NLS-TAP-ADH1terminator::HIS3

91

Chapter 2 Materials and Methods

627 3xFLAG-NATMX 704 pTDH3-dCas9-3xFLAG-ADH1term::C.a. URA3 703 pTDH3-dCas9-Mxi1-3xFLAG-ADH1term::C.a. URA3 692 pRS305-pSNR52-sgIRT2_A::LEU2 705 pRS305-pSNR52-sgiMLP1_I::LEU2 352 pRS305-pSNR52-sgTEF1::LEU2 492-501, pNH603 Myc-NLS-Rap1 (point or patch mutant)::HIS3, see 510-545 below 492 H709A 493 R747A 494 M763A 495 M817A 496 M763A M817A 497 R804A M817A 498 T700A D701A R747A K748A N749A S753A Patch 2B 499 S725A D727A E729A Patch 4 500 K796A R804A T812A Patch 6 501 Δ672-827 510 D689A 511 K696A 512 D701A 513 D701R 514 Q715A 515 D701A H789A 516 D727A 517 E729A 518 S731A 519 S731Y 520 E734A 521 E743R 522 S753Y 523 N782R 524 H789A

92

Chapter 2 Materials and Methods

525 D790A 526 K796A 527 N798A 528 Q800A 529 E801A 530 R804A 531 M817R 532 M817Y 533 R820A 534 D701A R747A 535 N798A D799A 536 R804A T812A 537 S731Y M763A 538 S731Y R820E 539 N679A I682A N782A Patch 1 540 T700A D701A R747A K748A N749A Patch 2A 541 H709A D742A E743A Patch 3A 542 L706A H709A D742A E743A Patch 3B 543 D761A M763A M817A Patch 5A 544 D761A M763A R814A M817A Patch 5B 545 R747S

Table 2.2 Table of plasmids used in this study

93

Chapter 2 Materials and Methods

2.4.5 Table of oligonucleotides used in this study

No. Sequence (5' - 3') Name Notes N/A CACTCTrGrArGrCrAr TSS-seq RNA adapter 5' RNA adapter ArUrArCrC oligonucleotide ligated to 5' end of decapped RNA fragments - TSS sequencing protocol. 5' RNA adapter ligated to 5' ends of decapped RNA fragments. RNA nucleotides are listed with the prefix “r”. N/A GCAC[iBiodT]GCACT TSS-seq 2nd strand Primer for 2nd CTGAGCAATACC synthesis primer strand synthesis in TSS sequencing protocol (internally biotinylated) 489 ATGCAACGCCTACT IME1 -2400 REV Oligos to amplify TGTTTT IRT2 northern blot probe DNA template from genomic DNA 493 GATGGAGGGTTGG IME1 UME6Δ check FW Oligos to amplify CATAAAA IRT2 northern blot probe DNA template from genomic DNA

94

Chapter 2 Materials and Methods

1130 TGCACCCAGACAAC AW10_RPL40B_ncRNA Oligos to amplify TACACA _probef iMLP1 northern blot probe DNA template from genomic DNA 1131 CGCCGTAAGACTCA AW11_RPL40B_ncRNA Oligos to amplify ATGGAC _prober iMLP1 northern blot probe DNA template from genomic DNA 2111 GGCCCTGATGATAA AW411_SNR190_NBpro Oligos to amplify TG be_fwd SNR190 northern blot probe DNA template from genomic DNA 2112 GGCTCAGATCTGCA AW412_SNR190_NBpro Oligos to amplify TG be_rev SNR190 northern blot probe DNA template from genomic DNA 1701 TGCGGCTGGTATG AW255_pRPL43B_Rap1 Oligos to amplify GTATTGTAAGG _ChIP_A_fwd region adjacent to Rap1 binding sites at pRPL43B, for ChIP-qPCR 1702 AAAGGCAGAAGATG AW256_pRPL43B_Rap1 Oligos to amplify GGCGGC _ChIP_B_rev region adjacent to Rap1 binding sites at pRPL43B, for ChIP-qPCR 2170 GCTTTACCTCTTGC AW434_pRPL40B_ChIP Oligos to amplify TGAACGGGA _A_fwd region adjacent to Rap1 binding sites

95

Chapter 2 Materials and Methods

at pRPL40B, for ChIP-qPCR 2171 TCCGCCATATGATC AW435_pRPL40B_ChIP Oligos to amplify CGCCTC _A_rev region adjacent to Rap1 binding sites at pRPL40B, for ChIP-qPCR 106 GTACCACCATGTTC FvW_ACTFrt Oligos to ampify CCAGGTATT region at 3' end of ACT1 ORF, for ChIP-qPCR 268 AGATGGACCACTTT FvW_ACT1rt Oligos to ampify CGTCGT region at 3' end of ACT1 ORF, for ChIP-qPCR

Table 2.3 Table of oligonucleotides used in this study All oligonucleotides were obtained from Integrated DNA Technologies (IDT).

2.5 Experimental methods

2.5.1 Fluorescence microscopy

Yeast cells were grown in YPD media (small batch cultures) to the exponential growth phase, and fixed with formaldehyde (3.7% w/v), incubating at room temperature for 15 minutes. Fixed cells were washed with phosphate-sorbitol buffer (0.1 M KPi (pH 7), 0.05 M MgCl2, 1.2 M sorbitol), and resuspended in the same buffer prior to imaging. Images were acquired using a Nikon Eclipse Ti-E inverted microscope imaging system (Nikon) equipped with a 100x oil objective (NA 1.4), SOLA SE light engine (Lumencor), ORCA-FLASH 4.0 camera (Hamamatsu) and NIS-Elements AR software (Nikon). 500 ms exposure time was specified and GFP and mCherry filters were used to detect YFP and mCherry signals, respectively.

96

Chapter 2 Materials and Methods

To quantify whole cell fluorescence signals, measurements were performed using ImageJ (version 1.52i, NIH) (Schneider et al., 2012) for the YFP and mCherry channels. Regions of interest (ROIs) were manually drawn around the border of each cell. Mean signal represents the mean intensity in each channel per cell multiplied by the cell area, and the signal for YFP and mCherry was corrected for cell-free background fluorescence in a similar manner. Wild-type cells without integrated fluorescent reporter plasmids were also measured to determine auto- fluorescence signal. 50 cells were quantified for each sample.

2.5.2 Spot growth assay

To perform spot assays for cellular growth, yeast cells were grown overnight to saturated phase in YPD small batch cultures, then diluted to OD600 0.4 in sterile water. Serial 5-fold dilutions were made and 3 μL of each dilution was spotted onto YPD agar plates in the presence of IAA or DMSO, and allowed to dry. Cells were incubated at 30 °C for 2 days before imaging.

2.5.3 RNA extraction

Yeast cells were harvested from cultures for RNA extraction by centrifugation, then washed once with sterile water prior to snap-freezing in liquid nitrogen. RNA was extracted from frozen yeast cell pellets using Acid Phenol:Chloroform:Isoamyl alcohol (125:24:1, Ambion) and Tris-EDTA-SDS (TES) buffer (0.01 M Tris-HCl pH 7.5, 0.01 M EDTA, 0.5% w/v SDS), by rapid agitation (1400 RPM, 65 °C for 45 minutes). After centrifugation, the aqueous phase was obtained and RNA was precipitated at -20 °C overnight in ethanol with 0.3 M sodium acetate. After centrifugation and washing with 80% (v/v) ethanol solution, dried RNA pellets were resuspended in DEPC-treated sterile water and subsequently stored at -80 °C.

97

Chapter 2 Materials and Methods

2.5.4 Northern blot

Northern blots were performed as described previously (Chia et al., 2017; Wu et al., 2018b). Briefly, RNA samples (10 μg per lane) were incubated in sample denaturation buffer (1 M deionised glyoxal, 50% v/v DMSO, 10 mM sodium phosphate (NaPi) buffer (pH 6.8)) at 70 °C for 10 minutes, loading buffer (10% v/v glycerol, 2 mM NaPi buffer, 0.4% w/v bromophenol blue) was added, and RNA samples were subjected to electrophoresis (2 hours at 80 V) on an agarose gel

(1.1% v/v agarose, 0.01 M NaPi buffer). Capillary transfer was used to transfer total RNA onto positively charged nylon membranes (GE Amersham Hybond N+). Bands corresponding to mature rRNA were visualised by staining with methylene blue solution (0.02% w/v methylene blue, 0.3 M sodium acetate).

The nylon membranes were incubated for at least 3 hours at 42 °C in hybridisation buffer (1% w/v SDS, 40% v/v deionised formamide, 25% w/v dextran sulfate, 58 g/L NaCl, 200 mg/L sonicated salmon sperm DNA (Agilent), 2 g/L BSA, 2 g/L polyvinyl-pyrolidone, 2 g/L Ficoll 400, 1.7 g/L pyrophosphate, 50 mM Tris pH 7.5) or ULTRAhyb Ultrasensitive Hybridization Buffer (Thermo Fisher Scientific) to minimise non-specific probe hybridisation. Probes were synthesised using a Prime- it II Random Primer Labeling Kit (Agilent), 25 ng of target-specific DNA template, and radioactively labelled with dATP [α-32P] (Perkin-Elmer or Hartmann Analytic). The oligonucleotides used to amplify target-specific DNA templates for IRT2, iMLP1, and SNR190 northern blot probes by PCR can be found in Table 2.3.

Blots were hybridised overnight with radioactively labelled probes at 42 °C, and then washed at 65 °C for 30 minutes each with the following buffers: 2X saline- sodium citrate (SSC) buffer, 2X SSC with 1% w/v SDS, 1X SSC with 1% SDS, and 0.5X SSC with 1% SDS. For image acquisition, membranes were exposed to storage phosphor screens before scanning on the Typhoon 9400, FLA 9500, or FLA 7000 instruments (GE Healthcare Life Sciences). To strip membranes prior to re-probing for different transcripts, membranes were washed with stripping buffer (1 mM Tris, 0.1 mM EDTA, 0.5% w/v SDS) at 85 °C until negligible residual signal remained on the blots.

98

Chapter 2 Materials and Methods

IRT2, iMLP1, and SNR190 levels were estimated from northern blots using ImageJ (version 1.52i, NIH) (Schneider et al., 2012). To obtain the normalised net intensity of each ROI, the mean background intensity of the areas immediately above and below the ROI was subtracted from the signal of the main band comprising a rectangular ROI encompassing the specific band of interest. Signals were first normalised to SNR190 levels, and then further normalised to a specific band on the same membrane (signal assigned an arbitrary value of 1).

2.5.5 Western blot

Western blots were performed as described previously (Chia et al., 2017; Wu et al., 2018b). Protein extracts were prepared from whole cells after fixation with trichloroacetic acid (TCA). Samples were pelleted by centrifugation and incubated with 5% w/v TCA solution at 4 °C for at least 10 minutes. Samples were washed with acetone, pelleted, and dried. Samples were then resuspended in protein breakage buffer (50 mM Tris (pH 7.5), 1 mM EDTA, 2.75 mM dithiothreitol (DTT)) and subjected to disruption using a Mini Beadbeater (Biospec) and 0.5 mm glass beads. Protein extract samples were mixed with SDS-PAGE sample buffer (187.5 mM Tris (pH 6.8), 6.0% v/v β-mercaptoethanol, 30% v/v glycerol, 9.0% v/v SDS, 0.05% w/v Bromophenol blue) in a 2:1 ratio by volume, and protein samples were denatured at 95 °C for 5 minutes.

SDS-PAGE (polyacrylamide gel electrophoresis) was performed using 4- 20% gradient gels (Bio-Rad TGX) and samples were then transferred onto PVDF membranes by electrophoresis (wet transfer in cold transfer buffer: 3.35% w/v Tris, 14.9% w/v glycine, 20% v/v methanol). Membranes were incubated in blocking buffer (1% w/v BSA, 1% w/v non-fat powdered milk in phosphate buffered saline with 0.01% v/v Tween-20 (PBST) buffer) before primary antibodies were added to blocking buffer for overnight incubation at 4 °C. For probing with secondary antibodies, membranes were washed in PBST and anti-mouse or anti-rabbit IgG horseradish peroxidase (HRP)-linked antibodies were used for incubation in blocking buffer (1 hour, room temperature). Signals corresponding to protein levels

99

Chapter 2 Materials and Methods

were detected using Amersham ECL Prime detection reagent and an Amersham Imager 600 instrument (GE Healthcare).

2.5.6 Antibodies

The following antibodies were used for western blotting.

Antibody Dilution Source Identifier Anti-V5 mouse 1:2000 Thermo Fisher R96025, previously

monoclonal IgG2A Scientific Invitrogen 46-0705 Anti-hexokinase 1:8000 US Biological H2035 rabbit IgG Anti-Myc tag 1:2000 Merck CAT 05-724 Lot mouse monoclonal DAM1764400 IgG, clone 4A6 Anti-HA tag mouse 1:2000 Cell Services Clone 12CA5 IgG, clone 12CA5 Facility, The Francis Crick Institute Anti-FLAG mouse 1:2000 Sigma-Aldrich F3165

monoclonal IgG1, (Merck) clone M2 Amersham ECL 1:10000 GE Life Sciences NA931V5 anti-mouse IgG, HRP-linked whole antibody (from sheep) Amersham ECL 1:10000 GE Life Sciences NA934V anti-rabbit IgG, HRP-linked whole antibody (from donkey)

100

Chapter 2 Materials and Methods

Table 2.4 Table of antibodies used for western blotting

2.5.7 Cycloheximide protein stability assay

To measure the stability of Rap1 mutant proteins, cycloheximide was used to inhibit translation. Yeast cells were grown in YPD media (small batch cultures) until mid-logarithmic growth, and treated with IAA (500 μM) to induce depletion of endogenously tagged Rap1-AID protein while leaving expression of auxin- insensitive Rap1 mutant constructs unaffected. Cycloheximide (final concentration 0.2 mg/mL) was added 1 hour after treatment with auxin, and samples of cells were taken and fixed in TCA (see above) at specified time points. Samples were processed for SDS-PAGE, transferred to western blot membranes, and probed using anti-HA or anti-MYC primary antibodies, and subsequently HRP-linked secondary antibodies (ECL). Signals were quantified using ImageJ (version 1.52i, NIH) (Schneider et al., 2012) as follows: To obtain the normalised net intensity of each ROI, the mean background intensity of the areas immediately above and below the ROI was subtracted from the main signal comprising a rectangular ROI encompassing the specific band of interest. Signals for each mutant protein were normalised individually to the corresponding Hxk1 loading control, then further normalised to the respective signal at the 0 min time point (prior to addition of cycloheximide) for each blot (signal assigned arbitrary value of 1). Values from three biological replicate experiments were analysed and plotted.

2.5.8 Chromatin immunoprecipitation (ChIP)

Chromatin immunoprecipitation (ChIP) experiments were performed as previously described (Chia et al., 2017; Wu et al., 2018b). Cells were fixed by adding formaldehyde (1% v/v) and incubating the samples at room temperature for 20 minutes. Glycine (100 mM) was added to quench the fixation reactions. Samples were washed once using FA lysis buffer (0.05 M HEPES-KOH (pH 7.5), 0.15 M NaCl, 0.001 M EDTA (pH 8), 1% v/v Triton-X-100, 0.1% w/v sodium deoxycholate, 0.1% w/v SDS) and samples were frozen in liquid nitrogen. Frozen pellets were subjected to cell breakage using a Mini Beadbeater (Biospec) and

101

Chapter 2 Materials and Methods

zirconia/silica beads (0.5 mm, Biospec). Chromatin extracts were sheared by sonication using a Bioruptor (Diagenode, 9 cycles of 30 s on/off, high intensity). To immunoprecipitate V5-tagged proteins, clarified extracts were incubated for 2 hours at room temperature with 20 μL of anti-V5 agarose affinity gel (Sigma-Aldrich A7345). Agarose beads were subjected to extensive washing: four times with FA lysis buffer, three times with Wash 1 buffer (FA Lysis Buffer with 260 mM NaCl), and twice with Wash 2 buffer (10 mM Tris pH 8, 250 mM LiCl, 0.05% v/v IGEPAL CA-630, 0.05% w/v sodium deoxycholate, 0.1 mM EDTA). The anti-V5 resin was incubated in TE-SDS buffer (10 mM Tris (pH 8), 1 mM EDTA, 1.0% w/v SDS) at 65 °C overnight to reverse formaldehyde cross-links. Samples were treated with Proteinase K at 37 °C for at least 2 hours (Thermo Fisher Scientific) to digest de- crosslinked protein, and DNA fragments were purified by spin column (Machery- Nagel). ChIP signals corresponding to target protein binding at the RPL43B and RPL40B promoters were determined by quantitative PCR (qPCR) using PowerUp SYBR Green Master Mix (Thermo Fisher Scientific) on a QuantStudio 3 instrument (Applied Biosystems). The signals at RPL43B and RPL40B were normalised over a region at the 3’ end of the ACT1 gene (negative control). The oligonucleotide sequences used for ChIP experiments are listed in Table 2.3.

2.5.9 CRISPR interference (CRISPRi)

For CRISPR interference (CRISPRi) experiments, yeast expression constructs for nuclease-inactivated Cas9 (D10A/H840A mutations) from Streptococcus pyogenes – dCas9 (Addgene #46920) and dCas9-Mxi1 (Addgene #46921) (Gilbert et al., 2013; Qi et al., 2013) – were sub-cloned into single copy integration plasmids by restriction cloning. The 3xFLAG epitope was introduced by Gibson-style cloning (NEBuilder HiFi, NEB) at the C-terminus of each construct, in- frame with the ORF. dCas9-3xFLAG and dCas9-Mxi1-3xFLAG expression constructs on single copy integration plasmids were transformed into yeast after linearisation with PmeI. Single guide RNA (sgRNA) plasmids (gift from Elçin Ünal, UC Berkeley) (Chen et al., 2017) were generated by site-directed mutagenesis (Q5 Site-Directed Mutagenesis Kit, NEB), using specific primers containing the sgRNA 20-mer target sequence (split in half across the two primers). sgRNA target

102

Chapter 2 Materials and Methods

sequence selection was guided by availability of “NGG” protospacer adjacent motif (PAM) sites and transcription start-site sequencing (TSS-seq) data presented in Chapter 4. sgRNA expression plasmids were integrated at the SNR52 locus after linearisation by XbaI digestion.

2.5.10 Single molecule RNA fluorescence in situ hybridisation (FISH)

Single molecule RNA fluorescence in-situ hybridisation (FISH) was performed as described previously (Moretto et al., 2018). Formaldehyde (3% w/v) was added to fix cells overnight, and samples were resuspended in buffer B (1.5 M sorbitol, 0.1 M potassium phosphate dibasic pH 7.5, 0.2% v/v β-mercaptoethanol, 2 mM vanadyl-ribonucleoside complex) and treated with Zymolyase to spheroplast the cells, and fixed in 80% v/v ethanol. Fluorophore-labelled FISH probes (Biosearch Technologies) targeting IME1 (AF594) and ACT1 (Cy5, internal control) (Dyes, Thermo Fisher Scientific) were hybridised (30 °C, overnight) in hybridisation buffer (10% dextran sulphate, 2 mM vanadyl-ribonucleoside complex, 0.02% w/v BSA, 1 mg/mL E. coli tRNA, 2X SSC, 10% formamide) to target transcripts within the samples. Samples were washed with wash buffer (10% formamide, 0.3 M sodium chloride, 0.03 M sodium citrate), wash buffer with 5 μg/mL DAPI (4’,6- diamidino-2-phenylindole), and 2X SSC before imaging.

Cells were imaged using on a Nikon Eclipse Ti-E imaging system (Nikon) equipped with a 100x oil objective (NA 1.4), SOLA SE light engine (Lumencor), and an ORCA-FLASH 4.0 camera (Hamamatsu). Images were collected in the DIC, DAPI, AF594 (IME1), and Cy5 (ACT1) channels every 0.3 microns (each stack comprising 20 images) using NIS-Elements AR software (Nikon). Maximum intensity Z projections were generated for quantification using ImageJ (version 1.52i, NIH) (Schneider et al., 2012). Subsequently, StarSearch software (Arjun Raj laboratory, University of Pennsylvania, http://rajlab.seas.upenn.edu/StarSearch/launch.html) was for single cell transcript quantification. 139 cells were quantified for each sample using comparable thresholds in StarSearch, and only cells positive for the internal control ACT1 transcript were analysed.

103

Chapter 2 Materials and Methods

2.5.11 RNA sequencing (RNA-seq)

Total RNA was extracted from yeast as described above, treated with rDNase in solution (Machery-Nagel), and purified by spin column (Machery-Nagel) prior to preparation of sequencing libraries. 1 μg of intact yeast total RNA was treated with the Illumina RiboZero Gold rRNA Removal Kit (Yeast) to remove rRNA prior to total RNA sequencing, and 500 ng of RNA input material was used for polyadenylated (polyA) RNA sequencing. Libraries were prepared using the TruSeq Stranded Total RNA kit or TruSeq Stranded mRNA kit (Illumina) according to the manufacturer’s instructions (10 or 13 PCR cycles). The libraries were multiplexed and sequenced on either the HiSeq 2500 or 4000 platform (Illumina), and generated ~45 million 101 bp strand-specific paired-end reads per sample on average.

2.5.12 Transcription start site sequencing (TSS-seq)

Transcription start site sequencing (TSS-seq) libraries were prepared to obtain sequencing reads representing the 5’ ends of polyadenylated and capped transcripts. Approximately 7 - 9 μg of polyA purified RNA was fragmented by incubation at 70 °C during zinc ion-mediated fragmentation (Ambion). Fragmented RNA was then purified using RNeasy MinElute Cleanup spin columns (QIAGEN), to isolate RNA fragments with a mode length of ~200 nucleotides. To remove the 5’ phosphate groups of non-capped RNA fragments, fragmented purified RNA was incubated with alkaline phosphatase (rSAP, NEB), and the reaction mixture was subjected to acid phenol:chloroform RNA extraction and ethanol precipitation as described above. Dephosphorylated RNA fragments were incubated with Cap- ClipTM Acid Pyrophosphatase (CellScript) to remove the 5’ me7Gppp caps from fragments representing the genuine 5’ ends of transcripts, leaving a ligation- competent 5’ terminal phosphate group. For the “no decapping” control sample, an aliquot was taken from the wild-type (WT, FW629) sample and treatment with Cap- ClipTM was not performed. After one round of acid phenol:chloroform RNA extraction and ethanol precipitation, a custom 5’ adapter DNA/RNA oligonucleotide was ligated to 5’ decapped ends of fragments using T4 RNA ligase 1 (NEB) (see Table 2.3 listing oligonucleotides). Excess adapters were removed with an

104

Chapter 2 Materials and Methods

additional MinElute column purification step. First strand cDNA synthesis was performed using Superscript IV Reverse Transcriptase (Thermo Fisher Scientific), and the RNA strand of the RNA:DNA hybrid was digested with RNase H (NEB) and a RNase cocktail (Ambion). The KAPA HiFi HotStart ReadyMixPCR Kit was used to perform second strand cDNA synthesis with a biotinylated primer complementary to the 5’ adapter oligonucleotide sequence. Double-stranded cDNA was purified using streptavidin-coupled magnetic beads, quantified by Qubit fluorometric quantitation (Thermo Fisher Scientific). Libraries were prepared from cDNA input material using the KAPA Hyper Prep Kit (KAPA Biosystems) and KAPA Single- Indexed Adapters for Illumina platforms (KAPA Biosystems). Final libraries were subjected to 1X bead-based cleanup (HighPrep PCR, MagBio Genomics). Libraries were quantified by Qubit and sequenced on the HiSeq 4000 platform (Illumina), typically generating ~39 million 76bp strand-specific single-end reads per sample. The “No decapping” control library generated ~16 million single-end reads.

2.5.13 Nascent RNA sequencing (Nascent RNA-seq)

For nascent RNA sequencing (nascent RNA-seq), RNA fragments associated with RNA polymerase II subunit Rpb3 endogenously tagged with 3xFLAG epitope at the C-terminus were isolated by affinity purification as described previously (Churchman and Weissman, 2011, 2012). Small batch cultures of yeast cells grown in YPD media were collected by centrifugation, the supernatant was aspirated, and cell pellets were immediately snap-frozen by submerging in liquid nitrogen to minimise changes in nascent transcription activity (e.g. in response to cell resuspension in cold lysis buffer with high concentrations of salts and detergents). Frozen cell pellets were dislodged from centrifuge tubes and stored at -80 °C. Cells were subjected to cryogenic lysis by freezer mill grinding under liquid nitrogen (SPEX 6875D Freezer/Mill, standard program: 15 cps for 6 cycles of 2 minutes grinding and 2 minutes cooling each). Yeast “grindate” powder was stored at -80 °C. 2 g of yeast grindate was resuspended in 10 mL of 1X cold lysis buffer (20 mM HEPES pH 7.4, 110 mM potassium acetate, 0.5% v/v Triton-X-100, 1% v/v

Tween-20) supplemented with 10 mM MnCl2, 1X Roche cOmplete EDTA-free protease inhibitor, and 50 U/mL SUPERase.In RNase Inhibitor. Chromatin-bound

105

Chapter 2 Materials and Methods

proteins were solubilised by incubation with 1320 U of DNase I (RQ1 RNase-free DNase I, Promega) on ice for 20 minutes. Lysates containing solubilised chromatin proteins were clarified by centrifugation at 20,000 x g for 10 minutes (at 4 °C), and the supernatant was taken as input for immunoprecipitation using 500 μL of anti- FLAG M2 affinity gel suspension (A2220, Sigma-Aldrich) per sample (2.5 hours at 4 °C). After immunoprecipitation, the supernatant was removed and beads were washed 4 times with 10 mL of cold wash buffer each time (1X lysis buffer with 50 U/mL Superase.In RNase inhibitor and 1 mM EDTA). After the last wash, agarose beads were transferred to small chromatography spin columns (Pierce Spin Columns, Thermo Fisher), and competitive elution of protein complexes containing Rpb3-3xFLAG protein from the resin was performed by incubating beads with 300 μL of elution buffer (1X cold lysis buffer with 2 mg/mL 3xFLAG peptide) for 30 minutes at 4 °C (3xFLAG peptide provided by the Peptide Chemistry Science Technology Platform, The Francis Crick Institute). Elution was performed twice and 600 μL of eluate was subjected to acid phenol:chloform RNA extraction and ethanol precipitation. A significant amount of 3xFLAG peptide co-precipitates with the RNA as a contaminant, and is later removed by spin column purification.

Purified RNA was fragmented to a mode length of ~200 nucleotides using zinc ion-mediated fragmentation (Ambion AM870, 70 °C for 4 minutes). Fragmented RNA was purified using miRNeasy spin columns (miRNeasy mini kit, QIAGEN), which retain RNAs approximately 18 nucleotides or more in length. Purified RNA was quantified by Qubit (Thermo Fisher) and approximately 150 ng of RNA was subjected to rRNA depletion using the Ribo-Zero Gold rRNA Removal Kit (Yeast) (Illumina MRZY1324, now discontinued). Libraries were prepared using the TruSeq Stranded Total RNA kit (Illumina) according to the manufacturer’s instructions (14 PCR cycles). The libraries were multiplexed and sequenced on the HiSeq 4000 platform (Illumina), and generated ~28 million 101 bp strand-specific paired-end reads per sample on average.

106

Chapter 2 Materials and Methods

2.5.14 Chromatin proteomics mass spectrometry

To perform proteomics mass spectrometry on chromatin-bound proteins, chromatin extracts were prepared as previously described (van Werven et al.,

2008). Yeast cell pellets (approximately 700 – 1000 OD600.mL units) were collected by centrifugation, and snap-frozen in liquid nitrogen. Cells were resuspended in in nuclear isolation buffer (NIB: 250 mM sucrose, 10 mM MgCl2, 20 mM HEPES (pH 7.8), 0.1% v/v Triton X-100, 5 mM β-mercaptoethanol, 1X cOmplete protease inhibitor (Roche)) and lysed by vortexing with 0.5 mm glass beads. The insoluble fraction of the lysate was collected by centrifugation (27000 x g, 15 min, 4 °C), resuspended and washed once in NIB, collected again by centrifugation, and resuspended in 4.5 mL NIB supplemented with CaCl2 (2 mM, essential cofactor for MNase). Samples were treated with 3000 U of micrococcal nuclease (MNase, NEB) for 4 minutes at 30 °C, and reactions were stopped by transferring to ice and adding EDTA to 10 mM. The concentration of NaCl in the buffer was increased to 150 mM, and samples were clarified by centrifugation (16000 x g, 10 min, 4 °C). The supernatant containing chromatin-bound proteins released from the pellet by MNase digestion was treated as chromatin extract for immunoprecipitation. Immunoprecipitation was performed with approximately 15 mg of each chromatin extract sample using 100 μL of anti-V5 agarose affinity gel (Sigma-Aldrich, A7345) for 4 hours at 4 °C. The beads were washed 5 times with 1 mL NIB wash buffer (NIB with 350 mM NaCl), the residual supernatant was aspirated, and bound proteins were eluted by heating at 95 °C for 5 minutes in SDS-PAGE sample buffer.

SDS-PAGE was performed and the eluted protein samples were migrated 1 cm into a polyacrylamide gel (12% NuPAGE, Invitrogen), prior to staining with InstantBlue Protein Stain (Expedeon). In-gel protein digestion was performed using trypsin, and peptides were reduced and alkylated prior to clean-up and analysis using an Orbitrap-Fusion Lumos mass spectrometer coupled to an Ultimate3000 high performance liquid chromatography (HPLC) system equipped with an EASY- Spray nanosource (Thermo Fisher Scientific).

107

Chapter 2 Materials and Methods

MaxQuant software (v1.6.01) (Cox and Mann, 2008) was used to perform label-free quantification (LFQ) of the protein samples, and the proteingroup.txt output table was further statistically processed with Perseus (version 1.4.0.2)

(Tyanova et al., 2016). A log2 tranformation was applied to LFQ protein intensities and the data set was further filtered for proteins with at least 3 measured values in one sample group (comprising triplicate injections). Missing values were imputed using default settings in Perseus by drawing from a simulated noise distribution with a down-shift of 1.8 and a width of 0.3 compared with the log2 LFQ intensity distribution.

To determine differential enrichment of proteins between samples, two- sample t-tests were performed with a permutation-based false discovery rate (FDR) set at 0.05. Proteins that were significantly enriched (> 2-fold enrichment, p < 0.05, unpaired two-sample t-test) in the full-length (FL) Rap1-V5 sample compared to the empty vector (EV) sample (n = 289 proteins) were subjected to Gene Ontology (GO) analysis (SGD Gene Ontology Slim Mapper Process Analysis, https://www.yeastgenome.org/cgi-bin/GO/goSlimMapper.pl). Proteins that were ambiguously assigned to genes (n = 13, e.g. paralogs) were excluded from the analysis. For the proteins that showed differential enrichment (> 2-fold enrichment, p < 0.05, unpaired two-sample t-test) between the Δ631-696 sample versus full- length Rap1 sample (n = 33), GO over-representation analysis of cellular components was performed using the PANTHER classification system (Mi et al., 2019).

2.6 Bioinformatic analysis

2.6.1 Differential expression analysis

Cutadapt (version 1.9.1) was used to perform adapter trimming for RNA sequencing reads with parameters “--minimum-length=25 --quality-cutoff=20 -a AGATCGGAAGAGC -A AGATCGGAAGAGC” (Martin, 2011). The RSEM package (version 1.3.0) (Li and Dewey, 2011) in conjunction with the STAR alignment algorithm (version 2.5.2a) (Dobin et al., 2013) was used for the mapping and subsequent gene-level counting of the sequenced reads using the parameters “--

108

Chapter 2 Materials and Methods

star-output-genome-bam --forward-prob 0” (all other parameters kept as default). Analysis was performed with respect to all S. cerevisiae genes obtained from the Ensembl genome browser (assembly R64-1-1, release 90) (Zerbino et al., 2018). The DESeq2 package (version 1.12.3) (Love et al., 2014) was used to perform differential expression analysis within the R programming environment (version 3.3.1).

To compile a precise annotation of Rap1 binding sites genome-wide, a list of Rap1 binding sites at single nucleotide resolution was obtained from an experimental ChIP-exo dataset (Rhee and Pugh, 2011). To facilitate differential expression analysis with intervals of different sizes (e.g. +50 bp to +500 bp, see below for details), sites within 500 bp of chromosome ends were removed. To annotate the list of high-confidence 141 Rap1-regulated genes and their Rap1 binding site annotations, a list of ribosomal protein (RP) genes (Reja et al., 2015) was merged with a list of previously identified Rap1-regulated glycolytic pathway genes (Lieb et al., 2001), generating a total of 154 genes. The RP genes regulated by Abf1 (instead of Rap1) were removed, as were glycolytic pathway genes without a Rap1 ChIP-exo peak or annotated binding site motif within the promoter region (13 genes removed, resulting in 141 genes) (de Boer and Hughes, 2012; Fermi et al., 2016). The correct promoter Rap1 binding site was manually assigned to each Rap1-regulated gene from the ChIP-exo dataset, as Rap1 motifs are found in both forward and reverse orientations relative to target genes (Knight et al., 2014; Rhee and Pugh, 2011). For promoters containing two Rap1 binding site motifs, the furthest upstream site was assigned. If ChIP-exo coordinates were missing, the Rap1 motif coordinate identified from a comprehensive published ChIP-chip study was assigned instead (de Boer and Hughes, 2012; Lieb et al., 2001).

STAR genomic alignments were filtered to include only unspliced, primary, uniquely mapped, and properly paired alignments with a maximum insert size of 500 bp. The featureCounts tool from the Subread package (version 1.5.1) (Liao et al., 2014) was used to obtain insert fragment counts within defined intervals (e.g. +100 bp) around the 564 Rap1 binding sites (separate intervals for Watson and Crick strand alignments, 1128 intervals total) using the parameters “-O -- minOverlap 1 --nonSplitOnly --primary -s 2 -p -B -P -d 0 -D 600 -C”. Strand-specific

109

Chapter 2 Materials and Methods

reads were only counted if they overlapped with the interval on the corresponding strand. DESeq2 was used to perform differential expression analysis for intervals around Rap1 binding sites as described above, and the DESeq2 size factors calculated with respect to the transcriptome were used for normalisation of the per- sample counts. A similar strategy was used to analyse promoter regions of Ume6- regulated genes (McKnight et al., 2016). Ume6 sites were determined approximately as -250 nucleotides (upstream) relative to the annotated start codon of Ume6-regulated genes. Genomic annotations of CUTs, SUTs, XUTs, and NUTs (various noncoding RNA species) was obtained from (Wery et al., 2016) (http://vm- gb.curi.e.fr/mw2/), and filtered for transcripts greater than 200 nt in length to match the insert size distribution of RNA-seq libraries.

2.6.2 TSS-seq analysis

Cutadapt (version 1.9.1) was used to perform adapter trimming with parameters “--minimum-length=20 --quality-cutoff=20 -a AGATCGGAAGAGC”. To retain only the reads containing the custom 5’ adapter sequence specific to the TSS-seq protocol for further analysis, cutadapt was re-run with the parameters “-- minimum-length=20 --quality-cutoff=20 -g GCACTCTGAGCAATACC” (Martin, 2011). Mapping of reads to the S. cerevisiae genome (assembly R64-1-1, release 90) (Zerbino et al., 2018) was performed using BWA (version 0.5.9-r16) (Li and Durbin, 2009) using the default parameters. The flags “-q 1 -F 20” and “-q 1 -f 16” were specified in SAMtools view (version 1.3.1) to generate uniquely mapped genome alignments corresponding to sense and antisense strands, respectively (Li et al., 2009). The genomeCoverageBed function within BEDTools (version 2.26.0) (Quinlan and Hall, 2010) was used with the parameters “-bg -5 -scale ” to generate BedGraph coverage tracks for TSS-seq data, representing the TSS-seq signal per million mapped reads. BedGraph files were converted to bigWig using the wigToBigWig binary available from the UCSC genome browser with the "-clip" parameter (Kent et al., 2010). The bigwig coverage tracks from three biological replicate samples were merged for visualisation and plotting in IGV (Robinson et al., 2011).

110

Chapter 2 Materials and Methods

An updated TSS annotation for each gene was obtained from a published SMORE-seq TSS data set, and any missing values were replaced with the default gene TSS from Ensembl assembly R64-1-1, release 90 (Park et al., 2014; Zerbino et al., 2018). TSS-seq signals were quantified by counting the number of TSS-seq reads with the 1st transcribed (5’) nucleotide within a +75 bp window of annotated TSSs on the same strand as the corresponding coding gene, and transcripts per million (TPM) values were calculated for each TSS. For the differential expression analysis in Figure 4.6, the nearest TSS to the promoter Rap1 binding site was manually annotated and the genomic distance was measured from the Rap1 binding site to the mode peak of each TSS cluster. DESeq2 was used to perform differential expression analysis for TSSs as described above, and samples were normalised by sequencing depth. A log2(fold change) value greater than 1 (fold change > 2) comparing between the RAP1-AID +IAA and wild-type samples was considered an increase.

2.6.3 ChIP-seq and MNase-seq analysis

Publicly available datasets were obtained from NCBI Gene Expression Omnibus (GEO) for Sth1 ChIP-seq (GEO:GSE56994) (Lopez-Serra et al., 2014), Sth1 MNase ChIP-seq (GEO:GSE65594) (Parnell et al., 2015), and MNase-seq (GEO:GSE73337, GEO:GSE98260) (Kubik et al., 2015; Kubik et al., 2018).

Adapters were trimmed from ChIP-seq and MNase-seq reads using cutadapt as described above. Adapter-trimmed reads were mapped to the S. cerevisiae genome (Ensembl assembly R64-1-1, release 90) (Zerbino et al., 2018) with BWA (version 0.5.9-r16) (Li and Durbin, 2009) using default parameters. Duplicate and multi-mapped reads were removed from single-end ChIP-seq alignments, and only uniquely mapped and properly paired alignments that had no more than two mismatches in either read of the pair and an insert size 120 – 200 bp were kept for paired-end MNase-seq alignnments. To generate genome-wide coverage tracks for nucleosome occupancy from MNase-seq data, the DANPOS2 dpos command (version 2.2.2) (Chen et al., 2013) was used with parameters “-- span 1 --smooth_width 20 --width 40 --count 1000000”.

111

Chapter 2 Materials and Methods

2.6.4 Promoter directionality score analysis

To calculate directionality scores for coding gene promoters, a curated list of coding gene TSSs was obtained from published TSS sequencing data as described (Park et al., 2014). Any missing TSS coordinates for coding genes were supplemented with the TSS annotation from Ensembl (assembly R64-1-1, release 90) (Zerbino et al., 2018), generating a list of 6,646 S. cerevisiae coding gene TSSs. To avoid quantification of divergent direction transcription that constituted coding transcription for an upstream divergent gene, overlapping and divergent gene pairs were removed from the analysis resulting in 2,609 non-overlapping and tandem genes. To simplify the counting analysis, the coverage for each paired-end read from nascent RNA-seq was reduced to the single 3’ terminal nucleotide of the strand-specific read using genomeCoverageBed function within BEDTools (version 2.26.0) (Quinlan and Hall, 2010) with parameters “-ibam stdin -bg -5 -scale %s - strand %s”. “Sense” direction windows encompassed nucleotide positions +1 to +500 in the coding direction relative to the TSS, and “antisense” direction windows encompassed nucleotide positions -1 to -500 in the divergent direction relative to the TSS. The total number of reads with 3’ end positions falling within the sense and antisense direction windows was quantified for each gene using the computeMatrix tool within deepTools (version 2.5.3) (Ramirez et al., 2016) using parameters “reference-point --referencePoint center --upstream 500 --downstream 500 --binSize 1 --scale 1”. A value of 1 was added to all counts to avoid dividing by zero. For each window, the mean read count was calculated from three biological replicate experiments, and used for subsequent calculation and plotting.

2.6.5 Data plotting and visualisation

Bar plots, scatter plots, and volcano plots were generated using GraphPad Prism (version 7 or 8). Screenshots of sequencing data were captured using the Integrative Genomics Viewer (IGV, Broad Institute, version 2.4.15) (Robinson et al., 2011). The RStudio integrated development environment (version 1.0.143) was used within the R statistical computing environment (version 3.4.0) for data

112

Chapter 2 Materials and Methods

analysis and visualisation. Software packages within the tidyverse (version 1.2.1) collection were used for data analysis and plotting. The following functions within ggplot (version 3.0.0) were used for plotting: violin plots, geom_violin with parameters “scale = count”; box-and-whisker plots, geom_boxplot with parameters “outlier.colour = NA”; smoothed density plots, geom_density with default parameters; scatter plots, geom_point with default parameters; marginal density histogram plots, ggMarginal with parameters “type = “histogram”, bins = 40, size = 8). The Cairo graphics library (version 1.17.2) was used to generate heat map plots for TSS-seq and RNA-seq data after fold-change values were calculated for bins of 5 nt within defined intervals, comparing between two samples. Images and figures were prepared using Adobe Photoshop CC (version 19.0) and Adobe Illustrator CC (version 22.0.1).

2.6.6 Quantification and statistical analysis

Information regarding any statistical tests used, number of samples, or number of biological replicate experiments is stated in the corresponding figure legends. For Students’ t-tests, calculated p values less than 0.05 were considered significant. Plotted error bars in individual figures are stated in the figure legend, as either standard error of the mean (SEM) or 95% confidence intervals (CI). Box-and- whisker plots were used to illustrate changes in differential RNA expression, and the following statistics are depicted for each comparison: median value (horizontal line), lower and upper quartiles (lower and upper hinges), and lowest and highest values (whiskers, within 1.5 times interquartile range).

2.7 Data and Software Availability

The RNA sequencing and TSS sequencing data generated in this study have been deposited in the NCBI Gene Expression Omnibus (GEO) under accession code GSE110004.

The following published data sets were used for analysis:

113

Chapter 2 Materials and Methods

Data Source Identifier Total, poly(A), and This work, GEO: GSE110004 TSS RNA sequencing (Wu et al., 2018b) CUTs, SUTs, NUTs, (Neil et al., 2009; http://vm-gb.curi.e.fr/mw2/ XUTs annotation Schulz et al., 2013; van Dijk et al., 2011; Xu et al., 2009) SMORE-seq (Park et al., 2014) GEO: GSE49026 (TSS annotation) Rap1 ChIP-exo (Rhee and Pugh, NCBI: SRA044886 (Rap1 binding sites 2011) annotation) Rap1 motif annotation (de Boer and Hughes, Accessed through YeTFaSCo (Rap1 binding sites 2012; Lieb et al., database annotation) 2001) http://yetfasco.ccbr.utoronto.ca/ Ume6 binding site (McKnight et al., 2016) GEO: GSE72572 annotation Sth1 ChIP-seq (Lopez-Serra et al., GEO: GSE56994 2014) Sth1 MNase ChIP-seq (Parnell et al., 2015) GEO: GSE65594 MNase-seq (Kubik et al., 2015), GEO: GSE73337 (Kubik et al., 2018) GEO: GSE98260

114

Chapter 3 Results

Chapter 3. Identification of Rap1 as a Key Repressor of Divergent Noncoding Transcription

3.1 Acknowledgement

This research has been published in Molecular Cell (Wu et al., 2018b), and has been modified to present within this chapter.

The diploid strains used for Figure 3.3C and D were constructed with the help of Fabien Moretto (Cell Fate and Gene Regulation Laboratory, The Francis Crick Institute). Fabien Moretto performed the RNA-FISH experiment and analysed the data. I constructed the haploid parent RPL43B-bsΔ strains, designed the experiment, and interpreted the data.

The RNA sequencing libraries were prepared and sequenced with the help of the Advanced Sequencing Facility Science Technology Platform (The Francis Crick Institute). I designed the experiments, generated the yeast strains, collected and processed the samples prior to library preparation and sequencing, analysed and interpreted the data.

The RNA sequencing data in Figure 3.4 - Figure 3.7 and Figure 3.9 - Figure 3.11 were analysed with the help of Harshil Patel (Bioinformatics and Biostatistics Science Technology Platform, The Francis Crick Institute). Harshil performed the read trimming, filtering, and mapping, differential expression analysis, processing of data to generate coverage tracks, and plotting for heat maps. I designed the bioinformatic analysis strategy with the help of Harshil, compiled and annotated the Rap1 binding sites, and performed analysis, visualisation, and interpretation of the processed data.

3.2 Abstract

The regulation of gene expression underlies all cellular processes and is subject to extensive regulation in eukaryotes. This complex regulation is encoded

115

Chapter 3 Results

within cis-regulatory elements like promoters and enhancers, comprising genomic regions depleted of nucleosomes and enriched in transcription factor motifs. However, promoters and enhancers are a major source of noncoding transcripts in eukaryotes. Most eukaryotic gene promoters are bidirectional, and many active gene promoters exhibit divergent noncoding transcription as a consequence. The functions of the noncoding RNAs produced and the mechanisms that limit expression of divergent transcripts are not well understood.

Here, I identify that the sequence-specific transcription factor Rap1 in Saccharomyces cerevisiae confers promoter directionality by specifically repressing divergent noncoding transcription. Rap1 locally insulates genes from transcriptional interference originating from neighbouring gene promoters – safeguarding normal programmes of gene expression. Without Rap1, aberrant expression of noncoding RNAs otherwise interferes with normal gene expression programmes in yeast. Rap1 limits expression of noncoding RNAs at a large fraction of its binding sites across the genome, and represses divergent transcription at the majority of Rap1-regulated gene promoters. Divergent noncoding RNAs, normally repressed by Rap1, evade degradation by transcriptome surveillance pathways and are polyadenylated. Rap1 and other chromatin regulatory pathways act in concert, not redundantly, to repress noncoding transcription originating from different places across the genome. These findings uncover a new role for the pioneer transcription factor Rap1, namely to control promoter directionality and ensure transcriptional fidelity.

3.3 Introduction

Noncoding RNAs are generated along with coding mRNAs, often from the same genomic locations. Some noncoding RNAs have clear regulatory functions (Bumgarner et al., 2009; du Mee et al., 2018) but others likely represent transcriptional noise (Struhl, 2007). It remains to be fully elucidated whether most transcription is functional or non-functional, across different eukaryotic organisms. Recent advances in genome-wide RNA sequencing techniques that capture nascent and unstable transcripts (e.g. GRO-seq, PRO-seq, and NET-seq) have

116

Chapter 3 Results

highlighted the diverse transcriptomes of different organisms in ever greater detail. From these data, it is clear that divergent promoter transcripts are a major source of noncoding RNAs in eukaryotes (see Introduction 1.4.1). Eukaryotic gene promoters are inherently bidirectional, and divergent or bidirectional transcription generates upstream transcripts in the opposite direction to the protein-coding gene from a distinct core promoter. The separate core promoters are unidirectional elements (Andersson et al., 2015; Duttke et al., 2015), but the balance of transcriptional output between paired coding and divergent core promoters dictates overall promoter directionality. These core promoters share a nucleosome-depleted region, and distinct transcriptional pre-initiation complexes (PICs) likely compete for the same pool of general transcription factors and RNA polymerase recruited by activator proteins (Rhee and Pugh, 2012). Activator binding sites in promoters and enhancers can function bidirectionally, but the PIC is intrinsically asymmetric and therefore allows transcription in only one direction. It is important to understand how the information encoded within promoter cis-regulatory elements is interpreted by trans-acting factors, resulting in primarily unidirectional expression of coding mRNAs. Regulatory proteins such as sequence-specific transcription factors appear to promote or limit divergent transcription and control promoter directionality, but the mechanistic details are not clear.

Bidirectional gene promoters are a major source of noncoding RNAs in S. cerevisiae (Neil et al., 2009; Xu et al., 2009). In budding yeast, it is not clear whether the activities of paired divergent and coding core promoters are strictly coupled (Rhee and Pugh, 2012). In murine and human embryonic stem (ES) cells, at least 60% of lncRNAs originate from bidirectional transcription (Sigova et al., 2013). At human and mouse divergent lncRNA – mRNA pairs, the divergent core promoters show coordinated changes in expression in response to extracellular cues (Scruggs et al., 2015; Sigova et al., 2013). The amount of divergent transcription varies significantly between eukaryotic species. For example, divergent transcripts are rare or very lowly expressed in Drosophila and Arabidopsis, but more common in nematodes (C. elegans), humans (H. sapiens), and yeasts (S. cerevisiae and pombe) (Hetzel et al., 2016; Ibrahim et al., 2018; Jin et al., 2017; Meers et al., 2018; Nechaev et al., 2010; Zhu et al., 2018). The functions of the divergent noncoding RNAs arising from gene promoters, and the

117

Chapter 3 Results

mechanisms that limit their expression, are not fully understood. Noncoding transcripts can affect gene expression in cis or in trans (see Introduction 1.2.3 for examples in yeast and other organisms). Pervasive transcription of RNAs can also create transcription-replication conflicts and generate R-loops that lead to DNA damage and genome instability (Hamperl et al., 2017; Nojima et al., 2018). Therefore, it is important to fully comprehend how, where, and why noncoding RNAs are generated to understand the processes that underlie health and disease.

How do cells limit expression of divergent noncoding RNAs? Bidirectional transcription is an inherent feature of gene promoters, and the regulatory information encoded within promoter sequences must be interpreted by trans- acting factors to generate a directional transcriptional output (Jin et al., 2017). In eukaryotes, divergent transcripts emanating from gene promoters are limited by various mechanisms, including: RNA termination and degradation, chromatin assembly pathways, gene looping, RNA polymerase speed, TBP regulatory factors, and promoter sequence. The factors involved in these pathways, and the mechanisms through which they work, are discussed in detail in the Introduction of this thesis. Despite this previous work, it remains unclear how sequence-specific transcription factors also control promoter directionality. For example, at highly expressed genes, are there robust mechanisms in place that limit divergent RNA expression? If so, what are the factors responsible and how do they work? To what extent do they control divergent transcription and promoter directionality, genome- wide? Furthermore, it is unclear whether these mechanisms are redundant with other known regulatory pathways that control divergent and noncoding RNA expression, or whether they function cooperatively. Finally, it is essential to identify and characterise any consequences on gene expression and cellular fitness when noncoding transcription is mis-regulated.

Here, I identify that Rap1 specifically represses expression of divergent transcripts, in contrast to the other co-regulators of ribosomal protein genes. In the absence of Rap1 binding, constitutive expression of noncoding RNAs leads to mis- regulation of neighbouring gene expression. I also determine the extent to which Rap1 controls noncoding and divergent RNA expression across the yeast genome, and compare these transcripts to other known classes of noncoding RNAs found in

118

Chapter 3 Results

yeast. Finally, I demonstrate that Rap1 works in concert with other chromatin regulatory pathways to limit noncoding transcription initiating from distinct locations across the genome. Together, these data uncover a new role for the well- characterised transcription factor Rap1 in S. cerevisiae.

3.4 Results

3.4.1 Generation of the auxin-inducible degron (AID) system to deplete essential proteins in yeast

I first aimed to determine whether highly expressed genes have robust mechanisms in place to limit expression of divergent transcripts. To answer this question, I selected the 138 ribosomal protein (RP) genes in S. cerevisiae to investigate. Of the 138 RP gene promoters, only 16 contain an annotated divergent noncoding RNA (CUT or SUT) (Xu et al., 2009) suggesting that a robust mechanism acts to effectively limit divergent RNA expression and control promoter directionality. RP genes are among the most highly expressed genes in yeast, and account for approximately ½ of all actively transcribing RNA polymerase II in exponentially growing cells according to some estimates (Warner, 1999). Their properties and regulation have been well characterised by many groups over several decades, providing a clear starting point for my investigation (see Introduction 1.5.1 for a summary of RP gene regulation).

Multiple transcription factors (TFs) regulate RP gene expression (Figure 3.1A). I initially asked whether the factors that regulate RP gene expression also control divergent transcription. Given that most of these transcription factors are required for cellular growth and fitness, an inducible system to deplete individual proteins was required. Inducible protein depletion systems avoid the indirect effects and compensatory mutations that can occur when using gene deletion mutants. Inducible protein depletion systems are widely used in budding yeast, for example the anchor-away (AA) system (Haruki et al., 2008). The AA system depletes the nucleus of a specific protein of interest that has been tagged with the FKBP12- rapamycin-binding (FRB) domain of human mTOR (target of rapamycin). An abundant cytoplasmic protein (usually a ribosomal protein) must also be fused to

119

Chapter 3 Results

human FK506 binding protein (FKBP12) to provide the “anchor” to tether away the target protein and retain it in the cytosol. However, this system relies on protein dimerization induced by the addition of rapamycin, a small molecule inhibitor of TOR which is toxic to cells. To overcome this, rapamycin-resistant strains containing mutant alleles for TOR1 (tor1-1) and FPR1 (fpr1Δ) must be used. However, TOR signalling directly regulates RP gene expression (Martin et al., 2004). In addition, FPR1 interacts with the Hmo1 protein (a RP gene regulator) and may regulate its function (Berger et al., 2007). These secondary effects may interfere with the regulatory factors present at RP genes. Therefore, to assess the role of each transcription factor, I utilised the auxin-inducible degron (AID) system instead (Nishimura et al., 2009).

120

Chapter 3 Results

Figure 3.1 Generation of auxin-inducible degron (AID) system to deplete essential transcription factors in yeast (A) Schematic diagram of transcription factors that regulate ribosomal protein (RP) gene expression. Red box depicts Rap1 motif within the RP gene promoter. Pol II, RNA polymerase II. NB: a small subset of RP genes are dependent on Abf1, not Rap1, for expression. (B) Schematic diagram of auxin-inducible degradation system generated in S. cerevisiae. AID, auxin-inducible degron tag; E2, E2 ubiquitin-conjugating enzyme; TIR1, E3 ubiquitin ligase from Oryza sativa; SCF, Skp, Cullin, F-box containing complex (with TIR1); IAA, indole-3-acetic acid (auxin). (C) Auxin induced depletion (AID) of transcription factors involved in RP gene regulation, detected by western blot for V5 epitope fused to the AID tag. FHL1-AID (FW4200), IFH1-AID (FW4202), SFP1-AID (FW4204), and RAP1-AID (FW3877)

121

Chapter 3 Results

cells treated with 3-indole-acetic acid (IAA, +) or DMSO (-) and samples were taken 2 hours after treatment. Blot for Hxk1 protein used as a loading control. (D) Spot growth assay of strains harbouring different components of the AID system for inducible depletion of an essential protein, Rap1. Serial dilutions (5-fold) of yeast cells were spotted onto YPD agar plates supplemented with IAA or DMSO, and plates were incubated for 2 days at 30 °C prior to imaging. S288C background strains were used with the following genotypes: Wild-type (WT, FW627), TIR1 allele only (FW3427), RAP1-AID only (FW3422), and RAP1-AID and TIR1 (FW3418).

The AID system relies on an orthogonal protein depletion system adapted from plants, in this case Oryza sativa (Asian rice) (Nishimura et al., 2009). Normally, the F-box transport inhibitor (TIR1) protein is incorporated into the plant SCF-TIR1 E3 ubiquitin ligase complex. Auxin-family hormones (e.g. indole-3-acetic acid, IAA) promote interaction of TIR1 and the auxin/IAA transcriptional repressors that results in polyubiquitylation and rapid degradation of the target protein (Dharmasiri et al., 2005; Kepinski and Leyser, 2005; Teale et al., 2006). The O. sativa TIR1 F-box protein must be expressed in yeast cells to recreate the plant- specific SCF-TIR1 complex. Auxin-inducible degron (AID) tags are generated to target proteins of interest in yeast by introducing the DNA encoding either the IAA17 or IAA7 protein (targets of TIR1) by genomic integration. The resulting fusion proteins are tagged on their C-termini with the AID tag (25 kDa). After the addition of auxin, the target protein fused to the AID tag interacts with the TIR1 E3 ligase and is polyubiquitylated and rapidly degraded (Figure 3.1B). This system offers remarkable specificity because the TIR1 protein targets (AID tags) are specific to the plant kingdom and the auxin response is not present in fungi or animals. This system has been successfully used to rapidly and reversibly induce specific protein depletion in yeast and mammalian cells (Holland et al., 2012; Nishimura et al., 2009; Weidberg et al., 2016).

I generated a suitable AID system in haploid S. cerevisiae strains from the S288C genetic background by integrating a plasmid to drive expression of TIR1 protein and by tagging specific target proteins (e.g. Rap1) with the IAA7 degron at their C-termini. The transformed cells were viable and the alleles were stably integrated. To demonstrate the effectiveness of the AID system, I targeted four key RP gene transcription factors (Fhl1, Ifh1, Sfp1, and Rap1) and cultured the cells to

122

Chapter 3 Results

exponential growth phase in rich medium (YPD), where RP gene activity is high. The individual cultures were split and treated with DMSO (vehicle, mock treatment) or 500 μM indole-3-acetic acid (IAA, auxin) for 2 hours. Addition of IAA to AID- tagged strains induced specific protein depletion within 2 hours of treatment, whereas AID-tagged proteins were still detected in mock treated samples (Figure 3.1C). I also assessed the viability of strains harbouring individual components of the AID system. The growth and viability of cells containing individual components or the entire AID system was comparable for mock treatment (Figure 3.1D). With the addition of IAA to the rich growth media on agar plates, the strain containing Rap1-AID with TIR1 did not grow, but the strains containing Rap1-AID only showed comparable growth to those under mock (DMSO) treatment. In conclusion, the AID system allows rapid and inducible depletion of transcription factors in S. cerevisiae.

3.4.2 Divergent transcripts at ribosomal protein genes are regulated by Rap1

These experimental tools then enabled me to ask: is there a robust mechanism present at highly expressed ribosomal protein gene promoters that limits expression of divergent transcripts? I selected two RP gene promoters with divergent transcripts, RPL43B and RPL40B, as model examples to study (Figure 3.2A). At the RPL43B locus, the promoter Rap1 binding sites are adjacent to a noncoding RNA called IRT2 (IME1 regulatory transcript 2). IRT2 is part of a larger regulatory circuit of two tandem lncRNAs (IRT2 and IME1 regulatory transcript 1, IRT1) that control expression of IME1 (Inducer of Meiosis 1) (Moretto et al., 2018). IME1 is the master regulator gene for entry into sporulation (also known as gametogenesis) in budding yeast, a crucial cell fate decision (van Werven and Amon, 2011). IME1 expression is tightly regulated by multiple inputs, and aberrant IRT2 expression compromises timely induction of IME1 (Moretto et al., 2018). At the RPL40B locus, the Rap1 binding site is beside an annotated divergent transcript called SUT242, an intergenic long noncoding RNA situated between the RPL40B promoter and the adjacent MLP1 gene. The function of SUT242 is unknown, but because of its tandem upstream position relative to the MLP1 gene, I

123

Chapter 3 Results

hypothesised that transcription-coupled chromatin changes could affect the regulation of the MLP1 promoter and subsequently Mlp1 expression.

Figure 3.2 Divergent transcripts at ribosomal protein genes are regulated by Rap1 (A) Schematic diagram of two model ribosomal protein (RP) genes, RPL43B and RPL40B, with divergent noncoding RNAs indicated above. Red boxes depict Rap1

124

Chapter 3 Results

target motifs, and black arrows represent transcription start sites (TSSs). Approximate position of sequences used for northern blot probes also shown. (B) Northern blot showing expression of IRT2 and iMLP1 divergent transcripts after depletion or deletion of RP gene regulators. Cells were grown and treated as described in Figure 3.1C, along with hmo1Δ (FW4132) and crf1Δ (FW4136) strains. 32P-labelled probes targeting IRT2 or SUT242/iMLP1 and SNR190 (loading control) were used to detect transcript expression in total RNA samples. (C) Northern blot demonstrating accumulation of IRT2 and iMLP1 transcripts after auxin-induced depletion of Rap1-AID (FW3877). Blots for SNR190 transcript and Hxk1 protein shown as loading controls, respectively.

To characterise these divergent noncoding RNAs and their expression, I selected the exponential growth phase in nutrient-rich media where RP genes are highly expressed. I then depleted or deleted the transcription factors important for RP gene expression, and collected total RNA samples for analysis. To detect expression of the selected divergent transcripts, I performed northern blotting using probes directed against the annotated noncoding RNAs (Figure 3.2B). Although less quantitative than techniques like RT-qPCR due to variation in hybridisation and radionuclide probe labelling, northern blotting offers more complete information on the repertoire of transcript isoforms present. IRT2 was not detected in wild-type cells (WT), and IRT2 expression was only observed after depletion of Rap1. When I examined the RPL40B locus, I did not detect expression of the shorter SUT242 transcript (640 nt) in any condition. Several larger transcripts were detected after depletion of Rap1 only, and the main band was a high molecular weight species that approximated the size of the neighbouring MLP1 gene (5627 nt). I hypothesised that this transcript was a 5’ extended isoform of MLP1, and will refer to it as “isoform of MLP1” or “iMLP1” (see further verification of transcript identity below in Figure 3.3E). There is some cross-hybridisation with either multiple shorter iMLP1 isoforms or the 18S and 26S rRNA species with the SUT242/iMLP1 northern blot probe. I conclude that of the transcription factors present at these RP gene promoters, only Rap1 is crucial for control of divergent transcript expression.

Next, I characterised the kinetics of divergent transcript induction after depletion of Rap1 protein (Figure 3.2C). In RAP1-AID cells, IRT2 and iMLP1 were not expressed during exponential growth without Rap1 depletion. After addition of auxin (IAA), the level of detectable Rap1-AID protein was notably reduced within 30

125

Chapter 3 Results

minutes and barely detectable within 60 minutes of induction. Concurrently, expression of IRT2 and iMLP1 was induced after just 15 minutes of Rap1 depletion, and transcript levels steadily accumulated up to 4 hours after Rap1-AID induction. These data indicate that there is little delay between depletion of Rap1 and expression of IRT2 and iMLP1, suggesting that Rap1 is directly involved in regulation of divergent noncoding transcription. Thus, Rap1 is required to prevent aberrant expression of divergent transcripts from RP gene promoters.

3.4.3 Aberrant expression of Rap1-regulated divergent transcripts mis- regulates neighbouring genes

Global depletion of Rap1 protein indicates that Rap1 is the key factor responsible for limiting expression of divergent transcripts at RPL43B and RPL40B. I next investigated the physiological importance of this function: in other words, what happens when Rap1 does not effectively limit aberrant noncoding transcription? For example, what are the consequences on neighbouring gene expression? In budding yeast and other eukaryotes, many examples of promoter- situated noncoding RNAs that affect coding gene expression have been characterised (see Introduction 1.2.3 for examples). To induce constitutive divergent transcript expression at selected loci without compromising the other essential roles of Rap1 globally, I deleted the Rap1 binding sites within the RPL43B and RPL40B promoters. Specifically, I used the Cre-LoxP system (Gueldener et al., 2002) to integrate a selection marker flanked by LoxP sequences to replace the Rap1 motifs, and subsequently excised the marker by transiently expressing the Cre recombinase. In a strain background containing Rap1 protein endogenously tagged with a V5 epitope (Rap1-V5) to enable affinity purification, I then performed chromatin immunoprecipitation (ChIP) and quantitative PCR (qPCR) to assess binding of Rap1 at its target sites (Figure 3.3A). Rap1-V5 was enriched at the RPL43B and RPL40B promoters in cells with wild-type (WT) promoters, and the enrichment was abrogated when the Rap1 binding sites were deleted (bsΔ). Again using northern blotting, I confirmed that deletion of the Rap1 binding sites at RPL43B induced constitutive expression of IRT2 in exponential growth, to a level comparable to that observed 2 hours after Rap1 depletion (Figure 3.3B).

126

Chapter 3 Results

Next, I studied the consequences of constitutive IRT2 expression on the regulation of IME1 expression. In diploid cells, the Ime1 protein itself activates IRT2 transcription, and IRT2 limits expression of the downstream IRT1 transcript (which is a negative regulator of IME1) in a feedback loop (Moretto et al., 2018; van Werven et al., 2012). I speculated that constitutive IRT2 expression, due to Rap1 binding site deletion, would down-regulate IRT1 and result in inappropriately high expression of IME1. To test this hypothesis, diploid strains were constructed to relieve haploid mating-type repression of entry into sporulation. Diploid S288C cells were grown to saturated growth phase in nutrient-rich media (YPD), then transferred to nutrient-deficient media (SPO) to induce sporulation through nutrient sensing cues. The level of IME1 expression was measured using single-molecule RNA FISH in diploid cells, with the Rap1 binding sites at the RPL43B promoter deleted or left intact (Figure 3.3C-D). The median number of IME1 transcripts detected increased from 5 per cell in wild-type diploid cells (WT) to 16 per cell in the RPL43B-bsΔ mutant strain. Thus, Rap1 ensures that aberrant expression of divergent transcripts does not compromise expression of a neighbouring master regulator gene for an important cell fate decision.

127

Chapter 3 Results

Figure 3.3 Aberrant expression of Rap1-regulated divergent transcripts mis- regulates neighbouring genes (A) Bar plot of ChIP-qPCR data confirming loss of Rap1 binding after deletion of Rap1 binding sites at RPL43B and RPL40B promoters using the Cre-LoxP system. Cells harbouring Rap1-V5 (FW4732), RPL43B-bsΔ Rap1-V5 (FW4734), and RPL40B-bsΔ Rap1-V5 (FW6228) were grown to exponential phase. Cell were crosslinked with formaldehyde, chromatin extracts were prepared, and anti-V5

128

Chapter 3 Results

antibodies were used to immunoprecipitate Rap1-V5 bound DNA fragments. Rap1 binding at RPL43B and RPL40B promoters was measured by qPCR, and the signals were normalised over a region at the ACT1 gene 3’ end. The mean fold enrichment from three independent experiments plus the standard error of the mean (+SEM) is plotted. (B) Northern blot showing IRT2 expression in RPL43B-bsΔ cells (FW3443) and RAP1-AID cells (FW3877) after IAA treatment, but not in wild-type control cells (FW629). (C) Scatter plot showing increase in IME1 transcript expression in single diploid yeast cells after deletion of the pRPL43B Rap1 binding sites, RPL43B-WT (FW631) or RPL43B-bsΔ (FW6139). Each triangle represents transcript count for one cell ; black lines indicate median values. n = 139 cells; *p < 0.0001 (unpaired student’s t- test). Fabien Moretto performed this experiment. (D) Representative single-molecule RNA fluorescence in-situ hybridization (RNA FISH) images corresponding to Figure 3.3C. Single spots corresponding to individual IME1 (channel: AF594) or ACT1 (channel: Cy5) mRNA transcripts were counted in diploid wild-type (FW631) or RPL43B-bsΔ (FW6139) cells immediately after shifting to SPO medium. Fabien Moretto performed this experiment. (E) Northern blot showing induction of iMLP1 expression after deletion of RPL40B Rap1 binding site, and confirmation that iMLP1 is a 5’ extended isoform of MLP1. Wild-type (WT, FW629), mlp1Δ (FW6030), RPL40B-bsΔ (FW4141), and RPL40B- bsΔ mlp1Δ (FW6029) cells were used. SNR190 transcript shown as loading control. NAT represents nourseothricin marker integrated to MLP1 gene. (F) Western blot showing reduction in MLP1-V5 expression and northern blot showing increase in iMLP1 expression after deletion of RPL40B Rap1 binding site (Fig. S1D). Mlp1-V5 expression in WT (FW629), RPL40B-bsΔ (FW4141), MLP1-V5 (FW4122) and MLP1-V5 RPL40B-bsΔ (FW4120) cells. Blots for SNR190 transcript and Hxk1 protein are shown as loading controls, respectively.

Having established this role of Rap1 at one RP gene promoter, I then investigated the implications of Rap1-regulated divergent transcription at the RPL40B locus. As described, the northern blot probe targeting SUT242 did not identify a short intergenic transcript, but rather a high molecular weight species approximating the size of the neighbouring MLP1 mRNA. I speculated that this was a 5’ extended isoform of MLP1, which I named iMLP1. To verify that iMLP1 was directly regulated by Rap1, I performed northern blotting using the SUT242/iMLP1 probe in a strain where the Rap1 binding site at RPL40B was deleted. The iMLP1 transcript was not detected in wild-type cells during exponential growth, and was de-repressed in RPL40B-bsΔ (Figure 3.3E). To verify that iMLP1 was a 5’ extended isoform of MLP1, I also deleted the MLP1 open reading frame (ORF) in the RPL40B-bsΔ strain using a nourseothricin resistance marker gene (NAT, 1123 bp, vs. MLP1, 5627 nt). The higher molecular weight species disappeared and shorter

129

Chapter 3 Results

transcripts appeared after MLP1 deletion, validating the existence of the iMLP1 transcript isoform. Next, I combined the deletion of the RPL40B Rap1 binding site (RPL40B-bsΔ) with a MLP1 allele endogenously tagged with the V5 epitope, to measure the effects on MLP1 protein expression (Figure 3.3F). When the Rap1 binding sites were deleted and the iMLP1 transcript was de-repressed, MLP1-V5 protein levels were markedly reduced during exponential growth. The 5’ extended sequence of iMLP1 harbours 15 upstream AUG codons, 10 of which are out of frame with the coding ORF, which may limit translation of iMLP1 similarly to other 5’ extended transcript isoforms (Chen et al., 2017; Cheng et al., 2018; Chia et al., 2017). In conclusion, mis-regulation of Rap1-repressed divergent transcripts affects neighbouring gene expression. These examples highlight a new, underappreciated role of Rap1: to control aberrant expression of noncoding RNAs and safeguard normal programmes of gene expression.

3.4.4 Analysis of RNA sequencing experiments to measure changes in transcript expression after global Rap1 depletion

Having established this novel role of Rap1 at two loci, I then investigated the extent to which Rap1 regulates noncoding and divergent transcription across the yeast genome. I used the tractable Rap1-AID system to deplete the essential Rap1 protein in nutrient-rich conditions. Deep RNA sequencing was performed at several time points after auxin-induced Rap1 depletion. Samples were collected from wild- type cells during exponential growth, Rap1-AID cells 30 minutes after addition of DMSO (vehicle, mock treatment), and Rap1-AID cells 30 minutes and 2 hours after addition of IAA. The extracted total RNA was depleted of ribosomal RNAs using a commercial bead-based hybridisation capture method (Illumina Ribo-Zero Gold Yeast, also see Materials and Methods), to enrich for non-polyadenylated (polyA) transcripts and other species of noncoding RNAs. In addition, matched samples were enriched for polyA RNAs using oligo-dT-coupled beads to capture the polyA transcriptome. 100 bp paired-end (PE) libraries were successfully generated and sequenced to a depth of ~50 million reads per sample, allowing highly sensitive detection of both coding genes and non-coding RNAs. Generation of strand- specific libraries was essential to delineate the crowded, and in many cases

130

Chapter 3 Results

overlapping, transcripts in yeast. I observed good agreement between matched biological replicate samples in the experiment (Figure 3.4A). The polyA and total rRNA- libraries also clustered separately from each other as expected, indicating that the respective RNA enrichment approaches were successful. To verify the global depletion of Rap1, I examined the expression level of 141 high-confidence Rap1-regulated genes (Figure 3.4B). In both total and polyA RNA, the Rap1- regulated genes were down-regulated after Rap1 depletion. The implementation of the AID system did not interfere with expression of Rap1-regulated genes, as the abundance of Rap1-regulated transcripts was similar between wild-type (WT) cells and Rap1-AID cells under mock treatment (DMSO). To obtain a more comprehensive picture of the differentially expressed genes after Rap1 depletion, I plotted their changes in RNA expression comparing Rap1-AID cells 2 hours after induction with IAA, and 30 minutes after mock treatment with DMSO (Figure 3.4C).

20 genes were significantly up-regulated (Fold Change > 2, padj < 0.05), and 205 genes were significantly down-regulated (Fold Change < -2, padj < 0.05) after Rap1 depletion. I used strict criteria to annotate Rap1-regulated genes with high confidence (see Materials and Methods 2.6.1), and the additional subset of down- regulated genes may correspond to previously unannotated or indirect targets of Rap1. From these data, I conclude that the AID system and strand-specific RNA sequencing to high depth are appropriate for global transcriptome analysis in budding yeast.

131

Chapter 3 Results

Figure 3.4 Analysis of RNA sequencing experiments to measure changes in transcript expression after global Rap1 depletion (A) Correlation matrix heat map showing agreement between RNA-seq biological replicate samples from wild-type (FW629) and RAP1-AID (FW3877) strains. Polyadenylated (polyA) and total RNA samples depleted of ribosomal RNA (rRNA) were taken from cells during exponential growth, or after 30 min to 2 hours of treatment with DMSO or IAA. Colour scale corresponds to the Euclidean distance between the samples, based on all genes. Harshil Patel filtered, mapped, and clustered the RNA-seq data, and generated this heat map. (B) Scatter plots showing distribution of expression for Rap1 regulated genes (n = 141) in wild-type (FW629), and RAP1-AID (FW3877) cells after DMSO or IAA treatment. Separate plots shown for total (rRNA depleted) and polyadenylated (polyA) RNA-seq. Each dot represents transcripts per million (TPM) count for each gene, shown on the y-axis (exponential scale). Grey lines indicate mean values for each group, from 3 biological replicate experiments. Harshil Patel performed read

132

Chapter 3 Results

filtering, mapping, and differential expression analysis to generate the processed data included in this figure. (C) Volcano plot showing that expression of Rap1-regulated genes is decreased upon Rap1 depletion. On the y-axis the false discovery rate adjusted p-value (- Log10(padj)) is plotted, and on the x-axis the fold change is displayed (Log2(Fold change)). Samples were compared from RAP1-AID (FW3877) cells after 2 hours of IAA or DMSO treatment. Rap1 regulated genes (n = 141) are highlighted in red. Data are calculated using three independent experiments. Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure.

3.4.5 Rap1 represses noncoding transcription at hundreds of sites across the yeast genome

I then examined whether divergent transcripts, regulated by Rap1, could be detected in the RNA-seq data. I observed examples of divergent noncoding RNAs at many Rap1-regulated gene promoters. For example, at the RPL43B locus, the divergent IRT2 transcript was only expressed after depletion of Rap1 protein (Figure 3.5A, top left). The transcription start site (TSS) of IRT2 was located near the Rap1 binding sites within the RPL43B promoter. At the RPL40B locus, the intergenic region corresponding to the 5’ extension of the iMLP1 isoform was up- regulated upon Rap1 depletion. In contrast, the expression level of the MLP1 coding sequence alone was reduced 1.6-fold after Rap1 depletion – in agreement with the significant reduction in Mlp1 protein expression in the RPL40B-bsΔ mutant (Figure 3.5A, bottom). At RPL8A, another Rap1-regulated RP gene promoter, I observed a transcript that initiated near the promoter Rap1 binding site in the divergent direction (Figure 3.5A, top right). This transcript spans ~2.2 kb antisense to the neighbouring GUT1 gene, and concurrently sense GUT1 expression was reduced 1.7-fold. Transcription of this antisense long noncoding RNA could potentially lead to transcriptional interference through co-transcriptional chromatin modification (Carrozza et al., 2005; Venkatesh et al., 2016; Venkatesh et al., 2012; Venkatesh and Workman, 2015) or even RNA polymerase collision (Hobson et al., 2012; Prescott and Proudfoot, 2002). These data demonstrate that Rap1-regulated divergent transcripts are not homogenous, and can comprise intergenic transcripts, transcript isoforms, and antisense long noncoding RNAs. Despite this diversity, total RNA sequencing is consistently able to identify novel Rap1-regulated divergent transcripts.

133

Chapter 3 Results

Figure 3.5 Rap1 represses noncoding transcription at hundreds of sites across the yeast genome (A) Examples of divergent noncoding RNAs repressed by Rap1. Samples from RAP1-AID (FW3877) cells treated with IAA or DMSO were processed for total RNA-seq. Normalised read coverage shown separately (y-axis) for the Watson (W, blue) and Crick (C, red) strands. Examples of Rap1-regulated divergent noncoding RNAs are shown for the RPL43B, RPL43B, and RPL8A loci. Harshil Patel

134

Chapter 3 Results

performed read filtering, mapping, and generated the read coverage tracks for data visualisation in this figure. (B) Schematic diagram depicting sense and antisense expression windows to determine RNA-seq signals around Rap1 binding sites. Reads were counted that overlapped with these defined windows. (C) Violin and box-and-whisker plots showing change in total RNA expression after Rap1 depletion (FW3877), using windows of different sizes around Rap1 sites across the genome (n = 564 sites, signals for W and C strands computed separately to generate 1128 measurements). Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure. (D) Similar to Figure 3.5C, comparing polyadenylated (polyA) and total (rRNA-) RNA-seq data. As a control the expression changes in the RAP1-AID (FW3877) +DMSO control condition, compared to wild-type (WT, FW629), are displayed. Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure.

To provide a less biased view of noncoding RNA expression across the genome near Rap1 binding sites, I examined a medium-confidence set of annotated Rap1 sites from sensitive ChIP-exo data (Rhee and Pugh, 2011). Genomic “windows” of different sizes centred around Rap1 binding sites were defined, where reads mapping to each strand are counted separately (Figure 3.5B). For the smaller window sizes (50 and 100 bp), approximately 40% of Rap1 binding site windows show more than 2-fold higher RNA expression after depletion of Rap1 (Figure 3.5C). For the larger window sizes, the number of Rap1 sites displaying more than 2-fold higher RNA expression decreased to 30% for windows of 200 bp and 16% for windows of 500 bp. This distance-dependent effect suggests that noncoding transcription regulated by Rap1 is spatially limited to regions immediately surrounding Rap1 binding sites. I also observed similar changes in RNA expression around Rap1 binding site windows genome-wide when comparing polyA enriched RNA and total RNA depleted of rRNA (Figure 3.5D). These data indicate that a large proportion of Rap1-regulated noncoding transcripts is polyadenylated. In conclusion, Rap1 represses RNA expression near hundreds of Rap1 binding sites across the genome.

135

Chapter 3 Results

3.4.6 Rap1 represses divergent noncoding transcripts at the majority of Rap1-regulated gene promoters

To investigate whether the noncoding transcripts regulated by Rap1 at gene promoters show a bias in directionality, I selected 141 high-confidence Rap1 binding sites at well annotated Rap1-regulated gene promoters (mostly RP genes) for further analysis (Reja et al., 2015; Rhee and Pugh, 2011). The correct annotation of coding and divergent RNAs was critical for this analysis. Among the other Rap1 binding sites, many are located within intragenic and subtelomeric regions. For the Rap1 sites within intergenic regions, it is also unclear which neighbouring gene is regulated by Rap1, and whether proximal genes are directly or indirectly regulated. Using differential expression analysis on windows centred around Rap1 sites, I found that RNA expression around promoter Rap1 binding sites increased in both the sense and antisense directions after Rap1 depletion. The fold change increase in expression was greater in the antisense direction (Figure 3.6A). This bias towards antisense transcription is consistent with the individual examples highlighted (Figure 3.5A), showing that divergent transcript expression was increased upon Rap1 depletion. To confirm that the changes in noncoding transcription within gene promoters was specific to Rap1-regulated genes, instead of protein-coding genes in general, I also examined the changes in RNA expression at the promoters of genes regulated by a repressive transcription factor, Ume6 (Figure 3.6A). Ume6 is a key regulator of early meiotic genes and represses transcription by recruiting the histone deacetylase Rpd3 complex and the ATP-dependent chromatin remodeller ISW2 (Fazzio et al., 2001; Kadosh and Struhl, 1997). In contrast to Rap1-regulated promoters, noncoding transcription remained unaffected within gene promoters regulated by a distinct transcriptional repressor, Ume6. I conclude that the changes in noncoding transcription within Rap1-regulated gene promoters are a direct result of Rap1 depletion, indicating that Rap1 plays a specific role in limiting divergent and noncoding transcription specifically at its target genes.

To examine these transcriptional changes in further detail, I analysed the data using heat maps to display the changes in RNA expression after Rap1 depletion, at a resolution of 5 nt, for the corresponding windows around promoter

136

Chapter 3 Results

Rap1 sites. By plotting and clustering the data from the antisense strand separately, I identified two groups of genes (ASc1 and ASc2, n = 59 and 47, respectively) where divergent transcription was induced in the antisense direction after depletion of Rap1 (Figure 3.6B, top panels). Divergent transcription initiated especially near the Rap1 binding sites themselves. In contrast, a smaller group of genes (ASc3, n = 35) did not show increased divergent transcription after Rap1 depletion. When I clustered the data based on the sense strand signals instead, I observed groups of genes where noncoding transcription increased downstream (Sc1) or upstream (Sc2) of the promoter Rap1 sites (Figure 3.6B, bottom panels). For each strand, a subset of promoters also did not respond to Rap1 depletion (ASc3 and Sc3). However, I did not observe any clear relationships between antisense and sense transcription after depletion of Rap1 (Figure 3.6D), suggesting that they are not strictly coupled. These changes in expression reflect the steady- state levels of divergent and coding RNAs after depletion of Rap1 measured with total RNA-seq. I conclude that Rap1 limits noncoding transcription in both the divergent and coding directions at a large subset of its target gene promoters.

One important consideration was to control for the possible effects of the AID-tag on endogenous Rap1 protein function. Consistent with the expression levels of Rap1-regulated coding genes (Figure 3.2B), I did not observe significant changes in noncoding RNA expression around promoter Rap1 sites when comparing data from a Rap1-AID strain after mock treatment (DMSO) and a wild- type strain (WT) (Figure 3.6C). I subsequently examined whether the promoters dependent on an additional transcriptional regulator, Hmo1, were more likely to have divergent transcription. However, promoters regulated by Hmo1 displayed a comparable increase in antisense and sense RNA expression to Hmo1- independent promoters (Figure 3.6E). Finally, I asked whether there was a relationship between the expression level of a Rap1-regulated gene and its likelihood of displaying divergent transcription. The mean level of RNA expression was very similar across the three groups of Rap1-regulated genes (ASc1 – ASc3), and no clear bias was observed (Figure 3.6F). From these results, I conclude that Rap1 represses transcription near its promoter binding sites mainly in the antisense direction, and to a lesser extent in the sense direction.

137

Chapter 3 Results

Figure 3.6 Rap1 represses divergent noncoding transcripts at the majority of Rap1-regulated gene promoters

138

Chapter 3 Results

(A) Scatter plots showing changes in total RNA expression at Rap1 and Ume6- regulated promoters after Rap1 depletion. Expression changes for antisense (AS) and sense (S) direction windows relative to the coding gene for Rap1 (n = 141) and Ume6 (n = 87) regulated promoters. Each point represents one strand-specific interval at one promoter and horizontal red and blue lines represent mean values. (B) Heat maps showing changes in total RNA expression on AS or S strands, for data described in Figure 3.6A. Promoters were clustered based on AS (ASc1- ASc3) or S (Sc1-Sc3) signals using k-means clustering (k = 3). Colour scale corresponds to magnitude and direction of change, calculated in bins of 5 bp throughout intervals centred on Rap1 sites at promoters. Number of promoters in each cluster: ASc1 (n = 59), ASc2 (n = 47), ASc3 (n = 35), Sc1 (n = 46), Sc2 (n = 43), Sc3 (n = 52). (C) Similar to Figure 3.6B, except that control samples from RAP1-AID +DMSO (FW3877) were compared to WT (FW629). Promoters were clustered and ordered according to Figure 3.6B, AS strand. (D) Scatter plot showing changes in RNA expression around Rap1 sites at gene promoters after depletion of Rap1, separated by antisense or sense strand clusters generated in Figure 3.6B. Fold change values were calculated for RAP1-AID (FW3877) +IAA or +DMSO treated cells from three independent experiments. Each point represents one strand-specific interval at one promoter and horizontal red and blue lines represent mean values. (E) Scatter plot showing changes in RNA expression around Rap1 sites at gene promoters after depletion of Rap1, separated by Hmo1 dependence. Each point represents one strand-specific interval at one promoter and horizontal red and blue lines represent mean values. (F) Scatter plot showing that the expression level of Rap1-regulated genes within each cluster (ASc1 to ASc3) is similar. Each point represents the expression of one gene, expressed in transcripts per million +1 (TPM +1). Horizontal grey lines depict mean values. (A-F) Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure, and generated the heat maps.

3.4.7 Rap1 is not redundant with noncoding RNA surveillance pathways in yeast

Given their prevalence, I aimed to examine whether these Rap1-limited noncoding transcripts are also regulated by RNA degradation pathways in yeast. In organisms ranging from yeast to humans, expression of noncoding RNAs is limited by premature transcriptional termination coupled to degradation by exosome machinery (Almada et al., 2013; Arigo et al., 2006b; Neil et al., 2009; Ntini et al., 2013; Schulz et al., 2013; Xu et al., 2009). Many of these noncoding RNAs are unstable, and only detectable after inactivating specific components of these termination and RNA degradation pathways (Figure 3.7A). Common classes of

139

Chapter 3 Results

noncoding RNAs in budding yeast include CUTs, SUTs, NUTs, and XUTs (see Introduction 1.3.4). I then measured the changes in expression for these classes of noncoding RNAs after Rap1 depletion. If the Rap1-regulated noncoding transcripts overlap significantly with these known classes of noncoding RNAs, I would expect a large fraction of CUTs, SUTs, NUTs, or XUTs to be significantly up-regulated after depletion of Rap1. In contrast, I found that most noncoding RNAs did not significantly change in expression after depletion of Rap1, whereas ~40% of windows around Rap1 sites showed more than 2-fold higher RNA expression in response (Figure 3.7B). In most cases, the Rap1-regulated divergent transcripts were not previously annotated as CUTs, SUTs, XUTs, or NUTs. In conclusion, Rap1 is not redundant with pathways that limit expression of noncoding RNAs through RNA termination and degradation mechanisms.

140

Chapter 3 Results

Figure 3.7 Rap1 is not redundant with noncoding RNA surveillance pathways in yeast (A) Overview of common classes of noncoding RNAs found in S. cerevisiae, defined by their sensitivity to genetic perturbations of RNA surveillance or exosome machinery. There is significant overlap between these categories of noncoding RNAs and one transcript may be annotated in more than one category. CUTs, n = 760; SUTs, n = 797; XUTs, n = 1783; NUTs, n = 1518.

141

Chapter 3 Results

(B) Volcano plots showing change in RNA expression at windows around Rap1 binding sites genome-wide and for other classes of noncoding RNAs in yeast after depletion of Rap1-AID, highlighting the lack of significant overlap between Rap1- regulated noncoding transcripts and other unstable RNAs. Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure.

3.4.8 Rap1 represses noncoding transcription in a distinct manner to previously described chromatin regulatory pathways

In addition to pathways that limit stability of noncoding RNAs, chromatin- based pathways also repress noncoding transcription (see Introduction 1.3.3). I aimed to determine whether other chromatin regulators mediate repression of divergent transcription by Rap1. To identify additional co-repressors of noncoding transcription, I designed a targeted screen for genes that regulate divergent transcription, using aberrant expression of the IRT2 or iMLP1 transcripts as a read- out. I selected genes involved in limiting cryptic or divergent transcription (e.g. SET2, SET3, SPT16), known Rap1 interactors (e.g. SIR3, RIF1, RIF2), and general regulators of chromatin and transcription. Of 66 mutants, 14 displayed higher expression of iMLP1. Only depletion of Spt16 led to higher expression of IRT2 (Figure 3.8, Table 3.1). To cross-validate these candidate repressive factors, I examined a published dataset wherein yeast strains carrying conditional depletion alleles or gene deletions for genes involved in transcriptional regulation, nucleosome function, chromatin modification, and chromatin remodelling were subjected to transcriptome analysis using whole-genome tiling microarrays (van Bakel et al., 2013). When I compared iMLP1 expression patterns in the northern blot data to these published data (van Bakel et al., 2013), I identified five mutants that overlapped. These overlapping genes were selected for further investigation: (1) putative histone acetyltransferase Spt10, (2) transcription factor Spt21, (3) CAF- 1 chromatin assembly component Rlf2, (4) chromatin remodeller and elongation factor Spt6, and (5) FACT (facilitates chromatin transcription) complex component Spt16. All of the candidates I identified have roles in repressing noncoding or divergent transcription (Chang and Winston, 2011; DeGennaro et al., 2013; Doris et al., 2018; Feng et al., 2016; Kaplan et al., 2003; Marquardt et al., 2014; Nojima et al., 2018).

142

Chapter 3 Results

Figure 3.8 Rap1 represses noncoding transcription in a distinct manner to previously described chromatin regulatory pathways Northern blots detecting expression of Rap1-dependent IRT2 and iMLP1 transcripts in 66 mutants for various chromatin and transcription regulatory pathways. Mature rRNA bands stained using methylene blue are shown as loading controls. Gene deletion strains were obtained from a strain collection belonging to Peter Thorpe (Queen Mary University of London).

143

Chapter 3 Results

strain mutant IRT2 iMLP1 mutant IRT2 iMLP1 gene reference type levels levels type levels levels

Data from Figure 3.8 van Bakel et al., PMID 23658529 ADA2 FW6715 deletion - - NA NA NA ARP8 FW6707 deletion - - NA NA NA BRE1 FW6722 deletion - - deletion - - BUR2 FW4817 deletion - - NA NA NA CDC40 FW6683 deletion - + NA NA NA CMR1 FW6725 deletion - - NA NA NA CTK1 FW4756 deletion - - NA NA NA EST2 FW4757 deletion - - NA NA NA GCN4 FW6682 deletion - + NA NA NA GCN5 FW6717 deletion - - NA NA NA GCR2 FW6698 deletion - - NA NA NA HST1 FW6721 deletion - - NA NA NA HST2 FW6691 deletion - - NA NA NA HST3 FW6678 deletion - + NA NA NA HST4 FW6686 deletion - - NA NA NA INO80 FW4819 deletion - - deletion - - ISW1 FW6681 deletion - + deletion - - ISW2 FW6679 deletion - - deletion - - MGA2 FW6700 deletion - - NA NA NA NGG1 FW6687 deletion - - NA NA NA NHP6A FW6719 deletion - - NA NA NA NRD1 FW4821 AID - - NA NA NA OPI3 FW6701 deletion - - NA NA NA PAF1 FW6706 deletion - - deletion - + ts and RAP1 FW3877 AID ++ ++ ++ ++ tet-off RIF1 FW6729 deletion - + NA NA NA RIF2 FW6704 deletion - - NA NA NA RLF2 FW6703 deletion - ++ deletion - + RPB9 FW6694 deletion - + NA NA NA RPD3 FW6689 deletion - - deletion - - RRD1 FW6708 deletion - - NA NA NA RRP6 FW6680 deletion - - NA NA NA RSC1 FW6685 deletion - + NA NA NA RSC2 FW6718 deletion - - NA NA NA RTT106 FW6699 deletion - + NA NA NA RTT109 FW6677 deletion - + NA NA NA SCH9 FW4820 deletion - - NA NA NA SET2 FW6728 deletion - - deletion - - SET3 FW6709 deletion - - NA NA NA SGF29 FW6726 deletion - - NA NA NA

144

Chapter 3 Results

SGF73 FW6695 deletion - - NA NA NA SIN4 FW6716 deletion - - NA NA NA SIR1 FW6705 deletion - - NA NA NA SIR2 FW6711 deletion - - deletion - - SIR3 FW6713 deletion - - NA NA NA SIR4 FW6690 deletion - - NA NA NA SNF2 FW6724 deletion - - deletion - - SNF5 FW6723 deletion - - NA NA NA SPT10 FW5543 deletion - ++ deletion - ++ SPT16 FW5559 AID + ++ NA + ++ SPT21 FW6676 deletion - ++ deletion - ++ SPT23 FW4758 deletion - - NA NA NA SPT3 FW6684 deletion - - NA NA NA SPT4 FW6710 deletion - - NA NA NA SPT6 FW5555 AID - + ts + ++ SPT7 FW6714 deletion - - NA NA NA SPT8 FW6675 deletion - - NA NA NA SRB2 FW6702 deletion - - NA NA NA SSN3 FW6693 deletion - - NA NA NA SUM1 FW6696 deletion - - NA NA NA SWI3 FW6688 deletion - - NA NA NA SWR1 FW6697 deletion - - deletion - - TRF4 FW6720 deletion - - NA NA NA UBP3 FW6712 deletion - - NA NA NA VPS16 FW6692 deletion - - NA NA NA XRN1 FW4759 deletion - - NA NA NA

Table 3.1 Table summarising targeted chromatin regulator screen Mutants for selected chromatin regulator genes were scored for aberrant expression of IRT2 and iMLP1 transcripts, using data from the northern blot assay in Figure 3.8 and the published tiling microarray expression data set from Van Bakel et al. (van Bakel et al., 2013). Scoring for expression of IRT2 and iMLP1: -, no expression increase in mutant strain; + some expression increase in mutant strain; ++ notable expression increase in mutant strain. Gene deletion strains were obtained from a strain collection belonging to Peter Thorpe (Queen Mary University of London).

3.4.9 Genome-wide analysis of noncoding transcription in chromatin regulatory factor mutants

To investigate their wider role in control of noncoding transcription across the genome, I performed RNA sequencing on the gene deletion or auxin-inducible depletion mutants for the chromatin regulatory factors identified in Figure 3.8. I

145

Chapter 3 Results

successfully implemented the AID system to enable inducible and effective depletion of essential proteins Spt6 and Spt16. Expression of AID-tagged Rap1, Spt6, and Spt16 proteins was detected during exponential growth, and these proteins were depleted after addition of auxin to the media (Figure 3.9A). The genes encoding Rlf2, Spt10, and Spt21 were deleted. Similar to my previous experiments, I performed total RNA sequencing with ribosomal RNA depletion (Ribo-Zero) to detect both coding and noncoding RNAs with high sensitivity. 100 bp paired-end (PE) libraries were generated and sequenced to a depth of ~50 million reads per sample. The libraries showed good agreement between biological duplicate samples (Figure 3.9B). I then compared the changes in RNA expression around Rap1 binding sites genome-wide, to investigate the redundancy of transcriptional repression by Rap1 and other chromatin regulatory pathways. Depletion or deletion of these chromatin regulatory factors generally increased noncoding transcription within +100 bp windows around Rap1 binding sites, although some mutants display larger changes in expression (Figure 3.9C). From these data, I conclude that Rlf2, Spt10, Spt21, Spt6, and Spt16 repress noncoding transcription around Rap1 binding sites genome-wide.

146

Chapter 3 Results

Figure 3.9 Genome-wide analysis of noncoding transcription in chromatin regulatory factor mutants (A) Western blot showing successful generation of auxin-inducible degron strains for Spt6, and Spt16 proteins. RAP1-AID (FW3877), SPT6-AID (FW5555), and SPT16-AID (FW5559) cells were either treated with DMSO (-) or IAA (+) for 2 hours, and V5-AID-tagged proteins were detected using an anti-V5 antibody. Hxk1 blot shown as a loading control. (B) Correlation matrix heat map showing agreement between RNA-seq biological replicate samples. Total RNA samples depleted of rRNA were taken from cells during exponential growth or after 2 hours of treatment with DMSO or IAA. Colour scale corresponds to the Euclidean distance between the samples, based on all genes. Strains used: Wild-type (WT, FW629), rlf2Δ (FW5609), spt10Δ (FW5543), spt21Δ (FW5547), SPT6-AID (FW5555), SPT16-AID (FW5559), and RAP1-AID (FW3877). Harshil Patel performed read filtering and mapping, clustered the RNA-seq data, and generated this heat map. (C) Violin and box-and-whisker plots showing the distribution of changes in RNA expression around Rap1 sites genome-wide. Calculated expression values for rlf2Δ (FW5609), spt10Δ (FW5543), and spt21Δ (FW5547) were compared to a wild-type control (WT, FW629), whereas for RAP1-AID (FW3877), SPT6-AID (FW5555), and SPT16-AID (FW5559) fold change values were obtained by comparing IAA treatment (+IAA) to mock treatment (+DMSO). The distribution of RNA expression changes for +100 bp windows (n = 1128) around Rap1 binding sites (n = 564) is shown on the y-axis, calculated from two independent experiments. Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure.

3.4.10 Rap1 is not redundant with other chromatin regulatory pathways in repression of divergent transcription

I focused on individual loci to further delineate the types of noncoding transcripts regulated by Rap1 and other chromatin regulatory pathways. I found that Rap1-repressed divergent transcripts and other noncoding transcripts repressed by Rlf2, Spt10, Spt21, Spt6, and Spt16 overlapped very little. For example, I observed antisense transcription downstream of the Rap1 binding site, closer to or within the coding gene, at the RPL24B and RPL40B promoters (Figure 3.10, right panels). At other loci such as RPL43B and RPL25, no divergent transcription was observed from gene promoters after depletion or deletion of the chromatin regulators, but a clear divergent transcript initiating near the Rap1 binding site appeared after Rap1 depletion (Figure 3.10, left panels). In other words, these factors limit expression of distinct antisense transcripts within gene bodies. These examples illustrate that Rap1 is not redundant with these chromatin

147

Chapter 3 Results

assembly and remodelling pathways in repression of divergent noncoding transcription.

Figure 3.10 Rap1 is not redundant with other chromatin regulatory pathways in repression of divergent transcription Screenshots of total RNA-seq data, showing divergent and cryptic transcription at the RPL43B, RPL40B, RPL25, and RPL24B loci. Separate tracks shown for WT, Rap1-depleted, and chromatin regulator mutant samples. Normalised reads shown on the y-axis for the Watson (W, blue) and Crick (C, red) strands. Rap1 binding

148

Chapter 3 Results

sites are shown as red boxes. Transcription start site sequencing (TSS-seq) data corresponding to WT and RAP1-AID +IAA conditions are also shown for reference, and depicts positions of transcription initiation at single nucleotide resolution. These TSS-seq data are fully described later in Chapter 4.4.4). Harshil Patel performed read filtering, mapping, and generated the read coverage tracks for data visualisation in this figure.

3.4.11 Rap1 and other chromatin regulatory pathways control divergent and antisense transcription in distinct genomic locations

To gain a more comprehensive picture of noncoding transcription within gene promoters, I re-examined the set of high-confidence Rap1-regulated genes. I measured the change in RNA expression in windows centred on promoter Rap1 binding sites after depletion or deletion of Rap1 or other chromatin regulatory proteins. In all mutants I observed increased RNA expression within intergenic regions at Rap1-regulated gene promoters (Figure 3.11A). Consistently across all mutants, the change in RNA expression was higher in the antisense direction compared to the sense direction, relative to the coding gene. However, data from individual loci (Figure 3.10) suggested that there was positional variation in the types of noncoding transcripts controlled by Rap1 and other chromatin regulators. The data was plotted at a resolution of 5 nt using heat maps centred on promoter Rap1 sites. In agreement with this hypothesis, the five chromatin regulatory protein mutants showed higher antisense RNA expression initiating downstream of the Rap1 binding sites, further within the gene promoters or gene bodies (Figure 3.11B). In most cases, this antisense transcription overlaps the Rap1 binding sites located upstream, and terminates at the upstream edge of the promoter nucleosome-depleted region (NDR). In contrast, Rap1 depletion induced divergent transcription at the majority of Rap1-regulated promoters initiating at or near the Rap1 binding sites located at the 5’ border of the NDR. The patterns of noncoding sense transcription around promoter Rap1 sites induced after inactivation of these chromatin regulatory pathways were also distinct from those induced by Rap1 depletion (Figure 3.11C). Together, these data indicate that Rap1 acts in concert with chromatin regulators to repress divergent and antisense transcription initiating in different locations, but in a distinct manner that is spatially limited.

149

Chapter 3 Results

150

Chapter 3 Results

Figure 3.11 Rap1 and other chromatin regulators control divergent and antisense transcription in distinct genomic locations (A) Scatter plot showing changes in total RNA expression around Rap1 sites at gene promoters for mutants described in Figure 3.9C, separated by signals on antisense (AS) and sense (S) strands. Each point represents one strand-specific interval at one promoter and horizontal red and blue lines represent mean values. (B) Heat maps showing changes in total RNA expression around Rap1 sites on antisense (AS) strand, for mutants described in Figure 3.10A. Promoters were clustered and ordered based on AS strand signals as described in Figure 3.6B. Colour scale corresponds to magnitude and direction of change, calculated in bins of 5 bp throughout intervals centred on Rap1 sites at promoters. (C) Similar to Figure 3.11B, except that signals on the sense (S) strand are shown. (A-C) Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure, and generated the heat maps.

3.5 Discussion

3.5.1 Summary

In this chapter, I identified that the sequence-specific activator Rap1 represses divergent noncoding transcription at highly expressed gene promoters in S. cerevisiae. Using Rap1-regulated ribosomal protein genes as a model, I demonstrated that the control of divergent noncoding RNAs is specific to Rap1. The other coactivator proteins recruited to RP gene promoters are important for regulation of coding gene expression, but are not required for suppression of divergent transcription. If Rap1-regulated noncoding RNAs are inappropriately expressed, they locally interfere with important programmes of gene expression. This novel function of Rap1 is widespread and substantial. Rap1 represses noncoding transcription at hundreds of sites across the yeast genome, and at the majority of its target gene promoters. The Rap1-regulated transcripts are polyadenylated, and Rap1 is not redundant with RNA surveillance and degradation pathways that limit expression of other noncoding transcripts. Finally, I identified that Rap1 represses noncoding transcription in a distinct manner to previously described pathways that regulate chromatin assembly and organisation. Rap1 and other chromatin regulators control divergent and antisense transcription arising from distinct genomic locations. These pathways are not redundant, but act in concert to ensure transcriptional fidelity at coding genes.

151

Chapter 3 Results

3.5.2 Evaluation of the AID system to study essential transcription factors

To study the RP gene transcription factors, I used the auxin-inducible degron (AID) system to deplete essential proteins. This overcomes the use of harsh cellular perturbations such as rapamycin treatment or heat shock (the latter requires temperature-sensitive alleles). Heat shock and TOR signalling have large effects on RP gene expression (Martin et al., 2004; Reja et al., 2015). The AID system enables rapid and inducible protein depletion without perturbing important regulatory pathways upstream of Rap1 and its coactivators. However, addition of the AID tag can sometimes de-stabilise endogenously tagged proteins and interfere with their functions. In addition, the auxin-induced depletion inherently is not 100% efficient, as new copies of AID-tagged target protein are continuously synthesised and degraded. For more complete repression, the AID system could be combined with transcriptional repression. Strains that were endogenously tagged with Rap1- AID and did not express the TIR1 ubiquitin ligase did not show a fitness defect compared to wild-type strains (Figure 3.1D). Depletion of Rap1-AID with the addition of auxin (IAA) was lethal in a spot growth assay. In addition, the expression of Rap1-regulated genes was comparable between wild-type cells and Rap1-AID cells without auxin induction. Finally, Rap1-AID cells without auxin treatment did not exhibit significant differences in noncoding transcription around Rap1 binding sites, compared to wild-type cells (Figure 3.6C). A recent study also reported that treatment of wild-type S. cerevisiae cells with auxin did not lead to global changes in coding gene expression (Tye et al., 2019). These data indicate that the AID system is appropriate for investigation of transcription factors that regulate RP gene expression in yeast.

3.5.3 Rap1 specifically limits divergent transcription

Of the transcription factors that regulate RP gene transcription, only depletion of Rap1 induced aberrant expression of divergent RNAs. At RP genes, the two independent core promoters (coding and divergent, respectively) in the

152

Chapter 3 Results

promoter nucleosome-depleted region (NDR) may compete for the same pool of general transcription machinery and RNA polymerase II (Rhee and Pugh, 2012). The high transcriptional activity of the coding RP gene could titrate away RNA polymerase from the divergent TSS, limiting the rate of transcription initiation at the divergent core promoter. However, depleting the other RP gene coactivators did not have the same effect as Rap1, and therefore the limited divergent transcription in wild-type cells is not just a byproduct of RP gene activation. It is important to note that the RNA samples for northern blot and RNA sequencing experiments were obtained from a population of cells in small batch culture, and single-cell analysis techniques (for example, with fluorescent reporters) would be required to determine whether the coding and divergent transcription events occur simultaneously.

3.5.4 Consequences of aberrant divergent transcription

What are the consequences when divergent transcription is mis-regulated? I identified two independent examples, at the RPL43B and RPL40B loci, where aberrant divergent transcripts perturb neighbouring gene expression. At the RPL43B locus, constitutive expression of IRT2 leads to abnormally high expression of a cell fate master regulator gene, IME1 (Figure 3.3C) (Moretto et al., 2018). Within the population of cells exposed to environmental nutrient stress upon shifting to nutrient-depleted sporulation media, the median expression of IME1 is higher in the absence of Rap1 binding sites. IME1 expression is tightly regulated, and aberrant IRT2 expression may bias this crucial cell fate decision in a population of diploid cells upon starvation. At the RPL40B locus, the proximity of the divergent TSS to the neighbouring MLP1 gene leads to expression of a 5’ extended transcript isoform, iMLP1, in the absence of Rap1 or its binding site motifs (Figure 3.3E). Aberrant iMLP1 expression leads to down-regulation of Mlp1 protein expression (Figure 3.3F). Mlp1 is a component of the nuclear pore- associated basket and regulates export of unspliced pre-mRNAs. Therefore, aberrant expression of the iMLP1 transcript would likely affect widespread mRNA quality control by affecting Mlp1 function. There are 15 upstream start (AUG) codons within the 5’ extended sequence of iMLP1, 10 of which are out of frame

153

Chapter 3 Results

with the coding open reading frame (ORF). These upstream ORFs (uORFs) may render iMLP1 translationally inert and interfere with MLP1 expression in a similar manner to long undecoded transcript isoforms (lutis), found at NDC80 and other loci (Chen et al., 2017; Cheng et al., 2018; Hinnebusch et al., 2016).

At the RPL8A locus, a Rap1-regulated divergent transcript runs antisense to the neighbouring GUT1 gene, across the entire gene body. The close spacing of tandemly oriented genes on compact yeast chromosomes provides ample opportunity for transcriptional interference by antisense transcription (Camblong et al., 2007), and potentially even collision of convergent RNA polymerases (Hobson et al., 2012; Prescott and Proudfoot, 2002). Antisense transcription can affect gene expression in a variety of ways, such as regulation of the chromatin state at gene promoters (Murray et al., 2015). Opposing RNA polymerases represent insurmountable obstacles for each other, and collided RNA polymerases must be removed through ubiquitylation and proteolytic degradation. Finally, aberrant divergent transcription generates R-loops that cause DNA damage and genome instability, in particular when they are encountered by replisomes (Hamperl et al., 2017; Nojima et al., 2018). I observed that more genes, beyond the high- confidence Rap1 target genes, were differentially expressed after Rap1 depletion. These genes could represent previously unannotated target genes activated or repressed by Rap1 directly. Alternatively, their expression could be indirectly affected through a variety of mechanisms, some of which may include those discussed above.

3.5.5 Rap1 controls expression of noncoding transcripts to a large extent

Genome-wide RNA sequencing approaches revealed that Rap1 controls divergent transcription and promoter directionality to a large extent. Approximately 2/3 of Rap1-regulated genes display increased divergent transcription after depletion of Rap1, but the remaining 1/3 of genes do not. The gene promoters that do not display divergent RNA expression show no clear trends in promoter architecture, motif usage, motif orientation, expression level, or regulation by additional factors. I speculate that subtle differences in DNA sequence at and

154

Chapter 3 Results

around the Rap1 binding sites affect their propensity to act as cryptic core promoters. In addition, the local chromatin context conferred by histone modifications, nucleosome positioning, and their regulatory factors affects the likelihood of divergent transcription. Currently, not much is known about RP gene regulation in higher eukaryotes and mammals, and comparative evolutionary analysis of promoter sequence and identification of the regulatory factors would help identify common principles of control (Hu and Li, 2007). To obtain insight into the molecular factors that control promoter directionality, future work could use single locus proteomics strategies to identify novel regulators of divergent transcription in an unbiased way (Myers et al., 2018; Tsui et al., 2018). Alternatively, in vitro transcription experiments using representative divergent promoters as DNA templates could elucidate the contributions of key factors supplied individually. Finally, it is possible that the divergent RNA species at ~1/3 of Rap1-regulated gene promoters are too short for detection by typical RNA-seq library preparation protocols, as smaller fragments will be removed in size selection and clean-up steps. If they exist, these species could be interrogated using a dedicated small RNA sequencing strategy, nascent sequencing approaches that capture small fragments (e.g. Start-seq, NET-seq, and CRAC) (Churchman and Weissman, 2011; Granneman et al., 2009; Nechaev et al., 2010), or northern blotting as demonstrated here.

3.5.6 Functional redundancy between Rap1 and other pathways that limit noncoding RNAs

Rap1 is not redundant with other previously characterised pathways that control expression of divergent and noncoding transcripts. The factors and mechanisms that control chromatin assembly and RNA turnover are widely conserved within eukaryotic species, but the data presented indicate an additional role for sequence-specific transcription factors. In mammals, most long noncoding RNAs (lncRNAs) are believed to be non-polyadenylated, although for lncRNAs detectable by sequencing of polyadenylated RNA, the proportion of polyadenylated versus non-polyadenylated transcripts is unclear (Schlackow et al., 2017). In yeast, divergent and noncoding transcripts tend to lack full-length polyA tails and are only

155

Chapter 3 Results

transiently polyadenylated in the process of termination and degradation (Porrua and Libri, 2015). Many unstable transcript species are only detectable when exosome and other RNA degradation pathways are inactivated (Neil et al., 2009; Schulz et al., 2013; van Dijk et al., 2011; Xu et al., 2009). In contrast, Rap1- regulated divergent transcripts are stably expressed and polyadenylated in strain backgrounds where RNA surveillance and degradation machinery is functional, indicating that these pathways are not entirely redundant with Rap1. It remains unclear how these divergent transcripts escape premature termination and degradation, for example by the NNS pathway. Many divergent CUTs and SUTs are enriched in motifs that recruit NNS machinery, but Rap1-regulated transcripts may escape this evolutionary pressure as they are not expressed to a significant extent in wild-type cells. Rap1-regulated noncoding transcripts display little overlap with CUTs and SUTs (Figure 3.7). There is evidence that some ectopic TSSs regulated by Rap1 are sensitive to deletion of UPF1, a key component of the nonsense-mediated decay (NMD) pathway in the cytosol (Challal et al., 2018; Malabat et al., 2015). Further work is needed to fully understand the fate of Rap1- repressed divergent and noncoding transcripts, in particular whether a significant proportion reach the cytosol and are effectively mis-translated into polypeptides.

The data presented from RNA sequencing and individual examples show that Rap1 is not redundant with previously characterised pathways that limit divergent and noncoding transcription through chromatin assembly. Mutants for other chromatin regulatory pathways show only minor phenotypes of de-repression for the representative IRT2 and iMLP1 transcripts. It is possible that recruitment of these factors, like SPT10, SPT21, RLF2, SPT6, and SPT16 could be mediated by Rap1, thereby allowing them to function locally and repress divergent transcription. However, upon closer examination with RNA-seq, these antisense transcripts do not appear to be phased around the promoter Rap1 binding sites, in contrast to the divergent transcripts observed after Rap1 depletion. Most chromatin regulator mutants limit expression of similar types of transcripts – in general, antisense and sense direction transcripts originating within the nucleosome-dense bodies of coding genes, rather than the relatively nucleosome-free promoters. The role of RLF2 in the CAF-I chromatin assembly pathway has already been described to repress divergent transcript expression, likely by limiting nucleosome turnover.

156

Chapter 3 Results

Expression of several histone genes in the SPT10 and SPT21 mutants was significantly reduced, suggesting that lower levels of nucleosome assembly or occupancy led to de-repression of cryptic transcripts. SPT6 and SPT16 are required for co-transcriptional chromatin remodelling and nucleosome re-deposition in the wake of transcription (DeGennaro et al., 2013; Jeronimo et al., 2015; Winkler and Luger, 2011). Without these factors, cryptic TSSs become accessible and drive noncoding transcription. The RNA-seq data presented are consistent with a recent report that employed ChIP-nexus, TSS-seq, and NET-seq approaches to investigate the role of Spt6 in transcriptional fidelity (Doris et al., 2018).

Finally, perturbation of the various silencing proteins that cooperate with Rap1 to repress transcription at telomeres and HM loci did not result in aberrant divergent transcription. Recent work reported that histone deacetylases from the SIRT6 family, Hst3 and Hst4, repress transcription of divergent antisense and cryptic unstable transcripts (CUTs) (Feldman and Peterson, 2019). However, there is limited evidence to suggest that silencing factors, like sirtuins, are enriched at highly expressed RP gene promoters, and it seems counterintuitive to recruit silencing proteins to very active genes to generate heterochromatin. In contrast, the stably bound Rap1 transcription factor is ideally placed to limit expression of divergent noncoding RNAs at nucleosome-depleted gene promoters due to its stable occupancy on DNA. Therefore, Rap1 and other pathways function in concert, not redundantly, to control divergent and noncoding transcription originating from different places across the genome.

3.5.7 Conclusion

In conclusion, these data highlight a new role for the conserved transcription factor Rap1, distinct from its known functions in gene activation, transcriptional silencing, and nucleosome organisation. Rap1 represses noncoding transcription at hundreds of sites genome-wide and at 2/3 of its target gene promoters. Rap1 does not function redundantly with known RNA surveillance and chromatin-based pathways that also limit expression of noncoding transcripts. Without Rap1, aberrant expression of noncoding transcripts compromises important programmes

157

Chapter 3 Results

of gene expression. Therefore, the sequence-specific transcription factor Rap1 controls promoter directionality and ensures transcriptional fidelity at highly expressed gene promoters.

158

Chapter 4 Results

Chapter 4. Mechanism and Key Regulatory Principles for Regulation of Divergent Transcription by Rap1

4.1 Acknowledgement

This research was published in Molecular Cell, and has been modified to present within this chapter (Wu et al., 2018b). Parts of the Discussion section were published in a Point-of-View article in Transcription, and have been modified or adapted for this chapter (Wu and Van Werven, 2019).

The northern blot in Figure 4.1B was performed with the help of Fabien Moretto (Cell Fate and Gene Regulation Laboratory, The Francis Crick Institute). I designed the experiment, generated the strains, collected the samples, and prepared the northern blot membrane for Fabien who performed the probe labelling, hybridisation, and imaging for the IRT2 blot. I subsequently performed data analysis and interpretation.

The fluorescent reporter plasmids used in Figure 4.2, Figure 4.3, and Figure 4.7 were constructed with the help of Folkert van Werven (Cell Fate and Gene Regulation Laboratory, The Francis Crick Institute), who generated the plasmids with transcription factor binding site motifs. Folkert also assisted with microscopy data collection and analysis. I designed the experiments, constructed the yeast strains, and performed data collection and analysis.

Minghao Chia (Cell Fate and Gene Regulation Laboratory, The Francis Crick Institute) developed the TSS-seq protocol and assisted with preparation of the TSS-seq libraries in Figure 4.4 - Figure 4.6. This protocol will be included in a future publication by Minghao his co-authors, currently in preparation. I designed the experiments, constructed the yeast strains, and collected and processed the samples prior to TSS-seq library preparation by Minghao.

159

Chapter 4 Results

The TSS-seq libraries were sequenced by the Advanced Sequencing Facility Science Technology Platform (The Francis Crick Institute).

The Bioanalyzer analysis of the TSS-seq libraries was performed with the help of the Genomics Equipment Park Facility (The Francis Crick Institute).

The bioinformatic analyses in Figure 4.4 - Figure 4.6 were performed with the help of Harshil Patel (Bioinformatics and Biostatistics Science Technology Platform, The Francis Crick Institute). Harshil performed the read trimming, filtering, and mapping, differential expression analysis, processing of data to generate coverage tracks, and plotting for heat maps. I designed the bioinformatic analysis strategy with the help of Harshil, and performed analysis, visualisation, and interpretation of the processed data.

4.2 Abstract

In Chapter 3, I identified that Rap1 controls divergent transcription and confers directionality at a large fraction of its target gene promoters. When Rap1 is depleted, divergent RNAs initiate from cryptic transcription start sites (TSSs) located near Rap1 binding sites in promoters. However, the mechanism wherein Rap1 controls expression of divergent transcripts is not known. Here, I demonstrate that the position of the Rap1 binding site and its proximity to the divergent TSS are crucial determinants of transcriptional repression by Rap1. Genome-wide mapping of coding and cryptic TSSs identifies that many divergent core promoters overlap with, or are positioned extremely close to, Rap1 binding sites. Surprisingly, interaction between Rap1 and its known silencing cofactors is not required for repression of divergent transcription at promoters. However, a small region within the C-terminal domain (CTD) of Rap1 comprising residues 631-696 is required to limit expression of divergent noncoding RNAs. Other transcription factors in budding yeast may also repress divergent promoter activity in a similar fashion to Rap1. Finally, I demonstrate that tethering a nuclease-inactivated Cas9 protein to divergent core promoters is sufficient to confer repression of divergent RNA expression. These data demonstrate key regulatory principles of divergent

160

Chapter 4 Results

transcript regulation by Rap1. I propose a model wherein divergent core promoter activity is limited by sequence-specific transcription factors through steric hindrance.

4.3 Introduction

My investigations in Chapter 3 using single locus and genome-wide approaches demonstrated that Rap1 is a novel regulator of divergent noncoding transcription, and consequently promoter directionality. I identified that Rap1 functions independently, not redundantly with ubiquitous RNA surveillance pathways. In addition, I assessed the extent to which representative Rap1- regulated divergent transcripts are suppressed by chromatin regulatory pathways. Rap1 and known chromatin assembly pathways function in concert to limit noncoding transcription within spatially distinct regions of the yeast genome. These data point towards a direct role for the Rap1 protein itself in regulation of divergent promoter activity.

In general, sequence-specific transcription factors recruit cofactors – for example, chromatin remodellers, histone modifying enzymes, or general transcription machinery – to perform their functions. Many of the macromolecular complexes that play key roles in these processes have been identified in yeast and other eukaryotic cells. However, transcription factors can also directly regulate gene expression, for example by directly blocking recruitment of RNA polymerase to core promoters (Browning and Busby, 2004). It is not known whether Rap1 exploits these existing mechanisms to regulate divergent promoter activity, or instead performs this function using a different process or complement of factors. In this Chapter, I aim to understand how Rap1 controls divergent transcription and promoter directionality.

Many divergent noncoding RNAs appear to initiate near promoter Rap1 binding sites (Figure 3.6B). It is unclear whether repression of divergent promoters depends on the position or proximity of the Rap1 binding sites. The position of activator binding can affect coding gene expression output (Huang et al., 2012), but

161

Chapter 4 Results

the consequences for divergent transcripts are not known. For example, where are the transcription start sites (TSSs) of divergent transcripts located? Can Rap1 limit expression of divergent transcripts when bound to DNA in any orientation? In addition, it is necessary to address these questions with high resolution approaches at a genome-wide scale to understand this novel role for Rap1. Despite the fact that bidirectional transcription is a well documented phenomenon in budding yeast and higher eukaryotes (Ibrahim et al., 2018; Jin et al., 2017; Lacadie et al., 2016; Wei et al., 2011), it is not known whether other sequence- specific transcription factors in yeast also regulate divergent transcription at promoters. Is this a widespread but underappreciated phenomenon? If certain transcription factors share structural or functional characteristics with Rap1, do they function in a similar way?

Most transcription factors possess modular functional domains with structural homology to analogous regions in other proteins (Frankel and Kim, 1991). Functional domains of Rap1 required for gene activation and silencing of heterochromatin have been characterised (Feeser and Wolberger, 2008; Garbett et al., 2007; Johnson and Weil, 2017; Layer et al., 2010; Shi et al., 2013; Sussel and Shore, 1991). However, it is unclear whether specific domains of Rap1 also confer the ability to repress divergent transcription. It is also essential to understand whether there are inherent functional differences between divergent and coding direction core promoters. Systematic investigation of their properties using synthetic transcription factors with unique properties may generate useful insights into principles of gene regulation in eukaryotes.

Here, I demonstrate that the distance and position of transcription factor binding sites, relative to transcript start sites, strongly affects activity of divergent core promoters. I expand on these key findings from model loci by mapping Rap1- regulated TSSs at a genome-wide scale and measuring their response to depletion of Rap1. I then examine whether other sequence-specific transcription factors in yeast can also regulate divergent transcription in a similar manner. Furthermore, I identify that the C-terminal domain (CTD) of Rap1 contributes to repression of divergent RNAs, independently of interaction with known silencing cofactors. Analysis of functional Rap1 domains by mutation demonstrates that a small region

162

Chapter 4 Results

in the Rap1 CTD comprising residues 631-696 is required to limit divergent transcription. Finally, I use synthetic transcription factors and CRISPR interference technology to evaluate the properties of Rap1-regulated divergent promoters and elucidate the key regulatory principles. These data identify that Rap1 directly regulates core promoters in close proximity, likely by interfering with recruitment or activity of activator proteins and transcriptional machinery.

4.4 Results

4.4.1 A proximal Rap1 binding site is required and sufficient to repress divergent transcription at the RPL43B locus

From the local increases in divergent RNA expression at Rap1-regulated genes, I observed that many divergent transcripts initiate very close to promoter Rap1 binding sites (Figure 3.6). I hypothesised that the distance between the Rap1 binding site and the divergent TSS is critical for transcriptional repression. For example, what would happen if the distance between the Rap1 site and the divergent TSS were increased?

I generated several mutant versions of the RPL43B promoter at the endogenous locus to test the positional effects of Rap1 binding on IRT2 expression (Figure 4.1A). The Cre-LoxP system was used to introduce spacer sequences of DNA, by integrating LoxP-flanked selectable markers and spacer DNA into target sites and subsequently excising the “floxed” markers with Cre recombinase as described in Chapter 3 (Figure 3.3) (Gueldener et al., 2002). I first deleted the Rap1 binding sites at RPL43B (bsΔ), to reproduce the loss of Rap1 binding at the gene promoter. Then, I deleted the Rap1 binding sites and introduced a spacer (S) sequence of 400 bp sourced from the pUG27 vector (bsΔS) (Gueldener et al., 2002). This mutated promoter sequence should generate a longer transcript, encompassing the Spacer (S) sequence and IRT2. To evaluate whether the repressive effect of Rap1 binding is dependent on proximity, I integrated the spacer sequence (S) while keeping the Rap1 binding sites adjacent to the IRT2 TSS (SU), or repositioned the Rap1 binding sites in a distal position (SD) 400 bp downstream

163

Chapter 4 Results

of the IRT2 TSS. If the location of the Rap1 binding site relative to the TSS is critical, the divergent transcript should be repressed when the Rap1 site is proximal (SU) but not when the Rap1 site is distal (SD) to the TSS. The transcripts were detected using a northern blot probe targeting the IRT2 sequence. First, I observed that IRT2 was de-repressed in bsΔ to a similar level as RAP1-AID +IAA (Figure 4.1B). These data confirmed that Rap1 binding to DNA is required for transcriptional repression. The IRT2 transcript in bsΔ appeared slightly larger than the IRT2 transcript expressed in RAP1-AID +IAA, likely because the residual 105 bp LoxP and flanking sequence was also transcribed (compared to the original Rap1 binding sites, which comprise 28 bp of DNA). Next, I observed that a longer transcript was detected in bsΔS. I expected that the transcript originating from the IRT2 core promoter would include the sequence of the 400 bp spacer and IRT2 itself (S + IRT2). Comparing the migration distance to the ladder of RNA molecular weight markers indicated that the difference in length was approximately 400 bp. In the SU mutant wherein the Rap1 binding sites were integrated near the core promoter of IRT2 upstream of the spacer sequence, repression of the S + IRT2 transcript was reinstated. In the SD construct, the Rap1 binding sites are positioned between the spacer (S) and IRT2 sequences in a potential “roadblock” position. However, repression of S + IRT2 expression was ineffective when the Rap1 sites were positioned distal to the IRT2 core promoter (SD). To verify this result, I confirmed that endogenous Rap1 protein tagged with the V5 epitope was still able to bind to the distal Rap1 sites in SD (Figure 4.1C). Rap1 and other general regulatory factors like Abf1 and Reb1 are able to terminate upstream transcription as transcriptional “roadblocks” in certain contexts within the genome (Candelli et al., 2018; Colin et al., 2014; Yarrington et al., 2012). However, these data suggest that Rap1 limits divergent transcript expression at gene promoters by interfering with transcription initiation instead of elongation.

164

Chapter 4 Results

Figure 4.1 A proximal Rap1 binding site is required and sufficient to repress divergent transcription at the RPL43B locus (A) Schematic diagram of RPL43B promoter mutants used to test positional requirements of Rap1 binding site for repression of IRT2. WT (wild type, FW629), bsΔ (FW3440), bsΔS (FW3920), SU (FW7451), SD (FW3922). Blue bar, 400 bp spacer sequence; red boxes, Rap1 motifs; arrowheads, TSSs; S, spacer sequence. (B) Northern blot showing IRT2 expression in mutants described in A. MW, RNA molecular weight marker run on the same membrane. SNR190 transcript shown as loading control. This experiment was performed with the help of Fabien Moretto. (C) Rap1 binding to the RPL43B promoter measured by ChIP (FW4732, FW4734, FW4735, and FW629). Data normalised over region at 3’ end of ACT1 and plotted as mean + SEM (n = 3).

4.4.2 A proximal Rap1 binding site is required and sufficient to repress divergent transcription at an independent promoter

At highly expressed RP gene promoters, the position and proximity of Rap1 binding sites to divergent transcript TSSs clearly influences transcriptional output. Outside the context of these highly specialised genes, I aimed to determine whether Rap1 binding was sufficient to repress divergent transcription in general. To study this, I took advantage of the PPT1-SUT129 promoter (pPS) fluorescent

165

Chapter 4 Results

reporter construct, previously generated and characterised by the Buratowski lab (Marquardt et al., 2014). At the endogenous PPT1 gene promoter, the SUT129 noncoding RNA is expressed as a divergent transcript from the shared nucleosome-depleted region (NDR). To facilitate screening, the coding sequence of the PPT1 gene was replaced with a sequence encoding the mCherry fluorescent protein, and the SUT129 sequence was replaced with a sequence encoding the yellow fluorescent protein (YFP) (Marquardt et al., 2014; Nagai et al., 2002; Shaner et al., 2004). By measuring the corresponding signals for each reporter protein using fluorescence microscopy, the YFP and mCherry signals can be exploited as readouts for divergent and coding direction transcription, respectively. Three different reporter constructs were designed: pPS, comprising the endogenous PPT1/SUT129 promoter with YFP and mCherry reporter sequences, R1p, wherein Rap1 motifs from RPL43B were introduced adjacent to the TSS of SUT129 (20 bp, proximal), and R1d, wherein the Rap1 motifs were introduced further away from the SUT129 TSS (104 bp, distal) (Figure 4.2A). These constructs were stably integrated into haploid cells, replacing the endogenous PPT1/SUT129 genes.

During exponential growth in rich medium, cells with the pPS reporter (no Rap1 motifs) displayed higher YFP and mCherry signals than cells without the reporter (Figure 4.2B-C). Introduction of Rap1 binding sites 20 bp away from the SUT129 TSS (R1p) notably lowered YFP signals, while PPT1 (mCherry) activity increased. This demonstrates that Rap1 is able to repress divergent transcription outside of its native sequence and chromatin context at another gene promoter. However, when the Rap1 binding sites were introduced 104 bp from the SUT129 TSS (R1d), the YFP signal matched the signal obtained from the pPS reporter. To confirm that these effects were dependent on the Rap1 protein, I repeated these experiments in cells containing the RAP1-AID inducible protein depletion system. Again, the presence of proximal Rap1 sites (R1p) strongly repressed SUT129 promoter activity, but the YFP signals increased and were comparable to control plasmid (pPS) levels upon Rap1 depletion (RAP1-AID +IAA). This Rap1-dependent repression was not observed in RAP1-AID cells with Rap1 sites distal to the SUT129 TSS (R1d). Therefore, Rap1 binding is sufficient to repress divergent transcription when located near cryptic core promoter sequences.

166

Chapter 4 Results

Figure 4.2 A proximal Rap1 binding site is required and sufficient to repress divergent transcription at an independent promoter. (A) Schematic diagram illustrating fluorescent reporter constructs. pPS has been described previously (Marquardt et al., 2014). The pPS construct contains the divergent promoter for the PPT1 gene and associated divergent transcript SUT129, where the PPT1 sequence is replaced by an mCherry reporter gene and the SUT129 sequence is replaced by a yellow fluorescent protein (YFP) sequence. The Rap1 sites from the RPL43B promoter were integrated at a proximal (R1p, 20 bp) or distal (R1d, 104 bp) position to the TSS of the divergent transcript SUT129. (B) Plots displaying normalised YFP and mCherry signal, quantified in cells harbouring fluorescent reporter constructs described in A. WT (FW629), pPS, R1p, and R1d in WT or RAP1-AID (FW6407; FW6895; FW7253; FW6208; FW6206;

167

Chapter 4 Results

FW6408) cells were not treated (NT) or treated with auxin (+IAA) for 4 hours, fixed, and imaged. Mean signals corrected for background (AU, arbitrary units) plotted + 95% confidence intervals (n = 50 cells per sample). (C) Representative images showing SUT129 promoter activity (YFP, noncoding), as described in B. The following cells were fixed and imaged: Wild-type cells harbouring no reporter (FW629), control reporter (pPS, FW6407), reporter with proximal Rap1 motifs (R1p, FW6895), reporter with distal Rap1 motifs (R1d, FW7253), RAP1-AID cells harbouring control reporter (pPS, FW6208), reporter with proximal Rap1 motifs (R1p, FW6206), reporter with proximal Rap1 motifs in reverse orientation (R1prv, FW6204), reporter with distal Rap1 motifs (R1d, FW6408). (A-C) Folkert van Werven generated the plasmids containing Rap1 motifs, and assisted with collection and analysis of fluorescence microscopy data.

4.4.3 Repression of divergent noncoding transcription is independent of Rap1 motif orientation

Certain sequence-specific transcription factors function in a motif orientation-dependent manner, depending on genomic context (Guo et al., 2015; Lis and Walther, 2016; Merkenschlager and Nora, 2016). In this case, the Rap1 protein can act as a “roadblock” to elongating RNA polymerase, when its binding site motifs are in the forward orientation within a transcription unit (Briand, 2015; Candelli et al., 2018; Yarrington et al., 2012). Given that the divergent transcripts present at Rap1-regulated genes initiate extremely close to the Rap1 binding sites, it is possible that the Rap1 motifs themselves act as cryptic, directional core promoters. Therefore, I examined whether the orientation of the Rap1 motifs at gene promoters affects the likelihood of divergent transcription. Rap1 motifs at RP genes are present in five possible permutations depending on the orientation of the tandem or solo sequence motifs (Knight et al., 2014). However, variation in orientation or number of Rap1 motifs did not lead to differences in divergent transcription at gene promoters (Figure 4.3). I also tested this hypothesis outside the native sequence context of Rap1-regulated RP genes using the PPT1/SUT129 fluorescent reporter system. Rap1 motifs from the RPL43B locus were cloned into the PPT1/SUT129 promoter plasmid in a proximal location (20 bp from SUT129 TSS), and the plasmids were integrated into the RAP1-AID strain background. Rap1 motifs in forward (R1p) and reverse (R1prv) orientations both showed effective repression of divergent promoter activity (SUT129, YFP) in a Rap1- dependent manner (Figure 4.3). Therefore, the location, but not the orientation, of

168

Chapter 4 Results

Rap1 binding site motifs at gene promoters is a key determinant of divergent transcript expression.

Figure 4.3 Repression of divergent noncoding transcription is independent of Rap1 motif orientation (A) Schematic diagram (left) showing different classes of RP gene promoters based on orientation of the Rap1 motif (red boxes), and the corresponding scatter plots of RNA expression changes in promoter Rap1 site windows on each strand described in Figure 3.6D, separated by class. Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure. (B) Plots of normalised SUT129 (YFP, noncoding) and PPT1 (mCherry, coding) promoter activity, quantified in RAP1-AID cells harbouring the control reporter plasmid (pPS, FW6208), plasmid with R1p (FW6206), or R1prv (proximal Rap1 motifs in reverse orientation, FW6204). Mean signals have been corrected for background (AU, arbitrary units) and are plotted + 95% confidence intervals (error bars). N = 50 cells were quantified for each sample. NT, no treatment; +IAA, treated with auxin. Folkert van Werven generated the plasmids containing Rap1 motifs, and assisted with collection and analysis of fluorescence microscopy data.

169

Chapter 4 Results

4.4.4 Transcription start site sequencing (TSS-seq) identifies transcript initiation sites at single nucleotide resolution

The distance between the Rap1 binding site and the core promoter is a crucial factor in regulation of divergent transcription at the RPL43B and PPT1 promoters. Whether this principle wholly applies to all genes at which Rap1 represses divergent noncoding transcription is not clear. I aimed to identify where Rap1-regulated transcripts initiate and simultaneously measure their expression levels in a genome-wide experiment. However, standard RNA sequencing library preparation workflows are designed to capture RNA fragments which map across gene bodies, where the coverage obtained significantly drops at 5’ and 3’ gene ends. Inherently, they are not designed to accurately identify transcription start sites (TSSs) at single nucleotide resolution and measure their expression, which is crucial for this investigation. Therefore, I implemented a transcription start site sequencing (TSS-seq) technique developed by my colleague Minghao Chia (Cell Fate and Gene Regulation Laboratory, manuscript in preparation). Various methods have been developed recently to measure TSS usage, several of which would be equally valid approaches for this purpose (Adjalley et al., 2016; Arribere and Gilbert, 2013; Malabat et al., 2015). However, techniques that require isolation of intact and biochemically active nuclei (e.g. GRO-cap and PRO-seq) are challenging in yeast because of difficulty in obtaining clear subcellular fractions (Core et al., 2008; Kwak et al., 2013). In this TSS-seq protocol, the transcription start sites of 5’-capped and polyadenylated transcripts are identified and quantified in a strand-specific manner. My previous RNA-seq analysis identified that Rap1- regulated noncoding transcripts are polyadenylated, and therefore the use of this protocol is appropriate. A brief summary of the TSS-seq protocol is included here, and the full details can be found within the Methods section of this thesis (Figure 4.4A).

From a pool of total cellular RNA extracted from yeast cells, polyadenylated (polyA) RNA was isolated using oligo-dT hybridisation-based capture on magnetic beads. The polyA RNA was fragmented to a mode size of ~200 nt, by incubation in a hot alkaline solution containing zinc ions. This process generates RNA fragments with 5’ hydroxyl (-OH) groups and 2’,3’ cyclic phosphate (P) ends. The fragments

170

Chapter 4 Results

were then treated with alkaline phosphatase to remove the remaining 5’ phosphate (P) groups, leaving 5’ hydroxyl (-OH) groups incompetent for ligation. The alkaline phosphatase treatment also de-phosphorylated 2’,3’ cyclic phosphate ends. The genuine 5’ ends of polyadenylated transcripts are protected by 7-methyl-guanosine caps (m7Gppp, aka 5’ cap), which were then removed in a crucial step using a decapping enzyme (Cap-ClipTM). The 5’ terminal phosphate (P) groups of previously capped transcripts were exposed and ligated to the 3’ hydroxyl group (- OH) of a custom DNA-RNA adapter oligonucleotide (5’ adapter) to mark the location of the first transcribed nucleotide. After reverse transcription into cDNA using random hexamer primers, the RNA strand was digested using RNases. Second strand synthesis was performed using a biotinylated oligonucleotide complementary to the 5’ adapter for priming – generating biotinylated double- stranded cDNA. These biotinylated cDNA fragments were purified using streptavidin affinity capture and incorporated as input material for end repair, A- tailing, and PCR amplification reactions as with standard RNA-seq library preparation protocols.

Single-end 75 bp Illumina sequencing was performed, at a depth of ~40 million reads per library, to allow sensitive detection of both canonical and cryptic TSSs. The single-end reads were filtered to analyse only fragments containing the 5’ adapter ligated to a cDNA fragment representing a genuine 5’ transcript end. By mapping the cDNA inserts and truncating the read coverage to the location of the 1st nucleotide immediately following the 5’ adapter, TSSs genome-wide can be readily identified and quantified (Figure 4.4B). I processed three biological replicate samples each from wild-type cells during exponential growth in nutrient-rich media, and from RAP1-AID cells two hours after auxin-induced Rap1 protein depletion. For one wild-type sample, half of the RNA was aliquoted separately and the key decapping enzyme treatment was left out in this “no decapping” sample, as a control for the non-specific background adapter ligation events. TSS-seq libraries were successfully generated from these samples and further analysed.

171

Chapter 4 Results

Figure 4.4 Transcription start site sequencing (TSS-seq) identifies transcription initiation sites at single nucleotide resolution (A) Schematic diagram of TSS-seq method. Briefly, 3.5 – 6 μg of fragmented polyA RNA (red lines) was treated with alkaline phosphatase to remove 5’ phosphate groups (P) on uncapped or internal RNA fragments. CAP-CLIP Acid Pyrophosphatase was used to remove the 5’ m7GpppG “cap” (green circles) of transcripts, generating a 5’ monophosphate group. A 5’ DNA-RNA adapter oligonucleotide (blue lines) was ligated to the decapped, ligation-competent 5’ ends corresponding to genuinely capped transcripts. Random primers were used for

172

Chapter 4 Results

reverse transcription, followed by 2nd strand synthesis using a primer specific to the 5’ adapter. Sequencing libraries were then prepared for Illumina sequencing (single-end 75 bp, ~40 M reads per sample). See Materials and Methods section for full details. (B) Schematic diagram of extracted transcript 5’ end coverage from TSS-seq data, compared to typical coverage from RNA-seq data. Instead of plotting the coverage from the whole read as in RNA-seq plots (right), for TSS-seq the sequencing reads containing the adapter sequence are trimmed to plot coverage for only the 1st transcribed base after the adapter sequence corresponding to the genuine TSS of the transcript (left). TSS-seq maps genuine 5’ ends of polyA RNA at single nucleotide resolution in a strand-specific manner, and quantifies their expression level. The blue arrow below corresponds to a gene, and the black arrowhead corresponds to the main TSS. (C) Representative Bioanalyzer electropherogram traces showing the size distribution of RNA and DNA material at various stages of TSS-seq library preparation, for a representative sample (WT R1). The Bioanalyzer analysis was performed with the help of the Genomics Equipment Park Facility. (A-C) Minghao Chia developed the TSS-seq protocol and assisted with preparation of the TSS-seq libraries.

4.4.5 Validation of TSS-seq data

To assess the validity of the data generated by the TSS-seq method, I compared the TSS-seq signals obtained from wild-type (WT) cells to annotated coding gene TSSs in S. cerevisiae. The default TSS annotation in the Saccharomyces Genome Database (Cherry et al., 2012) and Ensembl reference genome (R64-1-1, release 90) (Zerbino et al., 2018) identifies the first nucleotide of the first coding exon, instead of the genuine transcript TSS upstream of the 5’ untranslated region (UTR). Instead, an improved budding yeast TSS annotation was obtained from a published report describing a technique similar to TSS-seq (Park et al., 2014), and any missing TSSs were supplemented with the Ensembl annotations. The strand-specific TSS-seq data for wild-type (WT) and “No Decapping” samples were plotted for each gene, in a 1 kb window centred on the true annotated TSS (Figure 4.5A). The TSS-seq signals clearly accumulated at TSSs and the coding genes showed a large dynamic range in expression as expected for nutrient-rich conditions. The “No Decapping” sample did not show an enrichment of signal at TSSs. The biological replicates for each sample also showed good agreement across the TSS-seq experiment (Figure 4.5B). Next, I assessed whether the TSS-seq method could be used as a quantitative read-out of corresponding transcript expression. Expression values from TSS-seq and RNA-

173

Chapter 4 Results

seq data were obtained for each gene and plotted against each other for cross- comparison (Figure 4.5C). The measurements from TSS-seq data correlated well with the RNA-seq data over wide range of transcript expression, for each biological replicate sample. These data demonstrate that TSS-seq can be used to identify and quantitatively measure expression of transcription start sites genome-wide.

174

Chapter 4 Results

Figure 4.5 Validation of TSS-seq data (A) Heat map of TSS-seq signal from wild-type cells (WT, FW629), showing 5’ coverage on the coding direction (sense) strand aligned at gene TSSs (left) (Park et al., 2014; Zerbino et al., 2018). As a control, TSS-seq data generated from a sample which was not treated with CAP-CLIP decapping enzyme (No Decapping) shows no enrichment for genuine capped 5’ transcript ends. Colour scale corresponds to log2(normalised signal). (B) Correlation matrix heat map for TSS-seq biological replicate samples from wild- type (WT, FW629) and RAP1-AID +IAA (FW3877) strains. Correlations are based on +75 bp windows centred on annotated TSSs for all genes. Colour scale corresponds to the Euclidean distance between the samples, based on all genes. (C) Scatter plots showing correlation between TSS-seq and RNA-seq data. TSS- seq measurements were calculated for each gene by counting the reads with the 1st transcribed nucleotide located within +75 bp of TSSs, on the coding gene strand. These were converted to TPM values and plotted (TSS-seq, y-axis) against RNA-seq TPM values (x-axis) for the corresponding gene. Individual replicate-level comparisons are shown. rs, Spearman’s correlation coefficient. (A-C) Harshil Patel performed read filtering, mapping, clustering, differential expression analysis, and correlation analysis to generate the processed data included in this figure, and generated the heat maps and scatter plots.

4.4.6 Rap1 represses divergent TSSs near its binding sites genome-wide

With the high resolution of the TSS-seq data, I could then investigate where Rap1-regulated transcripts initiated, relative to the Rap1 binding sites at gene promoters. I initially focused on the divergent noncoding RNAs characterised at the RPL43B and RPL40B genes, IRT2 and iMLP1. In wild type cells, TSSs are present in clusters upstream of both the RPL43B and IRT2 transcripts (Figure 4.6A, top left). These data are consistent with published reports from both yeast and higher eukaryotes showing that transcription initiation at core promoters is heterogeneous (Adjalley et al., 2016; Arribere and Gilbert, 2013; Kwak et al., 2013; Lam et al., 2013; Malabat et al., 2015; Park et al., 2014; Tome et al., 2018). The TSS-seq data also confirmed that coding and divergent promoter transcripts initiate at separate core promoter elements oriented in opposing directions. The coding RPL43B TSS is approximately 275 nt away from the divergent IRT2 TSS, and the coding RPL40B TSS is approximately 191 nt away from the divergent iMLP1/SUT242 TSS (Figure 4.6A, top panels). These data support the interpretations from high- resolution ChIP-seq and ChIP-nexus data indicating that coding and divergent transcription events are driven by assembly of two independent pre-initiation complexes (PICs) (He et al., 2015; Rhee and Pugh, 2012).

175

Chapter 4 Results

Upon depletion of endogenously tagged Rap1-AID protein, the IRT2 TSS was up-regulated, whereas the expression of the RPL43B TSS decreased. I observed a similar response at the RPL40B locus. The TSS corresponding to the iMLP1 transcript isoform was up-regulated upon Rap1 depletion, while the RPL40B TSS was down-regulated (Figure 4.6A, top right). The expression of the distinct MLP1 mRNA TSS also decreased significantly upon Rap1 depletion, consistent with the reduction in Mlp1-V5 protein expression observed when the RPL40B promoter Rap1 sites were deleted (Figure 3.3F). However, examining the locations of the TSSs at high resolution uncovered a surprising finding. At the RPL43B and RPL40B promoters, and many more instances, the TSSs are located either at or extremely close to promoter Rap1 binding sites (Figure 4.6A, bottom panels). Typically, the divergent TSSs in each cluster are 0 – 50 bp from the Rap1 binding sites themselves. As the TSS-seq procedure isolates polyadenylated and capped RNAs, these TSS-seq signals are unlikely to originate from abortive initiation of RNA polymerase II.

Next, the changes in TSS signals between wild-type and Rap1-depleted cells were calculated. At promoters where divergent transcription was observed in total RNA-seq data (ASc1 and ASc2), antisense strand TSS-seq signals increased near the Rap1 binding sites (Figure 4.6B, Antisense). In contrast, there were fewer genes where TSS-seq signal increased near the Rap1 sites in ASc3. As expected, in the sense direction TSS signals decreased downstream of the Rap1 binding sites after Rap1 depletion, as coding gene TSS usage and expression was reduced (Figure 4.6B, Sense). However, sequences upstream of the canonical coding transcript core promoter within the NDR showed higher usage on the sense strand, indicating that Rap1 is also important for fidelity of TSS selection in the coding direction. To examine this relationship at all Rap1-regulated gene promoters, I measured the distance between promoter Rap1 binding sites to the closest TSS, and calculated the changes in TSS expression after Rap1 depletion. At 82% of Rap1-regulated promoters, the antisense TSS was nearest to the Rap1 binding site – consistent with the location of the Rap1 motifs in the 5’ (upstream) edge of the nucleosome-depleted region. Most cryptic transcription initiation sites were within 50 to 100 bp of Rap1 binding sites and directly repressed by Rap1 (Figure 4.6C).

176

Chapter 4 Results

Approximately half of Rap1-regulated promoters displayed more than 2-fold higher TSS-seq signals within 50 bp of the Rap1 binding site after Rap1 depletion (Figure 4.6D). In summary, Rap1 represses initiation of divergent transcription near its cis- regulatory elements at gene promoters.

177

Chapter 4 Results

178

Chapter 4 Results

Figure 4.6 Rap1 represses divergent TSSs near its binding sites genome-wide (A) TSS-seq data plotted for the RPL43B and RPL40B loci in wild-type (WT, FW629) and RAP1-AID +IAA (FW3877) cells. TSS-seq signal per million reads and normalised total RNA-seq coverage are plotted for both strands. Bottom panels show the same data, zoomed in to show the divergent transcript TSSs located at the promoter Rap1 binding sites (note the scale). (B) Difference heat map showing changes in TSS-seq signals near 141 promoter Rap1 binding sites, clustered and ordered as in Figure 3.6B. Pink regions represent higher TSS-seq signal in RAP1-AID +IAA compared to WT, cyan regions represent lower TSS-seq signal (5 bp bin size). (C) Histogram showing distribution of TSS distance to promoter Rap1 binding sites, from TSS-seq data. The distance from the Rap1 binding sites (n = 141) to the nearest TSS was measured, and counts are plotted in 50 bp bins. (D) Scatter plots showing changes in expression of cryptic TSSs near promoter Rap1 sites, comparing RAP1-AID +IAA versus wild-type (WT) control samples. TSSs were classified into bins of 50 bp, increasing in distance to the promoter Rap1 binding site (bs). Fold change values were calculated from three independent experiments. Black horizontal lines show mean values in each bin. (A-D) Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure, generated the TSS- seq and RNA-seq coverage tracks, and generated the heat maps.

4.4.7 Investigation of candidate transcription factors that may control divergent transcription

Rap1 is able to control divergent transcription outside of its native genomic context (Figure 4.2). I asked whether other sequence-specific transcription factors could also control promoter directionality in a similar way. I therefore decided to exploit the PPT1/SUT129 fluorescent reporter system for divergent transcription (pPS plasmid) to investigate the properties of candidate transcription factors. Several transcription factors in yeast share similar characteristics with Rap1, including Abf1, Reb1, and Cbf1 (Bosio et al., 2017; Hogues et al., 2008). Abf1 regulates a small subset (~10) RP genes in S. cerevisiae (Fermi et al., 2016). Like Abf1, Reb1, and Rap1, Cbf1 possesses the ability to displace nucleosomes (Yan et al., 2018). Gcn4, Hap4, Cat8, and Gal4 represent well-characterised sequence- specific transcription factors that are active in different metabolic states (Forsburg and Guarente, 1989; Haurie et al., 2001; Hedges et al., 1995; Laughon and Gesteland, 1982; Natarajan et al., 2001; Rawal et al., 2018). The DNA binding site sequence motif for each transcription factor (obtained from YeTFaSCo) was introduced 20 bp away from the SUT129 TSS in the pPS reporter plasmid (de Boer

179

Chapter 4 Results

and Hughes, 2012). The motifs were oriented in the “forward” direction relative to the mCherry sequence, i.e. the “coding” direction of the promoter (Figure 4.7A). These plasmids containing transcription factor binding sites were then introduced into a wild-type strain background, or strains containing gene deletion or auxin- inducible depletion alleles for each candidate. The cells were grown in rich medium until exponential growth phase, and samples were collected to compare fluorescence signals after deletion or depletion of each transcription factor. Strains containing plasmids with Cbf1, Gcr1, Abf1, and Reb1 motifs showed noticeably higher YFP signal after deletion or depletion of the transcription factor, but those for Gcn4, Hap4, Cat8, and Gal4 did not (Figure 4.7B). For Gcr1, Abf1, and Reb1, the coding direction promoter activity (mCherry signal) did not decrease with transcription factor deletion or depletion - similar to the behaviour of Rap1 previously observed using this assay (Figure 4.7C). This evidence suggests that at some gene promoters in yeast, transcription from the coding and divergent core promoters is uncoupled. In fact, the sequence-specific transcription factors Tbf1 and Mcm1 can decouple regulation of divergently oriented coding gene pairs in yeast (Yan et al., 2015). These data indicate that diverse sequence-specific transcription factors can repress divergent transcription outside of their native genomic context. Whether these factors control promoter directionality at their bona fide target genes throughout the genome remains a subject for further investigation.

180

Chapter 4 Results

Figure 4.7 Investigation of candidate transcription factors in yeast that may control divergent transcription (A) Schematic diagram of fluorescent reporter constructs used to assess ability of transcription factors (TFs) to repress divergent transcription from a position proximal to the divergent TSS. Short DNA sequences containing transcription factor target motifs (TFBS, transcription factor binding site) for each TF were cloned into a proximal position (20 bp) to the TSS of the divergent transcript SUT129, constructs were transformed into wild-type and TF mutant cells, and YFP signal was measured as a readout for divergent promoter activity. The pPS reporter plasmid was described previously (Marquardt et al., 2014). (B) Plots displaying normalised YFP and mCherry signal, quantified in cells harbouring fluorescent reporter constructs described in A. Cells were grown to exponential phase in nutrient-rich media, and for AID-tagged strains, treated with auxin (+IAA) for 4 hours prior to fixation and imaging. N = 50 cells measured per condition, measurements taken from one representative experiment. Background

181

Chapter 4 Results

corrected signal is plotted in arbitrary units (AU), showing mean + 95% confidence intervals (CI) for each group. See Table 2.1 in Materials and Methods section for full list of strains. (A-B) Folkert van Werven generated the plasmids containing transcription factor motifs, and assisted with collection and analysis of fluorescence microscopy data.

4.4.8 The Rap1 silencing function is not crucial for repression of divergent noncoding transcripts

Rap1 offers a unique opportunity to identify and understand the mechanisms through which sequence-specific transcription factors control promoter directionality. Notably, Rap1 acts as a transcriptional activator at RP and glycolytic genes by recruiting general transcription machinery and RNA polymerase II (Garbett et al., 2007; Johnson and Weil, 2017; Layer et al., 2010; Papai et al., 2010). However, Rap1 also mediates transcriptional silencing and telomere regulation at the hidden mating type (HM) loci and chromosome ends (Kurtz and Shore, 1991; Moretti et al., 1994; Sussel and Shore, 1991). It performs these repressive functions through recruitment of silencing proteins such as Sir3, Sir4, Rif1, and Rif2 (Hardy et al., 1992b; Shi et al., 2013). A recent report identified that Hst3 and Hst4, members of the Sirt6 family of histone deacetylases, repress divergent transcription, R-loop formation, and maintain genomic stability (Feldman and Peterson, 2019). Although there is little evidence for recruitment of silencing proteins to these highly expressed gene promoters, I wondered whether the interactions between Rap1 and its known transcriptional co-repressors might contribute to repression of divergent transcription. To assess this, I performed a screen using a set of 46 surface mutations generated in the Rap1 C-terminal domain, a region required for interaction with silencing factors and telomeric regulators. These mutants have already been characterised for defects in HM silencing, telomeric silencing, and telomere length regulation (Feeser and Wolberger, 2008). I sub-cloned the mutant library into single copy integration plasmids expressing AID-insensitive full-length Rap1 protein, and introduced these constructs into the RAP1-AID strain background to enable screening. Cells were grown in nutrient-rich medium to exponential phase, and endogenously tagged Rap1-AID protein was then depleted. Total RNA samples were extracted and processed for northern blotting, to detect aberrant expression of the characterised

182

Chapter 4 Results

IRT2 divergent transcript (Figure 4.8A). Mutation of surface residues in the Rap1 C- terminal domain resulted in only mild de-repression of IRT2, with the exception of the Δ672-827 mutant in which the entire C-terminal domain was deleted. The experiment was repeated for any mutants that showed a minor phenotype, but the levels of IRT2 expression were comparable to the full-length Rap1 control (Figure 4.8B). These data suggest that Rap1 does not require its complement of silencing proteins present at HM loci and telomeres to mediate repression of divergent noncoding transcription.

183

Chapter 4 Results

Figure 4.8 The Rap1 silencing function is not crucial for repression of divergent noncoding transcripts

184

Chapter 4 Results

(A) Northern blots showing expression of IRT2 transcript in Rap1 point and patch mutants, previously characterized for telomere and hidden mating type locus silencing (Feeser and Wolberger, 2008). Mature rRNA bands stained using methylene blue are shown as loading controls. Rap1 expression constructs containing point or patch mutations were generating using Gibson-style cloning and transformed into RAP1-AID cells. RNA samples were extracted from RAP1-AID cells expressing Rap1 mutant proteins 2 hours after addition of IAA. (B) For any mutants in A that showed any mild de-repression of IRT2, the experiment was repeated, using a northern blot for SNR190 in addition to control for loading. (A-B) The original RAP1 plasmids containing mutations within the CTD were obtained from Cynthia Wolberger (Johns Hopkins University), and fragments of DNA encoding the Rap1 CTD were cloned into single copy integration expression vectors.

4.4.9 The Rap1 C-terminal domain is important for repression of divergent noncoding transcription

Several functional domains of Rap1 have been well characterised and are important in gene activation and repression (see Introduction 1.5.1 for details). However, the regions of the Rap1 protein that are essential for repression of divergent transcription have not been identified. Annotation of these crucial regions of Rap1 may uncover the mechanisms underlying its transcriptional repression function at gene promoters. Therefore, I examined whether specific domains of Rap1 were required for control of promoter divergent transcripts. Expression constructs containing deletions in the amino- (N-) and carboxy- (C-) terminal domains of Rap1 were sub-cloned into single copy integration plasmids, and the truncated proteins were expressed in RAP1-AID cells (Figure 4.9A) (Garbett et al., 2007). As Rap1 is an essential protein, I first assessed the viability of these mutants (Figure 4.9B). The C-terminal domain of Rap1 is crucial for cell viability, consistent with its important roles in recruiting transcriptional machinery and silencing proteins to various loci. In contrast, deletion of the N-terminal domain did not compromise cell viability. The mutant constructs were stably expressed from integrated plasmid constructs in RAP1-AID cells, and were not sensitive to auxin- induced protein depletion (Figure 4.9C). Next, I collected RNA samples before and after depletion of Rap1-AID during exponential growth, and performed northern blotting for IRT2 and iMLP1. As expected, IRT2 and iMLP1 were de-repressed after Rap1-AID depletion in a strain containing an empty expression vector (EV), but

185

Chapter 4 Results

expression of full-length (FL, 1-827) Rap1 was able to rescue the transcriptional de-repression (Figure 4.9D-E). Deletion of the C-terminal domain (ΔC) or the N- and C-terminal domains (ΔN ΔC) together induced aberrant IRT2 and iMLP1 expression, but deletion of the N-terminal domain (ΔN) alone did not. It is possible that such large truncations of Rap1 could de-stabilise the mutant proteins, leading to lower DNA binding occupancy. To address this, I measured the bulk protein stability of the mutant Rap1 proteins by inhibiting translation with cycloheximide (+CHX), 1 hour after depletion of Rap1-AID, and assessing Rap1 protein abundance. However, the mutant Rap1 fragments were as stable as full-length Rap1 (Figure 4.9F). These data reveal that the C-terminal domain of Rap1 is important for repression of divergent noncoding transcription.

186

Chapter 4 Results

Figure 4.9 The Rap1 C-terminal domain is important for repression of divergent noncoding transcription

187

Chapter 4 Results

(A) Schematic diagram of domains in Rap1 and mutants with amino (N)-terminal or carboxy (C)-terminal truncations. The residues corresponding to the DNA-binding domain (DBD), toxicity domain (Tox), activation domain (AD), and Rap1 C-terminal interacting domain (RCT) are displayed. (B) Growth spot assay showing viability of co-expressed Rap1 mutants in RAP1- AID strain background, with (+IAA) and without (+DMSO) the addition of auxin. Cells were grown to saturation in YPD media overnight, then adjusted to equivalent optical density (OD600 = 0.4). Serial 5-fold dilutions were spotted onto YPD agar plates with IAA or DMSO. Plates were incubated at 30 °C for 2 days prior to imaging. Strains: RAP1-AID (FW3877) cells also expressing full length Rap1 (FL) (FW4869), ΔN (FW4871), ΔC (FW4874), ΔN ΔC (FW4876), or containing an empty expression vector (EV) (FW4878). (C) Western blot showing co-expression of Rap1 large truncation mutants and insensitivity to auxin-inducible depletion of Rap1-AID. The strains from (B) were grown to exponential phase and treated with either DMSO (-) or IAA (+) for 2 hours. V5-AID-tagged proteins were detected using an anti-V5 antibody, and auxin- insensitive Rap1 mutant proteins were detected using an anti-Myc tag antibody. Hxk1 blot shown as loading control. (D) Northern blot showing expression of IRT2 and iMLP1 transcripts in rescue experiment using Rap1 mutant proteins, after depletion of Rap1-AID. Cells were grown and treated as described in (C). 32P-labelled probes directed to IRT2 or SUT242/iMLP1 and SNR190 (loading control) were used to detect transcript expression in total RNA samples. (E) Bar plot showing quantification of IRT2 and iMLP1 expression described in (D). Samples from two independent experiments were processed and mean values plus standard error of the mean are displayed (+SEM). Signals for IRT2 and iMLP1 were normalised over SNR190 to control for total amount of RNA loaded. The normalised signal for RAP1-AID +IAA containing empty vector (EV) was set to 1 to control for technical variation between experiments and blots. (F) Representative western blot showing that the Rap1 large truncation mutant proteins are as stable as full-length Rap1 protein. Cells were grown and treated as described in (C), however cycloheximide (CHX, 200 μg/mL) was added 1 hour after addition of IAA to inhibit protein translation and assess stability of the existing cellular pool of proteins. V5-tagged proteins were detected using an anti-V5 antibody, and auxin-insensitive Rap1 mutant proteins were detected using an anti- Myc tag antibody. Two exposures of different durations are shown for the anti-Myc tag blot. Hxk1 blot shown as loading control. (A-F) The original RAP1 plasmids containing truncation mutations were obtained from Amanda Johnson and Tony Weil (Vanderbilt University), and DNA fragments encoding Rap1 mutants were cloned into single copy integration expression vectors. 4.4.10 Tethering of the Rap1 C-terminal domain to DNA partly suppresses divergent transcription

To validate these findings, I investigated whether the Rap1 C-terminal domain itself was sufficient to confer transcriptional repression when tethered near a divergent TSS. I designed a system to bypass DNA recruitment via the Rap1

188

Chapter 4 Results

DNA binding domain. The Rap1 motifs in the RPL43B promoter were replaced with 5 copies of the Lex Operator (LexO) motif using the Cre-LoxP system (Figure 4.10A) (Gueldener et al., 2002). I then fused the ~35 kDa Rap1 C-terminal domain (CTD) to the LexA DNA-binding protein from E. coli, and expressed this fusion construct in strains with wild-type or LexO mutant promoters (Figure 4.10B). In addition, I transformed the wild-type and mutant promoter strains with plasmids containing expression constructs for LexA fused to a ~25 kDa tandem affinity purification (TAP) tag, to separate the effects of increased DNA-binding protein size and fusion domain function. Introduction of the LexO sites did not interfere with IRT2 expression. Tethering the Rap1 CTD to the RPL43B promoter was sufficient to partially repress IRT2 expression when the Rap1 binding sites were replaced (Figure 4.10C-D). However, the sample corresponding to the LexA-TAP fusion protein showed a similar level of IRT2 expression to the strain expressing LexA only, showing that down-regulation of IRT2 is specific to the Rap1 CTD. In conclusion, these data show that the Rap1 CTD is required but only partly sufficient to repress IRT2 expression when tethered to DNA near a divergent core promoter.

Figure 4.10 Tethering of the Rap1 C-terminal domain to DNA partly suppresses divergent transcription (A) Schematic diagram describing use of LexO-LexA system to tether the Rap1 CTD to the RPL43B/IRT2 promoter, via the LexA DNA-binding protein.

189

Chapter 4 Results

(B) Western blot showing expression of LexA-Rap1 CTD-V5 fusion protein, in strains with Rap1 binding sites at the RPL43B promoter replaced with 5 copies of the LexA Operator (LexO: +) sequence, or with the Rap1 sites intact (LexO: -). The expression construct for LexA alone does not contain a V5 epitope tag. V5-tagged proteins were detected using an anti-V5 antibody. Cells were grown to exponential phase in synthetic complete medium without histidine (SC-HIS) to retain the non- integrated expression plasmids under selection using an auxotrophic selection marker. Strains: Rap1 sites intact with LexA protein (FW4714), Rap1 sites intact with LexA-Rap1 CTD-V5 fusion protein (FW4715), Rap1 sites replaced by LexO with LexA protein (FW4716), Rap1 sites replaced by LexO with LexA-Rap1 CTD- V5 fusion protein (FW4717). (C) Representative northern blot showing partial repression of IRT2 when the CTD of Rap1 is tethered to the RPL43B promoter via the LexA DNA-binding protein. Cells were cultured and treated as described in (B), with the addition of strains (lanes 5 and 6) which harbour expression constructs for LexA fused to a tandem affinity purification (TAP) tag (21 kDa) in cells with intact Rap1 sites (FW5086) or Rap1 sites replaced with LexO sequences (FW5087). 32P-labelled probes targeting IRT2 or SUT242/iMLP1 and SNR190 (loading control) were used to detect transcript expression in total RNA samples. (D) Bar plot showing quantification of IRT2 expression, from three biological replicate experiments as described in (C). Samples were processed and mean values plus standard error of the mean are displayed (+SEM). Signals for IRT2 were normalised over SNR190 to control for the total amount of RNA loaded. The normalised signal in lane 3 (Rap1 sites replaced with LexO, LexA protein only expressed, FW4716) was set to 1 to control for technical variation between experiments and blots.

4.4.11 A small region within the Rap1 CTD comprising residues 631-696 is important for repression of divergent noncoding transcription

The data described above indicate that the Rap1 CTD is important for control of divergent noncoding transcription. To identify which of the annotated regions of Rap1 contribute to this function, I generated a set of Rap1 domain deletion expression constructs derived from previously published mutants, by sub- cloning them into single copy integration plasmids (Figure 4.11A, top) (Layer et al., 2010). The contribution of each Rap1 domain towards cellular viability was assessed using a growth spot assay under auxin treatment (Figure 4.11A). Deletion of the Rap1 DBD and regions within the CTD (Δ764-827, Δ631-696) resulted in severely compromised cellular viability. These mutant proteins were successfully expressed in the RAP1-AID strain background, and remained insensitive to auxin induced protein depletion as expected (Figure 4.11B). Next, I characterised the expression of the divergent transcripts IRT2 and iMLP1, after

190

Chapter 4 Results

auxin induced depletion of Rap1-AID – leaving the cells with just the mutant Rap1 proteins (Figure 4.11C). The Toxicity (Tox) and Activation (AD) domains were not important for repression of IRT2 and iMLP1. However, deletion of the DNA-binding domain (ΔDBD) and regions within the C-terminus of Rap1 (Δ764-827, Δ631-696) resulted in aberrant expression of IRT2 and iMLP1.

To understand how these mutations compromise transcriptional repression, I then assessed whether the Rap1 mutants could still bind to Rap1 target motifs at the endogenous RPL43B (IRT2) and RPL40B (iMLP1) promoters. I generated expression constructs of the Rap1 mutants tagged with the V5 epitope for chromatin immunoprecipitation (ChIP), and successfully introduced these in the RAP1-AID strain background (Figure 4.11E). Of the mutants that showed de- repression of IRT2 and iMLP1 (ΔDBD, Δ764-827, and Δ631-696), stable DNA binding could only be detected for the Rap1 Δ631-696 mutant (Figure 4.11D). Finally, I asked whether the mutations introduced had compromised Rap1 protein stability, leading to reduced occupancy on promoter DNA. I assessed the stability of the Δ631-696 mutant after translational inhibition with cycloheximide, in comparison with the repression-competent full length (FL) and activation domain deletion (ΔAD) Rap1 proteins (Figure 4.11F). The Rap1 Δ631-696 protein, which was unable to repress divergent transcription, was very unstable compared to the FL and ΔAD proteins (Figure 4.11G). This finding supports the “steric hindrance” hypothesis, in which stable Rap1 binding to DNA is required to confer repression of proximal core promoters that drive divergent transcription. In conclusion, a small region comprising residues 631-696 in the Rap1 CTD is required for regulation of divergent transcription. Rap1 likely must bind stably to DNA to confer promoter directionality in this model.

191

Chapter 4 Results

192

Chapter 4 Results

Figure 4.11 A small region within the Rap1 CTD comprising residues 631-696 is important for repression of divergent noncoding transcription (A) Growth spot assay showing viability of co-expressed Rap1 mutants in RAP1- AID strain background, with (+IAA) and without (+DMSO) the addition of auxin. Cells were grown to saturation in YPD media overnight, then adjusted to equivalent optical density (OD600 = 0.4). Serial 5-fold dilutions were spotted onto YPD agar plates with IAA or DMSO. Plates were incubated at 30 °C for 2 days prior to imaging. Strains: RAP1-AID (FW3877) cells containing single copy integration vectors expressing full-length Rap1 (FL, FW4948), DNA-binding domain deletion (ΔDBD, FW4950), toxicity domain deletion (ΔTox, FW4952), activation domain deletion (ΔAD, FW4954), residues 764 to 827 deletion (Δ764-827, FW4958), residues 631 to 696 deletion (Δ631-696, FW4960), or containing an empty vector (EV, FW5145). (B) Western blot showing co-expression of Rap1 domain deletion mutants and insensitivity to auxin-inducible depletion of Rap1-AID. The strains from (A) were grown to exponential phase and treated with either DMSO (-) or IAA (+) for 2 hours. V5-AID-tagged proteins were detected using an anti-V5 antibody, and auxin- insensitive Rap1 mutant proteins were detected using an anti-HA (haemagglutinin) tag antibody. Hxk1 blot shown as loading control. (C) Northern blot showing expression of IRT2 and iMLP1 transcripts in rescue experiment using Rap1 mutant proteins, after depletion of Rap1-AID. Cells were grown and treated as described in (B). 32P-labelled probes targeting IRT2 or SUT242/iMLP1 and SNR190 (loading control) were used to detect transcript expression in total RNA samples. (D) Bar plot of ChIP-qPCR data assessing ability of Rap1 domain deletion mutant proteins to bind target sites at IRT2 (RPL43B) and iMLP1 (RPL40B) promoters. To facilitate use of the V5 antibody for ChIP, mutant Rap1 constructs were also tagged with V5-epitope and expressed in a strain background containing a Myc tag fused to Rap1-AID instead of a duplicate V5 tag. Anti-V5 antibodies were used to immunoprecipitate mutant Rap1-V5 bound DNA fragments. Rap1 binding at IRT2 and iMLP1 promoters was measured by qPCR, and the signals were normalised over a region at the ACT1 gene 3’ end. The mean fold enrichment value from three biological replicates plus the standard error of the mean (+SEM) is plotted. Strains: RAP1-AID-MYC cells expressing full length Rap1 (FL, FW5420), DNA-binding domain deletion (ΔDBD, FW5393), toxicity domain deletion (ΔTox, FW5394), activation domain deletion (ΔAD, FW5424), residues 764 to 827 deletion (Δ764- 827, FW5395), residues 631-696 deletion (Δ631-696, FW5396), or containing an empty vector (EV, FW5399). (E) Western blot showing co-expression of Rap1 domain deletion mutants and insensitivity to auxin-inducible depletion of Rap1-AID, for the ChIP experiment described in (D). (F) Plot of western blot quantification showing the relative protein abundance of the Rap1 Δ631-696 mutant protein compared to full-length Rap1 or Rap1 ΔAD. Cells were grown and treated as described in (B), however cycloheximide (CHX, 200 μg/mL) was added 1 hour after addition of IAA to inhibit protein translation and assess stability of the existing cellular pool of proteins. Samples from three independent experiments were processed and mean values +SEM are displayed. Signals for mutant Rap1 proteins were first normalised to the Hxk1 loading control, and to control for the variation in starting protein expression levels, the normalised

193

Chapter 4 Results

signal for each mutant protein at 0 min after addition of CHX was set to 1. NB: error bars are plotted but too small to visualise for Δ631-696 mutant. (G) Representative western blot showing that the Δ631-696 Rap1 mutant protein is unstable compared to full-length Rap1 or Rap1 ΔAD. Cells were grown and treated as described in (B), however cycloheximide (CHX, 200 μg/mL) was added 1 hour after addition of IAA to inhibit protein translation and assess stability of the existing cellular pool of proteins. V5-tagged proteins were detected using an anti-V5 antibody, and auxin-insensitive Rap1 mutant proteins were detected using an anti- HA tag antibody. Hxk1 blot shown as loading control. (A-G) The original RAP1 plasmids containing domain deletion mutations were obtained from Amanda Johnson and Tony Weil (Vanderbilt University), and DNA fragments encoding Rap1 mutants were cloned into single copy integration expression vectors.

4.4.12 Tethering a bulky protein to RP gene promoters via the Rap1 DNA- binding domain is not sufficient to repress divergent transcript expression

The extremely proximal location of the Rap1 binding sites to divergent core promoters suggests that stable physical association between Rap1 and its target sequences is important for transcriptional repression. Perhaps, Rap1 may specifically limit or reduce the recruitment of transcriptional activators and general transcription factors to divergent core promoters by steric hindrance. Therefore, I examined whether the size of the DNA-bound protein near the core promoter is a crucial factor for this regulation. To test this, I generated an additional construct to augment the large Rap1 domain deletion mutants already characterised, in which the DNA-binding domain was fused in frame to 3 tandem copies of a sequence encoding green fluorescent protein (GFP) (Figure 4.12A). These additional fusion domains should increase the molecular weight to 121.5 kDa, which is slightly larger than the full length Rap1 protein. After SDS-PAGE, the full-length (FL) Rap1 protein normally migrates a shorter distance than expected, approximately corresponding to an apparent molecular weight of 150 kDa despite its actual molecular weight of 102.5 kDa. Expression constructs for full-length (FL) Rap1, the Rap1 DNA-binding domain (DBD), and DBD fused to three GFPs in tandem (DBD- 3x-GFP) were introduced into the RAP1-AID strain background successfully. These constructs were insensitive to auxin-inducible depletion (Figure 4.12B). I then measured IRT2 and iMLP1 expression in samples collected during exponential growth, with and without depletion of Rap1-AID (Figure 4.12C). As previously

194

Chapter 4 Results

observed, expression of the full-length Rap1 protein was able to limit aberrant expression of IRT2 and iMLP1, whereas expression of the DBD alone was not. The levels of IRT2 and iMLP1 expression in samples corresponding to the DBD-3x-GFP construct were similar to those for DBD and the empty vector (EV) control. In vivo, the N- and C-terminal domains of Rap1 are reportedly dispensable for chromatin opening (Yu et al., 2001), and expression of the Rap1 DBD is sufficient to partly restore nucleosome positioning and ectopic transcription initiation induced after Rap1 depletion (Challal et al., 2018). Taken together, these data suggest that the identity and other inherent properties of DNA-binding proteins are important for control of divergent transcription. However, the size of the DNA-binding entity does not appear to be a crucial determinant of the ability to repress nearby core promoters.

Figure 4.12 Tethering a bulky protein to RP gene promoters via the Rap1 DNA- binding domain is not sufficient to repress divergent transcript expression (A) Table listing the molecular weights of full length Rap1 protein, Rap1 DNA- binding domain (DBD), or DBD-green fluorescent protein (GFP) fusion constructs.

195

Chapter 4 Results

(B) Western blot showing co-expression of full-length (FL) Rap1, Rap1 DNA- binding domain (DBD), and Rap1 DBD fused to three green fluorescent proteins in tandem (3x-GFP), and their insensitivity to auxin-inducible depletion of Rap1-AID. These strains were grown to exponential phase and treated with either DMSO (-) or IAA (+) for 2 hours. Myc-tagged and GFP-tagged proteins were detected using α- Myc or α-GFP antibodies, respectively, and Hxk1 blots are shown as loading controls. Rap1-AID background strains containing empty expression vector (FW5145), or Rap1 Full Length (FW5129), Rap1 DBD (FW5141), or Rap1 DBD-3x- GFP (FW7156) expression vectors. MW, molecular weight. Two sets of non- specific background bands are visible across all samples for the α-GFP blot, due to some cross-hybridisation from the α-GFP antibody. The full-length (FL) Rap1 protein normally migrates a shorter distance than expected during SDS-PAGE, approximately corresponding to an apparent molecular weight of 150 kDa despite its actual molecular weight of 102.5 kDa. (C) Northern blot showing expression of IRT2 and iMLP1 transcripts in rescue experiment using Rap1 mutant proteins, after depletion of Rap1-AID. Cells were grown and treated as described in (B). 32P-labelled probes targeting IRT2 or iMLP1 and SNR190 (loading control) were used to detect transcript expression in total RNA samples.

4.4.13 Investigation of divergent TSS usage and regulation using the CRISPR interference system

The discovery of the CRISPR (clustered regularly interspaced short palindromic repeats) bacterial immunity system has generated revolutionary experimental tools for genome editing and transcriptional regulation (Cong et al., 2013; Gilbert et al., 2013; Gu et al., 2018; Jinek et al., 2012; Myers et al., 2018; Qi et al., 2013; Shariati et al., 2019; Tsui et al., 2018). The Cas9 RNA-guided endonuclease protein from Streptococcus pyogenes can be “programmed” and recruited to specific DNA motifs in vivo when reconstituted in other organisms, including yeast (Cong et al., 2013). The CRISPR interference (CRISPRi) system exploits this by targeting a catalytically inactivated form of the Cas9 endonuclease (dead Cas9, dCas9) to core promoters of coding genes. This recruitment effectively down-regulates the expression of targeted genes (Gilbert et al., 2013; Qi et al., 2013). Therefore, I decided to introduce the CRISPRi system to target the core promoters of Rap1-regulated divergent transcripts, and test the “steric hindrance” model of transcriptional repression with an exogenous protein (Figure 4.13A).

At Rap1-regulated gene promoters with divergent transcripts, I demonstrated that Rap1 binds to its target motifs at or near the divergent TSS. This

196

Chapter 4 Results

binding is required for effective repression of divergent RNA expression. In the RAP1-AID strain background that allows inducible depletion of endogenous Rap1 protein, I introduced expression constructs for dCas9 and dCas9 fused to a transcriptional repressor domain, Mxi1 (dCas9-Mxi1) (Gilbert et al., 2013; Schreiber-Agus et al., 1995). Inherently, Rap1 and dCas9 use very different binding modes to target DNA (Jones et al., 2017; Konig et al., 1996). Sequence-specific transcription factors interrogate DNA motifs largely through protein interactions with the minor and major grooves of the double-stranded DNA helix (Pabo and Sauer, 1992). In contrast, Cas9-guide RNA (gRNA) complex binding to target DNA requires melting of double-stranded DNA and formation of an RNA:DNA hybrid (Figure 4.13B). The DNA binding domain of the S. cerevisiae Rap1 protein contains a C-terminal “wrapping loop”, which wraps around the major groove of the DNA helix on the opposite face to the main Myb-like DNA interaction domains to form a closed complex on DNA (Feldmann et al., 2015; Feldmann and Galletto, 2014; Matot et al., 2012). This “wrapping loop” likely contributes to the stability of Rap1 binding to its target motifs at telomeres and promoters. When co-expressed, the S. pyogenes Cas9 protein forms a complex with a single “programmable” guide RNA (sgRNA) and binds to the complementary DNA target sequence in a strand-specific manner (Figure 4.13B-C). The sequence targeted by the S. pyogenes Cas9 complex must be immediately followed by an “NGG” protospacer adjacent motif (PAM). I previously showed that increasing the size of the Rap1 DBD alone to ~120 kDa was not sufficient to repress divergent transcription (Figure 4.12C). However, I hypothesised that targeting the dCas9 protein (163 kDa) to the divergent core promoter could effectively limit divergent RNA expression, due to differences in size, DNA-binding mode, and DNA-binding kinetics (Figure 4.13D).

197

Chapter 4 Results

198

Chapter 4 Results

Figure 4.13 Investigation of divergent TSS usage and regulation using the CRISPR interference system (A) Schematic diagram depicting a RP gene promoter, containing a coding direction TSS and divergent TSS (black arrows), located at the edges of the nucleosome-depleted region. The Rap1 binding site (red box) is located at or near the divergent TSS. Normally, Rap1 (blue polygon) transcription factor binding near the divergent TSS is sufficient to repress divergent transcription. In this experiment, catalytically inactivated (“dead” Cas9, dCas9, purple polygon) and dCas9 fused to an Mxi1 repressor domain (green circle) are targeted to the divergent core promoter in place of Rap1. Grey circles depict nucleosomes. (B) Surface representations of experimentally obtained crystal structures, showing the Rap1 DNA-binding domain (DBD, top) in a complex with its target DNA (red). Note the thin “wrapping loop” shown that wraps around the major groove of the DNA helix. The “wrapping loop” interacts with the N-terminal Myb-like domain and forms a closed complex on DNA. For comparison, the structure of Cas9 from S. pyogenes in complex with a sgRNA and its target is shown. Crystal structures not to scale. PDB: 3UKG (Matot et al., 2012); PDB: 4OO8 (Nishimasu et al., 2014). (C) Schematic diagram of the CRISPR-Cas9 system and how it binds to target DNA. The single guide RNA (sgRNA), comprising a fused tracr scaffold RNA and guide RNA, forms a complex with the catalytically inactive dCas9 protein. The dCas9-sgRNA complex interrogates DNA sequences on chromatin and forms a stable complex with the correct target DNA complementary to the sgRNA sequence. The sequence targeted by the S. pyogenes Cas9 complex must be immediately followed by an “NGG” protospacer adjacent motif (PAM). (D) Schematic diagram depicting the location of the TSSs at the IRT2 transcript divergent core promoter, the Rap1 motifs located adjacent, and the target sequence for the sgRNA targeting the IRT2 core promoter. (E) Western blots showing co-expression of dCas9 and dCas9-Mxi1 proteins in the RAP1-AID strain background, and insensitivity to auxin-inducible protein depletion. Cells were grown to exponential growth phase and samples were collected prior to (-IAA) or 2 hours after (+IAA) treatment with 3-IAA. V5-AID proteins were detected using an anti-V5 antibody, and auxin-insensitive dCas9 proteins were detected using an anti-FLAG tag antibody. Hxk1 blot shown as loading control. Strains: RAP1-AID containing empty vectors (EV) for dCas9 expression and sgRNA (FW8477), RAP1-AID expressing dCas9 and sgRNA targeting IRT2 core promoter (FW8523), RAP1-AID expressing dCas9-Mxi1 and sgRNA targeting IRT2 (FW8531), iMLP1 (FW8533), and TEF1 (FW8535) core promoters. (F) Northern blots showing expression of IRT2 and iMLP1 transcripts in rescue experiment as described in (E) using dCas9 proteins, after depletion of Rap1-AID. 32P-labelled probes targeting IRT2 or SUT242/iMLP1 and SNR190 (loading control) were used to detect transcript expression in total RNA samples.

The dCas9 and dCas9-Mxi1 proteins were successfully expressed in the RAP1-AID strain background, and were not sensitive to auxin-induced depletion during exponential growth in rich medium (Figure 4.13E). Single guide RNAs (sgRNAs) targeting IRT2, iMLP1, and TEF1 were expressed to specifically recruit

199

Chapter 4 Results

the dCas9 complex to core promoters. The IRT2 transcript was expressed when Rap1-AID was depleted. When dCas9 was co-expressed and targeted to the IRT2 core promoter, IRT2 expression was completely repressed (Figure 4.13F). Co- expression of the dCas9-Mxi1 protein also effectively limited IRT2 expression when targeted to the IRT2 core promoter, but not with sgRNAs targeting the core promoters of iMLP1 or TEF1. Targeting of the dCas9-Mxi1 protein to the iMLP1 core promoter also repressed iMLP1 expression when the iMLP1 sgRNA was expressed, but not using the sgRNAs targeting IRT2 or TEF1. These results demonstrate that the CRISPRi system can be used to effectively limit divergent or noncoding RNA expression in S. cerevisiae. The divergent core promoters at highly expressed genes are sensitive to programmable recruitment of large, exogenous DNA-binding proteins like dCas9 and dCas9-Mxi1. In addition, dCas9 and dCas9- Mxi1 from S. pyogenes should not recruit coactivators to the gene promoters in S. cerevisiae. Given that targeting these proteins to the divergent core promoter is already sufficient to limit IRT2 expression, these data demonstrate that factors involved in activation of RP gene expression downstream of Rap1 are not required for repression of divergent transcription.

4.5 Discussion

4.5.1 Summary

The work presented in Chapter 3 illustrated that Rap1 represses divergent transcription to a large extent across the genome. In this chapter, I explored the mechanism by which Rap1 represses divergent transcription and controls promoter directionality in more detail. I performed experiments using model loci, reporter assays, and genome-wide approaches to identify the key regulatory principles. These data identified that the proximity and position of Rap1 binding is an important determinant of transcriptional output for divergent core promoters. Rap1 appears to repress initiation of divergent transcription instead of acting as a “roadblock” to transcriptional elongation, and functions regardless of the orientation of its binding site motifs. Outside of its native genomic context at RP genes, this unique transcription factor can also control divergent transcription and promoter directionality at independently regulated promoters where Rap1 binding is

200

Chapter 4 Results

reconstituted. Other transcription factors in yeast, some of which possess “pioneer factor” activity, may also control promoter directionality in a similar manner.

Interaction with heterochromatin silencing factors is not required for Rap1 to limit divergent noncoding transcription. However, the C-terminal domain (CTD) of Rap1 is required for efficient repression of divergent transcripts. Tethering the CTD of Rap1 artificially to the RPL43B promoter, bypassing the Rap1 DNA-binding domain with the LexA system, was sufficient to partly repress expression of IRT2. In addition, I identified a small region in the Rap1 CTD comprising residues 631- 696 required for repression of divergent RNAs. The Rap1 CTD may confer stable DNA-binding properties to the protein, or recruit interacting partners to divergent core promoters for transcriptional repression. Experimental evidence collected from tethering different protein constructs to divergent core promoters suggests that Rap1 and other DNA-binding proteins repress divergent transcription directly by steric hindrance. These findings help to explain why this unusual mechanism of transcriptional repression is prevalent at highly expressed gene promoters depleted of nucleosomes. Other sequence-specific transcription factors and DNA binding proteins may perform similar roles in different eukaryotic organisms. Here, I discuss these findings and their wider implications, and evaluate alternative hypotheses regarding the mechanism by which Rap1 controls divergent transcription. The discussion below is limited to topics relating directly to experiments performed in this chapter. A wider discussion of other mechanisms that control promoter directionality is presented in Chapter 6 (Discussion 6.4).

4.5.2 Evaluation of TSS-seq approach

From detailed examination of the pattern of antisense transcripts induced after depletion of Rap1, I observed that many divergent transcripts present at Rap1-regulated promoters initiate near Rap1 binding sites. I obtained a high resolution map of Rap1 binding sites in the S. cerevisiae genome from published ChIP-exo data. However, typical RNA-seq library generation protocols are not designed to identify transcription initiation sites with high resolution; coverage is typically lower at the start and end of transcripts due to fragmentation and other

201

Chapter 4 Results

biases. Therefore, I decided to implement the TSS-seq method to identify transcription start sites at single nucleotide resolution in a strand-specific manner, and quantify their expression. The decapping and oligo-capping strategy central to the TSS-seq method has been previously used to map transcription initiation sites in various species, including budding yeast and Plasmodium (Adjalley et al., 2016; Arribere and Gilbert, 2013; Malabat et al., 2015). Other methods that map 5’ RNA ends at a genome-wide scale would also have been appropriate – for example protocols based on 5’ cap biotinylation (e.g. CAGE) or template switching (e.g. RAMPAGE, STRT, NanoCAGE XL) (Adiconis et al., 2018). Strategies which map nascent transcripts such as GRO-cap or 5’ GRO-seq can be challenging to implement in yeast because of difficulty in obtaining pure and biochemically active nuclei through cellular fractionation (Core et al., 2008; Kwak et al., 2013).

The TSS-seq data identified TSSs for coding and noncoding transcripts across a wide range of RNA expression. It is important to note that the polyadenylated fraction of RNA was used as the input material for TSS-seq. The changes in RNA expression I observed in polyA RNA fractions vs. total RNA fractions were similar for Rap1-regulated noncoding transcripts in the RNA-seq data (Figure 3.5D). TSS-seq using polyA RNA was able to measure changes in divergent RNA expression for Rap1-regulated transcripts. In the future, it should be possible to adapt this TSS-seq protocol to use purified nascent RNA, for example through affinity purification of RNA polymerase II (Churchman and Weissman, 2011; Nojima et al., 2015) or metabolic labelling of nascent transcripts using 4- thiouracil (Schwalb et al., 2016).

It is possible that very short divergent transcripts remain undetected due to fragment size selection and mapping in typical RNA sequencing library preparations. Accumulation of short divergent transcripts at promoter binding sites could suggest that antisense transcription initiates within the promoter nucleosome- depleted region, and is attenuated by a Rap1 “roadblock” in the divergent direction. However, the data outlined in Figure 4.1, Figure 4.2, and Figure 4.6 indicate that this is not the case. Further experimental investigation using an RNA sequencing strategy designed to capture small RNAs, or examination of NET-seq data that

202

Chapter 4 Results

capture smaller nascent RNA fragments associated with RNA polymerase II, should validate these findings.

4.5.3 Evaluation of a “transcriptional roadblock” mechanism

The TSS-seq data identified that Rap1-regulated divergent transcripts usually initiate at the 5’ or upstream edge of the promoter nucleosome-depleted region (NDR), where the Rap1 binding sites are also located. Typically, chromatin-based pathways significantly contribute to limiting initiation of noncoding and antisense transcription. However, Rap1-regulated gene promoters are depleted of nucleosomes, and therefore require other robust mechanisms to limit aberrant transcription. Because Rap1 is a stable DNA-binding protein in close proximity to the divergent core promoters, I speculated that proximal Rap1 binding is required for suppression of divergent transcription (Figure 4.14A).

Rap1 and Reb1 can function as roadblocks to terminate elongating RNA polymerase, preventing upstream read-through transcription from interfering with downstream transcription of coding genes (Candelli et al., 2018; Colin et al., 2014; Yarrington et al., 2012). Given the extreme proximity (0 – 50 bp) between promoter Rap1 binding sites and divergent TSSs, Rap1 likely interferes with formation of transcription pre-initiation complexes instead (Figure 4.14B). In addition, Rap1 does not pose a potential roadblock for most divergent transcripts, because most divergent TSSs are either overlapping with or situated upstream, not downstream, of promoter Rap1 binding sites.

203

Chapter 4 Results

Figure 4.14 Requirements for repression of divergent transcription initiation by Rap1 (A) Rap1 binding sites (red boxes) must be proximal to (0-50 bp) divergent transcription start sites (TSSs) to repress transcription, which does not depend orientation of the Rap1 motif. (B) When Rap1 is bound to DNA at distal locations upstream or downstream of the divergent TSS, repression of divergent transcription cannot occur. This figure is adapted from my point-of-view article in Transcription (Wu and Van Werven, 2019).

I tested this “roadblock” model directly by integrating a spacer sequence of 400 bp into the endogenous RPL43B/IRT2 promoter, while keeping the Rap1 binding sites adjacent to the IRT2 TSS or moving them 400 bp downstream of the initiation site (Figure 4.1). Clear repression was observed only when the Rap1

204

Chapter 4 Results

binding sites were located near the TSS of IRT2. When the Rap1 binding sites were situated 400 bp downstream of the IRT2 TSS, in between the exogenous spacer sequence and the DNA encoding IRT2, the longer transcript containing the spacer and IRT2 sequence was not expressed. No transcripts fragment sizes of intermediate size were detected, indicating that Rap1 was unable to attenuate RNA polymerase elongation by acting as a transcriptional “roadblock” at this locus. In addition, Rap1 binding sites introduced proximal to and upstream of the SUT129 divergent TSS could already repress divergent promoter activity (Figure 4.2). Thus, Rap1 limits expression of divergent ncRNAs by regulating transcription initiation, not elongation. I propose that the stable binding of Rap1 at its target sequences within promoters can simply repress divergent core promoters. As a consequence, Rap1 may sterically block or reduce the association of transcriptional activators and general transcription machinery with proximal core promoters.

4.5.4 Decoupling of coding and divergent core promoters

At bidirectional gene promoters, sense and antisense TSSs flank the same DNA regulatory information encoded within the promoter NDR. For example, in mouse macrophages coding and divergent core promoter activity are coupled – they respond in a coordinated manner to lipopolysaccharide stimulation and activator binding (Scruggs et al., 2015). However, experimental evidence suggests that the coding and divergent core promoters are uncoupled to some extent in yeast. High-resolution mapping of transcription pre-initiation complex (PIC) components using ChIP-exo identified that coding and divergent PICs are divergently oriented and generally separated by several hundred nucleotides of DNA (Rhee and Pugh, 2012). At most RP genes, Rap1 is constitutively bound to promoter DNA and functions as a transcriptional activator. However, when Rap1 is depleted the expression of Rap1-regulated coding genes decreases and divergent transcript expression increases – the coding and divergent core promoters do not respond in the same direction. Furthermore, evidence from fluorescent reporter experiments using the pPS construct supports the hypothesis that the core promoters are uncoupled. When exogenous Rap1 binding sites were situated in a proximal position, divergent promoter activity (YFP signal) increased after Rap1

205

Chapter 4 Results

depletion – whereas coding promoter activity (mCherry signal) was not directly affected (R1p, Figure 4.2A-B). The orientation of the Rap1 binding site motifs did not affect its ability to limit divergent transcription (Figure 4.3). These examples suggest that core promoter elements are functionally uncoupled at some bidirectional gene promoters in yeast.

Preliminary experiments using the PPT1/SUT129 fluorescent reporter system also suggest that several other transcription factors can limit divergent transcript expression in a similar way: notably Cbf1, Gcr1, Abf1, and Reb1. For three of these proteins – Gcr1, Abf1, and Reb1 – divergent and coding core promoter activity are not coupled. Indeed, Domenico Libri and colleagues recently used high resolution mapping of RNA polymerase II occupancy (cross-linking and analysis of cDNAs, CRAC) to demonstrate that Abf1 and Reb1 also limit ectopic usage of cryptic TSSs within promoter NDRs (Challal et al., 2018). However, the data for Cat8, Gcn4, and Gal4 must be interpreted more cautiously. Cat8 and Gcn4 are not active until after the diauxic shift (Haurie et al., 2001; Natarajan et al., 2001), and Gal4 expression is repressed 4 to 7-fold in the presence of glucose (Griggs and Johnston, 1991). Given that the fluorescent reporter experiments were performed during exponential growth in glucose and nutrient-rich media, the results for Cat8, Gcn4, and Gal4 must be validated in other cellular states. Many “pioneer” transcription factors with nucleosome-displacing activity have low protein isoelectric (pI) points that result in a net negative charge at physiological pH conditions (Table 4.1) (Donovan et al., 2018a; Gasteiger et al., 2003). Exposed interaction surfaces with net negative charge may facilitate interaction with positively charged histone tails and affect transcription factor association or dissociation rates. Although this hypothesis requires comprehensive experimental investigation, I speculate that a low isoelectric point may constitute a property that enhances the ability of transcription factors to repress divergent transcription.

206

Chapter 4 Results

Represses Strong Isoelectric Point Protein Divergent Nucleosome (pI) Promoter Activity Displacement Cbf1 4.93 Yes Yes Gcn4 5.08 Yes No Hap4 5.23 No No Cat8 9.13 No No Gal4 6.79 No No Abf1 4.86 Yes Yes Reb1 4.96 Yes Yes Rap1 4.83 Yes Yes

Table 4.1 Table summarising biochemical and functional properties of candidate transcription factors Candidate transcription factors were tested for their ability to repress divergent promoter activity in the PPT1/SUT129 fluorescent reporter system (Figure 4.7). Theoretical isoelectric points (pI) of proteins were calculated using the ExPASy Compute pI/MW tool provided by the SIB Swiss Institute of Bioinformatics, using information from the UniProt Knowledgebase (Gasteiger et al., 2003) (https://web.expasy.org/compute_pi/pi_tool-doc.html). The nucleosome displacement activity of these factors was determined using a sequencing-based approach by Lu Bai and colleagues (Yan et al., 2018).

4.5.5 Regulation of divergent transcript expression by Rap1 silencing cofactors

At telomeres and HM loci, Rap1 mediates the recruitment of silencing factors to DNA via interaction surfaces on its C-terminal domain (CTD) (Feeser and Wolberger, 2008). I screened a set of 46 CTD point and patch mutations, already characterised for telomere length regulation and transcriptional silencing at HM and telomeric loci. I initially hypothesised that interactions between Rap1 and silencing cofactors would be important for repression of the IRT2 divergent noncoding RNA. However, only a small subset of mutants displayed minor de-repression of divergent transcription, and were not comparable in magnitude to a complete deletion of the Rap1 CTD. Recently, it has been reported that members of the SIRT6 sirtuin family, Hst3 and Hst4, limit expression of divergent antisense and

207

Chapter 4 Results

cryptic unstable transcripts (CUTs) (Feldman and Peterson, 2019). However, this mechanism is widespread and is not specific to Rap1-regulated or RP genes. Further investigation using mutants for Rap1, Hst3, and Hst4 in combination is required to determine whether these pathways function redundantly or independently at gene promoters with the capacity for bidirectional transcription. To date, the rules that dictate whether Rap1 is an activator (e.g. at promoters) or a silencer (e.g. at telomeres and HM loci) are not known. These data indicate that Rap1 does not limit divergent transcription at gene promoters by specific recruitment of its silencing cofactors required for repression of heterochromatin.

4.5.6 Transcriptional repression of divergent core promoters by steric hindrance

Due to its close proximity to divergent TSSs, I propose that Rap1 can achieve repression of divergent transcription through stable physical association with its target sites at gene promoters. Rap1 may limit access of general transcription machinery and perhaps chromatin remodellers required for divergent promoter activity. Several lines of evidence support this steric hindrance hypothesis. First, Rap1 is ideally positioned to prevent or restrict initiation of divergent transcription. Divergent transcription initiates at the 5’ (upstream) edge of the promoter NDR where Rap1 also binds. Typically, the divergent TSS is located within 50 bp of the Rap1 binding sites. Due to this positioning, steric hindrance is spatially limited to the divergent core promoter and should not interfere with coding direction gene transcription. In vivo, Rap1 binding sites and coding gene TSSs are separated by several hundred nucleotides of DNA (Knight et al., 2014; Reja et al., 2015). Hypothetically, if the Rap1 binding site(s) were positioned near the coding direction TSS, I would expect lower expression of the coding transcript while divergent transcript expression increased. Second, recruitment of Rap1 to its target motifs at promoters confers effective transcriptional repression, regardless of the orientation of the Rap1 motifs. Finally – Rap1 is stably bound to its target sites at promoters during different cellular states. For example, other RP gene coactivators dissociate from the gene promoter after heat shock stress whereas Rap1 maintains

208

Chapter 4 Results

stable association with DNA (Reja et al., 2015). At highly expressed genes, this could efficiently limit divergent core promoter activity.

Why has S. cerevisiae evolved to control divergent transcript expression using this unorthodox mechanism? The intrinsic features of Rap1-regulated genes, with regards to nucleosome positioning in particular, may offer an explanation. Rap1-regulated promoters are among the most active in budding yeast, and highly transcribed genes are biased towards wider NDRs (Bai and Morozov, 2010; Warner, 1999). The width of the NDR is approximately 200 – 400 bp for Rap1- regulated genes, compared to ~150 bp for the average yeast gene promoter (Bai and Morozov, 2010). Rap1, together with coactivators and chromatin remodellers, generates a wide NDR which facilitates activation of the coding gene promoter (Kubik et al., 2018). However, this open chromatin could also facilitate aberrant recruitment of RNA polymerase II without stringent mechanisms of control. This would likely manifest as aberrant transcription in both the coding and divergent direction from the distinct core promoters at the outer borders of the NDR. Depletion of Rap1 induces an upstream shift in TSS position for the coding direction core promoter (Challal et al., 2018; Wu et al., 2018b). In many cases, this shift in TSS usage compromises the expression of the downstream coding gene as well. I propose that Rap1 is ideally positioned to repress initiation of divergent transcription, while it concurrently recruits cofactors to facilitate gene expression in the protein coding direction. Rap1 may limit the association of transcription factors, general transcription machinery, or chromatin remodellers to sequences flanking its binding sites, which often comprise divergent TSSs.

To investigate whether divergent core promoters are indeed regulated by steric hindrance, I generated or exploited known synthetic transcription factors and tethered them to divergent promoters (Figure 4.12, Figure 4.13). Expression of the Rap1 DNA binding domain (DBD) alone could not repress IRT2 or iMLP1 expression. Increasing the size of the Rap1 DBD construct (~40.5 kDa) with the addition of three tandem GFP proteins (to ~121 kDa) was not sufficient to repress divergent transcription. This suggests that the size of the DNA-binding protein is not the sole determinant of transcriptional repression. In fact, the Rap1 CTD modulates the DNA-binding mode and affinity of the protein (Feldmann et al., 2015;

209

Chapter 4 Results

Feldmann and Galletto, 2014), and addition of the Rap1 CTD conferred some repression of divergent transcription (Figure 4.9). Given that a destabilising mutation in the Rap1 C-terminal domain (Δ631-696) also caused de-repression of divergent transcription (Figure 4.11), stable occupancy of the transcription factor near the core promoter may be a key aspect of regulation.

Parallel work performed by the laboratory of Domenico Libri indicated that expression of the Rap1 DBD can partially suppress the changes in nucleosome positioning and ectopic transcription initiation after depletion of Rap1 protein (Challal et al., 2018). In this case nucleosome positioning and NDR width were partly restored at promoter Rap1 binding sites after expression of the Rap1 DBD. Analysis of RNA-seq data also indicated that suppression of ectopic transcription was largely limited to proximal TSSs within a 200 nt window centered on Rap1 binding sites. However, my experiments using the auxin-insensitive Rap1 DBD construct revealed that expression of the Rap1 DBD alone was not sufficient for repression of the IRT2 and iMLP1 transcripts. These differences may be due to design of the DBD construct tested (specifically the inclusion of residues 339-600 here versus residues 358-601), expression level due to selection of a single copy integration vector here versus a non-integrating centromeric plasmid, and subcellular localisation due to addition of a SV40 nuclear localisation signal (NLS) in my construct (Challal et al., 2018; Wu et al., 2018b). Future work using sensitive genome-wide approaches should resolve these findings through direct comparison.

Nuclease-inactivated Cas9 (catalytically “dead” Cas9, dCas9) has been used to repress transcription in prokaryotes and eukaryotes via CRISPR interference (CRISPRi) (Gilbert et al., 2013; Qi et al., 2013) and to functionally interrogate specific transcription factor-DNA interactions (Shariati et al., 2019). I tested whether an exogenous DNA-binding protein tethered to a Rap1-regulated divergent core promoter could achieve transcriptional repression using the CRISPRi system (Figure 4.13). If valid, this would support the steric hindrance model, instead of the alternative hypothesis in which specific functional interactions of Rap1 are predominantly important for repression of divergent transcription.

210

Chapter 4 Results

Rap1 is a pioneer transcription factor and can displace nucleosomes (Ganapathi et al., 2011; Kubik et al., 2015; Kubik et al., 2018; Mivelaz et al., 2019; Yan et al., 2018). The Rap1-regulated divergent TSSs are usually located at the upstream edge of the promoter NDR, flanked on one side by the -1 nucleosome (Kubik et al., 2015; Kubik et al., 2018). In contrast, Cas9 access to DNA is impeded by nucleosomes (Daer et al., 2017; Horlbeck et al., 2016; Kuscu et al., 2014; Singh et al., 2015; Wu et al., 2014) and CRISPRi activity is higher in nucleosome- depleted regions (Horlbeck et al., 2016). For targeting of coding genes, CRISPRi performs significantly better when sgRNA targets overlap directly with the coding direction core promoters (Radzisheuskaya et al., 2016). In the case of IRT2 and iMLP1, tethering of dCas9 and dCas9-Mxi1 to divergent core promoters was sufficient for specific and complete repression of divergent transcription. As dCas9 and dCas9-Mxi1 do not function as transcriptional activators, the activation function of Rap1 and its associated downstream coactivators involved in RP gene expression are likely not required for repression of divergent noncoding RNAs. The -1 nucleosome at promoter Rap1 sites shifts inwards after depletion of Rap1, moving adjacent to but not fully occluding the Rap1 binding site and associated divergent TSSs (Kubik et al., 2015; Kubik et al., 2018). Despite this, dCas9 is still able to confer stable repression of divergent promoter activity. These results suggest that nucleosome-displacement activity may not be strictly required for effective repression of divergent transcripts.

These results provide proof-of-principle data to inform design of CRISPRi sgRNA libraries. The function of many divergent and long noncoding RNAs remains unclear, but systematic perturbation of lncRNA expression by targeting divergent core promoters will help to answer this outstanding question. For large- scale and combinatorial CRISPRi, addition of a scalable method to efficiently assembly many gRNA monomers, such as “CARGO”, would be required (Gu et al., 2018). The expression level of dCas9 or other nuclease-inactivated CRISPR proteins could be tuned to modulate their ability to compete with sequence-specific and general transcription factors at target sites. In contrast to permanent DNA modifications achieved through CRISPR-Cas9 gene editing, the expression of dCas9 and gRNAs is programmable, reversible, and tuneable. This versatile

211

Chapter 4 Results

system should facilitate experimental investigation and potential therapeutic applications in the future.

4.5.7 Conclusion

In this chapter, I investigated the functional relationship between the proximity of promoter Rap1 binding sites to divergent TSSs, and the ability of Rap1 to repress divergent transcription. I found that both the position of the Rap1 binding site and the distance to the TSS are crucial functional parameters. The Rap1 binding site must be proximal to the divergent TSS (typically within 0 – 50 bp), and cannot limit divergent transcription from a distal position located upstream or downstream of the divergent TSS. This ability to repress divergent promoter activity is independent of Rap1 motif orientation. To expand the scope of my investigation, I performed genome-wide analysis using TSS-seq and identified Rap1-regulated divergent TSSs at single nucleotide resolution. These data validated the single- locus studies and revealed that many divergent TSSs overlap with or are extremely close to Rap1 binding sites at gene promoters. I compiled preliminary data using a fluorescent reporter assay, which suggest that other transcription factors in budding yeast may also control promoter directionality in a similar manner to Rap1. By mutagenesis, I found that the interaction of Rap1 with its known silencing cofactors was not required to limit expression of the divergent RNA IRT2. However, a small region within the Rap1 CTD comprising residues 631-696 is essential for repression of divergent transcription. Finally, I investigated the properties of DNA- binding proteins that can directly affect activity of divergent core promoters. I found that the size of the DNA-bound protein, to a limited extent, was not a major regulator of divergent promoter activity. In contrast, sequence-specific targeting of dCas9 to divergent core promoters at RP genes conferred effective transcriptional repression, suggesting that regulation of divergent transcription does not require co-regulators downstream of Rap1. In conclusion, these data support a “steric hindrance” model wherein Rap1 limits the activity of divergent core promoters and thereby controls overall promoter directionality.

212

Chapter 5 Results

Chapter 5. Regulatory Interplay between Rap1 and the RSC Chromatin Remodeller

5.1 Acknowledgement

Parts of this research comprising Figure 5.1 - Figure 5.3 have been published in Molecular Cell (Wu et al., 2018b), and have been modified to present within this chapter.

For the proteomics mass spectrometry experiments in Figure 5.1, I designed the experiments, generated the strains, collected the samples, and processed the samples up to and including the step wherein co-immunoprecipitated proteins were migrated into the gel. Folkert van Werven (Cell Fate and Gene Regulation Laboratory, The Francis Crick Institute) assisted with sample collection. David Frith and Bram Snijders from the Protein Analysis and Proteomics Platform (The Francis Crick Institute) processed and analysed the samples for label-free quantification by mass spectrometry. David and Bram also assisted with analysis of the data using the MaxQuant and Perseus software, and I subsequently plotted, analysed, and interpreted the processed data.

The fluorescent reporter system plasmids used in Figure 5.3 were constructed with the help with Folkert van Werven (Cell Fate and Gene Regulation Laboratory, The Francis Crick Institute), who generated the plasmids with transcription factor binding site motifs. Folkert also assisted with microscopy data collection and analysis. I designed the experiments, constructed the yeast strains, and performed data collection and analysis.

The RNA sequencing and nascent RNA sequencing libraries were prepared and sequenced with the help of the Advanced Sequencing Facility Science Technology Platform (The Francis Crick Institute). I designed the experiments, generated the yeast strains, collected and processed the samples prior to library preparation and sequencing, analysed and interpreted the data.

213

Chapter 5 Results

The Bioanalyzer analysis of the nascent RNA-seq libraries was performed with the help of the Genomics Equipment Park Facility (The Francis Crick Institute).

The bioinformatic analysis for ChIP-seq, MNase-seq, and RNA-seq data in Figure 5.2 - Figure 5.15 was performed with the help of Harshil Patel (Bioinformatics and Biostatistics Science Technology Platform, The Francis Crick Institute). Harshil performed the read trimming, filtering, and mapping, differential expression analysis, processing of data to generate coverage tracks, plotting for heat maps and metagene plots, and calculation of promoter directionality scores. I designed the bioinformatic analysis strategy with the help of Harshil, and performed analysis, visualisation, and interpretation of the processed data.

5.2 Abstract

In this Chapter, I explore the functions of Rap1 and the ATP-dependent chromatin remodeller RSC, with regards to control of divergent noncoding transcription and promoter directionality. Using proteomics mass spectrometry to study chromatin-bound proteins, I identify that Rap1 interacts directly or indirectly with components of ATP-dependent chromatin remodelling complexes. Specifically, deletion of residues 631-696 in the Rap1 C-terminal domain (CTD) may compromise the interactions between Rap1 and components of the RSC (Remodels the Structure of Chromatin) complex. Rap1 and RSC control nucleosome organisation and regulate divergent transcription at bidirectional promoters. RSC is not strictly required for divergent transcription at Rap1-regulated core promoters, and co-depletion of RSC only partially suppresses the divergent transcription induced after Rap1 depletion. Finally, quantitative analysis of promoter directionality using nascent RNA sequencing reveals that RSC controls divergent and noncoding transcription to a large extent at hundreds of genes in S. cerevisiae. These findings improve our understanding of the diverse mechanisms and factors that work in concert to regulate divergent transcription and control promoter directionality.

214

Chapter 5 Results

5.3 Introduction

In Chapter 4, I identified the mechanism by which Rap1 specifically limits expression of divergent transcripts. Stable occupancy of Rap1 near divergent core promoters likely interferes with initiation of divergent transcription. However, divergent transcription still occurs in the absence of Rap1, suggesting that additional regulatory factors also dictate expression of divergent noncoding RNAs. This led me to ask: what are these additional factors, and how do they work? In this chapter, I aimed to identify additional regulators of divergent transcription at Rap1- regulated and other highly expressed genes. Initially, I performed co- immunoprecipitation experiments with repression-competent and -deficient versions of Rap1 to identify potential regulators through proteomics mass spectrometry. Components of ATP-dependent chromatin remodelling complexes were significantly over-represented in differentially enriched proteins interacting with Rap1.

ATP-dependent chromatin remodellers play key roles in most processes that occur on chromatin – for example: DNA replication, transcription, packaging, and recombination (Clapier et al., 2017; Narlikar et al., 2013). In this Chapter, I explore the roles of ATP-dependent chromatin remodellers in divergent and noncoding transcription, and specifically focus on the SWI/SNF family complex RSC (Remodels the Structure of Chromatin) (Cairns et al., 1996). RSC has essential roles in many processes including (Hsu et al., 2003; Huang et al., 2004), DNA repair (Czaja et al., 2014), and even mitochondrial function (Imamura et al., 2015). However, one of its key roles is to slide or eject nucleosomes, facilitating chromatin access specifically at the flanking nucleosomes (+1, -1) of the nucleosome-depleted region (NDR) (Cairns et al., 1996; Kubik et al., 2018; Parnell et al., 2008). Several chromatin remodellers, including RSC, have been described to repress divergent and noncoding transcription around promoters and NDRs (Whitehouse et al., 2007; Yadon et al., 2010). It is unclear whether Rap1 and RSC regulate similar or dissimilar classes of divergent and noncoding transcription. These regulatory functions must be situated within the context of other noncoding transcripts regulated by a range of ATP-dependent chromatin remodellers. It is not clear whether the activities of Rap1 and RSC overlap, and

215

Chapter 5 Results

how these factors could interact at gene promoters to dictate divergent transcription and overall promoter directionality. In this Chapter, I examine how Rap1 and RSC regulate nucleosome positioning and transcription directionality at gene promoters by depleting each factor individually and in combination.

Previous genome-wide approaches have generated profiles of transcription after perturbation of chromatin remodellers using microarrays or polyA mRNA-seq (Alcid and Tsukiyama, 2014; Parnell et al., 2008). These approaches are limited in their dynamic range, resolution, or ability to capture the entire nascent transcriptome. Many noncoding RNAs are highly unstable and lack polyA tails, and are not fully represented using standard transcriptome sequencing. To fully appreciate the extent to which sequence-specific transcription factors and chromatin remodellers may control promoter directionality, nascent RNA sequencing approaches are required to provide highly sensitive and quantitative information. Various methods have been used to successfully analyse promoter directionality at the level of nascent transcription (Jin et al., 2017; Kwak et al., 2013; Scruggs et al., 2015; Tome et al., 2018), but have not yet been applied to directly measure the effects of depleting ATP-dependent chromatin remodellers in budding yeast. Changes in promoter directionality could reflect changes in divergent, coding, or antisense transcription, or any combination of these. Here, I explore the diverse classes of coding and noncoding RNAs regulated by the RSC chromatin remodeller, and identify groups of genes that differentially respond to RSC depletion.

5.4 Results

5.4.1 Identification of Rap1 chromatin protein interactome using proteomics mass spectrometry

The pioneer factor activity of Rap1 and its role in determining chromatin organisation across the genome have been well documented (Bai and Morozov, 2010; Ganapathi et al., 2011; Kubik et al., 2015; Mivelaz et al., 2019). Activity of Rap1 and ATP-dependent chromatin remodellers modulates the positioning of nucleosomes at gene promoters to establish an open promoter structure and

216

Chapter 5 Results

facilitate productive transcription (Kubik et al., 2015; Kubik et al., 2018; Mivelaz et al., 2019; Reja et al., 2015; van Bakel et al., 2013). In budding yeast, several ATP- dependent chromatin remodellers slide, exchange, or eject nucleosomes to regulate gene expression (Clapier et al., 2017; Narlikar et al., 2013). In addition to post-translational histone modifications, sequence-specific transcription factors can also recruit chromatin remodelling complexes to specific locations through direct protein-protein interaction (Gutierrez et al., 2007; Owen-Hughes and Workman, 1996; Yudkovsky et al., 1999). Chromatin is generally repressive towards transcription, and transcription initiation at core promoters is particularly sensitive to +1 nucleosome position (Klein-Brill et al., 2019; Kubik et al., 2019). As the Rap1 binding sites at gene promoters are located near divergent TSSs and immediately flanked by -1 nucleosomes, I speculated that there might be a functional link between nucleosome positioning and divergent promoter activity. Therefore, I examined whether Rap1 interacts with or locally recruits additional factors, such as ATP-dependent chromatin remodellers, to control divergent transcription.

In Chapter 4, I identified that the Rap1 Δ631-696 mutant protein maintained its ability to bind to DNA, but was unable to repress divergent transcription. I hypothesised that this mutant might interact differently with co-activators or co- repressors of divergent transcription, compared to repression-competent forms of Rap1. I adapted an existing protocol to enrich for the chromatin fraction within yeast cell lysates prior to immunoprecipitation of V5-epitope tagged Rap1 (Figure 5.1A) (van Werven et al., 2008; Waterborg, 2000). Briefly, AID-insensitive Rap1 proteins were tagged with the V5 epitope and expressed in the Rap1-AID strain background. One hour after depletion of endogenously tagged Rap1-AID protein, cells were collected and snap-frozen for protein extraction without cross-linking. Cells were disrupted by vortexing with glass beads, and crude cellular fractionation was used to enrich for the chromatin fraction within the insoluble pellet. Crucially, resuspended cell pellets were treated with an excess of micrococcal nuclease (MNase) to digest unprotected DNA and help to solubilise the chromatin-bound proteins. After clarification of the “chromatin-released” protein fraction, anti-V5 antibodies were used to immunoprecipitate Rap1 protein tagged with the V5 epitope, and its associated interacting partners.

217

Chapter 5 Results

Figure 5.1 Identification of Rap1 chromatin protein interactome using proteomics mass spectrometry (A) Schematic diagram of experimental protocol for enrichment of chromatin fraction and solubilisation of chromatin-bound proteins by micrococcal nuclease (MNase) digestion. V5-tagged constructs for expression of full-length Rap1 (FL)

218

Chapter 5 Results

(FW5420), activation domain deletion (ΔAD) (FW5424), deletion of residues 631- 696 (Δ631-696) (FW5396) and an empty vector control (FW5399), were stably integrated into Rap1-AID cells. V5-tagged Rap1 protein in chromatin-released lysates was subjected to affinity purification. Co-immunoprecipitated proteins were processed for LC-MS label free quantification. (B) Western blots showing enrichment of V5-tagged Rap1 mutant proteins after immunoprecipitation from MNase-treated chromatin extracts described in A. 0.67% of input (I), 0.67% of flow-through (FT), and 10% of immunoprecipitated sample (IP) eluted from anti-V5 beads loaded. Folkert van Werven assisted with the sample collection. (C) Volcano plot showing differential enrichment in detected proteins comparing Rap1-V5 (FL) pulldown to empty vector (EV) control. Difference in enrichment (Log2 scale) is plotted against p-value (unpaired two-sample t-test, Log10 scale) for 916 proteins identified by mass spectrometry. Highlighted proteins: bait protein (Rap1, dark blue), TFIID TBP-associated factors (TAFs) (green), telomeric proteins (blue), nuclear pore complex (NPC) components (orange), and RSC complex components (purple). Horizontal dashed line corresponds to p = 0.05, and vertical dashed line corresponds to 2-fold enrichment. (D) Yeast Gene Ontology (GO)-Slim Process Analysis of significantly enriched Rap1 interacting proteins described in C. (E) Volcano plots comparing differential protein enrichment between ΔAD mutant (left) and Δ631-696 mutant (right) pulldowns to FL Rap1 pulldown. Only proteins that were significantly enriched in Rap1-FL vs. EV pulldown (289 proteins) are plotted, as described in C. Highlighted proteins: bait protein (Rap1, dark blue), RSC complex components (blue), NPC components (orange). (F) GO over-representation analysis of cellular components over-represented in the proteins showing significant differential enrichment in E, comparing Rap1 Δ631-696 to FL Rap1 pulldown. False discovery rate (FDR) adjusted p-value is displayed on a -Log10 scale. (A-F) David Frith and Bram Snijders processed and analysed the immunoprecipitated protein samples using mass spectrometry, and performed data analysis to generate the processed data included in this figure.

I performed this experiment using AID-insensitive constructs expressing full- length Rap1 (FL), Rap1 wherein the activation domain comprising residues 631- 678 (ΔAD) was deleted, Rap1 wherein residues 631-696 (Δ631-696) were deleted, and an empty vector control. I aimed to compare the interacting partners of Rap1 Δ631-696 (which was unable to repress divergent transcription) with the interacting partners of Rap1 FL and Rap1 ΔAD (which were able to repress divergent transcription). Using this protocol to solubilise the chromatin-bound proteins, I was able to successfully immunoprecipitate the Rap1-V5 protein for all three constructs (Figure 5.1B). The samples containing proteins co-immunoprecipitated with Rap1 were processed for label free quantification (LFQ) using proteomics mass spectrometry. I first evaluated the performance of the chromatin co-

219

Chapter 5 Results

immunoprecipitation protocol by examining the proteins enriched in the full-length Rap1 (Rap1-FL) pulldown versus the empty vector (EV) control (Figure 5.1A). Rap1 itself was the most highly enriched protein in the Rap1-FL pulldown. Several proteins known to interact with Rap1 at promoters and telomeres were enriched in Rap1-FL compared to EV, including TBP-associated factors (TAFs), telomere- related proteins, and nuclear pore complex (NPC) components (Garbett et al., 2007; Layer et al., 2010; Papai et al., 2010; Van de Vosse et al., 2013). In addition, multiple subunits (12/17) of the RSC (Remodels the Structure of Chromatin) complex were identified in this experiment. Furthermore, the 276 proteins that were significantly enriched (log2 fold change > 1, padj < 0.05) in the Rap1-FL pulldown were over-represented in Rap1-related processes such as RNA polymerase II transcription and chromatin organisation (Figure 5.1D). This set of significantly enriched proteins was treated as a set of high-confidence Rap1 interacting partners for further analysis.

Next, I selected proteins whose interaction with Rap1 was affected in the Δ631-696 mutant pulldown, but not the ΔAD mutant pulldown. I hypothesised that this repression-incompetent mutant (Δ631-696) would show substantial differences in recruitment or interaction with key co-regulators of divergent transcription. Many proteins showed differential enrichment in either mutant pulldown, compared to Rap1-FL (Figure 5.1E). Surprisingly, all identified subunits of the RSC complex were enriched in the Rap1 Δ631-696 pulldown compared to Rap1-FL, suggesting that Rap1 may negatively affect association or recruitment of RSC to chromatin. I validated this observation by performing gene ontology (GO) component analysis on the Rap1 interacting proteins significantly enriched in the Δ631-696 mutant pulldown versus the Rap1-FL pulldown (Mi et al., 2019). I found that components belonging to RSC, an ATP-dependent chromatin remodeller of the SWI/SNF family (Cairns et al., 1996), showed the most significant enrichment (Figure 5.1F). Taken together, these data suggest that Rap1 may interact with or affect the activity of the RSC chromatin remodeller on chromatin.

220

Chapter 5 Results

5.4.2 Rap1 and RSC regulate chromatin organisation at Rap1-regulated genes

RSC is an ATP-dependent chromatin remodelling complex that generates nucleosome depleted regions (NDRs) at gene promoters to facilitate transcription (Cairns et al., 1996; Clapier et al., 2017; Kubik et al., 2018; Lorch et al., 2011; Parnell et al., 2008; Parnell et al., 2015). However, RSC interacts directly with DNA and nucleosomes, and does not require transcription factors for recruitment to promoters, nucleosome sliding, or nucleosome ejection (Krietenstein et al., 2016; Kubik et al., 2015; Kubik et al., 2018). Given that nucleosome positioning and local chromatin modifications affect promoter directionality (Ibrahim et al., 2018), I speculated that the activities of Rap1 and RSC might converge upon nucleosomes located at bidirectional promoters. To explore the relationship between RSC and Rap1, I first validated that the ATPase subunit of RSC, Sth1, was enriched at Rap1-regulated gene promoters. Examination of published Sth1 ChIP-seq and Sth1 MNase ChIP-seq data sets verified that Sth1 binding is enriched around promoter Rap1 binding sites, compared to the promoters of an unrelated transcriptional repressor, Ume6 (Figure 5.2A) (Lopez-Serra et al., 2014; McKnight et al., 2016; Parnell et al., 2015). Next, I closely examined the organisation of chromatin at Rap1-regulated promoters by analysing published MNase-seq data. In MNase-seq experiments, formaldehyde-fixed or native chromatin is subjected to digestion with micrococcal nuclease (MNase), which preferentially degrades unprotected linker DNA. The DNA protected by stable complexes like nucleosomes (fragments of mainly mono- and di-nucleosomal length) is extracted and prepared for sequencing. By mapping the MNase-protected fragments to the reference genome and generating a profile of their coverage, the positions of nucleosomes can be determined on a genome-wide scale (Figure 5.2B).

221

Chapter 5 Results

Figure 5.2 Rap1 and RSC regulate chromatin organisation at Rap1-regulated genes

222

Chapter 5 Results

(A) Metagene plots of published Sth1 ChIP-seq (GEO:GSE56994) (Lopez-Serra et al., 2014) and MNase ChIP-seq (GEO:GSE65594) (Parnell et al., 2015) data for Rap1-regulated (n = 141) and Ume6-regulated (n = 87) genes. Normalised signal per million reads plotted 1 kb up- and downstream of transcription factor binding sites, for immunoprecipitation (black lines) and input (grey lines) samples. For Sth1 MNase ChIP-seq, chromatin was solubilised by MNase digestion prior to immunoprecipitation. (B) Schematic diagram depicting typical signal and metagene coverage of protected DNA fragments from MNase-seq experiments using a high concentration of MNase. Rap1, blue polygon; Rap1 binding site, red box; transcription start site, black arrowhead; nucleosomes, grey globes; NDR, nucleosome-depleted region. (C) MNase-seq metagene plots showing that promoters with Rap1-dependent divergent transcription show differences in nucleosome occupancy after Rap1 depletion. Normalised signal per million reads is shown 450 bp up- and downstream of the Rap1 binding site (Rap1 binding site, left panel), or transcription start site (TSS, right panel). Separate plots are shown for the set of Rap1-regulated genes as shown in Figure 3.6 (n = 141), and each cluster of genes according to the antisense strand (ASc1, ASc2, ASc3). Nucleosome positions before (black) and after (grey) Rap1 depletion are shown for each plot. Data were obtained from GEO:GSE73337 (Kubik et al., 2015). (A-C) Harshil Patel performed the read trimming, filtering, mapping, and generated the metagene plots in this figure.

I then examined the nucleosome positioning at Rap1-regulated genes, using a published data set wherein MNase-seq data were generated with and without depletion of Rap1 using the anchor-away system (Kubik et al., 2015). Consistent with other published reports, Rap1-regulated genes contain an extremely wide NDR several hundred nucleotides in length (Knight et al., 2014; Reja et al., 2015). After depletion of Rap1, the flanking +1 and -1 nucleosomes shift inwards, preserving a smaller but persistent NDR (Figure 5.2C, top row). The entire arrays of adjacent nucleosomes beyond the NDR-flanking positions shift in tandem with the -1 and +1 nucleosomes, confirming that other factors in yeast beyond Rap1 are responsible for maintenance of spacing within nucleosomal arrays. Rap1 and other “general regulatory factors” such as Abf1 and Reb1 may be exploited as barriers by remodellers such as ISW1 and ISW2 to properly position nucleosomes (Hartley and Madhani, 2009; Krietenstein et al., 2016). Finally, I compared the changes in nucleosome positioning between promoters that displayed divergent transcription after Rap1 depletion, and those that did not (Figure 5.2C, row 2-4). At Rap1- regulated genes with high levels of divergent transcription (ASc1 and ASc2 from Figure 3.6B), the -1 nucleosome upstream of the Rap1 binding site and the

223

Chapter 5 Results

preceding nucleosome array shift inwards but maintain their phasing and inter- nucleosome spacing. Given that the divergent core promoters are located near the Rap1 sites, the -1 nucleosome shifts to a potentially optimal position for initiation of divergent transcription in ASc1 and ASc2. In ASc3, where divergent transcription was not observed after Rap1 depletion, the upstream flanking nucleosomes do not display this inwards shift in position. This drastic difference in chromatin organisation may be due to transcription-coupled chromatin remodelling in the divergent direction. RSC recruitment and activity at promoters is independent of Rap1 (Kubik et al., 2018), and likely helps to maintain this narrower NDR after Rap1 depletion.

5.4.3 RSC promotes divergent transcription in the absence of Rap1 at several model loci

RSC and Rap1 regulate chromatin organisation, and chromatin organisation influences transcription. Therefore, I speculated that RSC controls promoter directionality by promoting divergent transcription in the absence of Rap1. I generated an auxin-inducible depletion (AID) allele for Sth1, the ATPase subunit of the RSC complex, which I also combined with the Rap1-AID allele to allow simultaneous protein depletion (Figure 5.3A). As shown before, Rap1 depletion led to aberrant expression of the divergent transcripts IRT2 and iMLP1. Depletion of Sth1 alone (Sth1-AID +IAA) had no effect on the expression of IRT2 and iMLP1, which are normally repressed when Rap1 is present (Figure 5.3B). However, when Rap1 and Sth1 were depleted concurrently, expression of IRT2 and iMLP1 was notably reduced to about half of the level observed after Rap1 depletion (Figure 5.3B-C). I also validated these findings using the pPS fluorescent reporter system containing a Rap1 binding site (R1p) in close proximity to the SUT129/YFP TSS (Figure 5.3D). At an independent promoter that is subject to regulation by Rap1, co-depletion of RSC suppresses the increase in divergent promoter activity induced after depletion of Rap1 alone. These data indicate that RSC activity promotes expression of these divergent transcripts in the absence of Rap1.

224

Chapter 5 Results

Figure 5.3 RSC promotes divergent transcription in the absence of Rap1 at several model loci (A) Auxin-induced depletion (AID) of Rap1 and Sth1 detected by western blotting. Samples were collected from RAP1-AID (FW3877), STH1-AID (FW6032), and RAP1-AID STH1-AID (FW6231) cells before (-IAA) and after (+IAA) auxin treatment. Depletion of V5-AID tagged proteins was detected with an anti-V5 antibody. Sth1-AID-FLAG was detected with an anti-FLAG antibody after re- probing of the V5-blot. The red asterisks on the FLAG blot indicate the residual V5 blot signal. Hxk1 was detected as a loading control. (B) IRT2 and iMLP1 expression in cells after depletion of Sth1, Rap1, or both factors, detected by northern blotting. Cells harbouring RAP1-AID (FW3877), STH1-AID (FW6032), and RAP1-AID STH1-AID (FW6231) alleles were grown to exponential phase, and samples were collected before (-) and 2 hours after (+) treatment with IAA (500 μM). The transcripts IRT2, iMLP1 and SNR190 were probed. (C) Quantification of IRT2 and iMLP1 expression (right). The signal was normalised over the SNR190 loading control. The normalised signal for IRT2 or iMLP1 expression in Rap1 depleted cells (RAP1-AID +IAA) was set to 1. Mean values +SEM are plotted (n = 3). (D) SUT129 promoter activity is suppressed by co-depletion of RSC and Rap1. RAP1-AID (FW6206), STH1-AID (FW6218), and RAP1-AID STH1-AID (FW6433) cells harbouring the R1p construct were treated with IAA (500 μM) or left untreated (NT). SUT129 activity (YFP) of the R1p reporter construct was measured as described in Figure 4.2. For each sample, mean signal corrected for background signal (AU, arbitrary units) is plotted plus 95% confidence intervals for n = 50 cells.

225

Chapter 5 Results

Folkert van Werven generated the plasmids containing Rap1 motifs, and assisted with collection and analysis of fluorescence microscopy data in this figure.

5.4.4 Development of nascent RNA-seq method to measure nascent transcription genome-wide

I identified that functional interplay between Rap1 and RSC controls divergent transcription at three model loci. To expand upon these findings, I adapted a published protocol to measure nascent transcription at a genome-wide scale (Churchman and Weissman, 2011, 2012). Standard sequencing of total or polyA RNA is sufficient to reliably detect noncoding transcripts and measures steady-state transcript expression, but many divergent and noncoding transcripts are more unstable than their protein-coding counterparts (Neil et al., 2009; Ntini et al., 2013; Preker et al., 2008; Schulz et al., 2013; van Dijk et al., 2011; Xu et al., 2009). Therefore, differential RNA stability may adversely affect measurements of promoter directionality. Methods to measure nascent transcription minimise the effects of RNA stability on transcript expression levels, and provide more accurate measurements of changes in promoter directionality. I adapted a method for affinity purification of RNA polymerase II, successfully implemented in published NET-seq and TEF-seq studies (Churchman and Weissman, 2011; Fischl et al., 2017). In this nascent RNA-seq approach, the Rpb3 subunit of RNA polymerase II is endogenously tagged with the 3xFLAG tag to facilitate isolation of nascent RNAs bound to the RNA polymerase complex through affinity purification. I successfully combined the RPB3-3xFLAG allele with the AID system to allow nascent RNA to be purified after auxin-inducible depletion of Rap1, Sth1, or both proteins (Figure 5.4A). I collected three biological replicate samples for strains with no degron, Rap1-AID, Sth1-AID, and Rap1-AID Sth1-AID strains with and without treatment with auxin (all strains with Rpb3-3xFLAG).

Briefly, cells containing the respective AID systems and Rpb3-3xFLAG were cultured in rich media to exponential growth phase, and then treated with DMSO (vehicle) or IAA (auxin) for 2 hours (Figure 5.4B). Cells were collected by gentle centrifugation, and immediately snap-frozen in liquid nitrogen (LN2). This crucial step circumvents the drastic changes in transcriptional activity induced if the cells

226

Chapter 5 Results

cultured at 30 °C were suddenly resuspended in cold lysis buffer with a high concentration of salts and detergents. Subsequently, the frozen cell pellets were subjected to cryogenic lysis by grinding in a LN2-cooled freezer mill, to prevent degradation of RNA and protein during cell lysis. The yeast “grindates” were then resuspended in lysis buffer and treated with an excess of DNase I to digest accessible DNA and solubilise chromatin-bound proteins – including RNA polymerase II containing Rpb3-3xFLAG. After clarification, the lysates containing nucleic acid-protein complexes released from chromatin were used as input material for immunoprecipitation of Rpb3-3xFLAG using anti-FLAG affinity resin. Finally, protein-RNA complexes containing Rpb3-3xFLAG were specifically released from the affinity resin by competitive elution using an excess of 3xFLAG peptide. The associated RNA was isolated by acid phenol-chloroform extraction prior to RNA fragmentation and library preparation.

Using this adapted protocol, I was able to successfully solubilise Rpb3- 3xFLAG in the lysis buffer, immunoprecipitate the protein using specific antibodies coupled to an affinity resin, and elute Rpb3-3xFLAG using the 3xFLAG peptide (Figure 5.4C). However, when I assessed the quality and distribution of fragment sizes for the purified RNA, I observed that there was a significant amount of contamination from rRNA (Figure 5.4D). The purified RNA associated with Rpb3- 3xFLAG included large fragments several thousand nucleotides in length, indicating that the RNA was not severely degraded during the purification protocol. However, if the final libraries comprised a significant portion of rRNA, the effective sequencing depth for nascent RNA polymerase II transcripts would be reduced. Therefore, I implemented an additional step to deplete ribosomal RNAs from the samples using hybridisation-based capture (Illumina Ribo-Zero Gold Yeast). The Ribo-Zero treatment successfully depleted rRNA from the nascent RNA samples (Figure 5.4E). Purified nascent RNA was subjected to fragmentation by incubating samples at 70 °C with zinc ions, and libraries were successfully generated for 100 bp paired-end (PE) sequencing (Figure 5.4F).

227

Chapter 5 Results

228

Chapter 5 Results

Figure 5.4 Development of nascent RNA-seq method to measure nascent transcription genome-wide (A) Western blots showing expression and auxin-inducible depletion of Rap1-AID and Sth1-AID, and expression of Rpb3 tagged with the 3xFLAG epitope for immunoprecipitation. Samples were collected from no degron (FW7228), Sth1-AID (FW7220), Rap1-AID (FW7238), and Rap1-AID Sth1-AID (FW723) cells after mock treatment with DMSO (-IAA) or 2 hours after treatment with auxin (+IAA). Depletion of V5-AID tagged proteins was detected with an anti-V5 antibody. Hxk1 was detected as a loading control. (B) Schematic diagram of the protocol used to purify nascent RNAs, involving cryogenic lysis, solubilisation of chromatin by DNase I digestion, and affinity purification of RNA polymerase II. (C) Western blot showing level of Rpb3-3xFLAG protein present in representative samples from one biological replicate experiment. Input (I, 0.06%); flow-through (FT, 0.06%); elution (E, 0.55%); beads (B, 1.67%). Proteins were detected using an anti-FLAG antibody. Samples were transferred to separate membranes after SDS- PAGE but blotted, incubated, and imaged together under identical conditions. (D) Electropherogram from Bioanalyzer analysis of a representative sample, showing the distribution of sizes for RNA co-purified with Rpb3-3xFLAG prior to fragmentation. The peaks of 18S and 26S rRNA are annotated, and the fraction of total RNA comprising rRNA was estimated in each sample by the segmented area of each peak. Bioanalyzer analysis of the nascent RNA-seq libraries was performed with the help of the Genomics Equipment Park Facility. (E) Scatter plots showing the amount of rRNA remaining in the nascent RNA-seq samples before removal of rRNAs by bead-based capture, and in the final libraries after rRNA removal and sequencing. Harshil Patel performed the read trimming, filtering, and mapping to calculate the proportion of reads mapping to rDNA loci. (F) Density plots from Agilent Tapestation analysis showing the distribution of RNA or DNA fragment sizes for the purified fragmented RNA (top) or final DNA libraries (bottom), for one set of biological replicate samples. Molecular weight ladders are shown for comparison on each plot (size in nucleotides). The Agilent Tapestation analysis was performed with the help of the Advanced Sequencing Facility in the course of library preparation.

5.4.5 Validation of nascent RNA-seq using conditional depletion mutants for Rap1

The nascent RNA-seq libraries were successfully sequenced. I then aimed to assess the quality of the nascent transcriptome sequencing data. Each library was sequenced to an approximate depth of ~25 million paired-end reads, approximately half the depth of the total and polyA RNA sequencing experiments in Chapter 3 (Figure 5.5A). Given the combination of the small yeast genome size (~12 Mb), enrichment for RNA polymerase II transcripts, and the rRNA depletion, I

229

Chapter 5 Results

expected that this sequencing depth (~25 M reads) would provide more than sufficient coverage over genic and intergenic regions. After filtering and mapping of the reads to the S. cerevisiae reference genome, approximately 90% of the total reads in each sample were used for further analysis. I observed high similarity between each biological replicate sample for the respective conditions tested (Figure 5.5C). Differential expression analysis confirmed that Rap1-regulated genes were down-regulated upon depletion of Rap1-AID protein as expected (Figure 5.5B). Furthermore, I compared the expression levels of the 141 high- confidence Rap1 regulated genes in the nascent RNA-seq data to the total RNA sequencing data (Figure 5.5D). Overall, the Rap1-regulated genes showed lower expression (e.g. in wild-type and AID strains with no depletion) in nascent RNA-seq than total RNA-seq. This apparent decrease may reflect the inherent contributions of relative RNA stability in the total RNA fraction. Depletion of Rap1 alone or Rap1 in combination with Sth1 resulted in a large decrease in expression for Rap1- regulated genes. Taken together, these data indicate that the nascent RNA sequencing approach implemented here is appropriate for genome-wide analysis of the nascent transcriptome, and can be exploited to quantitatively measure changes in promoter directionality.

230

Chapter 5 Results

231

Chapter 5 Results

Figure 5.5 Validation of nascent RNA-seq using conditional depletion mutants for Rap1 (A) Bar charts depicting the effective sequencing depth of the nascent RNA-seq libraries. The total number of 100 bp paired-end reads acquired for each library is shown in dark blue, and the number of reads remaining after mapping and filtering is depicted is shown in light blue. All strains contain RPB3-3xFLAG allele: Wild-type (WT, FW7228), Sth1-AID (FW7220), Rap1-AID (FW7238), and Rap1-AID Sth1-AID (FW7232). Harshil Patel performed read trimming, filtering, and mapping to generate the processed data included in this figure. (B) Volcano plot showing that expression of Rap1-regulated genes is decreased upon Rap1 depletion in nascent RNA-seq data. On the y-axis the false discovery rate adjusted p-value (-Log10(padj)) is plotted, and on the x-axis the fold change is displayed (Log2(Fold change)). Samples were compared from RAP1-AID (FW3877) cells after 2 hours of IAA or DMSO treatment. Rap1 regulated genes (n = 141) are highlighted in red. Data are calculated using three biological replicate experiments. Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure. (C) Correlation matrix heat map showing agreement between nascent RNA-seq biological triplicate samples from nascent RNA-seq samples. Colour scale corresponds to the Euclidean distance between the samples, based on all genes. Harshil Patel performed read filtering, mapping, and clustering analysis, and generated the heat map in this figure. (D) Scatter plots showing the normalised expression level of Rap1-regulated genes (n = 141) in nascent RNA-seq data (right), compared to total RNA-seq data from an earlier experiment (Figure 3.4) (left). Data are calculated as mean transcripts per million (TPM) +1 from three biological replicates for each condition. Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure.

5.4.6 Nascent RNA-seq enriches for nascent RNA polymerase II transcripts

I then directly assessed the quality of enrichment for nascent transcripts. Eukaryotic pre-mRNAs are co-transcriptionally spliced (Moore and Proudfoot, 2009; Wallace and Beggs, 2017). Fractions of RNA from whole cells or the cytoplasm are enriched in processed mRNAs lacking introns, whereas nuclear RNAs associated with actively transcribing RNA polymerase II still contain introns prior to splicing. Therefore, read coverage over intronic regions can be an indicator for successful enrichment of nuclear RNAs associated with RNA polymerase II. In yeast, only 282 of ~6500 genes are spliced, but most RP genes contain introns (Spingola et al., 1999; Zerbino et al., 2018). I compared the total and nascent RNA- seq coverage at representative RP genes (Figure 5.6A, rows 1-3), and found that coverage between introns and their adjacent exons was comparable. In addition, a spliced non-RP gene, TAF14, also displayed comparable intronic and exonic read

232

Chapter 5 Results

coverage. These data verify that this protocol involving solubilisation of chromatin- bound proteins and affinity purification of RNA polymerase II-associated RNAs is appropriate for detection and quantification of nascent transcripts.

Figure 5.6 Nascent RNA-seq enriches for nascent RNA polymerase II transcripts Screenshots showing coverage from total RNA-seq and nascent RNA-seq experiments, for wild-type cells (FW629, total RNA tracks) and cells containing Rpb3-3xFLAG (FW7228, nascent RNA tracks) at different intron-containing genes. Signals corresponding to the Watson (blue) and Crick (red) strands are depicted separately. The scale for the normalised signals for each strand are listed on the

233

Chapter 5 Results

right of each track. Screenshots are from one representative biological sample for each sequencing type. The legend at the bottom illustrates how introns and exons are depicted in the “Genes” track. Harshil Patel performed read filtering, mapping, and generated the read coverage tracks for data visualisation in this figure.

5.4.7 RSC regulates relative expression of a large fraction of coding genes and noncoding RNAs in budding yeast

To identify genes that respond to depletion of RSC, differential expression analysis was performed on coding and noncoding transcripts after Sth1 depletion. The aim of this analysis was to identify whether certain genes show higher sensitivity to RSC activity, and to highlight candidates for further investigation of promoter directionality. Previous and recent work has shown that depletion of RSC leads to a global drop in expression for most coding genes and rRNA (Klein-Brill et al., 2019; Parnell et al., 2008). This decrease in total RNA expression was validated using K. lactis cells as spike-in controls within an mRNA-seq and SLAM- seq time course (Klein-Brill et al., 2019). RSC likely affects efficiency of transcription initiation through modulation of +1 nucleosome positioning at the TSS (Klein-Brill et al., 2019; Kubik et al., 2018). As the nascent RNA samples presented here did not include spike-in controls at the cellular level, the differences in expression must be interpreted as relative changes rather than global changes (discussed in more detail at the end of this chapter). First, I compared the changes in nascent RNA expression between Sth1-AID cells treated with IAA for 2 hours, and a mock treatment. 1548 genes were differentially expressed after depletion of Sth1, which were split approximately evenly between increased and decreased expression (Figure 5.7A, left). 808 genes were significantly up-regulated and 740 genes were significantly down-regulated after auxin-induced depletion of Sth1. In contrast, only 198 genes were significantly up-regulated and 107 genes were significantly down-regulated when comparing the Sth1-AID degron strain after mock treatment to a “no degron” control strain with Rpb3-3xFLAG (Figure 5.7A, right).

Given the extensive effects of RSC on coding gene expression and chromatin organisation, I then determined whether RSC also affects expression of

234

Chapter 5 Results

noncoding RNAs – many of which are divergent promoter transcripts. I examined the expression of annotated CUTs, SUTs, NUTs, and XUTs before and after depletion of Sth1 (Figure 5.8A). A greater number of noncoding RNAs in each class were differentially expressed after depletion of Sth1 (Figure 5.8C, left), compared to the control comparison between Sth1-AID after mock treatment and a strain with no degron (Figure 5.8C, right). More noncoding RNAs were significantly up-regulated in expression after depletion of Sth1 (Figure 5.8A, C). Fewer noncoding RNAs were differentially expressed comparing Sth1-AID with mock treatment (DMSO) to a “no degron” control strain (Figure 5.8C-D), suggesting that the STH1-AID allele has a small effect on RSC activity. In conclusion, depletion of RSC leads to differential expression of many coding and noncoding transcripts. Expression of these noncoding RNA classes tends to be higher after depletion of Sth1.

Figure 5.7 RSC regulates relative expression of a large fraction of coding genes Volcano plot showing the relative changes of all genes after depletion of Sth1. On the y-axis the false discovery rate adjusted p-value (-Log10(padj)) is plotted, and on the x-axis the fold change is displayed (Log2(Fold change)). Samples were compared between Sth1-AID (FW7220) cells after 2 hours of IAA or DMSO treatment (left), or between Sth1-AID (FW7220) cells after 2 hours of DMSO treatment and cells with no degron (FW7228) (right). All strains contain the RPB3- 3xFLAG allele. The number of significantly (Sig.) up- or down-regulated genes (Fold-change > 2, p < 0.05) is stated on each plot. Data are calculated using three independent experiments. Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure.

235

Chapter 5 Results

236

Chapter 5 Results

Figure 5.8 RSC regulates relative expression of many noncoding RNAs (A) Volcano plot showing the relative changes of noncoding RNAs (CUTs, SUTs, NUTs, and XUTs) after depletion of Sth1. On the y-axis the false discovery rate adjusted p-value (-Log10(padj)) is plotted, and on the x-axis the fold change is displayed (Log2(Fold change)). Samples were compared between Sth1-AID (FW7220) cells after 2 hours of IAA or DMSO treatment. All strains contain the RPB3-3xFLAG allele. Data are calculated using three independent experiments. (B) Similar to A, except that comparison is between Sth1-AID (FW7220) cells after 2 hours of DMSO treatment and cells with no degron (FW7228) (right). All strains contain the RPB3-3xFLAG allele. (C) The number and fraction (in parentheses) of significantly up- or down-regulated noncoding RNAs of each species in each comparison from A and B is listed. The total number of noncoding transcripts in each class is stated on the right side of each table. Transcripts with fold change > 2 and p < 0.05 were considered to be significantly up- or down-regulated. (A-C) Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure.

5.4.8 Rap1 and RSC have distinct chromatin organisation functions at gene promoters

I successfully verified the nascent RNA-seq approach and subsequently measured nascent transcription after depletion of Rap1 and Sth1. I then decided to revisit the functional relationship between chromatin organisation and divergent transcription to identify directions for further investigation. Previously, I discovered that approximately two-thirds of Rap1-regulated genes display divergent transcription after depletion of Rap1 (ASc1-ASc2), whereas approximately one- third do not (ASc3) (Figure 5.9A). I wondered whether these groups of genes would exhibit distinct changes in nucleosome organisation in response to depletion of Rap1, RSC, or both factors. I therefore examined published MNase-seq data from these respective conditions. As described earlier in this Chapter, depletion of Rap1 results in an inwards shift of the -1 nucleosome and a reduction in NDR width in ASc1 and ASc2 only. A corresponding inwards shift of the subsequent flanking nucleosomes is also detectable (Figure 5.2C, Figure 5.9B middle column). Depletion of Sth1 caused a minor displacement of the -1 and -2 nucleosomes upstream of the TSS, significantly less than the shift observed after depletion of Rap1 (Figure 5.9B, left column). The three groups of Rap1-responsive genes (ASc1, ASc2, and ASc3) showed similar changes in nucleosome position in

237

Chapter 5 Results

response to depletion of Sth1. However, when Sth1 was depleted along with Rap1, the strong phasing of upstream nucleosomes (-1, -2, and -3) observed at genes with divergent transcription was entirely lost (ASc1) or decreased (ASc2) (Figure 5.9B, right column). The coding direction nucleosomes (+1 and +2) only showed a minor displacement in response to RSC, supporting conclusions from recent work demonstrating that the nucleosome organisation activities of Rap1 and RSC at gene promoters are independent (Krietenstein et al., 2016; Kubik et al., 2018). Co- depletion of Sth1 with Rap1 did not affect phasing of upstream nucleosomes in ASc3, where little to no divergent transcription occurred after Rap1 depletion. In conclusion, Rap1 and RSC have distinct roles in organisation of chromatin at gene promoters. Nucleosome positioning at a subset of promoters where divergent transcription occurs may be more sensitive to nucleosome sliding or eviction by RSC.

238

Chapter 5 Results

Figure 5.9 Rap1 and RSC have distinct chromatin organisation functions at gene promoters (A) Heat map reproduced from Figure 3.6B, showing changes in total RNA expression on the antisense strand after depletion of Rap1 protein for Rap1- regulated genes (plot centred on promoter Rap1 binding site (bs)). Promoters were clustered based on AS (ASc1-ASc3) signals using k-means clustering (k = 3). ASc1 and ASc2 were taken to display divergent transcription after Rap1 depletion, and ASc3 was taken to display little to no divergent transcription. Colour scale corresponds to magnitude and direction of change, calculated in bins of 5 bp

239

Chapter 5 Results

throughout intervals centred on Rap1 sites at promoters. Number of promoters in each cluster: ASc1 (n = 59), ASc2 (n = 47), ASc3 (n = 35). Harshil Patel performed read filtering, mapping, differential expression analysis, and generated the heat map in this figure. (B) MNase-seq metagene plots showing that promoters with Rap1-dependent divergent transcription show differences in nucleosome occupancy after depletion of Rap1, Sth1, or both factors. Normalised signal per million reads is shown 450 bp up- and downstream of the Rap1 binding site (bs). Separate plots are shown for the set of Rap1-regulated genes as described in A (n = 141), and each cluster of genes according to the antisense strand (ASc1, ASc2, ASc3). MNase-seq coverage is shown from control sample with no depletion (black or grey), after Sth1 depletion (green), after Rap1 depletion (red), or after depletion of both Rap1 and Sth1 (blue). Data were obtained from GEO:GSE73337 and GEO:GSE98260 (Kubik et al., 2015; Kubik et al., 2018). Harshil Patel performed read filtering, mapping, generated the MNase-seq read coverage tracks, and generated the metagene plots in this figure.

5.4.9 RSC activity is required for divergent transcription in some instances

The MNase-seq analysis of nucleosome positioning at Rap1-regulated genes (Figure 5.9B) showed that the contributions of Rap1 and RSC towards nucleosome positioning are largely independent, in agreement with published reports (Kubik et al., 2018). I then asked whether the functions of Rap1 and RSC in regulating divergent transcription are similarly independent, or interact genetically (e.g. show epistasis). I first examined the changes in RNA expression in total RNA and nascent RNA samples for the IRT2 and iMLP1 divergent transcripts (Figure 5.10A-B). In addition, the MNase-seq coverage from the corresponding conditions (e.g. after depletion of Rap1, Sth1, or both) were visualised for comparison. At the RPL43B locus, IRT2 is strongly expressed after depletion of Rap1 but not Sth1. Co-depletion of RSC did not suppress IRT2 transcription, and divergent transcription still occurred after depletion of Rap1 and Sth1. As expected, the phasing of the upstream promoter nucleosome arrays (corresponding to IRT2) was disrupted after co-depletion of Rap1 and RSC, and a further contraction of the NDR was observed. At the RPL40B locus, depletion of Rap1 induced expression of the iMLP1 5’ extended transcript isoform, while RPL40B expression greatly decreased (Figure 5.10B). Depletion of Sth1 alone was not sufficient to induce significant expression from the divergent TSS. Similarly to IRT2, co-depletion of RSC did not fully suppress iMLP1 expression. iMLP1 showed a similar pattern of expression after co-depletion of Rap1 and RSC to Rap1 depletion alone. Depletion of Rap1 and RSC individually reduce the width of the NDR, which is compounded when

240

Chapter 5 Results

these factors are depleted together. Despite this increasingly inaccessible environment, divergent transcription is still able to initiate within the diminished NDR. Apparently at IRT2 and iMLP1, nascent RNA-seq analysis suggests that RSC may not be strictly required for divergent transcription.

However, I successfully identified two Rap1-regulated genes with divergent transcripts suppressed after co-depletion of Rap1 and RSC. At the RPL8A locus, a divergent noncoding transcript antisense to the neighbouring GUT1 gene is expressed after depletion of Rap1 (Figure 5.11A). Surprisingly, co-depletion of RSC completely suppressed expression of the RPL8A divergent noncoding RNA. The changes in nucleosome positioning at the RPL8A promoter in response to Rap1 and Sth1 depletion were generally similar to those at RPL43B and RPL40B. I also identified a similar phenomenon of suppression at the RPL35A locus. Expression of a divergent transcript is repressed by Rap1, unaffected by depletion of Sth1 alone, and suppressed after simultaneous depletion of Rap1 and Sth1 (Figure 5.11B). In other words, RSC activity is required for divergent transcript expression. At this gene promoter, the loss of upstream nucleosome phasing and positioning is more drastic after co-depletion of both Rap1 and RSC. This phenomenon may strongly depend on the local genomic context and chromatin environment of each gene promoter. These examples suggest that the requirement for RSC activity may not be identical at all bidirectional or divergent gene promoters.

241

Chapter 5 Results

Figure 5.10 Rap1 and RSC function independently at RPL43B and RPL40B (A) Screenshots showing RNA-seq coverage from nascent and total RNA-seq experiments showing the transcriptional responses of RPL43B and IRT2 in wild- type cells (WT, FW7228), and after depletion of Rap1 (FW7238), Sth1 (FW7220), or both (FW7232). Rap1 binding sites are represented by red boxes, exons are depicted as thicker blue lines, and introns are depicted as thinner blue lines. For reference, coverage from MNase-seq experiments showing the position of

242

Chapter 5 Results

nucleosomes is also is also shown (Kubik et al., 2015; Kubik et al., 2018). Publicly available MNase-seq data were obtained from GEO:GSE73337 and GEO:GSE98260 (Kubik et al., 2015; Kubik et al., 2018). (B) Similar to A, except that the RPL40B locus containing the iMLP1 divergent noncoding RNA is depicted. (A-B) Harshil Patel performed read filtering, mapping, and generated the RNA-seq and MNase-seq coverage tracks for data visualisation in this figure.

243

Chapter 5 Results

Figure 5.11 RSC activity is required for divergent transcription in some instances (A) Screenshots showing RNA-seq coverage from nascent and total RNA-seq experiments showing the transcriptional responses at the RPL8A locus in wild-type cells (WT, FW7228), and after depletion of Rap1 (FW7238), Sth1 (FW7220), or both (FW7232). Rap1 binding sites are represented by red boxes, exons are depicted as thicker blue lines, and introns are depicted as thinner blue lines. For reference, coverage from MNase-seq experiments showing the position of nucleosomes is also is also shown (Kubik et al., 2015; Kubik et al., 2018). Publicly available MNase-seq data were obtained from GEO:GSE73337 and GEO:GSE98260 (Kubik et al., 2015; Kubik et al., 2018). (B) Similar to A, except that the RPL35A locus is depicted. (A-B) Harshil Patel performed read filtering, mapping, and generated the RNA-seq and MNase-seq coverage tracks for data visualisation in this figure.

5.4.10 Rap1 and RSC control noncoding transcription around Rap1 sites genome-wide

To gain a more comprehensive picture of the relationship between Rap1 and the RSC chromatin remodeller, I examined the changes in nascent transcription around Rap1 binding sites genome-wide. Using the +100 bp strand- specific windows centred on 564 putative Rap1 binding sites throughout the genome, I plotted the relative changes in nascent RNA expression after depletion of Rap1, Sth1, or both factors simultaneously. Depletion of Rap1 or RSC alone led to an overall increase in RNA expression around Rap1 binding sites (Figure 5.12A). However, co-depletion of Rap1 and RSC resulted an even larger increase in RNA expression. These extensive effects on nascent RNA expression were not observed when comparing the AID strains with mock treatment to a “no degron” control sample containing Rpb3-3xFLAG (Figure 5.12A, right). Next, I measured the changes in nascent RNA expression for strand-specific windows around Rap1 binding sites at gene promoters. Depletion of Rap1 increased noncoding transcription within the antisense and sense strand windows, with a bias towards higher antisense transcription as previously observed with total RNA-seq (Figure 5.12B, Figure 3.6). After depletion of Sth1 alone, I observed smaller but equal increases in antisense and sense transcription. Finally, co-depletion of Rap1 and RSC increased nascent RNA expression within both antisense and sense strand windows, and did not differ substantially from depletion of Rap1 alone. These data

244

Chapter 5 Results

suggest that RSC may regulate noncoding transcription which overlaps with a large subset of Rap1 binding sites genome-wide. However, RSC may not be strictly required for divergent transcription at gene promoters in the absence of Rap1.

To examine these transcriptional changes in higher resolution, the changes in transcription around Rap1-regulated gene promoters were plotted using heat maps (5 bp bin size), to highlight transcriptional changes on each strand. When clustered and ordered according to the groups of Rap1-regulated genes that previously showed divergent RNA expression (Figure 5.12C, top left), the nascent RNA samples showed nearly identical changes in noncoding transcript expression (Figure 5.12C, top row). In nascent RNA, divergent transcription occurred after depletion of Rap1 in the same gene clusters (ASc1 and ASc2), which comprise approximately two-thirds of Rap1-regulated genes. Changes in noncoding RNA expression in the sense direction around promoter Rap1 binding sites were not coupled to transcriptional changes on the antisense strand. In agreement with the differential expression analysis performed on promoter Rap1 “windows”, divergent transcription still occurred after co-depletion of Sth1 and Rap1 (Figure 5.12C, bottom left). Sense strand RNA expression was higher after co-depletion of Sth1 and Rap1, compared to Rap1 alone. Finally, I examined the direct effects of Sth1 depletion alone on antisense and sense transcription (Figure 5.12C, bottom right). After depletion of Sth1, increases in antisense transcription were observed approximately ~150 bp upstream of the promoter Rap1 binding sites at the majority of Rap1-regulated genes. Notably, this position approximates the location of the flanking -1 nucleosome at these Rap1-regulated promoters (Figure 5.2C). Taken together, these data indicate that RSC is likely not required for divergent promoter activity at Rap1-regulated gene promoters in the absence of Rap1. Preliminary analysis also suggests that Sth1 may separately regulate divergent transcription and promoter directionality to some extent at most Rap1-regulated genes. Whether the RSC chromatin remodeller controls promoter directionality in general remains to be determined; this will constitute the main focus of the remaining analysis in this Chapter.

245

Chapter 5 Results

246

Chapter 5 Results

Figure 5.12 Rap1 and RSC control noncoding transcription around Rap1 sites genome-wide (A) Violin and box-and-whisker plots showing changes in nascent RNA expression after depletion of Rap1, Sth1, or both factors at all Rap1 binding sites genome- wide. As a control, signals were compared between degron strains with no depletion (DMSO) and wild-type (WT, no degron) strains. +100 bp windows centred around Rap1 sites across the genome were used (n = 564 sites, signals for W and C strands computed separately to generate 1128 measurements). Data were calculated using three biological replicate samples for each condition. Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure. (B) Scatter plots showing changes in nascent RNA expression at Rap1-regulated promoters after depletion of Rap1, Sth1, or both factors. Expression changes for antisense (AS) and sense (S) direction windows relative to the coding gene for Rap1-regulated promoters (n = 141). Each point represents one strand-specific interval at one promoter and horizontal red and blue lines represent mean values. Harshil Patel performed read filtering, mapping, and differential expression analysis to generate the processed data included in this figure. (C) Heat maps showing changes in total and nascent RNA expression on AS or S strands, for data described in A and Figure 3.6B. Promoters were clustered based on AS (ASc1-ASc3) or S (Sc1-Sc3) signals using k-means clustering (k = 3) for total RNA (top left). Colour scale corresponds to magnitude and direction of change, calculated in bins of 5 bp throughout intervals centred on Rap1 sites at promoters. Number of promoters in each cluster: ASc1 (n = 59), ASc2 (n = 47), ASc3 (n = 35). All strains in nascent RNA-seq samples contain RPB3-3xFLAG allele: Wild-type (WT, FW7228), Sth1-AID (FW7220), Rap1-AID (FW7238), and Rap1-AID Sth1-AID (FW7232). Total RNA-seq Rap1-AID strain does not contain RPB3-3xFLAG allele (FW3877). Harshil Patel performed read filtering, mapping, differential expression analysis, and generated the heat maps in this figure.

5.4.11 Quantification of promoter directionality using nascent RNA sequencing

In order to obtain a quantitative view of global promoter directionality, a simple bioinformatic pipeline was adapted to calculate a “promoter directionality score” for each gene. This approach has been successfully implemented in S. cerevisiae for analysis of NET-seq data (Churchman and Weissman, 2011; Jin et al., 2017). The nascent RNA-seq data generated here similarly measures nascent transcription, albeit without mapping the position of RNA polymerase II at single nucleotide resolution. Essentially, two “windows” or intervals were generated for each gene, where the “sense” direction window starts at the true TSS and

247

Chapter 5 Results

encompasses nucleotide positions +1 to +500 on the sense strand (Figure 5.13A). The “antisense” direction window encompasses position -1 to -500 on the antisense strand. To simplify the analysis and avoid duplicate counting, the coverage from the paired-end sequencing reads was reduced to a single nucleotide at the 3’ most position of each read. The reads in the sense and antisense windows were counted and a directionality score was calculated for each gene which represents a log10 transformed ratio of sense window reads divided by antisense window reads. The directionality score values were calculated for each coding gene within individual biological replicate samples using nascent RNA-seq data. By way of example, the ATG5 gene (directionality score: 1.13) is more directional than the YPT52 gene (directionality score: 0.36) (Figure 5.13A).

In the compact yeast genome, many genes are arranged head-to-head in a divergent orientation. As my analysis primarily focused on divergent noncoding transcription, the list of coding genes was filtered to remove overlapping or divergent gene pairs, resulting in 2,609 tandemly oriented genes (Figure 5.13B) (Jin et al., 2017). The signals for antisense and sense transcription were plotted for each gene, and I found that there was a larger range of expression globally in the antisense direction than in the sense direction (Figure 5.13C, see marginal histogram plots). When divergently oriented and overlapping genes were included, the distribution of promoter directionality in wild-type S. cerevisiae cells is bimodal – there is a significant portion of both bidirectional and directional genes (Figure 5.13D). After filtering for tandem genes, the relative proportion of bidirectional genes diminished and more gene promoters showed a directional bias. This genome-wide distribution of directionality scores is consistent with the analysis by Stirling Churchman, Kevin Struhl, and colleagues (Jin et al., 2017). Therefore, the implementation and validation of this bioinformatic approach enables genome-wide analysis of promoter directionality in a quantitative manner.

248

Chapter 5 Results

249

Chapter 5 Results

Figure 5.13 Quantification of promoter directionality using nascent RNA sequencing (A) Depiction of directionality score calculation examples for two genes, ATG5 and YPT52. Sense direction window depicted by light green box, antisense direction window depicted by pink box. Windows are centred on the transcription start site (TSS) of each gene. Only the coverage from the 3’ end of strand-specific nascent RNA-seq reads is depicted on the Watson (W, +, blue) and Crick (C, -, red) strand for each gene. The formula for calculation of the directionality score is also shown, along with representative scores for hypothetical genes with different ratios of sense to antisense transcription, and the two examples shown. Harshil Patel performed read filtering, mapping, generated the 3’ coverage tracks of nascent RNA-seq data, and calculated the directionality scores for this figure. (B) Schematic diagram depicting genes arranged in tandem, where the antisense direction is noncoding (left), versus divergently arranged genes (right), where the antisense direction for the gene on the right is the sense or “coding” direction for the gene on the left. Genes that were divergently oriented or overlapping were removed from the analysis, to focus on divergent noncoding transcription. (C) Scatter plot depicting the separate antisense (y-axis) and sense (x-axis) expression signal (arbitrary units) for each gene from nascent RNA-seq data (WT, FW7228). Each dot represents one gene, and separate plots are shown for all genes before filtering (n = 6,646, left) and after filtering for tandem and non- overlapping genes (n = 2,609, right). Marginal histogram plots on each axis show the distribution of expression on each strand. Harshil Patel performed read filtering, mapping, and calculation of promoter directionality scores to generate the processed data included in this figure. (D) Density plot showing distribution of promoter directionality scores for the sets of genes described in C. All genes (light blue), filtered genes (dark blue). All directionality score data are calculated from three biological replicate experiments. Harshil Patel performed read filtering, mapping, and calculation of promoter directionality scores to generate the processed data included in this figure.

5.4.12 Contribution of Rap1 and RSC towards promoter directionality at Rap1-regulated genes

Having implemented this bioinformatic approach for analysis, I then aimed to obtain better quantitative insight into the functional interplay between transcription factors and chromatin remodellers that regulate promoter directionality. I examined the promoter directionality scores for Rap1-regulated genes and determined their responses to individual or simultaneous depletion of Rap1 and Sth1. In normal conditions, Rap1-regulated genes are among the most directional genes in yeast (Figure 5.14A). Implementation of the AID system for inducible depletion of Rap1 and Sth1 protein (without auxin treatment) did not

250

Chapter 5 Results

significantly affect directionality of Rap1-regulated genes (Figure 5.14B). As expected, depletion of Rap1 led to a shift towards bidirectional transcription, as divergent transcripts are up-regulated while coding direction transcripts are concurrently down-regulated at many Rap1-regulated genes (Figure 5.14C and Figure 5.12C). Depletion of Sth1 alone did not induce a comparable shift from directional to bidirectional transcription (Figure 5.14D), and co-depletion of RSC with Rap1 partially restored the strong directionality of Rap1-regulated genes (Figure 5.14E, F). These data confirm Rap1 controls promoter directionality at a large fraction of Rap1-regulated gene promoters, and indicate that RSC activity contributes to bidirectional output of these target gene promoters in the absence of Rap1.

251

Chapter 5 Results

252

Chapter 5 Results

Figure 5.14 Contribution of Rap1 and RSC towards promoter directionality at Rap1-regulated genes (A) Distribution of promoter directionality scores for all tandem and non-overlapping genes (n = 2,609) and Rap1-regulated genes (n = 141), in a wild-type strain (FW7228). All directionality score data are calculated from three biological replicate experiments (B) Distribution of promoter directionality scores for Rap1-regulated genes (n = 141), in WT (FW7228), Sth1-AID (FW7220), Rap1-AID (FW7238), and Rap1-AID Sth1-AID (FW7232) strains with mock treatment (DMSO). (C-E) Distribution of promoter directionality scores for Rap1-regulated genes (n = 141), comparing mock treatment (+DMSO, green), to auxin treatment (+IAA, purple) for Rap1-AID (FW7238), Sth1-AID (FW7220), and Rap1-AID Sth1-AID (FW7232) strains. (F) Distribution of promoter directionality scores for Rap1-regulated genes (n = 141), comparing Rap1-AID +IAA (FW7238, orange) to Rap1-AID Sth1-AID (FW7232, blue) strains. (A-F) Harshil Patel performed read filtering, mapping, and calculation of promoter directionality scores to generate the processed data included in this figure.

5.4.13 RSC controls promoter directionality at hundreds of yeast gene promoters

Finally, I applied the directionality score approach to determine whether activity of RSC controls directionality of coding gene promoters in general. RSC is likely recruited to most promoters, where it is important for +1 nucleosome positioning and coding gene expression (Brahma and Henikoff, 2019; Kubik et al., 2018; Skene and Henikoff, 2017). Whether RSC also regulates divergent transcription in general is unknown. After depletion of Sth1, I observed a small shift in the global distribution of promoter directionality towards more bidirectional transcription, whereas the distribution in Sth1-AID cells without auxin treatment was comparable to that of wild-type cells (Figure 5.15A-B). As changes in either sense or antisense transcription (or both) can affect the directionality score, I then examined the changes in nascent transcription at gene promoters in a strand- specific manner (Figure 5.15C). Clustering the data based on the antisense strand signals identified three main groups of promoters that respond differently to RSC depletion (RSC_ASc1, RSC_ASc2, and RSC_ASc3). Approximately 29% of tandem filtered genes showed higher divergent noncoding transcription upstream of the promoter NDR after depletion of Sth1 (RSC_ASc1: 746/2,609) (Figure

253

Chapter 5 Results

5.15C, left). A nearly equal proportion (28%) of tandem genes showed higher antisense transcription that initiates within promoters or gene bodies and overlaps with the promoter NDR (RSC_ASc2: 725/2,609). In this cluster, sense direction transcription downstream of the TSS (corresponding to the mRNA) was reduced to a greater extent, suggesting that depletion of RSC suppresses intragenic noncoding transcripts which may affect overlapping coding transcription. Here, the activity of RSC may be directed by additional factors towards the gene body rather than divergent transcript TSSs. Finally, approximately 43% of gene promoters showed a small decrease or negligible change in divergent transcription after depletion of Sth1 (RSC_ASc3: 1,138/2,609). In all clusters, depletion of RSC led to pervasive increases in nascent transcription upstream of the main coding TSS in the sense direction, supporting the findings of Domenico Libri and colleagues (Challal et al., 2018). This work showed that RSC activity modulates +1 nucleosome positioning and transcriptional fidelity at coding TSSs (Challal et al., 2018).

I also analysed published MNase-seq data to assess changes in promoter nucleosome occupancy after depletion of Sth1 (Figure 5.15D). When average signals for all tandem genes were plotted together, I observed inward small shifts for the phased coding and divergent direction nucleosome arrays, shrinking the NDR (Figure 5.15D, left). Further examination of groups of genes that displayed different transcriptional responses to depletion of RSC (clusters from Figure 5.15C) revealed more subtle changes in NDR-flanking nucleosome positioning (Figure 5.15D, right). At genes where RSC appears to repress divergent transcription, the positions of the -1 and -2 nucleosomes were not greatly affected (Figure 5.15D, RSC_ASc1). However, these upstream nucleosomes were shifted inwards towards the NDR at genes where RSC regulates intragenic antisense transcription instead (Figure 5.15D, RSC_ASc2). RSC activity may be potentially subject to differential regulation at distinct subsets of gene promoters in S. cerevisiae. The local sequence features, complement of regulatory factors, and chromatin environment of each promoter may dictate whether divergent transcription occurs, and how each gene may respond to depletion of chromatin remodelling complexes like RSC.

254

Chapter 5 Results

255

Chapter 5 Results

Figure 5.15 RSC controls promoter directionality at hundreds of yeast gene promoters (A) Distribution of promoter directionality scores for all tandem and non-overlapping genes (n = 2,609), comparing a wild-type strain (FW7228, grey) to Sth1-AID strain with mock DMSO treatment (FW7220, yellow). All directionality score data are calculated from three biological replicate experiments. Harshil Patel performed read filtering, mapping, and calculation of promoter directionality scores to generate the processed data included in this figure. (B) Similar to A, except that the comparison is between Sth1-AID strain with mock treatment (FW7220 +DMSO, yellow) versus auxin treatment (FW7220 +IAA, blue). Harshil Patel performed read filtering, mapping, and calculation of promoter directionality scores to generate the processed data included in this figure. (C) Heat maps showing changes in nascent RNA expression on antisense or sense strands, after depletion of Sth1 (FW7220, comparing +IAA auxin depletion to +DMSO mock treatment). Only tandem and non-overlapping genes were analysed (n = 2,609). Promoters were clustered based on the antisense strand signals using k-means clustering (k = 3) (RSC_ASc1 to RSC_ASc3). Sense strand plot is ordered using the same gene clusters and order as the antisense plot on the left. Colour scale corresponds to magnitude and direction of change, calculated in bins of 5 bp throughout intervals centred coding TSS for each gene. Number of genes in each cluster: RSC_ASc1 (n = 746), RSC_ASc2 (n = 725), RSC_ASc3 (n = 1,138). Harshil Patel performed read filtering, mapping, differential expression analysis, clustering, and generated the heat maps in this figure. (D) MNase-seq metagene plots showing nucleosome occupancy after depletion of Sth1 for all tandem and non-overlapping genes (left), and clusters of genes showing differential responses to RSC depletion as described in C (right). Normalised signal per million reads is shown 500 bp up- and downstream of the coding TSS. MNase-seq coverage is shown from a control sample with no depletion (black) and after Sth1 depletion (red). Data were obtained from GEO:GSE73337 and GEO:GSE98260 (Kubik et al., 2015; Kubik et al., 2018). Harshil Patel performed the read trimming, filtering, mapping, and generated the metagene plots in this figure.

256

Chapter 5 Results

5.5 Discussion

5.5.1 Summary

In Chapter 5, I characterised the individual and partially overlapping roles of sequence-specific transcription factors and chromatin remodellers in regulation of divergent transcription. I successfully identified potential interacting partners of Rap1 through proteomics mass spectrometry. Mutation of Rap1 within its C- terminal domain affected interaction with components of the ATP-dependent chromatin remodeller RSC. I also implemented a nascent RNA sequencing approach to quantitatively assess the contributions of Rap1 and RSC towards promoter directionality throughout the genome. RSC was initially found to promote divergent transcription after Rap1 depletion at selected loci. However, subsequent genome-wide analysis revealed that RSC activity is not required for divergent transcription in the absence of Rap1. RSC and Rap1 each regulate chromatin organisation at gene promoters in distinct ways, but the chromatin remodelling activity of RSC only partially suppresses the aberrant noncoding transcription induced after depletion of Rap1. Finally, I explored the role of RSC itself towards regulation of promoter directionality, and found that RSC regulates discrete divergent and noncoding transcription at distinct groups of genes. Together, these data elucidate the relative contributions of Rap1 and RSC towards divergent transcription and promoter directionality in S. cerevisiae.

5.5.2 Evaluation of nascent RNA-seq approach

To obtain quantitative measurements of nascent transcription and avoid possible confounding effects on RNA expression due to transcript stability, I adapted a method for nascent RNA sequencing (Churchman and Weissman, 2011, 2012). Nascent RNA-seq was able to detect and measure expression of both coding and noncoding RNAs. Detection of RNA polymerase II occupancy at single nucleotide resolution was not crucial for this analysis, and therefore the nascent RNA was fragmented and processed for strand-specific paired-end sequencing instead of NET-seq. The solubilisation of chromatin by DNase I digestion should not degrade unprotected RNAs (in contrast to MNase as used in mNET-seq),

257

Chapter 5 Results

facilitating the incorporation of longer insert fragments in the library (Nojima et al., 2015).

However, a spike-in control was not included at the cellular level to account for global changes in total RNA or transcription (Chen et al., 2015). The common practice of adding spike-in controls after isolation of nascent RNA and input normalisation of all samples would not reveal global changes in RNA content. However, several studies have found that depletion of RSC leads to global inhibition of transcription. In one study, RNA polymerase II and III genes were repressed two hours after depletion of RSC, but whether these effects were direct or indirect consequences of RSC depletion was unclear (Parnell et al., 2008). After the nascent RNA-seq experiments in this chapter had already been completed, a study by Nir Friedman and colleagues used K. lactis spike-ins to perform mRNA- seq and nascent RNA metabolic labelling experiments (SLAM-seq) (Klein-Brill et al., 2019). Spike-in normalised samples were collected and analysed immediately following auxin-induced depletion of Sth1. Depletion of RSC led to dramatic reduction in mRNA levels after thirty minutes, and RSC showed a stronger effect on NDR nucleosome positioning and coding gene transcription at genes with low to moderate levels of expression (Klein-Brill et al., 2019). Therefore, the changes in RNA expression in Sth1 depletion strains must be interpreted as relative, not global changes. The normalisation strategy in the DESeq2 package used for data analysis assumes that most genes do not change significantly in expression when comparing two conditions (Love et al., 2014), and if the total amount of RNA should decrease then the expression level of unchanging or up-regulated genes may be artificially inflated. However, the relative proportion of forward and reverse RNA expression (i.e. the directionality score) should not be adversely affected by the global change in RNA expression in Sth1 mutants.

In the future, it would be important to include a spike-in at the cellular level. This could be achieved perhaps by adding a defined amount of S. pombe or K. lactis cells to a defined amount of experimental S. cerevisiae cells, allowing for a correction factor to be calculated from the ratio of reads mapping to each species’ genome. The Rpb3-3xFLAG system used to purify RNA polymerase II and associated transcripts by affinity purification imposes a technical constraint,

258

Chapter 5 Results

wherein the equivalent subunit must also be epitope-tagged in S. pombe or K. lactis. This approach has been attempted before (Shetty et al., 2017), but the S. cerevisiae spike-in control was not used to normalise S. pombe NET-seq samples in the final analysis for this study. Instead, metabolic labelling strategies for nascent RNA (e.g. SLAM-seq or TT-seq) would be useful in overcoming this obstacle (Herzog et al., 2017; Schwalb et al., 2016).

5.5.3 Regulation of divergent transcription across eukaryotic species

Across a broad range of eukaryotic organisms, an equally broad range of mechanisms is exploited to control the balance of coding and divergent transcription at promoters and enhancers (Ibrahim et al., 2018). For example, human promoters and enhancers exploit core promoter sequence and post- translational modification of proximal histones in different ways to dictate the directionality of transcription initiation. Divergent transcription is more common in C. elegans and H. sapiens than D. melanogaster, the latter of which exhibits differences in asymmetry in chromatin state and core promoter sequences (Ibrahim et al., 2018). At least in human HeLa cells, more highly expressed genes tend to be more directional (Seila et al., 2008). However, a survey of nascent transcription in S. cerevisiae revealed only modest correlation between sense and antisense transcription (Churchman and Weissman, 2011). As reviewed in the Introduction of this thesis, asymmetric biases in DNA sequence content around promoters and enhancers also dictate transcription elongation, termination, and RNA degradation (Neil et al., 2009; Schulz et al., 2013; van Dijk et al., 2011; Xu et al., 2009). The pattern of post-translational histone modifications reflects the differences in initiation directionality in different species (Duttke et al., 2015; Ibrahim et al., 2018; Scruggs et al., 2015), and thus it has been proposed that a single unified model for transcription initiation does not apply to all organisms. In S. cerevisiae, it is apparent that chromatin remodellers play an extensive role in regulating coding and divergent noncoding transcription (Alcid and Tsukiyama, 2014; Clapier et al., 2017; Kubik et al., 2019; Narlikar et al., 2013; Parnell et al., 2008; Parnell et al., 2015).

259

Chapter 5 Results

5.5.4 Contribution of ATP-dependent chromatin remodellers towards promoter directionality

In addition to RSC, components of other ATP-dependent chromatin remodellers were also identified as potential Rap1 interactors by proteomics mass spectrometry. For example, the following proteins also showed differential enrichment only in the Rap1 Δ631-696 mutant pulldown (but not Rap1 ΔAD) compared to Rap1-FL: Arp7 and Arp9, subunits shared between SWI/SNF and RSC; Rsc9 and RSC30, RSC subunits; Isw2, the catalytic subunit of the ISWI family remodeller ISW2; and Ino80, the catalytic subunit of the INO80 complex. Many of these remodellers regulate divergent or noncoding transcription (Alcid and Tsukiyama, 2014; Challal et al., 2018; Kubik et al., 2018; Marquardt et al., 2014; Xue et al., 2017). SWI/SNF activity promotes divergent transcription through remodelling of H3K56 acetylated nucleosomes at promoters, which are incorporated by the chromatin assembly complex CAF-I (Marquardt et al., 2014). Genome-wide analysis of polyadenylated RNAs after depletion of INO80, RSC, SWR1, and ISW2 also identified that chromatin remodellers commonly repress noncoding transcripts in S. cerevisiae (Alcid and Tsukiyama, 2014). Significant overlap was observed between Ino80 and RSC-repressed transcripts, which were distinct from those regulated by SWR1 and ISW2. Both ISW2 and RSC affect chromatin organisation around TSSs of lncRNAs, and many up-regulated lncRNAs overlap with coding mRNAs due to the dense arrangement of genes within the yeast genome. In fact, ~32% (259 of 814) chromatin remodeller-repressed lncRNAs are associated with a significant decrease in overlapping mRNA expression – highlighting the complex and interleaved nature of the yeast transcriptome. These intragenic antisense lncRNAs likely overlap to a significant extent with the intragenic antisense transcripts observed in RSC_ASc2 after depletion of Sth1 (Figure 5.15). In addition to direct effects on chromatin organisation, a complex comprising Mot1, NC2, and Ino80 binds to TBP at gene promoters and limits expression of pervasive and divergent noncoding RNAs in budding yeast and mice (Xue et al., 2017).

Within this crowded and overlapping landscape of chromatin remodellers, recent work has identified that the functions of Rap1 and RSC in regulating +1

260

Chapter 5 Results

nucleosome position and TBP occupancy are independent (Kubik et al., 2018). Both factors prevent occlusion of TBP binding sites by +1 nucleosomes at gene promoters. RSC activity at promoters is dictated by the arrangement and proximity of polyA and GC-rich sequence motifs, and RSC likely does not require Rap1 for local recruitment (Krietenstein et al., 2016). These findings support my observations that depletion of Rap1 and Sth1 affected different classes of divergent and noncoding transcripts (Figure 5.12). Finally, Domenico Libri and colleagues used TSS-seq to analyse the changes in fidelity of transcription initiation at coding gene TSSs after depletion of several ATP-dependent remodellers (Challal et al., 2018). Depletion of Snf21 did not drastically effect TSS selection, but depletion of Sth1 resulted in a shift of transcription initiation to regions just upstream of the coding TSS. In contrast, depletion of Ino80 or Isw2 resulted in higher levels of intragenic transcription initiation. As RSC had similar and subtle effects on +1 and - 1 nucleosome positioning at Rap1-regulated genes (Figure 5.9), I speculate that RSC may exploit -1 nucleosome positioning to tune divergent core promoter activity.

5.5.5 Regulatory interplay between Rap1 and RSC

How do Rap1 and RSC function concurrently at bidirectional gene promoters? Initial evidence using northern blots for selected loci and a fluorescent reporter system indicated that RSC activity promotes divergent transcription in the absence of Rap1 (Figure 5.3). Based on these data and the observation that a smaller, persistent promoter NDR was present despite the depletion of Rap1 (Figure 5.2), I speculated that RSC promoted divergent transcription at bidirectional promoters by maintaining chromatin accessibility sufficient for transcription initiation. Subsequent analysis of nascent RNA-seq data revealed that co-depletion of RSC with Rap1 was sufficient to only partly suppress the shift from more unidirectional to more bidirectional transcription observed after depletion of Rap1 alone (Figure 5.14). In the absence of both RSC and Rap1, divergent transcription still occurs at two-thirds of Rap1-regulated genes (Figure 5.12). This divergent transcription likely depends on the activity of other ATP-dependent chromatin remodellers as well.

261

Chapter 5 Results

Do Rap1 and RSC interact directly? At gene promoters, RSC promotes histone exchange and sliding of the -1 and +1 nucleosomes to generate nucleosome depleted regions, facilitating transcription initiation. RSC clearly is important in regulating +1 nucleosome position to tune gene expression output, but the consequences of RSC-dependent shifts in -1 nucleosome position are less clear. I initially suspected that presence or absence of a RSC-bound “fragile nucleosome” within the promoter would dictate the likelihood of divergent transcription (Brahma and Henikoff, 2019; Kubik et al., 2015; Skene and Henikoff, 2017). This hypothesis was deemed unlikely after TSS-seq analysis identified that the divergent TSSs are located at the upstream (5’) border, not the centre of the NDR where “fragile” nucleosomes are found (Figure 4.6). Elegant in vitro reconstitution experiments have demonstrated that RSC clears away nucleosomes at promoters, while other remodellers such as INO80, ISW2, and ISW1a generate the phasing and tune the spacing of nucleosome arrays (Krietenstein et al., 2016).

Despite their co-localisation on chromatin, it is unclear whether Rap1 and RSC directly interact. RSC is strongly enriched at promoter binding sites, but their co-immunoprecipitation (Figure 5.1) may be indirectly mediated by DNA or other proteins. The DNA within the chromatin extracts was digested to tri-, di-, and mono- nucleosomal size after treatment with MNase (pilot experiment, data not shown). Conceivably, longer stretches of linker and nucleosomal DNA could mediate the co-immunoprecipitation of RSC with Rap1. In mouse fibroblast cells, bi-directional enhancer RNAs (eRNAs) directly bind to the CBP coactivator, and stimulate its histone acetyltransferase activity (Bose et al., 2017). However, as MNase also digests unprotected RNA, it is unlikely that the divergent noncoding RNA molecules themselves would directly recruit RSC or “tether” the complex to divergent core promoters. Perhaps, the region comprising residues 631-696 in the C-terminal domain of Rap1 could be important in modulating the recruitment or activity of RSC specifically near divergent TSSs. Further in vitro experiments are required to elucidate the specific nature of potential interaction between Rap1 and RSC. In particular, RSC is notoriously refractory to ChIP-seq analysis using formaldehyde cross-linking (Ng et al., 2002; Yen et al., 2012). It should be possible to directly measure the effects of RSC occupancy before and after depletion of Rap1 using

262

Chapter 5 Results

sensitive and complementary techniques such as CUT&RUN or ChEC-seq (Kubik et al., 2018; Skene and Henikoff, 2017; Zentner et al., 2015). However, I propose that instead of RSC occupancy, direct measurement of nucleosome positioning (e.g. MNase-seq) or histone exchange may be a better readout for RSC activity. In this context, it would also be useful to perform high-resolution mapping (e.g. ChIP- exo or ChIP-nexus) to characterise any effects on TBP or general initiation factor occupancy at divergent core promoters after depletion of RSC (He et al., 2015; Rhee and Pugh, 2012). Due to the stable DNA-binding occupancy of Rap1 at divergent gene promoters, I propose that Rap1 may limit sequence-directed recruitment or activity of RSC to stimulate productive transcription in the coding gene direction and confer promoter directionality.

5.5.6 Conclusion

In this chapter, I investigated the functional relationship between the sequence-specific transcription factor Rap1 and the ATP-dependent chromatin remodeller RSC in the context of promoter directionality. I found that direct or indirect interaction with RSC was affected in a Rap1 mutant (Δ631-696) deficient in repression of divergent transcripts. By exploiting an auxin-inducible depletion allele for Sth1, the catalytic subunit of RSC, I identified that RSC activity is only required for divergent transcription at a small number of genes. Rap1 and RSC also have distinct roles in nucleosome organisation around gene promoters. Genome-wide measurement of nascent transcription revealed that additional factors besides Rap1 and RSC likely drive divergent transcription. Finally, I identified that RSC controls promoter directionality at hundreds of gene promoters, and differentially regulates divergent and antisense noncoding transcripts at distinct classes of genes. These data clarify the relative contributions of a sequence-specific transcription factor and a sequence-directed chromatin remodeller towards promoter directionality across the genome.

263

Chapter 6 Discussion

Chapter 6. Discussion

6.1 Acknowledgement

Some of the content in this chapter has been published in a Point-of-View article in Transcription, and has been modified to present within this chapter (Wu and Van Werven, 2019).

6.2 Summary of key findings

Most eukaryotic gene promoters are inherently bidirectional, and generate divergent long noncoding RNAs (lncRNAs) together with coding messenger RNAs. As a result, bidirectional gene promoters are a major source of noncoding RNAs in eukaryotes. Here, I investigated why a group of highly expressed genes in S. cerevisiae atypically lack divergent promoter activity. Prior to this, chromatin assembly and RNA surveillance pathways were thought to constitute the main mechanisms of repression for divergent, antisense, and cryptic transcription. However, the contribution of sequence-specific transcription factors was unclear. In this thesis, I identified an additional mechanism exploited by eukaryotic cells to limit expression of divergent transcripts and control promoter directionality. Rap1 is a critical sequence-specific transcription factor that restricts expression of intergenic antisense lncRNAs in budding yeast. Other coactivators of Rap1-regulated genes do not share this function, indicating that divergent promoter activity is not simply a consequence of coding gene activation. Rap1 controls noncoding transcription at a significant portion of its binding sites genome-wide and represses divergent transcription at two-thirds of its target gene promoters. Without Rap1, aberrant noncoding transcription compromises neighbouring gene expression through various mechanisms. This novel repressive function of Rap1 is not redundant with other processes known to restrict noncoding RNA expression in yeast, notably chromatin assembly and transcriptional termination coupled to RNA degradation. Repression of divergent noncoding RNAs is extremely localised to regions near Rap1 binding sites in promoters, whereas other chromatin regulatory pathways generally limit noncoding transcription arising from intragenic regions.

264

Chapter 6 Discussion

Rap1 is not simply a roadblock for transcription in the antisense direction, but has a specific and proximity-dependent function in restricting initiation of nearby divergent transcripts at a genome-wide scale. Surprisingly, the heterochromatin silencing function of Rap1 is not required for control of promoter directionality. However, a region within the C-terminal domain of Rap1 comprising residues 631- 696 is essential for repression of divergent RNAs. Rap1 constitutes a versatile transcription factor in its ability to suppress divergent core promoters near its binding sites, likely by steric hindrance of transcription initiation in the antisense direction. Apart from Rap1, a nuclease-inactivated Cas9 protein can also repress divergent transcription when targeted to divergent core promoters; the size and nucleosome displacing activity of the DNA-binding protein may not be essential for repression of divergent transcripts. Rap1 functions in concert with the ATP- dependent chromatin remodeller RSC to organise nucleosomes at gene promoters, but co-depletion of RSC only partially suppresses the effect of Rap1 depletion with regards to promoter directionality. In the absence of Rap1, RSC activity is only required for divergent promoter activity at a very small number of Rap1-regulated genes indicating that an additional complement of regulatory proteins likely drives divergent transcription as well. Finally, I identified that RSC controls discrete classes of divergent and noncoding transcripts at distinct groups of coding genes. RSC represses divergent noncoding transcription at approximately 10% of coding genes in S. cerevisiae, and Rap1 may modulate the recruitment or activity of RSC at bidirectional gene promoters to facilitate productive transcription in the coding direction. Taken together, these findings identify a novel role for a sequence- specific transcription factor with regards to regulation of divergent transcription and promoter directionality in eukaryotic cells.

6.3 A model for regulation of divergent noncoding transcription in Saccharomyces cerevisiae

Here, I propose a model for regulation of divergent noncoding transcription comprising different variations within different genomic contexts. First, I focus on the key factors and regulatory proteins present at Rap1-regulated genes, and

265

Chapter 6 Discussion

discuss the functional interplay between chromatin and transcription. I delineate this novel regulatory function of Rap1 in transcriptional fidelity and compare its role to those of canonical regulatory pathways that limit the expression of divergent noncoding RNAs in budding yeast. I then expand the scope of this discussion to encompass bidirectional gene promoters wherein the RSC chromatin remodeller plays distinct roles in regulation of divergent and intragenic noncoding transcription. Finally, I summarise the key arguments of the steric hindrance model for regulation of divergent noncoding transcription, and consider its merits and limitations.

6.3.1 Control of divergent noncoding transcription at Rap1-regulated genes

At Rap1-regulated gene promoters, the Rap1 binding sites are located at or adjacent to divergent core promoters at the upstream border of the NDR (Figure 6.1). The discrete coding direction core promoter is located several hundred nucleotides downstream and is flanked by the +1 nucleosome. The divergent and coding TSSs are most likely recognised by separate, divergently oriented pre- initiation complexes (PICs) (Rhee and Pugh, 2012). Rap1 directly mediates recruitment of coactivators and general transcription factors, leading to productive gene transcription in the coding direction (Azad and Tomar, 2016; Garbett et al., 2007; Layer et al., 2010; Papai et al., 2010). In addition, Rap1 plays a key role in determining the position of +1 and -1 nucleosomes (Ganapathi et al., 2011; Kubik et al., 2015; Kubik et al., 2018; van Bakel et al., 2013; Yan et al., 2018; Yu and Morse, 1999). Modulation of +1 nucleosome positioning is a key mechanism by which this sequence-specific transcription factor, along with its associated cofactors and chromatin remodellers, tunes coding gene expression output (Kasahara et al., 2011; Kubik et al., 2019; Kubik et al., 2018). Due to the large distance between the Rap1 binding site and the +1 nucleosome, it is thought that that Rap1 regulates +1 nucleosome positioning indirectly (Reja et al., 2015). In contrast, the position of the -1 nucleosome flanking the divergent TSS is likely to be directly determined by Rap1 due to its “pioneer” nucleosome displacement activity. The large NDR present at Rap1-regulated genes likely facilitates access to cryptic divergent core promoters and Rap1 therefore represents a key regulatory factor that limits aberrant noncoding transcription in lieu of chromatin-based pathways. Upon

266

Chapter 6 Discussion

depletion of Rap1, divergent transcription initiates from nearby cryptic core promoters and is concomitant with an inwards displacement of the -1 nucleosome, shrinking the promoter NDR. The -1 nucleosome shifts into a potentially ideal position for divergent transcription. Whether nucleosome positioning dictates transcription initiation, or vice versa, remains a topic of debate in the community (Albert et al., 2007; Bai and Morozov, 2010; Jiang and Pugh, 2009).

In contrast, after depletion of Rap1 the downstream +1 nucleosome also shifts inwards and contracts the NDR, moving away from a transcriptionally optimal position relative to the coding TSS. Concurrently, canonical TSS expression is strongly decreased and usage of upstream cryptic or “ectopic” TSSs is increased (Figure 4.6) (Challal et al., 2018). The RSC chromatin remodeller is also directed to Rap1-regulated gene promoters by specific sequence motifs (Badis et al., 2008; Krietenstein et al., 2016; Kubik et al., 2018), and does not require Rap1 for local recruitment or action. RSC helps to maintain the promoter NDR by nucleosome sliding, facilitating productive transcription. However, Rap1 plays a more dominant role in determining the position of the -1 nucleosome near the divergent TSS (Figure 5.9). Other chromatin-based pathways that also repress noncoding transcription mostly limit intragenic transcripts arising within gene bodies, instead of divergent transcripts emanating from gene promoters (Figure 3.11). Inherently, these chromatin assembly or chromatin remodelling factors must act indirectly through nucleosomes, which are depleted at Rap1-regulated and other highly expressed gene promoters.

After depletion of Rap1, divergent noncoding transcription at divergent TSSs occurs along with a concurrent shift in -1 nucleosome positioning. On the whole, divergent transcription does not require the activity of the RSC chromatin remodeller. Rap1 may specifically limit recruitment or nucleosome remodelling activity of RSC in a proximity-dependent manner near divergent core promoters, biasing RSC activity towards the main coding core promoter instead. RSC activity helps to partially restore bidirectional output at Rap1-regulated genes in the absence of Rap1 (Figure 5.14).

267

Chapter 6 Discussion

Other activators of divergent noncoding transcription, such as additional chromatin remodellers or TBP regulatory proteins, remain to be fully identified and characterised in this context. The experiments performed in this thesis were conducted in genetic strain backgrounds wherein RNA surveillance pathways were functional – e.g. NNS components, the nuclear exosome (Rrp6), and nonsense- mediated decay (NMD, Upf1). Despite this, the aberrant divergent transcripts normally repressed by Rap1 manage to escape pervasive RNA surveillance pathways in yeast, and potentially pose a threat to gene expression and genome stability. Regulation of TBP activity has been identified as a key regulatory mechanism exploited in yeast and mice to limit intergenic noncoding transcription (van Werven et al., 2008; Xue et al., 2017). It remains unclear to what extent TBP- regulatory factors such as Mot1, Ino80, and NC2 also affect expression of Rap1- regulated divergent noncoding RNAs. These findings highlight the fact that these regulatory pathways are likely not redundant, and Rap1 therefore plays an essential role in maintaining transcriptional fidelity in S. cerevisiae.

268

Chapter 6 Discussion

269

Chapter 6 Discussion

Figure 6.1 Control of divergent noncoding and coding transcription at Rap1- regulated genes Schematic diagram depicting the role of Rap1 (blue polygon) in regulation of divergent noncoding transcription at its binding site (red box) within gene promoters. TSSs are depicted with black arrowheads. RNA polymerase (yellow polygon) is recruited to the coding direction core promoter and excluded from the divergent direction core promoter by Rap1. After depletion of Rap1, polyadenylated divergent noncoding RNAs (red wavy line) are expressed, which escape canonical RNA termination and degradation mechanisms in yeast. Nucleosome-remodelling activity of the RSC complex (light green polygon) helps to maintain a nucleosome- depleted region between the divergent and coding direction core promoters. The +1 and -1 nucleosomes (grey globes) shift inwards and shrink the NDR after depletion of Rap1. RSC activity is not strictly required for divergent transcription at Rap1- regulated genes, and other activators are likely involved. Cryptic TSSs within gene bodies are not regulated by Rap1 directly but instead repressed by chromatin assembly and co-transcriptional remodelling pathways (i.e. CAF-1, Spt10/21, Spt6, and FACT). The slider bars below highlight that the balance between divergent and coding direction transcription dictates the overall directionality of each promoter. Depletion of Rap1 increases divergent transcription and decreases coding transcription, shifting the directionality of the promoter towards a more bidirectional output.

6.3.2 Control of divergent noncoding transcription by RSC chromatin remodeller

Strand-specific analysis of transcription at gene promoters using nascent RNA sequencing revealed that RSC suppresses divergent noncoding transcription for at least 10% of coding genes in budding yeast. This figure is likely to be an under-estimate, because approximately ~3500 yeast genes that were overlapping or divergently oriented were filtered out of the analysis. At this subset of genes, depletion of RSC leads to increases in divergent noncoding transcription several hundred nucleotides upstream of the main TSS from divergent core promoters at the upstream edge of the NDR (Figure 6.2). As mentioned before, RSC activity is guided by specific sequence motifs (Badis et al., 2008; Krietenstein et al., 2016; Kubik et al., 2015; Kubik et al., 2018). Nucleosome sliding or eviction by RSC is crucial for proper positioning of the +1 nucleosome and initiation of coding gene transcription. However, RSC activity appears to be beneficial for coding direction transcription but detrimental (repressive) to noncoding transcription in the divergent direction at a large subset of coding gene promoters.

270

Chapter 6 Discussion

Where do these functional differences arise? Histone variants and post- translational modifications can modulate recruitment or activity of ATP-dependent chromatin remodellers. For example, incorporation of the H2A.Z histone variant stimulates nucleosome remodelling by ISWI but not SWI/SNF family remodellers (Goldman et al., 2010), and CHD subfamily remodellers contain chromodomains which can bind to methylated histone tails (Sims et al., 2005; Watson et al., 2012). Asymmetric histone modification or variant incorporation at the +1 and -1 nucleosomes could lead to differential regulation of core promoters in opposing directions (Bagchi and Iyer, 2016). In addition to histone-based modulation, it has been suggested that sequence-specific transcription factors on DNA can specifically interact with domains of ATP-dependent chromatin remodellers and influence the efficiency of chromatin remodelling (Clapier et al., 2017; Gutin et al., 2018). I propose that at genes where RSC normally represses divergent transcription (RSC_ASc1), RSC activity is directed to both the +1 and -1 nucleosomes by specific sequence motifs, and may be limited by gene-specific regulatory factors. RSC activity could position the +1 nucleosome optimally for coding gene transcription, while maintaining the -1 nucleosome in a sub-optimal or repressive position to limit initiation of divergent transcription. At genes where depletion of RSC leads to higher antisense intragenic RNA expression instead of divergent noncoding transcription (RSC_AS_c2), it is difficult to conclusively determine whether the effects are direct or indirect consequences of RSC depletion. Perhaps, a complement of sequence-specific transcription factors could limit the activity of RSC at the -1 nucleosome and divergent core promoter. These factors may be different for distinct groups of genes, and could potentially maintain the -1 nucleosome in a repressive position. A study by the group of Nir Friedman classified gene promoters by mapping interactions between nucleosomes and the transcription factor Reb1 (Gutin et al., 2018). Intriguingly, the movement of -1 and +1 nucleosomes after depletion of Sth1 were distinct at different sub-classes of promoters. At a subset of target promoters, the -1 nucleosome can be “locked” in place by the presence of Reb1, which may preclude PIC formation at antisense- oriented core promoters and subsequently divergent transcription. Given that RSC activity is mainly directed to gene promoters and flanking nucleosomes (Brahma and Henikoff, 2019; Krietenstein et al., 2016; Ng et al., 2002), I speculate that the

271

Chapter 6 Discussion

increased intragenic noncoding transcription observed after depletion of RSC may be due to loss or reduction of co-transcriptional chromatin remodelling associated with coding transcription across the gene body. It would be interesting to investigate whether the specific GC-rich and AT-rich motifs that direct RSC action are differentially enriched at different subsets of gene promoters that display divergent transcription.

Figure 6.2 Control of divergent noncoding transcription by RSC chromatin remodeller Schematic diagram depicting the effect of RSC on divergent and coding core promoters at a subset of genes where RSC represses divergent transcription. The nucleosome-remodelling activity of RSC is directed by specific sequence motifs at the edges of the NDR. RSC activity may help to position the +1 nucleosome at the coding direction core promoter for optimal TSS usage, while maintaining the -1 nucleosome in a sub-optimal position for divergent TSS usage.

6.3.3 Requirements and limitations of the steric hindrance model for regulation of divergent noncoding transcription

The evidence presented in this thesis supports a model in which a sequence-specific transcription factor positioned near divergent TSSs limits the access of general transcription machinery and transcriptional activators to

272

Chapter 6 Discussion

divergent core promoters by steric hindrance (Figure 6.3). I demonstrated that Rap1 is a key transcription factor that limits initiation of divergent transcription, but its “pioneer” nucleosome displacing activity may not be crucial for this function. Nuclease-inactivated Cas9 and other yeast transcription factors may similarly repress divergent TSSs in close proximity to their binding sites. However, the Rap1 DNA-binding domain alone is unable to limit divergent transcription alone or in conjunction with several copies of GFP protein as a bulkier DNA-binding entity. The requirements of this model are simple: a DNA-binding protein of sufficient size must be stably bound to DNA at or near cryptic core promoters. This would be relatively simple to implement from an evolutionary perspective. Eukaryotic cells could exploit their existing complement of sequence-specific transcription factors to repress divergent TSSs. For example, this could simply involve mutation of ~10 nucleotides within a gene promoter to generate a novel transcription factor target motif (Khan et al., 2018). Changes in non-coding regulatory elements are under less evolutionary constraint than changes in coding DNA sequences (Andolfatto, 2005; Borneman et al., 2007), and transcription factor binding site substitution is possible (Hogues et al., 2008). In addition, this unorthodox mode of regulation does not depend on auxiliary factors, for example the positioning of adjacent nucleosomes and the composition of their histones and post-translational modifications. Extensive evidence identifies that these regulatory principles of repression by steric hindrance are ubiquitous across prokaryotic, eukaryotic, and viral genomes (Wu and Van Werven, 2019). However, the requirement of stable chromatin occupancy may limit the number of factors that can fulfil this function. Addition of novel cis-regulatory motifs and factors could also compromise carefully coordinated programmes of gene expression. Finally, steric hindrance of core promoters may be influenced by species-specific differences in core promoter sequence, histone modifications, and local chromatin environment (Bagchi and Iyer, 2016; Ibrahim et al., 2018). Further work is required to identify additional factors that may limit expression of divergent transcripts by steric hindrance, and fully understand their mechanisms of action.

273

Chapter 6 Discussion

Figure 6.3 Steric hindrance model for regulation of divergent noncoding transcription Schematic diagram depicting candidate DNA-binding proteins evaluated within the proposed steric hindrance model for regulation of divergent core promoters. Various endogenous and synthetic transcription factors were evaluated for their ability to effectively repress divergent RNA expression, and other candidate transcription factors may also function in a similar manner to Rap1. Rap1 binding site, red box; nucleosomes, grey globes; TSS, black arrowheads; Rap1 protein, blue polygon; green fluorescent protein (GFP), green circles; Rap1 DNA-binding domain (DBD), truncated blue polygon; nuclease-inactivated Cas9 (dCas9), purple polygon; Mxi1 repressor domain, light green circle; other transcription factors (TFs), orange polygon.

6.4 Alternative hypotheses to steric hindrance model

Rap1 appears to be a transcription factor distinct in that its asymmetric binding within the nucleosome-depleted region of a bidirectional gene promoter facilitates directional transcriptional output. In the Discussion of Chapter 4 and the section above, I proposed a steric hindrance model for regulation of divergent core

274

Chapter 6 Discussion

promoter activity. In addition, I evaluated several alternative hypotheses using experimental evidence, including the “transcriptional roadblock” model, coupling between coding and divergent core promoters, and of Rap1 silencing cofactors. On the whole, the data collated from the literature and my experimental investigation support a steric hindrance model, wherein Rap1 prevents transcription initiation at divergent core promoters by limiting recruitment of general transcription factors, RNA polymerase, and other coactivators. In this section, I weigh up the merits and limitations of other alternatives to the steric hindrance model of Rap1 function that have not been experimentally tested in this thesis.

6.4.1 Regulation by pausing of RNA polymerase II

In metazoans, the release of RNA polymerase from promoter-proximal pausing can be a key regulatory step in gene regulation (Adelman and Lis, 2012). However, promoter-proximal pausing is less prominent in yeast (Churchman and Weissman, 2011), where gene regulation typically occurs at the level of transcription initiation, not elongation. After RNA polymerase II initiates divergent transcription at divergent core promoters, it is unlikely that the complex is held in a “paused” state in a Rap1-dependent manner that limits pause-release. Although chemical inhibitors can be used to inhibit various steps in the transcription cycle, blocking the release of paused RNA polymerase would not elicit expression of divergent transcripts. Potentially, use of a sequence-specific synthetic transcription elongation factor (SynTEF) could license transcription elongation at targeted genomic loci, but these molecules are limited in target sequence choice (Erwin et al., 2017). Visual inspection of NET-seq data from S. cerevisiae, which captures paused RNA polymerase II at single nucleotide resolution, does not support the accumulation of paused RNA polymerase in the divergent direction at Rap1- regulated promoters (Churchman and Weissman, 2011). These signals corresponding to antisense-oriented RNA polymerase stalled at promoter Rap1 binding sites would be exacerbated in a TFIIS mutant (dst1Δ) unable to relieve stalled and backtracked elongation complexes. Accumulation of these signals was not observed at any of the representative Rap1-regulated genes examined.

275

Chapter 6 Discussion

Therefore, I focused my experimental investigation on testing the mechanisms underlying the “roadblock” and “steric hindrance” models instead.

6.4.2 Regulation of TBP activity

Transcription initiation requires the recruitment of TATA-binding protein (TBP) to core promoters. TBP localisation and activity is a key regulatory hub for transcription regulation (True et al., 2016; van Werven et al., 2008; Xue et al., 2017). Therefore, I wondered whether Rap1 directly regulates TBP activity at bidirectional gene promoters to control divergent transcription and promoter directionality. At high concentrations in vitro, TBP directly interacts with Rap1 protein and formation of the TBP-Rap1 complex inhibits TBP binding to TATA promoter DNA (Bendjennat and Weil, 2008). In vivo, Rap1 recruits TBP to the coding direction core promoter via interactions with the general transcription factors TFIID and TFIIA (Garbett et al., 2007; Layer et al., 2010; Papai et al., 2010). As TBP is also required for initiation of divergent transcription, it is possible that Rap1 could locally limit TBP recruitment to divergent core promoters. However, Rap1- regulated RP gene promoters are strongly TFIID-dependent and tend to lack consensus TATA box motifs, in the forward or reverse orientations (Basehoar et al., 2004; Bosio et al., 2017). The extent to which non-promoter bound Rap1 may “titrate” away free TBP in the nucleoplasm or cytosol is not clear. This principle of restricting TBP activity is not limited to Rap1, as additional co-regulators such as Mot1, NC2, Ino80, and SAGA also control TBP localisation and activity (Gadbois et al., 1997; Goppelt et al., 1996; Gumbs et al., 2003; Kim et al., 1996; Sermwittayawong and Tan, 2006; van Werven et al., 2008; Xue et al., 2017). In yeast, Mot1, NC2, and Ino80 (MINC) contribute to gene repression in heterochromatin through the silencing protein Sir3 (Xue et al., 2017). MINC also binds to TBP at 5’ and 3’ gene ends and suppresses aberrant expression of CUTs, XUTs, SUTs, and divergent promoter transcripts in yeast. Upon examination, the single-end RNA sequencing data from this study were not of sufficient quality or depth to fully assess the overlap between MINC- and Rap1-regulated transcripts. Crucially, standard ChIP-seq analysis of TBP and MINC localisation is insufficient to clarify whether MINC specifically affects TBP occupancy at the divergent core

276

Chapter 6 Discussion

promoters versus coding core promoters. Further experimental validation using high resolution ChIP-exo or ChIP-nexus techniques would help to address this outstanding question (He et al., 2015; Rhee and Pugh, 2012). Further investigation will also be required using more sensitive RNA sequencing approaches to fully understand the complex and overlapping relationships between these pathways that regulate transcriptional fidelity.

6.4.3 Regulation of divergent transcription by gene looping and chromatin conformation

Promoter directionality – and specifically the transcription of divergent noncoding RNAs – is also controlled by gene looping and local chromatin conformation (Tan-Wong et al., 2012). Looping between the 5’ and 3’ ends of a gene requires functional promoter and 3’ polyadenylation signals (PAS), along with a complement of protein factors that recognise these signals. In budding yeast, inactivation of the RNA polymerase II C-terminal domain (CTD) phosphatase SSU72 prevents gene loop formation (Tan-Wong et al., 2012). Subsequently, loss of gene looping results in aberrant expression of cryptic unstable transcripts in the divergent direction, thereby controlling promoter directionality at human and yeast genes. I initially hypothesised that Rap1 might specifically recruit Ssu72 to its target promoters to facilitate gene looping and reinforce promoter directionality. However, Ssu72 was not detected as a direct or indirect Rap1 interactor in chromatin co- immunoprecipitation experiments (Figure 5.1) (Wu et al., 2018b). Gene looping requires factors associated with active transcription activation and initiation (El Kaderi et al., 2009; Medler et al., 2011). Repression of divergent transcripts through gene looping would thus require factors recruited downstream of Rap1, a “pioneer” factor that initiates transcriptional activation. In addition, targeting the dCas9 protein, which should not recruit coactivators involved in gene looping, to the IRT2 core promoter was already sufficient to completely repress IRT2 expression. Therefore, gene looping through the Ssu72 pathway is likely present at Rap1-regulated gene promoters but insufficient to repress divergent transcription in the absence of Rap1.

277

Chapter 6 Discussion

Recently, detailed maps of chromatin conformation at nucleosomal (Hsieh et al., 2015) and sub-nucleosomal resolution (Ohno et al., 2019) have been generated in S. cerevisiae. Although outside the scope of this thesis, it would be interesting to examine changes in chromatin conformation and divergent transcription at a genome-wide scale after inactivation of Ssu72 and Rap1. To obtain mechanistic insights at selected loci, the PAS of a Rap1-regulated gene could be replaced with a Rnt1 cleavage signal to bypass mRNA polyadenylation while still terminating transcription (Tan-Wong et al., 2012). This should result in loss of gene looping (e.g. measured by 3C, chromatin conformation capture), and subsequently the effects on divergent transcription could be assessed by RT-qPCR or northern blotting. Recent applications of CRISPR-dCas9 technology now allow selective and reversible establishment of chromatin loops (e.g. CLOuD9 or LADL) (Morgan et al., 2017; Rege et al., 2018). It may be possible to redirect specific promoter-terminator interactions at single loci and directly examine the causal effects of gene looping on divergent promoter activity.

6.4.4 Negative supercoiling and DNA accessibility at divergently oriented core promoters

In addition to looping interactions between distal genomic sites, DNA supercoiling constitutes another aspect of genome topology that affects gene regulation. In the twin-supercoil domain model of transcribing RNA polymerase, positive supercoiling is induced downstream of the elongation complex, and negative supercoiling of DNA accumulates in upstream sequences (Ma and Wang, 2016). This supercoiling is locally relieved by topoisomerase enzymes, specifically topoisomerase I (TOP1) and topoisomerase II (TOP2) in yeast. Of the two, topoisomerase II is primarily active at gene promoters (Teves and Henikoff, 2014a). Promoter melting (i.e. generation of the transcription bubble involving transition to non-B-DNA) is a key regulatory step in transcription regulation, in addition to assembly of the pre-initiation complex and RNA polymerase pausing (Kouzine et al., 2013). I speculated at highly expressed Rap1-regulated genes, high transcriptional activity from the coding direction core promoter would generate upstream negative supercoiling, which could affect the accessibility of the divergent

278

Chapter 6 Discussion

core promoter. In fact, negative supercoiling generated by RNA polymerase at the human C-MYC promoter facilitates binding of additional activator and repressor proteins to DNA (Kouzine et al., 2008). However, high levels of coding direction transcription stimulated by Rap1 should result in increased negative supercoiling, DNA accessibility, and transcriptional activity at the divergent core promoter (Seila et al., 2009). In other words, the activity of the coding and divergent core promoters should be coupled. However, the collective data from genome-wide and single locus studies indicate that the coding and divergent transcripts respond in opposite directions after Rap1 depletion. Alternatively, Rap1 could directly recruit topoisomerases to gene promoters to modulate local DNA shape and topology. Top1 and Top2 were enriched after chromatin co-immunoprecipitation of Rap1 (Figure 5.1) (Wu et al., 2018b). Future work could assess the effect of Rap1 depletion on the genome-wide chromatin occupancy of Top1 and Top2 (e.g. with ChIP-seq). These data could be complemented with high-resolution TMP-seq mapping of under-wound DNA (Teves and Henikoff, 2014b). Although the data from my genome-wide and single-locus approaches cannot rule out a direct effect of negative supercoiling on divergent promoter activity, I propose that other mechanisms predominantly regulate divergent transcription at Rap1-regulated genes.

6.4.5 Regulation of divergent transcript expression by gene positioning in the nucleus

In eukaryotes, the spatial positioning of a genomic locus in the nucleus is closely linked to gene expression (Steglich et al., 2013). Actively expressed genes associate with the nuclear pore complex (NPC) whereas transcriptionally silenced chromatin preferentially associates with the inner nuclear membrane (INM) instead (Brickner et al., 2012; Taddei and Gasser, 2012). In budding yeast, the telomeres and silent hidden mating type (HM) loci constitute a significant portion of heterochromatin. Anchoring of chromatin at the INM promotes gene silencing, and the silenced telomeres and HM loci are associated with the INM in yeast (Andrulis et al., 1998). At the nuclear envelope, Sir2, Sir3 and Sir4 form the silent information regulator (SIR) complex and contribute to heterochromatin formation (Gotta, 1996;

279

Chapter 6 Discussion

Taddei and Gasser, 2012). A component of the yeast NPC called Nup170 also binds to subtelomeric chromatin and is required for association of telomeres with the nuclear envelope, together with Sir4 (Van de Vosse et al., 2013). The silencing function of Nup170 is linked to its physical interaction with the RSC chromatin remodeller and Sir4. Intriguingly, Nup170 also regulates nucleosome positioning at subtelomeric and RP genes in yeast. In a nup170Δ mutant, 242 genes were up- regulated more than 2-fold, including 109 out of 138 RP genes. However, the role of Nup170 in direct or indirect regulation of divergent transcripts at RP genes has not been investigated. Several yeast transcription factors, including Gal4 and the RP gene regulators Rap1, Fhl1, and Ifh1 have been recently shown to target genes to the NPC (Brickner et al., 2019). It would be interesting to examine whether Nup170 and silencing factors contribute to “tuning” or “dampening” activity of bidirectional promoters. Potentially, a maximally active promoter would produce significant quantities of both coding and divergent RNAs, but transcription initiation and elongation in the divergent direction could be more sensitive to general silencing. This potential role for Nup170 and the NPC warrants further analysis using published whole-genome microarray data obtained from a nup170Δ strain background (Van de Vosse et al., 2013), or perhaps further RNA sequencing experiments.

6.5 Relevance to higher eukaryotes

6.5.1 Bidirectional promoters and enhancers

The findings presented within this thesis may inform future studies regarding the functions and regulation of promoters and enhancers. Bidirectional transcription is a common phenomenon in eukaryotes, particularly at these regulatory elements. Promoters and enhancers are depleted of nucleosomes and enriched in cis- regulatory elements that are interpreted by a complement of trans-acting proteins, like transcription factors and chromatin remodellers. Transcription initiates from each of the separate divergently oriented core promoters at the borders of the nucleosome-depleted region at both classes of regulatory elements. Although clear parallels are evident, enhancers and promoters are functionally different in transcriptional output. Promoters stimulate unidirectional transcription of an coding

280

Chapter 6 Discussion

mRNA, whereas bidirectional noncoding enhancer RNAs (eRNAs) are commonly associated with active enhancers and are typically equal in expression (Kim et al., 2010). The mechanistic basis of enhancer function remains unclear despite decades of research. Both the act of transcription and the eRNAs themselves have been implicated in enhancer function (Bose et al., 2017; Gu et al., 2018; Plank and Dean, 2014). From an evolutionary perspective, it has been proposed that eRNAs can acquire properties associated with mRNAs through mutation giving rise to protein-coding transcripts (Wu and Sharp, 2013). For example, mutation and selective pressure could purge an eRNA sequence of premature polyadenylation signals (PAS) or increase the frequency of U1 snRNP binding sites to override premature transcript termination and degradation (Almada et al., 2013; Wu and Sharp, 2013). This hypothesis is supported by the discovery that promoters are inherently bidirectional, but evolutionarily retained promoter regions are shaped by mutation to promote directional coding transcription and suppress divergent noncoding transcription (Jin et al., 2017).

In Chapter 4, I demonstrated that divergent core promoter activity can be precisely targeted and repressed using a nuclease-inactivated Cas9 protein. This proof-of-principle experiment opens up several intriguing possibilities. First, CRISPRi could be used to directly coerce the bidirectional output of an enhancer, simply by repressing one divergent core promoter and promoting directional transcription. If this repression could be sustained in cells over several generations in an artificial evolution experiment, perhaps additional genetic or epigenetic changes associated with eRNA and lncRNA gain-of-function could be identified. In addition, it is possible to perform systematic genome-wide perturbation of lncRNA expression by targeting core promoters using precise and scalable CRISPRi technology (Fulco et al., 2016; Liu et al., 2017). LncRNA functions have been directly tested by perturbation using CRISPR primarily in human cells (Joung et al., 2017; Liu et al., 2017; Zhu et al., 2016). CRISPRi screening using individual gRNAs or pooled gRNA libraries could be exploited to assess the contributions of lncRNAs towards a diverse range of cellular processes and phenotypes in vitro – for example, cellular fitness, de-regulation of checkpoints, cell morphology, or migration. For large-scale screening, it is important to implement high-throughput methods of gRNA assembly to facilitate combinatorial and scalable CRISPRi

281

Chapter 6 Discussion

screening. For example, many lncRNAs are associated with diseases such as cancer (Huarte, 2015), and thus screening may identify potential therapeutic targets if any lncRNAs directly affect disease pathology (Joung et al., 2017). Although this technology is only in its infancy, CRISPRi offers attractive benefits in a therapeutic setting because it can precisely target specific transcripts, does not induce permanent genetic changes in DNA sequence, and is tuneable and reversible. S. pyogenes Cas9 target sequences are limited by the requirement of a “NGG” protospacer adjacent motif (PAM), but other CRISPR proteins have been identified with orthogonal or expanded PAM requirements which expands the experimental toolbox and range of targetable sequences (Adli, 2018; Jinek et al., 2012). Further systematic investigation of the essential roles and functions of divergent lncRNAs in natural and pathological settings will help to complete our understanding of the noncoding genome and reveal insights into how vital cellular processes can deteriorate in disease.

6.5.2 Regulation of divergent core promoters and promoter directionality across the domains of life

The transcription factor Rap1 is not unique in its ability to repress transcription by occluding core promoter elements. Other DNA-binding proteins can regulate gene expression through steric hindrance. Examples of these factors are found across all three domains of life and viruses (Table 6.1). Classic transcriptional repressor proteins such as the LexA and Lac repressors in bacteria limit transcription by sterically excluding RNA polymerase from promoter DNA, where the coding direction core promoter is usually targeted (Browning and Busby, 2004). Using synthetic systems like CRISPR interference (CRISPRi) and TALE repressors (TALE, transcription activator-like effector), steric repression of core promoters can be reconstituted in prokaryotic and eukaryotic cells (Table 6.1) (Clauß et al., 2017; Gilbert et al., 2013; Li et al., 2015; Qi et al., 2013).

282

Chapter 6 Discussion

Factor Species or Origin Reference Bacteria Trp repressor E. coli (Kumamoto et al., 1987) LexA repressor E. coli (Little et al., 1981) (Brent and Ptashne, 1981) Lac repressor E. coli (Sellitti et al., 1987)

Archaea MDR1 repressor A. fulgidus (Bell et al., 1999) LrpA repressor P. furiosus (Brinkman et al., 2000) Phr heat shock response P. furiosus (Vierke et al., 2003) regulator

Eukaryotes AP2 H. sapiens (Getman et al., 1995) Glucocorticoid receptor B. taurus (Sakai et al., 1988) (GR) Rap1, likely Reb1 & Abf1 S. cerevisiae (Wu et al., 2018b) (Challal et al., 2018)

Viruses cI and Cro Lambda (λ) (Meyer et al., 1975) bacteriophage (Johnson et al., 1978) T antigen SV40 (Myers et al., 1981) LBP-1 (host factor) HIV-1 (Kato et al., 1991)

Synthetic systems dCas9 (nuclease- From S. pyogenes (Gilbert et al., 2013) inactivated Cas9 mutant) (Qi et al., 2013) TALEs From Xanthamonas (Li et al., 2015) sp. (Clauß et al., 2017)

283

Chapter 6 Discussion

Table 6.1 Examples of transcriptional repression by steric hindrance in different organisms and systems. Several examples are listed from different sources describing proteins from all three domains of life, viruses, and synthetic repression systems that limit transcription initiation by steric hindrance. This list is not intended to be comprehensive. This table was adapted from my point-of-view article published in Transcription (Wu and Van Werven, 2019).

It remains unclear whether other transcription factors can also repress divergent core promoter activity in vivo. Rap1 has been co-opted in S. cerevisiae to drive most RP gene expression, by substitution of transcription factor motifs through evolution (Hogues et al., 2008). Other transcription factors that drive RP gene expression in other fungal species also possess “pioneer” nucleosome displacement activity or show homology to the Rap1 DBD structural motifs – namely, Abf1, Reb1, Tbf1, and Cbf1 (Hogues et al., 2008; Yan et al., 2018). Whether these transcription factors also fulfil the criteria required to block divergent transcription is not known; this hypothesis requires further investigation. Rap1 itself is functionally conserved in higher eukaryotes including humans, but its role as a transcriptional activator and regulator of nucleosome positioning at gene promoters is not conserved (de Lange, 2018). Instead, other transcription factors may regulate divergent transcription in higher eukaryotes in a similar way. Recently, the contributions of chromatin states and core promoter sequence towards promoter directionality have been assessed in humans, D. melanogaster, and C. elegans (Andersson et al., 2015; Duttke et al., 2015; Ibrahim et al., 2018). In addition, transcription factors that open DNA asymmetrically are inherently present at the edges of NDRs, where divergent core promoters are also located. Several pioneer transcription factors that fulfil these criteria have been identified by analysis of genome-wide DNase hypersensitivity in mouse ES cells, including members of the Klf/Sp, NFYA, Creb/ATF, and Zfp161 families (Sherwood et al., 2014). These factors are interesting candidates for further investigation using TSS-seq and nascent RNA-seq approaches.

To highlight key regulatory principles that control expression of expression of divergent transcripts, it can often be helpful to compare closely or distantly

284

Chapter 6 Discussion

related species (Jin et al., 2017). In some eukaryotes, such as D. melanogaster, it was previously thought that there was little to no divergent transcription (Nechaev et al., 2010). However, technical advances in detection of nascent RNAs have uncovered that divergent transcripts are widely expressed at low levels in Drosophila (Meers et al., 2018; Rennie et al., 2018). GRO-seq (global run-on sequencing) and NET-seq (native elongating transcript sequencing) performed in Arabidopsis seedlings also have revealed limited but detectable amounts of divergent transcription at coding gene promoters (Hetzel et al., 2016; Zhu et al., 2018). Coincidentally, plant genomes harbour hundreds of transcription factors containing Myb-like DNA binding domains similar to Rap1 in budding yeast as a result of gene family expansion through evolution (Feller et al., 2011). Orthologs of the sequence-specific transcription factor Myb are found in retroviruses, and organisms ranging from sea urchins to humans (Davidson et al., 2013). Myb-like transcription factors regulate transcriptional responses to proliferation, differentiation, and environmental stresses (Ambawat et al., 2013). In contrast, only a handful of Myb-like proteins are present in vertebrates (Davidson et al., 2013). It is possible that expansion of gene families encoding Myb-related transcription factors helps to limit divergent transcription in the plant kingdom, and others. Future work could combine high resolution mapping of transcription factor binding sites with comparative analysis of promoter directionality between organisms with more or less divergent transcription.

6.6 Resources for scientific community

Several resources have been generated in this thesis, which may be of use to the wider scientific community. Firstly, precise TSS-seq data were generated with high sensitivity and resolution. The standard reference genome available for Saccharomyces cerevisiae, to date, still annotates coding gene start sites as the start of the ORF instead of the genuine TSS (Cherry et al., 2012; Zerbino et al., 2018). A similar misannotation issue applies to 3’ transcript ends. As mRNA 5’ and 3’ untranslated regions (UTRs) can confer key regulatory information and dictate the fate of transcripts (Mayr, 2017; Mignone et al., 2002), refinement of the reference genome is essential for accurate studies involving RNA expression.

285

Chapter 6 Discussion

Several data sets resulting from different methods used to map transcript isoforms and transcription start sites have been generated over the past decade, but this key information has not yet been integrated into community-based resources (Arribere and Gilbert, 2013; Cherry et al., 2012; Malabat et al., 2015; Park et al., 2014; Pelechano et al., 2013; Zerbino et al., 2018). In addition, the public release of RNA- seq and TSS-seq data sets from this work provides a resource for further bioinformatic analysis and cross-validation of other approaches (GEO: GSE110004).

I also adapted two technologies to facilitate transcriptome analysis in our research group. I implemented a nascent RNA sequencing method to generate highly sensitive data on coding and noncoding transcription across the genome. In addition, I utilised tools (e.g. dCas9, sgRNAs) to specifically perturb initiation of lncRNA transcription in a highly precise manner using CRISPRi. These tools are available as experimentally tractable plasmids for yeast. In combination with high- resolution TSS-seq, these resources provide a widely applicable method to assess lncRNA function by integrating TSS-mapping with precise CRISPR interference of divergent core promoters (Radzisheuskaya et al., 2016). This technology could easily be scaled up using gRNA libraries to assess the function of lncRNAs in various eukaryotic species individually or in combination (Joung et al., 2017; Liu et al., 2017).

6.7 Future directions

In this section, I describe future directions for investigation arising from the work presented in this thesis. I focus primarily on three key outstanding questions, and I propose experimental strategies to address them.

6.7.1 What are the consequences if Rap1 does not repress noncoding transcription?

Aberrant noncoding transcription can affect crucial programmes of gene expression through various mechanisms. In Chapter 3, I identified that the IRT2

286

Chapter 6 Discussion

regulatory noncoding RNA is constitutively expressed when Rap1 is depleted or its binding sites at the RPL43B promoter are deleted (Figure 3.2, Figure 3.3). Due to its role in the regulatory circuit comprising IRT2, IRT1, and IME1, constitutive expression of IRT2 leads to higher expression of IME1, the master regulator gene for entry into sporulation or gametogenesis in yeast. The outcome of this crucial cell fate decision may indeed be biased due to de-repression of IRT2, and the effects on sporulation efficiency could be measured using a sporulation assay in small batch cultures (Moretto et al., 2018; Weidberg et al., 2016). In contrast, inactivation of Rap1 leads to a significant reduction in MLP1 expression at the RPL40B locus – likely through transcriptional interference of the coding TSS (Figure 3.2, Figure 3.3). Mlp1 affects mRNA export by mediating nuclear retention of unspliced mRNAs, and reduced Mlp1 expression could significantly affect global mRNA export (Galy et al., 2004). It would be interesting to directly assess the consequences on export and translation of spliced mRNAs in yeast, perhaps by tracking individual transcripts using single-molecule RNA FISH or by measuring the abundance of spliced and unspliced mRNAs in nuclear and cytoplasmic fractions. Finally, I showed that depletion of Rap1 induced non-coding transcription at hundreds of sites across the genome (Figure 3.5, Figure 3.6). Aberrant noncoding transcription can generate extensive R-loops across intragenic and intergenic regions, leading to DNA damage and genome instability (Hamperl et al., 2017; Nojima et al., 2018). To further assess the extent and consequences of this phenomenon after depletion of Rap1, genome-wide experiments involving DRIP- seq, Break-seq, and γ-H2A.X ChIP-seq could be performed (Hoffman et al., 2015; Iacovoni et al., 2010; Skourti-Stathaki et al., 2011). Taken together, these examples highlight the diverse ways in which aberrant noncoding transcription can affect cellular function and fitness. In order to identify potential causes of disease pathology and cellular de-regulation related to expression of noncoding RNAs, it is important to fully identify, characterise, and understand the regulation of divergent and noncoding RNAs.

287

Chapter 6 Discussion

6.7.2 Can other sequence-specific transcription factors also control divergent transcription in a similar manner to Rap1?

Preliminary studies identified that additional sequence-specific transcription factors in budding yeast may control divergent promoter activity in a similar manner to Rap1 (Figure 4.7). These candidates could be validated using a similar approach, using gene deletion or inducible protein depletion alleles in conjunction with complementary genome-wide methods including nascent RNA-seq, TSS-seq, ChIP-seq, and MNase-seq. Potential candidates in yeast include Cbf1, Gcr1, Abf1, and Reb1. As discussed above, it would also be interesting to identify and characterise additional transcription factors that share structural homology with the Rap1 DNA-binding domain – in particular, the large family of Myb-like transcription factors in plants. However, because the AID system is endogenous to plant cells, alternative methods of ablating candidate gene expression would be required – e.g. RNAi or temperature-sensitive mutants. In the course of this work, I attempted to combine the nascent RNA enrichment method with the TSS-seq approach to measure TSS usage of nascent RNA polymerase II transcripts. This attempt was initially unsuccessful, but a recent study successfully combined biochemical fractionation for nuclear RNA enrichment with CAGE-seq to develop NET-CAGE, a method that identifies nascent transcript TSSs at single nucleotide resolution (Hirabayashi et al., 2019).

Within the steric hindrance model proposed, one stipulation is that the DNA- binding protein must be stably bound near the core promoter. If, for example, a transcription factor displayed a high turnover rate for DNA binding and the duration of each binding event were short, there might be more opportunities for RNA polymerase and additional coactivators to utilise the divergent core promoter. A previous study has measured Rap1 turnover at specific binding sites genome-wide using an epitope switching method in conjunction with tiling microarrays, but this approach was limited by the kinetics of epitope-tagged protein induction in response to galactose (Lickwar et al., 2012). The advent of single molecule live cell imaging now enables measurement of DNA-binding stability, turnover, and other kinetics for sequence-specific transcription factors. For example, candidate transcription factors could be endogenously fused with the HALO tag and

288

Chapter 6 Discussion

complementary fluorescent ligands for PALM microscopy could be introduced to various cellular systems in vitro (Donovan et al., 2018b; Mehta et al., 2018; Teves et al., 2018; Teves et al., 2016). Dual colour labelling to determine locus-specific binding events has been successful in budding yeast, using a CUP1 or LexO array GFP labelling system (Mehta et al., 2018). This additional masking distinguishes binding events at specific loci and is essential for imaging of Rap1, because Rap1 binds to DNA and forms clusters at telomeres in addition to gene promoters. This filter should be dispensable for most other sequence-specific transcription factors. Finally, it is unclear whether the coding and divergent transcription events from separate core promoters occur simultaneously. This temporal information is not available from experiments that assess bulk RNA expression performed on a population of cells in small batch cultures. Conceivably, fluorescent reporter systems for nascent RNA transcription (e.g. PP7, MS2) could be combined with divergent promoter constructs to measure divergent and coding transcription using live cell imaging (Donovan et al., 2018b; Lenstra et al., 2015). Together, these complementary approaches provide a framework for future investigation of additional transcription factors that may also control divergent transcription and promoter directionality in eukaryotes.

6.7.3 Which additional factors regulate divergent promoter activity?

In Chapter 5, I identified that a sequence-specific transcription factor and ATP- dependent chromatin remodeller both play important roles in regulating divergent promoter activity. However, divergent transcription still occurred in the absence of both Rap1 and RSC at two-thirds of Rap1-regulated genes in yeast (Figure 3.6). The full spectrum of factors that drive or influence divergent promoter activity remains to be identified. For example, high-resolution ChIP-exo or ChIP-nexus mapping of TBP and general transcription factors could verify whether recruitment of initiation factors to divergent core promoters is directly affected by Rap1 or chromatin remodeller depletion (Doris et al., 2018; He et al., 2015; Rhee and Pugh, 2012). Individual perturbation of transcription factors or chromatin remodellers offers deeper mechanistic information, but is limited in terms of throughput and speed. Alternatively, unbiased proteomics approaches could identify additional

289

Chapter 6 Discussion

factors that regulate divergent promoter activity at many model loci. Recent advances that exploit nuclease-inactivated Cas9 (dCas9) complexes facilitate single locus proteomics studies of nascent or cross-linked chromatin (Tsui et al., 2018). In addition, it would be informative to apply proximity-dependent biotin labelling through the dCas9-APEX system to label and immunoprecipitate proteins and complexes that interact only transiently with divergent promoters (Myers et al., 2018). Although further optimisation and dedicated proteomics mass spectrometry analysis is required to implement these methods, they should help to uncover additional regulators of bidirectional transcription.

290

Appendix

Chapter 7. Appendix

7.1 Copyright Permissions

Part of the work in this thesis has been published in two separate papers I have authored in Molecular Cell and Transcription (Wu et al., 2018b; Wu and Van Werven, 2019). The research published in Molecular Cell was first posted on bioRxiv as a preprint.

The copyright for all three publications rests with the author(s). The articles below in Molecular Cell and Transcription are distributed as open access articles distributed under the terms of the Creative Commons Attribution License (CC BY 4.0) (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

The copyright holder for the preprint article (Wu et al., 2018a) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse without permission.

Wu, A.C.K., Patel, H., Chia, M., Moretto, F., Frith, D., Snijders, A.P., and van Werven, F.J. (2018). Repression of Divergent Noncoding Transcription by a Sequence-Specific Transcription Factor. Mol Cell 72, 942-954 e947. Copyright © 2018 The Author(s)

Wu, A.C.K., Patel, H., Chia, M., Moretto, F., Frith, D., Snijders, A.P., and van Werven, F.J. (2018). Repression of divergent noncoding transcription by a sequence-specific transcription factor. bioRxiv, doi:10.1101/314310. Copyright © 2018 The Author(s)/Funder

Wu, A.C.K., and Van Werven, F.J. (2019). Transcribe this way: Rap1 confers promoter directionality by repressing divergent transcription. Transcription 10, 164- 170. Copyright © 2019 The Author(s)

291

References

Chapter 8. References

Adelman, K., and Lis, J.T. (2012). Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoans. Nat Rev Genet 13, 720-731.

Adiconis, X., Haber, A.L., Simmons, S.K., Levy Moonshine, A., Ji, Z., Busby, M.A., Shi, X., Jacques, J., Lancaster, M.A., Pan, J.Q., et al. (2018). Comprehensive comparative analysis of 5'-end RNA-sequencing methods. Nat Methods 15, 505- 511.

Adjalley, S.H., Chabbert, C.D., Klaus, B., Pelechano, V., and Steinmetz, L.M. (2016). Landscape and Dynamics of Transcription Initiation in the Malaria Parasite Plasmodium falciparum. Cell Rep 14, 2463-2475.

Adli, M. (2018). The CRISPR tool kit for genome editing and beyond. Nature communications 9, 1911.

Aguilera, A., and Garcia-Muse, T. (2012). R loops: from transcription byproducts to threats to genome stability. Mol Cell 46, 115-124.

Albert, B., Knight, B., Merwin, J., Martin, V., Ottoz, D., Gloor, Y., Bruzzone, M.J., Rudner, A., and Shore, D. (2016). A Molecular Titration System Coordinates Ribosomal Protein Gene Transcription with Ribosomal RNA Synthesis. Mol Cell 64, 720-733.

Albert, B., Tomassetti, S., Gloor, Y., Dilg, D., Mattarocci, S., Kubik, S., Hafner, L., and Shore, D. (2019). Sfp1 regulates transcriptional networks driving cell growth and division through multiple promoter-binding modes. Genes Dev 33, 288-293.

Albert, I., Mavrich, T.N., Tomsho, L.P., Qi, J., Zanton, S.J., Schuster, S.C., and Pugh, B.F. (2007). Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome. Nature 446, 572-576.

Alcid, E.A., and Tsukiyama, T. (2014). ATP-dependent chromatin remodeling shapes the long noncoding RNA landscape. Genes Dev 28, 2348-2360.

Alexander, R.P., Fang, G., Rozowsky, J., Snyder, M., and Gerstein, M.B. (2010). Annotating non-coding regions of the genome. Nat Rev Genet 11, 559-571.

Ali, M., Rincon-Arano, H., Zhao, W., Rothbart, S.B., Tong, Q., Parkhurst, S.M., Strahl, B.D., Deng, L.W., Groudine, M., and Kutateladze, T.G. (2013). Molecular

292

References

basis for chromatin binding and regulation of MLL5. Proc Natl Acad Sci U S A 110, 11296-11301.

Allis, C.D., and Jenuwein, T. (2016). The molecular hallmarks of epigenetic control. Nat Rev Genet 17, 487-500.

Almada, A.E., Wu, X., Kriz, A.J., Burge, C.B., and Sharp, P.A. (2013). Promoter directionality is controlled by U1 snRNP and polyadenylation signals. Nature 499, 360-363.

Ambawat, S., Sharma, P., Yadav, N.R., and Yadav, R.C. (2013). MYB transcription factor genes as regulators for plant responses: an overview. Physiol Mol Biol Plants 19, 307-321.

Andersen, P.R., Domanski, M., Kristiansen, M.S., Storvall, H., Ntini, E., Verheggen, C., Schein, A., Bunkenborg, J., Poser, I., Hallais, M., et al. (2013). The human cap- binding complex is functionally connected to the nuclear RNA exosome. Nat Struct Mol Biol 20, 1367-1376.

Andersson, R., Chen, Y., Core, L., Lis, J.T., Sandelin, A., and Jensen, T.H. (2015). Human Gene Promoters Are Intrinsically Bidirectional. Mol Cell 60, 346-347.

Andersson, R., Refsing Andersen, P., Valen, E., Core, L.J., Bornholdt, J., Boyd, M., Heick Jensen, T., and Sandelin, A. (2014). Nuclear stability and transcriptional directionality separate functionally distinct RNA species. Nature communications 5, 5336.

Andolfatto, P. (2005). Adaptive evolution of non-coding DNA in Drosophila. Nature 437, 1149-1152.

Andrulis, E.D., Neiman, A.M., Zappulla, D.C., and Sternglanz, R. (1998). Perinuclear localization of chromatin facilitates transcriptional silencing. Nature 394, 592-595.

Ard, R., Allshire, R.C., and Marquardt, S. (2017). Emerging Properties and Functional Consequences of Noncoding Transcription. Genetics 207, 357-367.

Arigo, J.T., Carroll, K.L., Ames, J.M., and Corden, J.L. (2006a). Regulation of yeast NRD1 expression by premature transcription termination. Mol Cell 21, 641-651.

293

References

Arigo, J.T., Eyler, D.E., Carroll, K.L., and Corden, J.L. (2006b). Termination of cryptic unstable transcripts is directed by yeast RNA-binding proteins Nrd1 and Nab3. Mol Cell 23, 841-851.

Arribere, J.A., and Gilbert, W.V. (2013). Roles for transcript leaders in translation and mRNA decay revealed by transcript leader sequencing. Genome Res 23, 977- 987.

Ashe, M.P., Pearson, L.H., and Proudfoot, N.J. (1997). The HIV-1 5' LTR poly(A) site is inactivated by U1 snRNP interaction with the downstream major splice donor site. EMBO J 16, 5752-5763.

Azad, G.K., and Tomar, R.S. (2016). The multifunctional transcription factor Rap1: a regulator of yeast physiology. Front Biosci (Landmark Ed) 21, 918-930.

Azzalin, C.M., Reichenbach, P., Khoriauli, L., Giulotto, E., and Lingner, J. (2007). Telomeric repeat containing RNA and RNA surveillance factors at mammalian chromosome ends. Science 318, 798-801.

Badis, G., Chan, E.T., van Bakel, H., Pena-Castillo, L., Tillo, D., Tsui, K., Carlson, C.D., Gossett, A.J., Hasinoff, M.J., Warren, C.L., et al. (2008). A library of yeast transcription factor motifs reveals a widespread function for Rsc3 in targeting nucleosome exclusion at promoters. Mol Cell 32, 878-887.

Bagchi, D.N., and Iyer, V.R. (2016). The Determinants of Directionality in Transcriptional Initiation. Trends Genet 32, 322-333.

Bai, L., and Morozov, A.V. (2010). Gene regulation by nucleosome positioning. Trends Genet 26, 476-483.

Balbin, O.A., Malik, R., Dhanasekaran, S.M., Prensner, J.R., Cao, X., Wu, Y.M., Robinson, D., Wang, R., Chen, G., Beer, D.G., et al. (2015). The landscape of antisense gene expression in human cancers. Genome Res 25, 1068-1079.

Baptista, T., Grunberg, S., Minoungou, N., Koster, M.J.E., Timmers, H.T.M., Hahn, S., Devys, D., and Tora, L. (2017). SAGA Is a General Cofactor for RNA Polymerase II Transcription. Mol Cell 68, 130-143 e135.

Basehoar, A.D., Zanton, S.J., and Pugh, B.F. (2004). Identification and distinct regulation of yeast TATA box-containing genes. Cell 116, 699-709.

294

References

Baßler, J., and Hurt, E. (2019). Eukaryotic Ribosome Assembly. Annu Rev Biochem 88, 281-306.

Baumann, D.G., and Gilmour, D.S. (2017). A sequence-specific core promoter- binding transcription factor recruits TRF2 to coordinately transcribe ribosomal protein genes. Nucleic Acids Res 45, 10481-10491.

Beaulieu, Y.B., Kleinman, C.L., Landry-Voyer, A.M., Majewski, J., and Bachand, F. (2012). Polyadenylation-dependent control of long noncoding RNA expression by the poly(A)-binding protein nuclear 1. PLoS Genet 8, e1003078.

Bell, S.D., Cairns, S.S., Robson, R.L., and Jackson, S.P. (1999). Transcriptional regulation of an archaeal operon in vivo and in vitro. Mol Cell 4, 971-982.

Belotserkovskaya, R., Oh, S., Bondarenko, V.A., Orphanides, G., Studitsky, V.M., and Reinberg, D. (2003). FACT facilitates transcription-dependent nucleosome alteration. Science 301, 1090-1093.

Beltran, M., Yates, C.M., Skalska, L., Dawson, M., Reis, F.P., Viiri, K., Fisher, C.L., Sibley, C.R., Foster, B.M., Bartke, T., et al. (2016). The interaction of PRC2 with RNA or chromatin is mutually antagonistic. Genome Res 26, 896-907.

Bendjennat, M., and Weil, P.A. (2008). The transcriptional repressor activator protein Rap1p is a direct regulator of TATA-binding protein. J Biol Chem 283, 8699- 8710.

Berger, A.B., Decourty, L., Badis, G., Nehrbass, U., Jacquier, A., and Gadal, O. (2007). Hmo1 is required for TOR-dependent regulation of ribosomal protein gene transcription. Mol Cell Biol 27, 8015-8026.

Borneman, A.R., Gianoulis, T.A., Zhang, Z.D., Yu, H., Rozowsky, J., Seringhaus, M.R., Wang, L.Y., Gerstein, M., and Snyder, M. (2007). Divergence of transcription factor binding sites across related yeast species. Science 317, 815-819.

Bose, D.A., Donahue, G., Reinberg, D., Shiekhattar, R., Bonasio, R., and Berger, S.L. (2017). RNA Binding to CBP Stimulates Histone Acetylation and Transcription. Cell 168, 135-149 e122.

Bosio, M.C., Fermi, B., and Dieci, G. (2017). Transcriptional control of yeast ribosome biogenesis: A multifaceted role for general regulatory factors. Transcription 8, 254-260.

295

References

Brahma, S., and Henikoff, S. (2019). RSC-Associated Subnucleosomes Define MNase-Sensitive Promoters in Yeast. Mol Cell 73, 238-249 e233.

Bregman, A., Avraham-Kelbert, M., Barkai, O., Duek, L., Guterman, A., and Choder, M. (2011). Promoter elements regulate cytoplasmic mRNA decay. Cell 147, 1473-1483.

Brent, R., and Ptashne, M. (1981). Mechanism of action of the lexA gene product. Proc Natl Acad Sci U S A 78, 4204-4208.

Briand, J.-B. (2015). Etude du contrôle de la transcription envahissante par la terminaison de la transcription. In École Doctorale 426: Gènes Génomes Cellules (Paris, Université Paris Sud - Paris XI).

Brickner, D.G., Ahmed, S., Meldi, L., Thompson, A., Light, W., Young, M., Hickman, T.L., Chu, F., Fabre, E., and Brickner, J.H. (2012). Transcription factor binding to a DNA zip code controls interchromosomal clustering at the nuclear periphery. Dev Cell 22, 1234-1246.

Brickner, D.G., Randise-Hinchliff, C., Lebrun Corbin, M., Liang, J.M., Kim, S., Sump, B., D'Urso, A., Kim, S.H., Satomura, A., Schmit, H., et al. (2019). The Role of Transcription Factors and Nuclear Pore Proteins in Controlling the Spatial Organization of the Yeast Genome. Dev Cell 49, 936-947 e934.

Briggs, S.D., Bryk, M., Strahl, B.D., Cheung, W.L., Davie, J.K., Dent, S.Y., Winston, F., and Allis, C.D. (2001). Histone H3 lysine 4 methylation is mediated by Set1 and required for cell growth and rDNA silencing in Saccharomyces cerevisiae. Genes Dev 15, 3286-3295.

Brinkman, A.B., Dahlke, I., Tuininga, J.E., Lammers, T., Dumay, V., de Heus, E., Lebbink, J.H., Thomm, M., de Vos, W.M., and van Der Oost, J. (2000). An Lrp-like transcriptional regulator from the archaeon Pyrococcus furiosus is negatively autoregulated. J Biol Chem 275, 38160-38169.

Britten, R.J., and Davidson, E.H. (1969). Gene regulation for higher cells: a theory. Science 165, 349-357.

Browning, D.F., and Busby, S.J. (2004). The regulation of bacterial transcription initiation. Nat Rev Microbiol 2, 57-65.

296

References

Buck, S.W., and Shore, D. (1995). Action of a RAP1 carboxy-terminal silencing domain reveals an underlying competition between HMR and telomeres in yeast. Genes & Development 9, 370-384.

Bumgarner, S.L., Dowell, R.D., Grisafi, P., Gifford, D.K., and Fink, G.R. (2009). Toggle involving cis-interfering noncoding RNAs controls variegated gene expression in yeast. Proc Natl Acad Sci U S A 106, 18321-18326.

Buratowski, S. (2009). Progression through the RNA polymerase II CTD cycle. Mol Cell 36, 541-546.

Buratowski, S., and Kim, T. (2010). The role of cotranscriptional histone methylations. Cold Spring Harb Symp Quant Biol 75, 95-102.

Cairns, B.R., Lorch, Y., Li, Y., Zhang, M., Lacomis, L., Erdjument-Bromage, H., Tempst, P., Du, J., Laurent, B., and Kornberg, R.D. (1996). RSC, an essential, abundant chromatin-remodeling complex. Cell 87, 1249-1260.

Camblong, J., Iglesias, N., Fickentscher, C., Dieppois, G., and Stutz, F. (2007). Antisense RNA stabilization induces transcriptional gene silencing via histone deacetylation in S. cerevisiae. Cell 131, 706-717.

Candelli, T., Challal, D., Briand, J.B., Boulay, J., Porrua, O., Colin, J., and Libri, D. (2018). High-resolution transcription maps reveal the widespread impact of roadblock termination in yeast. EMBO J 37, e97490.

Canzio, D., Nwakeze, C.L., Horta, A., Rajkumar, S.M., Coffey, E.L., Duffy, E.E., Duffie, R., Monahan, K., O'Keeffe, S., Simon, M.D., et al. (2019). Antisense lncRNA Transcription Mediates DNA Demethylation to Drive Stochastic Protocadherin alpha Promoter Choice. Cell 177, 639-653 e615.

Carninci, P., Kasukawa, T., Katayama, S., Gough, J., Frith, M.C., Maeda, N., Oyama, R., Ravasi, T., Lenhard, B., Wells, C., et al. (2005). The transcriptional landscape of the mammalian genome. Science 309, 1559-1563.

Carrozza, M.J., Li, B., Florens, L., Suganuma, T., Swanson, S.K., Lee, K.K., Shia, W.J., Anderson, S., Yates, J., Washburn, M.P., et al. (2005). Histone H3 methylation by Set2 directs deacetylation of coding regions by Rpd3S to suppress spurious intragenic transcription. Cell 123, 581-592.

297

References

Carter, R., and Drouin, G. (2009). Structural differentiation of the three eukaryotic RNA polymerases. Genomics 94, 388-396.

Carvalho, S., Raposo, A.C., Martins, F.B., Grosso, A.R., Sridhara, S.C., Rino, J., Carmo-Fonseca, M., and de Almeida, S.F. (2013). Histone methyltransferase SETD2 coordinates FACT recruitment with nucleosome dynamics during transcription. Nucleic Acids Res 41, 2881-2893.

Challal, D., Barucco, M., Kubik, S., Feuerbach, F., Candelli, T., Geoffroy, H., Benaksas, C., Shore, D., and Libri, D. (2018). General Regulatory Factors Control the Fidelity of Transcription by Restricting Non-coding and Ectopic Initiation. Mol Cell 72, 955-969 e957.

Chanfreau, G., Noble, S.M., and Guthrie, C. (1996). Essential yeast protein with unexpected similarity to subunits of mammalian cleavage and polyadenylation specificity factor (CPSF). Science 274, 1511-1514.

Chang, J.S., and Winston, F. (2011). Spt10 and Spt21 are required for transcriptional silencing in Saccharomyces cerevisiae. Eukaryot Cell 10, 118-129.

Chen, J., Tresenrider, A., Chia, M., McSwiggen, D.T., Spedale, G., Jorgensen, V., Liao, H., van Werven, F.J., and Unal, E. (2017). Kinetochore inactivation by expression of a repressive mRNA. Elife 6, 27417.

Chen, K., Hu, Z., Xia, Z., Zhao, D., Li, W., and Tyler, J.K. (2015). The Overlooked Fact: Fundamental Need for Spike-In Control for Virtually All Genome-Wide Analyses. Mol Cell Biol 36, 662-667.

Chen, K., Xi, Y., Pan, X., Li, Z., Kaestner, K., Tyler, J., Dent, S., He, X., and Li, W. (2013). DANPOS: dynamic analysis of nucleosome position and occupancy by sequencing. Genome Res 23, 341-351.

Cheng, Z., Otto, G.M., Powers, E.N., Keskin, A., Mertins, P., Carr, S.A., Jovanovic, M., and Brar, G.A. (2018). Pervasive, Coordinated Protein-Level Changes Driven by Transcript Isoform Switching during Meiosis. Cell 172, 910-923 e916.

Cherry, J.M., Hong, E.L., Amundsen, C., Balakrishnan, R., Binkley, G., Chan, E.T., Christie, K.R., Costanzo, M.C., Dwight, S.S., Engel, S.R., et al. (2012). Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res 40, D700-705.

298

References

Cheung, A.C., and Cramer, P. (2011). Structural basis of RNA polymerase II backtracking, arrest and reactivation. Nature 471, 249-253.

Cheung, V., Chua, G., Batada, N.N., Landry, C.R., Michnick, S.W., Hughes, T.R., and Winston, F. (2008). Chromatin- and transcription-related factors repress transcription from within coding regions throughout the Saccharomyces cerevisiae genome. PLoS Biol 6, e277.

Chia, M., Tresenrider, A., Chen, J., Spedale, G., Jorgensen, V., Unal, E., and van Werven, F.J. (2017). Transcription of a 5' extended mRNA isoform directs dynamic chromatin changes and interference of a downstream promoter. Elife 6, 27420.

Chiu, A.C., Suzuki, H.I., Wu, X., Mahat, D.B., Kriz, A.J., and Sharp, P.A. (2018). Transcriptional Pause Sites Delineate Stable Nucleosome-Associated Premature Polyadenylation Suppressed by U1 snRNP. Mol Cell 69, 648-663 e647.

Churchman, L.S. (2017). Not Just Noise: Genomics and Genetics Bring Long Noncoding RNAs into Focus. Mol Cell 65, 1-2.

Churchman, L.S., and Weissman, J.S. (2011). Nascent transcript sequencing visualizes transcription at nucleotide resolution. Nature 469, 368-373.

Churchman, L.S., and Weissman, J.S. (2012). Native elongating transcript sequencing (NET-seq). Curr Protoc Mol Biol Chapter 4, Unit 4 14 11-17.

Chymkowitch, P., Nguea, A.P., Aanes, H., Koehler, C.J., Thiede, B., Lorenz, S., Meza-Zepeda, L.A., Klungland, A., and Enserink, J.M. (2015). Sumoylation of Rap1 mediates the recruitment of TFIID to promote transcription of ribosomal protein genes. Genome Res 25, 897-906.

Clapier, C.R., Iwasa, J., Cairns, B.R., and Peterson, C.L. (2017). Mechanisms of action and regulation of ATP-dependent chromatin-remodelling complexes. Nature reviews. Molecular cell biology 18, 407-422.

Clark, D.J. (2010). Nucleosome positioning, nucleosome spacing and the nucleosome code. J Biomol Struct Dyn 27, 781-793.

Clauß, K., Popp, A.P., Schulze, L., Hettich, J., Reisser, M., Escoter Torres, L., Uhlenhaut, N.H., and Gebhardt, J.C.M. (2017). DNA residence time is a regulatory factor of transcription repression. Nucleic Acids Res 45, 11121-11130.

299

References

Colin, J., Candelli, T., Porrua, O., Boulay, J., Zhu, C., Lacroute, F., Steinmetz, L.M., and Libri, D. (2014). Roadblock termination by reb1p restricts cryptic and readthrough transcription. Mol Cell 56, 667-680.

Cong, L., Ran, F.A., Cox, D., Lin, S., Barretto, R., Habib, N., Hsu, P.D., Wu, X., Jiang, W., Marraffini, L.A., et al. (2013). Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819-823.

Consortium, E.P. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57-74.

Core, L., and Adelman, K. (2019). Promoter-proximal pausing of RNA polymerase II: a nexus of gene regulation. Genes Dev 33, 960-982.

Core, L.J., Martins, A.L., Danko, C.G., Waters, C.T., Siepel, A., and Lis, J.T. (2014). Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat Genet 46, 1311-1320.

Core, L.J., Waterfall, J.J., and Lis, J.T. (2008). Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322, 1845-1848.

Cox, J., and Mann, M. (2008). MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 26, 1367-1372.

Cramer, P. (2002). Multisubunit RNA polymerases. Current Opinion in Structural Biology 12, 89-97.

Creamer, T.J., Darby, M.M., Jamonnak, N., Schaughency, P., Hao, H., Wheelan, S.J., and Corden, J.L. (2011). Transcriptome-wide binding sites for components of the Saccharomyces cerevisiae non-poly(A) termination pathway: Nrd1, Nab3, and Sen1. PLoS Genet 7, e1002329.

Czaja, W., Mao, P., and Smerdon, M.J. (2014). Chromatin remodelling complex RSC promotes base excision repair in chromatin of Saccharomyces cerevisiae. DNA Repair (Amst) 16, 35-43.

Daer, R.M., Cutts, J.P., Brafman, D.A., and Haynes, K.A. (2017). The Impact of Chromatin Dynamics on Cas9-Mediated Genome Editing in Human Cells. ACS Synth Biol 6, 428-438.

300

References

David, L., Huber, W., Granovskaia, M., Toedling, J., Palm, C.J., Bofkin, L., Jones, T., Davis, R.W., and Steinmetz, L.M. (2006). A high-resolution map of transcription in the yeast genome. Proceedings of the National Academy of Sciences 103, 5320- 5325.

Davidson, C.J., Guthrie, E.E., and Lipsick, J.S. (2013). Duplication and maintenance of the Myb genes of vertebrate animals. Biol Open 2, 101-110.

Davis, C.A., and Ares, M., Jr. (2006). Accumulation of unstable promoter- associated transcripts upon loss of the nuclear exosome subunit Rrp6p in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 103, 3262-3267. de Boer, C.G., and Hughes, T.R. (2012). YeTFaSCo: a database of evaluated yeast transcription factor sequence specificities. Nucleic Acids Res 40, D169-179. de Lange, T. (2018). Shelterin-Mediated Telomere Protection. Annu Rev Genet 52, 223-247.

DeGennaro, C.M., Alver, B.H., Marguerat, S., Stepanova, E., Davis, C.P., Bahler, J., Park, P.J., and Winston, F. (2013). Spt6 regulates intragenic and antisense transcription, nucleosome positioning, and histone modifications genome-wide in fission yeast. Mol Cell Biol 33, 4779-4792.

Derrien, T., Johnson, R., Bussotti, G., Tanzer, A., Djebali, S., Tilgner, H., Guernec, G., Martin, D., Merkel, A., Knowles, D.G., et al. (2012). The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res 22, 1775-1789.

Dharmasiri, N., Dharmasiri, S., and Estelle, M. (2005). The F-box protein TIR1 is an auxin receptor. Nature 435, 441-445.

Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15-21.

Donovan, B.T., Chen, H., Jipa, C., Bai, L., and Poirier, M.G. (2018a). Dissociation rate compensation mechanism for budding yeast pioneer transcription factors. bioRxiv, 441469.

301

References

Donovan, B.T., Huynh, A., Ball, D.A., Poirier, M.G., Larson, D.R., Ferguson, M.L., and Lenstra, T.L. (2018b). Single-molecule imaging reveals the interplay between transcription factors, nucleosomes, and transcriptional bursting. bioRxiv, 404681.

Doris, S.M., Chuang, J., Viktorovskaya, O., Murawska, M., Spatt, D., Churchman, L.S., and Winston, F. (2018). Spt6 Is Required for the Fidelity of Promoter Selection. Mol Cell 72, 687-699 e686.

Downey, M., Knight, B., Vashisht, A.A., Seller, C.A., Wohlschlegel, J.A., Shore, D., and Toczyski, D.P. (2013). Gcn5 and sirtuins regulate acetylation of the ribosomal protein transcription factor Ifh1. Curr Biol 23, 1638-1648. du Mee, D.J.M., Ivanov, M., Parker, J.P., Buratowski, S., and Marquardt, S. (2018). Efficient termination of nuclear lncRNA transcription promotes mitochondrial genome maintenance. Elife 7, 31989.

Duttke, S.H.C., Lacadie, S.A., Ibrahim, M.M., Glass, C.K., Corcoran, D.L., Benner, C., Heinz, S., Kadonaga, J.T., and Ohler, U. (2015). Human promoters are intrinsically directional. Mol Cell 57, 674-684.

Eick, D., and Geyer, M. (2013). The RNA polymerase II carboxy-terminal domain (CTD) code. Chem Rev 113, 8456-8490.

El Kaderi, B., Medler, S., Raghunayakula, S., and Ansari, A. (2009). Gene looping is conferred by activator-dependent interaction of transcription initiation and termination machineries. J Biol Chem 284, 25015-25025.

Engreitz, J.M., Haines, J.E., Perez, E.M., Munson, G., Chen, J., Kane, M., McDonel, P.E., Guttman, M., and Lander, E.S. (2016). Local regulation of gene expression by lncRNA promoters, transcription and splicing. Nature 539, 452-455.

Erwin, G.S., Grieshop, M.P., Ali, A., Qi, J., Lawlor, M., Kumar, D., Ahmad, I., McNally, A., Teider, N., Worringer, K., et al. (2017). Synthetic transcription elongation factors license transcription across repressive chromatin. Science 358, 1617-1622.

Fazzio, T.G., Kooperberg, C., Goldmark, J.P., Neal, C., Basom, R., Delrow, J., and Tsukiyama, T. (2001). Widespread collaboration of Isw2 and Sin3-Rpd3 chromatin remodeling complexes in transcriptional repression. Mol Cell Biol 21, 6450-6460.

302

References

Feeser, E.A., and Wolberger, C. (2008). Structural and functional studies of the Rap1 C-terminus reveal novel separation-of-function mutants. J Mol Biol 380, 520- 531.

Feldman, J.L., and Peterson, C.L. (2019). Yeast Sirtuin Family Members Maintain Transcription Homeostasis to Ensure Genome Stability. Cell Rep 27, 2978-2989 e2975.

Feldmann, E.A., De Bona, P., and Galletto, R. (2015). The wrapping loop and Rap1 C-terminal (RCT) domain of yeast Rap1 modulate access to different DNA binding modes. J Biol Chem 290, 11455-11466.

Feldmann, E.A., and Galletto, R. (2014). The DNA-binding domain of yeast Rap1 interacts with double-stranded DNA in multiple binding modes. Biochemistry 53, 7471-7483.

Feller, A., Machemer, K., Braun, E.L., and Grotewold, E. (2011). Evolutionary and comparative analysis of MYB and bHLH plant transcription factors. Plant J 66, 94- 116.

Feng, J., Gan, H., Eaton, M.L., Zhou, H., Li, S., Belsky, J.A., MacAlpine, D.M., Zhang, Z., and Li, Q. (2016). Noncoding Transcription Is a Driving Force for Nucleosome Instability in spt16 Mutant Cells. Mol Cell Biol 36, 1856-1867.

Fermi, B., Bosio, M.C., and Dieci, G. (2016). Promoter architecture and transcriptional regulation of Abf1-dependent ribosomal protein genes in Saccharomyces cerevisiae. Nucleic Acids Res 44, 6113-6126.

Fischl, H., Howe, F.S., Furger, A., and Mellor, J. (2017). Paf1 Has Distinct Roles in Transcription Elongation and Differential Transcript Fate. Mol Cell 65, 685-698 e688.

Flynn, R.A., Almada, A.E., Zamudio, J.R., and Sharp, P.A. (2011). Antisense RNA polymerase II divergent transcripts are P-TEFb dependent and substrates for the RNA exosome. Proc Natl Acad Sci U S A 108, 10460-10465.

Fong, N., Saldi, T., Sheridan, R.M., Cortazar, M.A., and Bentley, D.L. (2017). RNA Pol II Dynamics Modulate Co-transcriptional Chromatin Modification, CTD Phosphorylation, and Transcriptional Direction. Mol Cell 66, 546-557 e543.

303

References

Forsburg, S.L., and Guarente, L. (1989). Identification and characterization of HAP4: a third component of the CCAAT-bound HAP2/HAP3 heteromer. Genes Dev 3, 1166-1178.

Frankel, A.D., and Kim, P.S. (1991). Modular structure of transcription factors: implications for gene regulation. Cell 65, 717-719.

Freeman, K., Gwadz, M., and Shore, D. (1995). Molecular and genetic analysis of the toxic effect of RAP1 overexpression in yeast. Genetics 141, 1253-1262.

Frenk, S., Oxley, D., and Houseley, J. (2014). The nuclear exosome is active and important during budding yeast meiosis. PLoS One 9, e107648.

Fulco, C.P., Munschauer, M., Anyoha, R., Munson, G., Grossman, S.R., Perez, E.M., Kane, M., Cleary, B., Lander, E.S., and Engreitz, J.M. (2016). Systematic mapping of functional enhancer-promoter connections with CRISPR interference. Science 354, 769-773.

Gadbois, E.L., Chao, D.M., Reese, J.C., Green, M.R., and Young, R.A. (1997). Functional antagonism between RNA polymerase II holoenzyme and global negative regulator NC2 in vivo. Proc Natl Acad Sci U S A 94, 3145-3150.

Galy, V., Gadal, O., Fromont-Racine, M., Romano, A., Jacquier, A., and Nehrbass, U. (2004). Nuclear Retention of Unspliced mRNAs in Yeast Is Mediated by Perinuclear Mlp1. Cell 116, 63-73.

Ganapathi, M., Palumbo, M.J., Ansari, S.A., He, Q., Tsui, K., Nislow, C., and Morse, R.H. (2011). Extensive role of the general regulatory factors, Abf1 and Rap1, in determining genome-wide chromatin structure in budding yeast. Nucleic Acids Res 39, 2032-2044.

Garbett, K.A., Tripathi, M.K., Cencki, B., Layer, J.H., and Weil, P.A. (2007). Yeast TFIID serves as a coactivator for Rap1p by direct protein-protein interaction. Mol Cell Biol 27, 297-311.

Gasch, A.P., Spellman, P.T., Kao, C.M., Carmel-Harel, O., Eisen, M.B., Storz, G., Botstein, D., and Brown, P.O. (2000). Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell 11, 4241-4257.

304

References

Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R.D., and Bairoch, A. (2003). ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res 31, 3784-3788.

Gates, K.S. (2009). An overview of chemical processes that damage cellular DNA: spontaneous hydrolysis, alkylation, and reactions with radicals. Chem Res Toxicol 22, 1747-1760.

Getman, D.K., Mutero, A., Inoue, K., and Taylor, P. (1995). Transcription factor repression and activation of the human acetylcholinesterase gene. J Biol Chem 270, 23511-23519.

Gilbert, L.A., Larson, M.H., Morsut, L., Liu, Z., Brar, G.A., Torres, S.E., Stern- Ginossar, N., Brandman, O., Whitehead, E.H., Doudna, J.A., et al. (2013). CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442-451.

Gilson, E., Roberge, M., Giraldo, R., Rhodes, D., and Gasser, S.M. (1993). Distortion of the DNA double helix by RAP1 at silencers and multiple telomeric binding sites. J Mol Biol 231, 293-310.

Goldman, J.A., Garlick, J.D., and Kingston, R.E. (2010). Chromatin remodeling by imitation switch (ISWI) class ATP-dependent remodelers is stimulated by histone variant H2A.Z. J Biol Chem 285, 4645-4651.

Goppelt, A., Stelzer, G., Lottspeich, F., and Meisterernst, M. (1996). A mechanism for repression of class II gene transcription through specific binding of NC2 to TBP- promoter complexes via heterodimeric histone fold domains. The EMBO Journal 15, 3105-3116.

Gotta, M. (1996). The clustering of telomeres and colocalization with Rap1, Sir3, and Sir4 proteins in wild-type Saccharomyces cerevisiae. The Journal of Cell Biology 134, 1349-1363.

Goudarzi, M., Berg, K., Pieper, L.M., and Schier, A.F. (2019). Individual long non- coding RNAs have no overt functions in zebrafish embryogenesis, viability and fertility. eLife 8, 40815.

Govind, C.K., Qiu, H., Ginsburg, D.S., Ruan, C., Hofmeyer, K., Hu, C., Swaminathan, V., Workman, J.L., Li, B., and Hinnebusch, A.G. (2010).

305

References

Phosphorylated Pol II CTD recruits multiple HDACs, including Rpd3C(S), for methylation-dependent deacetylation of ORF nucleosomes. Mol Cell 39, 234-246.

Graham, I.R., Haw, R.A., Spink, K.G., Halden, K.A., and Chambers, A. (1999). In Vivo Analysis of Functional Regions within Yeast Rap1p. Molecular and Cellular Biology 19, 7481-7490.

Granneman, S., Kudla, G., Petfalski, E., and Tollervey, D. (2009). Identification of protein binding sites on U3 snoRNA and pre-rRNA by UV cross-linking and high- throughput analysis of cDNAs. Proc Natl Acad Sci U S A 106, 9613-9618.

Griggs, D.W., and Johnston, M. (1991). Regulated expression of the GAL4 activator gene in yeast provides a sensitive genetic switch for glucose repression. Proc Natl Acad Sci U S A 88, 8597-8601.

Gu, B., Swigut, T., Spencley, A., Bauer, M.R., Chung, M., Meyer, T., and Wysocka, J. (2018). Transcription-coupled changes in nuclear mobility of mammalian cis- regulatory elements. Science 359, 1050-1055.

Gu, M., Naiyachit, Y., Wood, T.J., and Millar, C.B. (2015). H2A.Z marks antisense promoters and has positive effects on antisense transcript levels in budding yeast. BMC Genomics 16, 99.

Gudipati, R.K., Villa, T., Boulay, J., and Libri, D. (2008). Phosphorylation of the RNA polymerase II C-terminal domain dictates transcription termination choice. Nat Struct Mol Biol 15, 786-794.

Gueldener, U., Heinisch, J., Koehler, G.J., Voss, D., and Hegemann, J.H. (2002). A second set of loxP marker cassettes for Cre-mediated multiple gene knockouts in budding yeast. Nucleic Acids Res 30, e23.

Gumbs, O.H., Campbell, A.M., and Weil, P.A. (2003). High-affinity DNA binding by a Mot1p-TBP complex: implications for TAF-independent transcription. EMBO J 22, 3131-3141.

Guo, Y., Xu, Q., Canzio, D., Shou, J., Li, J., Gorkin, D.U., Jung, I., Wu, H., Zhai, Y., Tang, Y., et al. (2015). CRISPR Inversion of CTCF Sites Alters Genome Topology and Enhancer/Promoter Function. Cell 162, 900-910.

Gutierrez, J.L., Chandy, M., Carrozza, M.J., and Workman, J.L. (2007). Activation domains drive nucleosome eviction by SWI/SNF. EMBO J 26, 730-740.

306

References

Gutin, J., Sadeh, R., Bodenheimer, N., Joseph-Strauss, D., Klein-Brill, A., Alajem, A., Ram, O., and Friedman, N. (2018). Fine-Resolution Mapping of TF Binding and Chromatin Interactions. Cell Rep 22, 2797-2807.

Haber, J.E. (2012). Mating-type genes and MAT switching in Saccharomyces cerevisiae. Genetics 191, 33-64.

Hahn, S., and Young, E.T. (2011). Transcriptional regulation in Saccharomyces cerevisiae: transcription factor regulation and function, mechanisms of initiation, and roles of activators and coactivators. Genetics 189, 705-736.

Hall, D.B., Wade, J.T., and Struhl, K. (2006). An HMG protein, Hmo1, associates with promoters of many ribosomal protein genes and throughout the rRNA gene locus in Saccharomyces cerevisiae. Mol Cell Biol 26, 3672-3679.

Hallais, M., Pontvianne, F., Andersen, P.R., Clerici, M., Lener, D., Benbahouche Nel, H., Gostan, T., Vandermoere, F., Robert, M.C., Cusack, S., et al. (2013). CBC- ARS2 stimulates 3'-end maturation of multiple RNA families and favors cap- proximal processing. Nat Struct Mol Biol 20, 1358-1366.

Hamperl, S., Bocek, M.J., Saldivar, J.C., Swigut, T., and Cimprich, K.A. (2017). Transcription-Replication Conflict Orientation Modulates R-Loop Levels and Activates Distinct DNA Damage Responses. Cell 170, 774-786 e719.

Han, M., and Grunstein, M. (1988). Nucleosome loss activates yeast downstream promoters in vivo. Cell 55, 1137-1145.

Hardy, C.F., Balderes, D., and Shore, D. (1992a). Dissection of a carboxy-terminal region of the yeast regulatory protein RAP1 with effects on both transcriptional activation and silencing. Mol Cell Biol 12, 1209-1217.

Hardy, C.F., Sussel, L., and Shore, D. (1992b). A RAP1-interacting protein involved in transcriptional silencing and telomere length regulation. Genes Dev 6, 801-814.

Hartley, P.D., and Madhani, H.D. (2009). Mechanisms that specify promoter nucleosome location and identity. Cell 137, 445-458.

Haruki, H., Nishikawa, J., and Laemmli, U.K. (2008). The anchor-away technique: rapid, conditional establishment of yeast mutant phenotypes. Mol Cell 31, 925-932.

307

References

Haurie, V., Perrot, M., Mini, T., Jeno, P., Sagliocco, F., and Boucherie, H. (2001). The transcriptional activator Cat8p provides a major contribution to the reprogramming of carbon metabolism during the diauxic shift in Saccharomyces cerevisiae. J Biol Chem 276, 76-85.

He, Q., Johnston, J., and Zeitlinger, J. (2015). ChIP-nexus enables improved detection of in vivo transcription factor binding footprints. Nat Biotechnol 33, 395- 401.

Hedges, D., Proft, M., and Entian, K.D. (1995). CAT8, a new zinc cluster-encoding gene necessary for derepression of gluconeogenic enzymes in the yeast Saccharomyces cerevisiae. Mol Cell Biol 15, 1915-1922.

Hennig, B.P., Bendrin, K., Zhou, Y., and Fischer, T. (2012). Chd1 chromatin remodelers maintain nucleosome organization and repress cryptic transcription. EMBO Rep 13, 997-1003.

Herzel, L., Ottoz, D.S.M., Alpert, T., and Neugebauer, K.M. (2017). Splicing and transcription touch base: co-transcriptional spliceosome assembly and function. Nature reviews. Molecular cell biology 18, 637-650.

Herzog, V.A., Reichholf, B., Neumann, T., Rescheneder, P., Bhat, P., Burkard, T.R., Wlotzka, W., von Haeseler, A., Zuber, J., and Ameres, S.L. (2017). Thiol- linked alkylation of RNA to assess expression dynamics. Nat Methods 14, 1198- 1204.

Hetzel, J., Duttke, S.H., Benner, C., and Chory, J. (2016). Nascent RNA sequencing reveals distinct features in plant transcription. Proc Natl Acad Sci U S A 113, 12316-12321.

Hinnebusch, A.G., Ivanov, I.P., and Sonenberg, N. (2016). Translational control by 5'-untranslated regions of eukaryotic mRNAs. Science 352, 1413-1416.

Hirabayashi, S., Bhagat, S., Matsuki, Y., Takegami, Y., Uehata, T., Kanemaru, A., Itoh, M., Shirakawa, K., Takaori-Kondo, A., Takeuchi, O., et al. (2019). Dynamics and Topology of Human Transcribed Cis-regulatory Elements. bioRxiv, 689968.

Hirota, K., Miyoshi, T., Kugou, K., Hoffman, C.S., Shibata, T., and Ohta, K. (2008). Stepwise chromatin remodelling by a cascade of transcription initiation of non- coding RNAs. Nature 456, 130-134.

308

References

Hobson, D.J., Wei, W., Steinmetz, L.M., and Svejstrup, J.Q. (2012). RNA polymerase II collision interrupts convergent transcription. Mol Cell 48, 365-374.

Hoekstra, H.E., Kapusta, A., Kronenberg, Z., Lynch, V.J., Zhuo, X., Ramsay, L., Bourque, G., Yandell, M., and Feschotte, C. (2013). Transposable Elements Are Major Contributors to the Origin, Diversification, and Regulation of Vertebrate Long Noncoding RNAs. PLoS Genetics 9, e1003470.

Hoffman, E.A., McCulley, A., Haarer, B., Arnak, R., and Feng, W. (2015). Break- seq reveals hydroxyurea-induced chromosome fragility as a result of unscheduled conflict between DNA replication and transcription. Genome Res 25, 402-412.

Hogues, H., Lavoie, H., Sellam, A., Mangos, M., Roemer, T., Purisima, E., Nantel, A., and Whiteway, M. (2008). Transcription factor substitution during the evolution of fungal ribosome regulation. Mol Cell 29, 552-562.

Holland, A.J., Fachinetti, D., Han, J.S., and Cleveland, D.W. (2012). Inducible, reversible system for the rapid and complete degradation of proteins in mammalian cells. Proc Natl Acad Sci U S A 109, E3350-3357.

Hon, C.-C., Ramilowski, J.A., Harshbarger, J., Bertin, N., Rackham, O.J.L., Gough, J., Denisenko, E., Schmeier, S., Poulsen, T.M., Severin, J., et al. (2017). An atlas of human long non-coding RNAs with accurate 5' ends. Nature 543, 199-204.

Hongay, C.F., Grisafi, P.L., Galitski, T., and Fink, G.R. (2006). Antisense transcription controls cell fate in Saccharomyces cerevisiae. Cell 127, 735-745.

Horlbeck, M.A., Witkowsky, L.B., Guglielmi, B., Replogle, J.M., Gilbert, L.A., Villalta, J.E., Torigoe, S.E., Tjian, R., and Weissman, J.S. (2016). Nucleosomes impede Cas9 access to DNA in vivo and in vitro. Elife 5, 12677.

Houalla, R., Devaux, F., Fatica, A., Kufel, J., Barrass, D., Torchet, C., and Tollervey, D. (2006). Microarray detection of novel nuclear RNA substrates for the exosome. Yeast 23, 439-454.

Houseley, J., Rubbi, L., Grunstein, M., Tollervey, D., and Vogelauer, M. (2008). A ncRNA modulates histone modification and mRNA induction in the yeast GAL gene cluster. Mol Cell 32, 685-695.

309

References

Hsieh, T.-Han S., Weiner, A., Lajoie, B., Dekker, J., Friedman, N., and Rando, Oliver J. (2015). Mapping Nucleosome Resolution Chromosome Folding in Yeast by Micro-C. Cell 162, 108-119.

Hsu, J.M., Huang, J., Meluh, P.B., and Laurent, B.C. (2003). The yeast RSC chromatin-remodeling complex is required for kinetochore function in chromosome segregation. Mol Cell Biol 23, 3202-3215.

Hu, H., and Li, X. (2007). Transcriptional regulation in eukaryotic ribosomal protein genes. Genomics 90, 421-423.

Huang, J., Hsu, J.-m., and Laurent, B.C. (2004). The RSC Nucleosome- Remodeling Complex Is Required for Cohesin's Association With Chromosome Arms. Molecular Cell 13, 739-750.

Huang, Q., Gong, C., Li, J., Zhuo, Z., Chen, Y., Wang, J., and Hua, Z.C. (2012). Distance and helical phase dependence of synergistic transcription activation in cis-regulatory module. PLoS One 7, e31198.

Huarte, M. (2015). The emerging role of lncRNAs in cancer. Nat Med 21, 1253- 1261.

Huber, F., Bunina, D., Gupta, I., Khmelinskii, A., Meurer, M., Theer, P., Steinmetz, L.M., and Knop, M. (2016). Protein Abundance Control by Non-coding Antisense Transcription. Cell Rep 15, 2625-2636.

Huertas, P., and Aguilera, A. (2003). Cotranscriptionally Formed DNA:RNA Hybrids Mediate Transcription Elongation Impairment and Transcription-Associated Recombination. Molecular Cell 12, 711-721.

Iacovoni, J.S., Caron, P., Lassadi, I., Nicolas, E., Massip, L., Trouche, D., and Legube, G. (2010). High-resolution profiling of gammaH2AX around DNA double strand breaks in the mammalian genome. EMBO J 29, 1446-1457.

Ibrahim, M.M., Karabacak, A., Glahs, A., Kolundzic, E., Hirsekorn, A., Carda, A., Tursun, B., Zinzen, R.P., Lacadie, S.A., and Ohler, U. (2018). Determinants of promoter and enhancer transcription directionality in metazoans. Nature communications 9, 4472.

Imamura, Y., Yu, F., Nakamura, M., Chihara, Y., Okane, K., Sato, M., Kanai, M., Hamada, R., Ueno, M., Yukawa, M., et al. (2015). RSC Chromatin-Remodeling

310

References

Complex Is Important for Mitochondrial Function in Saccharomyces cerevisiae. PLoS One 10, e0130397.

Jeon, Y., and Lee, J.T. (2011). YY1 tethers Xist RNA to the inactive X nucleation center. Cell 146, 119-133.

Jeronimo, C., Watanabe, S., Kaplan, C.D., Peterson, C.L., and Robert, F. (2015). The Histone Chaperones FACT and Spt6 Restrict H2A.Z from Intragenic Locations. Mol Cell 58, 1113-1123.

Jiang, C., and Pugh, B.F. (2009). Nucleosome positioning and gene regulation: advances through genomics. Nat Rev Genet 10, 161-172.

Jin, Y., Eser, U., Struhl, K., and Churchman, L.S. (2017). The Ground State and Evolution of Promoter Region Directionality. Cell 170, 889-898 e810.

Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J.A., and Charpentier, E. (2012). A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816-821.

Johnson, A., Meyer, B.J., and Ptashne, M. (1978). Mechanism of action of the cro protein of bacteriophage lambda. Proc Natl Acad Sci U S A 75, 1783-1787.

Johnson, A.N., and Weil, P.A. (2017). Identification of a transcriptional activation domain in yeast repressor activator protein 1 (Rap1) using an altered DNA-binding specificity variant. J Biol Chem 292, 5705-5723.

Jones, D.L., Leroy, P., Unoson, C., Fange, D., Curic, V., Lawson, M.J., and Elf, J. (2017). Kinetics of dCas9 target search in Escherichia coli. Science 357, 1420- 1424.

Jorgensen, P., Rupes, I., Sharom, J.R., Schneper, L., Broach, J.R., and Tyers, M. (2004). A dynamic transcriptional network communicates growth potential to ribosome synthesis and critical cell size. Genes Dev 18, 2491-2505.

Joshi, A.A., and Struhl, K. (2005). Eaf3 chromodomain interaction with methylated H3-K36 links histone deacetylation to Pol II elongation. Mol Cell 20, 971-978.

Joung, J., Engreitz, J.M., Konermann, S., Abudayyeh, O.O., Verdine, V.K., Aguet, F., Gootenberg, J.S., Sanjana, N.E., Wright, J.B., Fulco, C.P., et al. (2017).

311

References

Genome-scale activation screen identifies a lncRNA locus regulating a gene neighbourhood. Nature 548, 343-346.

Kadosh, D., and Struhl, K. (1997). Repression by Ume6 involves recruitment of a complex containing Sin3 corepressor and Rpd3 histone deacetylase to target promoters. Cell 89, 365-371.

Kaida, D., Berg, M.G., Younis, I., Kasim, M., Singh, L.N., Wan, L., and Dreyfuss, G. (2010). U1 snRNP protects pre-mRNAs from premature cleavage and polyadenylation. Nature 468, 664-668.

Kanhere, A., Viiri, K., Araujo, C.C., Rasaiyaah, J., Bouwman, R.D., Whyte, W.A., Pereira, C.F., Brookes, E., Walker, K., Bell, G.W., et al. (2010). Short RNAs are transcribed from repressed polycomb target genes and interact with polycomb repressive complex-2. Mol Cell 38, 675-688.

Kaplan, C.D., Laprade, L., and Winston, F. (2003). Transcription elongation factors repress transcription initiation from cryptic sites. Science 301, 1096-1099.

Kapranov, P., Cheng, J., Dike, S., Nix, D.A., Duttagupta, R., Willingham, A.T., Stadler, P.F., Hertel, J., Hackermuller, J., Hofacker, I.L., et al. (2007). RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science 316, 1484-1488.

Kasahara, K., Ohyama, Y., and Kokubo, T. (2011). Hmo1 directs pre-initiation complex assembly to an appropriate site on its target gene promoters by masking a nucleosome-free region. Nucleic Acids Res 39, 4136-4150.

Katayama, S., Tomaru, Y., Kasukawa, T., Waki, K., Nakanishi, M., Nakamura, M., Nishida, H., Yap, C.C., Suzuki, M., Kawai, J., et al. (2005). Antisense transcription in the mammalian transcriptome. Science 309, 1564-1566.

Kato, H., Horikoshi, M., and Roeder, R.G. (1991). Repression of HIV-1 transcription by a cellular protein. Science 251, 1476-1479.

Kent, W.J., Zweig, A.S., Barber, G., Hinrichs, A.S., and Karolchik, D. (2010). BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26, 2204-2207.

Keogh, M.C., Kurdistani, S.K., Morris, S.A., Ahn, S.H., Podolny, V., Collins, S.R., Schuldiner, M., Chin, K.Y., Punna, T., Thompson, N.J., et al. (2005).

312

References

Cotranscriptional Set2 methylation of histone H3 lysine 36 recruits a repressive Rpd3 complex. Cell 123, 593-605.

Kepinski, S., and Leyser, O. (2005). The Arabidopsis F-box protein TIR1 is an auxin receptor. Nature 435, 446-451.

Kettenberger, H., Armache, K.J., and Cramer, P. (2004). Complete RNA polymerase II elongation complex structure and its interactions with NTP and TFIIS. Mol Cell 16, 955-965.

Khalil, A.M., Guttman, M., Huarte, M., Garber, M., Raj, A., Rivea Morales, D., Thomas, K., Presser, A., Bernstein, B.E., van Oudenaarden, A., et al. (2009). Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proc Natl Acad Sci U S A 106, 11667- 11672.

Khan, A., Fornes, O., Stigliani, A., Gheorghe, M., Castro-Mondragon, J.A., van der Lee, R., Bessy, A., Cheneby, J., Kulkarni, S.R., Tan, G., et al. (2018). JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res 46, D260-D266.

Kim, J., Parvin, J.D., Shykind, B.M., and Sharp, P.A. (1996). A Negative Cofactor Containing Dr1/p19 Modulates Transcription with TFIIA in a Promoter-specific Fashion. Journal of Biological Chemistry 271, 18405-18412.

Kim, T., and Buratowski, S. (2009). Dimethylation of H3K4 by Set1 recruits the Set3 histone deacetylase complex to 5' transcribed regions. Cell 137, 259-272.

Kim, T.K., Hemberg, M., Gray, J.M., Costa, A.M., Bear, D.M., Wu, J., Harmin, D.A., Laptewicz, M., Barbara-Haley, K., Kuersten, S., et al. (2010). Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182-187.

Kizer, K.O., Phatnani, H.P., Shibata, Y., Hall, H., Greenleaf, A.L., and Strahl, B.D. (2005). A novel domain in Set2 mediates RNA polymerase II interaction and couples histone H3 K36 methylation with transcript elongation. Mol Cell Biol 25, 3305-3316.

Klein-Brill, A., Joseph-Strauss, D., Appleboim, A., and Friedman, N. (2019). Dynamics of Chromatin and Transcription during Transient Depletion of the RSC Chromatin Remodeling Complex. Cell Reports 26, 279-292.e275.

313

References

Knezetic, J.A., and Luse, D.S. (1986). The presence of nucleosomes on a DNA template prevents initiation by RNA polymerase II in vitro. Cell 45, 95-104.

Knight, B., Kubik, S., Ghosh, B., Bruzzone, M.J., Geertz, M., Martin, V., Denervaud, N., Jacquet, P., Ozkan, B., Rougemont, J., et al. (2014). Two distinct promoter architectures centered on dynamic nucleosomes control ribosomal protein gene transcription. Genes Dev 28, 1695-1709.

Konig, P., Giraldo, R., Chapman, L., and Rhodes, D. (1996). The crystal structure of the DNA-binding domain of yeast RAP1 in complex with telomeric DNA. Cell 85, 125-136.

Kornberg, R.D. (1974). Chromatin structure: a repeating unit of histones and DNA. Science 184, 868-871.

Kouzine, F., Sanford, S., Elisha-Feil, Z., and Levens, D. (2008). The functional response of upstream DNA to dynamic supercoiling in vivo. Nat Struct Mol Biol 15, 146-154.

Kouzine, F., Wojtowicz, D., Yamane, A., Resch, W., Kieffer-Kwon, K.R., Bandle, R., Nelson, S., Nakahashi, H., Awasthi, P., Feigenbaum, L., et al. (2013). Global regulation of promoter melting in naive lymphocytes. Cell 153, 988-999.

Krietenstein, N., Wal, M., Watanabe, S., Park, B., Peterson, C.L., Pugh, B.F., and Korber, P. (2016). Genomic Nucleosome Organization Reconstituted with Pure Proteins. Cell 167, 709-721 e712.

Krogan, N.J., Kim, M., Tong, A., Golshani, A., Cagney, G., Canadien, V., Richards, D.P., Beattie, B.K., Emili, A., Boone, C., et al. (2003). Methylation of histone H3 by Set2 in Saccharomyces cerevisiae is linked to transcriptional elongation by RNA polymerase II. Mol Cell Biol 23, 4207-4218.

Kubicek, K., Cerna, H., Holub, P., Pasulka, J., Hrossova, D., Loehr, F., Hofr, C., Vanacova, S., and Stefl, R. (2012). Serine phosphorylation and proline isomerization in RNAP II CTD control recruitment of Nrd1. Genes Dev 26, 1891- 1896.

Kubik, S., Bruzzone, M.J., Challal, D., Dreos, R., Mattarocci, S., Bucher, P., Libri, D., and Shore, D. (2019). Opposing chromatin remodelers control transcription initiation frequency and start site selection. Nat Struct Mol Biol 26, 744-754.

314

References

Kubik, S., Bruzzone, M.J., Jacquet, P., Falcone, J.L., Rougemont, J., and Shore, D. (2015). Nucleosome Stability Distinguishes Two Different Promoter Types at All Protein-Coding Genes in Yeast. Mol Cell 60, 422-434.

Kubik, S., O'Duibhir, E., de Jonge, W.J., Mattarocci, S., Albert, B., Falcone, J.L., Bruzzone, M.J., Holstege, F.C.P., and Shore, D. (2018). Sequence-Directed Action of RSC Remodeler and General Regulatory Factors Modulates +1 Nucleosome Position to Facilitate Transcription. Mol Cell 71, 89-102 e105.

Kumamoto, A.A., Miller, W.G., and Gunsalus, R.P. (1987). Escherichia coli tryptophan repressor binds multiple sites within the aroH and trp operators. Genes & Development 1, 556-564.

Kung, J.T., Colognori, D., and Lee, J.T. (2013). Long noncoding RNAs: past, present, and future. Genetics 193, 651-669.

Kurtz, S., and Shore, D. (1991). RAP1 protein activates and silences transcription of mating-type genes in yeast. Genes Dev 5, 616-628.

Kuscu, C., Arslan, S., Singh, R., Thorpe, J., and Adli, M. (2014). Genome-wide analysis reveals characteristics of off-target sites bound by the Cas9 endonuclease. Nat Biotechnol 32, 677-683.

Kwak, H., Fuda, N.J., Core, L.J., and Lis, J.T. (2013). Precise maps of RNA polymerase reveal how promoters direct initiation and pausing. Science 339, 950- 953.

Kyrion, G., Liu, K., Liu, C., and Lustig, A.J. (1993). RAP1 and telomere structure regulate telomere position effects in Saccharomyces cerevisiae. Genes Dev 7, 1146-1159.

Lacadie, S.A., Ibrahim, M.M., Gokhale, S.A., and Ohler, U. (2016). Divergent transcription and epigenetic directionality of human promoters. FEBS J 283, 4214- 4222.

Lam, M.T., Cho, H., Lesch, H.P., Gosselin, D., Heinz, S., Tanaka-Oishi, Y., Benner, C., Kaikkonen, M.U., Kim, A.S., Kosaka, M., et al. (2013). Rev-Erbs repress macrophage gene expression by inhibiting enhancer-directed transcription. Nature 498, 511-515.

315

References

Lardenois, A., Liu, Y., Walther, T., Chalmel, F., Evrard, B., Granovskaia, M., Chu, A., Davis, R.W., Steinmetz, L.M., and Primig, M. (2011). Execution of the meiotic noncoding RNA expression program and the onset of gametogenesis in yeast require the conserved exosome subunit Rrp6. Proc Natl Acad Sci U S A 108, 1058- 1063.

Laughon, A., and Gesteland, R.F. (1982). Isolation and preliminary characterization of the GAL4 gene, a positive regulator of transcription in yeast. Proc Natl Acad Sci U S A 79, 6827-6831.

Layer, J.H., Miller, S.G., and Weil, P.A. (2010). Direct transactivator-transcription factor IID (TFIID) contacts drive yeast ribosomal protein gene transcription. J Biol Chem 285, 15489-15499.

Layer, J.H., and Weil, P.A. (2013). Direct TFIIA-TFIID protein contacts drive budding yeast ribosomal protein gene transcription. J Biol Chem 288, 23273- 23294.

Lenstra, T.L., Coulon, A., Chow, C.C., and Larson, D.R. (2015). Single-Molecule Imaging Reveals a Switch between Spurious and Functional ncRNA Transcription. Mol Cell 60, 597-610.

Li, B., and Dewey, C.N. (2011). RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323.

Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Burrows- Wheeler transform. Bioinformatics 25, 1754-1760.

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and Genome Project Data Processing, S. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079.

Li, X., and Manley, J.L. (2005). Inactivation of the SR protein splicing factor ASF/SF2 results in genomic instability. Cell 122, 365-378.

Li, X., and Manley, J.L. (2006). Cotranscriptional processes and their influence on genome stability. Genes Dev 20, 1838-1847.

Li, X., Zhong, S., and Wong, W.H. (2005). Reliable prediction of transcription factor binding sites by phylogenetic verification. Proc Natl Acad Sci U S A 102, 16945- 16950.

316

References

Li, Y., Jiang, Y., Chen, H., Liao, W., Li, Z., Weiss, R., and Xie, Z. (2015). Modular construction of mammalian gene circuits using TALE transcriptional repressors. Nat Chem Biol 11, 207-213.

Liao, Y., Smyth, G.K., and Shi, W. (2014). featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923-930.

Lickwar, C.R., Mueller, F., Hanlon, S.E., McNally, J.G., and Lieb, J.D. (2012). Genome-wide protein-DNA binding dynamics suggest a molecular clutch for transcription factor function. Nature 484, 251-255.

Lieb, J.D., Liu, X., Botstein, D., and Brown, P.O. (2001). Promoter-specific binding of Rap1 revealed by genome-wide maps of protein-DNA association. Nat Genet 28, 327-334.

Lin, J.M., Collins, P.J., Trinklein, N.D., Fu, Y., Xi, H., Myers, R.M., and Weng, Z. (2007). Transcription factor binding and modified histones in human bidirectional promoters. Genome Res 17, 818-827.

Lis, M., and Walther, D. (2016). The orientation of transcription factor binding site motifs in gene promoter regions: does it matter? BMC Genomics 17, 185.

Little, J.W., Mount, D.W., and Yanisch-Perron, C.R. (1981). Purified lexA protein is a repressor of the recA and lexA genes. Proceedings of the National Academy of Sciences 78, 4199-4203.

Liu, S.J., Horlbeck, M.A., Cho, S.W., Birk, H.S., Malatesta, M., He, D., Attenello, F.J., Villalta, J.E., Cho, M.Y., Chen, Y., et al. (2017). CRISPRi-based genome- scale identification of functional long noncoding RNA loci in human cells. Science 355, aah7111.

Longtine, M.S., McKenzie, A., 3rd, Demarini, D.J., Shah, N.G., Wach, A., Brachat, A., Philippsen, P., and Pringle, J.R. (1998). Additional modules for versatile and economical PCR-based gene deletion and modification in Saccharomyces cerevisiae. Yeast 14, 953-961.

Longtine, M.S., Wilson, N.M., Petracek, M.E., and Berman, J. (1989). A yeast Telomere Binding Activity binds to two related telomere sequence motifs and is indistinguishable from RAP1. Current Genetics 16, 225-239.

317

References

Lopez-Serra, L., Kelly, G., Patel, H., Stewart, A., and Uhlmann, F. (2014). The Scc2-Scc4 complex acts in sister chromatid cohesion and transcriptional regulation by maintaining nucleosome-free regions. Nat Genet 46, 1147-1151.

Lorch, Y., Griesenbeck, J., Boeger, H., Maier-Davis, B., and Kornberg, R.D. (2011). Selective removal of promoter nucleosomes by the RSC chromatin-remodeling complex. Nat Struct Mol Biol 18, 881-885.

Lorch, Y., LaPointe, J.W., and Kornberg, R.D. (1987). Nucleosomes inhibit the initiation of transcription but allow chain elongation with the displacement of histones. Cell 49, 203-210.

Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550.

Luger, K., Mader, A.W., Richmond, R.K., Sargent, D.F., and Richmond, T.J. (1997). Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature 389, 251-260.

Luke, B., and Lingner, J. (2009). TERRA: telomeric repeat-containing RNA. EMBO J 28, 2503-2510.

Luke, B., Panza, A., Redon, S., Iglesias, N., Li, Z., and Lingner, J. (2008). The Rat1p 5' to 3' Exonuclease Degrades Telomeric Repeat-Containing RNA and Promotes Telomere Elongation in Saccharomyces cerevisiae. Molecular Cell 32, 465-477.

Lynch, M., and Marinov, G.K. (2015). The bioenergetic costs of a gene. Proc Natl Acad Sci U S A 112, 15690-15695.

Ma, J., and Wang, M.D. (2016). DNA supercoiling during transcription. Biophys Rev 8, 75-87.

Malabat, C., Feuerbach, F., Ma, L., Saveanu, C., and Jacquier, A. (2015). Quality control of transcription start site selection by nonsense-mediated-mRNA decay. Elife 4, 06722.

Mallo, M., and Alonso, C.R. (2013). The regulation of Hox gene expression during animal development. Development 140, 3951-3963.

318

References

Marcand, S., Pardo, B., Gratias, A., Cahun, S., and Callebaut, I. (2008). Multiple pathways inhibit NHEJ at telomeres. Genes Dev 22, 1153-1158.

Marion, R.M., Regev, A., Segal, E., Barash, Y., Koller, D., Friedman, N., and O'Shea, E.K. (2004). Sfp1 is a stress- and nutrient-sensitive regulator of ribosomal protein gene expression. Proc Natl Acad Sci U S A 101, 14315-14322.

Marquardt, S., Escalante-Chong, R., Pho, N., Wang, J., Churchman, L.S., Springer, M., and Buratowski, S. (2014). A Chromatin-Based Mechanism for Limiting Divergent Noncoding Transcription. Cell 158, 462.

Martin, D.E., Soulard, A., and Hall, M.N. (2004). TOR regulates ribosomal protein gene expression via PKA and the Forkhead transcription factor FHL1. Cell 119, 969-979.

Martin, M. (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10.

Mason, P.B., and Struhl, K. (2003). The FACT complex travels with elongating RNA polymerase II and is important for the fidelity of transcriptional initiation in vivo. Mol Cell Biol 23, 8323-8333.

Matot, B., Le Bihan, Y.V., Lescasse, R., Perez, J., Miron, S., David, G., Castaing, B., Weber, P., Raynal, B., Zinn-Justin, S., et al. (2012). The orientation of the C- terminal domain of the Saccharomyces cerevisiae Rap1 protein is determined by its binding to DNA. Nucleic Acids Res 40, 3197-3207.

Mayr, C. (2017). Regulation by 3'-Untranslated Regions. Annu Rev Genet 51, 171- 194.

McCracken, S., Fong, N., Rosonina, E., Yankulov, K., Brothers, G., Siderovski, D., Hessel, A., Foster, S., Shuman, S., and Bentley, D.L. (1997). 5'-Capping enzymes are targeted to pre-mRNA by binding to the phosphorylated carboxy-terminal domain of RNA polymerase II. Genes Dev 11, 3306-3318.

McKnight, J.N., Tsukiyama, T., and Bowman, G.D. (2016). Sequence-targeted nucleosome sliding in vivo by a hybrid Chd1 chromatin remodeler. Genome Res 26, 693-704.

Medler, S., Al Husini, N., Raghunayakula, S., Mukundan, B., Aldea, A., and Ansari, A. (2011). Evidence for a complex of transcription factor IIB with poly(A)

319

References

polymerase and cleavage factor 1 subunits required for gene looping. J Biol Chem 286, 33709-33718.

Meers, M.P., Adelman, K., Duronio, R.J., Strahl, B.D., McKay, D.J., and Matera, A.G. (2018). Transcription start site profiling uncovers divergent transcription and enhancer-associated RNAs in Drosophila melanogaster. BMC Genomics 19, 157.

Mehta, G.D., Ball, D.A., Eriksson, P.R., Chereji, R.V., Clark, D.J., McNally, J.G., and Karpova, T.S. (2018). Single-Molecule Analysis Reveals Linked Cycles of RSC Chromatin Remodeling and Ace1p Transcription Factor Binding in Yeast. Mol Cell 72, 875-887 e879.

Mele, M., Mattioli, K., Mallard, W., Shechner, D.M., Gerhardinger, C., and Rinn, J.L. (2017). Chromatin environment, transcriptional regulation, and splicing distinguish lincRNAs and mRNAs. Genome Res 27, 27-37.

Mellor, J., Woloszczuk, R., and Howe, F.S. (2016). The Interleaved Genome. Trends Genet 32, 57-71.

Mencia, M., Moqtaderi, Z., Geisberg, J.V., Kuras, L., and Struhl, K. (2002). Activator-specific recruitment of TFIID and regulation of ribosomal protein genes in yeast. Mol Cell 9, 823-833.

Merkenschlager, M., and Nora, E.P. (2016). CTCF and Cohesin in Genome Folding and Transcriptional Gene Regulation. Annu Rev Genomics Hum Genet 17, 17-43.

Meyer, B.J., Kleid, D.G., and Ptashne, M. (1975). Lambda repressor turns off transcription of its own gene. Proc Natl Acad Sci U S A 72, 4785-4789.

Mi, H., Muruganujan, A., Ebert, D., Huang, X., and Thomas, P.D. (2019). PANTHER version 14: more genomes, a new PANTHER GO-slim and improvements in enrichment analysis tools. Nucleic Acids Res 47, D419-D426.

Mignone, F., Gissi, C., Liuni, S., and Pesole, G. (2002). Untranslated regions of mRNAs. Genome Biol 3, reviews0004.0001.

Mikhaylichenko, O., Bondarenko, V., Harnett, D., Schor, I.E., Males, M., Viales, R.R., and Furlong, E.E.M. (2018). The degree of enhancer or promoter activity is reflected by the levels and directionality of eRNA transcription. Genes Dev 32, 42- 57.

320

References

Mischo, H.E., Chun, Y., Harlen, K.M., Smalec, B.M., Dhir, S., Churchman, L.S., and Buratowski, S. (2018). Cell-Cycle Modulation of Transcription Termination Factor Sen1. Mol Cell 70, 312-326 e317.

Mivelaz, M., Cao, A.-M., Kubik, S., Zencir, S., Hovius, R., Boichenko, I., Stachowicz, A.M., Kurat, C.F., Shore, D., and Fierz, B. (2019). The mechanistic basis for chromatin invasion and remodeling by the yeast pioneer transcription factor Rap1. bioRxiv, 541284.

Miyake, T., Hu, Y.F., Yu, D.S., and Li, R. (2000). A functional comparison of BRCA1 C-terminal domains in transcription activation and chromatin remodeling. J Biol Chem 275, 40169-40173.

Mizuno, T., Kishimoto, T., Shinzato, T., Haw, R., Chambers, A., Wood, J., Sinclair, D., and Uemura, H. (2004). Role of the N-terminal region of Rap1p in the transcriptional activation of glycolytic genes in Saccharomyces cerevisiae. Yeast 21, 851-866.

Moore, M.J., and Proudfoot, N.J. (2009). Pre-mRNA processing reaches back to transcription and ahead to translation. Cell 136, 688-700.

Morawska, M., and Ulrich, H.D. (2013). An expanded tool kit for the auxin-inducible degron system in budding yeast. Yeast 30, 341-351.

Moretti, P., Freeman, K., Coodly, L., and Shore, D. (1994). Evidence that a complex of SIR proteins interacts with the silencer and telomere-binding protein RAP1. Genes & Development 8, 2257-2269.

Moretto, F., Wood, N.E., Kelly, G., Doncic, A., and van Werven, F.J. (2018). A regulatory circuit of two lncRNAs and a master regulator directs cell fate in yeast. Nature communications 9, 780.

Morgan, S.L., Mariano, N.C., Bermudez, A., Arruda, N.L., Wu, F., Luo, Y., Shankar, G., Jia, L., Chen, H., Hu, J.F., et al. (2017). Manipulation of nuclear architecture through CRISPR-mediated chromosomal looping. Nature communications 8, 15993.

Mountoufaris, G., Canzio, D., Nwakeze, C.L., Chen, W.V., and Maniatis, T. (2018). Writing, Reading, and Translating the Clustered Protocadherin Cell Surface

321

References

Recognition Code for Neural Circuit Assembly. Annu Rev Cell Dev Biol 34, 471- 493.

Murray, S.C., Haenni, S., Howe, F.S., Fischl, H., Chocian, K., Nair, A., and Mellor, J. (2015). Sense and antisense transcription are associated with distinct chromatin architectures across genes. Nucleic Acids Res 43, 7823-7837.

Myers, R.M., Rio, D.C., Robbins, A.K., and Tjian, R. (1981). SV40 gene expression is modulated by the cooperative binding of T antigen to DNA. Cell 25, 373-384.

Myers, S.A., Wright, J., Peckner, R., Kalish, B.T., Zhang, F., and Carr, S.A. (2018). Discovery of proteins associated with a predefined genomic locus via dCas9- APEX-mediated proximity labeling. Nat Methods 15, 437-439.

Nagai, T., Ibata, K., Park, E.S., Kubota, M., Mikoshiba, K., and Miyawaki, A. (2002). A variant of yellow fluorescent protein with fast and efficient maturation for cell- biological applications. Nat Biotechnol 20, 87-90.

Nagawa, F., and Fink, G.R. (1985). The relationship between the "TATA" sequence and transcription initiation sites at the HIS4 gene of Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 82, 8557-8561.

Narlikar, G.J., Sundaramoorthy, R., and Owen-Hughes, T. (2013). Mechanisms and functions of ATP-dependent chromatin-remodeling enzymes. Cell 154, 490- 503.

Natarajan, K., Meyer, M.R., Jackson, B.M., Slade, D., Roberts, C., Hinnebusch, A.G., and Marton, M.J. (2001). Transcriptional profiling shows that Gcn4p is a master regulator of gene expression during amino acid starvation in yeast. Mol Cell Biol 21, 4347-4368.

Nechaev, S., Fargo, D.C., dos Santos, G., Liu, L., Gao, Y., and Adelman, K. (2010). Global analysis of short RNAs reveals widespread promoter-proximal stalling and arrest of Pol II in Drosophila. Science 327, 335-338.

Neil, H., Malabat, C., d'Aubenton-Carafa, Y., Xu, Z., Steinmetz, L.M., and Jacquier, A. (2009). Widespread bidirectional promoters are the major source of cryptic transcripts in yeast. Nature 457, 1038-1042.

322

References

Neri, F., Rapelli, S., Krepelova, A., Incarnato, D., Parlato, C., Basile, G., Maldotti, M., Anselmi, F., and Oliviero, S. (2017). Intragenic DNA methylation prevents spurious transcription initiation. Nature 543, 72-77.

Ng, H.H., Robert, F., Young, R.A., and Struhl, K. (2002). Genome-wide location and regulated recruitment of the RSC nucleosome-remodeling complex. Genes Dev 16, 806-819.

Nishimasu, H., Ran, F.A., Hsu, P.D., Konermann, S., Shehata, S.I., Dohmae, N., Ishitani, R., Zhang, F., and Nureki, O. (2014). Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell 156, 935-949.

Nishimura, K., Fukagawa, T., Takisawa, H., Kakimoto, T., and Kanemaki, M. (2009). An auxin-based degron system for the rapid depletion of proteins in nonplant cells. Nat Methods 6, 917-922.

Nojima, T., Gomes, T., Grosso, A.R.F., Kimura, H., Dye, M.J., Dhir, S., Carmo- Fonseca, M., and Proudfoot, N.J. (2015). Mammalian NET-Seq Reveals Genome- wide Nascent Transcription Coupled to RNA Processing. Cell 161, 526-540.

Nojima, T., Tellier, M., Foxwell, J., Ribeiro de Almeida, C., Tan-Wong, S.M., Dhir, S., Dujardin, G., Dhir, A., Murphy, S., and Proudfoot, N.J. (2018). Deregulated Expression of Mammalian lncRNA through Loss of SPT6 Induces R-Loop Formation, Replication Stress, and Cellular Senescence. Mol Cell 72, 970-984 e977.

Nomura, M., Gourse, R., and Baughman, G. (1984). Regulation of the synthesis of ribosomes and ribosomal components. Annu Rev Biochem 53, 75-117.

Ntini, E., Jarvelin, A.I., Bornholdt, J., Chen, Y., Boyd, M., Jorgensen, M., Andersson, R., Hoof, I., Schein, A., Andersen, P.R., et al. (2013). Polyadenylation site-induced decay of upstream transcripts enforces promoter directionality. Nat Struct Mol Biol 20, 923-928.

Ohno, M., Ando, T., Priest, D.G., Kumar, V., Yoshida, Y., and Taniguchi, Y. (2019). Sub-nucleosomal Genome Structure Reveals Distinct Nucleosome Folding Motifs. Cell 176, 520-534 e525.

323

References

Owen-Hughes, T., and Workman, J.L. (1996). Remodeling the chromatin structure of a nucleosome array by transcription factor-targeted trans-displacement of histones. EMBO J 15, 4702-4712.

Pabo, C.O., and Sauer, R.T. (1992). Transcription factors: structural families and principles of DNA recognition. Annu Rev Biochem 61, 1053-1095.

Pandey, R.R., Mondal, T., Mohammad, F., Enroth, S., Redrup, L., Komorowski, J., Nagano, T., Mancini-Dinardo, D., and Kanduri, C. (2008). Kcnq1ot1 antisense noncoding RNA mediates lineage-specific transcriptional silencing through chromatin-level regulation. Mol Cell 32, 232-246.

Papai, G., Tripathi, M.K., Ruhlmann, C., Layer, J.H., Weil, P.A., and Schultz, P. (2010). TFIIA and the transactivator Rap1 cooperate to commit TFIID for transcription initiation. Nature 465, 956-960.

Park, D., Morris, A.R., Battenhouse, A., and Iyer, V.R. (2014). Simultaneous mapping of transcript ends at single-nucleotide resolution and identification of widespread promoter-associated non-coding RNA governed by TATA elements. Nucleic Acids Res 42, 3736-3749.

Parnell, T.J., Huff, J.T., and Cairns, B.R. (2008). RSC regulates nucleosome positioning at Pol II genes and density at Pol III genes. EMBO J 27, 100-110.

Parnell, T.J., Schlichter, A., Wilson, B.G., and Cairns, B.R. (2015). The chromatin remodelers RSC and ISW1 display functional and chromatin-based promoter antagonism. Elife 4, e06073.

Parry, T.J., Theisen, J.W., Hsu, J.Y., Wang, Y.L., Corcoran, D.L., Eustice, M., Ohler, U., and Kadonaga, J.T. (2010). The TCT motif, a key component of an RNA polymerase II transcription system for the translational machinery. Genes Dev 24, 2013-2018.

Paulson, J.R., and Laemmli, U.K. (1977). The structure of histone-depleted metaphase chromosomes. Cell 12, 817-828.

Pelechano, V., Wei, W., and Steinmetz, L.M. (2013). Extensive transcriptional heterogeneity revealed by isoform profiling. Nature 497, 127-131.

Petesch, S.J., and Lis, J.T. (2012). Overcoming the nucleosome barrier during transcript elongation. Trends Genet 28, 285-294.

324

References

Pijnappel, W.W., Schaft, D., Roguev, A., Shevchenko, A., Tekotte, H., Wilm, M., Rigaut, G., Seraphin, B., Aasland, R., and Stewart, A.F. (2001). The S. cerevisiae SET3 complex includes two histone deacetylases, Hos2 and Hst1, and is a meiotic- specific repressor of the sporulation gene program. Genes Dev 15, 2991-3004.

Plank, J.L., and Dean, A. (2014). Enhancer function: mechanistic and genome-wide insights come together. Mol Cell 55, 5-14.

Ponger, L., and Li, W.H. (2005). Evolutionary diversification of DNA methyltransferases in eukaryotic genomes. Mol Biol Evol 22, 1119-1128.

Porrua, O., Hobor, F., Boulay, J., Kubicek, K., D'Aubenton-Carafa, Y., Gudipati, R.K., Stefl, R., and Libri, D. (2012). In vivo SELEX reveals novel sequence and structural determinants of Nrd1-Nab3-Sen1-dependent transcription termination. EMBO J 31, 3935-3948.

Porrua, O., and Libri, D. (2013). A bacterial-like mechanism for transcription termination by the Sen1p helicase in budding yeast. Nat Struct Mol Biol 20, 884- 891.

Porrua, O., and Libri, D. (2015). Transcription termination and the control of the transcriptome: why, where and how to stop. Nature reviews. Molecular cell biology 16, 190-202.

Preker, P., Almvig, K., Christensen, M.S., Valen, E., Mapendano, C.K., Sandelin, A., and Jensen, T.H. (2011). PROMoter uPstream Transcripts share characteristics with mRNAs and are produced upstream of all three major types of mammalian promoters. Nucleic Acids Res 39, 7179-7193.

Preker, P., Nielsen, J., Kammler, S., Lykke-Andersen, S., Christensen, M.S., Mapendano, C.K., Schierup, M.H., and Jensen, T.H. (2008). RNA exosome depletion reveals transcription upstream of active human promoters. Science 322, 1851-1854.

Prescott, E.M., and Proudfoot, N.J. (2002). Transcriptional collision between convergent genes in budding yeast. Proc Natl Acad Sci U S A 99, 8796-8801.

Proudfoot, N.J. (2011). Ending the message: poly(A) signals then and now. Genes Dev 25, 1770-1782.

325

References

Proudfoot, N.J., and Brownlee, G.G. (1976). 3' non-coding region sequences in eukaryotic messenger RNA. Nature 263, 211-214.

Pugh, B.F. (2000). Control of gene expression through regulation of the TATA- binding protein. Gene 255, 1-14.

Qi, L.S., Larson, M.H., Gilbert, L.A., Doudna, J.A., Weissman, J.S., Arkin, A.P., and Lim, W.A. (2013). Repurposing CRISPR as an RNA-guided platform for sequence- specific control of gene expression. Cell 152, 1173-1183.

Quinlan, A.R., and Hall, I.M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841-842.

Radzisheuskaya, A., Shlyueva, D., Muller, I., and Helin, K. (2016). Optimizing sgRNA position markedly improves the efficiency of CRISPR/dCas9-mediated transcriptional repression. Nucleic Acids Res 44, e141.

Raj, A., Peskin, C.S., Tranchina, D., Vargas, D.Y., and Tyagi, S. (2006). Stochastic mRNA synthesis in mammalian cells. PLoS Biol 4, e309.

Raj, A., and van Oudenaarden, A. (2008). Nature, Nurture, or Chance: Stochastic Gene Expression and Its Consequences. Cell 135, 216-226.

Ramirez, F., Ryan, D.P., Gruning, B., Bhardwaj, V., Kilpert, F., Richter, A.S., Heyne, S., Dundar, F., and Manke, T. (2016). deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res 44, W160-165.

Rando, O.J., and Winston, F. (2012). Chromatin and transcription in yeast. Genetics 190, 351-387.

Rasmussen, E.B., and Lis, J.T. (1993). In vivo transcriptional pausing and cap formation on three Drosophila heat shock genes. Proc Natl Acad Sci U S A 90, 7923-7927.

Rawal, Y., Chereji, R.V., Valabhoju, V., Qiu, H., Ocampo, J., Clark, D.J., and Hinnebusch, A.G. (2018). Gcn4 Binding in Coding Regions Can Activate Internal and Canonical 5' Promoters in Yeast. Mol Cell 70, 297-311 e294.

Rege, M., Kim, J.H., Valeri, J., Dunagin, M.C., Metzger, A., Gong, W., Beagan, J.A., Raj, A., and Phillips-Cremins, J.E. (2018). LADL: Light-activated dynamic looping for endogenous gene expression control. bioRxiv, 349340.

326

References

Rege, M., Subramanian, V., Zhu, C., Hsieh, T.H., Weiner, A., Friedman, N., Clauder-Munster, S., Steinmetz, L.M., Rando, O.J., Boyer, L.A., et al. (2015). Chromatin Dynamics and the RNA Exosome Function in Concert to Regulate Transcriptional Homeostasis. Cell Rep 13, 1610-1622.

Reja, R., Vinayachandran, V., Ghosh, S., and Pugh, B.F. (2015). Molecular mechanisms of ribosomal protein gene coregulation. Genes Dev 29, 1942-1954.

Rennie, S., Dalby, M., Lloret-Llinares, M., Bakoulis, S., Dalager Vaagenso, C., Heick Jensen, T., and Andersson, R. (2018). Transcription start site analysis reveals widespread divergent transcription in D. melanogaster and core promoter- encoded enhancer activities. Nucleic Acids Res 46, 5455-5469.

Rhee, H.S., and Pugh, B.F. (2011). Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell 147, 1408-1419.

Rhee, H.S., and Pugh, B.F. (2012). Genome-wide structure and organization of eukaryotic pre-initiation complexes. Nature 483, 295-301.

Ribeyre, C., and Shore, D. (2012). Anticheckpoint pathways at telomeres in yeast. Nat Struct Mol Biol 19, 307-313.

Rincon-Arano, H., Halow, J., Delrow, J.J., Parkhurst, S.M., and Groudine, M. (2012). UpSET recruits HDAC complexes and restricts chromatin accessibility and acetylation at promoter regions. Cell 151, 1214-1228.

Rinn, J.L., Kertesz, M., Wang, J.K., Squazzo, S.L., Xu, X., Brugmann, S.A., Goodnough, L.H., Helms, J.A., Farnham, P.J., Segal, E., et al. (2007). Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell 129, 1311-1323.

Robinson, J.T., Thorvaldsdottir, H., Winckler, W., Guttman, M., Lander, E.S., Getz, G., and Mesirov, J.P. (2011). Integrative genomics viewer. Nat Biotechnol 29, 24- 26.

Rudra, D., Mallick, J., Zhao, Y., and Warner, J.R. (2007). Potential interface between ribosomal protein production and pre-rRNA processing. Mol Cell Biol 27, 4815-4824.

Rudra, D., Zhao, Y., and Warner, J.R. (2005). Central role of Ifh1p-Fhl1p interaction in the synthesis of yeast ribosomal proteins. EMBO J 24, 533-542.

327

References

Ruthenburg, A.J., Allis, C.D., and Wysocka, J. (2007). Methylation of lysine 4 on histone H3: intricacy of writing and reading a single epigenetic mark. Mol Cell 25, 15-30.

Ryan, K., Calvo, O., and Manley, J.L. (2004). Evidence that polyadenylation factor CPSF-73 is the mRNA 3' processing endonuclease. RNA 10, 565-573.

Sainsbury, S., Bernecky, C., and Cramer, P. (2015). Structural basis of transcription initiation by RNA polymerase II. Nature reviews. Molecular cell biology 16, 129-143.

Sakai, D.D., Helms, S., Carlstedt-Duke, J., Gustafsson, J.A., Rottman, F.M., and Yamamoto, K.R. (1988). Hormone-mediated repression: a negative glucocorticoid response element from the bovine prolactin gene. Genes Dev 2, 1144-1154.

Sankar, T.S., Wastuwidyaningtyas, B.D., Dong, Y., Lewis, S.A., and Wang, J.D. (2016). The nature of mutations induced by replication-transcription collisions. Nature 535, 178-181.

Saunders, A., Werner, J., Andrulis, E.D., Nakayama, T., Hirose, S., Reinberg, D., and Lis, J.T. (2003). Tracking FACT and the RNA polymerase II elongation complex through chromatin in vivo. Science 301, 1094-1096.

Schawalder, S.B., Kabani, M., Howald, I., Choudhury, U., Werner, M., and Shore, D. (2004). Growth-regulated recruitment of the essential yeast ribosomal protein gene activator Ifh1. Nature 432, 1058-1061.

Schlackow, M., Nojima, T., Gomes, T., Dhir, A., Carmo-Fonseca, M., and Proudfoot, N.J. (2017). Distinctive Patterns of Transcription and RNA Processing for Human lincRNAs. Mol Cell 65, 25-38.

Schneider, C.A., Rasband, W.S., and Eliceiri, K.W. (2012). NIH Image to ImageJ: 25 years of image analysis. Nat Methods 9, 671-675.

Schreiber-Agus, N., Chin, L., Chen, K., Torres, R., Rao, G., Guida, P., Skoultchi, A.I., and DePinho, R.A. (1995). An amino-terminal domain of Mxi1 mediates anti- Myc oncogenic activity and interacts with a homolog of the yeast transcriptional repressor SIN3. Cell 80, 777-786.

Schuettengruber, B., Chourrout, D., Vervoort, M., Leblanc, B., and Cavalli, G. (2007). Genome regulation by polycomb and trithorax proteins. Cell 128, 735-745.

328

References

Schulz, D., Schwalb, B., Kiesel, A., Baejen, C., Torkler, P., Gagneur, J., Soeding, J., and Cramer, P. (2013). Transcriptome surveillance by selective termination of noncoding RNA synthesis. Cell 155, 1075-1087.

Schwalb, B., Michel, M., Zacher, B., Fruhauf, K., Demel, C., Tresch, A., Gagneur, J., and Cramer, P. (2016). TT-seq maps the human transient transcriptome. Science 352, 1225-1228.

Scruggs, B.S., Gilchrist, D.A., Nechaev, S., Muse, G.W., Burkholder, A., Fargo, D.C., and Adelman, K. (2015). Bidirectional Transcription Arises from Two Distinct Hubs of Transcription Factor Binding and Active Chromatin. Mol Cell 58, 1101- 1112.

Seila, A.C., Calabrese, J.M., Levine, S.S., Yeo, G.W., Rahl, P.B., Flynn, R.A., Young, R.A., and Sharp, P.A. (2008). Divergent transcription from active promoters. Science 322, 1849-1851.

Seila, A.C., Core, L.J., Lis, J.T., and Sharp, P.A. (2009). Divergent transcription: a new feature of active promoters. Cell Cycle 8, 2557-2564.

Sellitti, M.A., Pavco, P.A., and Steege, D.A. (1987). lac repressor blocks in vivo transcription of lac control region DNA. Proc Natl Acad Sci U S A 84, 3199-3203.

Sermwittayawong, D., and Tan, S. (2006). SAGA binds TBP via its Spt8 subunit in competition with DNA: implications for TBP recruitment. EMBO J 25, 3791-3800.

Shaner, N.C., Campbell, R.E., Steinbach, P.A., Giepmans, B.N., Palmer, A.E., and Tsien, R.Y. (2004). Improved monomeric red, orange and yellow fluorescent proteins derived from Discosoma sp. red fluorescent protein. Nat Biotechnol 22, 1567-1572.

Shariati, S.A., Dominguez, A., Xie, S., Wernig, M., Qi, L.S., and Skotheim, J.M. (2019). Reversible Disruption of Specific Transcription Factor-DNA Interactions Using CRISPR/Cas9. Mol Cell 74, 622-633 e624.

Shay, J.W., and Wright, W.E. (2019). Telomeres and telomerase: three decades of progress. Nat Rev Genet 20, 299-309.

Shearwin, K.E., Callen, B.P., and Egan, J.B. (2005). Transcriptional interference--a crash course. Trends Genet 21, 339-345.

329

References

Sheridan, R.M., Fong, N., D'Alessandro, A., and Bentley, D.L. (2019). Widespread Backtracking by RNA Pol II Is a Major Effector of Gene Activation, 5' Pause Release, Termination, and Transcription Elongation Rate. Mol Cell 73, 107-118 e104.

Sherwood, R.I., Hashimoto, T., O'Donnell, C.W., Lewis, S., Barkal, A.A., van Hoff, J.P., Karun, V., Jaakkola, T., and Gifford, D.K. (2014). Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape. Nat Biotechnol 32, 171-178.

Shetty, A., Kallgren, S.P., Demel, C., Maier, K.C., Spatt, D., Alver, B.H., Cramer, P., Park, P.J., and Winston, F. (2017). Spt5 Plays Vital Roles in the Control of Sense and Antisense Transcription Elongation. Mol Cell 66, 77-88 e75.

Shi, T., Bunker, R.D., Mattarocci, S., Ribeyre, C., Faty, M., Gut, H., Scrima, A., Rass, U., Rubin, S.M., Shore, D., et al. (2013). Rif1 and Rif2 shape telomere function and architecture through multivalent Rap1 interactions. Cell 153, 1340- 1353.

Shilatifard, A. (2012). The COMPASS family of histone H3K4 methylases: mechanisms of regulation in development and disease pathogenesis. Annu Rev Biochem 81, 65-95.

Shore, D., and Nasmyth, K. (1987). Purification and cloning of a DNA Binding protein from yeast that binds to both silencer and activator elements. Cell 51, 721- 732.

Sigova, A.A., Mullen, A.C., Molinie, B., Gupta, S., Orlando, D.A., Guenther, M.G., Almada, A.E., Lin, C., Sharp, P.A., Giallourakis, C.C., et al. (2013). Divergent transcription of long noncoding RNA/mRNA gene pairs in embryonic stem cells. Proc Natl Acad Sci U S A 110, 2876-2881.

Silva, A.C., Xu, X., Kim, H.S., Fillingham, J., Kislinger, T., Mennella, T.A., and Keogh, M.C. (2012). The replication-independent histone H3-H4 chaperones HIR, ASF1, and RTT106 co-operate to maintain promoter fidelity. J Biol Chem 287, 1709-1718.

Silva, J., Mak, W., Zvetkova, I., Appanah, R., Nesterova, T.B., Webster, Z., Peters, A.H., Jenuwein, T., Otte, A.P., and Brockdorff, N. (2003). Establishment of histone

330

References

h3 methylation on the inactive X chromosome requires transient recruitment of Eed-Enx1 polycomb group complexes. Dev Cell 4, 481-495.

Sims, R.J., 3rd, Chen, C.F., Santos-Rosa, H., Kouzarides, T., Patel, S.S., and Reinberg, D. (2005). Human but not yeast CHD1 binds directly and selectively to histone H3 methylated at lysine 4 via its tandem chromodomains. J Biol Chem 280, 41789-41792.

Singh, R., Kuscu, C., Quinlan, A., Qi, Y., and Adli, M. (2015). Cas9-chromatin binding information enables more accurate CRISPR off-target prediction. Nucleic Acids Res 43, e118.

Skene, P.J., and Henikoff, S. (2017). An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. Elife 6, 21856.

Skourti-Stathaki, K., Proudfoot, N.J., and Gromak, N. (2011). Human senataxin resolves RNA/DNA hybrids formed at transcriptional pause sites to promote Xrn2- dependent termination. Mol Cell 42, 794-805.

Soares, L.M., He, P.C., Chun, Y., Suh, H., Kim, T., and Buratowski, S. (2017). Determinants of Histone H3K4 Methylation Patterns. Mol Cell 68, 773-785 e776.

Sohrabi-Jahromi, S., Hofmann, K.B., Boltendahl, A., Roth, C., Gressel, S., Baejen, C., Soeding, J., and Cramer, P. (2019). Transcriptome maps of general eukaryotic RNA degradation factors. Elife 8, 47040.

Spingola, M., Grate, L., Haussler, D., and Ares, M., Jr. (1999). Genome-wide bioinformatic and molecular analysis of introns in Saccharomyces cerevisiae. RNA 5, 221-234.

Steglich, B., Sazer, S., and Ekwall, K. (2013). Transcriptional regulation at the yeast nuclear envelope. Nucleus 4, 379-389.

Steinmetz, E.J., Conrad, N.K., Brow, D.A., and Corden, J.L. (2001). RNA-binding protein Nrd1 directs poly(A)-independent 3'-end formation of RNA polymerase II transcripts. Nature 413, 327-331.

Strahl, B.D., Grant, P.A., Briggs, S.D., Sun, Z.W., Bone, J.R., Caldwell, J.A., Mollah, S., Cook, R.G., Shabanowitz, J., Hunt, D.F., et al. (2002). Set2 is a nucleosomal histone H3-selective methyltransferase that mediates transcriptional repression. Mol Cell Biol 22, 1298-1306.

331

References

Struhl, K. (2007). Transcriptional noise and the fidelity of initiation by RNA polymerase II. Nat Struct Mol Biol 14, 103-105.

Sun, M., Lariviere, L., Dengl, S., Mayer, A., and Cramer, P. (2010). A tandem SH2 domain in transcription elongation factor Spt6 binds the phosphorylated RNA polymerase II C-terminal repeat domain (CTD). J Biol Chem 285, 41597-41603.

Sussel, L., and Shore, D. (1991). Separation of transcriptional activation and silencing functions of the RAP1-encoded repressor/activator protein 1: isolation of viable mutants affecting both silencing and telomere length. Proc Natl Acad Sci U S A 88, 7749-7753.

Svejstrup, J.Q. (2004). The RNA polymerase II transcription cycle: cycling through chromatin. Biochim Biophys Acta 1677, 64-73.

Taddei, A., and Gasser, S.M. (2012). Structure and function in the budding yeast nucleus. Genetics 192, 107-129.

Taft, R.J., Glazov, E.A., Cloonan, N., Simons, C., Stephen, S., Faulkner, G.J., Lassmann, T., Forrest, A.R., Grimmond, S.M., Schroder, K., et al. (2009). Tiny RNAs associated with transcription start sites in animals. Nat Genet 41, 572-578.

Tan-Wong, S.M., Zaugg, J.B., Camblong, J., Xu, Z., Zhang, D.W., Mischo, H.E., Ansari, A.Z., Luscombe, N.M., Steinmetz, L.M., and Proudfoot, N.J. (2012). Gene loops enhance transcriptional directionality. Science 338, 671-675.

Teale, W.D., Paponov, I.A., and Palme, K. (2006). Auxin in action: signalling, transport and the control of plant growth and development. Nature reviews. Molecular cell biology 7, 847-859.

Teves, S.S., An, L., Bhargava-Shah, A., Xie, L., Darzacq, X., and Tjian, R. (2018). A stable mode of bookmarking by TBP recruits RNA polymerase II to mitotic chromosomes. Elife 7, 35621.

Teves, S.S., An, L., Hansen, A.S., Xie, L., Darzacq, X., and Tjian, R. (2016). A dynamic mode of mitotic bookmarking by transcription factors. Elife 5, 22280.

Teves, S.S., and Henikoff, S. (2014a). DNA torsion as a feedback mediator of transcription and chromatin dynamics. Nucleus 5, 211-218.

332

References

Teves, S.S., and Henikoff, S. (2014b). Transcription-generated torsional stress destabilizes nucleosomes. Nat Struct Mol Biol 21, 88-94.

Thiebaut, M., Kisseleva-Romanova, E., Rougemaille, M., Boulay, J., and Libri, D. (2006). Transcription termination and nuclear degradation of cryptic unstable transcripts: a role for the nrd1-nab3 pathway in genome surveillance. Mol Cell 23, 853-864.

Tome, J.M., Tippens, N.D., and Lis, J.T. (2018). Single-molecule nascent RNA sequencing identifies regulatory domain architecture at promoters and enhancers. Nat Genet 50, 1533-1541.

Trinklein, N.D., Aldred, S.F., Hartman, S.J., Schroeder, D.I., Otillar, R.P., and Myers, R.M. (2004). An abundance of bidirectional promoters in the human genome. Genome Res 14, 62-66.

True, J.D., Muldoon, J.J., Carver, M.N., Poorey, K., Shetty, S.J., Bekiranov, S., and Auble, D.T. (2016). The Modifier of Transcription 1 (Mot1) ATPase and Spt16 Histone Chaperone Co-regulate Transcription through Preinitiation Complex Assembly and Nucleosome Organization. J Biol Chem 291, 15307-15319.

Tsui, C., Inouye, C., Levy, M., Lu, A., Florens, L., Washburn, M.P., and Tjian, R. (2018). dCas9-targeted locus-specific protein isolation method identifies histone gene regulators. Proc Natl Acad Sci U S A 115, E2734-E2741.

Tuck, A.C., and Tollervey, D. (2013). A transcriptome-wide atlas of RNP composition reveals diverse classes of mRNAs and lncRNAs. Cell 154, 996-1009.

Tyanova, S., Temu, T., Sinitcyn, P., Carlson, A., Hein, M.Y., Geiger, T., Mann, M., and Cox, J. (2016). The Perseus computational platform for comprehensive analysis of (prote)omics data. Nat Methods 13, 731-740.

Tye, B.W., Commins, N., Ryazanova, L.V., Wuhr, M., Springer, M., Pincus, D., and Churchman, L.S. (2019). Proteotoxicity from aberrant ribosome biogenesis compromises cell fitness. Elife 8, 43002.

Uhler, J.P., Hertel, C., and Svejstrup, J.Q. (2007). A role for noncoding transcription in activation of the yeast PHO5 gene. Proc Natl Acad Sci U S A 104, 8011-8016.

333

References

Uszczynska-Ratajczak, B., Lagarde, J., Frankish, A., Guigo, R., and Johnson, R. (2018). Towards a complete map of the human long non-coding RNA transcriptome. Nat Rev Genet 19, 535-548.

Valen, E., Preker, P., Andersen, P.R., Zhao, X., Chen, Y., Ender, C., Dueck, A., Meister, G., Sandelin, A., and Jensen, T.H. (2011). Biogenic mechanisms and utilization of small RNAs derived from human protein-coding genes. Nat Struct Mol Biol 18, 1075-1082. van Bakel, H., Tsui, K., Gebbia, M., Mnaimneh, S., Hughes, T.R., and Nislow, C. (2013). A compendium of nucleosome and transcript profiles reveals determinants of chromatin architecture and transcription. PLoS Genet 9, e1003479.

Van de Vosse, D.W., Wan, Y., Lapetina, D.L., Chen, W.M., Chiang, J.H., Aitchison, J.D., and Wozniak, R.W. (2013). A role for the nucleoporin Nup170p in chromatin structure and gene silencing. Cell 152, 969-983. van Dijk, E.L., Chen, C.L., d'Aubenton-Carafa, Y., Gourvennec, S., Kwapisz, M., Roche, V., Bertrand, C., Silvain, M., Legoix-Ne, P., Loeillet, S., et al. (2011). XUTs are a class of Xrn1-sensitive antisense regulatory non-coding RNA in yeast. Nature 475, 114-117. van Werven, F.J., and Amon, A. (2011). Regulation of entry into gametogenesis. Philos Trans R Soc Lond B Biol Sci 366, 3521-3531. van Werven, F.J., Neuert, G., Hendrick, N., Lardenois, A., Buratowski, S., van Oudenaarden, A., Primig, M., and Amon, A. (2012). Transcription of two long noncoding RNAs mediates mating-type control of gametogenesis in budding yeast. Cell 150, 1170-1181. van Werven, F.J., and Timmers, H.T. (2006). The use of biotin tagging in Saccharomyces cerevisiae improves the sensitivity of chromatin immunoprecipitation. Nucleic Acids Res 34, e33. van Werven, F.J., van Bakel, H., van Teeffelen, H.A., Altelaar, A.F., Koerkamp, M.G., Heck, A.J., Holstege, F.C., and Timmers, H.T. (2008). Cooperative action of NC2 and Mot1p to regulate TATA-binding protein function across the genome. Genes Dev 22, 2359-2369.

334

References

Vaquerizas, J.M., Kummerfeld, S.K., Teichmann, S.A., and Luscombe, N.M. (2009). A census of human transcription factors: function, expression and evolution. Nat Rev Genet 10, 252-263.

Vasiljeva, L., and Buratowski, S. (2006). Nrd1 Interacts with the Nuclear Exosome for 3' Processing of RNA Polymerase II Transcripts. Molecular Cell 21, 239-248.

Vasiljeva, L., Kim, M., Mutschler, H., Buratowski, S., and Meinhart, A. (2008). The Nrd1-Nab3-Sen1 termination complex interacts with the Ser5-phosphorylated RNA polymerase II C-terminal domain. Nat Struct Mol Biol 15, 795-804.

Velculescu, V.E., Zhang, L., Zhou, W., Vogelstein, J., Basrai, M.A., Bassett, D.E., Jr., Hieter, P., Vogelstein, B., and Kinzler, K.W. (1997). Characterization of the yeast transcriptome. Cell 88, 243-251.

Venkatesh, S., Li, H., Gogol, M.M., and Workman, J.L. (2016). Selective suppression of antisense transcription by Set2-mediated H3K36 methylation. Nature communications 7, 13610.

Venkatesh, S., Smolle, M., Li, H., Gogol, M.M., Saint, M., Kumar, S., Natarajan, K., and Workman, J.L. (2012). Set2 methylation of histone H3 lysine 36 suppresses histone exchange on transcribed genes. Nature 489, 452-455.

Venkatesh, S., and Workman, J.L. (2015). Histone exchange, chromatin structure and the regulation of transcription. Nature reviews. Molecular cell biology 16, 178- 189.

Vierke, G., Engelmann, A., Hebbeln, C., and Thomm, M. (2003). A novel archaeal transcriptional regulator of heat shock response. J Biol Chem 278, 18-26.

Wade, J.T., Hall, D.B., and Struhl, K. (2004). The transcription factor Ifh1 is a key regulator of yeast ribosomal protein genes. Nature 432, 1054-1058.

Wallace, E.W.J., and Beggs, J.D. (2017). Extremely fast and incredibly close: cotranscriptional splicing in budding yeast. RNA 23, 601-610.

Wang, K.C., Yang, Y.W., Liu, B., Sanyal, A., Corces-Zimmerman, R., Chen, Y., Lajoie, B.R., Protacio, A., Flynn, R.A., Gupta, R.A., et al. (2011). A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression. Nature 472, 120-124.

335

References

Wapinski, I., Pfiffner, J., French, C., Socha, A., Thompson, D.A., and Regev, A. (2010). Gene duplication and the evolution of ribosomal protein gene regulation in yeast. Proc Natl Acad Sci U S A 107, 5505-5510.

Warfield, L., Ramachandran, S., Baptista, T., Devys, D., Tora, L., and Hahn, S. (2017). Transcription of Nearly All Yeast RNA Polymerase II-Transcribed Genes Is Dependent on Transcription Factor TFIID. Molecular Cell 68, 118-129.e115.

Warner, J.R. (1999). The economics of ribosome biosynthesis in yeast. Trends Biochem Sci 24, 437-440.

Waterborg, J.H. (2000). Steady-state levels of histone acetylation in Saccharomyces cerevisiae. J Biol Chem 275, 13007-13011.

Watson, A.A., Mahajan, P., Mertens, H.D., Deery, M.J., Zhang, W., Pham, P., Du, X., Bartke, T., Zhang, W., Edlich, C., et al. (2012). The PHD and chromo domains regulate the ATPase activity of the human chromatin remodeler CHD4. J Mol Biol 422, 3-17.

Weber, C.M., Ramachandran, S., and Henikoff, S. (2014). Nucleosomes are context-specific, H2A.Z-modulated barriers to RNA polymerase. Mol Cell 53, 819- 830.

Wei, W., Hennig, B.P., Wang, J., Zhang, Y., Piazza, I., Sanchez, Y.P., Chabbert, C.D., Adjalley, S.H., Steinmetz, L.M., and Pelechano, V. (2019). Chromatin- sensitive cryptic promoters encode alternative protein isoforms in yeast. bioRxiv, 403543.

Wei, W., Pelechano, V., Jarvelin, A.I., and Steinmetz, L.M. (2011). Functional consequences of bidirectional promoters. Trends Genet 27, 267-276.

Weidberg, H., Moretto, F., Spedale, G., Amon, A., and van Werven, F.J. (2016). Nutrient Control of Yeast Gametogenesis Is Mediated by TORC1, PKA and Energy Availability. PLoS Genet 12, e1006075.

Wery, M., Descrimes, M., Vogt, N., Dallongeville, A.S., Gautheret, D., and Morillon, A. (2016). Nonsense-Mediated Decay Restricts LncRNA Levels in Yeast Unless Blocked by Double-Stranded RNA Structure. Mol Cell 61, 379-392.

336

References

Whitehouse, I., Rando, O.J., Delrow, J., and Tsukiyama, T. (2007). Chromatin remodelling at promoters suppresses antisense transcription. Nature 450, 1031- 1035.

Winkler, D.D., and Luger, K. (2011). The histone chaperone FACT: structural insights and mechanisms for nucleosome reorganization. J Biol Chem 286, 18369- 18374.

Winzeler, E.A., Shoemaker, D.D., Astromoff, A., Liang, H., Anderson, K., Andre, B., Bangham, R., Benito, R., Boeke, J.D., Bussey, H., et al. (1999). Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science 285, 901-906.

Wlotzka, W., Kudla, G., Granneman, S., and Tollervey, D. (2011). The nuclear RNA polymerase II surveillance system targets polymerase III transcripts. EMBO J 30, 1790-1803.

Wolfe, K.H., and Shields, D.C. (1997). Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387, 708-713.

Wotton, D., and Shore, D. (1997). A novel Rap1p-interacting factor, Rif2p, cooperates with Rif1p to regulate telomere length in Saccharomyces cerevisiae. Genes Dev 11, 748-760.

Wu, A.C.K., Patel, H., Chia, M., Moretto, F., Frith, D., Snijders, A.P., and van Werven, F. (2018a). Repression of divergent noncoding transcription by a sequence-specific transcription factor. bioRxiv, 314310.

Wu, A.C.K., Patel, H., Chia, M., Moretto, F., Frith, D., Snijders, A.P., and van Werven, F.J. (2018b). Repression of Divergent Noncoding Transcription by a Sequence-Specific Transcription Factor. Mol Cell 72, 942-954 e947.

Wu, A.C.K., and Van Werven, F.J. (2019). Transcribe this way: Rap1 confers promoter directionality by repressing divergent transcription. Transcription 10, 164- 170.

Wu, H., Yang, L., and Chen, L.L. (2017). The Diversity of Long Noncoding RNAs and Their Generation. Trends Genet 33, 540-552.

337

References

Wu, X., Scott, D.A., Kriz, A.J., Chiu, A.C., Hsu, P.D., Dadon, D.B., Cheng, A.W., Trevino, A.E., Konermann, S., Chen, S., et al. (2014). Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian cells. Nat Biotechnol 32, 670-676.

Wu, X., and Sharp, P.A. (2013). Divergent transcription: a driving force for new gene origination? Cell 155, 990-996.

Wyers, F., Rougemaille, M., Badis, G., Rousselle, J.C., Dufour, M.E., Boulay, J., Regnault, B., Devaux, F., Namane, A., Seraphin, B., et al. (2005). Cryptic pol II transcripts are degraded by a nuclear quality control pathway involving a new poly(A) polymerase. Cell 121, 725-737.

Xie, L., Pelz, C., Wang, W., Bashar, A., Varlamova, O., Shadle, S., and Impey, S. (2011). KDM5B regulates embryonic stem cell self-renewal and represses cryptic intragenic transcription. EMBO J 30, 1473-1484.

Xu, Z., Wei, W., Gagneur, J., Perocchi, F., Clauder-Munster, S., Camblong, J., Guffanti, E., Stutz, F., Huber, W., and Steinmetz, L.M. (2009). Bidirectional promoters generate pervasive transcription in yeast. Nature 457, 1033-1037.

Xue, Y., Pradhan, S.K., Sun, F., Chronis, C., Tran, N., Su, T., Van, C., Vashisht, A., Wohlschlegel, J., Peterson, C.L., et al. (2017). Mot1, Ino80C, and NC2 Function Coordinately to Regulate Pervasive Transcription in Yeast and Mammals. Mol Cell 67, 594-607 e594.

Yadon, A.N., Van de Mark, D., Basom, R., Delrow, J., Whitehouse, I., and Tsukiyama, T. (2010). Chromatin remodeling around nucleosome-free regions leads to repression of noncoding RNA transcription. Mol Cell Biol 30, 5110-5122.

Yan, C., Chen, H., and Bai, L. (2018). Systematic Study of Nucleosome-Displacing Factors in Budding Yeast. Mol Cell 71, 294-305 e294.

Yan, C., Zhang, D., Raygoza Garay, J.A., Mwangi, M.M., and Bai, L. (2015). Decoupling of divergent gene regulation by sequence-specific DNA binding factors. Nucleic Acids Res 43, 7292-7305.

Yang, M.Q., and Elnitski, L.L. (2008). Diversity of core promoter elements comprising human bidirectional promoters. BMC Genomics 9 Suppl 2, S3.

338

References

Yarragudi, A., Miyake, T., Li, R., and Morse, R.H. (2004). Comparison of ABF1 and RAP1 in chromatin opening and transactivator potentiation in the budding yeast Saccharomyces cerevisiae. Mol Cell Biol 24, 9152-9164.

Yarrington, R.M., Richardson, S.M., Lisa Huang, C.R., and Boeke, J.D. (2012). Novel transcript truncating function of Rap1p revealed by synthetic codon- optimized Ty1 retrotransposon. Genetics 190, 523-535.

Yen, K., Vinayachandran, V., Batta, K., Koerber, R.T., and Pugh, B.F. (2012). Genome-wide nucleosome specificity and directionality of chromatin remodelers. Cell 149, 1461-1473.

Yoh, S.M., Cho, H., Pickle, L., Evans, R.M., and Jones, K.A. (2007). The Spt6 SH2 domain binds Ser2-P RNAPII to direct Iws1-dependent mRNA splicing and export. Genes Dev 21, 160-174.

Yu, L., and Morse, R.H. (1999). Chromatin opening and transactivator potentiation by RAP1 in Saccharomyces cerevisiae. Mol Cell Biol 19, 5279-5288.

Yu, L., Sabet, N., Chambers, A., and Morse, R.H. (2001). The N-terminal and C- terminal domains of RAP1 are dispensable for chromatin opening and GCN4- mediated HIS4 activation in budding yeast. J Biol Chem 276, 33257-33264.

Yudkovsky, N., Logie, C., Hahn, S., and Peterson, C.L. (1999). Recruitment of the SWI/SNF chromatin remodeling complex by transcriptional activators. Genes Dev 13, 2369-2374.

Zentner, G.E., and Henikoff, S. (2013). Regulation of nucleosome dynamics by histone modifications. Nat Struct Mol Biol 20, 259-266.

Zentner, G.E., Kasinathan, S., Xin, B., Rohs, R., and Henikoff, S. (2015). ChEC- seq kinetics discriminates transcription factor binding sites by DNA sequence and shape in vivo. Nature communications 6, 8733.

Zerbino, D.R., Achuthan, P., Akanni, W., Amode, M.R., Barrell, D., Bhai, J., Billis, K., Cummins, C., Gall, A., Giron, C.G., et al. (2018). Ensembl 2018. Nucleic Acids Res 46, D754-D761.

Zhao, J., Ohsumi, T.K., Kung, J.T., Ogawa, Y., Grau, D.J., Sarma, K., Song, J.J., Kingston, R.E., Borowsky, M., and Lee, J.T. (2010). Genome-wide identification of polycomb-associated RNAs by RIP-seq. Mol Cell 40, 939-953.

339

References

Zhao, Y., McIntosh, K.B., Rudra, D., Schawalder, S., Shore, D., and Warner, J.R. (2006). Fine-structure analysis of ribosomal protein gene transcription. Mol Cell Biol 26, 4853-4862.

Zhou, M., and Law, J.A. (2015). RNA Pol IV and V in gene silencing: Rebel polymerases evolving away from Pol II's rules. Curr Opin Plant Biol 27, 154-164.

Zhu, J., Liu, M., Liu, X., and Dong, Z. (2018). RNA polymerase II activity revealed by GRO-seq and pNET-seq in Arabidopsis. Nat Plants 4, 1112-1123.

Zhu, S., Li, W., Liu, J., Chen, C.H., Liao, Q., Xu, P., Xu, H., Xiao, T., Cao, Z., Peng, J., et al. (2016). Genome-scale deletion screening of human long non-coding RNAs using a paired-guide RNA CRISPR-Cas9 library. Nat Biotechnol 34, 1279-1286.

Zofall, M., Fischer, T., Zhang, K., Zhou, M., Cui, B., Veenstra, T.D., and Grewal, S.I. (2009). Histone H2A.Z cooperates with RNAi and heterochromatin factors to suppress antisense RNAs. Nature 461, 419-422.

340