The Pennsylvania State University

The Graduate School

Department of Biochemistry and Molecular Biology

GLOBAL REGULATION OF EXPRESSION IN

SACCHAROMYCES CEREVISIAE

VIA TATA BINDING REGULATORY FACTORS

A Thesis in

Biochemistry, Microbiology, and Molecular Biology

by

Kathryn L. Huisinga

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

August 2005

The thesis of Kathryn L. Huisinga was reviewed and approved* by the following:

B. Franklin Pugh Professor of Biochemistry and Molecular Biology Thesis Advisor Chair of Committee

Joseph C. Reese Associate Professor of Biochemistry and Molecular Biology

Ross C. Hardison T. Ming Chu Professor of Biochemistry and Molecular Biology

Naomi S. Altman Associate Professor of Statistics

Robert A. Schlegal Professor of Biochemistry and Molecular Biology Head of the Department of Biochemistry and Molecular Biology

*Signatures are on file in the Graduate School

ABSTRACT

The TATA Binding Protein (TBP) is a key component of gene regulation. It binds to the promoter region of eukaryotic and facilitates assembly of the transcription initiation machinery, including RNA Polymerase II. Many interact with TBP to both positively and negatively regulate gene expression. My thesis utilized genome-wide expression profiling in Saccharomyces cerevisiae to define the target genes of, and relationships between, the factors that regulate transcription via TBP. I found the SAGA and TFIID co-activator complexes, both of which can deliver TBP to promoters, make overlapping contributions to the expression of nearly all yeast genes. The SAGA complex functions predominantly at ~10% of the genome, targeting genes that contain

TATA boxes and are up regulated upon an environmental stress response. TFIID functions primarily at TATA-less genes, including genes down-regulated upon stress, while playing a predominant role in activation at the remaining 90% of the genome. The

SAGA-dominated genes tend to be coordinately down regulated by a variety of additional transcriptional regulators. They include Mot1 and the NC2 complex both of which down-regulate transcription primarily at TATA-containing genes, in a manner that often counteracts the positive role of SAGA. Additionally, Mot1 and NC2 target many of the same genes, whose down-regulation generally requires the combined action of both factors. Mutations that disrupt repression of transcription via TBP dimerization increase expression of very lowly expressed genes, generally located in subtelomeric regions that are also repressed by chromatin-associated mechanisms. TBP interactions with DNA are

iii

most critical at highly transcribed genes, and the TAF1 N-terminal domain plays a minimal role by itself in gene regulation but is partially redundant with other regulatory factors. In total, this work has defined a highly intertwined, gene-specific, network of transcription regulators, which stimulate and inhibit gene expression via interactions with the TATA Binding Protein.

iv

TABLE OF CONTENTS

LIST OF FIGURES...... x LIST OF TABLES ...... xii ACKNOWLEDGEMENTS...... xiii 1 Introduction: An Overview of the Regulation of Gene Expression ...... 1 1.1 Genes and Gene Expression...... 1

1.2 Regulation of RNA Polymerase II transcription ...... 2

1.3 TATA Binding Protein’s Role in Pol II transcription...... 10

1.3.1 Identification of TFIID in yeast and higher eukaryotes ...... 10

1.3.2 TBP Structure and Interaction with DNA ...... 12

1.3.3 Role of yeast TAFs in transcription ...... 13

1.3.4 Regulators of TBP...... 14

1.4 A Genome-wide approach to investigating transcription regulation...... 15

1.4.1 Genome-wide Technology ...... 15

1.4.2 Benefits and Limitations of a Genome-wide approach...... 16

1.5 References ...... 18

2 Interplay of TBP Inhibitors in Global Transcriptional Control ...... 26 2.1 Summary ...... 26

2.2 Introduction ...... 27

2.3 Results...... 33

2.3.1 Genome-wide Effects of ΔTAND and TBP Mutations...... 33

v

2.3.2 Distinct Gene Expression Groups Reveal Combinatorial Interactions of TBP

Regulators ...... 39

2.3.3 Repressive Subtelomeric Regions are Intrinsically Accessible to the General

Transcription Machinery...... 47

2.4 Discussion ...... 49

2.4.1 The Yeast Genome is Negatively Regulated in Part by a Variety of TBP

Inhibitors ...... 49

2.4.2 NC2 Attenuates Highly Active Genes ...... 51

2.4.3 Multiple Inhibitory Interactions Along TBP’s Concave Surface Provide

Redundant Mechanisms for Preventing Unregulated Transcription ...... 53

2.4.4 The Repressive Subtelomeric Environment is Accessible to the General

Transcription Machinery...... 55

2.5 Experimental Procedures ...... 57

2.6 References ...... 65

3 A genome-wide housekeeping role for TFIID and a highly regulated stress- related role for SAGA in Saccharomyces cerevisiae ...... 72 3.1 Summary ...... 72

3.2 Introduction ...... 73

3.3 Results...... 75

3.3.1 Determination of transcriptional effects via expression microarray analysis75

3.3.2 TFIID and SAGA each contribute to the expression of nearly all genes...... 79

3.3.3 TFIID dominates at ~90% of all genes, while SAGA dominates at ~10%...84

vi

3.3.4 Stress-induced genes tend to be SAGA-dominated while stress-repressed

genes tend to be TFIID-dominated...... 88

3.3.5 Genes having highly acetylated histone H4 tails tend to be TFIID-dominated

93

3.3.6 TAFs make a greater positive contribution at TFIID-dominated genes than at

SAGA-dominated genes ...... 96

3.3.7 Bdf1 and histone H4 tails are linked to TFIID regulation...... 97

3.3.8 SAGA-dominated genes are coordinately regulated...... 99

3.4 Discussion ...... 101

3.4.1 TFIID and SAGA contribute to the expression of essentially all genes .....101

3.4.2 Shared TAFs are more important for TFIID than for SAGA...... 102

3.4.3 Histone H4 tail acetylation, Bdf1 binding, and TFIID function are linked.102

3.4.4 SAGA-dominated genes reveal a highly regulated stress-response pathway

103

3.5 Materials and methods ...... 105

3.6 References ...... 108

4 Coordination of gene expression in Saccharomyces cerevisiae through positive and negative regulation of the TATA-Binding Protein...... 117 4.1 Summary ...... 117

4.2 Introduction ...... 118

4.3 Results...... 126

4.3.1 Design of the study...... 126

vii

4.3.2 Perturbing different combinations of interactions (nodes) in the TBP

regulatory network have distinct effects on cell growth...... 131

4.3.3 A portion of the yeast genome is subject to a complex TBP regulatory

network...... 134

4.3.4 Clustering of genes sensitive to TBP regulation reveals a complex TBP

regulatory network with many nodes...... 135

4.3.5 Validation of the TBP mutants ...... 140

4.3.6 Changes in gene expression reflect changes in genome-wide occupancy of

Test TBP ...... 144

4.3.7 Combinations of TBP mutants elucidate the complex relationships between

TBP regulators...... 145

4.3.8 Clusters of co-regulated genes have distinct intrinsic properties and exhibit

relationships to additional transcriptional regulators...... 151

4.3.9 Clusters are enriched for genes with distinct functional properties and

sequence motifs...... 163

4.3.10 Chromosomal duplications can occur when multiple TBP regulators are

disrupted...... 165

4.4 Discussion ...... 170

4.4.1 Modes of TBP regulation function coordinately, with certain regulation

dominating at subsets of genes...... 170

viii

4.4.2 A portion of the yeast genome is sensitive to a highly interconnected but

minimally redundant TBP regulatory network...... 171

4.4.3 Clusters from genome-wide expression analysis characterize key

relationships between TBP regulators...... 173

4.5 Materials & Methods ...... 176

4.6 References ...... 181

5 The Big Picture: Dissecting the in vivo role of TBP Regulatory complexes...... 189 5.1 Summary of study...... 189

5.2 Gene-specific nature of TBP regulatory complexes...... 190

5.3 Interplay between TBP regulators ...... 192

5.4 The next questions ...... 193

ix

LIST OF FIGURES

Figure 1.1. Overview of transcription in Saccharomyces cerevisiae...... 4

Figure 2.1. Interaction of TBP with regulatory factors ...... 29

Figure 2.2. HA-tagged TBP is expressed after galactose induction...... 35

Figure 2.3. Microarray analysis of TBP mutants in wild type and ΔTAND strains ...... 37

Figure 2.4. Hierarchical clustering of mutants in individual groups...... 41

Figure 2.5. Dependency of selected groups of genes on the TAF1 TAND domain...... 42

Figure 2.6. Expression level of various gene groups ...... 43

Figure 2.7. Subtelomeric frequency profile of group 3 and 4 genes...... 48

Figure 2.8. Models for the interplay of TBP effectors in regulating the four groups of

genes identified in Chapter 2...... 51

Figure 3.1. Noise associated with spotted and high density microarrays...... 78

Figure 3.2. Genome-wide expression profiles of GCN5, SPT3, and TAF1 mutants...... 80

Table 3.2. Fold changes in gene expression in TFIID and SAGA mutants...... 82

Figure 3.3. Induction timecourse of those environmental stress response genes that are

induced and SAGA- or TFIID-dominated...... 92

Figure 3.4. Highly coordinated co-regulation of SAGA-dominated genes...... 100

Figure 4.1. Strains and TBP mutants utilized to dissect the TBP regulatory network...128

Figure 4.2. The test TBP is expressed upon addition of galactose...... 130

Figure 4.3. Expression of several TBP mutants causes dominant synthetic toxicity.....133

Figure 4.4. Cluster analysis reveals a highly interconnected TBP regulatory network. 137 x

Figure 4.5. Validation of the TBP mutants...... 142

Figure 4.6. Clustering of subsets of experiments indicates a complex relationship

between TBP regulators...... 147

Figure 4.7. Chromosomal duplications can occur upon disruption of the TBP regulatory

network...... 167

Figure 4.8. Genes in Cluster 10 are over-represented on XI and XII. ....169

xi

LIST OF TABLES

Table 2.1. Number of genes affected by TBP mutations...... 34

Table 2.2. Number of genes in each cluster...... 36

Table 3.1. Yeast strains used in this study...... 76

Table 3.3. Comparison of TATA/TATA-less classification with ChIP dataa...... 87

Table 3.4. Percent of factor-sensitive genes that are SAGA-dominated...... 90

Table 4.1. Test TBP mutations disrupt a variety of TBP interacting factors...... 129

Table 4.2. Summary of effects observed upon disruption of modes of TBP regulation 146

Table 4.3. Presence of a TATA box and regulation by SAGA and TFIID at each gene

cluster...... 153

Table 4.4. Transcription rate and steady state signal intensity of clusters ...... 156

Table 4.5. Selected Relationships of gene clusters to defined groups...... 157

Table 4.6. Selected Relationships of gene clusters to other transcription regulators.....158

xii

ACKNOWLEDGEMENTS

This work would not have been possible without assistance and encouragement from many individuals.

Development and Analysis of yeast microarrays As the yeast expression arrays played a vital role in the work presented in this thesis, these acknowledgements are applicable to all chapters. I would first, and foremost, like to thank Lata Chitikila for undertaking the project of developing microarray technology in the Pugh lab. The data from numerous microarrays presented in this thesis would not have been possible without all that she did to put this huge undertaking together. She initiated the expression studies presented in Chapter 2, and I was fortunate to inherit a project that has been so fruitful. I truly appreciate her guidance and encouragement!

All the microarray data in the world means nothing if you don’t have a method to analyze it, and that is where assistance from the statisticians is vital. I would like to thank Francesca Chiaromonte for her role, especially during the “early days”, in facilitating in the development and interpretation of the microarrays. I would also like to thank Naomi Altman and the PSU Bioinformatic Consulting Center for assistance in developing the analysis used in Chapter 3 and just generally being a great resource when I’m stuck with an “R” problem! Finally, I would like to thank Andy Basehoar for helping develop the statistical analysis methods used in Chapters 2 and 4 and writing the code to execute them. Most importantly, I want to thank Andy for the work he did in identifying yeast TATA boxes. Without this information, the data presented in Chapters 3 and 4 would not be nearly as compelling of a story.

I would also like to acknowledge John Szot and Craig Praul at the Penn State Microarray core for their role in printing the arrays and providing great technical assistance with all

xiii

things microarray; John Chicca for his role in amplifying the PCR products used in printing the arrays; and lastly, the original Penn State yeast microarray group composed of the Workman, Simpson, Ng, Reese, and Pugh labs for realizing the importance of this technology and bringing it to Penn State.

Chapter 2 I would like to thank, T. Kokubo for providing strains and plasmids, Song Tan for providing images for Figure 2.1A, and David Gilmour, Pam Mitchell, Joe Reese, and Song Tan for comments on the published manuscript.

Chapter 3 I would like to acknowledge Amanda Paul and Jaclyn Shingara for technical assistance, Sara Zanton for assistance with Excel scripts, Winston Shen and Andreas Ladurner for sharing unpublished data, and thank Joe Reese, Song Tan, Jerry Workman, Bing Li, Philippe Prochasson, and members of the Workman lab for advice, reagents, and comments on the published manuscript.

Chapter 4 I would like to acknowledge Kivanc Birsoy and Brian Venters for technical assistance with the cloning of TBP mutants and cell growth assays.

I will forever be indebted to my mentor, Frank Pugh, for his guidance, encouragement, and confidence in my abilities. He charted a new direction for the lab by focusing on genome-wide analysis, and I am very fortunate to have been a part of it. I would also like to thank all the current and former members of the Pugh Lab that I have worked with: Lata Chitikila, Haiping Kou, Jordan Irvin, John Robinson, Melissa Durrant, Sara Zanton, Travis Mavrich, and all of the undergrads who keep things interesting. I am very fortunate to have had an excellent group of co-workers who are always willing to help.

xiv

I would also like to thank additional members of the Penn State community for making my time here so enjoyable, enlightening, and rewarding. This includes all my classmates and friends who helped troubleshoot science questions and more importantly provided diversions from science when necessary- you are too numerous to list! I thank the members of the Gene Regulation “Mega” group, both current and former, from the Simpson, Workman, Reese, Tan, and Gilmour Labs for suggestions an ideas throughout this work. I’d also like to thank my committee members, Ross Hardison, Naomi Altman, Song Tan, Joe Reese, and Frank for their direction on the work presented in this thesis. In addition, I’d like to thank the late Dr. Robert Simpson for his input as a member of my committee and example as a scientist.

Last, but certainly not least, I want to thank my family, especially my parents, for their support and encouragement. Most importantly, I want to thank Mike for all he has done and for always being so supportive. I look forward to our life together.

xv

Chapter 1

1 Introduction: An Overview of the Regulation of Gene Expression

1.1 Genes and Gene Expression

The nucleus of every living cell contains DNA that carries the genetic information for that organism. Living organisms contain anywhere from several hundred to many thousands of genes, which are specific segments of DNA. These genes must be expressed or repressed at the right time and under the right conditions in order for the organism to grow and survive. For different organisms, this translates into different things. A human embryo must express certain genes and repress others in order to transform from a fertilized egg to a living, breathing baby. The single-celled brewer’s yeast must adjust which genes it expresses in order to survive once it has fermented all available sugar and needs to switch its carbon source. The expression of genes is regulated by integrating environmental signals coming into a cell to achieve the gene expression profile required for the cell to survive in its current environment. Gene expression is regulated on many levels, which include DNA accessibility, transcription initiation and elongation, mRNA processing and stability, translation, protein folding and post-translational modifications, and finally degradation. While the effects of gene expression can be very diverse, as illustrated by the above examples, the basic principles behind it are highly conserved.

1

For a gene to be expressed, its DNA coding region must first be transcribed into RNA.

The factor responsible for transcribing DNA into RNA is RNA polymerase. There are three types of DNA-dependent RNA polymerases in most eukaryotic cells, designated

Pol I, Pol II, and Pol III (plants have a recently discovered fourth RNA Polymerase

(Onodera et al., 2005)). Each type of polymerase synthesizes RNAs that serve different functions in gene expression. RNA Pol II transcribes protein-coding genes to generate mRNA, which is then translated into proteins. The translation process requires the rRNAs and tRNAs synthesized by RNA Pol I and Pol III. While the three polymerases share several subunits, the mechanisms that regulate transcription initiation differ between them. The promoter regions where the polymerases interact with DNA differ in sequence and each polymerase requires different accessory factors to regulate RNA synthesis by that polymerase (reviewed by Hahn, 2004). Despite the differences between each type of polymerase, all three require the TATA binding protein (TBP) for transcription initiation (Cormack and Struhl, 1992; Schultz et al., 1992), highlighting its importance in cellular function. This work is focused on TBP and TBP regulatory factors involved in transcription initiation at RNA Polymerase II genes.

1.2 Regulation of RNA Polymerase II transcription

Since misregulation of gene expression can be fatal, the cell has evolved complex transcription machinery to regulate gene expression. To adapt to different signals, a cell must use mechanisms to adjust the level of mRNA production. In some instances, this means completely shutting a gene off, which involves repressors that block transcription

2

of the gene. In other cases this means turning the gene on, which involves gene specific activators that bind to specific DNA sites and recruit the transcription machinery, including TBP, to the promoter of a gene. This allows for transcription of the gene to occur. Various levels of transcription output can be achieved, as needed, through positive and negative regulatory mechanisms, which are outlined in more detail below. The transcription initiation machinery is highly conserved across all eukaryotes, which allows the use of model organisms in studying the regulation of gene expression. However, there are always subtle differences between simple and more complex organisms. As the work presented in this thesis was performed with the single-celled yeast Saccharomyces cerevisiae, any important differences in regulation of transcription initiation between

Saccharomyces and other organisms will be pointed out in this overview of RNA Pol II transcription. Figure 1.1 illustrates an overview of the transcription process in S. cerevisiae which is outlined in more detail below.

3

Figure 1.1. Overview of transcription in Saccharomyces cerevisiae. Cartoon which outlines several of the main components of transcription. The TATA Binding Protein (TBP), in orange, is shown bound to promoter DNA. Other key factors are labeled in the illustration.

The transcription process is impacted by the fact that DNA inside the nucleus is packaged into chromatin. Presumably, this packaging is required to fit the large amount of cellular

DNA into the available nuclear space. The DNA is wrapped around a histone core, which is composed of two copies each of H2A, H2B, H3, and H4 proteins, to form nucleosomes. These nucleosomes are then further compacted into higher order chromatin, the structure of which is not completely clear. However, chromatin is much more than just a static entity inside the nucleus. It plays a central role in regulating gene transcription, as histone-bound DNA is less accessible to the transcription machinery than

4

naked DNA. There are two different types of chromatin inside the cell, heterochromatin and euchromatin. Heterochromatin is highly condensed and generally contains repressed genes, while euchromatin is slightly more accessible and contains regions of the genome being actively transcribed. Packaging DNA into heterochromatin is one way that a cell maintains genes in an “OFF” state when necessary, as different cell types can package genes differently based upon expression requirements (reviewed by Orphanides and

Reinberg, 2002; Richards and Elgin, 2002). However, even genes located within euchromatin must contend with the presence of nucleosomes. One way that cells handle this is by using factors that alter chromatin to make the DNA more accessible.

There are two main categories of complexes that alter the properties of chromatin; ATP- dependent chromatin remodeling complexes and histone modifying complexes. The former uses the energy of ATP to reposition nucleosomes thereby making the DNA more or less accessible to gene specific DNA binding proteins and the general transcription machinery. Histone modifying complexes catalyze post-translational modifications of histones, which facilitate gene activation or repression. The best-characterized histone modification is acetylation, which correlates with transcriptional activity. Histone Acetyl

Transferases (HATs) acetylate lysine residues in the histone N-terminal tails, while

Histone Deacetylases (HDACs) counteract HATs and remove acetyl groups from histones. Histones have also been shown to be methylated, phosphorylated, and ubiquitinated. While there is strong evidence linking certain histone modifications to transcriptional activation and others to inactivation, exactly how histone modifications

5

are coordinated in a manner that regulates gene expression is still not entirely clear. A

“histone code” has been proposed, which suggests that certain modifications create a pattern that is somehow decoded, possibly by proteins that bind to the modified tails.

Although it is clear that certain proteins do bind specifically or preferentially to histones with certain modifications, there is also evidence that the modifications can affect the charge-sensitive interactions between histones and DNA, resulting in altered DNA accessibility. Recent studies in yeast suggest that regulation may occur through a combination of these two mechanisms (Dion et al., 2005; Kurdistani et al., 2004).

In vitro experiments in the early 1980’s identified the critical components required for

RNA Pol II transcription by exploiting the fact that purified mammalian Pol II could direct accurate transcription of a DNA template when supplemented with a crude cell extract (Matsui et al., 1980; Weil et al., 1979). Through fractionation of this crude extract, the GTFs, which include TFIIA, TFIIB, TFIID (including TBP), TFIIE, TFIIF, and TFIIH, were identified. These factors are highly conserved from yeast through mammals, emphasizing the similarity of the general components involved in RNA Pol II transcription. Subsequent in vitro experiments elucidated the general mechanism by which these GTFs function to initiate basal transcription (“activated” transcription is discussed below). However, one should keep in mind that gene activation in vivo is not necessarily separated into the distinct steps that were elucidated with the in vitro GTFs experiments. The first step is TFIID binding at the core promoter, followed by TFIIB binding to the IID-DNA complex. Next, the polymerase is recruited with TFIIF, and then

6

TFIIE and TFIIH are brought in to form a preinitiation complex termed the PIC. TFIIA can associate after TFIID binding, and helps stabilize TFIID-DNA interactions. At this point all the factors necessary to initiate transcription are bound to the promoter, but the polymerase is not yet in an active conformation to begin transcription. For initiation to occur, the DNA at the start site must undergo a conformational change where it separates into two strands and the template strand is positioned within the active site of Pol II. It is from this point that RNA synthesis begins as the polymerase releases from the promoter and initiation factors, in a step termed promoter clearance, and moves down the gene in productive elongation (Hahn, 2004 and references therein). There are additional factors, which will not be discussed here, that are involved in regulating Pol II during the elongation process.

The DNA sequence upstream of the coding sequence, termed the core promoter region, plays a critical role in setting up the PIC. While originally thought to be invariant, the core promoters can actual be very diverse and appear to play a major role in gene regulation. Several different sequences within promoters have been identified, which components of the PIC interact with, to position the polymerase. The first promoter element identified in eukaryotic protein-coding genes was the TATA box (originally named the Goldberg-Hogness box after its discoverers). Subsequently, TBP (a component of TFIID) was shown to bind to this sequence (Buratowski et al., 1988; Hahn et al., 1989a). An in-depth examination of the yeast genome sequence recently resulted in identification of an 8bp consensus sequence (TATA(A/T)A(A/T)(A/G)) which is

7

present at approximately 19% of yeast promoters (Basehoar et al., 2004). Genome analysis of other eukaryotic organisms has estimated the percentage of TATA-containing genes between 32-43% (Smale and Kadonaga, 2003). Another core promoter element of importance is the initiator element (Inr), which is where initiation of transcription occurs.

In higher eukaryotes, this is generally 25-30 bp downstream of the TATA box, while in yeast the distance is more variable. However, in both situations, the Inr plays a role in accurate transcription initiation. There are two additional core promoter sequences that have been identified in higher eukaryotes but are not known to be present in yeast: the downstream promoter element (DPE), and an element recognized by TFIIB (BRE)

(reviewed by Smale and Kadonaga, 2003).

Other key players in transcription regulation are the gene-specific activators and repressors, which bind to DNA regulatory sequences generally located upstream of the coding region in yeast. Often these opposing modes of gene regulation are performed by the same molecule, which is converted from a repressor to an activator and back through protein modifications, of which the best studied is phosphorylation. This conversion is often initiated by the external signals a cell receives, which then must be coordinated to lead to changes in the pattern of gene expression. Gene-specific activators and repressors function with the assistance of a diverse group of factors designated as coactivators or corepressors. While I have focused my discussion on coactivator complexes, one should be aware that there are corepressor complexes, which essentially function in the opposite

8

manner as coactivator complexes. In fact, in several cases presented in this thesis, one complex might have both activator and repressor functions.

In the broadest definition, any factor that assists the gene-specific activator in assembling the transcription complex at the promoter of a gene can be considered a coactivator.

They are often large protein complexes with many subunits that carry out diverse functions such as activator binding, histone modification and remodeling, recruitment of

GTFs (including TBP and the polymerase) to the promoter, and even facilitating the positioning of the transcriptional apparatus over the core promoter. This definition is meant to encompass a broad range of factors, which are often classified separately based upon the activity by which they were first identified. Coactivators are thought to bridge between the gene-specific activator and the basic transcriptional machinery, although one coactivator, TFIID, was originally identified as a GTF. Others were identified based upon a requirement for them in activated transcription, either using in vitro transcription assays or in vivo with genetic screens that exploited conditional growth requirements.

Interestingly, and a feature explored in more detail in this thesis, coactivator complexes often contain components which modulate transcription both positively and negatively.

Complexes generally considered as coactivators include TFIID and SAGA, whose relationship is detailed in chapter 2, as well as Mediator. In addition, complexes that modify and remodel chromatin, such as NuA4 and SWI/SNF, and factors that interact with TBP, like TFIIA and NC2, could be put into this category. Whether a factor works as a coactivator or a corepressor might change based upon what gene is being examined

9

and what the physiological requirements of that gene product are in the cell at the time of the assay. In other words, it all depends on what gene you look at and what signals the cell is getting from its environment.

1.3 TATA Binding Protein’s Role in Pol II transcription

1.3.1 Identification of TFIID in yeast and higher eukaryotes

The TFIID fraction originally isolated from mammalian cell extracts is the GTF component that binds to and recognizes the TATA box (Nakajima et al., 1988; Sawadogo and Roeder, 1985). Subsequent experiments with yeast recapitulated the ability to specifically initiate transcription in vitro from a nuclear extract and identified a yeast factor that behaved like the mammalian TFIID fraction (Buratowski et al., 1988; Lue and

Kornberg, 1987). Further characterization of yeast “TFIID” demonstrated that it could bind to consensus and non-consensus TATA promoters and corresponds a single polypeptide which is encoded by the SPT15 gene (Eisenmann et al., 1989; Hahn et al.,

1989a; Hahn et al., 1989b). On a side note, the SPT genes were identified as mutations that suppress the phenotypes associated with insertion of the Ty transposon into the promoter region of genes (Simchen et al., 1984; Winston et al., 1984). Analysis of the cohort of SPT mutants has mapped the mutations to genes that encode components of several transcription regulatory complexes as well as the core histones. This includes

SPT3, a component of the SAGA complex, whose role in regulating TBP (SPT15) is investigated in this work.

10

After the DNA binding activity of the mammalian TFIID fraction had been mapped to a single polypeptide in yeast, which is referred to as the TATA binding protein (TBP) in the remainder of this document to distinguish it from the original TFIID fraction, major progress in the field occurred with the mammalian and Drosophila systems. Further studies showed that yeast TBP, as well as the Drosophila and mammalian cloned homologues, were unable to completely recapitulate all of the activities of the original semi-purified TFIID fraction. The cloned yTBP could support basal transcription in vitro but was unable to respond to gene specific activators in the same way as the partially purified mammalian TFIID (Pugh and Tjian, 1990; Smale et al., 1990). This led to the proposal that the partially purified fraction contained additional “coactivator” functions, which allowed it to respond to transcriptional activators (Pugh and Tjian, 1990). Co- activator subunits were identified in flies and mammalian cells and termed TAFs for TBP

Associated Factors (Dynlacht et al., 1991; Tanese et al., 1991). (The remainder of this work refers to the TAF complex as TFIID; and generally considers TFIID and TBP as separate entities, although they often are associated.)

After the identification of mammalian and Drosophila TAFs, the search for yeast TAFs took off, as they must have been the missing component in the reconstitution experiments with yTBP. In 1994 Reese and colleagues published the identification of a multisubunit complex which interacts with yTBP, is specifically required for activated transcription in vitro, and contained subunits homologous to TAFs previously isolated in mammals and flies (Poon et al., 1995; Reese et al., 1994). Since their discovery, the in vivo role of

11

yeast TAFs has been the focus of intensive research, which is discussed in more detail in section 1.3.3. Subsequent studies have identified a total of 14 TAFs, as well as additional proteins, as components of yeast TFIID through co-purification experiments (Auty et al.,

2004; Sanders et al., 2002). Like its counterparts in other organisms, yTFIID is composed of subunits with diverse functions whose role in gene regulation is still under active investigation.

1.3.2 TBP Structure and Interaction with DNA

The crystal structure of TBP bound to TATA DNA has been solved (Chasman et al.,

1993; Kim et al., 1993a; Kim et al., 1993b; Nikolov et al., 1992). TBP is a saddle shaped molecule, composed of ten antiparallel β-sheets which form a curved, concave surface that interacts with DNA, and four α-helices on its upper surface (see Figure 2.1 in

Chapter 2). TBP binds to DNA’s minor groove, primarily through hydrophobic interactions. This binding distorts the DNA, causing a sharp bend in it. The TBP-DNA interaction is severely impeded when DNA is assembled into a nucleosome (Imbalzano et al., 1994), presumably due to the incompatibility between DNA-histone interactions and

DNA-TBP interactions (reviewed in Orphanides et al., 1996). Recent in vivo studies have shown that the position of a TATA-box relative to positioned nucleosomes is critical in determining which factors are required for transcriptional activation (Martinez-

Campa et al., 2004). This is an important observation as it highlights the requirement for different components of the transcriptional activation machinery in vivo, depending upon the location and presence of a TATA box in the context of nucleosomal DNA.

12

1.3.3 Role of yeast TAFs in transcription

Based on the in vitro studies with mammalian and drosophila TAFs, it was expected that the yeast TAFs would also be required for activated transcription. Through the awesome power of yeast genetics, the role of TAFs in yeast was addressed primarily with in vivo experiments. Surprisingly, initial analysis with several well studied, highly inducible yeast genes indicated that gene activation did not actually require TAFs (Moqtaderi et al.,

1996; Walker et al., 1996). Nevertheless, the requirement of TAFs for cell viability would indicate that they play an important role in some cellular processes. Further investigation into the role of TAFs in vivo demonstrated that TAF1 and TAF5 are important for cell cycle progression (Apone et al., 1996; Walker et al., 1997), as well as a requirement for TAF1 in the expression of certain ribosomal protein genes (Shen and

Green, 1997). Studies of the regulatory role of TAFs at ribosomal protein genes indicated that their dependence on TAF1 is linked to the core promoter region, not the upstream activator binding regions.

A turning point in the understanding of yeast TAFs occurred in 1998, when Workman and colleagues discovered that a subset of yeast TAFs (TAF 5, 6, 9, 10, and 12) were also present in the Spt-Ada-Gcn5-acetyltransferase complex (Grant et al., 1998).

Investigation into the in vivo requirement of these “shared” TAFs indicated a much broader requirement for them in transcription (Apone et al., 1998; Michel et al., 1998;

Moqtaderi et al., 1998). Additional progress in understanding yeast TAFs was made when chromatin immunoprecipiation experiments conducted against TBP and a variety of

13

TAFs demonstrated that some promoters had high levels of TAFs relative to bound TBP, while other promoters had relatively low levels of TAFs. This result led to the proposal that some genes in yeast are TAF-independent, meaning that they are activated in a manner that does not require TFIID (Kuras et al., 2000; Li et al., 2000). However, which yeast genes require TFIID for activation was still an area of debate. Experiments presented in this thesis, in particular Chapter 3, attempt to address the genome-wide role of TFIID through analysis of TFIID-specific TAF1.

1.3.4 Regulators of TBP

Although TBP is most readily identified as a component of TFIID, it interacts with many additional proteins besides TAFs. These “TBP regulators” function in a variety of ways to affect gene expression by modulating TBP’s activity. Some factors act as positive regulators of gene expression by facilitating the delivery of TBP to promoters, stabilizing its binding to DNA, and setting up the PIC. Others play a negative role by inhibiting

TBP-DNA interactions or PIC formation. A subset of the known TBP regulators include: the SAGA complex, which interacts with TBP through the Spt3/Spt8 module; TFIIA, which interacts with the TBP-DNA binary complex to stabilize their interaction; TFIIB, which is thought to be the bridge between TBP and Pol II; TBP self-dimerization, in which two TBP molecules interact with each other to occlude TBP’s DNA binding interface, the NC2 complex, which binds to the TBP-DNA complex and blocks the incorporation of TFIIA and TFIIB into the PIC; and Mot1, which uses the energy of ATP to dissociate TBP from DNA. While this list includes many of the well-studied

14

components of the gene regulatory machinery that target TBP, it is not meant to be comprehensive, as there are additional factors that have been demonstrated to interact with TBP.

A main objective of this thesis is to understand how the different regulatory factors which target TBP coordinate their function. To this end, several of the TBP regulators mentioned above are analyzed independently and in combination to determine their genome-wide role in regulating gene expression through TBP. More detailed information about each regulator is presented in the relevant chapter introductions.

1.4 A Genome-wide approach to investigating transcription regulation

1.4.1 Genome-wide Technology

In 1996 the finished sequence of the Saccharomyces cerevisiae genome was released, making it the first eukaryotic genome to be completely sequenced (Goffeau et al., 1996).

After the identification of all ~6000 yeast genes, the next hurdle was to understand their functions. Technology developed in the Brown and Davis labs at Stanford University has been instrumental in addressing questions about genome function and the regulation of gene expression (DeRisi et al., 1996; Heller et al., 1997; Schena et al., 1995; Shalon et al., 1996). They developed a high-density DNA arrayer that could print DNA on glass microscope slides. By printing DNA that corresponded to the coding regions of a genome, they could monitor the expression of many genes simultaneously. The first yeast microarray, performed after the yeast sequence was released, used PCR to amplify

15

several thousand yeast open reading frames (ORFs, i.e. the coding regions of the genome)

(Lashkari et al., 1997). These PCR products were then arrayed and used to monitor differences in the mRNA levels between two yeast strains with different genetic backgrounds or cultured under different growth conditions. The technology can also be used to print cDNAs instead of amplified PCR products for organisms whose genomes are more complex. There are now many different varieties of arrays, which are used in a broad range of applications. One application of particular interest in the gene regulation field is using high-density arrays to determine the locations of factors that interact with

DNA by combining the arrays with chromatin-immunoprecipitation experiments.

1.4.2 Benefits and Limitations of a Genome-wide approach

Genome-wide expression profiling has been very fruitful for the study of gene regulation.

Prior to genome wide analysis, researchers that wanted to test the effect of a mutation in gene regulatory machinery would monitor the expression of a handful of genes via

Northern blot or a comparable method. Often, conclusions based on the analysis of a few commonly studied genes were applied to the entire genome. While this practice seemed reasonable at the time, in retrospect it may have been misleading, especially concerning the role of TAFs in the yeast genome, as presented in Chapter3. By using a genome-wide approach to dissect the function of transcriptional regulatory machinery, any unintentional bias of the test genes utilized is avoided.

16

However, there are drawbacks and limitations to what can be done on a genome-wide scale. The most obvious downside is dealing with the large amounts of data that result from global analysis. For a biologist this is often very daunting. It is also difficult, if not impossible, to delve into the level of detail necessary to completely understand how a gene is regulated with a genome-wide approach. It is often most illuminating to combine global analysis with more detailed biochemical studies. In the past several years, there has been much growth in the genomics field, in an attempt to test the applicability of earlier conclusions about how genes are regulated against the whole yeast genome.

While the advent of the genomics field in the past decade, including the invention of high-density arrays, has dramatically altered the landscape of life sciences, there still are many unknowns about how a cell regulates its genome.

17

1.5 References

Apone, L. M., Virbasius, C. A., Holstege, F. C., Wang, J., Young, R. A., and Green, M.

R. (1998). Broad, but not universal, transcriptional requirement for yTAFII17, a histone

H3-like TAFII present in TFIID and SAGA. Mol Cell 2, 653-661.

Apone, L. M., Virbasius, C. M., Reese, J. C., and Green, M. R. (1996). Yeast TAF(II)90 is required for cell-cycle progression through G2/M but not for general transcription activation. Genes Dev 10, 2368-2380.

Auty, R., Steen, H., Myers, L. C., Persinger, J., Bartholomew, B., Gygi, S. P., and

Buratowski, S. (2004). Purification of active TFIID from Saccharomyces cerevisiae.

Extensive promoter contacts and co-activator function. J Biol Chem 279, 49973-49981.

Basehoar, A. D., Zanton, S. J., and Pugh, B. F. (2004). Identification and distinct regulation of yeast TATA box-containing genes. Cell 116, 699-709.

Buratowski, S., Hahn, S., Sharp, P. A., and Guarente, L. (1988). Function of a yeast

TATA element-binding protein in a mammalian transcription system. Nature 334, 37-42.

Chasman, D. I., Flaherty, K. M., Sharp, P. A., and Kornberg, R. D. (1993). Crystal structure of yeast TATA-binding protein and model for interaction with DNA. Proc Natl

Acad Sci U S A 90, 8174-8178.

Cormack, B. P., and Struhl, K. (1992). The TATA-binding protein is required for transcription by all three nuclear RNA polymerases in yeast cells. Cell 69, 685-696.

18

DeRisi, J., Penland, L., Brown, P. O., Bittner, M. L., Meltzer, P. S., Ray, M., Chen, Y.,

Su, Y. A., and Trent, J. M. (1996). Use of a cDNA microarray to analyse gene expression patterns in human cancer. Nat Genet 14, 457-460.

Dion, M. F., Altschuler, S. J., Wu, L. F., and Rando, O. J. (2005). From the Cover:

Genomic characterization reveals a simple histone H4 acetylation code. Proc Natl Acad

Sci U S A 102, 5501-5506.

Dynlacht, B. D., Hoey, T., and Tjian, R. (1991). Isolation of coactivators associated with the TATA-binding protein that mediate transcriptional activation. Cell 66, 563-576.

Eisenmann, D. M., Dollard, C., and Winston, F. (1989). SPT15, the gene encoding the yeast TATA binding factor TFIID, is required for normal transcription initiation in vivo.

Cell 58, 1183-1191.

Goffeau, A., Barrell, B. G., Bussey, H., Davis, R. W., Dujon, B., Feldmann, H., Galibert,

F., Hoheisel, J. D., Jacq, C., Johnston, M., et al. (1996). Life with 6000 genes. Science

274, 546, 563-547.

Grant, P. A., Schieltz, D., Pray-Grant, M. G., Steger, D. J., Reese, J. C., Yates, J. R., 3rd, and Workman, J. L. (1998). A subset of TAF(II)s are integral components of the SAGA complex required for nucleosome acetylation and transcriptional stimulation. Cell 94, 45-

53.

Hahn, S. (2004). Structure and mechanism of the RNA polymerase II transcription machinery. Nat Struct Mol Biol 11, 394-403. 19

Hahn, S., Buratowski, S., Sharp, P. A., and Guarente, L. (1989a). Isolation of the gene encoding the yeast TATA binding protein TFIID: a gene identical to the SPT15 suppressor of Ty element insertions. Cell 58, 1173-1181.

Hahn, S., Buratowski, S., Sharp, P. A., and Guarente, L. (1989b). Yeast TATA-binding protein TFIID binds to TATA elements with both consensus and nonconsensus DNA sequences. Proc Natl Acad Sci U S A 86, 5718-5722.

Heller, R. A., Schena, M., Chai, A., Shalon, D., Bedilion, T., Gilmore, J., Woolley, D. E., and Davis, R. W. (1997). Discovery and analysis of inflammatory disease-related genes using cDNA microarrays. Proc Natl Acad Sci U S A 94, 2150-2155.

Imbalzano, A. N., Kwon, H., Green, M. R., and Kingston, R. E. (1994). Facilitated binding of TATA-binding protein to nucleosomal DNA. Nature 370, 481-485.

Kim, J. L., Nikolov, D. B., and Burley, S. K. (1993a). Co-crystal structure of TBP recognizing the minor groove of a TATA element. Nature 365, 520-527.

Kim, Y., Geiger, J. H., Hahn, S., and Sigler, P. B. (1993b). Crystal structure of a yeast

TBP/TATA-box complex. Nature 365, 512-520.

Kuras, L., Kosa, P., Mencia, M., and Struhl, K. (2000). TAF-Containing and TAF- independent forms of transcriptionally active TBP in vivo. Science 288, 1244-1248.

Kurdistani, S. K., Tavazoie, S., and Grunstein, M. (2004). Mapping global histone acetylation patterns to gene expression. Cell 117, 721-733.

20

Lashkari, D. A., DeRisi, J. L., McCusker, J. H., Namath, A. F., Gentile, C., Hwang, S. Y.,

Brown, P. O., and Davis, R. W. (1997). Yeast microarrays for genome wide parallel genetic and gene expression analysis. Proc Natl Acad Sci U S A 94, 13057-13062.

Li, X. Y., Bhaumik, S. R., and Green, M. R. (2000). Distinct classes of yeast promoters revealed by differential TAF recruitment. Science 288, 1242-1244.

Lue, N. F., and Kornberg, R. D. (1987). Accurate initiation at RNA polymerase II promoters in extracts from Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 84,

8839-8843.

Martinez-Campa, C., Politis, P., Moreau, J. L., Kent, N., Goodall, J., Mellor, J., and

Goding, C. R. (2004). Precise nucleosome positioning and the TATA box dictate requirements for the histone H4 tail and the bromodomain factor Bdf1. Mol Cell 15, 69-

81.

Matsui, T., Segall, J., Weil, P. A., and Roeder, R. G. (1980). Multiple factors required for accurate initiation of transcription by purified RNA polymerase II. J Biol Chem 255,

11992-11996.

Michel, B., Komarnitsky, P., and Buratowski, S. (1998). Histone-like TAFs are essential for transcription in vivo. Mol Cell 2, 663-673.

Moqtaderi, Z., Bai, Y., Poon, D., Weil, P. A., and Struhl, K. (1996). TBP-associated factors are not generally required for transcriptional activation in yeast. Nature 383, 188-

191. 21

Moqtaderi, Z., Keaveney, M., and Struhl, K. (1998). The histone H3-like TAF is broadly required for transcription in yeast. Mol Cell 2, 675-682.

Nakajima, N., Horikoshi, M., and Roeder, R. G. (1988). Factors involved in specific transcription by mammalian RNA polymerase II: purification, genetic specificity, and

TATA box-promoter interactions of TFIID. Mol Cell Biol 8, 4028-4040.

Nikolov, D. B., Hu, S. H., Lin, J., Gasch, A., Hoffmann, A., Horikoshi, M., Chua, N. H.,

Roeder, R. G., and Burley, S. K. (1992). Crystal structure of TFIID TATA-box binding protein. Nature 360, 40-46.

Onodera, Y., Haag, J. R., Ream, T., Nunes, P. C., Pontes, O., and Pikaard, C. S. (2005).

Plant nuclear RNA polymerase IV mediates siRNA and DNA methylation-dependent heterochromatin formation. Cell 120, 613-622.

Orphanides, G., Lagrange, T., and Reinberg, D. (1996). The general transcription factors of RNA polymerase II. Genes Dev 10, 2657-2683.

Orphanides, G., and Reinberg, D. (2002). A unified theory of gene expression. Cell 108,

439-451.

Poon, D., Bai, Y., Campbell, A. M., Bjorklund, S., Kim, Y. J., Zhou, S., Kornberg, R. D., and Weil, P. A. (1995). Identification and characterization of a TFIID-like multiprotein complex from Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 92, 8224-8228.

22

Pugh, B. F., and Tjian, R. (1990). Mechanism of transcriptional activation by Sp1: evidence for coactivators. Cell 61, 1187-1197.

Reese, J. C., Apone, L., Walker, S. S., Griffin, L. A., and Green, M. R. (1994). Yeast

TAFIIs in a multisubunit complex required for activated transcription. Nature 371, 523-

527.

Richards, E. J., and Elgin, S. C. (2002). Epigenetic codes for heterochromatin formation and silencing: rounding up the usual suspects. Cell 108, 489-500.

Sanders, S. L., Garbett, K. A., and Weil, P. A. (2002). Molecular characterization of

Saccharomyces cerevisiae TFIID. Mol Cell Biol 22, 6000-6013.

Sawadogo, M., and Roeder, R. G. (1985). Interaction of a gene-specific transcription factor with the adenovirus major late promoter upstream of the TATA box region. Cell

43, 165-175.

Schena, M., Shalon, D., Davis, R. W., and Brown, P. O. (1995). Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467-

470.

Schultz, M. C., Reeder, R. H., and Hahn, S. (1992). Variants of the TATA-binding protein can distinguish subsets of RNA polymerase I, II, and III promoters. Cell 69, 697-

702.

23

Shalon, D., Smith, S. J., and Brown, P. O. (1996). A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization.

Genome Res 6, 639-645.

Shen, W. C., and Green, M. R. (1997). Yeast TAF(II)145 functions as a core promoter selectivity factor, not a general coactivator. Cell 90, 615-624.

Simchen, G., Winston, F., Styles, C. A., and Fink, G. R. (1984). Ty-mediated gene expression of the LYS2 and HIS4 genes of Saccharomyces cerevisiae is controlled by the same SPT genes. Proc Natl Acad Sci U S A 81, 2431-2434.

Smale, S. T., and Kadonaga, J. T. (2003). The RNA polymerase II core promoter. Annu

Rev Biochem 72, 449-479.

Smale, S. T., Schmidt, M. C., Berk, A. J., and Baltimore, D. (1990). Transcriptional activation by Sp1 as directed through TATA or initiator: specific requirement for mammalian transcription factor IID. Proc Natl Acad Sci U S A 87, 4509-4513.

Tanese, N., Pugh, B. F., and Tjian, R. (1991). Coactivators for a proline-rich activator purified from the multisubunit human TFIID complex. Genes Dev 5, 2212-2224.

Walker, S. S., Reese, J. C., Apone, L. M., and Green, M. R. (1996). Transcription activation in cells lacking TAFIIS. Nature 383, 185-188.

24

Walker, S. S., Shen, W. C., Reese, J. C., Apone, L. M., and Green, M. R. (1997). Yeast

TAF(II)145 required for transcription of G1/S cyclin genes and regulated by the cellular growth state. Cell 90, 607-614.

Weil, P. A., Luse, D. S., Segall, J., and Roeder, R. G. (1979). Selective and accurate initiation of transcription at the Ad2 major late promotor in a soluble system dependent on purified RNA polymerase II and DNA. Cell 18, 469-484.

Winston, F., Chaleff, D. T., Valent, B., and Fink, G. R. (1984). Mutations affecting Ty- mediated expression of the HIS4 gene of Saccharomyces cerevisiae. Genetics 107, 179-

197.

25

Chapter 2

2 Interplay of TBP Inhibitors in Global Transcriptional Control

Work presented in this chapter was previously published as "Interplay of TBP inhibitors in global transcriptional control." Carmelata Chitikila, Kathryn L. Huisinga, Jordan D.

Irvin, Andrew D. Basehoar, and B. Franklin Pugh. Mol Cell 10(4): 871-82 and is partially reprinted here with permission.

2.1 Summary

The TATA binding protein (TBP) is required for the expression of nearly all genes and is highly regulated both positively and negatively. I have used DNA microarrays to explore the genome-wide interplay of several TBP-interacting inhibitors in the yeast

Saccharomyces cerevisiae. The findings suggest the following: The NC2 inhibitor turns down, but not off, highly active genes. Auto-inhibition of TBP through dimerization contributes primarily to transcriptional repression, even at repressive subtelomeric regions. The TAND domain of TAF1 plays a primary inhibitory role at very few genes, but its function becomes wide-spread when other TBP interactions are compromised.

These findings reveal that transcriptional output is controlled in part by a collaboration of different combinations of TBP inhibitory mechanisms.

26

2.2 Introduction

Transcriptional control of gene expression involves a dynamic interplay of positively and negatively acting factors, with the relative dominance of one over the other dictating transcriptional output. Negative regulation is generally associated with promoter inaccessibility due to chromatin structure (reviewed in (Struhl, 1999). However, loss of chromatin components, including histone H4, TUP1, or SIR3 have surprisingly modest effects on transcription (DeRisi et al., 1997; Wyrick et al., 1999). Since promoter regions are often intrinsically accessible to nuclear proteins (Mai et al., 2000; and references therein), there are likely to be additional general mechanisms, not based solely on chromatin structure, that negatively regulate transcription complex assembly.

Direct interactions of negative regulators with the general transcription machinery might contribute to transcriptional control. In particular, several factors target the TATA binding protein (TBP) in all eukaryotes, including NC2, TAF1, TBP itself, Mot1, Spt3, and the Not-Ccr4 complex. How these factors inter-relate to regulate transcription through TBP is not known. To begin understanding the regulatory network by which these factors function, I first focused on three well-characterized interactions in yeast:

TBP/NC2, TBP/TAF1, and TBP dimerization.

NC2 is a heterodimer of two histone-fold subunits (Bur6/DRAP1 and NC2β/DR1). NC2 binds to and stabilizes TBP/TATA complexes in mobility shift assays, competitively

27

inhibiting the association of TFIIA and TFIIB (Cang et al., 1999; Goppelt and

Meisterernst, 1996; Kim et al., 1997; Mermelstein et al., 1996). NC2 occupies a region just under the TBP/ TATA interface (see Figure 2.1), contacting DNA on both sides of

TBP (Kamada et al., 2001). A domain of NC2 reaches up and contacts residues along the convex surface of TBP, and contributes to the steric occlusion of TFIIB. A yeast TBP mutation (F182V) along this interface disrupts TBP/NC2 interactions in vitro, and causes increased expression of a number of enhancer-less genes in vivo (Cang et al., 1999).

NC2 overexpression selectively suppresses phenotypes associated with this mutation, providing further evidence that TBP (F182V) is primarily defective in NC2 interactions.

In addition to acting as an inhibitor, NC2 plays a positive role in transcription, although its basis is not understood (Cang et al., 1999; Geisberg et al., 2001; Willy et al., 2000).

28

Figure 2.1. Interaction of TBP with regulatory factors Structures of TBP interactions relevant to this study. Shown is the core of yeast TBP interacting with itself (Chasman et al., 1993; Nikolov et al., 1992), the Drosophila TAND I domain (Liu et al., 1998), and TATA DNA plus human NC2 (Kamada et al., 2001). The TFIIA•TBP•TATA•TFIIB structure is a composite of two structures (Nikolov et al., 1995; Tan et al., 1996). All views are the same vantage point: Upstream of the TATA box looking downstream. Selected amino acids are shown as stick diagrams. The relative affinity of each negative regulator (shown in red) for the relevant TBP mutants is shown below each diagram. Those in parentheses are not significantly different from each other.

TAF1 inhibits TBP/TATA interactions (Banik et al., 2001; Kokubo et al., 1998;

Nishikawa et al., 1997). NMR analysis of the Drosophila TAF1 amino terminal domain I

(TAND I) indicates that it engages in molecular mimicry of the TATA box (see Figure

2.1), occluding the concave DNA binding surface of TBP (Liu et al., 1998). The yeast

TAND I region is smaller, poorly conserved, and functionally dependent upon an

29

adjacent TAND II region (Kokubo et al., 1998; Kotani et al., 1998). Although yeast

TAND I also appears to interact with the concave surface of TBP (Kokubo et al., 1998), the TBP residues involved in binding have not been fully identified. Deletion of the yeast

TAND domain is expected to derepress transcription in vivo. Except in certain artificial situations (Cheng et al., 2002), this has not been observed. Therefore, it remains unresolved as to whether TAND is a negative regulator in vivo.

In the absence of DNA, the conserved core of TBP crystallizes as a dimer (see Figure

2.1), which occludes its DNA binding surface (Chasman et al., 1993; Nikolov et al.,

1992). In vitro, this interaction appears to be weaker in yeast TBP than human TBP

(Campbell et al., 2000; Coleman and Pugh, 1997; Coleman et al., 1995). The physiological relevance of TBP self-association in both yeast and humans is supported by in vivo crosslinking experiments and mutational studies (Jackson-Fisher et al., 1999;

Taggart and Pugh, 1996). Mutations along the crystallographic concave DNA binding and dimerization surface (TBPEB series) inhibit TBP self-association to varying extents, while TATA binding is equally impaired (Jackson-Fisher et al., 1999). A strong correlation exists between dimer instability measured in vitro, and elevated basal (EB) transcription in yeast, as measured by β-galactosidase activity from a lacZ reporter gene.

Consistent with the notion of autorepression, overexpression of wild type TBP does not lead to elevated basal transcription. Moreover, in a dose-dependent manner, TBP overexpression suppresses the elevated basal transcription caused by the dimerization- impaired TBPEB mutants, perhaps by driving unfavorable dimer formation with the

30

TBPEB mutants via mass action (Jackson-Fisher et al., 1999). While TBP is subjected to auto-inhibition in vivo, it is not known how broadly this mechanism is utilized genome- wide in the context of the NC2 and TAND inhibitory mechanisms discussed above.

Since the concave surface of TBP has the potential to engage in multiple positive and negative interactions, such as with TATA, TBP, and the TAF1 TAND domain, mutations along the concave surface could affect more than one interaction. However, each of these interfaces of TBP with its target are chemically distinct, and therefore could elicit characteristic phenotypes in response to a series of TBP mutations along the concave surface. It might be possible to correlate specific patterns of gene expression with specific interactions defined biochemically.

To investigate the potential involvement of different repression mechanisms functioning through the concave surface of TBP, Jordan Irvin examined whether the previously characterized ‘EB’ mutations in this region affect binding to the TAF1 TAND domain.

Using GST-TAND to pulldown the TBPEB mutants he found that the mutants interacted with the TAND domain over a range of affinities. These results are summarized in

Figure 2.1. Interestingly, the ability of the mutants to interact with TAND did not show the same pattern of interactions observed when their ability to self-dimerize was examined (Jackson-Fisher et al., 1999; summarized in Figure 2.1). While the in vitro dimer stability showed a strong negative correlation with the level of basal transcription

31

from a reporter gene in vivo, the ability of the TBPEB mutants to interact with TAND did not exhibit a correlation.

Next, a genome-wide study to examine the interplay of factors that interact with TBP’s concave surface (DNA, TAND, TBP homodimerization, and possibly others) to regulate transcription was initiated by Carmelata Chitikila and completed by myself. TAND contribution was assessed by comparing strains containing and lacking the TAND domain. Contributions from DNA binding and TBP dimerization were examined through use of the ‘EB’ mutants. These factors are distinguishable in that DNA plays a positive role and dimerization plays a negative role in regulating gene expression. In addition,

NC2 interaction was examined using a TBP mutation (F182V) that abolishes NC2 interactions.

Through microarray analysis, we find that expression of approximately 40% of the yeast genome is sensitive to either mutations along TBP’s concave surface or a mutation that affects NC2 binding, particularly when the TAND domain of TAF1 is absent. The affected genes cluster into four major groups, which show distinct sensitivities to the various mutations. The first group of genes is highly expressed, and appears to be sensitive to TBP/DNA stability. Interestingly, expression of this group appears to be attenuated primarily by NC2 and partially by the TAF1 TAND domain. The second group is positively regulated by TAND, particularly when TBP/DNA interactions are compromised. The third group is negative regulated by at least two seemingly redundant

32

factors: the TAF1 TAND domain and an unidentified activity. The fourth group of genes is largely repressed. TBP dimerization and the TAF1 TAND domain contribute to their repression, with TBP dimerization playing the more predominant role. These repressed genes are found throughout the genome, but are particularly prevalent in the repressive subtelomeric environment. The findings presented here suggest that a large portion of the yeast genome is negatively regulated through TBP by a collaboration of different combinations of factors including NC2, TAF1, TBP dimerization, and others.

Which combination is used might be dependent upon the expression level of the gene.

2.3 Results

2.3.1 Genome-wide Effects of ΔTAND and TBP Mutations

To examining the functional relationships among TBP regulatory interactions, Carmelata

Chitikila and I performed DNA microarray analysis of TBP mutants in strains harboring either wild type TAF1 or a ΔTAND derivative (the experiments conducted by each of us are outlined in Table 2.1 and Figure 2.2). The strategy involved growing yeast cells in noninducing raffinose media, then briefly (45 min.) inducing an HA-tagged version of the

TBP mutants by addition of galactose (Figure 2.2). This short window of time was intended to minimize potential indirect effects, where any initial changes in gene expression lead to subsequent changes in the expression of other genes. Since all strains harbor a chromosomal copy of the wild type TBP gene, only dominant effects will be observed.

33

Table 2.1. Number of genes affected by TBP mutations Mutant Down Up N wild type

nullL 0 0 5831 null 3 1 5902 wild typeL 21 3 5912

V161RL 219 119 5925 V161R 233 141 5658 V161E 153 2 5736 N69R 273 143 5646 N69S 86 51 5604 V71R 317 236 5632 V71E 67 11 5341 F182V 29 483 5045

ΔTAND

nullL 74 58 5880 wild typeL 71 21 5913

V161RL 234 410 5834 V161R 266 536 5970 V161E 316 216 5784 N69R 233 409 5751 N69S 245 294 5632 V71R 289 591 5915 V71E 215 332 5958 The number of genes that went up or down using the filtering criteria described in the Material & Methods. N is the number of genes that passed filter 1 (signal above background in both channels). The reference for all experiments was wild type with ‘null’ TBP. The ‘L’ subscripts indicate experiments performed by Carmelata Chitikila.

34

Figure 2.2. HA-tagged TBP is expressed after galactose induction. Strains harbored either wild type or TAF1(ΔTAND), and the indicated HA-tagged TBP derivatives under control of the GAL10 promoter. TBP expression was induced for 45 minutes, and equivalent numbers of cells were analyzed for TBP by immunoblotting (Jackson-Fisher et al., 1999). Purified recombinant his-TBP standards are shown.

Approximately 2400 genes (~40% of the genome) changed expression significantly in at least one mutant (Table 2.1), employing a series of statistical filters described in the

Experimental Procedures, which were developed by Andy Basehoar. The significantly affected genes were clustered into four groups using a K-means algorithm. These four groups reflect distinct transcriptional responses to the TBP and TAF1 mutations (Figure

2.3 and Table 2.2). Several experiments provide a frame of reference. In columns 18 &

19 are two independent reference vs. reference data sets (null TBP in a wild type TAF1 strain) that are indicative of no change. Secondly, the V161R experiments (in both wild type TAF1 and ΔTAND strains) were repeated approximately a year apart by myself and

Carmelata Chitikila (V161RK vs. V161RL, columns 4, 5, 11, and 12). When these repeats

35

are compared, of the typically >5700 genes that passed filtering criteria 1 (see Materials

& Methods), correlation coefficients of 0.9 and 0.8 for wild type and ΔTAND, respectively, were obtained, indicating a high degree of reproducibility. Many of the mutations caused both an increase and decrease in transcription. However, F182V

(column 20) caused primarily an increase, while V161E (column 1) caused primarily a decrease. In total, the pattern of responses suggests that some of the mutations are affecting multiple functions of TBP, while others are selective.

Table 2.2. Number of genes in each cluster Group Genes % of Genome

1 526 9 2 466 8 3 268 4

4 652 11 5 446 7 Total 2358 38 Number of genes in each cluster for Figure 2.3.

36

Figure 2.3. Microarray analysis of TBP mutants in wild type and ΔTAND strains Cluster analysis of gene expression profiles. Cluster and Treeview (Eisen et al., 1998) were used to cluster 2358 significant changes in gene expression (defined in Materials & Methods) initially into five clusters, using the K-means algorithm. Five clusters were initially chosen since clusters greater than five were visually redundant. Upon subsequent analysis, it became evident that two of the clusters represented similar mechanisms and so were merged to form group 1. Group 1 was then sorted by the values in the F182V column. Each column represents gene expression changes in the strain designated above each column. Names are colored to signify related mutations. Strains indicated by ΔTAND contained a deletion of the TAF1 TAND domain, while the remainder were isogenic wild types. Each row corresponds to an expression ratio for a single gene (Red = increased expression, Green = decreased expression, Black = no change, Gray = missing data). The intensity of color correlates to the magnitude of change. The collection of columns were subjected to hierarchical clustering using Cluster and Treeview and the resulting dendrogram is shown above the list of mutants. Experiments in columns 4,11, 15-17, and 19 were preformed by Carmelata Chitikila.

37

An important concern when interpreting gene expression data is assessing indirect effects. Indirect effects are presumed to arise, in this case, when a mutation directly affects the expression of arbitrary transcriptional activators and repressors, which subsequently cause increased and decreased expression of other genes that might not have been directly affected by the mutation. In an attempt to minimize potential indirect effects, gene expression levels were examined after a brief (45 min.) induction of the HA- tagged TBP mutants from a GAL promoter (Figure 2.2), which reaches near-maximum output at ~30 min (Pokholok et al., 2002). It is suspected that little time is available for the subsequent production of sufficient quantities of activators and repressors, which would then go on to elicit indirect effects of any significant magnitude. In addition, simultaneous increases and decreases in gene expression were not observed for a number of mutants, particularly V161E, indicating that the expression patterns are likely to be due to direct effects. Chromatin immunoprecipitation and LexA-fusion studies of similar mutations along TBP’s concave surface have provided further evidence that these mutants are acting directly on target promoters and engaging the transcription machinery

(Geisberg and Struhl, 2000). Taken together, these observations indicate that the patterns of gene expression in Figure 2.3 are robust, and due largely to direct effects arising from disruption of distinct interfaces. Nonetheless, the gene expression patterns are likely to include some contribution from indirect effects.

Several general observations and inferences can be made from Figure 2.3. First, the large number of genes affected by the TBP mutations indicates that a substantial amount of

38

global gene regulation occurs via regulation of TBP. Second, the TBPEB mutants behaved similarly but not identically, indicating that mutations along TBP’s concave surface alter certain interactions but not others. Third, since some TBPEB mutations caused both increased and decreased expression, TBP’s concave surface is likely to be engaged in both positive and negative interactions. Fourth, deletion of the TAND domain had very modest effects on transcription (column 15), unless coupled to defects along TBP’s concave surface (columns 8-14). This type of behavior is indicative of functional redundancy between the TAND domain and other factors that interact with the concave surface of TBP. Fifth, overexpression of wild type TBP caused <0.5% of the genome to significantly change expression. The concentration of TBP, per se, therefore is not limiting for gene expression.

2.3.2 Distinct Gene Expression Groups Reveal Combinatorial Interactions of TBP

Regulators

Since the four expression groups shown in Figure 2.3 reflect distinct response patterns to the mutations, their underlying mechanisms were investigated by examining them separately. The data were examined in a number of ways as shown in Figures 2.4, 2.5, and 2.6. First, the mutants in each group (columns) were re-clustered separately using a hierarchical method. If multiple factors interact along TBP’s surfaces, then TBP mutations that disrupt certain interactions but not others might generate a characteristic transcriptional response. Comparison of this pattern with patterns of in vitro interactions of the mutants might shed light on the underlying mechanism. Second, the specific

39

TAND dependency of representative mutants from each group were examined quantitatively to assess the influence of TAND on gene expression. Third, to examine whether particular regulatory mechanisms direct absolute output levels, the absolute expression level of genes in each group was examined. Models to aid in the discussion of the groups are shown in Figure 2.7. For simplicity, these models do not include other components of the transcription machinery, and they make no inference about TAF occupancy at the promoters.

40

Figure 2.4. Hierarchical clustering of mutants in individual groups. Clustering experiments for each group of genes was performed as described in Figure 2.3. Dendrograms derived from hierarchical clustering of mutants in groups 1 and 4. Portions of the dendrogram that encompass the TBPEB arginine mutants are boxed in yellow. ‘Δ’ indicates ΔTAND.

41

Figure 2.5. Dependency of selected groups of genes on the TAF1 TAND domain.

For the indicated groups of genes, log2 ratios of fold changes in gene expression for the indicated mutants in the ΔTAND strain were plotted as a function of the same mutants in the corresponding wild type strain. Two groups were plotted in each panel. TAND effects are reflected as deviations of the data points from the red diagonal.

42

Figure 2.6. Expression level of various gene groups

Fold changes in gene expression (log2 Ratio) for representative mutants in each indicated group of genes were plotted as a function of log10 expression intensity in the reference state (null TBP in a wild type TAF1 strain). Intensities represent an average from twelve reference hybridizations. The entire nullL dataset is plotted in black to provide a frame of reference for the distribution of gene expression intensities. (A) All genes for the

V161RK mutant in the ΔTAND strain. (B) Group 1, represented by the V161RK mutant in the wild type TAF1 strain, is plotted in the lower half of the panel (green). Group 4 represented by the V161RK mutant in the ΔTAND strain, is plotted in the upper half of the panel (red). (C) Group 2, represented by the V161RK mutant in the ΔTAND strain, is plotted in the lower half of the panel (green). Group 3 represented by the V161RK mutant in the ΔTAND strain, is plotted in the upper half of the panel (red).

43

Group 1 genes decreased in expression upon mutation of the concave surface of TBP

(Figure 2.3, columns 1-7 of group 1). Hierarchical clustering of the mutants (columns) in group 1 indicated that all TBPEB mutations in the wild type TAF1 strain were having equivalent negative effects on TBP function (Figure 2.4, group 1). Previous analysis had shown that all six of these TBPEB mutants are similarly impaired for TATA binding, in vitro, but show large differentials in dimerization and TAND binding (Chitikila et al.,

2002; Jackson-Fisher et al., 1999). Therefore, expression of this group of genes correlated best with TBP/DNA stability.

Deletion of the TAF1 TAND domain by itself had little effect on group 1 expression

(Figure 2.3, column 15 in group 1), but partially suppressed the decreased transcription caused by the TBPEB mutations (Figure 2.5A, shown as a deviation of the green data points from the diagonal). Thus, the TAF1 TAND domain might play a negative role at group 1 genes, which only becomes detectable when TBP’s positive function (i.e., DNA binding) is compromised.

Strikingly, most of the genes in group 1 appear to be negatively regulated by NC2 since the F182V mutation leads to increased expression (Figure 2.4, column 20 in group 1).

Inhibition by NC2 is detectable in the presence of TAND, whereas the reciprocal is not true (column 20 vs. 15). Therefore, NC2 appears to be a more predominant inhibitor than

TAND at these genes. Interestingly, group 1 genes that are the most sensitive to the

44

F182V mutation tended to be less sensitive to the TBPEB mutations. This suggests that

NC2 stabilizes TBP at many group 1 promoters, despite playing a negative role.

Next the absolute expression level of group 1 genes in the reference state (wild type

TAND, ‘null’ TBP overexpression) was examined. Fold changes in gene expression for a representative group 1 mutant (V161R) were plotted as a function of absolute expression intensity (Figure 2.6B, green). As a guide for comparison, the distribution of expression intensities for all genes in a ‘null’ mutant, were also plotted (black). Of the four groups, group 1 represented the most highly expressed set, clustering to the far right of the expression intensity profile, and having a median expression level ~50% higher than the next highest group (Figure 2.6C). Together, the group 1 data suggest that NC2 is an inhibitor of highly expressed genes (Figure 2.8, model 1).

Group 2 genes were also highly active (Figure 2.6C). The genes in this group were equivalently sensitivity to mutations along TBP’s DNA binding surface (Figure 2.4).

Unlike group 1 genes, TAND functioned positively on group 2 genes (Figures 2.3 and

2.5B), particularly when the DNA binding surface of TBP was compromised (Figure 2.8, model 2).

Group 3 genes appeared to be less active than groups 1 and 2 (Figure 2.6). Mutations along the concave surface of TBP, in general, lead to increased transcription (Figure 2.3, columns 1-14 in group 3), suggesting that positive TBP/DNA interactions are not rate-

45

limiting for these genes. TAND negatively regulated this group only in the presence of the TBPEB mutants, as evidenced by a leftward deviation from the diagonal of group 3 genes in Figure 2.5A (shown in red). All the TBPEB data sets derived from the ΔTAND strain appeared to be very similar (reflected by the shallow dendrogram branches in

Figure 2.4, group 3). This is not the behavior expected from impaired TBP dimerization.

V161R, N69R, and V71R display severe dimerization defects when compared to the other EB mutants (V161E, N69S, and V71E), and thus should cluster separately from them. Since this was not observed, it is likely that an additional unidentified negative regulator that interacts with TBP’s concave surface might be in play (Figure 2.8, model

3), although other interpretations are not excluded.

Group 4 genes clustered furthest to the left in the intensity profiles (Figure 2.6B, group 4 shown in red). The median expression level of this group was 20% of the group 1 level, and thus appeared to be generally repressed or only weakly active. Mutations along the concave surface of TBP lead to increased transcription in both the wild type and ΔTAND

TAF1 strains (Figure 2.3, group 4). Therefore, positive TBP/DNA interactions do not appear to be rate-limiting for these genes. Like group 3, deletion of the TAND domain exacerbated the increased transcription of group 4 (Figure 2.3 and 2.5B, red), indicating that TAND is playing an inhibitory role. The transcriptional response from the V161R,

V71R, N69R mutants clustered very tightly (Figure 2.4, group 4), regardless of whether

TAND was present, and showed a much greater transcriptional effect than V161E, V71E, and N69S. The pattern of response is very similar to the pattern of impaired dimerization

46

displayed by these mutants in vitro (Jackson-Fisher et al., 1999). Therefore, TBP dimerization appears to be contributing significantly to the repression of group 4 genes

(Figure 2.8, model 4).

2.3.3 Repressive Subtelomeric Regions are Intrinsically Accessible to the General

Transcription Machinery

Subtelomeric regions as far as 15 kb from chromosomal ends tend to be quite repressive for resident genes (Aparicio et al., 1991; Grunstein, 1998; Kurtz and Shore, 1991; Kyrion et al., 1993; Loo and Rine, 1995; Zakian, 1996). Sir proteins direct subtelomeric silencing out to about 3-4 kb, but in regions out to ~15 kb, histone H4 and presumably other histones direct Sir-independent repression (Hecht et al., 1996; Wyrick et al., 1999).

However, <10% of telomere-proximal genes are derepressed upon deletion of SIR3, and

<50% are derepressed upon depletion of histone H4, respectively (Wyrick et al., 1999).

Therefore, many telomere-proximal genes might be subjected to other modes of repression.

Genes that belong to groups 3 and 4 appear to be weakly active or repressed and are sensitive to TBP inhibition, rather than being sensitive exclusively to chromatin structure.

Therefore, it is expected that genes in these two groups would not be found in presumably chromatin-repressed subtelomeric regions. To address this, the frequency of group 3 and 4 genes were plotted as a function of distance from chromosomal ends

(Figure 2.7). As a measure of the boundary of the repressive subtelomeric region, the

47

frequency of genes in the lowest ten percentile of genome-wide expression intensity was also plotted. The weakly expressed genes of group 3 appeared to be generally absent from subtelomeric regions, as expected. Surprisingly, group 4 genes were quite prevalent, and were as frequent as the lowest ten percentile of expressed genes throughout the genome. Expression of as much as 30 percent of the genes in the subtelomeres appeared to be sensitive to negative interactions along TBP’s concave surface. These findings indicate that the repressive subtelomeric environment is intrinsically accessible to the general transcription machinery.

Figure 2.7. Subtelomeric frequency profile of group 3 and 4 genes. Shown is a composite profile of all 32 subtelomeric regions. The frequencies of non- repetitive genes that increased in expression in a 50-gene window, tiled every 10 genes, were plotted as a function of their average distance from the telomere (Wyrick et al., 1999). Group 3 is shown in blue and group 4 in red. Also plotted (open circles) is the percentage of genes in the same 50-gene window that are in the lowest ten percentile of genome-wide expression levels.

48

2.4 Discussion

2.4.1 The Yeast Genome is Negatively Regulated in Part by a Variety of TBP

Inhibitors

To investigate how TBP interactions contribute to the global gene regulatory network in yeast, a number of mutations in TBP were utilized that differentially affect TBP binding to DNA, NC2, the TAF1 TAND domain, and a second molecule of TBP. These mutants were briefly expressed in an otherwise wild type TBP strain, and effects on the expression of individual genes throughout the genome were examined using DNA microarrays. To more directly examine the potentially subtle contribution of TAF1’s N- terminal inhibitory domain (TAND), the TBP mutants in strains lacking the TAND domain were also examined.

The microarray data are interpreted within the context of established properties of these regulators. Thus, active genes generally have TBP bound to their promoters; inactive genes generally do not (Kuras and Struhl, 1999; Li et al., 1999). Therefore, transcriptional output generally can be interpreted as a reflection of TBP occupancy.

When TBP is bound to active RNA polymerase II promoters, where tested, it invariably also has NC2 bound (Geisberg et al., 2001). NC2 stabilizes TBP/TATA interactions, and competitively inhibits TFIIB and TFIIA binding (Cang et al., 1999; Goppelt and

Meisterernst, 1996; Kim et al., 1997; Mermelstein et al., 1996). When TBP is not bound to DNA, evidence suggests that its DNA binding surface is complexed with inhibitors,

49

such as the TAND domain of TAF1 (Banik et al., 2001; Kokubo et al., 1998; Nishikawa et al., 1997), or a second molecule of TBP (Coleman and Pugh, 1997; Jackson-Fisher et al., 1999). All of these observations are well supported by crystallographic or NMR structures of these interactions (Figure 2.1). Furthermore, these physical structures have been validated using mutagenesis and interaction assays, in vitro and in vivo.

The main conclusion of this work is that expression of a substantial portion of the yeast genome is regulated in part by the concerted action of a variety of TBP inhibitors (Figure

2.8). In particular, NC2 attenuates transcriptionally active genes. The TAND domain of

TAF1 has both positive and negative functions, but the cell does not fully depend upon these functions unless other TBP interactions are compromised. For repressed or lowly expressed genes, these results suggest that TBP dimerization plays a substantial repressive role. TBP self-association might keep an otherwise monomeric TBP from binding to repressed genes located within accessible chromatin (including the normally repressive subtelomeric environment). The results also indicate that an unidentified activity that functions through TBP’s concave surface also inhibits TBP, particularly when TAND function is absent.

50

Figure 2.8. Models for the interplay of TBP effectors in regulating the four groups of genes identified in Chapter 2. Positively acting functions are shown in green, and negative in red. The thickness of the black equilibrium arrows reflects the tendency of one interaction to dominate over another.

2.4.2 NC2 Attenuates Highly Active Genes

NC2 plays both a negative and positive role in transcription (Cang et al., 1999; Geisberg et al., 2001; Willy et al., 2000). Its histone fold domain binds to the bent DNA beneath the TBP/TATA complex (Kamada et al., 2001), and is required for NC2’s positive and negative function (Willy et al., 2000). Alpha helices H4 and H5 protrude from the core of NC2 (see Figure 2.1), and are required for NC2’s inhibitory activity (Willy et al.,

2000). The H5 helix binds TBP’s convex surface and interacts with amino acid F182 on

TBP, positioning H4 to block TFIIB access (Kamada et al., 2001). Consistent with the inhibitory function of H5, mutation of F182 to valine disrupts NC2 binding and causes an increase in transcription.

51

The properties of NC2 raise a number of questions to seemingly paradoxical behavior.

First, as a repressor, NC2 might be expected to operate at lowly expressed or repressed genes, not at highly active genes. If NC2 prevents essential transcription factors like

TFIIA and TFIIB from assembling at a promoter, then how can genes that have NC2 bound at their promoters be actively transcribed? Second, how can a factor act negatively on one hand and positively on the other, particularly if the same structural interactions are involved in both? The latter question is of general interest because many transcription factors including TBP, TAF1, and NC2 play both positive and negative roles in transcription.

These apparent contradictions might be reconciled in the context of a model where transcriptional output is dictated not by an all-or-none binding of factors, but by the net effect of a dynamic and continuous interplay of positive and negatively acting factors.

While components of the transcription machinery may be making similar interactions regardless of gene expression levels, the relative stability of these interactions may limit transcriptional output. In particular, dynamic competition between negatively-acting

NC2 and positively-acting TFIIA and/or TFIIB (and hence RNA polymerase II holoenzyme) for binding to a TBP/DNA complex might limit transcriptional output at highly active genes. For other genes, where weak TBP/DNA interactions might be limiting transcriptional output, NC2 could make a net positive contribution by stabilizing the binding of TBP to DNA (in addition to being antagonistic to TFIIA/B in a non rate-

52

limiting way). Indeed, NC2 plays a positive role at TATA-less promoters (Willy et al.,

2000). Interestingly, highly expressed genes that are inhibited by NC2 tend to be less sensitive to mutations along TBP’s DNA binding surface, which suggests that NC2 might be stabilizing TBP/DNA interactions while nonetheless inhibiting the expression of these genes. If NC2’s positive and negative contributions are mutually offsetting at some promoters, then NC2 might not appear to regulate these promoters despite being bound to them. Consistent with this, NC2 appears to be bound to all mRNA promoters tested that are also occupied by TBP (Geisberg et al., 2001). While TFIIB also appears to be bound at the same promoters, the two might not be bound at the same time, and could be in dynamic competition.

2.4.3 Multiple Inhibitory Interactions Along TBP’s Concave Surface Provide

Redundant Mechanisms for Preventing Unregulated Transcription

In contrast to NC2’s modulation of the accessibility and stability of the TBP/DNA complex, TBP dimerization appears to serve a repressive role by keeping TBP off of the

DNA (Figure 2.8, model 4). Genes that have accessible promoter regions are susceptible to being turned on when the dimerization function of TBP is eliminated through mutation, or by the positive action of transcriptional activators. The buffering effect of dimerization provides one explanation as to why overexpression of wild type TBP does not lead to increased gene expression.

53

The TAND domain of TAF1 exhibits limited inhibitory effects on transcription, which might be attributed to a number of mechanisms. First, some genes might not be regulated by TAF1 (i.e., are TAF-independent), and thus not sensitive to deletion of its TAND domain. Second, TBP must first dissociate from DNA before TAND I can bind TBP

(Banik et al., 2001; Kokubo et al., 1998). If TFIIA, TFIIB and other factors stabilize

TBP/DNA binding, particularly at highly active promoters, then the TAND domain cannot inhibit TBP binding. Mutations along the concave DNA binding surface of TBP that destabilize TBP/DNA interactions, could lead to increased dissociation of TBP

(manifested as a decrease in transcription) and increased susceptibility to the inhibitory action of the TAND domain (Figure 2.8, models 1, 3, and 4). Thus, for group 1 genes, deletion of the TAND domain partially suppresses mutations that impair TBP/DNA interactions (Figure 2.3). A third explanation for a lack of a dominant TAND effect is applicable to group 3 and 4 genes. For these genes, alternative TBP repressors (an unknown factor for group 3, and possibly dimerization for group 4) might dominate the repression of TBP that is not bound to DNA. Only in the context of mutations that destabilize these inhibitory interactions, does the inhibitory function of the TAND domain affect transcriptional output. Thus, TAND’s potential as a negative regulator may be widespread, but largely redundant with other TBP inhibitors.

54

2.4.4 The Repressive Subtelomeric Environment is Accessible to the General

Transcription Machinery

Histones and other chromosomal proteins are important negative regulators of gene expression. Subtelomeric regions are particularly repressive (Aparicio et al., 1991;

Grunstein, 1998; Kurtz and Shore, 1991; Kyrion et al., 1993; Loo and Rine, 1995;

Zakian, 1996). Active genes placed within these regions are often silenced. Silencing is due in part to Sir proteins, which are thought to generate an inaccessible heterochromatin structure emanating from the telomeres and extending inward about 3-4 kb along the (Hecht et al., 1996; Wyrick et al., 1999). Less than 10% of the telomere- proximal genes fall under Sir regulation (Wyrick et al., 1999). Sir-independent nucleosomal repression extends to about 15 kb, and is much more prevalent (Wyrick et al., 1999). Consistent with this, as much as 40% of the nonrepetitive genes near the telomeres are in the lowest ten percentile of genome-wide expression levels.

Surprisingly, as much as 30% of the genes in regions close to the telomeres were classified as group 4. Group 4 genes are characterized as being repressed or lowly expressed due in part to inhibition of TBP. The frequency of occurrence of group 4 genes within subtelomeric regions is four times higher than the genome-wide average of

7%, and is about the same as the frequency of the lowest ten percentile of genome-wide expression. This suggests that many repressed promoters in subtelomeric regions and throughout the genome are accessible to TBP/TFIID and the general transcription machinery. If repressive nucleosomes reside at these promoters, then accessibility might

55

also require chromatin remodeling activities associated with TFIID and/or the general transcription machinery. The Lowly expressed genes of groups 3 and 4 show a general sensitivity toward deletion of the TAF1 TAND domain, which might reflect a requirement for TFIID in delivering TBP to their promoters. If the core promoters are intrinsically accessible, then the repressive nature of the subtelomeres might be directed at steps upstream of TBP/TFIID recruitment such as preventing key gene-specific activator proteins from binding.

One study has suggested that the Sir-repressed HMRa1promoter is occupied by TBP, and is repressed downstream of TBP recruitment (Sekinger and Gross, 2001). In these studies, the HMRa1gene is unaffected by any of the TBP mutants, which is consistent with the notion that this promoter and many others are not regulated at the point of TBP access. However, the data do not distinguish whether HMRa1 is regulated before or after

TBP recruitment.

Ultimately, sequence-specific activators control the expression of most genes. They do so by regulating promoter accessibility, factor recruitment, and competition between positive and negative regulators. Part of the activation process involves removal of TBP inhibitors. This might be directed by TFIIA, as has been shown for TBP dimerization,

TBP/TAND, TBP/Mot1, and TBP/NC2 interactions (Auble et al., 1994; Cang et al.,

1999; Chicca et al., 1998; Coleman et al., 1999; Kamada et al., 2001; Kim et al., 1997;

Kokubo et al., 1998; Kotani et al., 2000; Mermelstein et al., 1996). Activators might also

56

play a direct role in alleviating TBP repression. For example, c-jun interacts with the

TAND domain of hsTAF1 to alleviate transcriptional repression (Lively et al., 2001). In this study, I have described the interplay of several inhibitors of TBP. There are likely to be multiple inhibitors at all stages of the gene activation process. Peeling back each layer of this complex network of regulation should help illuminate some of the underlying mechanisms governing gene regulation.

2.5 Experimental Procedures

Yeast Microarray Production

Amplification of 6188 open reading frames (99.4% coverage) of S. cerevisiae strain

S288C was performed by Carmalata Chitikila and John Chicca as described at http://cmgm.stanford.edu/pbrown/. All PCR products were confirmed to be the correct size by gel electrophoresis. Microarray fabrication was performed on aminosilane glass slides at the Penn State University Microarray facility.

Culture Growth

The yeast plasmid shuffle strain Y13.2 (MATα ura3-52 trp1-Δ63 leu2,3-112 his3-609

Δtaf145 pYN1/TAFII145), and plasmids pRS314/TAFII145(WT) and pRS314/TAFII145

(Δ10-73) were a gift from T. Kokubo (NIST, Japan) (Kokubo et al., 1998). Plasmid shuffling, performed by Carmelata Chitikila, was used to exchange the endogenous

pYN1/TAFII145 plasmid with either the wild type or ΔTAND plasmids. Expression experiments were performed in these Y13.2 derivative strains that had freshly been

57

transformed with pRS315 plasmids expressing galactose inducible TBP derivatives described in (Jackson-Fisher et al., 1999). The reference strain for all experiments contained wild type TAF1 and a galactose-inducible null derivative of TBP (having a stop codon at position 1): pCALF-T(M1stop)(GAL). Starter cultures were grown initially in 5 mL of CSM-LEU-TRP + 3% raffinose for ~ 12hrs at 30˚C, 300 rpm. These cultures were

diluted to an OD600 = ~0.008 in 0.5 L fresh media and grown to OD600 = ~0.8 at 300 rpm,

30˚C. After a 45 min induction of TBP expression with 2% galactose, cell pellets were quickly harvested by centrifugation at 3500 rpm, 5 min GS3 rotor at room temperature.

To minimize variability, no more than two cultures were harvested simultaneously. All reagents were at room temperature to avoid the induction of cold shock and stress genes.

Cell pellets were washed with sterile RNase free water, transferred to nuclease free 15 mL tubes, recovered by centrifugation (room temperature, 3500 rpm, IEC clinical rotor) and frozen in liquid nitrogen. Samples for immunoblotting were taken before and after induction, and equivalent numbers of cells were analyzed (equivalent to 0.5 ml of OD

1.0).

Total RNA was extracted by the hot acidic phenol method (Holstege et al., 1998).

Poly(A)+ RNA was isolated using Oligotex resin (Qiagen) according to the manufacturer’s instructions. It was then treated with DNase I as described (Ausubel et al., 1994), and stored in water at –80˚C. Poly(A)+ RNA (2 ug) was then reverse- transcribed with aminoallyl-dUTP, followed by incorporation of Cy3 or Cy5.

Microarrays were scanned and quantitated with a GenePix 4000A scanner and GenePix

58

3.0 software (Axon Instruments). All experiments were performed in duplicate from independent transformants, in which the dyes for the reference and test samples were swapped.

Isolation of Total RNA (Hot Phenol Method)

Total RNA was extracted by the hot acidic phenol method (Holstege et al., 1998). Frozen cell pellets were resuspended in equal volumes of TES buffer (10 mM Tris-Cl (pH 7.5),

10 mM EDTA, 0.5% SDS) and hot acid phenol-chloroform-isoamylalcohol (125:24:1,

pH 4.7, (Sigma P-1944) pre-warmed to 65˚C) at 0.015 mL/OD600 cell equivalents. After

1 hr of incubation in a 65˚C water bath with 20 sec of vigorous vortexing at 10 min intervals, the aqueous and organic phases were separated at 4°C, 8600g (7300 rpm) in a

Sorvall RC5C GSA rotor. The aqueous phase was re-extracted sequentially with equal volumes of hot acid phenol-chloroform-isoamylalcohol and chloroform-isoamylalcohol

(24:1, Sigma C-0549)). After the final centrifugation step, the aqueous phase was ethanol precipitated at –20˚C with 0.1 volumes 3M NaOAc (pH 5.2) and 3 volumes 100% ethanol. The precipitated nucleic acid was recovered at 8600g (4°C, 20 min), washed once with 70% ethanol and air dried. The RNA was dissolved in RNase free water at

65˚C with periodic vortexing and the concentration estimated from OD260 absorbance.

Typical yield for a wild type strain grown in selective media + raffinose/galactose at

30°C was ~30 µg/ml/OD600 cell equivalent. Aliquots were frozen at –80°C.

Purification of Poly(A)+ RNA

59

Poly(A)+ RNA was isolated from 3 mg of total RNA with 115 µL Oligotex resin suspension (Qiagen) following the manufacturer’s instructions. Samples were heated to

70°C for 6 min, cooled to room temperature in 20 min with a gentle resuspension of the resin every 3 min. Poly(A)+ RNA was eluted sequentially with 40 µL and 25 µL of

Buffer OEB (Qiagen kit) preheated to 70°C. The absorbance of the pooled eluates was measured and the entire Poly(A)+ RNA was immediately treated with DNase I. Typical yield of Poly(A)+ RNA was ~1.5-2% of the input total RNA.

DNase I Treatment of Poly(A)+ RNA

To degrade any co-purified genomic DNA, the poly(A)+ RNA was DNase I treated in 10

mM Tris-Cl (pH 7.5), 5 mM MgCl2, 5 mM CaCl2, 1 mM DTT, 0.2 u/µL cloned RNase inhibitor, 1 u/µL RNase-free DNase I for 2 hrs at 37˚C. After extraction with TE- equilibrated phenol-chloroform, the poly(A)+ RNA was ethanol precipitated (0.1 volume

3 M NaOAc, pH 5.2, 3 volumes 100% ethanol) in the presence of 0.05 µg/µL glycogen carrier. The poly(A)+ RNA was recovered by centrifugation at 14000 rpm, 4°C, microcentrifuge. After a 70% ethanol wash, the poly(A)+ RNA was briefly dried under vacuum, dissolved in RNase free water at 70°C and the concentration estimated from

+ OD260 absorbance. The poly(A) RNA was frozen at –80°C until used in microarray hybridizations.

60

Cy3 and Cy5 Labeling and Hybridization

The indirect incorporation of Cy3 or Cy5 aminoallyl-dUTP into cDNA or genomic DNA, hybridization, post-hybridization washes and scanning are modifications of protocols previously described at http://www.microarrays.org. All steps involving the fluorescent dyes were carried out under low ambient light conditions using single-use dye aliquots at room temperature. Two µg each of poly(A)+ RNA from the reference strain (wild type

TAF1, ‘null’ TBP) and test strains was reverse transcribed into cDNA incorporating the nucleotide analog aminoallyl dUTP (aa-dUTP Sigma A-0410). Each 15 µL reaction contained oligo(dT) primers (5 µg, IDT), random primers (7 µg, Amersham Pharmacia

Biotech) and poly(A)+ RNA denatured at 65°C for 10 min. After snap cooling on ice, the volume was increased to 30 µL with 1x First Strand Buffer (GIBCO Life Technologies),

1.25 mM dATP, 1.25 mM dCTP, 1.25 mM dGTP, 0.5 mM dTTP, 0.75 mM aa-dUTP, 10 mM DTT, 400 units Superscript II Reverse Transcriptase (GIBCO Life Technologies) and 5 units cloned RNase inhibitor (GIBCO Life Technologies). The reactions were incubated at 42°C for 2.5 hr in a water bath. The template RNA was degraded in 200 mM NaOH, 100 mM EDTA at 65°C for 15 min and then neutralized with 333 mM

HEPES pH 7.4. Unincorporated nucleotides were separated from the reactions with 5 sequential washes with water and centrifugation (12000 rpm, 10 min) in microcon YM-

30 filters (Amicon, #42410). The concentrated cDNA was dehydrated under vacuum for

~1 hr, low heat. Fluorescent Cy3- and Cy5-NHS ester dyes (Amersham Pharmacia

Biotech PA23001, PA25001) were chemically coupled to the aa-dUTP cDNA prepared from the reference and test strains respectively in 0.05 M sodium bicarbonate buffer (pH

61

9.0) at room temperature in the dark for 1 hr. Hydroxylamine was added to a final concentration of 1.33 M to quench the coupling reaction and incubated for a further 15 min in the dark. Each individual reaction was purified with a single column from the

QIAquick PCR purification kit (Qiagen) according to the manufacturer’s instructions. To minimize the co-purification of unincorporated dye, each reaction was washed 4 times with buffer PE and the labeled target was eluted twice sequentially with 30 µL of buffer

EB. (If multiple reference samples were labeled for hybridization of multiple arrays, they were pooled at this step). The appropriate pairs of labeled cDNA target (reference and test samples) were mixed, dehydrated under vacuum and dissolved in 16 µL of hybridization solution (5x SSC, 50% formamide (Sigma F-9037), 7 µg yeast tRNA

(Sigma R-8508), 10 µg polyA RNA (Sigma P-9403), 0.2% SDS). The labeled cDNA mixture was denatured in a boiling water bath for 2 min and co-hybridized to a microarray under a lifter slip (Corning Scientific, No. 1, 22 mm2) in a watertight hybridization cassette (Corning, http://cmgm.stanford.edu/pbrown/). Hybridization was performed in a water bath for12-16 hr at 42˚C.

Washed microarrays were dried, scanned and quantitated with a GenePix 4000A scanner

(Axon Instruments) GenePix 3.0 image acquisition software at 532 nm (Cy3) and 635 nm

(Cy5). A minimal number of scans were performed adjusting the PMT settings on the two lasers until the ratio of the signal at all the spots was ~1.0 as displayed by the histogram tool. (The ‘Lines to Average’ in the ‘Hardware Settings’ was set to 2 and each scan was at 10 µm resolution.) .

62

Statistical Filtering

The statistical filtering method was implemented in “R” with code written by Andrew

Basehoar. All gene expression ratios from a single array of test vs. reference were normalized by mode-centering, which sets the peak of a smoothed frequency histogram

of the log2 ratios to zero. This method of normalization assumes that the most frequent ratios reflect an unchanging population of mRNAs. Unlike the more common method of normalizing to total signal, mode-centering is insensitive to changes in gene expression.

This includes asymmetric changes where only increases or decreases in gene expression are observed. Normalization by mode-centering was validated in every experiment by

taking an equal number of test and reference cells (measured by OD600), and spiking in equal amounts of externally-generated polyadenylated unique mRNAs (B. subtilis LysA,

PheB, ThrC, TrpE, DapB) (Holstege et al., 1998). The spiked controls became processed along with the total yeast mRNA, and hybridized to cognate features on the arrays. The

Cy3/Cy5 ratios of the spiking controls were normalized using the same factor used to normalize the yeast mRNA data. In all cases, their ratios were within 10% of 1.0, which validates the assumption that the most frequent ratios reflect an unchanging population of mRNAs.

To assess the intrinsic error from all sources in the microarray experiments, we used 13 independently-derived reference vs. reference hybridizations to assess gene-specific and overall variation, when no changes were taking place. The standard deviation was uniform (~10%) throughout the entire dynamic range of all homotypic hybridizations.

63

The greater value of either the gene-specific or overall standard deviation was used to filter each gene for significant changes in expression. Typically, in any one experiment,

>5700 genes gave measurable transcriptional output (passing criterion 1, below). Fold changes in gene expression were considered significant if they met all of the following filtering criteria: 1) Raw gene expression intensities were greater than one standard deviation above local background in both the test and reference samples in both replicates; 2) Ratios changed in the same direction in each replicate; 3) Log Ratios in each replicate were greater than two standard deviations above 0.0; 4) P-values of the

arithmetic average of the log2 ratios were <0.005; and 5) fold changes in gene expression were >1.5. We chose to use an arithmetic average of log values so as to blunt the effect of any potential large variations between the two replicate experiments. Applying these filters to independently derived homotypic hybridizations typically resulted in no genes being reported as falsely significant. The P-value cut-off of 0.005 was assigned arbitrarily.

64

2.6 References

Aparicio, O. M., Billington, B. L., and Gottschling, D. E. (1991). Modifiers of position effect are shared between telomeric and silent mating-type loci in S. cerevisiae. Cell 66,

1279-1287.

Auble, D. T., Hansen, K. E., Mueller, C. G., Lane, W. S., Thorner, J., and Hahn, S.

(1994). Mot1, a global repressor of RNA polymerase II transcription, inhibits TBP binding to DNA by an ATP-dependent mechanism. Genes & Development 8, 1920-1934.

Ausubel, F. M., Brent, R., Kingston, R. E., Moore, D. D., Seidman, J. G., Smith, J. A., and Struhl, K., eds. (1994). Current Protocols in Molecular Biology (John Wiley &

Sons).

Banik, U., Beechem, J. M., Klebanow, E., Schroeder, S., and Weil, P. A. (2001).

Fluorescence-based analyses of the effects of full-length recombinant TAF130p on the interaction of TATA box-binding protein with TATA box DNA. J Biol Chem 276,

49100-49109.

Campbell, K. M., Ranallo, R. T., Stargell, L. A., and Lumb, K. J. (2000). Reevaluation of transcriptional regulation by TATA-binding protein oligomerization: predominance of monomers. Biochemistry 39, 2633-2638.

Cang, Y., Auble, D. T., and Prelich, G. (1999). A new regulatory domain on the TATA- binding protein. EMBO J 18, 6662-6671.

65

Chasman, D. I., Flaherty, K. M., Sharp, P. A., and Kornberg, R. D. (1993). Crystal structure of yeast TATA-binding protein and model for interaction with DNA. Proc Natl

Acad Sci U S A 90, 8174-8178.

Cheng, J. X., Nevado, J., Lu, Z., and Ptashne, M. (2002). The TBP-inhibitory domain of

TAF145 limits the effects of nonclassical transcriptional activators. Curr Biol 12, 934-

937.

Chicca, J. J., 2nd, Auble, D. T., and Pugh, B. F. (1998). Cloning and biochemical characterization of TAF-172, a human homolog of yeast Mot1. Mol Cell Biol 18, 1701-

1710.

Chitikila, C., Huisinga, K. L., Irvin, J. D., Basehoar, A. D., and Pugh, B. F. (2002).

Interplay of TBP Inhibitors in Global Transcriptional Control. Mol Cell.

Coleman, R. A., and Pugh, B. F. (1997). Slow dimer dissociation of the TATA binding protein dictates the kinetics of DNA binding. Proc Natl Acad Sci U S A 94, 7221-7226.

Coleman, R. A., Taggart, A. K., Benjamin, L. R., and Pugh, B. F. (1995). Dimerization of the TATA binding protein. J Biol Chem 270, 13842-13849.

Coleman, R. A., Taggart, A. K., Burma, S., Chicca, J. J., 2nd, and Pugh, B. F. (1999).

TFIIA regulates TBP and TFIID dimers. Mol Cell 4, 451-457.

DeRisi, J. L., Iyer, V. R., and Brown, P. O. (1997). Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278, 680-686.

66

Geisberg, J. V., Holstege, F. C., Young, R. A., and Struhl, K. (2001). Yeast NC2 associates with the RNA polymerase II preinitiation complex and selectively affects transcription in vivo. Mol Cell Biol 21, 2736-2742.

Geisberg, J. V., and Struhl, K. (2000). TATA-binding protein mutants that increase transcription from enhancerless and repressed promoters in vivo. Mol Cell Biol 20, 1478-

1488.

Goppelt, A., and Meisterernst, M. (1996). Characterization of the basal inhibitor of class

II transcription NC2 from Saccharomyces cerevisiae. Nucleic Acids Res 24, 4450-4455.

Grunstein, M. (1998). Yeast heterochromatin: regulation of its assembly and inheritance by histones. Cell 93, 325-328.

Hecht, A., Strahl-Bolsinger, S., and Grunstein, M. (1996). Spreading of transcriptional repressor SIR3 from telomeric heterochromatin. Nature 383, 92-96.

Holstege, F. C., Jennings, E. G., Wyrick, J. J., Lee, T. I., Hengartner, C. J., Green, M. R.,

Golub, T. R., Lander, E. S., and Young, R. A. (1998). Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95, 717-728.

Jackson-Fisher, A. J., Chitikila, C., Mitra, M., and Pugh, B. F. (1999). A role for TBP dimerization in preventing unregulated gene expression. Mol Cell 3, 717-727.

67

Kamada, K., Shu, F., Chen, H., Malik, S., Stelzer, G., Roeder, R. G., Meisterernst, M., and Burley, S. K. (2001). Crystal structure of negative cofactor 2 recognizing the TBP-

DNA transcription complex. Cell 106, 71-81.

Kim, S., Na, J. G., Hampsey, M., and Reinberg, D. (1997). The Dr1/DRAP1 heterodimer is a global repressor of transcription in vivo. Proc Natl Acad Sci U S A 94, 820-825.

Kokubo, T., Swanson, M. J., Nishikawa, J. I., Hinnebusch, A. G., and Nakatani, Y.

(1998). The yeast TAF145 inhibitory domain and TFIIA competitively bind to TATA- binding protein. Mol Cell Biol 18, 1003-1012.

Kotani, T., Banno, K., Ikura, M., Hinnebusch, A. G., Nakatani, Y., Kawaichi, M., and

Kokubo, T. (2000). A role of transcriptional activators as antirepressors for the autoinhibitory activity of TATA box binding of transcription factor IID. Proc Natl Acad

Sci U S A 97, 7178-7183.

Kotani, T., Miyake, T., Tsukihashi, Y., Hinnebusch, A. G., Nakatani, Y., Kawaichi, M., and Kokubo, T. (1998). Identification of highly conserved amino-terminal segments of dTAFII230 and yTAFII145 that are functionally interchangeable for inhibiting TBP-

DNA interactions in vitro and in promoting yeast cell growth in vivo. J Biol Chem 273,

32254-32264.

Kuras, L., and Struhl, K. (1999). Binding of TBP to promoters in vivo is stimulated by activators and requires Pol II holoenzyme. Nature 399, 609-613.

68

Kurtz, S., and Shore, D. (1991). RAP1 protein activates and silences transcription of mating-type genes in yeast. Genes Dev 5, 616-628.

Kyrion, G., Liu, K., Liu, C., and Lustig, A. J. (1993). RAP1 and telomere structure regulate telomere position effects in Saccharomyces cerevisiae. Genes Dev 7, 1146-1159.

Li, X.-Y., Virbasius, A., Zhu, X., and Green, M. (1999). Enhancement of TBP binding by activators and general transcription factors. Nature 399, 605-609.

Liu, D., Ishima, R., Tong, K. I., Bagby, S., Kokubo, T., Muhandiram, D. R., Kay, L. E.,

Nakatani, Y., and Ikura, M. (1998). Solution structure of a TBP-TAF(II)230 complex: protein mimicry of the minor groove surface of the TATA box unwound by TBP. Cell

94, 573-583.

Lively, T. N., Ferguson, H. A., Galasinski, S. K., Seto, A. G., and Goodrich, J. A. (2001). c-Jun binds the N terminus of human TAF(II)250 to derepress RNA polymerase II transcription in vitro. J Biol Chem 276, 25582-25588.

Loo, S., and Rine, J. (1995). Silencing and heritable domains of gene expression. Annu

Rev Cell Dev Biol 11, 519-548.

Mai, X., Chou, S., and Struhl, K. (2000). Preferential accessibility of the yeast his3 promoter is determined by a general property of the DNA sequence, not by specific elements. Mol Cell Biol 20, 6668-6676.

69

Mermelstein, F., Yeung, K., Cao, J., Inostroza, J. A., Erdjument-Bromage, H., Eagelson,

K., Landsman, D., Levitt, P., Tempst, P., and Reinberg, D. (1996). Requirement of a corepressor for Dr1-mediated repression of transcription. Genes Dev 10, 1033-1048.

Nikolov, D. B., Hu, S. H., Lin, J., Gasch, A., Hoffmann, A., Horikoshi, M., Chua, N. H.,

Roeder, R. G., and Burley, S. K. (1992). Crystal structure of TFIID TATA-box binding protein. Nature 360, 40-46.

Nishikawa, J., Kokubo, T., Horikoshi, M., Roeder, R. G., and Nakatani, Y. (1997).

Drosophila TAF(II)230 and the transcriptional activator VP16 bind competitively to the

TATA box-binding domain of the TATA box-binding protein. Proc Natl Acad Sci U S A

94, 85-90.

Pokholok, D. K., Hannett, N. M., and Young, R. A. (2002). Exchange of RNA

Polymerase II initiation and elongation factors during gene expression in vivo. Mol Cell

9, 799-809.

Sekinger, E. A., and Gross, D. S. (2001). Silenced chromatin is permissive to activator binding and PIC recruitment. Cell 105, 403-414.

Struhl, K. (1999). Fundamentally different logic of gene regulation in eukaryotes and prokaryotes. Cell 98, 1-4.

Taggart, A. K., and Pugh, B. F. (1996). Dimerization of TFIID when not bound to DNA.

Science 272, 1331-1333.

70

Willy, P. J., Kobayashi, R., and Kadonaga, J. T. (2000). A basal transcription factor that activates or represses transcription. Science 290, 982-985.

Wyrick, J. J., Holstege, F. C., Jennings, E. G., Causton, H. C., Shore, D., Grunstein, M.,

Lander, E. S., and Young, R. A. (1999). Chromosomal landscape of nucleosome- dependent gene expression and silencing in yeast. Nature 402, 418-421.

Zakian, V. A. (1996). Structure, function, and replication of Saccharomyces cerevisiae telomeres. Annu Rev Genet 30, 141-172.

71

Chapter 3

3 A genome-wide housekeeping role for TFIID and a highly regulated stress-

related role for SAGA in Saccharomyces cerevisiae

Work presented in this chapter was previously published as "A genome-wide housekeeping role for TFIID and a highly regulated stress-related role for SAGA in

Saccharomyces cerevisiae." Kathryn L. Huisinga and B. Franklin Pugh. Mol Cell 13(4):

573-85 and is reprinted here with permission.

3.1 Summary

TFIID and SAGA share a common set of TAFs, regulate chromatin, and deliver TBP to promoters. This Chapter examines their relationship within the context of the

Saccharomyces cerevisiae genome-wide regulatory network. I find that while TFIID and

SAGA make overlapping contributions to the expression of all genes, TFIID function predominates at ~90% and SAGA at ~10% of the measurable genome. Strikingly,

SAGA-dominated genes are largely stress-induced and TAF-independent, and are down- regulated by the coordinate action of a variety of chromatin, TBP, and RNA polymerase

II regulators. In contrast, the TFIID-dominated class is less regulated, but is highly dependent upon TAFs including those shared between TFIID and SAGA. These two distinct modes of transcription regulation might reflect the need to balance inducible stress responses with the steady output of housekeeping genes.

72

3.2 Introduction

The regulation of eukaryotic genes involves the dynamic assembly and disassembly of the transcription machinery. Chromatin plays a central role in this process by restricting access of promoter DNA (Struhl, 1999). Chromatin modifying complexes regulate accessibility in part through modification of histone amino terminal tails. Deacetylated histone tails recruit repressors that block transcription complex assembly, whereas acetylated tails recruit bromodomain-containing complexes that enhance assembly. One of the key components of the general transcription machinery that is the target of chromatin regulation is the TATA binding protein (TBP). Chromatin modification and

TBP loading are linked in that two complexes, TFIID and SAGA, possess both histone acetyltransferase (HAT) and TBP binding activities (Brownell et al., 1996; Grant et al.,

1997; Mizzen et al., 1996; Nishikawa et al., 1997).

It is currently not clear whether SAGA and TFIID represent the only complexes responsible for delivering TBP to promoters, and whether they serve distinct purposes throughout the genome. Genome-wide expression studies indicate that the TAF1 subunit of TFIID and the Spt3 subunit of SAGA are necessary for the expression of only a small subset of genes (16% and 3%, respectively)(Holstege et al., 1998; Lee et al., 2000).

These estimates are conservative since genes were required to be as dependent upon

TAF1 or Spt3 as they are upon RNA polymerase II (pol II) in order to be classified as dependent upon TAF1 or Spt3. Thus, any gene that could utilize either TFIID or SAGA would by this criteria not be dependent on either one. Assessments that take into account

73

partial dependency might provide a more accurate view of the genome-wide role of

TFIID and SAGA.

TFIID and SAGA share several TAF subunits (Grant et al., 1998). Since elimination of any one of these shared subunits typically results in loss of expression of up to 70% of the genome, it has been suggested that some combination of both complexes is important for expression of 70% of yeast genes (Lee et al., 2000). These findings however could not exclude the possibility that shared TAFs are highly important for one complex but not the other, with the former playing a major genome-wide role. The finding that a strain lacking both TAF1 and the SAGA subunit Gcn5 has broader transcriptional defects than expected from the sum of their individual effects further suggests that TFIID and SAGA are functionally redundant (Lee et al., 2000). However, Gcn5 also functions outside of

SAGA, in the form of the ADA complex (Eberharter et al., 1999). Thus, it remains possible that TFIID and ADA, rather than SAGA, serve redundant roles.

The presence of related activities and subunits in TFIID and SAGA raises the question as to whether they have identical or distinct roles within the cell. Here I have generated comparatively high resolution expression profiles of genome-wide dependencies on

TFIID and SAGA, focusing in particular on the TBP-related activities of these complexes in ways not previously examined. By integrating a variety of data analysis methods, a statistical assessment of the contribution of TFIID and SAGA to genome-wide expression is made. Comparisons of these data with other microarray experiments provide insights

74

into gene control by TFIID and SAGA that were not previously recognized. In this study

I address the following questions: 1) Do TFIID and SAGA each contribute to the expression of all genes? 2) Do the combined functions of TFIID and SAGA account for all pol II transcription, or is some transcription independent of these complexes, suggesting a role for additional TBP complexes? 3) Does TFIID or SAGA preferentially regulate genes involved in specific cellular processes? 4) Is TFIID- and SAGA-mediated transcription regulated similarly, or is there preferential use of specific co-regulators? 5)

Do TAF subunits, which are shared between SAGA and TFIID, function equivalently in the two complexes? 6) Is bromodomain factor Bdf1 and its specificity for H4 acetylated tails associated with TFIID function in vivo? 7) Does Gcn5 or its HAT activity play an important global role in gene expression?

3.3 Results

3.3.1 Determination of transcriptional effects via expression microarray analysis

To determine the genome-wide dependency on TFIID and SAGA, two sets of six isogenic strains were constructed in either taf1Δ gcn5Δ or taf1Δ spt3Δ strains, with both sets carrying either TAF1 or taf1ts2 on a plasmid. The taf1Δ gcn5Δ set also contained either GCN5, gcn5hat (possessing the KQL HAT mutation), or an empty vector (Table

3.1). The taf1Δ spt3Δ set contained either SPT3, spt3E240K (spt3-401), or an empty vector. Strains were rapidly shifted from 26˚C to 37˚C to inactivate taf1ts2, then harvested after 45 minutes. The 45 minute window provides sufficient time for taf1ts2 inactivation and global mRNA turnover, while minimizing but not eliminating indirect 75

effects (Holstege et al., 1998). Genome-wide microarray analysis was performed to determine fold changes in gene expression for each mutant relative to the wild type strain. The complete data set can be found in Supplemental Table S3-1.

Table 3.1. Yeast strains used in this study Relevant Doubling RNA yield a b c d e f Strain genotype Plasmid 1 Plasmid 2 time (ug/OD600) YJS6 wild type pSW104-TAF1 pJW215 2.9 19 YJS8 taf1ts2 pSW104-taf1ts2 pJW215 3.0 17 YKH100 gcn5Δ pSW104-TAF1 pRS314 5.3 18 YJS7 gcn5hat pSW104-TAF1 pKQL 5.9 16 YKH101 taf1ts2 gcn5Δ pSW104-taf1ts2 pRS314 5.5 18 YJS9 taf1ts2 gcn5hat pSW104-taf1ts2 pKQL 5.9 16

YKH105 wild type pSW104-TAF1 pKH72 2.6 20 YKH108 taf1ts2 pSW104-taf1ts2 pKH72 3.1 19 YKH104 spt3Δ pSW104-TAF1 pRS314 5.4 21 YKH106 spt3E240K pSW104-TAF1 pKH80 3.5 22 YKH107 taf1ts2 spt3Δ pSW104-taf1ts2 pRS314 6.1 20 YKH109 taf1ts2 spt3E240K pSW104-taf1ts2 pKH80 3.1 24 aAll strains were derived from Y13.2 (MATα ura3-52 trp1-Δ63 leu2,3-112 his3-609 taf145Δ pYN1-TAF1) (Kokubo et al., 1998). Chromosomal GCN5 and SPT3 were deleted from the upper and lower set, respectively, by standard gene replacement methods using KanMx (Guldener et al., 1996), and verified by PCR. bThe relevant genotype was obtained by transforming Y13.2 derivatives with plasmids 1 and 2, and shuffling out pYN1-TAF1. cPlasmids are pRS313-based (HIS3) (Walker et al., 1996). dPlasmids are pRS314-based (TRP1). Plasmid pJW215 contains GCN5 (Howe et al., 2001), and pKQL contains gcn5K126A Q127A L128A (Wang et al., 1998). Plasmids pKH72 and pKH80 were constructed by subcloning an XhoI/EcoRI fragment containing either SPT3 or spt3-401 (E240K) from pCC1 or p3-401 (Eisenmann et al., 1992) into a pRS314 backbone. eDoubling time (hours) in CSM-Trp-His media at 26˚C. f Micrograms of total RNA isolated per ml per OD600 of cell culture, isolated after shifting cultures to 37˚C for 45 min. The standard deviation for these measurements is 12%. While strains harboring SAGA deletions grew slower, we observed no correlation between strain doubling time and total RNA or mRNA levels.

76

Previously, mutants of TFIID and SAGA were examined using Affymetrix arrays

(Holstege et al., 1998; Lee et al., 2000). Here, spotted arrays of full-length genes are used. To assess the false positive rate in both studies, I plotted expression intensity from one set of replicates against a second set of replicates. As shown in Figure 3.1, the spotted arrays displayed less overall error (~13% vs. ~63% standard deviation), allowed more ORFs to be queried (~99% vs. ~85%), and had fewer false positives (~6 vs. ~600) using a two-fold cut-off criteria. The low error associated with the present study should provide a high accuracy assessment of changes in gene expression.

77

Figure 3.1. Noise associated with spotted and high density microarrays. Fold changes in gene expression are typically determined from two independent test cultures when compared to two independent reference cultures. To display the scatter associated with two replicates, we combined the raw expression intensities from two wild type strains and compared them against their combined replicates. (A) Data were generated from strain YKH105 (wild type) using spotted arrays of full-length genes. (B) Data were generated from strains YSW87 and Z579 (wild type) using high density short- oligonucleotide arrays, and downloaded from http://staffa.wi.mit.edu/cgi- bin/young_public/factor.cgi?gene=TAF145&s=0 and http://staffa.wi.mit.edu/cgi- bin/young_public/normdata.cgi?gene=RPB1&s=0. Lines represent 2 and 4-fold changes. N is the number (percentage) of plottable data points, which are data that have a signal above background. The reported standard deviations are typical of other data sets from the same study.

78

3.3.2 TFIID and SAGA each contribute to the expression of nearly all genes

To examine bulk behavior of mRNA populations, I generated a smoothed frequency distribution of fold changes in gene expression for all strains (Figure 3.2A). As expected, independent wild type strains displayed a tight distribution centered over zero (no change). As an indicator of the maximum expected decrease (leftward shift of the profile) for a 45 minute shutdown of genes transcribed by pol II, the frequency distribution expected from the temperature-sensitive rpb1-1 pol II mutant is plotted

(Wang et al., 2002). The broadening of the distribution for rpb1-1 presumably reflects differential mRNA half-lives.

79

Figure 3.2. Genome-wide expression profiles of GCN5, SPT3, and TAF1 mutants.

(A) Fold changes in gene expression (log2 scale) were binned in 0.05 increments on a percentage scale using Kaleidagraph 3.6 software. The binned data were plotted as a scatter plot and fit to a smoothed or interpolated curve. The horizontal “error” bars indicate the peak position of the two replicates. The rpb1-1 data was calculated using the -kt formula y = log2(e ), where ‘k’ is the mRNA decay rate for each gene obtained from Wang et al. 2002, and ‘t’ = 45 min. (B) Cluster plot of selected mutants. Data for YS6 (wild type), YHK100 (gcn5Δ), YKH101 (taf1ts2 gcn5Δ), YJS8 (taf1ts2), YKH108 (taf1ts2), YKH107 (taf1ts2 spt3Δ), YKH104 ( spt3Δ), and YKH105 (wild type) are shown in columns 1-8, respectively. Rows represent individual ORFs (open reading frames). Fold changes in gene expression (log2 scale) were clustered using Cluster software and visualized with Treeview (Eisen et al., 1998). 6004 ORFs are present in the clusters.

Membership required a log2 absolute value of >0.76 in at least one experiment and data in four of the eight clustered experiments. Data were clustered using the K-means algorithm. K was set at 2, since higher K values gave visually minor variations of the K=2 pattern. Fold changes in gene expression are reflected in color intensity, with red, green, black, and gray reflecting increase, decrease, no change, and no data, respectively. 80

The gcn5Δ strain displayed a general and uniform decrease in the level of most mRNAs, with the peak of the distribution shifting to the left of wild type by 1.8 fold (Figure 3.2A upper panel, and Table 3.2). Over 60% of the genes in the gcn5Δ strain decreased expression by >4 standard deviations (~1.7 fold) when compared to the wild type distribution. This shifted distribution profile indicates that Gcn5 makes a uniform positive, albeit modest, contribution to the expression of most genes. The frequency distribution for gcn5hat did not change significantly, indicating that the HAT activity of

Gcn5 plays either a redundant or a minor role in regulating the bulk of the genome. The base of the distribution was broader for gcn5hat, possibly reflecting gene-specific effects.

The more pronounced dependence of the genome on Gcn5 as a whole, rather than its

HAT activity, suggests that other functions of Gcn5 may be more important than its acetylase activity.

81

Table 3.2. Fold changes in gene expression in TFIID and SAGA mutants. Strain Genotype Peaka Nb >4σ Downc >4σ Upc % Down % Up YJS6 wild type 1.0 6089 11 8 ~0 ~0 YKH100 gcn5Δ 1.8 5690 3608 3 63 ~0 YJS7 gcn5hat 1.2 5455 448 75 8 1 YJS8 taf1ts2 2.4 5620 4711 32 84 <1 YKH101 taf1ts2 gcn5Δ 3.2 5713 5380 26 94 <1 YJS9 taf1ts2 gcn5hat 3.3 5484 4999 22 91 <1

YKH105 wild type 1.0 5837 6 1 ~0 ~0 YKH104 spt3Δ 1.2 5832 614 28 11 <1 YKH106 spt3E240K 1.0 5522 43 48 1 1 YKH108 taf1ts2 1.7 5762 2793 88 48 2 YKH107 taf1ts2 spt3Δ 3.9 5804 5623 7 97 ~0 YKH109 taf1ts2 spt3E240K 1.8 5651 3310 107 59 2 aFold change in gene expression (non-log scale) at the peak of the distribution shown in Figure 3.1A. bThe number of ORFs having measurable expression. cA gene was determined to have significantly changed its expression if the absolute value of its log2 ratio changed by >4σ from the wild type distribution. By these criteria, the cut-off value was 0.76 (1.7 fold change in gene expression). The standard deviation reflects all sources of error associated with the ratios for the wild type data sets, which are expected to have no change.

The distribution of fold changes in gene expression in spt3Δ and spt3E240K strains did not differ substantially from wild type (Figure 1A, lower panel), indicating that Spt3 plays either a small or redundant role at most genes. A leftward tail was apparent on the spt3Δ distribution, indicating that a small subset of genes (11% of the genome) is particularly dependent upon the positive function of Spt3.

The taf1ts2 mutant shows no overt phenotype at the permissive temperature (26˚C), but rapidly stops growing at the restrictive temperature (37˚C) (Walker et al., 1996).

Consistent with this mutant being very tight, the frequency distribution of fold changes in 82

gene expression for taf1ts2 at the permissive temperature (dotted grey line) was essentially identical to the distribution of wild type TAF1 (Figure 3.2A, upper). Shifting the taf1ts2 strain to 37˚C for 45 minutes resulted in a leftward shift of the distribution by as much as

2.4 fold (Figure 3.2A). Approximately 84% of the genome in the YJS8 taf1ts2 strain decreased expression by >4 standard deviations when compared to the wild type distribution (Table 3.2). The population shift was not as severe as expected for rpb1-1

(median decrease of 4.4 fold by 45 min.), indicating that the genome-wide dependence on

TAF1 was not absolute.

I next examined the distribution of fold changes in gene expression when subunits of both

SAGA and TFIID were mutated. In contrast to the limited effect of spt3Δ and taf1ts2 individually, the distribution for the taf1ts2 spt3Δ double mutant shifted to the left by four fold, to a level equivalent to that of rpb1-1 (Figure 3.2A, lower panel). Approximately

97% of the measurable genes in the taf1ts2 spt3Δ double mutant decreased expression by

>4 standard deviations when compared to the wild type distribution (Table 3.2), indicating a broad requirement for TAF1. This increased to >99% when all mutants are considered. The nearly complete shutdown of transcription in the taf1ts2 spt3Δ double mutant when compared to the single mutants has two important implications. First, it indicates that both Spt3 and TAF1 contribute to the expression of nearly all measurable genes, particularly in the absence of the other factor. This is the first demonstration that

Spt3 function is partially redundant with either TAF1 or some component of TFIID.

83

Second, the findings suggest that there are unlikely to be additional complexes that are functionally redundant to the transcriptionally relevant activity shared between TFIID and SAGA, otherwise the combined transcriptional dependence on Spt3 and TAF1 would have been only partial.

The population distributions for taf1ts2 gcn5Δ and in particular taf1ts2 gcn5hat shifted by the same magnitude as calculated for rpb1-1 (Figure 3.2A, upper). This is more than expected from the sum of the individual taf1ts2 and gcn5hat mutants. The enhanced sensitivity to gcn5hat in the taf1ts2 mutant suggests that the HAT activity of Gcn5 becomes important when TAF1 is absent, and suggests that TAF1 is associated with a physiologically important HAT activity that is functionally redundant with Gcn5’s HAT function. Our findings confirm similar conclusions drawn previously (Lee et al., 2000), and extend them by narrowing the redundancy to the HAT activity rather than some other potential activity shared between TFIID and SAGA or ADA.

3.3.3 TFIID dominates at ~90% of all genes, while SAGA dominates at ~10%

Cluster plot analysis provides an efficient means of comparing and grouping multiple data sets. In Figure 3.2B, spt3Δ, gcn5Δ, taf1ts2 single and double mutants are compared.

Two major groups were identified using a K-means algorithm to cluster the data. The largest group, comprising ~90% of the measurable genome, generally showed a greater dependency on TAF1 than on Spt3 (compare columns 5 and 7). This cluster of genes is referred to throughout the paper as the TFIID-dominated group. The smaller cluster, 84

comprising ~10% of the genome, generally displayed a greater dependency on Spt3, and was largely TAF1-independent. This set of genes is referred to as the SAGA-dominated group. Gcn5-dependency appeared to contribute little to the clustering pattern, consistent with its uniform genome-wide role. It is striking that TFIID (TAF1) appears to be less active and perhaps slightly inhibitory at the SAGA-dominated genes, becoming important only when Spt3 is removed (compare columns 5-7).

Based upon this clustering profile, genes were designated as either TFIID-dominated or

SAGA-dominated (Supplemental Table S3-1), which necessarily reflects their forced classification under one set of conditions. In an alternative environment, some genes might switch classification. Genes that were obviously regulated by both TFIID and

SAGA were designated as such. Genes in the lowest 5 percentile of expression intensity were designated as “no call” since they are likely to be turned off. Genes that produce highly stable mRNA could not be tested for TFIID dependency since their mRNA levels are expected to be unaffected by short term inactivation of TAF1. These genes were also regarded as “no call”.

To assess whether genes designated as SAGA- or TFIID-dominated are bound by these complexes in vivo, comparison to 37 genes whose binding of SAGA or TFIID subunits has been previously assessed by in vivo formaldehyde crosslinking and chromatin immunoprecipitation (ChIP) assays were made. In Table 3.3, these genes were grouped according to whether they were generally TFIID- or SAGA-regulated, based upon their

85

fit to expression data and ChIP comparisons presented here. Of the 19 genes regarded as

TFIID-regulated, all 19 were classified by the clustering algorithm as TFIID-dominated.

Eighteen bound TFIID at high levels (i.e., high TAF1/TBP ratio) in ChIP assays. Of the

18 genes regarded as SAGA-regulated/TAF-independent, 12 of them were classified as

SAGA-dominated, 2 were classified (probably incorrectly) as TFIID-dominated, and 4 were a “no call”. All four of these “no calls” however are regulated similarly to SAGA- dominated genes when other expression data are considered (see below). Overall, there was a remarkable correspondence between the published ChIP data and our

SAGA/TFIID classification, verifying that this classification in most cases is a direct consequence of SAGA and TFIID occupancy. It is striking that many of the SAGA- regulated but not TFIID-regulated genes in Table 3.3 are involved in a variety of stress responses (e.g. heat, osmotic, sporulation, and carbon, nitrogen, or phosphorous starvation).

86

Table 3.3. Comparison of TATA/TATA-less classification with ChIP dataa SAGA/TFIID Expression Gene Classa (percentile)b ChIPc TFIID Regulated ACT1 TFIID 98 high TFIID1,2 ARF1 TFIID 96 high TFIID2 EFT2 TFIID 96 high TFIID2 RPL2B TFIID 97 high TFIID3 RPL5 TFIID 96 high TFIID2 RPL9A TFIID 99 high TFIID2 RPL18B TFIID 96 high TFIID3 RPL19B TFIID 99 high TFIID3 RPL25 TFIID 94 high TFIID1 RPL26A TFIID 88 high TFIID3 RPS5 TFIID 97 high TFIID1 RPS8A TFIID 92 high TFIID2 RPS11A TFIID 91 high TFIID3 RPS11B TFIID 99 high TFIID3 RPS13 TFIID 93 high TFIID3 RPS22B TFIID 99 high TFIID3 RPS30 TFIID 91 high TFIID1 TRP3 TFIID 89 high TFIID2 VTC3 TFIID 74 low TFIID*4

SAGA regulated / TAF-independent ADH1 SAGA 72 low TFIID*1-3 AHP1 SAGA 97 high SAGA (Spt20)4 ARG1 SAGA 94 high SAGA (Spt7)5 BDF2 SAGA 71 low TFIID*3 CTT1 SAGA 54 low TFIID2 FBA1 SAGA 97 low TFIID2 GRE2 SAGA 56 high SAGA (Spt20)4 PGK1 SAGA 98 low TFIID1,2 SSA4 SAGA 79 low TFIID2 TDH3 SAGA 95 low TFIID2 PHO84 SAGA/TFIID 35 low TFIID*3 PHO5 SAGA/TFIID 79 high SAGA (Ada2)7 PYK2 TFIID 59 low TFIID2 SED1 TFIID 64 low TFIID1 HSP12 no call 99 low TFIID2 HSP104 no call 97 low TFIID2 SSA3 no call 38 low TFIID2 GAL1 no call (glucose) 3 low TFIID (galactose)*1 high SAGA (Spt3)6 87

Table 3.3. Comparison of TATA/TATA-less classification with ChIP dataa aAs determined in this study (see Supplemental Table S3-1). HSP12, HSP104, and SSA3 are ‘no call’ because their mRNAs are very stable, and thus did not decay appreciably during TAF1 inactivation. However, these genes appear to be co-regulated in a manner similar to other SAGA-dominated genes, and thus are likely to be regulated by SAGA. For ‘SAGA/TFIID’ the expression data indicate that both SAGA and TFIID regulate these genes. Since experiments were performed in glucose, where the GAL genes are not substantially expressed, no call was made on the GAL1 gene, although this is clearly a SAGA-regulated gene (Bhaumik and Green, 2001; Larschan and Winston, 2001). bPrecent rank of mRNA level determined in this study (see Table S3-1). cBased upon the indicated reference. Shown in parentheses for ChIP analysis is the immunoprecipitated subunit. For TFIID, several TAFs were used. VTC3 was included in the TFIID class since this classification was more consistent with the expression data (presented here) and its designation as TATA-less (Basehoar et al., 2004). *SAGA is required for TBP binding to these genes (Bhaumik and Green, 2002). 1(Li et al., 2000); 2(Kuras et al., 2000); 3(Mencia et al., 2002); 4(Bhaumik and Green, 2002); 5(Proft and Struhl, 2002); 6(Swanson et al., 2003); 7(Larschan and Winston, 2001); 8(Barbaric et al., 2003).

3.3.4 Stress-induced genes tend to be SAGA-dominated while stress-repressed

genes tend to be TFIID-dominated

SAGA, unlike TFIID, is nonessential and plays a predominant regulatory role at a small fraction of the genome, which raises the question as to its physiological purpose. To address this question, the general properties of SAGA- and TFIID-dominated genes found in public microarray data sets were examined. First those genes that are particularly sensitive to environmental changes (e.g., various stresses) were compared.

These genes were divided into two subsets, one including genes that were most up- regulated and another for those most down-regulated during an environmental change.

The percentage of genes in each subset that belonged to the SAGA-dominated group described in the previous section was determined. A summary of the findings is presented in Table 3.4 (Environment, rows 1-11), and a more complete listing can be

88

found in Supplemental Table S3-2. An unbiased population is expected to have 9-10% of its genes in the SAGA-dominated class, as is observed throughout the entire genome.

Percentages significantly above this range reflect a bias toward SAGA regulation, whereas values significantly below reflect a bias toward TFIID regulation. An advantage of this type of analysis is that noisy data will tend toward the genome-wide average of 9-

10%, and thus have low significance.

Strikingly, genes that are commonly up-regulated during general environmental stress

(Causton et al., 2001; Gasch et al., 2000), including heat, oxidation, acidity, DNA damage, carbon or nitrogen starvation, and unfolded proteins, or during sporulation were strongly biased (P-values < 10-30) toward being SAGA-dominated (Table 3.4, rows 1-5).

In contrast, genes that are down-regulated during general environmental stress were biased toward being TFIID-dominated (P-value ~10-10, row 7). This distinct behavior of the stress-induced and stress-repressed genes, as well as the ChIP relationships in Table 2 suggest that SAGA might be particularly geared for turning on genes that respond to stress. TFIID, on the other hand, might be more involved in regulating housekeeping genes, many of which are down-regulated during stress. Since not all stress-induced genes were SAGA-dominated, it is likely that TFIID also contributes to turning on some environmental stress response genes.

89

Table 3.4. Percent of factor-sensitive genes that are SAGA-dominated

a b c Row Factor N % SAGAdom p-value None 9 Environment Increased expression during environmental stress 1 Stress response set 283 42 2 x 10-83 2 Amino acid starvation 289 34 1 x 10-49 3 Excess unfolded protein 298 30 4 x 10-35 4 Diauxic phase 288 42 3 x 10-82 5 Sporulation 298 29 2 x 10-33 6 Sporulation (ntd80D) 299 11 2 x 10-01 Decreased expression during environmental stress 7 Stress response set 585 2 2 x 10-10 8 Amino acid starvation 295 6 4 x 10-02 9 Excess unfolded protein 188 12 1 x 10-02 10 Diauxic phase 3044 6 2 x 10-07 11 Sporulation 308 5 6 x 10-03 Histone tail modifications Under-acetylated 12 H4 intergenic 1051 16 9 x 10-13 13 H4 coding 1235 10 2 x 10-01 14 H3 intergenic 1050 11 1 x 10-02 15 H3 coding 1235 10 5 x 10-01 Over-acetylated 16 H4 intergenic 1036 5 2 x 10-05 17 H4 intergenic (rpd3D) 258 23 6 x 10-15 18 H4 coding 1219 6 3 x 10-04 19 H4 coding (rpd3D) 295 14 5 x 10-03 20 H3 intergenic 1040 7 3 x 10-02 21 H3 intergenic (hda1D) 250 18 1 x 10-06 22 H3 coding 1218 6 1 x 10-05 23 H3 coding (hda1D) 297 19 8 x 10-09 Negatively regulated or slightly inhibited/independent (*) gene groupsd Chromatin regulators 24 Hda1 261 30 6 x 10-32 25 Rpd3 299 27 6 x 10-27 26 Tup1 (Ssn6-Tup1) 305 25 2 x 10-22 27 Ssn6 (Ssn6-Tup1) 1223 20 9 x 10-41 28 Gcn5 HAT (SAGA & ADA) 247 * 38 1 x 10-63 29 Histone H3 tail (1-28) 731 14 3 x 10-06 30 Histone H4 tail (2-26) 291 34 5 x 10-52 31 Histone H2A.Z 60 67 1 x 10-53 32 Snf2 (SWI/SNF) 417 33 6 x 10-61 33 Rsc30 (RSC) 299 4 1 x 10-03 TBP regulators 34 Mot1 304 36 1 x 10-56 35 Bur6 (NC2) 225 49 3 x 10-81 36 TAF1 (TFIID) 156 * 63 4 x 10-105 37 TAF1 TAND (TFIID) 289 * 41 9 x 10-78 38 TAF2 (TFIID) 169 * 31 2 x 10-21 39 TAF5 (TFIID & SAGA) 155 * 34 6 x 10-24 40 TAF6 (TFIID & SAGA) 168 * 50 3 x 10-73 90

a b c Row Factor N % SAGAdom p-value 41 TAF9 (TFIID & SAGA) 100 * 69 5 x 10-71 42 TAF10 (TFIID & SAGA) 76 * 62 3 x 10-38 43 TAF12 (TFIID & SAGA) 210 * 47 1 x 10-87 44 Bdf1 (TFIID) 289 46 4 x 10-106 RNA polymerase II holoenzyme 45 Srb10 (mediator) 219 42 7 x 10-65

Positively regulated gene groups Stress-response gene-specific activators 46 Msn2, Msn4 set 136 49 6 x 10-59 Chromatin regulators 47 H3 tail (1-28) 193 20 2 x 10-07 48 H4 tail (2-26) 2996 5 6 x 10-16 49 Gcn5 HAT (SAGA/ADA) 273 10 5 x 10-01 50 Sir4 135 50 3 x 10-59 TBP regulators 51 Spt3 (SAGA) 179 30 3 x 10-27 52 Spt20 (SAGA)** 333 26 5 x 10-25 53 TAF1 (TFIID) 2011 2 1 x 10-40 54 TAF2 (TFIID) 1730 6 9 x 10-07 55 TAF3 (TFIID)** 338 6 6 x 10-02 56 TAF4 (TFIID)** 343 4 2 x 10-03 57 TAF5 (TFIID & SAGA) 1065 7 1 x 10-04 58 TAF6 (TFIID & SAGA) 1869 3 3 x 10-19 59 TAF7 (TFIID)** 342 1 8 x 10-07 60 TAF9 (TFIID & SAGA) 1140 2 5 x 10-26 61 TAF10 (TFIID & SAGA) 961 7 2 x 10-09 62 TAF11 (TFIID)** 343 3 3 x 10-05 63 TAF12 (TFIID & SAGA) 1614 4 1 x 10-12 64 TAF13 (TFIID)** 340 3 1 x 10-04 65 Bdf1 (TFIID) 3133 4 6 x 10-24 RNA polymerase II holoenzyme 66 Srb5 (mediator) 127 28 1 x 10-13 67 Med2 (mediator) 164 46 8 x 10-59 aRow identifier in Supplemental Table S3-2, including Pubmed reference. bNumber of genes in each group. The Environmental Stress Response set reflects the group of genes defined by Gasch et al. (2000) that respond to a wide variety of stresses. See Experimental Procedures for additional details. cPercentage of genes in the defined group that were designated as SAGA-dominated. A full data set is expected to have a genome-wide distribution of 9.2%. dData are derived from constitutive deletion of the indicated nonessential factor or after a 45 min. temperature inactivation for temperature-sensitive alleles of essential factors. *Denotes gene groups that were slightly inhibited or independent of the factor (i.e. in the upper 5th percentile but not meeting the 0.76 cut-off) **Denotes gene groups where a simple 2-fold ratio cut-off was used, as this was the available data set.

91

What advantage might SAGA provide in the environmental stress response that TFIID does not? It is possible that SAGA allows for a more rapid response. However, SAGA- dominated stress response genes are induced with the same kinetics as TFIID-dominated stress induced genes, indicating that SAGA does not provide a quicker response (Figure

3.3). SAGA-dominated genes do display a larger fold-induction and a greater expression intensity relative to TFIID-dominated genes, indicating that SAGA and/or associated regulators might provide a greater range of expression than TFIID and/or its associated regulators.

Figure 3.3. Induction timecourse of those environmental stress response genes that are induced and SAGA- or TFIID-dominated. Data from Gasch et al., 2000 was plotted for all ESR induced genes separated into SAGA or TFIID-dominated categories based upon the classification presented in Chapter 3.

92

3.3.5 Genes having highly acetylated histone H4 tails tend to be TFIID-dominated

Acetylation of H3 and H4 tails is generally, but not always, associated with transcriptional activation (Bernstein et al., 2002; Deckert and Struhl, 2001; Wu and

Grunstein, 2000). Acetylated histone tails are recognized by bromodomains, which are found in a variety of complexes including SAGA and TFIID. In particular, bromodomain factor Bdf1 which associates with TFIID, binds acetylated histone H4 tails (Jacobson et al., 2000; Ladurner et al., 2003; Matangkasombut and Buratowski, 2003). Therefore, histone tail modifications might help specify the recruitment of TFIID.

The relative levels of acetylated H3 and H4 tails has been determined on a genome-wide scale under relatively low stress conditions (YPD media, 30˚C) (Bernstein et al., 2002;

Robyr et al., 2002). Low stress favors under-utilization of SAGA-dominated genes.

Analyzing this data, shows that intergenic regions that were the most under-acetylated at

H4 displayed a biased association with SAGA-dominated genes (Table 3.4, row 12), whereas H4 over-acetylated regions were biased toward being TFIID-dominated (row

16). A similar relationship was not observed with intergenic H3 acetylation (rows 14 and

20). These results suggest that the acetylation state of histone H4 tails differentially contributes to the regulation of SAGA- and TFIID-dominated genes.

Individual deletions of histone deacetylases Hda1 and Rpd3 lead to general and specific increases in histone acetylation of H3 and H4 (Bernstein et al., 2002; Bernstein et al.,

2000; Kadosh and Struhl, 1998; Rundlett et al., 1998; Vogelauer et al., 2000). Strikingly,

93

in hda1Δ and rpd3Δ strains, genes that are over-acetylated at H3 and H4 were biased toward being SAGA-dominated (rows 17 and 21). Thus, Hda1 and Rpd3 appear to be playing greater roles in keeping acetylation levels lower at SAGA-dominated genes than at TFIID-dominated genes, a notion that is consistent with their role in down-regulating

SAGA-dominated genes under low stress conditions (discussed below).

SAGA-dominated genes are highly regulated compared to TFIID-dominated genes

The general classification of yeast genes into a stress-related SAGA-dominated class and a housekeeping TFIID-dominated class raises the question as to whether other transcription regulatory proteins function largely with one class or the other. Consistent with the notion that histone deacetylation is particularly tailored for down-regulating

SAGA-dominated genes, genes that increase expression the most upon loss of Hda1 and

Rpd3 were biased toward being SAGA-dominated (Table 3.4, rows 24 and 25).

The Ssn6-Tup1 repressor complex interacts with deacetylated histone H3 and H4 tails, and is largely associated with transcriptional repression (Deckert and Struhl, 2001;

Edmondson et al., 1996; Watson et al., 2000; Wu et al., 2001). If SAGA-dominated genes tend to be down-regulated by histone deacetylation, then Ssn6-Tup1 might play an important role at these genes. Consistent with this possibility, genes that are inhibited the most by Tup1 or Ssn6 tended to be under-acetylated and SAGA-dominated, compared to the genome-wide average (Table 3.4, rows 26 and 27, and Wu et al., 2001). Taken together, these findings suggest that the combined action of histone deacetylation and

94

Ssn6-Tup1 binding down-regulates SAGA-dominated genes more than TFIID-dominated genes.

Srb10 is a kinase component of the pol II holoenzyme mediator complex and appears to play several distinct roles in preventing stress responses. First, Srb10 inhibits pol II through phosphorylation of pol II’s C-terminal domain (Hengartner et al., 1998). Second,

Srb10 phosphorylates a number of stress response activators including Msn2, Ste12 and

Gcn4, promoting their nuclear exclusion and/or turnover (Chi et al., 2001; Nelson et al.,

2003). Third, Srb10 has been implicated in repression by Ssn6-Tup1 (Kuchin and

Carlson, 1998). Based upon these findings we examined whether SAGA-dominated genes were particularly sensitive to Srb10 regulation. Genes that are most inhibited by

Srb10 displayed a strong bias toward being SAGA-dominated (Table 3.4, row 45), thereby implicating Srb10 as a negative regulator of SAGA-dominated genes. Genes regulated by the stress activators Msn2 and Msn4 were also examined, and found to be strongly biased toward being SAGA-dominated (row 46). These combined findings support mounting evidence for a coordinated stress response pathway that is up-regulated by gene-specific activators like Msn2/4 and others, and down-regulated by Srb10- directed phosphorylation as well as histone de-acetylation.

Two TBP regulators, NC2 and Mot1, have been implicated in the stress response and target many of the same genes (Andrau et al., 2002; Dasgupta et al., 2002). NC2, which is composed of the Bur6 and Ydr1 subunits, attenuates transcription by binding to a

95

TBP/promoter complex and inhibiting the recruitment of TFIIA and TFIIB (Cang et al.,

1999; Goppelt and Meisterernst, 1996; Kim et al., 1997; Mermelstein et al., 1996). Mot1 uses the energy of ATP hydrolysis to dissociate TBP from DNA, and can act on a

TBP/DNA/NC2 complex (Auble et al., 1994; Darst et al., 2003). While it is clear that

NC2 and Mot1 target TBP, it is not known whether these inhibitors are directed at genes regulated by TFIID, SAGA, both, or neither. To address this, those genes that were most negatively regulated by Bur6 (NC2) or Mot1 were examined. As indicated by rows 34 and 35 in Table 3.4, this set of genes displayed a strong bias toward being SAGA- dominated. Therefore, the inhibitory activities of NC2 and Mot1 appear to function largely in the context of SAGA rather than TFIID, tying these factors to the same general stress response pathway as chromatin-directed regulators.

3.3.6 TAFs make a greater positive contribution at TFIID-dominated genes than

at SAGA-dominated genes

TAFs are subunits of TFIID, but a subset of TAFs are also present in SAGA (Grant et al.,

1998). Strikingly, genes that are largely independent of many of the TAFs displayed a strong bias toward being SAGA-dominated (Table 3.4, rows 36-43). In contrast, genes that are positively regulated by TAFs were biased toward the TFIID-dominated class, regardless of whether they are also present in SAGA (rows 53-64). This suggests that

TAFs are more important for TFIID than for SAGA, and that TAF-independent promoters are likely to be SAGA-dominated. A particularly strong bias toward TFIID was apparent with TAF6, TAF9, and TAF12 (rows 58, 60, and 63), which form part of a

96

histone-like octamer structure in both TFIID and SAGA (Selleck et al., 2001), suggesting that the octamer structure might be particularly important for TFIID function.

As expected, genes positively regulated by SAGA-specific subunits Spt3 and Spt20 were biased toward SAGA regulation (rows 51 and 52). Genes positively regulated by Gcn5 or its HAT activity did not display any significant bias toward SAGA regulation (row

49), which suggests that Gcn5’s main global function might lie outside of SAGA.

3.3.7 Bdf1 and histone H4 tails are linked to TFIID regulation

One of the more uncertain subunits of TFIID is Bdf1. The Bdf1 homolog in higher eukaryotes is encoded by the C-terminal half of TAF1, which is missing in yeast TAF1

(Matangkasombut et al., 2000). Bdf1 is abundant in yeast cells but is present at sub- stoichiometric levels in TFIID preparations (Matangkasombut et al., 2000; Sanders et al.,

2002). Since Bdf1 can bind promoter regions in the apparent absence of other TAFs, it might have functions apart from TFIID. Bdf1 contains two bromodomains, which have been implicated in binding acetylated lysines on histone H4 tails (Jacobson et al., 2000;

Ladurner et al., 2003; Matangkasombut and Buratowski, 2003). Therefore, one function of Bdf1 might be to tether TFIID to nucleosomes acetylated at H4. Indeed, as shown earlier, H4 over-acetylation is associated with TFIID function. Genes that are positively regulated by Bdf1 were significantly biased toward being TFIID-dominated (Table 3.4, row 65), which is consistent with its role as a component of TFIID. Very few of the large number of transcriptional regulators examined displayed any bias toward TFIID

97

regulation. Those that did were nearly all TAFs. Therefore, Bdf1 stands out as being more TAF-like than other regulators, which is consistent with it being a functional component of TFIID.

Interestingly, unlike the general behavior of TAFs, many genes are negatively regulated by Bdf1, and a high proportion of these genes are in the SAGA-dominated group (row

44). ChIP experiments demonstrate that Bdf1, but not TAFs, directly binds to these genes (Ladurner et al., 2003; Matangkasombut and Buratowski, 2003). If Bdf1 elicits its activity through acetylated H4 tails, then this interaction might inhibit SAGA-dominated genes by antagonizing the interaction of other positively-acting bromodomains complexes with acetylated H4 tails.

The comparisons presented here are consistent with the idea that acetylated H4 tails are important for TFIID recruitment through Bdf1. To explore this concept further, microarray data sets derived from deletions of amino acids 2-26 of H4 and amino acids 1-

28 of H3 (Sabet et al., 2003) were examined. This showed that genes, which are most positively regulated by H4 tails, were biased toward being TFIID-dominated (Table 3.4, row 48), whereas genes most negatively regulated by H4 tails were biased toward being

SAGA-dominated (row 30). When interpreted in the context of genome-wide acetylation data (rows 12-23), this striking relationship suggests that acetylated H4 tails contribute positively to TFIID-directed transcription, and under-acetylated H4 tails are particularly inhibitory toward SAGA-dominated genes. Consistent with the genome-wide acetylation

98

data, deletion of H3 tails did not generate the same bias as deletion of H4 tails (rows 29 and 47). Therefore, H4 tail acetylation might be particularly important for recruiting

TFIID.

3.3.8 SAGA-dominated genes are coordinately regulated

The data presented thus far suggest that SAGA-dominated genes are subjected to a greater repertoire of regulation than TFIID-dominated genes. However, it is not clear if the regulators associated with the SAGA-dominated genes coordinate their activities on the same set of genes, or whether they target different subsets. To address this, cluster analysis was performed on the SAGA-dominated gene set using microarray data from the wide range of regulators affecting this group (Figure 3.4A). If individual regulators primarily target distinct subsets of genes, multiple sub-clusters would be expected.

However, only two visually distinct sub-clusters became apparent within the SAGA- dominated set, with the major subcluster representing stress induced genes. This stress- induced subset bore the bulk of the genes that are inhibited by most of the negative regulators described here. The other subset was generally independent of these factors, and displayed a slightly stronger dependency on SAGA and mediator components. The highly concerted response of SAGA-dominated genes to a variety of negative regulators suggests that these negative regulators coordinate their activities on the same set of target genes.

99

Figure 3.4. Highly coordinated co-regulation of SAGA-dominated genes. (A) Fold changes in gene expression for 551 SAGA-dominated genes in response to 23 conditions were clustered by K means as described in Figure 3.1B. K was varied between 2 and 7, with K=2 being shown. K values higher than 2 generated visibly distinct clusters, but the differences were modest in comparison to the K=2 clusters. Below each column is a data set identifier, corresponding to entries in Table S3-2. (B) Caricature of SAGA- and TFIID-dominated genes. Environmental stress leads to up- regulation of a large subset of SAGA-dominated genes. Factors illustrated in red (not all of which are discussed in the text) play a negative role, and those in green play a positive role. Contributions of the TATA box are based upon Basehoar et al., 2004. In addition, TFIID and SAGA are shown contributing to the expression of each others target genes, as illustrated by the dashed arrows. 100

3.4 Discussion

3.4.1 TFIID and SAGA contribute to the expression of essentially all genes

A detailed genome-wide examination of the requirement for TFIID and SAGA in gene expression finds that >99% of the measurable genome is positively regulated by the overlapping involvement of both TFIID and SAGA. Since TFIID and SAGA can account for the entire measurable genome, it seems likely that they will also contribute to the expression of the approximately 5% or more of the genome that is not expressed under the assayed conditions.

In the absence of TAF1 and Spt3, virtually all transcription is shutdown to a level comparable to the loss of the Rpb1 subunit of pol II. This finding leaves little room for any additional complexes in yeast that function equivalently to TFIID and SAGA

(including SLIK/SALSA) in transcription. Under one set of growth conditions approximately 90% of the genome is characterized as being TFIID-dominated, meaning that TFIID contributes more than SAGA at these promoters. The remaining ~10% is characterized as being SAGA-dominated, meaning that SAGA contributes more than

TFIID. At some promoters, SAGA and TFIID contributions are more-or-less equivalent, and their classification is therefore arbitrary.

When SAGA is present, the bulk of the genome does not display an absolute dependence on TAF1 (TFIID), which is in agreement with previous reports (Holstege et al., 1998;

Lee et al., 2000; Shen et al., 2003). Our findings differ in that our analysis does not

101

require that genes be absolutely dependent upon TAF1, and thus reveals a substantial but not absolute genome-wide dependence on a single TFIID-specific TAF. These findings challenge the notion that expression of more than 80% of yeast genes is TAF1- independent. The ~10% of the genome that is predominantly regulated by SAGA displays little dependence on TAF1 or any other TAF compared to TFIID-dominated genes when SAGA is present, and thus are equivalent to TAF-independent genes.

Importantly, in the absence of SAGA, these genes become TFIID-dependent.

3.4.2 Shared TAFs are more important for TFIID than for SAGA

Inactivation of either TFIID-specific or shared TAFs leads to a general down-regulation of TFIID-dominated genes and has either no effect on or modestly up-regulates SAGA- dominated genes. This suggests that TAFs are more important for TFIID than for SAGA.

One plausible explanation for the greater dependency of TFIID on these shared TAFs is that the temperature-sensitive TAF mutants for which dependency was determined are more disruptive to the structure of TFIID than to SAGA. Alternatively, these shared

TAFs might make contacts with promoter DNA or other factors that are specific or particularly important to TFIID-dominated genes.

3.4.3 Histone H4 tail acetylation, Bdf1 binding, and TFIID function are linked

Under low stress conditions, histone H4 acetylation correlates with TFIID-dominance rather than SAGA-dominance. This raises the possibility that H4 acetylation is more important for TFIID function than for SAGA function under conditions of low stress.

102

The same degree of functional selectivity is not apparent with H3 tails. Bdf1, which binds acetylated H4 tails and TFIID, and selectively regulates TFIID-dominated genes, might therefore link H4 tail acetylation to TFIID binding, as previous biochemical and genetic studies suggest. Consistent with this, deletion of H4 tails inhibits TFIID- dominated genes more than SAGA-dominated genes. Interestingly, unlike TAFs, Bdf1 inhibits the stress-induced SAGA-dominated genes, which supports the notion that Bdf1 has functions apart from TFIID (Ladurner et al., 2003; Matangkasombut and Buratowski,

2003).

3.4.4 SAGA-dominated genes reveal a highly regulated stress-response pathway

Assessing the impact of environmental and genetic changes on SAGA- and TFIID- dominated genes reveals an extraordinarily high degree of coordination among regulatory factors targeting SAGA-dominated genes. First, genes that are commonly up-regulated in response to any of a variety of environmental stresses, such heat, oxidation, acidity,

DNA damage, carbon or nitrogen starvation, and excess unfolded proteins show a greater than expected tendency toward being SAGA-dominated. Thus, SAGA appears to be a major contributor to transcriptional activation of the common environmental stress response genes. Strikingly, genes that are down-regulate during environmental stress were almost exclusively TFIID-dominated. Typically, genes involved in general housekeeping functions such as protein synthesis and cellular growth are shut down during environmental stress (Causton et al., 2001; Gasch et al., 2000). Taken together,

103

the data reveal two distinct transcriptional regulatory pathways: a stress-mediated path utilizing primarily SAGA, and a housekeeping path involving TFIID.

Second, many factors that regulate chromatin, TBP, and pol II appear to play a greater role at SAGA-dominated than at TFIID-dominated genes. Cluster analysis of SAGA- dominated genes indicates that these regulators generally target the same set of genes rather than each being directed at distinct subsets.

Figure 3.4B illustrates the factors involved in transcriptional regulation of two extreme instances of SAGA- (left side) and TFIID-dominated (right side) genes. The prototypic

SAGA-dominated gene is stress-induced, while the prototypic TFIID-dominated gene plays a less-regulated housekeeping role. Emphasized in the figure are those factors that differ in their relative contribution to SAGA- and TFIID-dominated genes. Down- regulation of SAGA-dominated genes is shown to involve a cadre of factors directed at dismantling or blocking assembly of the transcription machinery. These factors include the Srb10 kinase component of the pol II holoenzyme, which incapacitates pol II and stress-response activators through phosphorylation. Without pol II and these activators, transcription complex assembly might stall. NC2 might then gain increased access to

TBP (not in the form of TFIID) and block TFIIA and TFIIB incorporation. Loss of

TFIIA would allow Mot1 to act on the TBP/TATA/NC2 complex, where it would dissociate TBP.

104

Cleared of the general transcription machinery, the promoter region of SAGA-dominated genes might then become further packaged into nucleosomes, perhaps facilitated by chromatin remodeling/modifying activities, involving histone amino terminal tails.

Nucleosome packaging at SAGA-dominated promoters might also be facilitated by Ssn6-

Tup1, which interacts with deacetylated histone tails and Hda1 and Rpd3 deacetylases.

Down-regulation is likely to be further enforced by sequence-specific repressors.

Counteracting this negative regulation are stress-induced activators, acetylated histone

H3 and H4 tails, SAGA, and mediator components, among others. As shown by Andy

Basehoar and colleagues (Basehoar et al., 2004), the presence of a TATA box is also part of the repertoire that mediates stress-induced activation. How cellular sensors of high and low stress transduce their signals to orchestrate this assembly/disassembly process is not fully known, although much of it is likely to be funneled through sequence-specific gene regulators. It seems reasonable to expect that this assembly/disassembly process is dynamic, with genes constantly being turned up or down, rather than on or off.

3.5 Materials and methods

Microarray analysis

Strains and plasmids are presented in Table 3.1. Strains were grown in CSM-Trp-His media at 26˚C to OD600 = ~0.8. taf1ts2 was inactivated by shifting the cultures to 37˚C with an equal volume of heated media. After 45 min. at 37˚C for 45 min, cells were rapidly harvested at room temperature. Total RNA and mRNA isolation, and microarray co-hybridization of test and reference samples were performed as described in Chapter2.

105

Polyadenylated B. subtilis transcripts (LysA, PheB, ThrC, TrpE, DapB) were spiked into equivalent amounts of cells (based upon OD600 readings) just prior to isolation of total

RNA. These transcripts hybridize to their cognate spots on the arrays. Data were normalized to these spiking controls. Dye-swapped duplicates of each experiment that did not display the tight correlation shown in Figure 3.1 were repeated. Raw data is accessible at GEO (http://www.ncbi.nlm.nih.gov/geo/), accession numbers GSM13002,

GSM13013-GSM13041, GSM13047.

The standard deviation for wild type in Table 3.2 was calculated using the following formula: Sqrt(V1 + 2V2/9) where V1 is the variance in the distribution of the wild type ratios (YKH105 and YJS6) and V2 is the pooled variance of the spiking controls over 30 separate arrays. V2 is increased by a factor of two since the variance is applicable to both populations that are being compared (wild type reference and mutant test samples). V2 is decreased by a factor of nine since each ratio was normalized using at least nine spiking control values.

Comparisons with public microarray data

Each row in Table 3 is defined by those genes having an absolute value of their log2 ratios > 0.76. The 0.76 cutoff (~1.7 fold) was arbitrary and reflects attempts to examine primarily those data with meaningful changes. Each group was further limited in membership to those in the upper or lower 5th percentile (calculated using the

PERCENTRANK function in Excel for increased and decreased expression,

106

respectively), except in cases where low values in the % SAGAdom column limited statistical evaluation. In such cases, the percentile limit was relaxed to 20 or 50, as designated by the “entry identifier” suffix in Supplementary Table S4. Similar results were obtained with either criteria (see Supplementary Table S4). The 5 percentile cutoff was arbitrarily set at half the maximum expected percentage (9-10%) for SAGA- dominated genes. For groups where N was small (typically <100), we eliminated the

0.76 cutoff but maintained the 5 percentile cutoff. These groups are identified by an ‘*’, and correspond to the “slightly inhibited/independent” gene groups. To eliminate potential bias arising from stable mRNA in the ‘independent/slightly inhibited’ groups, stable mRNAs were removed from these groups if the corresponding temperature- sensitive allele was subjected to a 45 min. inactivation (criteria for elimination: calculated rpb1-1 log2 ratio was between ±0.5 at 45 min. using data from Wang 2002 ). Since ‘bad’ data has been filtered out of many of the external data sets, the percentage of SAGA- dominated genes throughout an entire data set might differ from 9.2%, obtained in our data sets. In all cases where the entire data sets were available, the intrinsic genome-wide percentage of SAGA-dominated genes was used in calculating p-values. P-values were determined using the CHITEST function in Excel.

107

3.6 References

Andrau, J. C., Van Oevelen, C. J., Van Teeffelen, H. A., Weil, P. A., Holstege, F. C., and

Timmers, H. T. (2002). Mot1p is essential for TBP recruitment to selected promoters during in vivo gene activation. EMBO J. 21, 5173-5183.

Auble, D. T., Hansen, K. E., Mueller, C. G., Lane, W. S., Thorner, J., and Hahn, S.

(1994). Mot1, a global repressor of RNA polymerase II transcription, inhibits TBP binding to DNA by an ATP-dependent mechanism. Genes Dev. 8, 1920-1934.

Barbaric, S., Reinke, H., and Horz, W. (2003). Multiple mechanistically distinct functions of SAGA at the PHO5 promoter. Mol. Cell. Biol. 23, 3468-3476.

Basehoar, A. D., Zanton, S. J., and Pugh, B. F. (2004). Identification and distinct regulation of yeast TATA box-containing genes. accompanying paper.

Bernstein, B. E., Humphrey, E. L., Erlich, R. L., Schneider, R., Bouman, P., Liu, J. S.,

Kouzarides, T., and Schreiber, S. L. (2002). Methylation of histone H3 Lys 4 in coding regions of active genes. Proc. Natl. Acad. Sci. USA 99, 8695-8700.

Bernstein, B. E., Tong, J. K., and Schreiber, S. L. (2000). Genomewide studies of histone deacetylase function in yeast. Proc. Natl. Acad. Sci. USA 97, 13708-13713.

Bhaumik, S. R., and Green, M. R. (2001). SAGA is an essential in vivo target of the yeast acidic activator Gal4p. Genes Dev. 15, 1935-1945.

108

Bhaumik, S. R., and Green, M. R. (2002). Differential requirement of SAGA components for recruitment of TATA-box-binding protein to promoters in vivo. Mol. Cell. Biol. 22,

7365-7371.

Brownell, J. E., Zhou, J., Ranalli, T., Kobayashi, R., Edmondson, D. G., Roth, S. Y., and

Allis, C. D. (1996). Tetrahymena histone acetyltransferase A: a homolog to yeast Gcn5p linking histone acetylation to gene activation. Cell 84, 843-851.

Cang, Y., Auble, D. T., and Prelich, G. (1999). A new regulatory domain on the TATA- binding protein. EMBO J. 18, 6662-6671.

Causton, H. C., Ren, B., Koh, S. S., Harbison, C. T., Kanin, E., Jennings, E. G., Lee, T.

I., True, H. L., Lander, E. S., and Young, R. A. (2001). Remodeling of yeast genome expression in response to environmental changes. Mol. Biol. Cell 12, 323-337.

Chi, Y., Huddleston, M. J., Zhang, X., Young, R. A., Annan, R. S., Carr, S. A., and

Deshaies, R. J. (2001). Negative regulation of Gcn4 and Msn2 transcription factors by

Srb10 cyclin-dependent kinase. Genes Dev. 15, 1078-1092.

Chitikila, C., Huisinga, K. L., Irvin, J. D., Basehoar, A. D., and Pugh, B. F. (2002).

Interplay of TBP inhibitors in global transcriptional control. Mol. Cell 10, 871-882.

Darst, R. P., Dasgupta, A., Zhu, C., Hsu, J. Y., Vroom, A., Muldrow, T., and Auble, D. T.

(2003). Mot1 Regulates the DNA Binding Activity of Free TATA-binding Protein in an

ATP-dependent Manner. J. Biol. Chem. 278, 13216-13226.

109

Dasgupta, A., Darst, R. P., Martin, K. J., Afshari, C. A., and Auble, D. T. (2002). Mot1 activates and represses transcription by direct, ATPase-dependent mechanisms. Proc.

Natl. Acad. Sci. USA 99, 2666-2671.

Deckert, J., and Struhl, K. (2001). Histone acetylation at promoters is differentially affected by specific activators and repressors. Mol. Cell. Biol. 21, 2726-2735.

Eberharter, A., Sterner, D. E., Schieltz, D., Hassan, A., Yates, J. R., 3rd, Berger, S. L., and Workman, J. L. (1999). The ADA complex is a distinct histone acetyltransferase complex in Saccharomyces cerevisiae. Mol. Cell. Biol. 19, 6621-6631.

Edmondson, D. G., Smith, M. M., and Roth, S. Y. (1996). Repression domain of the yeast global repressor Tup1 interacts directly with histones H3 and H4. Genes Dev. 10,

1247-1259.

Eisen, M. B., Spellman, P. T., Brown, P. O., and Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863-

14868.

Gasch, A. P., Spellman, P. T., Kao, C. M., Carmel-Harel, O., Eisen, M. B., Storz, G.,

Botstein, D., and Brown, P. O. (2000). Genomic expression programs in the response of yeast cells to environmental changes. Mol. Biol. Cell 11, 4241-4257.

Goppelt, A., and Meisterernst, M. (1996). Characterization of the basal inhibitor of class

II transcription NC2 from Saccharomyces cerevisiae. Nucleic Acids Res. 24, 4450-4455.

110

Grant, P. A., Duggan, L., Cote, J., Roberts, S. M., Brownell, J. E., Candau, R., Ohba, R.,

Owen-Hughes, T., Allis, C. D., Winston, F., et al. (1997). Yeast Gcn5 functions in two multisubunit complexes to acetylate nucleosomal histones: characterization of an Ada complex and the SAGA (Spt/Ada) complex. Genes Dev. 11, 1640-1650.

Grant, P. A., Schieltz, D., Pray-Grant, M. G., Steger, D. J., Reese, J. C., Yates, J. R., 3rd, and Workman, J. L. (1998). A subset of TAF(II)s are integral components of the SAGA complex required for nucleosome acetylation and transcriptional stimulation. Cell 94, 45-

53.

Hengartner, C. J., Myer, V. E., Liao, S. M., Wilson, C. J., Koh, S. S., and Young, R. A.

(1998). Temporal regulation of RNA polymerase II by Srb10 and Kin28 cyclin- dependent kinases. Mol. Cell 2, 43-53.

Holstege, F. C., Jennings, E. G., Wyrick, J. J., Lee, T. I., Hengartner, C. J., Green, M. R.,

Golub, T. R., Lander, E. S., and Young, R. A. (1998). Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95, 717-728.

Jacobson, R. H., Ladurner, A. G., King, D. S., and Tjian, R. (2000). Structure and function of a human TAF(II)250 double bromodomain module. Science 288, 1422-1425.

Kadosh, D., and Struhl, K. (1998). Targeted recruitment of the Sin3-Rpd3 histone deacetylase complex generates a highly localized domain of repressed chromatin in vivo.

Mol. Cell. Biol. 18, 5121-5127.

111

Kim, S., Na, J. G., Hampsey, M., and Reinberg, D. (1997). The Dr1/DRAP1 heterodimer is a global repressor of transcription in vivo. Proc. Natl. Acad. Sci. USA 94, 820-825.

Kuchin, S., and Carlson, M. (1998). Functional relationships of Srb10-Srb11 kinase, carboxy-terminal domain kinase CTDK-I, and transcriptional corepressor Ssn6-Tup1.

Mol. Cell. Biol. 18, 1163-1171.

Kuras, L., Kosa, P., Mencia, M., and Struhl, K. (2000). TAF-Containing and TAF- independent forms of transcriptionally active TBP in vivo Science 288, 1244-1248.

Ladurner, A. G., Inouye, C., Jain, R., and Tjian, R. (2003). Bromodomains mediate an acetyl-histone encoded antisilencing function at heterochromatin boundaries. Mol. Cell

11, 365-376.

Larschan, E., and Winston, F. (2001). The S. cerevisiae SAGA complex functions in vivo as a coactivator for transcriptional activation by Gal4. Genes Dev. 15, 1946-1956.

Lee, T. I., Causton, H. C., Holstege, F. C., Shen, W. C., Hannett, N., Jennings, E. G.,

Winston, F., Green, M. R., and Young, R. A. (2000). Redundant roles for the TFIID and

SAGA complexes in global transcription. Nature 405, 701-704.

Li, X. Y., Bhaumik, S. R., and Green, M. R. (2000). Distinct classes of yeast promoters revealed by differential TAF recruitment. Science 288, 1242-1244.

112

Matangkasombut, O., Buratowski, R. M., Swilling, N. W., and Buratowski, S. (2000).

Bromodomain factor 1 corresponds to a missing piece of yeast TFIID Genes Dev. 14,

951-962.

Matangkasombut, O., and Buratowski, S. (2003). Different sensitivities of bromodomain factors 1 and 2 to histone H4 acetylation. Mol. Cell 11, 353-363.

Mencia, M., Moqtaderi, Z., Geisberg, J. V., Kuras, L., and Struhl, K. (2002). Activator- specific recruitment of TFIID and regulation of ribosomal protein genes in yeast. Mol.

Cell 9, 823-833.

Mermelstein, F., Yeung, K., Cao, J., Inostroza, J. A., Erdjument-Bromage, H., Eagelson,

K., Landsman, D., Levitt, P., Tempst, P., and Reinberg, D. (1996). Requirement of a corepressor for Dr1-mediated repression of transcription. Genes Dev. 10, 1033-1048.

Mizzen, C. A., Yang, X. J., Kokubo, T., Brownell, J. E., Bannister, A. J., Owen-Hughes,

T., Workman, J., Wang, L., Berger, S. L., Kouzarides, T., et al. (1996). The TAF(II)250 subunit of TFIID has histone acetyltransferase activity. Cell 87, 1261-1270.

Nelson, C., Goto, S., Lund, K., Hung, W., and Sadowski, I. (2003). Srb10/Cdk8 regulates yeast filamentous growth by phosphorylating the transcription factor Ste12. Nature 421,

187-190.

Nishikawa, J., Kokubo, T., Horikoshi, M., Roeder, R. G., and Nakatani, Y. (1997).

Drosophila TAF(II)230 and the transcriptional activator VP16 bind competitively to the

113

TATA box-binding domain of the TATA box-binding protein. Proc. Natl. Acad. Sci.

USA 94, 85-90.

Proft, M., and Struhl, K. (2002). Hog1 kinase converts the Sko1-Cyc8-Tup1 repressor complex into an activator that recruits SAGA and SWI/SNF in response to osmotic stress.

Mol. Cell 9, 1307-1317.

Robyr, D., Suka, Y., Xenarios, I., Kurdistani, S. K., Wang, A., Suka, N., and Grunstein,

M. (2002). Microarray deacetylation maps determine genome-wide functions for yeast histone deacetylases. Cell 109, 437-446.

Rundlett, S. E., Carmen, A. A., Suka, N., Turner, B. M., and Grunstein, M. (1998).

Transcriptional repression by UME6 involves deacetylation of lysine 5 of histone H4 by

RPD3. Nature 392, 831-835.

Sabet, N., Tong, F., Madigan, J. P., Volo, S., Smith, M. M., and Morse, R. H. (2003).

Global and specific transcriptional repression by the histone H3 amino terminus in yeast.

Proc. Natl. Acad. Sci. USA 100, 4084-4089.

Sanders, S. L., Jennings, J., Canutescu, A., Link, A. J., and Weil, P. A. (2002).

Proteomics of the eukaryotic transcription machinery: identification of proteins associated with components of yeast TFIID by multidimensional mass spectrometry.

Mol. Cell. Biol. 22, 4723-4738.

114

Selleck, W., Howley, R., Fang, Q., Podolny, V., Fried, M. G., Buratowski, S., and Tan, S.

(2001). A histone fold TAF octamer within the yeast TFIID transcriptional coactivator.

Nat. Struct. Biol. 8, 695-700.

Shen, W. C., Bhaumik, S. R., Causton, H. C., Simon, I., Zhu, X., Jennings, E. G., Wang,

T. H., Young, R. A., and Green, M. R. (2003). Systematic analysis of essential yeast

TAFs in genome-wide transcription and preinitiation complex assembly. EMBO J. 22,

3395-3402.

Struhl, K. (1999). Fundamentally different logic of gene regulation in eukaryotes and prokaryotes. Cell 98, 1-4.

Swanson, M. J., Qiu, H., Sumibcay, L., Krueger, A., Kim, S. J., Natarajan, K., Yoon, S., and Hinnebusch, A. G. (2003). A multiplicity of coactivators is required by Gcn4p at individual promoters in vivo. Mol. Cell. Biol. 23, 2800-2820.

Vogelauer, M., Wu, J., Suka, N., and Grunstein, M. (2000). Global histone acetylation and deacetylation in yeast. Nature 408, 495-498.

Walker, S. S., Reese, J. C., Apone, L. M., and Green, M. R. (1996). Transcription activation in cells lacking TAFIIS. Nature 383, 185-188.

Wang, Y., Liu, C. L., Storey, J. D., Tibshirani, R. J., Herschlag, D., and Brown, P. O.

(2002). Precision and functional specificity in mRNA decay. Proc. Natl. Acad. Sci. USA

99, 5860-5865.

115

Watson, A. D., Edmondson, D. G., Bone, J. R., Mukai, Y., Yu, Y., Du, W., Stillman, D.

J., and Roth, S. Y. (2000). Ssn6-Tup1 interacts with class I histone deacetylases required for repression. Genes Dev. 14, 2737-2744.

Wu, J., and Grunstein, M. (2000). 25 years after the nucleosome model: chromatin modifications. Trends Biochem. Sci. 25, 619-623.

Wu, J., Suka, N., Carlson, M., and Grunstein, M. (2001). TUP1 utilizes histone H3/H2B- specific HDA1 deacetylase to repress gene activity in yeast. Mol. Cell 7, 117-126.

116

Chapter 4

4 Coordination of gene expression in Saccharomyces cerevisiae through positive

and negative regulation of the TATA-Binding Protein

4.1 Summary

The TATA-binding protein is required for transcription of protein-coding genes by Pol II in yeast. Consistent with its important role in gene expression, many factors interact with

TBP to modulate its function both positively and negatively. To better understand the interplay between TBP regulatory factors, the effects on gene expression upon disruption of components of the TBP Regulatory network were examined using yeast expression arrays. I focused on six interactions: TBP with Spt3, the Taf1 N-terminal domain

(TAND), NC2, Mot1, DNA, and itself (TBP dimerization), and examined each one individually and in all possible combinations. I find that the different modes of TBP regulation function coordinately to regulate gene expression in a highly intertwined TBP regulatory network. NC2 and Mot1 repression targets TATA-containing genes and often counteracts positive regulation by Spt3 and TBP-DNA interactions at these genes.

Highly expressed genes often require the overlapping, but non-redundant, stimulatory effects of multiple TBP regulators, including Spt3, TAND, and TBP-DNA interactions.

Mutations that disrupt TBP dimerization cause increased expression at genes that are also repressed by several chromatin related mechanisms. Additionally, several factors, including Spt3, adjust their mode of regulation between stimulating and inhibiting transcription at different subsets of genes. 117

4.2 Introduction

Expression of protein coding genes is a dynamic, multi-step process involving many factors. A key step is the recruitment and binding of the TATA binding protein (TBP) to the promoter region of coding genes. Binding of TBP facilitates the assembly of the preinitiation complex (PIC), which subsequently leads to transcription of the gene by

RNA polymerase II. Gene specific activators facilitate the recruitment of TBP to promoters by interacting with complexes, such as TFIID and SAGA, that TBP is associated with (Brown et al., 2001; Kokubo et al., 1998; Kotani et al., 1998; Mencia et al., 2002; Mencia and Struhl, 2001; Sterner et al., 1999). These delivery complexes and other components of the gene regulatory machinery modulate TBP’s activity, which in turn affects the transcriptional output of a gene. TBP is regulated both positively and negatively. In fact, several TBP regulators themselves have both positive and negative functions. This chapter is focused on how six TBP regulatory interactions function coordinately to modulate TBP, thereby affecting gene expression. The six TBP regulatory interactions include the NC2 complex, Mot1, the N-terminal domain of TAF1

(TAND), SPT3, TBP-DNA binding, and TBP self-dimerization.

TBP is a saddle-shaped molecule that binds to DNA through its concave surface

(Chasman et al., 1993; Kim et al., 1993a; Kim et al., 1993b). Mutations on TBP’s concave surface can disrupt DNA binding in vitro, and generally do not support cell viability. In vivo genome-wide studies that assayed for dominant effects on transcription

118

with a subset of DNA binding defective mutants showed that not all yeast genes are affected (Chitikila et al., 2002). Therefore, it appears that TBP-DNA interactions are critical at some promoters but less so at others, possibly implicating another TBP regulatory factor in compensating for decreased TBP-DNA affinity at these genes.

Interestingly, the sensitivity to these mutants is correlated to the presence of a TATA box in the promoter of genes (Basehoar et al., 2004). In addition to binding DNA, TBP can also self-dimerize through its concave surface. Dimerization occludes the DNA-binding surface of the molecule, thereby inhibiting TBP-DNA interactions (Coleman and Pugh,

1997; Coleman et al., 1995; Jackson-Fisher et al., 1999; Nikolov et al., 1992). Earlier work showed that TBP dimerization is also disrupted by specific mutations on TBP’s concave surface and that this mode of regulation is important for inhibiting expression of lowly expressed genes (Chitikila et al., 2002; Jackson-Fisher et al., 1999).

TAF1, a component of the TFIID complex, is a key regulator of TBP. The TFIID complex is important for the expression of many genes, in particular the highly transcribed ribosomal protein genes (Huisinga and Pugh, 2004; Mencia et al., 2002).

However, not all genes are dependent upon TFIID for expression, as ~10% of yeast genes are insensitive to inactivation of TAF1 (Huisinga and Pugh, 2004). The N-terminal domain of TAF1 (TAND 1 and TAND 2 domains) interacts with TBP’s concave and convex surfaces respectively. Recently, a third region of yeast TAF1 was identified

(TAND3) that also interacts with the concave surface and is partially redundant with

TAND1 (Takahata et al., 2003). This interaction can facilitate TBP delivery to

119

promoters; however, TAF1 can also repress transcription as TAND-TBP interactions inhibit TBP from binding to DNA in vitro (Kokubo et al., 1998). Work presented in

Chapter 2 showed that deletion of TAND alone has minimal effects on genome-wide transcription, but when mutations on TBP’s concave surface are combined with ΔTAND, its effects are more wide-spread.

As a component of the SAGA complex, Spt3 is part of the Spt3/Spt8 TBP regulatory module, which functions to positively regulate TBP. Spt3 interacts with TBP and is critical for TBP recruitment to certain promoters (Dudley et al., 1999; Eisenmann et al.,

1992). Under high stress conditions, Spt3 is required for wild type levels of expression at

~10% of the yeast genome, which interestingly are the genes unaffected upon inactivation of Taf1. However, the combined loss of Spt3 and Taf1 decreases expression of nearly all yeast genes to levels comparable to those seen upon inactivation of RNA

Polymerase II (Pol II) (Huisinga and Pugh, 2004). Analysis of these results led to the proposal of a bipolar genome in which ~10% of genes are SAGA-dominate and ~90% are

TFIID-dominated. SAGA-dominated genes are generally stress-induced, TATA- containing, and highly regulated, while TFIID-dominate genes are stress-repressed,

TATA-less, and are under less regulation (Basehoar et al., 2004; Huisinga and Pugh,

2004). While this supports Spt3’s positive role in transcription, under basal conditions

Spt3 plays a repressive role at some genes, demonstrated by the fact that when it is deleted, expression increases (Belotserkovskaya et al., 2000). These opposite effects indicate that Spt3’s role at a promoter may be condition-dependent and change depending

120

upon the activation level of a gene. While genes induced under high stress require Spt3 for induction, it is unclear what role Spt3 plays at these genes under non-stress conditions.

The NC2 complex, which is encoded by the essential genes YDR1 and BUR6 in yeast, was identified as a global negative regulator of transcription. In vitro studies showed that

NC2 interacts with the TBP-DNA complex to block the association of TFIIA and TFIIB

(Goppelt and Meisterernst, 1996; Kim et al., 1995). Recent studies in yeast have shown that the two NC2 subunits only co-purified under certain growth conditions, implicating each individual subunit in different modes of regulation in vivo (Creton et al., 2002).

Another TBP regulator originally identified as a global repressor is Mot1 (Davis et al.,

1992). As a member of the Swi/Snf family of ATPases, Mot1 uses energy obtained from hydrolyzing ATP to dissociate TBP from DNA (Auble and Hahn, 1993; Auble et al.,

1994). Several observations have tied negative regulation by NC2 and Mot1 to activated transcription. First, alleles of bur6 and mot1 were identified in a screen for mutations that suppressed the requirement for the upstream activating sequence (UAS) of SUC2

(Prelich and Winston, 1993). Secondly, Mot1 and Bur6, as well as TBP, are recruited to promoters of genes upon activation (Geisberg et al., 2001; Geisberg et al., 2002).

Thirdly, mutations in TFIIA or components of the mediator complex suppress the requirement of NC2 for cell viability(Lemaire et al., 2000; Xie et al., 2000). Fourth, a

TBP mutation which disrupts TBP-NC2 interactions causes increased expression at highly transcribed genes in yeast (Chitikila et al., 2002). These results implicate the

121

negative regulatory functions of Mot1 and NC2 in modulation of active transcription as opposed to repressing basal transcription. Our earlier studies indicated a link between

SAGA-dominated genes and genes repressed by Bur6 or Mot1, but how these opposite effects are balanced on a gene-by-gene basis under non-stress conditions is still unknown.

While Mot1 and NC2 were originally characterized as negative regulators of TBP, several lines of evidence now implicate both factors in stimulating expression of some genes. Biochemical studies in flies identified the NC2 complex as a positive regulator of

TATA-less, Downstream Promoter Element (DPE) containing promoters (Willy et al.,

2000). While yeast contain TATA-less promoters, a DPE sequence has not been identified. Furthermore, mutations in yeast Mot1 decrease expression at TATA-less

HIS3 and HIS4 promoters (Collart, 1996). Several genome-wide expression studies have shown Mot1 or Bur6 inactivation causes decreased expression of up to 9% of the yeast genome (Andrau et al., 2002; Cang and Prelich, 2002; Dasgupta et al., 2002; Geisberg et al., 2001; Geisberg et al., 2002). Although some effects seen in these expression studies could be indirect, ChIP analysis of promoter regions at several stimulated genes showed binding of Mot1 or Bur6 (Dasgupta et al., 2002; Geisberg et al., 2001; Geisberg et al.,

2002). These genome-wide studies identified a substantial overlap between Mot1 and

Bur6 regulated genes, indicating a possible functional relationship between them.

Further evidence of this comes from the fact that inactivation of Mot1 causes increased association of Bur6 and Ydr1 at several different yeast promoters (Geisberg et al., 2002).

In vitro studies with the human homologues showed the NC2α (Bur6) subunit interacts

122

with BTAF1 (Mot1) and stimulates BTAF1 association with TBP. However, it is still not clear exactly how these two factors coordinate their TBP regulation.

The in vivo relationships between different modes of TBP regulation are not well understood. This is the case for a variety of regulators, such as the NC2 complex, Mot1, the N-terminal domain of TAF1 (TAND), SPT3, TBP-DNA binding, and TBP self- dimerization. While the genome-wide role of most of these TBP regulators has been examined individually, less is known about the connections between these TBP regulatory factors on a genome-wide scale. How is the genome partitioned between these regulators? Do different genes utilize different modes of TBP regulation and how does the mode of regulation vary depending upon the expression state of the gene? Are certain modes associated with each other? If so, how are they linked? Different regulatory modes could be redundant where one mode fully compensates for disruption of another.

Another option is that each mode is required, either fully so that multiple disruptions are identical to individual disruptions, or partially where multiple disruptions have a greater impact than loss of just one. The dual nature of TBP’s regulators complicates the understanding of their relationship to each other. Negative regulation by one factor may counteract positive regulation of another, or both factors may act similarly to activate or repress transcription. Making the situation more complex is the fact that different modes of regulation may occur at different subsets of genes, as the nature of each mode of regulation may be context dependent. Several variables appear to affect the mode of TBP regulation that occurs at a gene. One major factor, which most certainly affects gene

123

expression, is the physiological condition under which the cell is growing. As different growth conditions require induction and repression of different genes, the modes of TBP regulation at a gene must adjust to meet the expression needs. A second factor that appears to play a key role is the promoter sequence of the gene and whether or not it contains a TATA box. Another factor, which may affect the mode of TBP regulation is the pathway utilized to recruit TBP and assemble the PIC. Our earlier work revealed that the SAGA and TFIID complexes define two separate assembly pathways that deliver

TBP to the promoter. While each pathway is linked to different promoter sequences (the presence or absence of a TATA box) it is clear that this is not the only factor that determines which pathway is utilized. Because any gene may ultimately use a combination of these two delivery pathways, the ultimate transcriptional output of any gene is the combined output of these two pathways. Factors that function negatively in one pathway may play a positive role in the other pathway.

This study addresses how multiple modes of TBP regulation are integrated in vivo to achieve the required transcriptional output. In order to obtain the most complete picture of regulation I have used genome-wide expression profiling. By examining the expression profile of all yeast genes, any unintentional bias in our conclusions can be avoided. I have focused on the relationship between six modes of regulation: the NC2 complex, Mot1, the N-terminal domain of TAF1 (TAND), SPT3, TBP-DNA binding, and

TBP self-dimerization. Several of these TBP regulators are essential, thereby eliminating the option of assessing their function through deletion of their coding gene. Often

124

temperature sensitive mutants are used in this situation, however this requires that the experiments must be conducted at elevated temperatures. I employed an alternative method to examine the essential modes of TBP regulation, by using previously identified mutations in TBP that disrupt TBP’s interaction with NC2, Mot1, DNA, and TBP self- dimerization (Cang et al., 1999; Jackson-Fisher et al., 1999). These mutant TBPs were introduced into strains that were deleted for the N-terminal domain of TAF1 (TAND) and/or SPT3, which are non-essential TBP regulators. This system allowed us to examine each of these TBP regulatory interactions individually and in all possible combinations. By disrupting multiple regulators in combination, I can address the functional relationship between them.

I find that these six modes of TBP regulation define a highly interconnected network that affects about one-half of the yeast genome under normal growth conditions. Chromatin

IP analysis reinforces that the effects on expression are due to changes in TBP binding at promoters. Our data indicate that Spt3 affects expression both positively and negatively and plays a key regulatory role at many of the genes in our network. TBP-DNA interactions are critical at many highly expressed genes, which often contain TATA boxes. Mot1 and NC2 function in a manner which generally requires both factors to down-regulate transcription, primarily at TATA containing genes. Mutations on TBP’s concave surface exhibit a variety of effects when combined with disruptions of other modes of regulation, highlighting the critical role of this surface. The regulatory network defined by the mutants in this study tends to target TATA-containing SAGA-dominated

125

genes, but does affect some TATA-less, TFIID-dominated genes. In summary, I have used a systematic mutagensis of the TBP regulatory network to elucidate relationships between regulators.

4.3 Results

4.3.1 Design of the study

In order to investigate the relationships between multiple TBP regulators I constructed a series of strains and plasmids designed to test the effects on gene expression of regulators individually as well as in all possible combinations with each other (Figure 4.1). To examine the role of two non-essential regulators, SPT3 and TAND in TBP regulation, strains were constructed which were deleted for TAND (ΔTAND), SPT3 (spt3Δ), or for both TAND and SPT3 (ΔTAND spt3Δ). To examine regulation by the essential factors

NC2 and Mot1, as well as TBP- DNA and TBP self-dimerization interactions, I constructed a series of galactose-inducible HA-tagged test TBP plasmids (subsequently referred to as “test TBP”) with mutations that disrupt TBP’s interaction with these factors. The mutations used were identified in previous studies and have been shown to disrupt TBP’s ability to interact with specific regulators (Table 4.1). A mutation in amino acid V161 of TBP to either R or E (V161R or V161E) disrupts TBP-DNA binding

(Jackson-Fisher et al., 1999). In addition, the V161R mutation disrupts TBP’s ability to self-dimerize (Jackson-Fisher et al., 1999). An F182V mutation in TBP disrupts TBP’s ability to interact with the NC2 complex and a K145E mutation disrupts TBP-Mot1 interactions (Cang et al., 1999). These four mutations (V161E/R, F182V, and K145E)

126

were introduced individually and in all possible double and triple combinations into a galactose inducible version of TBP present on a plasmid (Fig 1A). A WT version and a

Null version, in which the first codon was mutated to a stop, were used for controls.

Previous work has shown that the K145E mutation disrupts TBP-TFIIA interactions in vitro, in addition to the TBP-MOT1 interaction, so I constructed three additional mutants

(single mutants E93R and R107E and a double mutant, K138T Y139A) which specifically affect TBP-TFIIA interactions (Bryant et al., 1996; Cang et al., 1999; Stargell and Struhl, 1995). I speculated that if the K145E mutant disrupted TBP-TFIIA interactions in vivo that these additional TFIIA specific mutations might show similar transcription profiles thereby allow us to distinguish genes regulated by TFIIA versus genes regulated by Mot1. Additional controls for these TFIIA-specific mutants were a

Toa2-TBP fusion to either the K138T, Y139A double mutant or to wild type TBP. The fusion of Toa2 to a K138T, Y139A double mutant has been shown to suppress the defects associated with this mutation (Stargell and Struhl, 1995). In addition to various mutant versions of TBP, each strain utilized in this study contains a wild type version of the endogenous TBP to support viability.

127

Figure 4.1. Strains and TBP mutants utilized to dissect the TBP regulatory network. (A) Diagram of the four different yeast strains. All strains were constructed in a Y13.2 background and contain a chromosomal copy of the endogenous SPT15 (i.e. TBP) gene. The chromosomal copy of TAF1 was deleted in all strains and is replaced by a plasmid containing either the wild-type TAF1 or a version that is deleted for the N-terminal domain (ΔTAND, a.a. 10-73). The endogenous copy of SPT3 is present in the wild type and ΔTAND strains but has been deleted by homologous recombination in the spt3Δ and ΔTAND spt3Δ strains. (B) Test TBP plasmids which are transformed into each of the four yeast strains shown in panel A. Plasmids are pRS315 or pRS425 derivatives. The test TBP is under control of the GAL10 promoter and N-terminally HA-tagged. The 2µ (pRS425) origin of replication was used in the spt3Δ strains to increase the level of test TBP expressed upon galactose induction. Site-directed mutagenesis was used to create mutations in the test TBP, which disrupt TBP’s interaction with factors that regulate it. A V161R (black) mutation disrupts TBP-DNA and TBP self dimerization, V161E (dark grey) disrupts TBP-DNA interactions and interactions with an unidentified TBP repressor, K145E (gold) disrupts TBP-Mot1 interactions and may also affect TBP-TFIIA interactions, and F182V (red) disrupts TBP-NC2 interactions. Each of these mutations is represented individually and in all possible double and triple combinations. Additional mutations (in purple), which disrupt TBP-TFIIA interactions, were also tested in some strains. As controls a wild type and null (where position 1 is mutated to a stop codon) TBP were tested. 128

Table 4.1. Test TBP mutations disrupt a variety of TBP interacting factors. Test TBP Mutation Interaction disrupted Reference K145E Mot1 and TFIIA (Cang et al., 1999) V161E DNA (Jackson-Fisher et al., 1999) V161R DNA and self dimerization (Jackson-Fisher et al., 1999) F182V NC2 (Cang et al., 1999) E93R TFIIA (Bryant et al., 1996) R107E TFIIA (Bryant et al., 1996) K138T Y139A TFIIA (Stargell and Struhl, 1995) Toa2- K138T Y139A Suppressor of K138T Y139A (Stargell and Struhl, 1995) Toa2-WT Control (Stargell and Struhl, 1995)

The series of test TBP plasmids was transformed into the four strains (WT, ΔTAND, spt3Δ, or ΔTAND spt3Δ) and the transformants were utilized for cell growth, genome- wide expression, and genome-wide ChIP assays. For the cell growth assays, the test TBP plasmids were present on a low copy plasmid and plated constitutively on galactose or glucose. However, for the expression and ChIP assays, the plasmid used in the spt3Δ strains was a high copy version in order to increase expression from the galactose promoter. Previous studies have shown that SPT3 is required for high levels of expression from the Gal1/10 promoter (Dudley et al., 1999; Larschan and Winston,

2001). Expression of the test TBP was verified by western blot analysis (Figure 4.2).

Steady state levels of the test TBPs were generally comparable to the level of endogenous wild-type TBP in the cell. Some of the mutants were present at lower levels most likely due to destabilization of TBP dimers, which has been shown to cause an increase in TBP degradation (Jackson-Fisher et al., 1999). There is little, if any, expression of the test

TBP prior to the addition of galactose in the WT and ΔTAND strains. However, in the spt3Δ strains, several test TBPs exhibit leaky expression prior to the addition of

129

galactose. This is not completely unexpected, as Spt3 has been shown previously to repress basal transcription at several promoters, although not for the Gal10 promoter

(Belotserkovskaya et al., 2000). This leaky expression seems to vary among the different test TBPs, possibly indicating a feedback loop in which some leaky expression causes an increase of the test TBP, which in turn increases the level of expression from the Gal promoter.

Figure 4.2. The test TBP is expressed upon addition of galactose. Anti-yTBP western blot analysis of samples taken from representative cultures that were used in microarray experiments. The four strains are indicated on the left with the various TBP mutants identified across the top. Samples were taken just before addition of galactose (- lanes) and after 45 minutes (for wild type and ΔTAND strains) or 3 hours (spt3Δ and Δ TAND spt3 Δ strains) of induction (+ lanes).

One limitation of using test TBPs in the context of an endogenous wild-type TBP is that effects on gene expression are observed only if the test TBP acts in a dominant negative manner. If the test TBP mutation was recessive to the wild type, no effect was observed.

As both the endogenous and test TBP are present in the cell, they should be in

130

equilibrium with TBP interacting factors, except for the factor whose interaction is disrupted. For example, based on previous in vitro studies, the V161E mutant should interact with the SAGA and TFIID complexes as well as the wild-type TBP, but be defective in TBP-DNA interactions. The work presented in Chapter 2, which included a subset of the mutants used in this study, demonstrated that effects on gene expression could be detected with this experimental setup (Chitikila et al., 2002).

4.3.2 Perturbing different combinations of interactions (nodes) in the TBP

regulatory network have distinct effects on cell growth.

Prior to analyzing the effect on gene expression of the test TBPs in the different strains, I determined their effects on cell growth. Earlier work with a subset of test TBPs in the wild type and ΔTAND strains demonstrated certain combinations were synthetically toxic

(Chitikila et al., 2002). By examining effects on cell growth, I was able to quickly gain insight into the relationship between TBP regulators prior to assessing their effects on transcription.

Serial dilutions of the four strains containing the series of test TBP plasmids were performed on glucose (data not shown) and galactose media (Figure 4.3). Growth on glucose, as expected, was unchanged between the different test TBPs. The test TBPs within a strain should all be compared to the Null test TBP, shown in the first row of each panel. The Null test TBP essentially shows the growth rate of each strain, since no extra

TBP is expressed. Expression of the different test TBPs causes various effects on cell

131

growth, indicating a complex regulatory pathway. Some mutant combinations cause additive effects, in which the combined mutant is more detrimental to cell growth than each individual mutant. One example of this type of effect is the V161R and F182V single mutants and double mutant combination in the spt3Δ strain. However, other disrupted combinations exhibit synthetic effects where each individual is unaffected but the combination is deleterious. For example, the K145E and F182V single mutants in the

ΔTAND strain don’t affect growth, but the double mutant (K145E, F182V) in the same strain is toxic. In addition, suppressive effects are observed with some combinations such as the K145E, V161R double mutant, which is severely toxic in the ΔTAND strain but less toxic in the ΔTAND spt3Δ strain.

132

Figure 4.3. Expression of several TBP mutants causes dominant synthetic toxicity. Growth assays using serial dilutions (undiluted to 10-5 from left to right) on galactose medium was performed for each test TBP in each strain. The version of the test TBP is indicated on the left and the strains is indicated across the top. The WT strain is shown after 4 days of growth at 30OC while the other three strains are shown after 6 days of growth, also at 30OC. Growth in each strain should be compared to the Null version of the test TBP, which is essentially equivalent to the growth of the strain without any extra TBP.

The growth studies demonstrate that many of the TBP mutants have effects on cell growth, indicating the test TBP is often dominant to the endogenous TBP. Furthermore, the effect on cell growth from disrupting multiple TBP regulators can be additive, synthetic, or suppressive, as detailed above. This complexity indicates a highly

133

intertwined TBP regulatory network in which several different factors play regulatory roles at a given gene.

4.3.3 A portion of the yeast genome is subject to a complex TBP regulatory

network.

In order to determine which genes are sensitive to disruptions in the TBP regulatory network, genome-wide expression studies were conducted with the test TBPs in each of the four yeast strains. Spotted arrays, which contained 6226 yeast ORFs, were utilized for two channel microarray assays in which the reference sample was the wild type strain expressing the null test TBP. Cell growth for all expression arrays was at 30OC with a

45’ (wild type and ΔTAND strains) or 3 hour (spt3Δ and ΔTAND spt3Δ strains) galactose induction to express the test TBP. Previous work has shown that the TBP delivery complexes SAGA and TFIID are linked to induction and repression, respectively, of

Environmental Stress Response (ESR) genes (Chapter 2 and Huisinga and Pugh, 2004).

However, in this study I examined the requirements for TBP regulators under non-stress growth conditions, where expression of ESR genes is not expected to be affected.

Therefore, I would predict the mode of TBP regulation at these genes could change from what occurs under high stress conditions.

A total of 2982 genes demonstrated significant changes in expression in at least one strain/test TBP combination, as determined by the criteria described in the Materials &

Methods. The filtering criteria are highly stringent as very few, if any, genes

134

significantly changed in reference versus reference (homotypic) experiments

(Supplemental Table S4-1). Therefore, it is possible that disruption of these TBP regulators affects additional genes, which do not meet our significant change criteria. I hypothesized that by focusing the analysis on genes with the most significant changes, the TBP regulatory network would be more robust, thereby allowing easier visualization of relationships.

4.3.4 Clustering of genes sensitive to TBP regulation reveals a complex TBP

regulatory network with many nodes.

To observe the unique gene expression patterns incurred by disrupting different aspects of the TBP regulatory network, cluster analysis was performed. Cluster plots allow for visualization of changes in expression at many genes across multiple conditions. This is particularly helpful in analyzing the relationship between multiple TBP regulators across all 63 experiments in this data set. Since many of the factors examined have both positive and negative effects on transcription, I was curious as to how these opposite modes of regulation intersect. If different TBP regulators show opposite effects, what happens when they are both disrupted? Do opposite effects counteract each other, or does one mode of regulation dominate over the other? A different situation arises when regulators act similarly to activate or repress transcription. Certain regulators could be redundant and their effects only observed when both are disrupted, or they might each play a role, causing an additive effect when both are disrupted. Another possibility is that each regulator is required for correct expression, and disruption of multiple regulators

135

exhibits the same expression pattern as disrupting each individually. These various interactions between TBP regulators are intertwined with the fact that different subsets of genes have different relationships between regulators.

Two types of clustering were applied to 2903 genes that passed the significance criteria as well as exhibited a minimum change of 1.5-fold and had data in 60% of the experiments presented (Figure 4.4A). The arrays (experiments) were clustered hierarchically, which generates nodes indicating the relationship between regulators by placing experiments that are most similar next to each other. Subsequently, K-means clustering was applied to the genes. By varying the value for “K” and examining each version of the clustering, I found that K=13 gave clusters which exhibited unique, but non-redundant patterns. However, since three pairs of the clusters had similar, but not identical, overarching expression patterns, these pairs of clusters were each merged to create three clusters, each containing two sub-clusters within them (labeled a and b).

136

Figure 4.4. Cluster analysis reveals a highly interconnected TBP regulatory network. (A) 2903 genes that showed statistically significant changes in expression in at least one of the 63 conditions examined were grouped into 13 clusters (labeled on the right of each cluster) using the K-means algorithm. Hierarchical clustering was used to arrange the most similar array conditions next to each other. The arrays are labeled across the top by which mutations are present in that experiment. The lack of any indicated mutant means the strain or test TBP is wild type for that interaction. The dashed squares indicate the Null TBP mutant. Each row corresponds to the expression ratio of a single gene where green = decrease expression, red = increase expression, black = no change, and grey = no data. The intensity of the color indicates the magnitude of the change. (B) Increases in transcription correlate with increases in mutant TBP binding as demonstrated by ChIP- chip. Cluster plot of genome-wide chromatin IP data (columns 1-3) compared to selected expression data (columns 4-6). The genes are clustered using the same clusters as in 4.4A, except that all dubious ORFs, as determined by Kellis et al., were excluded (Kellis et al., 2003). The ChIP-chip experiments were done using the differential ChIP-chip method (Zanton and Pugh, 2004) where the reference sample was the wild type strain with the wild type “test TBP” and the test sample is indicated above each column. 137

The major nodes in the hierarchical clustering lend insight into relationships between

TBP regulators. The homotypic control experiments (columns 1 & 2) change very little in expression, cluster together, and indicate the relatively low intrinsic variation of these experiments. The other experiments that exhibit minimal changes in gene expression cluster into three subgroups (columns 3-6, 45-51 and 52-56). These include over- expression of the wild type TBP and the TFIIA specific TBP mutations. More interestingly, the rest of the TBP regulators examined fall into three major clusters. First, in columns 57-63, are the F182V and K145E single and double mutants in the WT and

ΔTAND strains, which show very similar patterns of expression at most genes. This pattern is consistent with the previous evidence that NC2 and Mot1 function at many of the same genes (Andrau et al., 2002; Cang and Prelich, 2002; Dasgupta et al., 2002;

Geisberg et al., 2002). Secondly, the concave surface mutants (V161R and V161E) in the

WT or ΔTAND strains cluster next to each other and behave similarly at many, but not all, genes (columns 7-18). The V161R mutants form their own subgroup (columns 8-14), presumably due to their specific disruption of TBP dimerization. Thirdly, all of the spt3Δ and ΔTAND spt3Δ strains cluster together in the center of the clustering (columns 19-44).

Most of the test TBPs exhibit very similar patterns of expression in these two strains, indicating that the primary factor that regulates these genes is Spt3. However, specific effects from the test TBPs are observed at some genes, as discussed in more detail below.

The K-means clustering of genes illustrates how the regulators examined have gene specific effects. For the most part, each mode of regulation exhibits both positive and

138

negative effects, which occurs at different subsets of genes. How these different factors fit together to regulate TBP is one key to understanding how gene expression is regulated. The major effects on each cluster are outlined in the following paragraphs and a more detailed analysis of the interactions between regulators is undertaken in Section

4.3.6.

Almost all clusters exhibit effects to some degree from loss of Spt3, which emphasizes its importance in transcription regulation. The spt3Δ strains can have two opposing effects, either genes increase expression or genes decrease expression (Figure 4.4A, spt3Δ and

ΔTAND spt3Δ strains, columns 19-44). At many of the clusters, Spt3 plays a predominant role, with little effect from the combined disruption of other modes of regulation. However, the interplay between different modes of regulation is seen at some genes (clusters 3, 4, 5, and 9).

Mutations on the concave surface (V161E/R) also exhibit opposite effects (Figure 4.4A, columns 6-18). Several clusters decrease expression in the presence of these mutations, including genes in clusters 5, 6, 8, and 9. Intriguingly, while both of these mutants disrupt DNA binding in vitro, cluster 8 genes decrease expression in the V161R mutant but are minimally affected by the V161E mutant, indicating a functional difference between these two mutants. Genes that increase expression in the presence of the V161R mutant, which disrupts TBP dimerization, define cluster 4. At these genes, dimerization

139

is often dominant to other modes of regulation, illustrated by the fact that that almost all combinations containing a test TBP carrying the V161R mutant increase expression.

Disruption of the NC2-TBP (F182V) or Mot1-TBP (K145E) interactions generally causes an increase in gene expression, emphasizing their roles as general TBP repressors. This is observed, to varying degrees, at the genes in clusters 3, 5 and 6 (Figure 4.4A, columns

57-63 in all clusters and columns 19, 22-23 and 35-40 in C3). Furthermore, two clusters,

1 and 8, show a decrease in transcription when TBP-NC2 interactions are disrupted, lending support to a positive function of NC2 at some genes.

4.3.5 Validation of the TBP mutants

As a first step towards interpreting the data in Figure 4.4, I sought to affirm that each

TBP mutation was likely to be affecting the targeted interaction. Data in Chapter 2 indicated that V161E predominately affected DNA binding, and V161R affected both

DNA binding and dimerization. By comparing the two, the contribution from dimerization could be deduced. To ascertain whether the F182V mutation was consistent with a loss of NC2 interactions, I compared the changes in expression of the

TBP(F182V) mutant with that of a bur6-1 mutant from Cang et al. bur6-1 encodes a temperature-sensitive subunit of NC2 subunit. In addition, to assess whether the constitutive presence of endogenous wild type TBP in this study affected the expression profile from F182V, I compared our data set to data from Cang et al. which conducted expression profiling on TBP(F182V) in the absence of endogenous TBP. Strains

140

harboring TBP(F182V) as the only source of TBP are also temperature-sensitive. For purposes of evaluating strong vs. weak correlations, comparisons were also made with galactose-induced TBP(F182V) in a taf1(ΔTAND) strain which was expected to exhibit a strong correlation, and with a temperature-sensitive allele of TAF5, which expected to exhibit a weak correlation. As shown in panel A, the effects of galactose induction of

TBP(F182V) correlated well with the bur6-1 data (second panel) as well as data from strains having TBP(F182V) as the sole source of TBP (third panel). The degree of correlation is remarkable in light of the fact that the Cang et al. studies were conducted after heat-inactivation of temperature-sensitive alleles. Nevertheless, the strong correlations indicate that the effects observed with the TBP(F182V) mutant are likely to be due to disruption of NC2-TBP interactions.

141

Figure 4.5. Validation of the TBP mutants. (A) Correlation scatter plots of genome-wide changes in gene expression of the TBP(F182V) mutant vs. published NC2-TBP interaction defective mutants. Data sets were derived from Figure 4.4, and published data sets were obtained from Cang et al., 2002 and Lee et al., 2000. The far left and right panels represent examples of strong and weak correlations, respectively. All axis represent log2 changes in gene expression. R denotes the correlation coefficient. (B) Correlation scatter plots of genome-wide changes in gene expression between mot1-14 and a number of TBP mutants used in this study. Log2 changes in gene expression were derived from Figure 4.4, clusters 3, 5, 8, and 9. The mot1-14 data set was from Dasgupta et al., 2002. (C) Dendogram of TBP mutants. Dendogram branches for selected mutants from Figure 4.4 were collected and redrawn. 142

Next, I analyzed K145E. The K145E mutation impairs TBP interactions with

Mot1 and TFIIA, in vitro. This mutant showed little change in gene expression unless placed in the context of other mutations such as taf1(ΔTAND). To assess whether

TBP(K145E) was primarily defective in Mot1 or TFIIA interactions in vivo, two comparisons were made. First, I compared the genome-wide expression profile of

TBP(K145E) in a taf1(ΔTAND) strain to a mot1-14 mutant for those genes residing in clusters where K145E was having an impact. As shown in B, changes in gene expression in a strain in which TBP(K145E) was induced with galactose had a significantly higher correlation to changes occurring upon temperature-inactivation of mot1-14 (first panel), than it did with a temperature-sensitive taf5-9 mutant (second panel). In comparison, mot1-14 did not correlate as well with galactose-induced TBP(null) (third panel) or

TFIIA-defective TBP(R93E) (fourth panel). In the second comparison, selected hierarchical relationships from Figure 4.4 were redrawn in C, where TBP mutants that are expected to have altered interactions with TFIIA or NC2 were compared to the K145E mutation. These include E93R and R107E, which lie at the crystallographic TFIIA-TBP interface, the N2-1 mutant, which has been previously described as defective in TFIIA interactions, and Toa2-TBP, which is a fusion of the Toa2 subunit of TFIIA to the amino- terminal end of TBP. These mutants clustered together away from K145E. In fact,

K145E clustered closer to F182V, which is consistent with the known functional linkage between Mot1 and NC2. Taken together, these data affirm that the K145E mutation predominantly alters the functional interactions between TBP and Mot1 more than with

TFIIA.

143

4.3.6 Changes in gene expression reflect changes in genome-wide occupancy of

Test TBP

Although significant changes in gene expression were observed, I did not know whether these changes were due to direct or indirect effects of our test TBPs. In an attempt to minimize the indirect effects, the test TBP was expressed for the minimum amount of time necessary to obtain test TBP protein levels comparable to endogenous TBP levels.

Other studies have shown that changes in gene expression are generally reflective of changes in total TBP binding (Kuras and Struhl, 1999; Li et al., 1999), but that does not address if the test TBPs were actually responsible for the observed changes in expression.

Therefore, as a representative example I examined the binding of the F182V mutant in the ΔTAND strain using a genome-wide Chromatin Immuno-Precipitation (ChIP-chip) assay.

The differential ChIP-chip approach was utilized, where the binding of the F182V test

TBP in ΔTAND was compared to the binding of WT test TBP in the wild type strain using yeast intergenic arrays (Zanton and Pugh, 2004). I used the wild type test TBP as a reference since it is presumably in equilibrium with the endogenous TBP and very few changes in gene expression are observed when it is expressed. As controls I immuno- precipitated the Null and WT test TBPs in the wild type strain. Figure 4.4B shows the visualization of the ratios of the genome-wide ChIP side-by-side with the expression experiments, arranged in the same clustering pattern as the expression data in Figure

144

4.4A. By examining the ChIP and expression data together, one can see that increases in expression correlate with increased binding of the F182V test TBP. The highest binding of the F182V mutant occurs at the genes in clusters 3 and 5, which are the same genes that increase most in expression. This is a strong case for a direct effect of the test TBP up-regulating expression. An increase in binding is not observed for the control experiments with the WT or Null test TBPs.

4.3.7 Combinations of TBP mutants elucidate the complex relationships between

TBP regulators.

In order to more easily examine the relationship between different combinations of TBP regulators, specific subsets of experiments were examined separately, keeping the genes in the same 10 clusters defined in Figure 4.4A (see Figure 4.6 for subgroups). Three different subsets of experiments were examined: the NC2 and concave surface mutants

(Figure 4.6A), Mot1 and the concave surface mutants (Figure 4.6B), and NC2 and Mot1

(Figure 4.6C), each in the presence or absence of Spt3 and TAND. A variety of relationships between TBP regulators are observed and each cluster shows varying dependency on different modes of regulation. Table 4.2 summarizes the positive and negative roles of each factor by cluster. An analysis of the relationships between modes of TBP regulation, as illustrated by different example clusters, is undertaken in the following paragraphs.

145

Table 4.2. Summary of effects observed upon disruption of modes of TBP regulation Mode of TBP regulation Cluster NC2 Mot1 Spt3 TAND Concave surface interactions DNA TBP binding Dimerization Other 1 + - - - - 2 - - - 3 ------+ 4 - - + + - - - - 5 - - - - - + + - - + + + 6 - - - + + + + + + 7 + + + + + - 8 + + + + + + + + 9 - + + + 10 NA NA NA NA NA NA NA A “+” indicates a positive role in transcription at those genes and a “-“ indicates an inhibitory role. The number of pluses and minuses are meant to indicate the relative strength of each interaction when compared to other modes of regulation at a cluster of genes. Some modes of regulation are only detected when other modes have been compromised. Regulation for cluster 10 was not determined. More information about these genes is presented in section 4.3.9.

146

Figure 4.6. Clustering of subsets of experiments indicates a complex relationship between TBP regulators. (A-C) Combinations of TBP mutants elucidate new and complex relationships between TBP regulators. Clustering of subsets of experiments. The same K-means clustering of the 2903 genes shown in Figure 3.4 was utilized but each subset of experiments were clustered hierarchically based upon the experiments in that sub-group.

147

The genes that are sensitive to TBP-DNA interactions partially overlap with the genes positively regulated by Spt3. Genes in clusters 5b, 6 and 8 decrease expression when either mode of regulation is disrupted, indicating a dependence upon both interactions.

However, genes in cluster 9 are primarily dependant upon TBP-DNA interactions, which are counteracted by an inhibitory role for Spt3. This is illustrated by the V161R and

V161E mutants in the spt3Δ strain, which no longer cause a decrease in gene expression

(see Figure 4.6B columns 12 & 13 vs. columns 9 & 10 as an example). Furthermore, this differs from cluster 7 genes, which are primarily dependent upon Spt3, and to a slight extent TAND, to stimulate expression.

When the Mot1 and NC2 disruption mutants are compared across most strains, a very similar pattern of expression is observed (Figure 4.6C). The K145E mutant, which disrupts Mot1 interactions with TBP, shows minimal changes in expression in the wild type strain, but does effect expression when TAND or Spt3 are removed. Furthermore, the double mutant (columns 7,10, &13) which is disrupted for both interactions looks very similar to each single mutant (columns 5 & 6, 11 & 12, and 14 & 15). This argues against any redundancy between these regulatory factors. Instead, it appears that genes, which are repressed by NC2 and Mot1, generally require both regulators for correct expression.

NC2 and Mot1 can counteract the stimulatory role of Spt3 as observed with the pattern of expression in clusters 5 and 6. These genes increase in expression when Mot1 and NC2

148

regulation are disrupted, and decrease expression upon loss of Spt3. However, the reliance upon Mot1 and NC2 inhibition varies between these two groups of genes, with genes in cluster 5 being more dependent upon an inhibitory role for Mot1 and NC2.

NC2 and Mot1 inhibit expression of genes in clusters 3 and 5, and to a lesser extent the genes in clusters 4 and 6. These genes are also stimulated by TBP-DNA interactions (as demonstrated with the V161E mutants), although this mode of regulation at cluster 4 genes is overshadowed by repression involving TBP dimerization, which is discussed in more detail below. Combining mutations that compromise TBP-DNA interactions with the NC2 or Mot1 mutants results in distinct effects on gene expression. These effects can be rationalized by considering the different mechanisms by which NC2 and Mot1 repress transcription. NC2 inhibition is less effective when combined with mutants compromised for DNA binding, as illustrated by the combination of the F182V and V161E mutants in clusters 3, 5, and 6 (Figure 4.6A, column 16 vs. column 2). This is consistent with NC2 functioning to repress transcription by binding to the TBP-DNA binary complex and inhibiting further incorporation of other PIC components. Presumably, if TBP doesn’t bind well to DNA, a disruption of NC2 repression is irrelevant.

A different effect is observed when the K145E mutant, which disrupts Mot1 repression, is combined with TBP mutants compromised for DNA binding. While the K145E mutant has minimal effects in the wild type strain, it does effect gene expression when Spt3 or

TAND are removed. Therefore, the effects of combining the K145E with the concave

149

surface mutants (V161E/R) can be examined in the spt3Δ and ΔTAND strains. Genes in cluster 5 decrease expression upon expression of the concave surface mutants (for one example see Figure 4.6B, column 11) but this decrease is suppressed upon the addition of the K145E mutant (column 14). This combinatorial effect indicates that the requirement for strong TBP-DNA interactions is not as critical if Mot1 interactions are weakened. As

Mot1 dissociates DNA bound TBP, disruption of Mot1 interactions may indirectly stabilize weakened TBP-DNA interaction by decreasing their vulnerability to dissociation from DNA via Mot1. This ultimately results in the TBP-DNA mutants being able to activate transcription in the context of the K145E mutant that disrupts Mot1 interactions.

The V161R mutant has previously been shown to disrupt repression of lowly expressed, chromatin repressed, subtelomeric genes, presumably due to a defect in TBP dimerization

(Chapter 2 and Chitikila et al., 2002). Cluster 4 genes are strongly derepressed in the

V161R mutant. This derepression is counteracted by a stimulatory role for Spt3 at these genes. The dependence upon Spt3 varies between cluster 4a and 4b, with 4b genes being less dependent upon stimulation by Spt3. Disruption in other modes of TBP regulation, including NC2, Mot1 and TAND, enhance the effect of V161R at cluster 4 genes. Genes in cluster 4 are some of the few genes that exhibit an actual redundancy between inhibitory factors. This is observed with the K145E, V161E test TBP in the ΔTAND strain (Figure 4.6B, column 14), which increases in expression, while none of the separate mutant combinations increase in expression (Figure 4.6B, columns 2-4, 11, 13,

& 18).

150

In conclusion, each mode of TBP regulation has diverse, gene specific effects on expression, illustrated by the fact that their effects can be very different from one gene to another. This gene specificity is also true of relationships between the TBP regulators, in that certain modes of regulation dominate at some genes while different modes dominate at other genes. It is important to keep in mind that the relationships observed between regulators is specific to these growth conditions and that the balance between regulators may shift under different conditions.

4.3.8 Clusters of co-regulated genes have distinct intrinsic properties and exhibit

relationships to additional transcriptional regulators.

To better understanding why certain modes of TBP regulation were preferentially utilized at different subsets of genes, I examined other genome-wide information about the genes’ regulation. These comparisons facilitate a broader understanding of the relationship between TBP regulators. I compared the overlap of genes in each cluster with other

“groups” of genes and generated p-values indicating the significance of the overlap between groups. This included groups defined by gene-specific properties, such as the presence of a TATA-box or belonging to the Environmental Stress Response (ESR) set, and groups of the most positively or negatively regulated genes (as defined by the top or bottom 10% of the expression distributions) of publicly available genome-wide expression and ChIP data. It is important to keep in mind that in situations when the

151

distribution of the entire data set is increased or decreased (recall the taf1ts2 spt3Δ double mutant in Chapter 3) that the extremes in the distribution may actually be the genes which are “independent” of that factor. Comparisons were made to several hundred different categories of which several key relationships are presented in Tables 4.3-4.6.

The complete comparison is available in Table S4-2.

One of the strongest relationships is the over-representation of TATA boxes in genes repressed by Mot1 and the NC2 complex (Table 4.3; clusters 3, 5, and 6; p = 4 x 10-10, p

= 6 x 10-46, and p = 1 x 10-46, respectively). However, these are not the only clusters enriched for TATA containing genes. Cluster 4, which is repressed by TBP dimerization defective mutants, and cluster 9, which is dependent upon TBP-DNA interactions for expression, are also enriched for TATA-containing genes. When examined as a whole, the genes included in this TBP regulatory network are over-represented for TATA boxes, whereas genes unaffected upon disruption of these TBP regulators (i.e. genes which do not show significant changes in gene expression) tend to be TATA-less (p = 5 x 10-53) and TFIID-dominated (p = 1 x 10-46). This observation is consistent with previous analysis showing that many of the factors examined in this study, including NC2, Mot1, and Spt3 function primarily at TATA containing, SAGA-dominated genes (Huisinga and

Pugh, 2004). Additionally, while the p-values are not as significant, some of the clusters in the network are enriched for TATA-less genes. Intriguingly, clusters 1 and 8, which

152

decrease expression upon loss of NC2 interactions, are under-represented for TATA

-07 -03 boxes (with p-values of 4 x 10 and 3 x 10 , respectively).

Table 4.3. Presence of a TATA box and regulation by SAGA and TFIID at each gene cluster. % TATA- % TATA % SAGA less & TFIID containing in P- dominated in P- dominated in P- Item group value group value group value Cluster 1 6% 4E-07 5% 1E-02 80% 4E-02 Cluster 2 14% 6E-02 4% 2E-03 76% 2E-01 Cluster 3 36% 4E-10 21% 4E-10 51% 5E-04 Cluster 4 32% 7E-13 10% 4E-01 50% 7E-06 Cluster 5 48% 6E-46 33% 7E-54 34% 3E-16 Cluster 6 48% 1E-46 30% 6E-44 37% 4E-14 Cluster 7 22% 4E-02 16% 2E-04 60% 5E-02 Cluster 8 11% 3E-03 5% 2E-02 76% 9E-02 Cluster 9 32% 7E-05 13% 1E-01 54% 5E-02 Cluster 10 11% 2E-02 8% 4E-01 74% 5E-01 Unclustered genes 10% 5E-53 4% 1E-46 77% 1E-21 Genome-wide 19% 9% 71% SAGA & TFIID sets of genes are defined in (Huisinga and Pugh, 2004) and TATA box containing genes are defined in (Basehoar et al., 2004). Clusters less than the genome- wide average that have a significant p-value are underrepresented for regulation, with the opposite the case for clusters higher than the genome-wide average.

The relationship between previously defined SAGA and TFIID dominated genes and these clusters shed light onto the possible role of Spt3 at these genes under non- environmental stress conditions (recall that the SAGA/TFIID classification was done under heat stress conditions (Huisinga and Pugh, 2004)). Cluster 2, which is TFIID biased (4% SAGA; p = 2 x 10-03) is actually repressed by Spt3 under these growth conditions (see columns 19-44 in Figure 4.3A for Spt3 effects). However, another TFIID biased group (5% SAGA; p = 2 x 10-02), cluster 8, is actually dependent upon Spt3 for stimulation of expression. One explanation for this difference might be a difference in

153

the expression requirement of these two groups of genes. Cluster 8 genes are very highly expressed under these growth conditions (Table 4.4; see below), which may require both

TFIID and SAGA (Spt3) for correct expression. Furthermore, three clusters which were previously defined as SAGA dominated (clusters 3, 5, and 6) exhibit varying dependencies upon Spt3 under these conditions. Cluster 3 is slightly repressed by Spt3, whereas cluster 6 is positively regulated by Spt3. Another difference between these three

SAGA-dominated groups is their regulation by TAND. Cluster 5 genes, in particular cluster 5b, are negatively regulated by TAND (columns 12-18 and 51 in Figure 4.3A), while cluster 3 is unaffected and cluster 6 is only very slightly affected by removal of

TAND. This differential regulation helps define smaller sub-groups within the larger class of SAGA-dominated genes based upon the role of Spt3 under various conditions and their coordinated regulation with components of TFIID.

I hypothesized that the gene expression level might vary between the different clusters if different modes of TBP regulation are used to regulate activated versus basal gene expression. Studies presented in Chapter 2 indicated that this might be the case by demonstrating a link between lowly expressed genes and repression by TBP dimerization, and modulation of highly expressed genes by NC2 (Chitikila et al., 2002). To examine gene expression levels, both the transcription rate and the steady state mRNA levels were analyzed for each cluster. As an approximation of the steady state mRNA level, the percent rank of the median mRNA signal intensity in a wild type strain for each cluster was calculated. The relative transcription rate of genes in each cluster was investigated by

154

comparing them to the most and least transcribed yeast genes as previously calculated

(Holstege et al., 1998). The genes in cluster 4, which are up-regulated in the V161R dimer-defective mutants, are very lowly expressed when measured by either transcription rate or steady state level. This is consistent with observations presented in Chapter 2 linking TBP dimerization to lowly expressed genes. I also analyzed the expression level of the genes that are relatively insensitive to defects in the TBP regulatory network (i.e. the genes that don’t change expression in any of these experiments, which are referred to as “Unclustered genes”). These genes are lowly transcribed and correspondingly have lower steady state mRNA levels when compared to the genome as a whole. This suggests that their lack of ongoing transcription is what renders them insensitive to disruptions in the TBP regulatory network. If these genes were examined under conditions where they were actively engaged in transcription, the effects of disrupting the

TBP regulatory network might be different.

155

Table 4.4. Transcription rate and steady state signal intensity of clusters % Rank of mRNA/hr Median Intensity at 30˚C Signal Item Top 10 % Bottom 10 % Intensity Top 10 % Bottom 10 % Cluster 1 1E-01 1E-01 41% 3E-06 1E-02 Cluster 2 5E-01 1E-01 60% 3E-01 2E-03 Cluster 3 2E-02 5E-02 64% 4E-01 4E-04 Cluster 4 1E-07 2E-89 17% 3E-08 4E-29 Cluster 5 7E-01 7E-04 63% 3E-01 1E-05 Cluster 6 7E-03 7E-03 80% 4E-33 2E-06 Cluster 7 6E-02 8E-11 44% 5E-01 1E-01 Cluster 8 2E-12 5E-01 73% 2E-08 1E-04 Cluster 9 6E-22 8E-01 84% 2E-21 3E-04 Cluster 10 6E-02 2E-12 52% 3E-01 5E-02 Unclustered genes 8E-04 2E-84 45% 4E-08 2E-14 Genome-wide 50% Italicized p-values indicate the cluster is under represented, as opposed to over represented, in either the top or bottom 10%. Transcription rate (mRNA/hour) is calculated from Holstege et al (Holstege et al. 1998); and signal intensity is from Chapter 2 data (Chitikila et al., 2002).

In order to further dissect how additional modes of gene regulation relate to the TBP regulatory network under investigation, I analyzed the relationship between the genes in each cluster and previously published genome-wide expression data from disruptions of other transcriptional regulators (Tables 4.5 and 4.6; complete data set in Supplemental

Table S4-2). As expected, the clusters which were most up-regulated (clusters 3, 5, & 6) or down-regulated (clusters 1 & 8) by the K145E and F182V TBP mutants (Figure 4.4A, columns 57-63) showed significant overlap with previously published data from bur6 and mot1 mutants, reinforcing the specificity of these mutations to disrupt TBP’s interactions with Mot1 and NC2. Relationships of a few of the more interesting gene clusters are analyzed in more detail below.

156

Table 4.5. Selected Relationships of gene clusters to defined groups.

Item Category P-value Ref. # Cluster 1 ESR down-regulated 4E-79 G9-1 CER down-regulated 9E-49 G12-1 Essential gene (as of 03-03-03) 2E-16 G58 Cluster 2 No relationships of relevance with p < 1E-05 Cluster 3 15min Heat Shock Induced (10%) 5E-17 G67 Heat Shock Induced and SAGA Dominated 9E-13 G69 Heat Shock Induced and TFIID Dominated 3E-10 G70 CER up-regulated 1E-06 G12-2 Cluster 4 Sporulation temporal class; middle 4E-33 G10-5 HAST domain 6E-31 G32 HZAD (H2A.Z active domain) 2E-18 G51 Bottom 10%ile in Dist of ATG from telomere 5E-12 P3 Cluster 5 ESR Induced and SAGA Dominated 8E-48 G65 ESR up-regulated 7E-35 G9-2 CER up-regulated 1E-25 G12-2 SER - Msn2.4 acid regulated 1E-21 G14 2-fold down in spt20Δ 8E-07 G40 Cluster 6 15min Heat Shock Induced (10%) 6E-25 G67 ESR up-regulated 6E-23 G9-2 Cell Cycle; M/G1 phase 1E-19 G19-4 Heat Shock Induced and SAGA Dominated 1E-17 G69 SER - Msn2.4 acid regulated 2E-17 G14 CER up-regulated 6E-17 G12-2 ESR Induced and SAGA Dominated 4E-15 G65 Heat Shock Induced and TFIID Dominated 6E-15 G70 2-fold down in spt20Δ 1E-14 G40 ESR Induced and TFIID Dominated 4E-10 G66 Cluster 7 HAST domain 1E-13 G32 2-fold down in taf14 3E-09 G42 2-fold down in spt20Δ 6E-07 G40 2-fold down in tbp-E186D 4E-06 G50 HZAD (H2A.Z active domain) 2E-05 G51 Cluster 8 Rap1 ChIP; Ribosomal protein 2E-36 G37-2 15min Heat Shock Repressed (10%) 5E-06 G68 2-fold down in taf7-ts1 2E-05 G47 2-fold down in taf8-ts7 3E-05 G46 2-fold down in taf3-ts2 4E-05 G44 ESR down-regulated 5E-05 G9-1 Cluster 9 Rap1 ChIP; Ribosomal protein 2E-09 G37-2 15min Heat Shock Repressed (10%) 2E-06 G68 ESR down-regulated 1E-05 G9-1 CER down-regulated 1E-05 G12-1 Cluster 10 Chromosome #12 4E-210 G7-12 Chromosome #11 1E-90 G7-11 Data presented in Table 5.2 does not include all relationships, but is meant to highlight representative examples. The complete analysis is in Supplemental Table 4.2, along with reference details on each group.

157

Table 4.6. Selected Relationships of gene clusters to other transcription regulators (Clusters 1-4).

TOP 10% BOTTOM 10% Cluster Expression Experiment P-value Ref. # ChIP Experiment P-value Ref. # Expression Experiment P-value Ref. # ChIP Experiment P-value Ref. # Cluster 1 Sporulation H4 !(2-26) 8E-50 E325 Sporulation; 0.5 hr 1E-60 E188 Phosphate ntd80!; mid 8E-50 E195 Low-Pi vs High-Pi 9E-132 E314 Sporulation; 2 hr 2E-21 E189 PHO81c-1; exp1 8E-19 E318 ntd80!; early 4E-06 E194 pho85! 4E-42 E317 SWI/SNF Diauxic Shift snf2!; YPD 1E-12 E308 15 hrs, OD=1.8 1E-62 E183 swi1!; YPD 6E-15 E310 17 hrs, OD=3.7 2E-55 E184 RP genes 19 hrs, OD=6.9 4E-29 E185 rpl8a 4E-64 R140 21 hrs, OD=17.2 2E-36 E186 rpl27a (**4) 4E-27 R137 bur6-1 5E-16 E233 rpl12a 5E-25 R135 rpb1-1 1E-17 E291 rps24a (**9) 2E-20 R141 sir2! 3E-11 E329 rpl6b 7E-10 R139 swi2(K798A) 2E-07 E230

Cluster 2 Mediator Gcn5; (ORF) 4E-07 C171 Diauxic shift med6-ts 2E-10 E290 Rpb1; (IGR) 9E-06 C166 15 hrs, OD=1.8 2E-06 E183 srb4-138 ts 5E-13 E287 17 hrs, OD=3.7 3E-06 E184 RSC Complex rsc3-2; expA 2E-20 E320 rsc3-2; Avg 5E-12 E320/1 SAGA Complex spt3! 5E-28 E257 spt3(E240K) 8E-14 E256 spt8! 8E-14 E401 taf1-ts2 spt3! 1E-22 E259

Cluster 3 SAGA spt3(E240K) 1E-41 E256 spt3! 1E-30 E257

Mot1 mot1-1 6E-20 E231 mot1-14 4E-16 E232 NC2 bur6-1 4E-06 E233 spt15 (F182V) 2E-06 E234 SAGA & TFIID taf1-ts2 spt3 (E240K) 3E-17 E264 taf1-ts2 spt3! 2E-10 E259 taf1-ts2 gcn5! 1E-09 E269 TFIID taf1-ts2 5E-12 E265 bdf1! 8E-11 E282 srb4-138 ts 3E-09 E287 swi2(K798A) 6E-08 E230

Cluster 4 Histones Histones Evolution Histones H3 !(1-28) 4E-89 E324 H3; (IGR) 2E-23 C150 Evolved strain 2 1E-14 E177v2 H3 Ac-K18; (IGR) 4E-06 C179 H4 !(2-26) 1E-23 E325 H3; (ORF) 6E-14 C154 Evolved strain 3 9E-10 E178v2 H3 Ac-K9, 14; (ORF) 4E-08 C117 GAL-HH4 depletion 1E-69 E210 H2B; (IGR) 2E-13 C151 Diauxic Shift H3 Ac-K9,14; (IGR) 7E-13 C167 HDACs H4 Ac-K16; (IGR) 2E-08 C176 9 hrs, OD=0.14 5E-16 E180 H3 Me2-K4; (IGR) 2E-28 C120 hda1!; exp 2 (Rosetta) 9E-26 E217 H3 Ac-K9, 14; hda1! (ORF) 8E-24 C113 11 hrs, OD=0.46 9E-14 E181 H3 Me2-K4; (IGR) 2E-22 C122 hda1!; exp 1 (Rosetta) 4E-26 E218 H3 Ac-K9, 14; hda1! (IGR) 6E-20 C114 Sporulation H3 Me2-K4; (ORF) 2E-58 C119 sir2! 1E-19 E329 5 hr 9E-15 E190 H4 Ac-K5, 8, 12, 16; (IGR) 3E-21 C121 Histone associated factors 7 hr 7E-18 E191 H4 Ac-K5, 8, 12, 16; (ORF) 2E-10 C118 tup1! 7E-41 E215 9 hr 5E-20 E192 H4 Ac-K5, 8, 12, 16; rpd3! (ORF) 1E-29 C115 ssn6! 4E-34 R165 11.5 hr 6E-28 E193 Rpd3 in swi4!; (IGR) 8E-32 C165 set1! 3E-60 E226 ntd80!; early 2E-31 E194 RNA polymerase II isw1, isw2 6E-30 R089 HDAC Rpb1; (IGR) 1E-07 C166 isw1 1E-18 R088 hos2! 4E-11 E219 Rpb1; (ORF) 3E-07 C173 mot1-14 2E-22 E232 rpd3!; exp 2 2E-09 E221 bur6-1 8E-14 E233 rpd3!; exp 1 8E-34 E222 rpb1-1 8E-12 E291 sin3!; exp 1(Rosetta) 1E-20 E223 med2_ 2E-13 E332 SAGA med2_ 1E-08 E333 spt3(E240K) 2E-23 E256 abf1-1(ts) 2E-29 E398 spt3! 5E-29 E257 upb8! 1E-08 E398 bdf2! 2E-20 E284 SWI/SNF snf2!; minimal media 2E-13 E309 swi1!; YPD 3E-12 E310 swi1!; minimal media 7E-08 E311 Phosphate Low-Pi vs High-Pi 4E-16 E312 pho80! 3E-09 E316 Histones htz1! 3E-07 E326 htz1! hmr! 8E-07 E328

158

Table 4.6 continued. (Clusters 5-6).

TOP 10% BOTTOM 10% Cluster Expression Experiment P-value Ref. # ChIP Experiment P-value Ref. # Expression Experiment P-value Ref. # ChIP Experiment P-value Ref. # Cluster 5 Diauxic shift Bdf1 (37˚C); (IGR) 3E-27 C147 Mediator 19 hrs, OD=6.9 1.75193E-29 E185 TBP (37˚C); (IGR) 2E-10 C143 med2_ 2E-12 E333 21 hrs, OD=17.2 3E-18 E186 Mot1 (37˚C); (IGR) 2E-10 C146 RSC HDAC complexes rsc30!; expA 8E-11 E322 hda1!; exp 1 (Rosetta) 7E-12 E218 rpd3!; exp 2 5E-15 E221 sin3!; exp 1(Rosetta) 9E-08 E223 Histones H4 !(2-26) 2E-18 E325 htz1! 2E-24 E326 htz1! sir2! 4E-17 E327 htz1! hmr! 1E-10 E328 NC2 bur6-1 5E-44 E233 spt15 (F182V) 8E-38 E234 Mot1 mot1-14 4E-37 E232 mot1-1 1E-35 E231 SAGA gcn5(KQL) 4E-16 E228 gcn5! 8E-06 E229 spt3(E240K) 3E-10 E256 TFIID taf1-ts2 2E-50 E265 taf9-ts2 2E-18 E273 taf10-ts1 1E-06 E274 taf12-23ts 2E-11 E275 bdf1! 4E-26 E282 SAGA & TFIID taf1-ts2 gcn5(KQL) 2E-24 E267 taf1-ts2 gcn5! 2E-40 E263 taf1-ts2 spt3 (E240K) 2E-14 E264 srb10-3 8E-17 E289 rpb1-1 5E-12 E291

Cluster 6 Diauxic Shift Bdf1 (37˚C); (IGR) 3E-12 C147 crt1!; log phase 8E-13 E209 Histones 15 hrs, OD=1.8 2E-44 E183 Mot1 (37˚C); (IGR) 2E-06 C146 rap1-17 2E-11 E211 H3 Ac-K9,14; (IGR) 3E-11 C167 17 hrs, OD=3.7 2E-34 E184 swi2(K798A) 3E-10 E230 H3; (IGR) 5E-08 C150 19 hrs, OD=6.9 2E-49 E185 SAGA H3; (IGR) 4E-05 C152 21 hrs, OD=17.2 2E-39 E186 spt3(E240K) 2E-17 E256 TFIID spt3! 6E-32 E257 taf1-ts2 3E-12 E268 spt8! 2E-21 E401 taf1-ts2 (GCN5 pRS314) 1E-23 E265 SAGA & TFIID bdf1! 4E-09 E282 taf1-ts2 spt3! 4E-23 E259 taf10-ts1 5E-05 E274 Mediator taf12-23ts 4E-10 E275 srb5!1 4E-08 E288 taf6-19 3E-11 E272 med6-ts 9E-06 E290 taf9-ts2 4E-17 E273 med2_ 2E-35 E332 SAGA & TFIID med2_ 7E-30 E333 taf1-ts2 gcn5(KQL) 1E-05 E267 RSC taf1-ts2 gcn5! 4E-27 E269 rsc3-2; expA 9E-12 E320 NC2 rsc3-2; Avg 1E-22 E320/1 bur6-1 1E-09 E233 rsc3-2; expB 4E-24 E321 spt15 (F182V) 2E-06 E234 rsc30!; expA 5E-06 E322 Mot1 mot1-1 2E-05 E231 mot1-14 9E-06 E232 SAGA gcn5(KQL) 6E-11 E228 gcn5! 1E-10 E229 Histones H4 !(2-26) 3E-13 E325 htz1! 2E-20 E326 htz1! hmr! 3E-15 E328 htz1! sir2! 3E-10 E327 HDAC complexes hda1!; exp 1 (Rosetta) 6E-06 E218 hda1!; exp 2 (Rosetta) 4E-08 E217 rpd3!; exp 1 5E-06 E222 rpd3!; exp 2 1E-12 E221 sin3!; exp 1(Rosetta) 2E-09 E223 Phosphorylation Low-Pi vs High-Pi 4E-16 E314 pho80! 3E-09 E316 PHO81c-1; exp2 2E-14 E319 pho85! 2E-05 E317 SWI/SNF snf2!; minimal media 2E-17 E309 snf2!; YPD 4E-05 E308 swi1!; minimal media 8E-33 E311 swi1!; YPD 9E-10 E310 Sporulation 11.5 hr 6E-14 E193 9 hr 2E-10 E192 7 hr 7E-09 E191 5 hr 3E-07 E190 srb10-3 3E-10 E289

159

Table 4.5 continued. (Clusters 7-9).

TOP 10% BOTTOM 10% Cluster Expression Experiment P-value Ref. # ChIP Experiment P-value Ref. # Expression Experiment P-value Ref. # ChIP Experiment P-value Ref. # Cluster 7 Histone associated factors H3 Ac-K9, 14; hda1! (ORF) 6E-07 C113 HDAC complexes Histones set1! 8E-16 E226 hos2! 5E-10 E219 H4 Ac-K5, 8, 12, 16; rpd3! (ORF) 9E-06 C115 tup1 ! (haploid) 3E-17 R184 rpd3!; exp 2 1E-13 E221 H3 Ac-K9,14; (IGR) 9E-07 C167 ssn6 ! (haploid) 3E-11 R165 rpd3!; exp 1 3E-12 E222 H3 Ac-K18; (IGR) 2E-06 C179 isw1!, isw2! 3E-10 R089 sin3!; exp 1(Rosetta) 2E-08 E223 Histones sap30! 5E-06 E225 GAL-HH4 depletion 1E-11 E210 sir2! 3E-12 E329 H3 !(1-28) 2E-05 E324 SAGA SWI/SNF spt3(E240K) 9E-27 E256 swi1!; YPD 2E-10 E310 spt3! 2E-28 E257 swi1!; minimal media 4E-09 E311 spt8! 5E-06 E401 snf2!; minimal media 1E-08 E309 SAGA & TFIID snf2!; YPD 5E-08 E308 taf1-ts2 spt3! 1E-06 E259 abf1-1(ts) 6E-10 E398 taf1-ts2 spt3 (E240K) 9E-06 E264 HDAC complexes TFIID hda1! 3E-11 R076 taf5 ts9-12 9E-11 E271 ubp8! 5E-11 R185 taf12-23ts 7E-06 E275 bdf1! 3E-10 E282 Mediator srb5!1 8E-09 E288 med6-ts 7E-08 E290 fcp1-1 7E-11 E292 Histones htz1! 7E-07 E326 htz1! sir2! 1E-35 E327 htz1! hmr! 5E-06 E328

Cluster 8 Evolved strain 3 9E-20 E178v2 H3 Ac-K9, 14; (ORF) 9E-11 C117 mot1-14 1E-27 E232 Hst1in sum1!; (IGR) 1E-14 C162 RSC NC2 Rpd3 in swi4!; (IGR) 2E-32 C165 rsc30!; Avg 1E-09 E322/3 bur6-1 1E-17 E233 rsc30!; expB 1E-09 E323 spt15 (F182V) 6E-25 E234 rsc30!; expA 2E-06 E322 GAL-HH4 depletion 4E-16 E210 SAGA gcn5(KQL) 1E-14 E228 spt8! 4E-06 E401 SAGA & TFIID taf1-ts2 spt3! 6E-11 E259 taf1-ts2 gcn5! 4E-07 E263 taf1-ts2 spt3 (E240K) 9E-06 E264 taf1-ts2 gcn5(KQL) 5E-10 E267 TFIID taf1-ts2 (GCN5 pRS314) 3E-14 E265 taf1-ts2 (SPT3 pRS314) 3E-07 E266 taf6-19 7E-06 E272 taf9-ts2 7E-08 E273 bdf1! 2E-06 E282 tfa1-21 4E-06 E285 kin28-ts3 8E-07 E286 Mediator srb4-138 ts 3E-12 E287 med6-ts 8E-07 E290 fcp1-1 3E-09 E292 RSC rsc3-2; expA 3E-08 E320 rsc3-2; Avg 2E-11 E320/1 rsc3-2; expB 3E-08 E321

Cluster 9 Evolution Rpb1; (ORF) 2E-12 C173 GAL-HH4 depletion 9E-34 E210 Histones Evolved strain 1 1.45038E-16 E176v2 Rap1; (ORF) 1E-09 C126 Diauxic Shift H3; (IGR) 3E-10 C150 Evolved strain 2 3.22461E-14 E177v2 Histones 19 hrs, OD=6.9 2E-06 E185 H2B; (IGR) 5E-08 C151 Evolved strain 3 3.36097E-34 E178v2 H3 (37˚C); (ORF) 2E-10 C157 21 hrs, OD=17.2 7E-09 E186 H3; (ORF) 2E-11 C154 HDAC H3 Ac-K9, 14; (ORF) 2E-08 C117 mot1-14 2E-08 E232 Rpd3 in swi4!; (IGR) 2E-17 C165 hda1! 4E-14 E216 H3 Ac-K18; (ORF) 5E-07 C190 TFIID hos2! 3E-16 E219 H3 Ac-K9; (IGR) 2E-06 C177 taf1-ts2 gcn5! 1E-06 E263 hos3! 6E-13 E220 H3 Ac-K27; (ORF) 4E-05 C192 taf1-ts2 (GCN5 pRS314) 4E-08 E265 Rpd-Sin3 repression taf1-ts2 (SPT3 pRS314) 5E-06 E265/6 sap30! 3E-16 E225 taf10 chimera97 2E-10 E279 ume6! 3E-11 E208 RSC RSC rsc3-2; expA 6E-11 E320 rsc30!; expB 1E-12 E323 rsc3-2; Avg 4E-10 E320/1 rsc30!; expA 2E-09 E322 rsc3-2; expB 2E-09 E321 rsc30!; Avg 7E-08 E322/3 med2_ 6E-09 E332 abf1-1(ts) 8E-06 E398

160

Cluster 4 genes are over-represented for being located close to the telomere (p = 5x10-12,

Table 3.3), which is consistent with our previous work that identified repression of gene expression through TBP dimerization as being important for subtelomeric genes (Chapter

2 and Chitikila et al., 2002). Analysis of overlap with other transcription regulators paints a picture of subtelomeric gene repression through “chromatin mechanisms” which require H3 and H4 histone tails, the Hda1and Sir2 histone deacetylases, the Tup1/Ssn6 chromatin binding complex, and the Isw1p, which is a member of the diverse ISW class of chromatin remodeling complexes. An overlap is also seen with the presence of histones H3 (in both ORF and intergenic regions), H2B, and interestingly H4 that is acetylated at K16. However, these genes generally contain the lowest presence of acetylated H3 (at K18, K9, and K14), methylated H3 (K4) and H4 which is recognized by antibody against tetra-acetylated H4 (K5, 8, 12, and 16). This appears to implicate a different role for H4 K16 acetylation relative to H3 acetylation and methylation and H4 acetylation at other residues. This differential effect of H4 K16 acetylation is consistent with other studies, which have found specific function for H4 AcK16 relative to other H4 modifications (Dion et al., 2005; Kurdistani et al., 2004). Another relationship, which is rather unexpected, is a link to positive regulation by Bdf2 (p = 2 x 10-20). Previous experiments had suggested a role for Bdf1 in antisilencing at heterochromatic regions

(Ladurner et al., 2003), but my analysis doesn’t indicate a significant relationship between cluster 4 genes and positive regulation by Bdf1 (p = 4 x 10-1). The same study that tied Bdf1 to antisilencing saw very few genes that changed expression more than 2- fold upon deletion of BDF2 (Ladurner et al., 2003). Therefore, while the absolute change

161

in expression in a bdf2Δ may not be dramatic, the genes most positively regulated by

Bdf2 are generally lowly expressed genes that are repressed by a variety of chromatin mechanisms and TBP dimerization. Closer examination of this cluster of genes may further elucidate Bdf2’s role in gene regulation.

While many clusters are regulated by several counteracting positive and negative factors, cluster 8 is positively regulated by many factors, including TFIID, Mediator, and SAGA.

Mot1, NC2, and histone H4 also play positive roles. One regulatory factor that is inhibitory is the RSC subunit Rsc30, which is consistent with the analysis in Chapter 3 that implicated Rsc30 in negative regulation of TFIID dominated genes. The requirement for several different co-activator complexes may be due to the high level of transcription from cluster 8 genes. This group of genes is over-represented for ribosomal protein genes (see below), which have previously been shown to depend upon TFIID for expression (Mencia et al., 2002). While our analysis is consistent with TFIID playing an important role at cluster 8 genes, additional positive regulators also appear to stimulate expression at these genes.

I have presented a few examples in an attempt to illustrate how additional transcription regulatory factors function at different clusters of genes defined by their sensitivities to disruption of the TBP regulatory network. This type of analysis is expected to be helpful in elucidating a more detailed picture of gene regulation at distinct subsets of genes.

162

4.3.9 Clusters are enriched for genes with distinct functional properties and

sequence motifs.

To further investigate why certain modes of TBP regulation target distinct subsets of genes, each cluster was analyzed for enrichment of GO and MIPS classifications. I found that some of the clusters are highly enriched for genes with similar functions, although almost all clusters were represented by small numbers of genes which were enriched with p-values that ranged from p = 1 x 10-2 to 1 x 10-5. Here I focus on a few clusters with very low p-values, but the complete analysis is available in Supplemental Table S4-3.

Cluster 8 is enriched for genes involved in protein synthesis (MIPS Functional classification “Protein Synthesis” p < 1 x 10-14). This includes many genes that code for the protein components of both the large and small ribosome subunits. Cluster 9 is enriched for genes involved in respiration (MIPS Functional classification “respiration” p

= 9 x 10-13) and for protein products localized to the mitochondria (MIPS subcellular localization “mitochondrial inner membrane” p = 1 x 10-13). However, there is some overlap of gene functions between these two clusters with respiration genes occurring in cluster 8 and ribosomal protein genes in cluster 9.

Another group of interest is cluster 1, which is enriched for genes involved in rRNA transcription and processing (MIPS Functional classification “rRNA transcription” and

“rRNA processing”, p = 2 x 10-7 and p = 6 x 10-7, respectively). It is intriguing that perturbations in the TBP regulatory network, which decreased expression of the

Ribosomal Protein (RP) genes in cluster 8, generally have the opposite effect on the 163

genes that transcribe and process the RNA component of the ribosome, represented by cluster 1. Cluster 1 genes show a significant overlap with genes that increase expression upon deletion of selected RP genes (Table 4.6). Therefore, I suspect that the increased expression of cluster 1 genes may be an indirect effect from decreased expression of the

RP genes in clusters 8 & 9. However, the biological motive for up-regulating genes involved in rRNA processing (cluster 1) when Ribosomal Protein genes decrease expression (cluster 8/9) is not immediately obvious.

I also analyzed each cluster for enrichment of upstream regulatory sequences (besides the

TATA box, which is discussed above) by comparing to the sequences previously identified by Kellis, et al. (Supplemental Table S4-2) (Kellis et al., 2003). I found that cluster 1 was highly enriched for genes that contain an Environmental Stress Response 1

(p = 9 x 10-49) or 2 element (7 x 10-08). This is consistent with the fact that these genes are down regulated during an environmental stress response (Table 3.3; p = 4 x 10-79).

While certain modes of TBP regulation may be tied to a gene’s function or promoter sequence, these relationships may not directly determine the mode of TBP regulation utilized at a promoter. These associations might have more to do with the level of transcription occurring and the presence of activators or repressors under our growth conditions. For example, clusters 8 and 9, which are enriched for ribosomal protein genes, are highly transcribed under our conditions and therefore may be more sensitive to disrupting modes of regulation required for this high level of transcription, such as TBP-

164

DNA interactions and the presence of Spt3. If these genes were examined under high- stress conditions, in which they are not as highly transcribed, presumably they would not be as sensitive to theses modes of TBP regulation. The promoter sequences and functional categories identified are part of the larger picture of how these genes are regulated, but most likely are not the only factor which determines the mode of TBP regulation at these genes.

4.3.10 Chromosomal duplications can occur when multiple TBP regulators are

disrupted.

While examining each cluster of genes for relationships, I noticed enrichment in cluster

10 for genes on chromosomes XI and XII (Table 4.5). This was unusual since I did not expect any direct link between regulation of TBP and a specific chromosome(s). The genes in cluster 10 are up-regulated mainly in two experiments, the K145E V161R or

V161R F182V test TBPs in the ΔTAND spt3Δ strain. This chromosomal specific enrichment of genes was indicative of chromosomal duplications. To further analyze this possibility, I examined the average log ratio for each chromosome. Initially, I examined the data used in the analysis, which is the average of at least two replicate experiments. I would expect that the average log ratio for a given chromosome would be zero if there is no duplication. None of the test TBPs in either the wild type or ΔTAND strains exhibited any chromosome-specific up-regulation of gene expression (Figure 4.7A & B).

However, a few test TBPs in the spt3Δ and ΔTAND spt3Δ strain showed chromosome- specific increases in expression (data not shown). Since each of the experiments was the

165

average of at least two independent cultures from different transformants, I separated the data by replicates and examined each replicate separately in the spt3Δ and ΔTAND spt3Δ strains (Figure 4.7C &D). While the average log ratio for most chromosomes was near zero, a few cultures, for both the spt3Δ and ΔTAND spt3Δ strains, exhibited chromosomal specific increases. These include one replicate for the K145E V161R

F182V triple mutant, one for the V161R F182V double mutant, and one for the K145R single mutant in the spt3Δ strain, which showed increases in expression from genes on chromosome 11. In addition, the K145E mutant was slightly elevated on chromosome 2.

In the ΔTAND spt3Δ strain, one replicate of the K145E V161R mutant and both replicates of the V161R F182V mutant had increased expression from chromosomes 11 and 12 while one replicate of the K145E V161R F182V mutant increased expression on chromosomes 5 and 12. Since all of the test TBP plasmids were transformed into either the spt3Δ or ΔTAND spt3Δ strain at the same time, this indicated that the chromosomal duplications must have arisen after transformation of the plasmid. If it had occurred prior to transformation of the test TBP plasmids, I would expect to see the same duplications in all transformants.

166

A B

C D

Figure 4.7. Chromosomal duplications can occur upon disruption of the TBP regulatory network. (A-D) Average Log2 Ratio by chromosome for each strain. A) Wild Type strain with duplicate experiments averaged together. B) ΔTAND strain with duplicate experiments averaged together. C) spt3Δ strain with each duplicate experiment shown separately. D) ΔTAND spt3Δ strain with each duplicate experiment shown separately.

167

This analysis indicates that chromosomal duplications occur with an increased frequency when certain modes of TBP regulation are disrupted. It appears to be more likely to occur in the spt3Δ strains, but that alone is probably not sufficient for duplication, indicating a requirement for disruption of multiple TBP regulators. For at least some of the test TBP–strain combinations, a chromosomal specific duplication appears to be reproducible. However, it is unlikely that these duplications dramatically alter the relationships between TBP regulators outlined in previous sections, as only cluster 10 genes exhibit a strong chromosomal bias (Figure 4.8). The other clusters are driven by expression patterns of multiple test TBP and strain combinations, of which only a few had chromosomal duplications. Presumably, the reason that the genes in cluster 10 passed our stringent filtering criteria is due to the fact that both replicates of the ΔTAND spt3Δ strain with the V161R F182V test TBP had duplications of the same chromosomes.

Likely, the disruption of multiple TBP regulators caused misexpression of genes important for regulation of chromosomal integrity, which resulted in the duplications.

168

Figure 4.8. Genes in Cluster 10 are over-represented on chromosomes XI and XII. Percentage of genes in clusters from each chromosome. Each cluster should be compared to the genome-wide distribution (All genes). Cluster 10 is extremely over-represented for genes from chromosomes XI and XII while the other clusters are generally unbiased.

The duplications appears to be biased to chromosomes XI and XII. The reason for this is unclear. Earlier microarray expression studies demonstrated that sometimes duplications arise on a chromosome that carries a gene homologous to the disrupted one (Hughes et al., 2000). However, this doesn’t appear to be the explanation in this case, as none of the regulators examined in this study (TBP, SPT3, TAF1, NC2, or MOT1) are located on chromosome XI or XII. Given the multitude of players in transcriptional regulation, there could potentially be another component of the assembly complex located on chromosome XI or XII whose up-regulation may be the cause of the chromosomal bias.

In conclusion, this result reinforces the importance of evaluating microarray expression data for possible changes in expression due to aneuploidy, which have the potential to result in artificial correlations between experiments, prior to making conclusions about

169

the data. Furthermore, it demonstrates the utility of using expression data to evaluate possible aneuploidy.

4.4 Discussion

4.4.1 Modes of TBP regulation function coordinately, with certain regulation

dominating at subsets of genes.

The study presented here illuminates the highly interconnected TBP regulatory network.

Rather than TBP regulators functioning equivalently at all genes or selectively at non- overlapping sets of genes, they appear to take on net negative or positive roles depending upon many factors including whether TFIID or SAGA regulation predominates, the presence of a TATA box, and which mode of regulation dominates that group of genes.

From this analysis, TBP regulation appears to be very gene specific, with subsets of genes exhibiting similar modes of regulation, allowing them to be grouped together.

However, between the different groups of genes the relative importance of each mode of regulation varies. In fact, many of the TBP regulators examined in this study are bi- functional, playing both positive and negative roles, at different subsets of genes. The observed transcriptional output at any gene is determined by balancing the positive and negative regulators at each gene. With the multiple disruption experiments, I observe that certain regulators tend to counteract each other at multiple clusters of genes.

Nevertheless, the various gene clusters have different sensitivities to disruption of each regulator. This is the case with cluster 5 and 6 genes, which are positively regulated by

Spt3 and TBP-DNA interactions and inhibited by NC2 and Mot1 to different extents. For

170

many genes, regulation is not black and white but a gradient between positive and negative regulators that determine transcription output.

This gradient is further affected by the growth conditions and the requirement for each gene product. When the relative roles of SAGA and TFIID, as presented in Chapter 3, are compared to the modes analyzed here, it appears that the differences in growth conditions are a factor in determining the mode of regulation at a given gene. This is illustrated by the genes in cluster 8, which under heat stress are TFIID-dominated (i.e. require TAF1 for expression). However, with the growth conditions used in these experiments (30OC), Spt3 is also required for their expression. Therefore, Spt3, which is normally used for the induction of ESR up-regulated genes, may play a different role at some stress response genes when the cell is not under environmental stress. The idea that growth conditions regulate gene expression is hardly novel, but this variable is often overlooked when analyzing how the basic transcription machinery functions. In fact, an earlier study of NC2 showed that its properties are altered under different growth conditions (Creton et al., 2002).

4.4.2 A portion of the yeast genome is sensitive to a highly interconnected but

minimally redundant TBP regulatory network.

From the relationships observed in the clustering analysis, it appears that the TBP regulators analyzed in this study generally have intertwined relationships rather than redundant relationships. Multiple TBP regulatory factors generally have an additive

171

stimulatory or repressive effect, or counteract each other by balancing positive and negative regulation. I uncovered very few, if any, genes with a clear-cut redundancy of multiple factors. One possible exception are cluster 4 genes, which are derepressed in the

ΔTAND strain with the K145E, V161E double mutant. This increase in expression is only observed when all three interactions (Mot1, TAND and DNA binding) are disrupted, but does not occur in any of the single mutants or double mutant combinations.

Under the growth conditions examined, about one-half of the yeast genome is sensitive to the disruption of these TBP regulators. Why is only a portion of the genome sensitive to this TBP regulatory network? Several factors may impact this “insensitivity.” One trivial explanation is that these genes change slightly in expression, but do not meet our stringent criteria for being “significantly changed”. However, this is probably only the case for a small percentage of the insensitive genes. Another option is that the modes of

TBP regulation examined play a role at these genes, but not in a manner that is detectable in the context of the endogenous TBP. A third possibility is that other TBP regulators that are not examined in this study play a key role at these “unchanged” genes, even possibly a novel mode of TBP regulation. These may be the reasons that some genes are excluded from the TBP regulatory network, however the most likely explanation appears to be linked to the expression state of these genes. The insensitive genes are transcribed at a relatively low rate, which may minimize any impact of disruptions in the TBP regulatory network. If the genes generally are not expressed under the conditions used in this study, one would not expect to detect any effect if a positive regulator was disrupted.

172

However, if their lack of expression was due to inhibition of transcription by the regulators disrupted in the study, one might expect to see an increase in expression of these genes. Since this is not what occurs, it would suggest that the inhibitory mechanisms examined in this study are not responsible for the low level of transcription of these genes. Additionally, the relationship analysis showed that these genes are generally TATA-less and biased toward regulation by TFIID. Several of the factors examined in this study, including Mot1 and Nc2, are linked to regulation of TATA- containing, SAGA-dominated genes.

4.4.3 Clusters from genome-wide expression analysis characterize key

relationships between TBP regulators.

Certain modes of TBP regulation are often linked and this study reinforces several of the previously defined relationships, as well as refining the relationships by adding more detail. Mot1 and the NC2 complex repress genes that contain TATA boxes and all the gene clusters that are strongly repressed by Mot1 and/or NC2 show a high percentage of

TATA-containing promoters. Additionally, I show that Mot1 and NC2 target similar sets of genes, which is consistent with earlier observations that negative regulation by these

TBP regulators is highly interdependent. This prompts the question of what is the mechanistic relationship between Mot1 and the NC2 complex? Does one factor target the other to these promoters? Since the Mot1 mutant (K145E) has minimal effects on gene expression in the wild type strain, it is difficult to accurately assess their relationship in this context. However, in the three other strains (ΔTAND, spt3Δ and ΔTAND spt3Δ)

173

the double mutant (K145E, F182V) closely mimics the expression pattern of each single mutant, which supports a model that requires both Mot1 and NC2 for repression.

This study also supports a role for SPT3 at promoters that varies upon growth conditions.

Genes in several of these clusters were previously defined as SAGA-dominated under heat stress conditions, due to their sensitivity to loss of Spt3 and insensitivity to the loss of Taf1. Other genes were classified as TFIID-dominated based upon their decreased expression upon inactivation of Taf1. However, under the non-stress conditions used in these experiments, the dependence upon a positive role for Spt3 at SAGA-dominated genes varies. Cluster 3 genes, some of which presumably depend upon Spt3 for induction under heat stress, are slightly repressed by Spt3 under non-stress conditions. The expression pattern of cluster 2 indicates that under non-stress conditions, some TFIID- dominated genes are repressed by Spt3. On the other hand, the highly expressed genes in cluster 8, while classified as TFIID-dominated, also require Spt3 for wild type levels of expression under these growth conditions. These examples all illustrate how Spt3’s role can change depending upon the expression state of a gene.

The loss of Spt3 is often dominant to disruption of other modes of regulation, implicating

Spt3 as a key player in regulation at affected genes. In particular, the ΔTAND spt3Δ strain shows nearly identical expression as the spt3Δ strain alone when the two strains are compared for each version of test TBP (the exception being increased expression due to duplications in the strains discussed earlier). From this analysis, there is little, if any,

174

redundancy or counteracting relationship between Spt3 and the N-terminal domain of

TAF1. However, this is not the case with Spt3 and Mot1 or NC2. These factors clearly counteract positively regulated Spt3 genes as illustrated in cluster 5 and to a lesser extent cluster 6.

Another interesting relationship is that the two clusters (1 and 8), which decrease in expression upon disruption of NC2 and/or Mot1 regulation, indicating a positive role for these factors, are the clusters with the fewest TATA-containing genes. In vitro studies have demonstrated a positive role for Mot1 and NC2 at non-TATA promoters (Gilfillan et al., 2005; Gumbs et al., 2003), however it hasn’t been clearly demonstrated if Mot1 or

NC2 actually deliver TBP to promoters in vivo. Our experiments link positive regulation by NC2 and Mot1 to TATA-less genes but cannot determine if this effect is direct or indirect. One argument for an indirect effect is that as Mot1 and/or NC2 are required for repression of TATA-containing promoters, loss of Mot1 or NC2 repression at these genes shifts the balance of available TBP in a manner that is detrimental to TATA-less genes.

Determining if there is a limiting pool of TBP is the first step in addressing this model.

In conclusion, I have used genome-wide expression analysis to evaluate the relationship between six different modes of TBP regulation in detail. By combining the expression patterns over these 63 experiments in the context of additional information about the genes function and regulation, the main players in transcriptional regulation at each cluster of genes can be dissected out. Subsets of genes respond differently to the same

175

TBP mutant, demonstrating that the role of a particular TBP regulatory factor may differ from gene to gene. I have shown that certain modes of regulation are linked to each other and that some modes of regulation are dependent upon the growth conditions and expression state of genes. While TBP regulation is fairly complex, even extending beyond the modes examined here, this analysis indicates that certain relationships between TBP regulatory factors are maintained through the genome.

4.5 Materials & Methods

Strains and Plasmids

All strains were derived from Y13.2 (MATα ura3-52 trp1Δ-63 leu2,3-112 his3-609 taf145 Δ pYN1-TAF1) (Kokubo et al., 1998). The wild type and ΔTAND strains have been previously described (Chitikila et al., 2002). The spt3Δ and ΔTAND spt3Δ strains were constructed as derivatives of the above strains by replacement of the endogenous

SPT3 gene with a PCR amplified KanR cassette through homologous recombination

(Guldener et al., 1996). Single mutant plasmids carrying the Galactose inducible version of TBP on a CEN/ARS plasmid have been previously described or were constructed by site-directed mutagenesis from the wild type(Jackson-Fisher et al., 1999). Double and triple mutant combinations were constructed by standard sub-cloning methods from the single mutant plasmids or by using site-directed mutagenesis when there were no suitable restriction enzyme sites available. The 2u version of the wild type TBP plasmid was constructed through sub-cloning by exchanging a ScaI/ClaI fragment containing the

CEN/ARS replication origin with a ScaI/ClaI fragment containing the 2u origin found in

176

the pRS425 plasmid. This created p2uLF-yTBP(wt)(Gal10) to which the various TBP mutants were introduced into by sub-cloning. All newly created plasmids were verified by sequencing.

Dominant Toxicity Assay

CEN/ARS plasmids expressing various TBP derivatives under control of the GAL10 promoter were transformed into wild-type, DTAND, spt3D and DTAND spt3D strains.

Transformants were selected on CSM-Leu-Trp + 2% glucose, and subsequently grown in

CSM-Leu-Trp + 3% raffinose liquid media. At OD600 = 1.0, ten microliters of washed cells, or serial 10-fold dilutions, were plated onto CSM-Leu-Trp + 2% galactose or glucose (data not shown) agar.

Cell growth for microarrays and Western blot analysis

Transformants carrying the TBP plasmids were grown in CSM-Leu-Trp + 3% raffinose

liquid media until they reached an OD600 = 0.65-0.8. Cell aliquots equal to 0.5OD units were removed (-galactose) and expression of the TBP was induced by addition of galactose to a final concentration of 2%. The wild type and DTAND strains contained the CEN/ARS version of the plasmids, which was induced for 45’ while the spt3D and

DTAND spt3D strains carried the 2u version of the plasmids, which was induced for 3 hours. After induction, a 0.5OD cell aliquot (+ galactose) was removed prior to harvest.

Anti-yTBP western blots were performed on the -/+ galactose aliquots to monitor expression of the endogenous and Gal-inducible TBPs.

177

Microarray Expression Analysis

Two-channel yeast genomic microarrays described previously were utilized (Chitikila et al., 2002). The reference sample for all of the arrays was the wild-type strain with the

Null test TBP plasmid whose replication origin and induction time were identical to that used in the test sample. Cells were harvested, RNA extracted, and hybridized to arrays as described previously (Chitikila et al., 2002). Expression analysis of all strain-test TBP combinations was repeated at least twice, incorporating a dye-swap of the test and reference samples, with mRNA isolated from two independent transformants. Data was mode normalized and filtered for significant changes in gene expression as previously described (Chitikila et al., 2002) with the following modifications. The independently derived reference versus reference (i.e. homotypic) set was adjusted to include data from

16 hybridizations performed in cohorts with the experiments in this paper. The requirements for a gene to meet the significant change criteria were: 1) The signal intensities for each channel, as determined by subtracting the background median from the foreground mean signal, was required to be positive and greater than 1 standard deviation of the local background signal. 2) The ratio had to change in the same direction in each replicate. 3) The average log ratio of replicates was at least two standard deviations from the mean ratio for that gene in the homotypic data set. The value used for the standard deviation was the greater of either the gene specific standard deviation or the pooled (all genes) standard deviation. 4) p-values of the average log ratio when compared to the homotypic data set were <0.005. This criterion was highly stringent and

178

resulted in very few false positives when applied to independent homotypic experiments.

The additional requirements of at least 1.5-fold change in expression and the presence of data in 60% of the experiments were applied to the subset of genes in the cluster plot

(Figure 4.4A).

Genome-wide ChIP analysis

Cell growth for chromatin immunopercipitation analysis was identical to cell growth for the expression analysis. After the 45’ galactose induction, cultures were immediately cross-linked at 25˚C for 2 hrs. with 1% formaldehyde. Differential ChIP-chip experiments which used the wild type strain with the wild type Gal inducible test TBP as the “reference” was performed on two independent replicates as described with the following modifications (Basehoar et al., 2004). Sonication was performed for 18 sessions and after removal of an input sample all of the approximately 1.2ml of sonicated lysis was incubated with anti-HA antibody. After purification of enriched DNA, it was resuspended in 7ul of 0.1X TE of which 6ul was subjected to non-specific amplification.

Hybridization to intergenic arrays was performed as described (Basehoar et al.,

2004),(Zanton and Pugh, 2004). Signal intensities were calculated for expression arrays and filtered to remove any spots whose signal in either channel was less that one standard deviation above local background. The data for promoter containing intergenic regions

was normalized by setting the log2 ratio of all non-promoter containing intergenic regions

(tail-to-tail regions) equal to 0.00 followed by averaging the normalized log2 ratios of replicates.

179

Relationship analysis

The relationships to the top and bottom 10% of the expression and Chip distributions

(Table 4.6) were calculated in Excel with the data downloaded from the referenced lab or journal’s websites. The percent rank of the distribution was calculated with the

PERCENTRANK function. Then the number of genes which appear in the top 10%

(>0.9 in Percent rank) or the bottom 10% (<0.1 in Percent rank) and appear in each cluster were calculated. This resulted in a distinct number of genes, which were observed in both the top 10% (or bottom 10%) and in each cluster, or not observed in both

(Observed with and Observed without). These values were then compared to the expected number of observations based upon the number of data points in the distribution, the number of genes in the cluster, and the fact that we examined 10% of the distribution. The CHITEST function of Excel was then used to calculate p-values from the observed and expected values. A similar method was used for the comparison of the groups (Table 4.5) and DNA sequences (supplemental table S4-2) except that the overlap was calculated with the genes designated to belong to the group or not. The relationships to the MIPS and GO functional classifications was calculated using the FUNSPEC website without Bonferroni correction.

180

4.6 References

Andrau, J. C., Van Oevelen, C. J., Van Teeffelen, H. A., Weil, P. A., Holstege, F. C., and

Timmers, H. T. (2002). Mot1p is essential for TBP recruitment to selected promoters during in vivo gene activation. Embo J 21, 5173-5183.

Auble, D. T., and Hahn, S. (1993). An ATP-dependent inhibitor of TBP binding to DNA.

Genes Dev 7, 844-856.

Auble, D. T., Hansen, K. E., Mueller, C. G., Lane, W. S., Thorner, J., and Hahn, S.

(1994). Mot1, a global repressor of RNA polymerase II transcription, inhibits TBP binding to DNA by an ATP-dependent mechanism. Genes Dev 8, 1920-1934.

Basehoar, A. D., Zanton, S. J., and Pugh, B. F. (2004). Identification and distinct regulation of yeast TATA box-containing genes. Cell 116, 699-709.

Belotserkovskaya, R., Sterner, D. E., Deng, M., Sayre, M. H., Lieberman, P. M., and

Berger, S. L. (2000). Inhibition of TATA-binding protein function by SAGA subunits

Spt3 and Spt8 at Gcn4-activated promoters. Mol Cell Biol 20, 634-647.

Brown, C. E., Howe, L., Sousa, K., Alley, S. C., Carrozza, M. J., Tan, S., and Workman,

J. L. (2001). Recruitment of HAT complexes by direct activator interactions with the

ATM-related Tra1 subunit. Science 292, 2333-2337.

181

Bryant, G. O., Martel, L. S., Burley, S. K., and Berk, A. J. (1996). Radical mutations reveal TATA-box binding protein surfaces required for activated transcription in vivo.

Genes Dev 10, 2491-2504.

Cang, Y., Auble, D. T., and Prelich, G. (1999). A new regulatory domain on the TATA- binding protein. Embo J 18, 6662-6671.

Cang, Y., and Prelich, G. (2002). Direct stimulation of transcription by negative cofactor

2 (NC2) through TATA-binding protein (TBP). Proc Natl Acad Sci U S A 99, 12727-

12732.

Chasman, D. I., Flaherty, K. M., Sharp, P. A., and Kornberg, R. D. (1993). Crystal structure of yeast TATA-binding protein and model for interaction with DNA. Proc Natl

Acad Sci U S A 90, 8174-8178.

Chitikila, C., Huisinga, K. L., Irvin, J. D., Basehoar, A. D., and Pugh, B. F. (2002).

Interplay of TBP inhibitors in global transcriptional control. Mol Cell 10, 871-882.

Coleman, R. A., and Pugh, B. F. (1997). Slow dimer dissociation of the TATA binding protein dictates the kinetics of DNA binding. Proc Natl Acad Sci U S A 94, 7221-7226.

Coleman, R. A., Taggart, A. K., Benjamin, L. R., and Pugh, B. F. (1995). Dimerization of the TATA binding protein. J Biol Chem 270, 13842-13849.

Collart, M. A. (1996). The NOT, SPT3, and MOT1 genes functionally interact to regulate transcription at core promoters. Mol Cell Biol 16, 6668-6676.

182

Creton, S., Svejstrup, J. Q., and Collart, M. A. (2002). The NC2 alpha and beta subunits play different roles in vivo. Genes Dev 16, 3265-3276.

Dasgupta, A., Darst, R. P., Martin, K. J., Afshari, C. A., and Auble, D. T. (2002). Mot1 activates and represses transcription by direct, ATPase-dependent mechanisms. Proc Natl

Acad Sci U S A 99, 2666-2671.

Davis, J. L., Kunisawa, R., and Thorner, J. (1992). A presumptive helicase (MOT1 gene product) affects gene expression and is required for viability in the yeast Saccharomyces cerevisiae. Mol Cell Biol 12, 1879-1892.

Dion, M. F., Altschuler, S. J., Wu, L. F., and Rando, O. J. (2005). From the Cover:

Genomic characterization reveals a simple histone H4 acetylation code. Proc Natl Acad

Sci U S A 102, 5501-5506.

Dudley, A. M., Rougeulle, C., and Winston, F. (1999). The Spt components of SAGA facilitate TBP binding to a promoter at a post-activator-binding step in vivo. Genes Dev

13, 2940-2945.

Eisenmann, D. M., Arndt, K. M., Ricupero, S. L., Rooney, J. W., and Winston, F. (1992).

SPT3 interacts with TFIID to allow normal transcription in Saccharomyces cerevisiae.

Genes Dev 6, 1319-1331.

Geisberg, J. V., Holstege, F. C., Young, R. A., and Struhl, K. (2001). Yeast NC2 associates with the RNA polymerase II preinitiation complex and selectively affects transcription in vivo. Mol Cell Biol 21, 2736-2742. 183

Geisberg, J. V., Moqtaderi, Z., Kuras, L., and Struhl, K. (2002). Mot1 associates with transcriptionally active promoters and inhibits association of NC2 in Saccharomyces cerevisiae. Mol Cell Biol 22, 8122-8134.

Gilfillan, S., Stelzer, G., Piaia, E., Hofmann, M. G., and Meisterernst, M. (2005).

Efficient binding of NC2.TATA-binding protein to DNA in the absence of TATA. J Biol

Chem 280, 6222-6230.

Goppelt, A., and Meisterernst, M. (1996). Characterization of the basal inhibitor of class

II transcription NC2 from Saccharomyces cerevisiae. Nucleic Acids Res 24, 4450-4455.

Guldener, U., Heck, S., Fielder, T., Beinhauer, J., and Hegemann, J. H. (1996). A new efficient gene disruption cassette for repeated use in budding yeast. Nucleic Acids Res

24, 2519-2524.

Gumbs, O. H., Campbell, A. M., and Weil, P. A. (2003). High-affinity DNA binding by a

Mot1p-TBP complex: implications for TAF-independent transcription. Embo J 22, 3131-

3141.

Holstege, F. C., Jennings, E. G., Wyrick, J. J., Lee, T. I., Hengartner, C. J., Green, M. R.,

Golub, T. R., Lander, E. S., and Young, R. A. (1998). Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95, 717-728.

Hughes, T. R., Roberts, C. J., Dai, H., Jones, A. R., Meyer, M. R., Slade, D., Burchard, J.,

Dow, S., Ward, T. R., Kidd, M. J., et al. (2000). Widespread aneuploidy revealed by

DNA microarray expression profiling. Nat Genet 25, 333-337. 184

Huisinga, K. L., and Pugh, B. F. (2004). A genome-wide housekeeping role for TFIID and a highly regulated stress-related role for SAGA in Saccharomyces cerevisiae. Mol

Cell 13, 573-585.

Jackson-Fisher, A. J., Chitikila, C., Mitra, M., and Pugh, B. F. (1999). A role for TBP dimerization in preventing unregulated gene expression. Mol Cell 3, 717-727.

Kellis, M., Patterson, N., Endrizzi, M., Birren, B., and Lander, E. S. (2003). Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423,

241-254.

Kim, J. L., Nikolov, D. B., and Burley, S. K. (1993a). Co-crystal structure of TBP recognizing the minor groove of a TATA element. Nature 365, 520-527.

Kim, T. K., Zhao, Y., Ge, H., Bernstein, R., and Roeder, R. G. (1995). TATA-binding protein residues implicated in a functional interplay between negative cofactor NC2

(Dr1) and general factors TFIIA and TFIIB. J Biol Chem 270, 10976-10981.

Kim, Y., Geiger, J. H., Hahn, S., and Sigler, P. B. (1993b). Crystal structure of a yeast

TBP/TATA-box complex. Nature 365, 512-520.

Kokubo, T., Swanson, M. J., Nishikawa, J. I., Hinnebusch, A. G., and Nakatani, Y.

(1998). The yeast TAF145 inhibitory domain and TFIIA competitively bind to TATA- binding protein. Mol Cell Biol 18, 1003-1012.

185

Kotani, T., Miyake, T., Tsukihashi, Y., Hinnebusch, A. G., Nakatani, Y., Kawaichi, M., and Kokubo, T. (1998). Identification of highly conserved amino-terminal segments of dTAFII230 and yTAFII145 that are functionally interchangeable for inhibiting TBP-

DNA interactions in vitro and in promoting yeast cell growth in vivo. J Biol Chem 273,

32254-32264.

Kuras, L., and Struhl, K. (1999). Binding of TBP to promoters in vivo is stimulated by activators and requires Pol II holoenzyme. Nature 399, 609-613.

Kurdistani, S. K., Tavazoie, S., and Grunstein, M. (2004). Mapping global histone acetylation patterns to gene expression. Cell 117, 721-733.

Ladurner, A. G., Inouye, C., Jain, R., and Tjian, R. (2003). Bromodomains mediate an acetyl-histone encoded antisilencing function at heterochromatin boundaries. Mol Cell

11, 365-376.

Larschan, E., and Winston, F. (2001). The S. cerevisiae SAGA complex functions in vivo as a coactivator for transcriptional activation by Gal4. Genes Dev 15, 1946-1956.

Lee, T. I., Causton, H. C., Holstege, F. C., Shen, W. C., Hannett, N., Jennings, E. G.,

Winston, F., Green, M. R., and Young, R. A. (2000). Redundant roles for the TFIID and

SAGA complexes in global transcription. Nature 405, 701-704.

186

Lemaire, M., Xie, J., Meisterernst, M., and Collart, M. A. (2000). The NC2 repressor is dispensable in yeast mutated for the Sin4p component of the holoenzyme and plays roles similar to Mot1p in vivo. Mol Microbiol 36, 163-173.

Li, X. Y., Virbasius, A., Zhu, X., and Green, M. R. (1999). Enhancement of TBP binding by activators and general transcription factors. Nature 399, 605-609.

Mencia, M., Moqtaderi, Z., Geisberg, J. V., Kuras, L., and Struhl, K. (2002). Activator- specific recruitment of TFIID and regulation of ribosomal protein genes in yeast. Mol

Cell 9, 823-833.

Mencia, M., and Struhl, K. (2001). Region of yeast TAF 130 required for TFIID to associate with promoters. Mol Cell Biol 21, 1145-1154.

Nikolov, D. B., Hu, S. H., Lin, J., Gasch, A., Hoffmann, A., Horikoshi, M., Chua, N. H.,

Roeder, R. G., and Burley, S. K. (1992). Crystal structure of TFIID TATA-box binding protein. Nature 360, 40-46.

Prelich, G., and Winston, F. (1993). Mutations that suppress the deletion of an upstream activating sequence in yeast: involvement of a protein kinase and histone H3 in repressing transcription in vivo. Genetics 135, 665-676.

Stargell, L. A., and Struhl, K. (1995). The TBP-TFIIA interaction in the response to acidic activators in vivo. Science 269, 75-78.

187

Sterner, D. E., Grant, P. A., Roberts, S. M., Duggan, L. J., Belotserkovskaya, R., Pacella,

L. A., Winston, F., Workman, J. L., and Berger, S. L. (1999). Functional organization of the yeast SAGA complex: distinct components involved in structural integrity, nucleosome acetylation, and TATA-binding protein interaction. Mol Cell Biol 19, 86-98.

Takahata, S., Ryu, H., Ohtsuki, K., Kasahara, K., Kawaichi, M., and Kokubo, T. (2003).

Identification of a novel TATA element-binding protein binding region at the N terminus of the Saccharomyces cerevisiae TAF1 protein. J Biol Chem 278, 45888-45902.

Willy, P. J., Kobayashi, R., and Kadonaga, J. T. (2000). A basal transcription factor that activates or represses transcription. Science 290, 982-985.

Xie, J., Collart, M., Lemaire, M., Stelzer, G., and Meisterernst, M. (2000). A single point mutation in TFIIA suppresses NC2 requirement in vivo. Embo J 19, 672-682.

Zanton, S. J., and Pugh, B. F. (2004). Changes in genomewide occupancy of core transcriptional regulators during heat stress. Proc Natl Acad Sci U S A 101, 16843-

16848.

188

Chapter 5

5 The Big Picture: Dissecting the in vivo role of TBP Regulatory complexes.

5.1 Summary of study

Experiments presented here investigated the global role of several components of the transcription regulatory machinery that interact with and regulate the activity of the

TATA binding protein. Genome-wide expression profiling was used as the main conduit to analyze effects of disruptions in TBP’s regulation on transcript levels, with additional experiments conducted to compliment the expression studies. In Chapter 2, three interactions that were proposed to negatively regulate TBP are examined. They include the TAF1 N-terminal domain, TAND, the NC2 complex, and self-dimerization of TBP.

The impact of TBP-DNA interactions on gene expression was also addressed. I found that NC2 modulates highly transcribed genes, which are also generally sensitive to disruption of TBP-DNA interactions. Additionally, I observed that TBP mutants, which affect TBP self-dimerization, up regulate lowly expressed subtelomeric genes. In

Chapter 3, I present data that examines the relationship between two coactivator complexes that deliver TBP to different target genes. I find that the SAGA and TFIID co-activator complexes make overlapping contributions to the expression of nearly all yeast genes. The TFIID complex plays the predominant role at 90% of the genome while the SAGA complex functions predominantly at ~10% of the genome. SAGA-dominated genes generally contain TATA boxes and are coordinately regulated by a variety of additional factors. TFIID functions primarily at TATA-less genes, which are generally 189

under less regulation. In addition, genes up regulated upon environmental stress are generally SAGA-dominated while genes down regulated by stress are primarily regulated by TFIID. Work presented in Chapter 4 expands upon the experiments in Chapter 2. The relationship between six different TBP interactions is examined by combining disruptions of multiple regulators and examining the resulting changes in gene expression. I find that the different modes of TBP regulation function coordinately to regulate gene expression in a highly intertwined TBP regulatory network that includes ~50% of the yeast genome.

NC2 and Mot1 repression targets TATA-containing genes and often counteracts positive regulation by Spt3 and TBP-DNA interactions at these genes. Highly expressed genes require the overlapping but non-redundant stimulatory effects of multiple TBP regulators, including Spt3, TAND, and TBP-DNA interactions. Additionally, several factors, including Spt3, adjust their mode of regulation from positive to negative between different subsets of genes.

5.2 Gene-specific nature of TBP regulatory complexes

One of the key findings presented is how different TBP regulatory interactions target specific genes. Not all regulators act in the same manner at every yeast gene. This is important in and of itself when conceptualizing how gene expression is controlled. One model doesn’t fit every gene. It is also important to keep in mind when scrutinizing studies that focus on the regulation of a single gene. While this type of experiment can be very informative and explain in detail how different regulators function together, the results should not be extrapolated to be applicable for every gene in the organism. While

190

most likely there is a cohort of genes that are similarly regulated, our studies, as well as many other genome-wide expression studies with transcription regulators, indicate that there is gene specific regulation of many factors.

So what determines which regulators are important for expression of a gene? The results presented here argue that the presence of a TATA-box is one determinant. However, it is not entirely clear how the TATA-box preferential affects regulation by certain factors.

Possible mechanisms, which may explain this preferential usage, are outlined below.

In one case, SAGA and TFIID are equally “recruited” to a promoter via interactions with activators, and the presence of a TATA box at the promoter renders TBP delivery via

SAGA more productive, as in the absence of a TATA box, TBP requires TFIID components for productive interaction with the promoter. In support of this model, there is evidence that TFIID components can facilitate the binding of TBP to TATA-less promoters. However, this model seems to imply that TFIID is somehow less functional at TATA-containing promoters.

Another possibility is that the activator of a gene preferentially targets SAGA or TFIID.

This model can be incorporated with the possibility that the yeast genome has evolved in such a way that TATA-containing genes are under the control of activators which prefer

SAGA. Primarily targeting the TBP delivery complex to promoters where it functions best would appear to be the most judicious mechanism. While both SAGA and TFIID

191

have been demonstrated to interact with subsets of activators that correspond to the genes they preferentially activate, a comprehensive test of this model with a larger sampling of activators is necessary to more thoroughly test this model.

It is also possible that there are additional sequences, beyond the TATA box and UAS, that play a role in determining the mode of regulation at different genes. This is the case for higher eukaryotes where additional core promoter sequences have been identified.

However, in yeast, additional core promoter elements have not been characterized so far.

Another attractive possibility is that the chromatin structure at promoters favors the preferential usage of certain factors over others. In all likelihood it is a combination of these mechanisms that explain the preferential usage of certain TBP regulators at different genes.

5.3 Interplay between TBP regulators

In addition to the relationship between a positive role for SAGA and the presence of a

TATA box, I also found transcription inhibition by NC2 and Mot1 generally occurs at

TATA-containing genes. There appears to be a three-pronged relationship between these factors, which raises the question of what is the trigger in this relationship? Do Mot1 and

NC2 target TATA containing promoters due to the presence of a TATA box? IF so, do these genes primarily use SAGA as a coactivator because it is better at counteracting

Mot1 and NC2 repression? Or are TATA-containing genes more sensitive to inhibition by Mot1 and NC2 because they primarily use SAGA? It is not clear from the studies

192

conducted so far what the determinant is in the relationship between stimulation by

SAGA and inhibition by Mot1 and NC2.

5.4 The next questions

While the work presented in this thesis had helped progress the understanding of how different factors regulate transcription via TBP, there are still many questions to be answered. One question is where does TFIIA fit into this network. Its genome-wide effects on expression have not been fully addressed, although experiments presented in

Chapter 4 attempted to investigate its role. As genetic and biochemical studies have linked TFIIA to both components of the SAGA/TATA pathway and the TFIID/TATA- less pathway expanded knowledge of what genes are dependent upon TFIIA would add another level of understanding to the TBP regulatory network.

Another complex that also must fit into the gene regulation network is Mediator. While genome-wide studies have been performed with single mediator subunits, it would be interesting to combine this with other components of the TBP regulatory machinery to investigate the relationships between factors.

While the complexity of the TBP regulatory network may not have been obvious prior to initiating this study, it definitely is now. Given the complexity of gene expression this should not come as a surprise. However, there are several key relationships, outlined throughout this thesis, which can be observed for TBP regulatory factors. Continued

193

analysis of how different components of the transcription machinery coordinate their functions will be critical to the progress of understanding gene regulation.

194

VITA

Kathryn L. Huisinga

Education: PhD in Biochemistry, Microbiology, and Molecular Biology; August 2005 The Pennsylvania State University, University Park, PA Dissertation Title: “Global regulation of gene expression in Saccharomyces cerevisiae via TATA Binding Protein Regulatory Factors” Dissertation Advisor: Dr. B. Franklin Pugh

BS with Honors in Biochemistry; May 1997 University of Iowa, Iowa City, IA Honors Thesis: “A molecular investigation of pairing mediated modulation of gene expression” Honors Advisor: Dr. Pamela Geyer

Publications: Huisinga KL, Pugh BF. A genome-wide housekeeping role for TFIID and a highly regulated stress-related role for SAGA in Saccharomyces cerevisiae. Molecular Cell. 2004 Feb 27; 13:573-85.

Kou H, Irvin JD, Huisinga KL, Mitra M, Pugh BF. Structural and functional analysis of mutations along the crystallographic dimer interface of the yeast TATA binding protein. Mol Cell Biol. 2003 May; 23:3186-201.

Chitikila C, Huisinga KL, Irvin JD, Basehoar AD, Pugh BF. Interplay of TBP Inhibitors in Global Transcriptional Control. Molecular Cell. 2002 Oct; 10:1-20.

Chen JL, Huisinga KL, Viering MM, Ou SA, Wu CT, Geyer PK. Enhancer action in trans is permitted throughout the Drosophila genome. Proc Natl Acad Sci U S A. 2002 Mar 19; 99:3723-8.

Honors and Awards: Braddock Graduate Fellowship, BMMB Dept., Penn State University, 1999-2000 and 2000-2001 academic years

Braucher Scholarship, BMMB Dept., Penn State University, Summer 2000

Althouse Outstanding TA Teaching award, BMMB Dept., Penn State University, 2000-2001 academic year

Penn State Alumni Association Dissertation Award, Penn State University, 2005