, AND TRANSCRIPTION REGULATION IN ARCHAEA

DISSERTATION

Presented in Partial Fulfillment of the requirements for

the Degree Doctor of Philosophy in the Graduate School

of the Ohio State University

By

Yunwei Xie, B.S.

*****

The Ohio State University 2005

Dissertation Committee: approved by

Dr. John N. Reeve, Advisor Dr. Tina Henkin ______Dr. Michael Ibba Advisor Dr. Irina Artsimovitch Graduate Program in Microbiology

ABSTRACT

Archaea form the third domain of separate from and Eukarya

(Woese et al., 1990). They are single-celled prokaryotic microorganisms with very diverse physiologies, including methanogenesis, halophily and acidophily. The RNA polymerase (RNAP) and the basal transcription machinery in Archaea are more closely related to their eukaryal than bacterial counterparts (Lange et al., 1995; Bell and Jackson,

2001). Many Archaea also contain archaeal , which are sequence and structural homologs of eukaryal nucleosomal core histones (Reeve, 2003). I have taken advantage of an in vitro transcription system derived from Methanothermobacter thermautotrophicus (M.t.) (Darcy et al., 1999) to investigate several aspects of , from initiation to elongation to -specific regulation. Several interesting and significant results were obtained. First, the archaeal RNAP is able to transcribe through one or two archaeal nucleosomes in vitro, albeit at a slower rate.

Second, archaeal TATA-binding (TBP) remains bound to promoter DNA, but archaeal B (TFB) is released from promoter DNA after transcription initiation. Third, transcription of the trpEGCFBAD in M.t. is under the regulation of a -sensing protein, TRPY. The binding sites for TRPY were identified and the molecular basis for TRPY regulation of the trpEGCFBAD operon expression established.

ii

ACKNOWLEDGEMENTS

I am most grateful to Dr. John N. Reeve as my mentor in the past six years. Dr.

Reeve is not only an exemplary advisor, but also a trusted friend. I also want to thank Dr.

Reeve and my thesis committee members for carefully reading and correcting this thesis.

In addition, I would like to thank all members of the Reeve lab for their teaching and sharing.

iii

VITA

1977……………………………………………………………………….Born in China

1998……………………………………….B.S. in Biochemistry, Zhongshan University

1999-2005……………………………………...Graduate student, Ohio State University

PUBLICATIONS

Xie, Y. and J. N. Reeve (2003). In vitro transcription assays using components from Methanothermobacter thermautotrophicus. Methods Enzymol 370: 66-72.

Xie, Y. and J. N. Reeve (2004). Transcription by an archaeal RNA polymerase is slowed but not blocked by an archaeal . J Bacteriol 186(11): 3492-3498.

Xie, Y. and J. N. Reeve (2004). Transcription by Methanothermobacter thermautotrophicus RNA polymerase in vitro releases archaeal transcription factor B but not TATA-box binding protein from the template DNA. J Bacteriol 186(18): 6306-6310.

Xie, Y. and J.N. Reeve (2005). Regulation of tryptophan operon expression in the archaeon Methanothermobacter thermautotrophicus. J Bacteriol 187(18): 6419-6429.

FIELD OF STUDY

Major Field: Microbiology

Minor Field:

iv

TABLE OF CONTENTS

Abstract...... ii

Acknowledgements...... iii

Vita...... iv

List of figures...... x

List of abbreviations ...... xii

Chapters:

1. General introduction ...... 1

Eukaryal histones and nucleosomes...... 2

Archaeal histones and nucleosomes...... 5

Eukaryal basal transcription machinery...... 10

Chromatin and transcription in Eukarya...... 16

Archaeal basal transcription machinery...... 17

Archaeal chromatin and transcription...... 22

Transcription regulation in Archaea ...... 24

2. Transcription by an archaeal RNAP is slowed but not blocked by

archaeal nucleosomes...... 28

Introduction...... 28

Materials and Methods...... 30

v Chemicals and reagents...... 30

Construction of transcription templates...... 30

Growth of M.t...... 30

Purification of M.t. RNAP ...... 32

Promoter-independent transcription assay...... 33

Purification of M.t. TBP and TFB ...... 34

Purification of archaeal HMtA2...... 35

Agarose gel shift assay...... 36

[32P]-labeling of DNA templates ...... 36

Polyacrylamide gel shift assay...... 36

Micrococcal nuclease and primer extension footprinting...... 37

In vitro transcription ...... 38

Ternary complex isolation and stalled transcript elongation ...... 38

Results...... 39

Construction of T284 that positions a single nucleosome ...... 39

Positioning of HMtA2 on T284 ...... 40

Polyacrylamide gel shift assay of HMtA2 to T284 ...... 41

In vitro transcription of nucleosomal templates...... 41

HMtA2 assembly did not prevent transcript elongation...... 42

HMtA2 binding decreased the rate of elongation...... 43

Construction of T435 that positions two nucleosomes...... 43

Transcription through two nucleosomes...... 44

Discussion...... 45

vi Assembly of HMtA2 on T284 ...... 45

Stability of an archaeal elongation complex...... 48

M.t. RNAP transcription through an archaeal nucleosome...... 49

M.t. RNAP transcription through two archaeal nucleosomes...... 51

Regulation of expression by archaeal nucleosome assembly...... 51

Potential weakness of the in vitro experiments...... 53

3. Transcription initiation by an archaeal RNAP in vitro releases TFB,

but not TBP from the promoter DNA...... 71

Introduction...... 71

Materials and Methods...... 72

Chemicals and reagents...... 72

Template construction...... 72

Single and multiple round transcriptions ...... 72

Template competition assays ...... 73

Immunodetection of TBP and TFB...... 74

Results...... 75

Multiple-round transcription in vitro ...... 75

Template competition ...... 76

Immunodetection of TBP and TFB...... 77

Discussion...... 78

4. In vitro transcription of additional M.t. promoters and study of the

M.t. Alba protein...... 94

Introduction...... 94

vii Materials and Methods...... 95

Amplification of transcription templates ...... 95

In vitro transcription ...... 96

Cloning of M.t. Alba ...... 96

Purification of recombinant M.t. Alba ...... 97

EMSA of M.t. Alba ...... 97

Results...... 98

In vitro transcription from other promoters ...... 98

Discussion...... 99

5. Regulation of the in M.t...... 104

Introduction...... 104

Materials and Methods...... 114

Transcription template construction ...... 106

In vitro transcription ...... 107

Mapping of transcription start sites...... 108

Cloning of trpY ...... 109

Purification of recombinant TRPY ...... 109

[32P]-labeling of oligonucleotides and ...... 110

EMSA of TRPY binding to DNA...... 111

DNase I footprinting ...... 111

Results...... 111

Mapping of trpEGCFBAD and trpY transcription start sites...... 111

Confirmation of TATA-boxes and initiator sequences...... 112

viii Purification of TRPY ...... 113

Effects of TRPY on transcription in vitro...... 114

DNA binding activity of TRPY ...... 115

DNase I footprinting of TRPY-DNA complexes...... 116

Transcription using templates with mutations in TRP boxes 1 and 3...... 117

Mutational studies of TRPY ...... 118

TRPY also regulates trpB2 transcription ...... 118

Discussions ...... 120

Organization of trpY and trpEGCFBAD regulatory region ...... 120

DNA-binding properties of TRPY...... 123

Model for transcription regulation of trpY and trpEGCFBAD ...... 127

TRPY also regulates trpB2 transcription ...... 130

Comparison with trp regulation in other organisms ...... 132

6. Summary and future directions...... 160

List of references...... 167

Appendix ...... 198

ix LIST OF FIGURES

Figure Page

2.1 Construction of plasmids pYX2 and pYX6...... 56

2.2 Construction of transcription template T435 from T284...... 58

2.3 The position of HMtA2 assembly on T284 ...... 60

2.4 EMSA of HMtA2 binding and runoff transcription ...... 62

2.5 Stalled-transcript elongation and EMSA of ternary complexes ...... 64

2.6 Transcription in the absence and presence of an archaeal nucleosome ...... 66

2.7 DNA sequence of T435 ...... 68

2.8 Archaeal nucleosome assembly on T435 and transcript elongation through two nucleosomes ...... 70

3.1 Templates T284, T225, T475 and T440 ...... 83

3.2 Transcription in the presence and absence of heparin ...... 85

3.3 Template competition assays with TBP...... 87

3.4 Template competition assays with TFB...... 89

3.5 Template competition assays with TBP when T225 is in excess ...... 91

3.6 Immunoblot assays of template binding by TFB and TBP...... 93

4.1 Transcriptions from the tRNAAla, tRNAArg, tRNALeu, nifH and glnA promoters in vitro...... 103

x 5.1 Divergent promoters direct transcription of trpY and trpEGCFBAD ...... 137

5.2 Purification and SDS-PAGE of recombinant TRPY ...... 139

5.3 T1 DNA sequence and effect of TRPY and amino acids on E and Y synthesis...... 141

5.4 Amino acid sequence of TRPY and TRPY’s effect on E and Y transcription ...... 143

5.5 The effects of tryptophan and 5MT on E and Y transcription...... 145

5.6 EMSA with T11 DNA using TRPY with or without amino acids ...... 147

5.7 EMSA with T11 and T13 DNA in the presence of TRPY and/or tryptophan ...... 149

5.8 Determination of TRPY binding sites by EMSA ...... 151

5.9 DNase I footprinting of TRPY-DNA complex...... 153

5.10 In vitro transcription from templates T1, T8 and T9 ...... 155

5.11 Effects on transcription in vitro by TRPY and TRPY variants G149R and A128E ...... 157

5.12 Regulation of trpB2 transcription by TRPY ...... 159

xi LIST OF ABBREVIATIONS

5MT 5-methyltryptophan bp (s)

BRE TFIIB/TFB responsive element

BSA bovine serum albumin

CTD C-terminal domain ddNTP dideoxynucleotide triphosphate

DEAE Diethylaminoethyl-dextran dNTP deoxynucleotide triphosphate

DNase deoxyribonuclease

DTT dithiothreitol

EDTA disodium ethylenediaminetetra acetic acid

EM electron microscopy

EMSA electrophoretic mobility shift assay g gram(s) h hour(s)

HAT histone acetyltransferase

HTH helix-turn-helix

IPTG isopropyl-ß-D-thiogalactoside

xii KDa kilodalton

L liter(s)

LB Luria-Bertani min minute(s) ml milliliter(s)

MN micrococcal nuclease mRNA messenger RNA

NMR nuclear magnetic resonance nt nucleotide(s)

OD optical density

PAGE polyacrylamide gel electrophoresis

PCR polymerase chain reaction

PIC pre-initiation complex

PMSF phenylmethylsulphonylfluoride

RNAP RNA polymerase rpm rotations/revolutions per minute rRNA ribosomal RNA

RT room temperature

SDS sodium dodecyl sulfate

SELEX systematic evolution of ligands by exponential enrichment snRNA small nuclear RNA s second

TBP TATA-box binding protein

xiii TCA trichloroacetic acid

TFB transcription factor B

TFE transcription factor E

TFS transcription factor S

Tris 2-amino-2-(hydroxymethyl)-1,3-propanediol

U unit(s)

UV ultraviolet

V volt(s) v/v volume/volume

W watt(s) w/v weight/volume

X-gal 5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside

xiv

CHAPTER 1

GENERAL INTRODUCTION

The research work presented in this thesis is focused on investigating transcription in Archaea. Archaea form the third domain of life distinct from Bacteria and Eukarya (Woese et al., 1990). They are a diverse group of single-celled prokaryotic microorganisms with very interesting physiologies, including methanogenesis, halophily, thermophily and acidophily. The RNA polymerase (RNAP) and basal transcription machinery in Archaea are more closely related to the eukaryal RNA polymerase II system than to bacterial RNA polymerase (Langer et al., 1995; Reeve, 2003). Archaea in the Euryarchaeota, Nanoarchaeota, and more recently, Crenarchaeaota branches also contain archaeal histones, which are sequence and structural homologs of eukaryal core histones (Sandman and Reeve, 2000; Sandman et al., 2001; Reeve, 2003; Waters et al.,

2003; Reeve et al., 2004; Cubonova et al., 2005).

There are three separate but interrelated topics in my graduate research work. One is whether archaeal RNAP can transcribe through a nucleosomal template. The second is the fate of archaeal general transcription factors after transcription initiation. The third is transcription regulation in Archaea. Given these topics and the close relatedness between the archaeal and eukaryal basal transcription machineries and histones, I will introduce

1 the relevant background that describes: eukaryal histones and nucleosomes, archaeal histones and nucleosomes, eukaryal basal transcription machinery, archaeal basal transcription machinery, chromatin and transcription in Eukarya, chromatin and transcription in Archaea. I will also describe the current status of our understanding of transcription regulation in Archaea.

Eukaryal histones and nucleosomes

Histones package eukaryal chromosomal DNA into nucleosomes and chromatin, and, prior to the discovery of archaeal histones (Sandman et al., 1990), histones and nucleosomes were considered unique features of eukaryal cells. Each nucleosome core particle consists of two copies of histones H2A, H2B, H3 and H4 that together wrap 146 bp of DNA in 1.65 circles, or ~90 bp per circle (Luger et al., 1997a). The four core histones, H2A, H2B, H3 and H4 are small (11-16 KDa) basic . The amino acid sequences of core histones are highly conserved (Sullivan et al., 2002). For example, the sequences of histone H3 from sea urchin and H3 from calf thymus are different by only a single amino acid out of a total of 135 amino acid residues, and calf thymus H4 and garden pea H4 differ by only two amino acid residues out of a total of 102 residues

(Sullivan et al., 2002). Histones H3, H4, H2A and H2B all contain the histone fold, namely a short α helix (α1)-β strand loop (L1)-long α helix (α2)-β strand loop (L2)-short

α helix (α3) (Arents et al., 1991; Arents and Moudrianakis, 1993). H2A forms a heterodimer with H2B, and H3 forms a heterodimer with H4. Two (H3+H4) heterodimers associate to form a tetramer, and the octamer histone core of a nucleosome is a tripartite structure with the central (H3+H4)2 tetramer flanked on each side by an (H2A+H2B)

2 heterodimer (Luger et al., 1997a; Suto et al., 2000). In addition to the histone fold, the eukaryal histones also have N-terminal tails (Luger and Richmond, 1998). The N- terminal tails extend from the globular nucleosome core and provide sites for post- translational modifications such as acetylation, phosphorylation, methylation, ubiquitination and ADP-ribosylation that direct transitions between transcriptionally active and silent chromatin states (Strahl and Allis, 2000; Jenuwein and Allis, 2001;

Wang et al., 2004).

The DNA between adjacent nucleosomes is bound by linker histone H1 (or H5 in avian red blood cells). Histone H1 is also rich in arginine and lysine, but its sequence is unrelated to histones H3, H4, H2A and H2B (Sullivan et al., 2002).

When chromatin is extracted from nuclei and examined in an electron microscope, its appearance is salt-concentration dependent (Malcolm and Sommerville,

1977; Cameron et al., 1979). At low salt concentrations (~1 mM NaCl), isolated chromatin resembles “beads on a string” and is devoid of histone H1. The string is a thin filament of linker DNA connecting the regularly spaced beadlike nucleosome structures that are 10 nm in diameter. When isolated at physiological salt concentration (~150 mM

NaCl), chromatin has a more condensed fiberlike form that is 30 nm in diameter and retains one molecule of histone H1 per nucleosome (Thoma et al., 1979). More condensed forms of chromatin also exist in vivo, but their structures are poorly understood (van Holde and Zlatanova, 1996; Cremer et al., 2004). Actively transcribed regions of chromatin are thought to assume the extended beads-on-a-string form, and untranscribed regions are thought to exist in the 30 nm fibers or more condensed form.

Heterochromatin is a highly condensed form of chromatin, and is transcriptionally

3 inactive (Henikoff et al., 2000).

In the cell, newly replicated DNA is assembled into nucleosomes shortly after the

replication fork passes (Ahmad and Henikoff, 2002a), but when isolated histones are

added to DNA in vitro at a physiological salt concentration, nucleosomes do not

spontaneously form (Luger et al., 1997b). Histone assembly in vivo apparently requires other factors, and histone chaperones that bind histones and assemble them with DNA into nucleosomes in vitro have been characterized and may play similar roles in vivo

(Akey and Luger, 2003). However, histone chaperones alone are not sufficient to

generate regularly spaced nucleosome arrays. Instead, they lead to the assembly of

irregularly spaced or closely spaced nucleosome arrays in vitro. ATP-dependent

factors have been identified that generate regularly spaced

nucleosome arrays in vitro (Ito et al., 1997; Tyler, 2002; Flaus and Owen-Hughes, 2004).

In addition to the canonical core histones, histone variants exist that play specific

roles in vivo (Henikoff et al., 2004). CENP-A (centromere protein A) in humans or Cse4

in yeast is a variant of histone H3 that is specifically localized in active centromeres of

chromosomes, and is a component of centromeric nucleosomes on which kinetochores

are assembled (Alonso et al., 2003; Wieland et al., 2004; Yoda et al., 2004). H3.3 in

Drosophila is another variant of histone H3. While the regular H3 is incorporated into

chromatin during DNA replication, H3.3 is deposited by a replication-independent

pathway, thus providing a mechanism for gene regulation (Ahmad and Henikoff, 2002b).

Yeast H2A.Z is a variant of histone H2A. Deletion of the gene encoding H2A.Z in yeast

increases the requirement for SNF/SWI and SAGA complexes that remodel chromatin

and modify histone tails, suggesting overlapping roles of H2A.Z and SWI/SNF and

4 SAGA (Santisteban et al., 2000; Adam et al., 2001; Fan et al., 2002; Meneghini et al.,

2003; Rangasamy et al., 2004). H2A.X, another variant of H2A, is involved in DNA

repair, particularly in joining the ends of double-stranded breaks. Deletion of the gene

encoding H2A.X in mice leads to genomic instability (Celeste et al., 2002). Another H2A

variant, macroH2A, is preferentially located to the inactive X chromosome (Changolkar

and Pehrson, 2002).

Archaeal histones and nucleosomes

Methanothermus fervidus is a hyperthermophilic methanogenic archaeon with an

optimal growth temperature of 83°C (Stetter, 1981). The search for factors that prevent

genome melting at 83°C resulted in the discovery of HMf, a mixture of homodimers and

heterodimers of two small (7.5 KDa), basic (pI ~9-10) polypeptides, designated HMfA

and HMfB (Sandman et al., 1990). These proteins have 84% sequence identity to each

other and their sequences are related to those of eukaryal histones H3, H4, H2A and H2B

(Sandman et al., 1990). However, the homology is restricted to the histone fold domain,

as HMfA and HMfB do not have the N- or C-terminal tails present in eukaryal histones.

HMfA and HMfB bind, compact, and stabilize DNA in vitro and are located in the M.

fervidus nucleoid (Sandman et al., 1990). NMR and crystal structures of (HMfB)2 and

(HMfA)2 have confirmed the presence of histone folds in archaeal histones that are

virtually identical to those of the eukaryal core histones (Starich et al., 1996; Zhu et al.,

1998; Decanniere et al., 2000). Since the discovery of HMf, archaeal histones have been

identified in almost all members of Euryarchaeota (Smith et al., 1997; Ng et al., 2000;

Graham et al., 2001; Kawarabayasi, 2001; Deppenmeier et al., 2002; Galagan et al.,

5 2002; Slesarev et al., 2002; Waters et al., 2003; Baliga et al., 2004; Hendrickson et al.,

2004), in Nanoarchaeota (Waters et al. 2003), and, more recently, in some members of

Crenarchaeota (Venter et al., 2004; Cubonova et al., 2005).

Purified archaeal histones spontaneously associate with DNA in vitro under various salt concentrations to form archaeal nucleosomes, whereas eukaryal nucleosome assembly in vitro requires dialysis from a high to low salt environment, or the addition of histone chaperones and ATP-dependent remodeling factors. HMf-DNA complexes assembled in vitro at HMf/DNA mass ratios <1:3 appear as sharp kinks in the electron microscope, but at HMf/DNA mass ratios >1:2, they appear as spherical structures separated by protein-free DNA of irregular length (Pereira et al., 1997). This is different from the regularly spaced eukaryal nucleosome arrays. EM measurements also indicate that assembly of one fully–circularized HMf/DNA complex reduces the length of the visible DNA by ~90 bp (Marc et al., 2002). In the eukaryal nucleosome core particle, each histone dimer binds to ~28 bp of DNA (Luger et al., 1997a). If an archaeal histone dimer similarly binds DNA, a reduction of ~90 bp is best explained by the assembly of three adjacent archaeal histone dimers on the DNA. However, other lines of evidence suggest that the archaeal nucleosome core is a histone tetramer. Using [125I]-labeled

HMfB and [32P]-labeled 100 bp DNA, the stoichiometry of HMfB monomer to DNA was measured. Under assembly conditions when 50% and 98% of the DNA were bound by

HMfB, there were 4.8, 4.4 and 4.4, 4.0 HMfB monomers to each molecule of DNA in the complex (Bailey et al., 1999). Immunoblots of archaeal histones cross-linked in vivo revealed the presence of histone tetramers, dimers and monomers, but no hexamers or higher order structures were detected (Pereira et al., 1997).

6 The HMfA and HMfB levels are growth phase dependent. HMfA is the

predominant histone during exponential phase of growth, but HMfA and HMfB are

equally abundant in stationary phase cultures (Sandman et al., 1994a). HMfA and HMfB

differ in their DNA-binding affinities, and M. fervidus may use differing amounts of

HMfA and HMfB to control genome packaging and . Quantitation of

HMf suggests that there is enough HMf in vivo to package the whole M. fervidus genome into nucleosomes (Pereira et al., 1997). Immunoprecipitation studies demonstrated that many but not all M. fervidus genomic DNA regions were associated in vivo with archaeal histones. Specifically, the encoding the 7S and 16S RNA, and the hmfA, hmfB and ftr genes were associated with archaeal histones, but the mcr operon was histone-free

(Pereira et al., 1997). The ftr gene and the mcr operon encode proteins responsible for methane generation from carbon dioxide and hydrogen, and they are actively transcribed under the growth conditions employed in the laboratory (Pereira et al., 1997). The reason for the different histone association status of these genes is not clear. A more systematic genome-wide chromatin immunoprecipitation and microarray approach currently undertaken by Rachel Samson may reveal if there is any general pattern between transcription and archaeal histone association.

Micrococcal nuclease (MN) digestion studies have shown that archaeal nucleosomes assembled in vitro or in vivo protect nucleosomal DNA from MN digestion.

The lengths of the DNA protected are 30, 60, 90, and 120 bp, or even longer, depending on the histone to DNA ratios (Grayling et al., 1997; Pereira et al., 1997; Bailey et al.,

2002; Xie and Reeve, 2004).

7 Although archaeal histones bind DNA regardless of the DNA sequence, SELEX

(Systematic Evolution of Ligands by Exponential enrichment) procedures have generated

DNA sequences that have very high affinities for HMfB (Bailey et al., 2000). These

DNA sequences also direct archaeal nucleosome positioning and exhibit the dinucleotide periodicities shown previously to direct eukaryal nucleosome positioning (Bailey et al.,

2000; Widom, 2001). Selex-1, a 60-bp sequence selected by the SELEX procedures, was used in this study.

The archaeal histone HMk from Methanopyrus kandleri contains two histone folds joined by a short linker region (Fahrner et al., 2001). One histone fold is more similar to eukaryal histones; the other histone fold is more similar to archaeal histones

HMfA and HMfB. It has been reported that HMk forms dimers (pseudotetramers) both in solution and in a nucleosome (Pavlov et al., 2002).

Archaeal histones isolated from Methanococcus jannaschii have been subjected to high-resolution tandem mass spectrometry analyses and none of the histones had detectable post-translational modifications, regardless of whether they were isolated from exponential growth phase or stationary phase cells (Forbes et al., 2004). Archaeal histones purified from Methanothermobacter thermautotrophicus (M.t.) have also been examined for the presence of post-translational modifications by mass spectrometry, but post-translational modifications were not found (Kathleen Sandman, personal communication).

Despite more than 15 years of research on archaeal histones, the biological functions of these proteins have not been fully elucidated. A recent report provided evidence for a role of archaeal histones in regulating gene expression in vivo.

8 Methanococcus voltae possesses two histone-encoding genes hstA and hstB, and genetic tools are available for this archaeon. M. voltae cells with deletion of either hstA or hstB were viable, but a diverse range of proteins had significantly altered levels in the mutant cells compared to wild-type, as revealed by 2D gel electrophoresis (Heinicke at al.,

2004). Archaeal histones may play multiple modulatory roles in vivo, similar to what bacterial nucleoid proteins do. Bacterial nucleoid proteins have some superficial similarities to archaeal histones, such as DNA binding ability, low molecular mass and high electrostatic charge, but their amino acid sequences are not similar to archaeal histones. Among the best characterized of the bacterial nucleoid proteins are HU (heat unstable) and H-NS (histone-like nucleoid structuring). Both HU and H-NS bind DNA in a sequence-independent manner, although HU prefers four-way junctions and single- stranded lesions, while H-NS prefers intrinsically curved DNA (Dame and Goosen,

2002). Studies of HU and H-NS have revealed that they play a diverse range of roles in vivo. HU, on one hand, is involved in DNA replication (Hwang and Kornberg, 1992), Mu transposition (Lavoie and Chaconas, 1994), DNA repair and recombination (Kamashev and Rouviere-Yaniv, 2000), and repression of gal transcription (Lewis and Adhya, 2002).

H-NS, on the other hand, negatively regulates the expression of more than 100 genes, including the rRNA-encoding rrn operon and many genes involved in adaptation to environmental changes (Hommais et al., 2001). If archaeal histones are functionally analogous to bacterial nucleoid proteins, archaeal histones could also play a wide range of roles in vivo, and studying archaeal histones will shed light on many important biological mechanisms in Archaea.

9 Eukaryal basal transcription machinery

There are three related nuclear RNA polymerases in Eukarya. RNA polymerase I

(RNAPI) transcribes the 18S and 23S ribosomal RNA genes, RNA polymerase II

(RNAPII) synthesizes mRNAs and most small nuclear (snRNAs), and RNA polymerase III (RNAPIII) transcribes tRNA and 5S rRNA genes (Roeder et al., 1976).

All three RNAPs are multi-subunit complexes. Most subunits have homologs in all three

RNAPs and some subunits are shared by all three enzymes (Cramer, 2004). The RNAPII system is discussed in more detail below, as it most closely resembles the archaeal transcription system (Langer et al., 1995; Cramer, 2004).

RNAPII is a 12-subunit enzyme with the subunits designated Rpb1 through

Rpb12 in yeast. Human RNAPII is virtually identical and at least 10 human RNAPII genes can be substituted for their counterparts in yeast (Woychik, 1998). RNAPII alone can unwind the DNA double helix, polymerize RNA and proofread the nascent transcript, but it needs the participation of transcription factors, activators, coactivators and mediators to initiate transcription accurately at specific promoters and to respond to regulatory signals (Kornberg, 1998; Myers and Kornberg, 2000; Cramer, 2004).

The two largest RNAPII subunits, Rpb1 and Rpb2, are the most highly conserved.

They are homologous to the β΄ and β subunits of bacterial RNAP. Rpb3 and Rpb11 are homologous to the α subunit of bacterial RNAP. Rpb6 has sequence similarity to and is a structural homolog of the ω subunit of bacterial RNAP (Minakhin et al., 2001). A distinctive feature of RNAPII is the presence of a heptapeptide repeat in the carboxyl terminal domain (CTD) of Rpb1. The heptapeptide sequence YSPTSPS is repeated 26 times in the CTD of yeast RNAPII, 42 times in the CTD of Drosophila RNAPII and 52

10 times in the CTD of human RNAPII. The two serine residues (serine-2 and serine-5) are sites of phosphorylation. Pre-initiation complexes (PIC) usually have hypo- phosphorylated CTDs and elongating complexes have hyper-phosphorylated CTDs, and transcription factor TFIIH has the kinase activity that phosphorylates the CTD of RNAPII

(Akoulitchev et al., 1995). The phosphorylated CTD serves as a platform for the docking of mRNA processing enzymes so that nascent RNA transcripts can be capped and spliced co-transcriptionally (Proudfoot et al., 2002). Although the CTD of Rpb1 is highly conserved in most eukaryal RNAPII, it is absent in RNAPI, RNAPIII, archaeal RNAP and the RNAPII of some primitive (Stiller et al., 2000; Stiller and Hall, 2002).

Structures of 10-subunit and 12-subunit yeast RNAPIIs have been solved (Cramer et al., 2001; Gnatt et al., 2001; Bushnell et al., 2002; Bushnell and Kornberg, 2003;

Bushnell et al., 2004; Westover et al., 2004a; Westover et al., 2004b). The 10-subunit yeast RNAPII has all the subunits except Rpb4 and Rpb7, and these two subunits exist at substoichiometric levels in vivo. The overall structure of yeast RNAPII has been described as “jaws” and resembles the structure of bacterial RNAP (Zhang et al., 1999).

Rpb1 and Rpb2, the two largest subunits, form the upper and lower jaws and embody the catalytic center of the enzyme, while smaller subunits are located on the periphery. Both the upper and lower jaw may be mobile, opening and closing on the DNA. Near the catalytic center, the N-terminal region of Rpb1 and the C-terminal region of Rpb2 form a mobile clamp that closes over the DNA during transcription elongation. The catalytic center harbors a region of DNA-RNA hybrid and the 5΄ end of the RNA exits the polymerase through a groove near the CTD of Rpb1. Nucleotides are thought to enter the polymerase through a tunnel beneath the catalytic site. If RNAPII backtracks, the 3΄ end

11 of RNA can be extruded through this tunnel and be cleaved by RNAPII with the participation of TFIIS to restart elongation.

RNAPII by itself is incapable of initiating transcription accurately from promoters. It needs the participation of general and gene-specific transcription factors.

General transcription factors recruit RNAPII to the promoter site and direct the initiation of basal level transcription. Five general transcription factors, TFIID, TFIIB, TFIIF,

TFIIE and TFIIH have been identified, and order-of-addition experiments have demonstrated that assembly of the PIC in vitro is nucleated by TFIID binding to the

TATA-box, followed by addition of TFIIB, RNAPII+TFIIF, TFIIE and TFIIH

(Buratowski, 1994; Buratowski, 2000).

TFIID is a multisubunit complex composed of TATA-box-binding protein (TBP) and 14 TBP-associated factors (TAFs). TAFs are required for activated transcription, but

TBP plus TFIIB alone can support basal level transcription in vitro. TBP has two imperfect direct repeats that form a saddle-shaped structure which straddles the TATA- box sequence, interacting with the minor groove of DNA and inducing an 80° bend in the promoter DNA (Kim et al., 1993; Nikolov et al., 1995). The binding of TBP to the

TATA-box is a rate-limiting step in transcription initiation and this step can be stimulated by TBP-interacting activators (Pugh, 2000). The convex seat of the TBP saddle is exposed and available for interaction with such factors. The TATA-box is typically located 25-30 bp upstream of the transcription start site in metazoans, whereas in yeast the TATA-box is located 40 to 100 bp upstream of the transcription start site. The consensus sequence for the TATA-box is TATAAA. Not all promoters contain a TATA- box, although transcription from TATA-less promoters also requires TBP. In humans,

12 only 32% of 1031 known promoters contain a TATA-box (Suzuki et al., 2001). In

Drosophila, only 43% of 205 known promoters contain a TATA-box (Kutach and

Kadonaga, 2000). Paralogs of TBP have been identified in Drosphila, C. elegans, humans and plants and are termed TRFs (TBP-related factors). Some but not all TRFs bind to TATA-box sequences (Hansen et al., 1997). TRFs are required for tissue-specific or developmental-stage-specific transcription in metazoans and plants (Rabenstein et al.,

1999; Persengiev et al., 2003).

TFIIB is a monomeric protein that binds to the TFIIB responsive element

(Lagrange et al., 1998). The TFIIB responsive element, or BRE, is located immediately upstream of some TATA-boxes. The human BRE consensus is G/C-G/C-G/A-CGCC (the

3΄ C of the BRE is followed by the 5΄ T of the TATA-box), and a 5-out-of-7 match with the BRE consensus was found in 12% of 315 TATA-box-containing human promoters.

The BRE is recognized by a helix-turn-helix (HTH) motif in the C-terminal region of

TFIIB and the crystal structure of a TFIIB-TBP-DNA complex revealed that the C- terminal region of TFIIB, which contains imperfect repeats, contacts the DNA both upstream and downstream of the TATA-box (Tsai et al., 1998; Tsai and Sigler, 2000). In this configuration, the C-terminal region of TFIIB points towards the transcription start site. The N-terminal region of TFIIB contains a zinc ribbon and is involved in transcription start site selection and interaction with RNAPII (Pardee et al., 1998). The N- terminal region of TFIIB contacts the polymerase near the exit path of RNA and is partially inserted into the polymerase active center (Bushnell el al., 2004).

TFIIF has two subunits, designated RAP30 and RAP74 in both yeast and metazoans, and has features reminiscent of bacterial σ factors, including tight binding to

13 RNAPII, suppression of non-specific binding of RNAPII to DNA, and stabilization of the

PIC (Conaway and Conaway, 1999). In addition to its role in transcription initiation,

TFIIF also suppresses transient pausing of RNAPII during transcription elongation

(Conaway et al., 2000).

TFIIE enters the PIC after RNAPII and before TFIIH. Functions attributed to

TFIIE include recruitment of TFIIH to the PIC, stimulation of TFIIH-dependent phosphorylation of the RNAPII CTD and stimulation of TFIIH-dependent ATP hydrolysis. TFIIE contains α and β subunits that form an α2β2 heterotetramer. Both subunits are highly charged, with pI values of 4.5 and 9.5 for TFIIE-α and TFIIE-β, respectively. TFIIE-α has a zinc ribbon motif (C-X2-C-X21-C-X2-C) and a protein kinase consensus sequence, but no enzymatic activity has been demonstrated for TFIIE.

The zinc ribbon motif appears to play an important, but undefined role in TFIIE function

(Ohkuma et al., 1995). TFIIE may interact with, and stabilize single-stranded DNA in the initiation bubble. Consistent with this, TFIIE binds single-stranded DNA, but not double- stranded DNA (Kuldell and Buratowski, 1997) and is not required for transcription initiation from pre-melted template DNA (Pan and Greenblatt, 1994).

TFIIH, the last general transcription factor to enter the PIC, has DNA-dependent

ATPase, ATP-dependent DNA and CTD kinase activities. Formation of an open promoter complex requires the ATP-dependent DNA helicase activity of TFIIH.

However, TFIIE and TFIIH are dispensable for initiation from a supercoiled template

(Parvin et al., 1994), or from a pre-melted template (Holstege et al., 1995), consistent with TFIIE and TFIIH being required for open complex formation from double-stranded linear DNA. TFIIH also participates in the transition from transcription initiation to

14 elongation by phosphorylating the CTD of RNAPII. Phosphorylation of the CTD causes

a conformational change within the PIC, resulting in disruption of the RNAP-TBP

interaction and promoter clearance (Proudfoot et al., 2002).

After assembly of the general transcription factors and RNAPII at the promoter,

the closed promoter complex isomerizes into an open complex in which the DNA duplex

is separated and the template strand is exposed for RNA synthesis. After transcription

initiation, RNAPII and TFIIF move away as a complex from the promoter, TFIID,

TFIIH, TFIIE remain bound at the promoter, and TFIIB is released (Yudkovskaya et al.,

2000).

Transcription elongation is possible with RNAPII alone in vitro, but many

elongation factors, including TFIIF and TFIIS, are associated with RNAPII in vivo. TFIIF

increases the rate of nucleotide incorporation by changing the Km and Vmax of RNAPII

(Elmendorf et al., 2001). TFIIS allows stalled elongation complexes to read through

DNA sequence-dependent pause sites by mediating the cleavage of nascent transcripts

(Kulish and Struhl, 2001; Weilbaecher et al., 2003). Purified human FACT (facilitates

chromatin transcription) facilitates human RNAPII transcription through nucleosomes in vitro (Orphanides et al., 1998; Belotserkovskaya et al., 2003). The Elongator complex, isolated from both yeast and human cells, consists of 6 subunits, Elp1, Elp2, Elp3, Elp4,

Elp5 and Elp6. The Elp3 subunit has histone acetyltransferase (HAT) activity towards histones H3 and H4 in vitro, and this HAT activity is thought to facilitate RNAPII transcription elongation through nucleosomes in vivo (Wittschieben et al., 1999; Kim et al., 2002; Gilbert et al., 2004).

15 Chromatin and transcription in Eukarya

Packaging of promoter DNA in nucleosomes prevents transcription initiation by

RNAPII in vitro. When a promoter is packaged in a nucleosome, transcription factors cannot gain access to the promoter and thus transcription initiation is repressed.

Depleting histone H4 in yeast resulted in activation of transcription of the HIS3, CUP1 and HO5 genes under non-inducing conditions (Han and Grunstein, 1988) and genome- wide microarray studies have further shown that depleting histone H4 increased the expression of 15% of yeast genes (Wyrick et al., 1999). Transcription elongation in vitro

by RNAPII was also inhibited by a nucleosome when the salt concentration was at or

below physiological conditions (40-150 mM KCl) (Kireeva et al., 2002). Increasing the

salt concentration allowed transcription elongation and the nucleosome was converted

into a hexamer (losing one H2A/H2B dimer) that remained bound near the original

position of the nucleosome (Kireeva et al., 2002). Transcription elongation through

nucleosomes in vivo may require transcription elongation factors such as FACT,

SWI/SNF and Elongator (Orphanides et al., 1998; Otero et al., 1999; Wittschieben et al.,

1999; Kim et al., 2002; Belotserkovskaya et al., 2003; Gilbert et al., 2004).

Both histone fold-DNA interactions within the core particles and histone tail-

DNA interactions outside the core particles contribute to transcription repression, and both types of interactions are countered by specific mechanisms. Repression due to histone fold-DNA interactions within the core particles is overcome by chromatin- remodeling complexes, while histone tail-DNA interactions outside the core particles are loosend by histone tail acetylation. Chromatin-remodeling complexes are multi-subunit enzymes that use the energy of ATP to remodel chromatin. The most extensively studied

16 chromatin-remodeling complexes, SWI/SNF (switch/sucrose nonfermentor), RSC

(remodels structure of chromatin), CHRAC (chromatin accessibility complex), and

NURF (nucleosome remodeling factor), have been shown to loosen histone-DNA

interactions and/or to cause histone octamers to slide to adjacent DNA (Flaus and Owen-

Hughes, 2004). Many transcription activators and co-activators possess HAT activities

and many transcription and co-repressors have (HDAC)

activities. In addition to acetylation, the histone tails are also subject to methylation,

phosphorylation, ADP-ribosylation and ubiquitination, and the number and variety of

these modifications have been proposed to constitute a “histone code” that determines the

on or off state of gene expression (Jenuwein and Allis, 2001).

Archaeal basal transcription machinery

The basal transcription machinery in Archaea resembles the eukaryal RNAPII

system, involving a multi-subunit RNAP and three general transcription factors, TBP,

TFB and TFE, which are homologs of eukaryal TBP, TFIIB and the α subunit of TFIIE,

respectively (Reeve, 2003).

Archaeal RNAPs contain 10 to 12 different subunits (Langer et al., 1995; Darcy et

al., 1999) that are homologs of at least 10 of the 12 subunits of eukaryal RNAPII. The

only eukaryal RNAPII subunits missing in archaeal RNAPs are Rpb8 and Rpb9 (Darcy et

al., 1999), which are located on the periphery of the yeast RNAPII. Archaeal

transcription elongation factor TFS has sequence similarity to Rpb9 of RNAPII, C11 of

RNAPIII and TFIIS. It was initially thought to be an RNAP subunit, but biochemical

assays have shown that TFS behaves like TFIIS, and immunoblotting did not detect TFS

17 in purified archaeal RNAP preparations (Hausner et al., 2000; Lange and Hausner, 2004).

The two largest subunits of archaeal RNAP that correspond to Rpb1 and Rpb2 are generally split into two polypeptides (A΄, A΄΄ and B΄, B΄΄). Structures of several archaeal

RNAP subunits have been solved and some were used as models in determining the high- resolution structures of the yeast RNAPII (Thiru et al., 1999; Mackereth et al., 2000;

Todone et al., 2000; Todone et al., 2001). Given the conservation in subunit composition and sequence, the overall structure of an archaeal RNAP is likely to be very similar to that of yeast RNAPII. A functional archaeal RNAP capable of promoter-specific transcription has been reconstituted in vitro from purified recombinant individual subunits A΄, A΄΄, B΄, B΄΄, D, E, F, H, K, L, N and P (Werner and Weinzierl, 2002).

Most archaeal promoters have a TATA-box, located ~25 bp upstream of the transcription start site (Reiter et al., 1990; Hausner et al., 1991). A general consensus for the archaeal TATA-box is TTTATA, although there are variations depending on the gene and the organism (Soppa, 1999a). The TATA-box is recognized and bound by TBP. The

BRE is a purine-rich sequence located immediately upstream of the TATA-box that is recognized and bound by TFB (Qureshi and Jackson, 1998). Sequence comparisons have revealed the presence of BREs in a wide range of archaeal promoters (Soppa, 1999a;

Soppa, 1999b). Binding of TFB to the BRE establishes a polarity in the PIC and directs

RNAP binding downstream of the TATA-box (Bell et al., 1999b). Crystal structures of the TBP/TFB/promoter complex confirmed that archaeal TBP and TFB are indeed homologs of eukaryal TBP and TFIIB and confirmed the sequence-specific interactions of TBP with the TATA-box and TFB with the BRE (Littlefield et al., 1999).

18 Archaeal TFBs have three regions, an N-terminal region of about 100-120 amino acids and two C-terminal repeat regions of ~90 amino acids each, which most likely arose through a gene duplication. The C-terminal region of TFB has an HTH motif and binds to the BRE. The N-terminal region of TFB has a zinc-ribbon motif and this motif was shown to interact with the K subunit of archaeal RNAP (Magill et al., 2001).

In vitro transcription systems have been established using native RNAP and recombinant TBP and TFB from Sulfolobus solfataricus, Methanococcus jannaschii,

Methanothermobacter thermautotrophicus and Pyrococcus furiosus (Thomm et al., 1992;

Qureshi et al., 1997; Darcy et al., 1999; Hethke et al., 1999). More recently, an in vitro transcription system has also been established using Methanococcus jannaschii RNAP reconstituted from individual recombinant subunits and recombinant TBP and TFB

(Werner and Weinzierl, 2002; Ouhammouch et al., 2004). RNAP, TBP and TFB together are sufficient to direct transcription initiation in vitro, although only a small number of promoters have been used. These promoters are the SSV T6, tRNAVal, gdh and hmtB promoters that have consensus TATA-box and BRE sequences. It was reported that the

M.t. in vitro transcription system failed to initiate transcription from many methane gene promoters and the Sulfolobus system failed to initiate transcription from the lysW promoter (Darcy, 1999; Brinkman et al., 2002), suggesting that additional factors are needed for transcription from these promoters.

Some Archaea contain more than one TBP and/or TFB. For example, the

Halobacterium NRC-1 genome encodes 6 different TBPs and 7 TFBs. It was hypothesized that these Archaea use different combinations of TBPs and TFBs to regulate the expression of different genes, analogous to the use of alternative sigma

19 factors in Bacteria (Baliga et al., 2000). As evidence to support this hypothesis, the synthesis of at least one TFB in H. volcanii was upregulated after heat shock (Thompson et al., 1999).

All sequenced archaeal genomes also encode a protein, designated TFE, whose amino acid sequence is homologous to the sequence of the N-terminal region of the eukaryal TFIIE-α-subunit. TFE has the zinc ribbon motif that is conserved in eukaryal

TFIIE-α-subunits. Since there is no TFIIH homolog in Archaea, the function of TFE cannot be to recruit TFIIH. Although archaeal in vitro transcription can occur in reaction mixtures that contain only the template DNA, TBP, TFB and RNAP, TFE stimulated transcription from the frh, fmd, hdrC, mcr, ftr, fwd promoters 1.8- to 3.4-fold in the M. t. in vitro transcription system (Hanzelka et al., 2001). The stimulatory effect was dependent on the zinc ribbon motif, as a cysteine to alanine replacement resulted in a

TFE (C155A) variant with no stimulatory activity. TFE also stimulated transcription initiation 3-fold in Sulfolobus solfataricus and Methanococcus jannaschii in vitro transcription systems (Bell et al., 2001a; Ouhammouch et al., 2004).

Some Archaea, including Archaeoglobus fulgidus, Aeropyrum pernix, Pyrococcus abyssi, Pyrococcus horikoshii and Sulfolobus shibatae, have a homolog of the eukaryal

TIP49 (TBP-interacting protein 49 kDa). The eukaryal TIP49 was cloned based on its affinity for TBP and both rat and human TIP49 interact with TBP in nuclear extracts

(Kurokawa et al., 1999). The function of the archaeal TIP49 is unknown, but it contains

ATP/GTP binding motifs. Another TBP-interacting protein was isolated from cell extracts of Pyrococcus kodakaraensis by affinity chromatography (Matsuda et al., 1999).

20 This protein has a molecular weight of 26 kDa, and was designated TIP26. TIP26 binds to TBP in vitro and prevents TBP from binding to the TATA-box.

All sequenced archaeal genomes appear to encode a protein with sequence similarity to eukaryal transcription factor MBF1 (multiple bridging factor 1). The archaeal relatives of MBF1 have a C-terminal HTH domain and an N-terminal zinc ribbon motif. The homology between the archaeal and eukaryal MBF1 is limited to the

C-terminal region, as the N-terminal zinc ribbon motif is absent in eukaryal MBF1. In eukaryotes, MBF1 functions as a bridging factor between DNA-bound transcription activators and TBP (Takemaru et al., 1997; Takemaru et al., 1998; Brendel et al., 2002;

Liu et al., 2003). Archaeal MBF1 may function in a similar way, although experimental evidence is lacking.

Based on currently available genome sequences, Archaea do not have homologs of the eukaryal TAFs, TFIIF, TFIIE-β, and TFIIH. Therefore, the archaeal PIC must be a simpler version of the eukaryal PIC, and contains only RNAP, TBP, TFB and possibly

TFE. Archaeal PIC assembly is thought to be initiated by TBP binding to the TATA-box, followed by TFB binding to the BRE. RNAP is then recruited and positioned in the correct orientation for transcription initiation. DNase I footprinting has shown that a

TBP-TFB-promoter complex protects a region of DNA from –40 to –14 on the noncoding strand and from –36 to –17 on the coding strand, relative to the transcription start site (Hausner and Thomm, 2001). With the addition of RNAP, the protected region extended to +17 on the noncoding strand, and +13 on the coding strand (Hausner and

Thomm, 2001). Unlike the eukaryal RNAPII system, formation of an archaeal open promoter complex does not require the hydrolysis of the β-γ bond of ATP, and

21 permanganate probing has revealed that the open complex spans from –11 to –1 (Hausner and Thomm, 2001). The arrangement of TBP and TFB from base pair -35 to +20 (relative to the transcriptional start site) was analyzed by photochemical protein-DNA cross- linking (Bartlett et al., 2004; Renfrow et al., 2004). TBP was cross-linked to the TATA- box and TFB was cross-linked to DNA both upstream and downstream of the TATA-box.

The most prominent sites of TFB cross-linking are located downstream of the TATA- box, reaching as far as the start site of transcription, suggesting a role of TFB in transcription beyond RNAP recruitment.

Archaea do not have homologs of the eukaryal transcription elongation factors,

TFIIF, Spt6, FACT, Paf1, and Cdc73. However, archaeal TFS is a homolog of eukaryal elongation factor TFIIS, and in vitro transcription experiments with paused elongation complexes have shown that TFS is able to induce nascent transcript cleavage activity within an elongating archaeal RNAP, similar to the activity of eukaryal TFIIS (Hausner et al., 1999, Lange and Hausner, 2004). Many archaeal genomes also encode a homolog of the Elp3 subunit of the eukaryal Elongator complex, but not the other five subunits of the complex. Homologs of the elongation-termination- antitermination factors NusA and NusG are also encoded in many archaeal genomes, but their functions have not been experimentally investigated.

Archaeal chromatin and transcription

How archaeal chromatin affects transcription is one of the topics investigated in this thesis. In many Archaea, archaeal histones are considered the major chromatin

22 proteins, whereas in Sulfolobus species which lack histones, Alba is the major chromatin protein (Reeve, 2003).

Addition of HPyA1, an archaeal histone from Pyrococcus strain GB-3a, to a

Pyrococcus furiosus in vitro transcription system inhibited transcription from the gdh promoter (Soares et al., 1998). The extent of this inhibition was dependent on both the histone-to-DNA ratio, and the topology of the template DNA. At a histone-to-DNA mass ratio of 1:1, transcription from a relaxed circular template was decreased by 70%, whereas transcription from a negatively supercoiled template was decreased by only

10%, and there was no decrease in transcription when the template was linear.

Apparently, the archaeal RNAP could transcribe DNA in the presence of this archaeal histone, but the presence of the archaeal histone did reduce the amount of transcription.

In these experiments, there was no attempt to separate the effects of histone binding on transcription initiation versus transcript elongation.

Alba is the chromatin protein in several Sulfolobus species (Xue et al., 2000;

Wardleworth et al., 2002) and is also known as Ssh10b, Sso10b, and Sac10b, depending on the Sulfolobus species of origin. Alba homologs are also present in other thermophilic and hyperthermophilic Archaea. Alba binding coats DNA, but does not apparently compact DNA, as archaeal histones do. Addition of recombinant Alba repressed transcription in a Sulfolobus in vitro transcription system at low micromolar concentrations, whereas addition of native Alba had no repressive effect at these concentrations (Bell et al., 2002). Native Alba was found to be acetylated at lysine residue 16, which lowered its affinity for DNA by a factor of 30 (Bell et al., 2002), but it was then reported that there was no significant difference in DNA-binding affinities

23 between the acetylated and unacetylated forms of Alba (Wardleworth et al., 2002).

Acetylated Alba is deacetylated in vitro by a Sulfolobus homolog of Sir2, an NAD

dependent histone deacetylase. Recently, a Sulfolobus protein responsible for acetylating

Alba has also been identified (Marsh et al., 2005).

Transcription regulation in Archaea

Although the archaeal and eukaryal basal transcription machineries appear

homologous, most archaeal transcription regulators resemble bacterial regulators

(Aravind and Koonin, 1999; Kyrpides and Ouzounis, 1999). Eukaryal transcription

regulators with domains such as , MADS box, POU, bZIP, bHLH, Paired and

SANT have no apparent homologs in Archaea (Aravind and Koonin, 1999; Makarova

and Koonin, 2003). Instead, archaeal genomes have many bacterial-type HTH motif

transcription regulators. The abundance of HTH transcription regulators in archaeal and

bacterial genomes (corrected for genome size) is about equal (Aravind and Koonin,

1999). The archaeal transcription regulators are members of diverse families, including

Asn/Lrp, ModE, MerR, LysR, TetR, PbsX, MoxR, Fur, ArsR, MarR, NagC/XylR, DtxR,

HypF, PhoU, PspC, PurR, RpiR and DegT/DnrJ. Archaea may also contain unique

transcription regulators that have so far escaped detection by sequence homology

searches. The study of transcription regulation in Archaea is at its infancy, and only a few

archaeal transcription regulators have been studied in detail, but they include both

negative and positive regulators.

MDR1 (metal dependent repressor 1) of Archaeoglobus fulgidus is a homolog of

the bacterial metal dependent transcription repressor DtxR in Corynebacterium

24 diphtheriae. It contains an HTH DNA-binding motif and is the first gene of an operon

that also encodes an ABC metal transporter. Transcription of MDR1 in vivo is regulated

by metal ion availability, as addition of EDTA to the culture medium increased the

amount of MDR1 mRNA in the cells by 16-fold. MDR1 binds to the promoter of its own

gene in vitro in the presence Fe2+, Mn2+, and Ni2+. MDR1 and Mn2+ together inhibited

transcription from its own promoter in vitro. DNase I footprinting revealed that the

MDR1 binding site is located downstream of the TATA-box, overlapping the

transcription start site. MDR1 does not interfere with TBP and TFB binding, but prevents

RNAP binding (Bell et al., 1999a).

Lrs14 of Sulfolobus solfataricus is a homolog of E. coli leucine responsive protein

(Lrp) that contains an HTH DNA-binding motif. Lrs14 binds to the promoter of its own gene in a leucine-independent manner and inhibits its own transcription in vitro. The

Lrs14 binding site overlaps the BRE and TATA-box and binding of Lrs14 prevents TBP and TFB from binding to the promoter, thus blocking transcription initiation (Bell and

Jackson, 2000).

LrpA of Pyrococcus furiosus is another homolog of E. coli Lrp and contains an

HTH DNA-binding motif. Similar to Lrs14, LrpA binds to its own promoter in a leucine- independent manner and inhibits its own transcription. However, the LrpA binding site is located downstream of the TATA-box, and overlaps the transcription start site. LrpA binding to the promoter does not interfere with TBP and TFB binding, but prevents

RNAP binding (Dahlke and Thomm, 2002).

Phr of Pyrococcus furiosus is another archaeal transcription repressor that contains an HTH DNA-binding motif. Phr binds to the promoter of its own gene and the

25 promoters of two other genes encoding Hsp20 and an AAA+ ATPase, and inhibits

transcription from these promoters in vitro. The Phr binding site is located downstream of

the TATA-box and Phr binding therefore allows TBP and TFB binding, but prevents

RNAP binding (Vierke et al., 2003).

TrmB is a repressor protein in Thermococcus litoralis that regulates transcription

of the mal operon that encodes proteins for the import of maltose and possibly other

related sugars. In the absence of maltose, TrmB binds to the promoter region of the mal

operon promoter, inhibiting mal transcription; in the presence of maltose, TrmB is

released from the mal operon promoter, allowing mal transcription (Lee et al., 2003).

In addition to negative transcription regulators, positive transcription regulators

have also been identified in Archaea. The best biochemically studied archaeal

transcriptional activator is Ptr2 from M. jannaschii. Ptr2 is an Lrp homolog. It binds to

sequences upstream of the BRE and TATA-box of the fdxA and rb2 promoters and then

recruits TBP to the promoter. Ptr2 stimulated transcription in vitro from these two

promoters by 3-fold (Ouhammouch et al., 2003).

Based on genetic evidence, GvpE from H. salinarum activates the transcription of

the gvpA gene that encodes the major gas vesicle structural protein. When transformed

into the non-gas vesicle forming H. volcanii, gvpA was expressed only when the gvpE gene was cotransformed (Hofacker et al., 2004). GvpE is unique to H. salinarum and it does not have sequence similarity to any other protein in Eukarya, Bacteria and Archaea.

Molecular modeling revealed that GvpE has a basic (bZIP) motif (Kruger et al., 1998). Site directed mutagenesis of some but not all of the predicted key residues abolished its activator function. If GvpE were indeed a bZIP transcription regulator, it

26 would be the first of its kind in Archaea. The mechanism by which GvpE activates gvpA transcription is unknown (Hofacker et al., 2004).

At the time this project started, an in vitro transcription system had just been established using M.t. RNAP, TBP and TFB (Darcy et al., 1999). Archaeal histones had been studied extensively, and DNA sequences that position archaeal nucleosome assembly had been isolated (Bailey et al., 2000). These developments made it possible to exam the effects of archaeal nucleosome assembly on transcription. Furthermore, the complete M.t. genome sequence had become available (Smith et al., 1997), and one can look for potential transcription regulators in the genome and test them using the M.t. in vitro transcription system. The following chapters will present experiments that were undertaken to address these questions: 1) how does archaeal nucleosome assembly affect the basal transcription machinery; 2) what are the fates of archaeal general transcription factors after transcription initiation; 3) how does an archaeal transcriptional regulator regulate transcription initiation from the trp operon. The findings will contribute to better understanding of transcription in Archaea and stimulate future investigations in this new and exciting field.

27 CHAPTER 2

TRANSCRIPTION BY AN ARCHAEAL RNAP IS SLOWED BUT NOT BLOCKED BY ARCHAEAL NUCLEOSOMES

Introduction

Transcription by eukaryal RNAPII on nucleosomal templates has been studied extensively in vitro. In early reports, RNAPII transcription was completely inhibited by both (H3+H4)2 tetramers and complete nucleosomes (Morse et al., 1987; Hansen and

Wolffe, 1992; Izban and Luse, 1992; Chang and Luse, 1997). More recently, it was reported that a eukaryal nucleosome was a strong barrier to RNAPII, but this barrier was overcome at elevated salt concentration, which may contribute to both weakened histone dimer-tetramer and histone-DNA interactions. Further, after the passage of RNAPII, there was a quantitative loss of one H2A/H2B dimer, leaving behind a histone hexamer

(Kireeva et al., 2002). H2A/H2B removal in vitro was facilitated by FACT, which also facilitated H2A/H2B re-deposition (Belotserkovskaya et al., 2003). Early studies of transcription in living cells indicated that actively transcribed DNA has an altered nucleosome conformation and is partially depleted of H2A/H2B and H1 (Nacheva et al.,

1989; Hayes and Wolffe, 1992; Wolffe, 1994). In a more recent study, the elongating

RNAPII was found to evict all core histones (H3, H4, H2A and H2B) from the yeast

GAL10 gene in vivo, but the eviction of core histones was transient, as histones were re-

28 deposited onto the GAL10 DNA within 1 min after the passage of RNAPII (Schwabish and Struhl, 2004).

Transcription by RNAPIII on nucleosomal templates has also been studied in vitro. In contrast to RNAPII, RNAPIII is able to transcribe through a single nucleosome in vitro at physiological salt concentrations without the help of any accessory proteins.

During this process, the nucleosome was not dissociated from the template DNA, nor was there any displacement of the H2A/H2B dimer, but the entire nucleosome was transferred to an upstream promoter-proximal location through a DNA-looping mechanism

(Studitsky et al., 1997; Bednar et al., 1999).

Archaeal RNAPs have subunits and subunits complexities similar to the three eukaryal RNAPs and most closely resemble RNAPII (Langer et al., 1995; Darcy et al.,

1999; Werner and Weinzierl, 2002). Archaeal nucleosomes, on the other hand, closely resemble the tetrasome formed at the center of a eukaryal nucleosome by ~90 bp of DNA wrapped around a histone (H3+H4)2 tetramer (Sandman and Reeve, 2000; Reeve, 2003).

DNA sequences that position archaeal nucleosome assemblies have been isolated using a

SELEX procedure (Bailey et al., 2000), making it possible to construct transcription templates on which an archaeal nucleosome will assemble at a specific position. Using such templates and the archaeal in vitro transcription system established using M.t. derived components (Darcy et al., 1999), experiments were undertaken to determine the ability of the M.t. RNAP to transcribe through an archaeal nucleosome. DNA templates were also constructed that allowed the separation of transcription initiation from elongation. The results obtained demonstrated that an archaeal nucleosome does not block, but slows down transcript elongation by the M.t RNAP.

29

Materials and Methods

Chemicals and reagents

Unless otherwise noted, all chemicals were purchased from Sigma (St. Louis, MO) and DNA modifying enzymes and restriction enzymes from Invitrogen (Carlsbad, CA).

Agarose was purchased from Amresco (Solon, OH), streptavidin-coated Dynabeads from

Dynal Biotech (Lake Success, NY) and plasmid miniprep kits, PCR cleanup kits and Ni-

NTA spin columns from Qiagen (Valencia, CA). All [32P] radiochemicals were

purchased from ICN (Costa Mesa, CA).

Construction of transcription templates

Plasmids pYX2 and pYX6 were generated using molecular cloning techniques as

described in Figure 2.1. Transcription template T284 was PCR amplified from pYX6

using oligonucleotides TD2 and MX1 as primers (see Appendix). A second transcription

template, T435, was generated as described in Figure 2.2. All PCR generated DNA

molecules were purified using Qiagen PCR clean-up kits before use in downstream

reactions, such as micrococcal nuclease (MN) digestion, footprinting and in vitro

transcription reactions.

Growth of M. t.

M. t. cultures were grown at 65°C under strictly anaerobic conditions. The growth

medium contained (per L) 4 g NaHCO3, 0.3 g KH2PO4, 1 g NH4Cl, 0.6 g NaCl, 100 mg

MgCl2⋅6H2O, 60 mg CaCl2⋅2H2O, 10 ml trace element solution, 0.5 g cysteine, 0.62 g

30 sodium thiosulfate, and a trace amount of resazurin. The trace element solution contained

(per L) 50 mg AlCl3⋅6H2O, 100 mg CaCl2⋅2H2O, CoCl2⋅6 H2O, 25 mg CuCl2⋅2 H2O, 1.35

g FeCl3⋅6H2O, 10 mg H3BO3, 100 mg MnCl2⋅4H2O, 1 g NaCl, 24 mg Na2MoO4⋅2H2O, 26 mg Na2SeO4⋅6H2O, 120 mg NiCl2⋅6H2O and 100 mg ZnCl2.

To grow a 1.5-L fermentor culture, the fermentor was filled with growth medium,

sealed, and sterilized by autoclaving. Before autoclaving, the medium was dark blue,

because of the redox indicator resazurin. After autoclaving, the medium became light

blue or pink. The fermentor was then maintained at a constant temperature of 65°C and the medium sparged with a gas mixture of 89% H2 and 11% CO2. The gas flow rate was

200 ml/min and the rotating impellor was set at 600 rpm. After several hours of sparging,

the medium became colorless, indicating that it was reduced. At this stage, 10 ml of M.t.

culture from a sealed serum bottle was inoculated into the fermentor using a sterile

syringe. After the culture grew to an OD600 of ~1.5, the gas supply was stopped and the

culture harvested into 1-L Wheaton bottles anaerobically.

To grow a 20-L fermentor culture, the fermentor was filled with growth medium,

and the whole fermentor including the growth medium was steam sterilized in situ. The fermentor was then sparged with a gas mixture of 89% H2 and 11% CO2 for 2 h and was

ready for inoculation when it turned colorless. During inoculation, 500 ml of inoculum

from a Wheaton bottle was injected into the fermentor and the culture grown with

continuous gas sparging and impellor mixing. When the cultures grew to an OD600 of

~0.8, the cells were harvested anaerobically using an in-line centrifuge. After harvesting, the cells were collected as a dense paste in a sealed metal vessel, transferred into an anaerobic chamber (Coy Laboratory Products, Grass Lake, MI), and used either

31 immediately for RNAP purification or scraped into centrifuge tubes and stored in a -70°C freezer.

Purification of M.t. RNAP

Solutions A (1 M KCl, 10 mM MgCl2, 50 mM Tris-HCl), B (50 mM KCl, 10 mM

MgCl2, 50 mM Tris-HCl), and C (10 mM MgCl2, 50 mM Tris-HCl) were made, all at pH

8.0 and containing 20% (v/v) glycerol and a trace amount of resazurin. The solutions were filtered, autoclaved and, after cooling down to RT, transferred into an anaerobic chamber. M.t. cells (~20 g) were resuspended in 40 ml of solution B, and ruptured by passage through a French press at 18,000 psi. The cell lysate was collected directly into a serum bottle containing 200 µl of cysteine (150 mg/ml) and 200 µl of sodium-thiosulfate

(186 mg/ml) solutions. The cell lysate was transferred into centrifuge tubes, sealed and centrifuged at 10,000 g at 4°C for 90 min. The supernatant was collected and transferred into an anaerobic chamber where all subsequent chromatographic steps were undertaken at RT in an atmosphere of 97% N2 and 3% H2. The supernatant was loaded at 2 ml/min onto a 200-ml bed-volume DEAE cellulose column (Whatman; Fairfield, NJ) pre- equilibrated with solution B. The column was washed with 300 ml of solution B, and the bound proteins were then eluted using a 50 to 525 mM KCl gradient (800 ml) generated by mixing solutions A and B. Fractions (9 ml per fraction) were collected and assayed for the presence of RNAP activity by using a promoter-independent transcription assay.

The fractions containing RNAP activity were pooled, mixed with solution C to reduce the KCl concentration to 70 mM, and loaded at 1 ml/min onto a 20-ml bed- volume heparin-sepharose column pre-equilibrated with solution B. The column was

32 washed with 60 ml of solution B, and the bound proteins were eluted using a 50 mM to 1

M KCl gradient (200 ml). Fractions (4 ml per fraction) were collected and assayed for

RNAP activity. The fractions containing RNAP activity were pooled, mixed with

solution C to reduce the KCl concentration to 70 mM, and loaded onto a 1-ml bed-

volume mono-Q column (Amersham Pharmacia) pre-equilibrated with solution B. The

column was washed with 3 ml of solution B, and the bound proteins were eluted using a

50 mM to 1 M KCl gradient (15 ml). Fractions (0.5 ml per fraction) were collected and

assayed for RNAP activity. The fractions containing RNAP activity were pooled and

loaded onto a HiLoad 16/60 Superdex 200 gel filtration column (120 ml bed-volume, 60

cm height; Amersham Pharmacia), and eluted with a 27%: 73% mixture of solutions A

and B. Fractions (2 ml per fractions) were collected and assayed for RNAP activity. The

fractions containing RNAP activity were pooled, DTT was added (1mM final

concentration), and the RNAP solutions were aliquoted into 1.5-ml microfuge tubes and

stored in a -70°C freezer.

Promoter-independent transcription assays

Promoter-independent transcription assays were used to detect RNAP activity.

For each assay, 10 µl of a fraction eluted from a column was mixed with 90 µl of a

solution of 20 mM Tris-HCl (pH 8.0), 10 mM MgCl2, 40 mM KCl, 1 mM ATP, 0.1 mM

UTP, 0.7 µCi [α-32P] UTP (3000 Ci/mmole) and 9 µg of poly dA-dT. After incubation at

58°C for 30 min, 900 µl of an ice-cold solution of 5% trichloroacetic acid (TCA) and 165 mM NaCl was added to each reaction. The mixtures were placed on ice for 5 min and precipitated macromolecules collected by filtration through glass microfiber filters (934-

33 AH, Whatman). The filters were washed three times with 10 ml of cold 5% TCA, and once with 10 ml of ice-cold 95% ethanol. The filters were dried under a heat lamp, and the radioactivity bound to each filter was measured by liquid scintillation counting.

Preparation of M.t. TBP and TFB

E. coli strains overexpressing (his)6-tagged M.t. TBP and TFB were obtained from Trevor Darcy (Darcy, 1999). 50-ml cultures were grown in LB liquid medium containing ampicillin (50 µg/ml) at 37°C to an OD600 of ~0.6 and IPTG was added to a final concentration of 1 mM. After 2 h of further incubation, the cells were harvested by centrifugation in a Sorval SS34 rotor at 5000 rpm for 10 min. The cells were resuspended in 1.2 ml of lysis buffer [50 mM Na3PO4 (pH 8.0), 300 mM NaCl] and passaged twice through a French press. The cell lysate was collected and centrifuged in an Eppendorf microfuge at RT at 14,000 rpm for 20 min. The supernatant was collected and loaded onto a Qiagen Ni-NTA column. The column was washed twice with 0.6 ml of wash buffer [50 mM Na3PO4 (pH 6.0), 300 mM NaCl, 50 mM imidazole] and eluted twice with

0.2 ml elution buffer [50 mM Na3PO4 (pH 6.0), 300 mM NaCl, 500 mM imidazole]. The eluates were pooled and dialyzed against 500 ml 50 mM Tris-HCl (pH 8), 300 mM KCl,

10 mM MgCl2, 1 mM DTT, 20% (v/v) glycerol to remove the imidazole. Glycerol was added to the dialyzed protein solution to a final concentration of 35% (v/v), and the TFB and TBP preparations were stored at –20 °C.

34 Preparation of archaeal histone HMtA2

An E. coli strain overexpressing untagged HMtA2 was obtained from Suzette

Pereira (personal communication). Cultures were grown at 37°C to an OD600 of ~0.6 and

IPTG was added to a final concentration of 1 mM. After 2 h of further incubation, the

cells were harvested by centrifugation in a Sorval SS34 rotor at 5000 rpm for 10 min. The

cell pellet was resuspended in 2 ml of high salt buffer [3 M NaCl; 50 mM Tris-HCl (pH

8.0); 2mM Na2HPO4 (pH 8.0)] per gram of cells (wet weight) and passaged twice through

a French press. The cells were centrifuged in a Sorval SS34 rotor at 15,000 rpm for 30

min, and then in a Beckman Ti60 rotor at 40,000 rpm for 90 min. The supernatant was

collected and dialyzed against 3 L of dialysis buffer [0.1 M NaCl, 50 mM Tris-HCl (pH

8.0)] at 4°C overnight using a dialysis membrane with a molecular weight cutoff of 3500.

After dialysis, the sample was treated with DNase I (20 µg/ml final concentration) for 4 h

at 37°C in the presence of 5 mM MgCl2 and 0.1 mM PMSF. Solid NaCl was then added to a final concentration of 3 M and the sample incubated at 100°C in a water bath for 10 min. The solution was filtered through cheesecloth to remove the precipitate and the filtrate was collected and dialyzed against 3 L of dialysis buffer [0.1 M NaCl, 50 mM

Tris-HCl (pH 8.0)] at 4°C overnight using a dialysis membrane with a molecular weight cutoff of 3500. The dialyzed sample was applied to a Hi-Trap heparin column

(Amersham Pharmacia) pre-equilibrated with 0.1 M KCl, 50 mM Tris-HCl (pH 8.0) and washed with a 0.1 M to 1.5 M KCl gradient in 50 mM Tris-HCl (pH 8.0). Fractions were collected and assayed for HMtA2 using agarose gel-shift assays or SDS-PAGE. The fractions containing HMtA2 were pooled and concentrated using a Microcon device into storage buffer [1M NaCl, 50mM Tris-HCl (pH 8.0)] and stored at –20°C.

35 Agarose gel shift assay

pBR322 DNA was purified from E. coli DH5α cells using a Qiagen miniprep kit

according to the manufacturer’s instructions and linearized with restriction enzyme

EcoRI. A 5-µl aliquot of the eluate from the heparin column and 50 ng of linearized pBR322 DNA were mixed and incubated in 20 µl of 50 mM KCl, 25 mM MES (pH 6.0) at RT for 20 min. The complexes formed were separated by electrophoresis through 0.8%

(w/v) agarose gels run at 20 V in TBE buffer (90 mM Tris-borate, 2 mM EDTA, pH 8.0) for 16 h. The agarose gels were visualized by ethidium bromide staining.

[32P]-labeling of DNA templates

DNA templates were labeled using T4 kinase. A 20-µl reaction mixture contained

5 U of T4 kinase and 0.2 mCi [γ-32P] ATP (7000 Ci/mmole) in forward reaction buffer

[70 mM Tris-HCl (pH 8.0), 100 mM KCl, 1 mM DTT]. The labeling reactions were

incubated at 37°C for 1 h and terminated by adding 20 µl of 20 mM EDTA. The

unincorporated nucleotides were removed from the labeled DNA by passing the reaction

mixture through a syringe packed with Sephadex G-50 resin.

Polyacrylamide gel shift assays

Aliquots of [32P]-labeled DNA (50 ng) were incubated with increasing amounts of

HMtA2 for 20 min at RT in 50 µl of transcription buffer [20 mM Tris-HCl (pH 8), 120

mM KCl, 10 mM MgCl2, 2 mM DTT]. Following the addition of 5.5 µl of loading buffer

(40% w/v sucrose, 0.4 % bromophenol blue, 0.4% xylene cyanol), the reaction products

36 were separated by electrophoresis through 6% native polyacrylamide gels run in TBE

buffer and visualized by autoradiography.

Aliquots of ternary complexes containing the [32P]-labeled 24-nt transcripts were incubated with increasing amounts of HMtA2 in transcription buffer at RT for 20 min.

The products were separated by electrophoresis through 5% native polyacrylamide gels and visualized using a phosphorimager.

Micrococcal nuclease footprinting of HMtA2 assembly

Aliquots of DNA (50 ng) were incubated for 20 min at RT with increasing

amounts of HMtA2 in 50 µl of transcription buffer. One microliter of 100 mM CaCl2 and

1 U of micrococcal nuclease (MN) were added, and the reaction mixtures incubated at

37°C for 1 min. One microliter of a solution of 100 mM EDTA, 20 mg of proteinase K and 1 µl of 10% (w/v) sodium dodecylsulfate (SDS) was then added, and incubation continued at 37°C for 30 min. Undigested DNA molecules were separated by electrophoresis through 10% native polyacrylamide gels and visualized by SYBR-Gold staining. MN-protected DNA fragments were purified from the gels, end-labeled, and subjected to restriction enzyme digestions. The restriction fragments were separated by electrophoresis through 12% denaturing polyacrylamide gels and visualized by autoradiography.

The boundaries of the MN-protected fragments were determined by primer extension. The MN-protected DNA (1 µl) was incubated with 1 U of Taq polymerase, 2

µl 5X PCR reaction buffer, 200 µM of dATP, dGTP, dCTP and dTTP, and 0.05 pmol of

[32P]-labeled primer KS1A or KS1B (see Appendix). The reaction mixtures were

37 incubated in a thermocycler through 10 cycles of 95°C for 15 s, 45°C for 10 s and 72°C

for 20 s. After the addition of 5 µl of 95% formamide, 20 mM EDTA, the reaction

products were separated by electrophoresis through 6% denaturing polyacrylamide

sequencing gels and visualized using a phosphorimager.

In vitro transcription

An in vitro transcription reaction mixture (50 µl) contained 50 ng of template

DNA, 50 ng of TBP, 300 ng of TFB, 500 ng of RNAP, 200 µM ATP, 200 µM UTP, 200

µM GTP, 20 µM CTP and 5 µCi of [α-32P] CTP (3000 Ci/mmol) in transcription buffer.

The reaction mixtures were incubated at 58°C for 30 min, stopped by adding 30 µl of

95% formamide, 20 mM EDTA, heated to 95°C in a thermocycler for 3 min, and the

products separated by electrophoresis through 6% denaturing polyacrylamide gels, and

visualized by using a phosphorimager.

When an archaeal histone was included in an in vitro transcription reaction, the

archaeal histone was incubated with the template DNA in transcription buffer at RT for

20 min before addition of TBP, TFB, RNAP and NTPs.

Ternary complex isolation and stalled transcript elongation

Biotin-labeled template DNA molecules were generated by PCR using a biotin-

labeled primer (TD2) and an unlabeled primer (MX1). Ternary complexes that contained

the [32P]-labeled 24-nt U-less transcripts were generated by incubation of a biotin-labeled template DNA (50 ng), TBP (50 ng), TFB (300 ng), RNAP (500 ng), 200 µM ATP, 200

µM GTP, 20 µM CTP, and 20 µCi of [α-32P] CTP (3000Ci/mmol) in 50 µl of

38 transcription buffer at 58°C for 20 min. After the mixtures were cooled to RT, 10 µg of

streptavidin-coated Dynabeads were added and the mixtures were incubated at RT for 5

min. The beads were then captured by attraction to a magnet, the supernatant removed,

and, after being washed four times with 25 µl of transcription buffer, the beads were

resuspended in transcription buffer and incubated with or without HMtA2 at RT for 20

min. The reaction mixtures were then placed at 58°C, and ATP, GTP, CTP, and UTP

(200 µM final concentrations) were added. Aliquots (10 µl) were removed at increasing

times and mixed with 10 µl of 95% formamide, 20 mM EDTA. The transcripts synthesized were separated by electrophoresis through 6% polyacrylamide sequencing gels, or through gels in which the upper two-thirds were 6% polyacrylamide and the lower third was 18% polyacrylamide, and visualized using a phosphorimager.

Results

Construction of T284 that positions a single nucleosome

Template T284 was PCR amplified from plasmid pYX6 (Figure 2.1) using

primers TD2 and MX1. The sequence of T284 is shown in Figure 2.3A. T284 has the

hmtB promoter (Darcy et al., 1999) and the first 24 bp downstream of the transcription

start site is the U-less cassette that can be transcribed in the presence of ATP, GTP and

CTP. Located ~100 bp downstream from the U-less cassette is the 60-bp Selex-1

sequence (Bailey et al., 2000). T284 has unique restriction sites for EcoRI, BamHI, NruI and HindIII.

39 Positioning of HMtA2 on T284

MN was used to identify the location of HMtA2 assembly on T284. As shown in

Figure 2.3B, when MN was incubated with the T284 DNA for 1 min in the absence of

HMtA2, the DNA was completely digested (lane 0). When HMtA was incubated with

T284 at a histone dimer to 100 bp DNA ratio of 5, which is similar to the in vivo ratio

(Pereira et al., 1997), a distinct ~90-bp DNA fragment was protected from MN digestion

(lane 5). At a histone dimer to 100 bp DNA ratio of 15, DNA molecules longer than 90 bp were observed (lane 15). At a histone dimer to 100 bp DNA ratio of 30, almost the entire T284 was protected from MN digestion (lane 30).

The ~90-bp DNA was cut out from the gel, purified, and end-labeled. The end- labeled DNA was digested with restriction enzymes. As shown in Figure 2.3C, digestion by EcoRI or by NruI generated 80-bp and 10-bp fragments and digestion by BamHI generated 25-bp and 65-bp fragments. From these digestion patterns, it was apparent that

HMtA2 assembly protected a discrete ~90-bp region of T284 with the promoter proximal boundary located ~10 bp upstream of the EcoRI site and the promoter distal boundary

~10 bp downstream of the NruI site.

Given this information, two primers KS1A and KS1B (see Appendix) were designed that annealed within the MN-protected region. Primer extension reactions were performed on the products of MN digestion and run on sequencing gels in parallel with sequencing ladders generated with the same primers (Figure 2.3D). At a histone dimer to

100 bp DNA ratio of 5, HMtA2 assembly on T284 protected an ~92-bp fragment that extended from ~50 bp downstream of the U-less cassette through most of the Selex-1 sequence. At the histone dimer to 100 bp DNA ratio of 15, the predominant MN-

40 protected fragment was ~152 bp, and this contained the ~92 bp region plus ~60 bp that extended 5΄ from this region and included part of the U-less cassette. At a histone dimer to 100 bp of DNA ratio of 30, HMtA2 assembly incorporated DNA both 5΄ and 3΄ to the

~92-bp region and protected almost the entire T284 from MN digestion.

Polyacrylamide gel shift assays of HMtA2 binding to T284

Polyacrylamide gel shift assays were performed to determine the histone to DNA ratio at which all template DNA molecules have an assembled archaeal nucleosome. As shown in Figure 2.4A, at the histone dimer to 100 bp DNA ratio of 3, all the template

DNA molecules were shifted, indicating that all DNA molecules had an archaeal nucleosome. There was little change in the migration of the histone-DNA complexes formed at the histone to DNA ratios from 3 to 15. However, at the histone dimer to 100 bp DNA ratios of 20, 25 and 30, the mobility of the histone-DNA complexes were further reduced, and at the ratios of 25 and 30, some of the complexes formed did not enter the gel.

In vitro transcription of nucleosomal templates

When incubated with TBP, TFB, RNAP and NTPs and in the absence of HMtA2, a 225-nt transcript was generated from T284, consistent with transcription initiation from the expected site (Figure 2.4B, lane -). T284 was incubated with increasing amounts of

HMtA2 to assemble nucleosomes. Then, TBP, TFB, RNAP and NTPs were added to the reaction to determine if transcription could take place on nucleosome-containing templates. As shown in Figure 2.4B, at the histone dimer to 100 bp DNA ratios from 3 to

41 10, there was 6% or less reduction in transcript accumulation. At a histone dimer to 100 bp DNA ratios from 15 to 25, there was gradual reduction in transcript accumulation. At the histone dimer to DNA ratio of 30, transcription was 99% inhibited.

HMtA2 assembly did not prevent transcript elongation

To separate transcription initiation from transcript elongation, transcription was first initiated in the presence of only ATP, GTP and CTP using biotin-labeled T284 as the template. The stalled ternary complexes that resulted were captured using streptavidin- coated Dynabeads, washed, resuspended in transcription buffer, and incubated with increasing amounts of HMtA2. All four NTPs, including UTP, were then added to the reaction mixtures to allow transcript elongation to resume. As shown in Figure 2.5A, at all HMtA2 to ternary complex ratios, the 24-nt transcript of the ternary complex was extended into a 225-nt full-length run-off transcript. Apparently, assembly of HMtA2 on the ternary complexes did not prevent transcript elongation.

Given this result, it was possible that HMtA2 did not bind to the ternary complexes. Gel shift experiments were undertaken to investigate this possibility (Figure

2.5B). Ternary complexes were incubated with either HMtA2 or an archaeal histone variant (HMfB K13T+R19S+T54K) that lacks DNA-binding activity (Soares et al.,

2000). The addition of HMfB K13T+R19S+T54K did not change the mobility of the ternary complexes, but the addition of histone HMtA2 retarded the migration of the ternary complexes. The results indicate that HMtA2 bound to the ternary complexes under the transcription condition, but did not prevent transcript elongation.

42 HMtA2 binding decreased the rate of transcript elongation

Although HMtA2 binding to the template DNA did not prevent transcript

elongation, it was still possible that it reduced the rate of transcription. To determine if

this was the case, HMtA2 was incubated with ternary complexes containing a [32P]- labeled 24-nt stalled transcript to allow assembly of an archaeal nucleosome at the Selex-

1 site, and these complexes were then added to a complete in vitro transcription reaction

mixture. The lengths of the labeled transcripts synthesized after increasing times of

incubation at 58°C were determined (Figure 2.6). On control templates lacking an archaeal nucleosome, transcript elongation occurred at a rate of 20 nt/s and almost all of

the 24-nt transcripts that were extended reached full length (225 nt) in 20s. With the

archaeal nucleosome present, transcript elongation was slowed and 225-nt transcripts first became evident only after 30s. The M.t. RNAP paused at several locations, both before

and within the archaeal nucleosome, and the predominant pause site coincided precisely

with the promoter-proximal boundary of the archaeal nucleosome (position +75, relative

to transcription start). Pauses were also observed at positions +35 and +50, and at several

locations separated by multiples of ~10 bp within the nucleosome (positions +90, +100,

+110, +130 and +150).

Construction of T435 that positions two nucleosomes

T435 was constructed to determine if the archaeal RNAP could transcribe through

two nucleosomes. T435 has the same promoter and U-less cassette as T284, but has two

repeats of the ~90 bp nucleosome assembly sequence (the region where a nucleosome

assembles on T284), separated by ~90 bp (Figures 2.7 and 2.8A). The locations of the

43 two assembly sequences are +75 to +165 and +250 to +340, relative to the transcription start site. DNase I footprinting showed that two regions of the T435, each ~90 bp, were protected from DNase I digestion, when T435 was incubated with HMtA2. The location of the DNase I footprints are consistent with the assembly of two separate nucleosome structures on this template at the predicted locations separated by ~90 bp of histone-free

DNA (Figure 2.8A).

Transcription through two nucleosomes

Ternary transcription complexes containing [32P]-labeled 24-nt stalled transcripts were generated on biotin-labeled T435 by transcription in the absence of UTP. They were captured with streptavidin-coated beads, washed, and resuspended in transcription buffer.

HMtA2 was then added to these complexes at a histone dimer to 100 bp DNA ratio of 10, to allow archaeal nucleosome assembly on the template DNA. At this ratio, HMtA2 binding was determined by DNase I footprinting to result in protection at two separate nucleosome positioning locations (Figure 2.8A). After addition of all NTPs, transcript elongation was followed (Figure 2.8B). On control templates lacking the archaeal nucleosomes, transcript elongation occurred at a rate of 10 nt/s and almost all of the 24- nt transcripts that were extended reached full length (352 nt) in 30s. With archaeal nucleosomes present, the overall rate of transcript elongation was slowed to ~3 nt/s and

352-nt transcripts first became evident only after 120s. Some pauses were observed on both control DNA and nucleosomal DNA (indicated by ○) at positions +155 and +200, relative to the transcription start site. Some pauses were observed only on the

44 nucleosomal DNA (indicated by ●), at positions +75 and +225, which corresponded to the promoter-proximal boundaries of the two archaeal nucleosomes.

Discussion

Assembly of HMtA2 on T284

Template T284 constructed in this study had the 60-bp Selex-1 sequence, and the results obtained from MN digestion, restriction digestion and primer extension experiments confirmed that the Selex-1 sequence had the ability to position archaeal nucleosome assembly in this template DNA context (Figure 2.3). Archaeal histone

HMtA2 was used, rather than HMfB, which was used in the original SELEX procedures that led to the isolation of the Selex-1 sequence (Bailey et al., 2000). This was because the in vitro transcription system was derived from M.t. components, and a histone from the same organism was therefore more appropriate. The DNA binding property of

HMtA2 is similar to that of HMfB (Tabassum et al., 1992) and HMtA2 has approximately the same affinity for Selex-1 sequence as HMfB (Bailey, 2000). Two other histones are present in M.t., but they were not used in this study; HMtA1 has relatively weak affinity for the Selex-1 sequence and HMtB does not bind DNA, as revealed by gel shift assays (Bailey, 2000).

At the histone dimer to 100 bp DNA ratio of 5, nucleosome assembly protected

~90 bp of T284 from MN digestion (Figure 2.3B). This suggests that the nucleosome structure formed under this condition had ~90 bp of DNA wrapped around a histone core,

45 consistent with a previous study based on EM measurements in which DNA wrapped

around an archaeal histone core resulted in an apparent length reduction of 91±5 bp

(Marc et al., 2002). The ~90 bp MN protected region included 45 bp of the 60-bp Selex-1

sequence and 45 bp of the DNA immediately upstream of the Selex-1 sequence. It is not

clear why the entire Selex-1 sequence was not incorporated into the archaeal nucleosome.

It indicates that the rules for archaeal nucleosome positioning are more complex than the

simple preference for AA and TA dinucleotides repeated at 10 or 11 bp intervals (Bailey,

2000). The ~90-bp MN-protected region in T284 must provide a higher thermodynamic

stability for a histone-DNA complex. Studies of eukaryal nucleosome positioning have

revealed a number of variables that affect nucleosome positioning, including DNA bend,

bendability, DNA twist and twistability (Widom, 2001), and it is likely that such

variables also affect archaeal nucleosome positioning.

At the histone dimer to 100 bp DNA ratio of 15, the length of DNA protected

from MN increased to ~152 bp, extending ~60 bp towards the promoter proximal end

(Figure 2.3B and D). It is not clear why nucleosome assembly extended in one direction

and not in the other direction, but presumably, this is a feature inherent in T284 sequence.

At the histone dimer to 100 bp DNA ratio of 30, almost the entire length of T284 was

protected from MN (Figure 2.3B and D). What are the structures of the histone-DNA

complexes formed at these different histone to DNA ratios? Do they all exist in vivo?

Right now, there are no obvious answers to these questions. The minimal nucleosome structure seems to be a histone tetramer (Bailey et al., 1999), surrounded by 90 bp of

DNA, but when the histone concentration becomes higher, additional histone dimer polymerization created complexes that protected more than 90 bp of DNA. Archaeal

46 histone-DNA complexes that protected ~120 and ~180 bp from MN have been reported in several other studies (Pereira and Reeve, 1999; Tomschik et al., 2001). The concentration of histones in M.t. has not been accurately determined, and there appears to be growth phase dependent fluctuations in the synthesis of the three individual histones

(Kathleen Sandman, personal communications).

The polyacrylamide EMSA data provided additional information on nucleosome assembly on T284 (Figure 2.4A). At the histone dimer to 100 bp DNA ratios from 3 to

10, a distinct shifted band was seen, consistent with the formation of an archaeal nucleosome on each template molecule. At the histone dimer to 100 bp DNA ratios from

25 to 30, the mobility of the histone-DNA complexes formed was reduced, and some of the complexes formed did not enter the gel. This suggests that at these histone to DNA ratios, larger histone-DNA complexes were formed, consistent with the results obtained from MN digestion and primer extension experiments.

When transcription was performed at the histone dimer to DNA ratios from 3 to

10, there was 6% or less reduction in transcript accumulation (Figure 2.4B). Since it was known that under these conditions, the nucleosome was positioned at the ~90-bp region distant from the promoter, the nucleosome did not apparently interfere with transcription initiation and did not affect transcript accumulation. At the histone dimer to DNA ratios from 12 to 25, there was a gradual reduction in transcript accumulation. At the histone dimer to DNA ratio of 30, transcription was almost completely inhibited. From the MN digestion and primer extension results (Figure 2.3), it is known that under these conditions, the promoter region was partially or completely incorporated into an extended histone-DNA complex. This indicates that when the promoter was incorporated into a

47 histone-DNA complex, transcription was repressed, because the promoter would no

longer be accessible to TBP, TFB and/or RNAP. These results are consistent with reports

that eukaryal nucleosome assembly at a eukaryal promoter inhibited transcription in vitro

by preventing TBP or TFIID binding to the TATA-box (Workman and Roeder, 1987;

Hansen and Wolffe, 1992; Imbalzano et al., 1994).

Stability of an archaeal ternary transcription complex

Another feature of T284 is the U-less cassette that was used to separate transcription initiation from transcript elongation. When transcription was initiated in the presence of ATP, GTP and CTP, a stalled ternary complex containing a 24-nt nascent transcript, the template DNA and RNAP was generated. If the template was biotin- labeled at one end, the stalled ternary complex could be captured by streptavidin-coated beads, purified, and retain its transcription competency. The 24-nt nascent transcripts were extended into the 225-nt full-length transcripts in transcription buffer after addition of all four NTPs (Figure 2.5A). Apparently, the archaeal ternary complexes were very stable, as they did not dissociate during several rounds of centrifugation and washing.

The ability of the ternary complexes to resume transcript elongation was also very high, as almost 100% of the 24-nt nascent transcripts were extended to full-length transcripts.

Such a high degree of stability has also been demonstrated for M. thermolithotrophicus

RNAP elongation ternary complexes stalled at the end of a 25 bp C-less cassette

(Hausner et al., 2000). Transcription ternary complexes formed by E. coli RNAP also

remain stable and active for up to 24 h, even when incubated in 1 M KCl (Arndt and

48 Chamberlin, 1990). Apparently, the multi-subunit RNAPs from Archaea, Bacteria and

Eukarya share some common mechanisms in stabilizing their elongation complexes, such as a downstream DNA binding clamp, RNA/DNA hybrid binding sites, single-stranded

RNA binding sites and upstream RNA binding sites (Zhang et al., 1999; Korzheva et al.,

2000; Gnatt et al., 2001; Korzheva and Mustaev, 2001; Gnatt, 2002; Vassylyev et al.,

2002).

M.t. RNAP transcription through an archaeal nucleosome

M.t. RNAP was able to transcribe through one nucleosome without the help of

any additional protein factors, but the rate of transcript elongation was reduced (Figure

2.6). This result is consistent with those reported for transcription in vitro of eukaryal

nucleosome-containing templates by phage T7 and SP6 RNAPs (Kirov et al., 1992;

O'Neill et al., 1993; Chirinos et al., 1999; Walter and Studitsky, 2001) and by eukaryal

RNAPIII (Studitsky et al., 1997). In all these cases, progress of the RNAP in vitro is

slowed but not blocked on encountering the histone-DNA complex. This suggests that the

elongation property of archaeal RNAP is more similar to eukaryal RNAPIII than

RNAPII, as RNAPII can not easily transcribe through a nucleosome in vitro (Izban and

Luse, 1992; Chang and Luse, 1997; Kireeva et al., 2002). Neither archaeal RNAP nor

RNAPIII has the C-terminal domain (CTD), which is present in RNAPII and functions as

a scaffold for the assembly of elongation and mRNA-processing factors (Proudfoot et al.,

2002; Stiller and Hall, 2002; Kamenski et al., 2004; Meinhart and Cramer, 2004). The

basis for the higher sensitivity to the nucleosome barrier in RNAPII is unknown, but it

has been suggested that the difference between RNAPII and RNAPIII has evolved to

49 provide an additional opportunity for RNAPII-specific regulation (Kireeva et al., 2002).

Transcription elongation factors, such as SWI/SNF and FACT (Orphanides and Reinberg,

2000; Kim et al., 2002; Svejstrup, 2002; Woychik and Hampsey, 2002; Belotserkovskaya et al., 2003), can then be used to regulate gene expression by facilitating RNAPII transcription through a nucleosome. Consistent with this evolutionary scenario, archaeal genomes do not encode obvious relatives of these RNAPII elongation factors (Kyrpides and Ouzounis, 1999; Ponting, 2002). However, since the study of archaeal transcription is in its infancy and so many archaeal genes are unassayed, it remains possible that archaeal elongation factors distantly related to SWI/SNF or FACT will be discovered in the future.

During transcript elongation by the M.t. RNAP on a nucleosomal template, the most prominent pause occurred at the promoter-proximal boundary of the positioned nucleosome (Figure 2.5). This pause most likely reflects the event when the RNAP first encountered the nucleosome and had to pause before being able to continue transcript elongation. Pauses at two more upstream sites were likely caused by topological constraints in the DNA from binding of the RNAP on one end and the nucleosome on the other end. The pauses observed at ~10-bp intervals during transcription through the nucleosome were reminiscent of the pausing patterns of eukaryal RNAPIII on a mononucleosomal template (Studitsky et al., 1997). It is likely that the advancing archaeal RNAP causes short stretches of DNA to detach sequentially from the histone core and moves forward, stepwise through the nucleosome, transcribing each detached region, as RNAPIII does. The mechanism used by RNAPIII also transfers the complete histone octamer upstream against the direction of transcription by 20 to 90 bp (Studitsky et al., 1997). Further experiments are needed to determine if the archaeal nucleosome is

50 transferred to an upstream region or dissociated from the DNA by passage of an archaeal

RNAP.

M.t. RNAP transcription through two archaeal nucleosomes

The assembly of two archaeal nucleosomes on T435 (Figure 2.8) demonstrated the feasibility of constructing archaeal nucleosome arrays using tandem copies of the

~90-bp nucleosome positioning sequence. Compared to a mononucleosomal template, nucleosome arrays better mimic the situation eukaryal and archaeal RNAPs encounter in vivo when transcribing long genes and . Eukaryal nucleosome arrays were generated using head-to-tail repeats of the 5S rRNA gene of Lytechinus variegates, which rotationally and translationally positions a eukaryal nucleosome (Simpson and Stafford,

1983; Hansen et al., 1989; Dong et al., 1990).

Based on a literature search, the results with T435 are the first demonstration that a multisubunit RNAP can transcribe sequentially through more than one nucleosome in vitro. RNAPIII elongation has only been investigated on mononucleosomal templates

(Studitsky et al., 1997), and RNAPII elongation was inhibited, extending by <10 bp in a

15 min reaction, by an array of six nucleosomes (Chang and Luse, 1997).

Regulation of gene expression by archaeal nucleosome assembly

Different histones from the same archaeon have different affinities for the same

DNA sequence (Bailey et al., 2002; Marc et al., 2002), and are synthesized differentially depending on growth conditions (Sandman et al., 1994a; Dinger et al., 2000). Given these observations and that a relatively short DNA sequence (~90 bp) can precisely position

51 archaeal nucleosome assembly (Figures 2.3 and 2.8A), archaeal nucleosome positioning could be used in vivo to regulate gene expression, replication, and/or recombination.

Consistent with this, EM of archaeal genomic DNA has revealed that archaeal nucleosomes are not as regularly packed as nucleosomes in eukaryal chromatin, but rather are interspersed between histone-free regions (Takayanagi et al., 1992; Pereira et al., 1997). Transcription regulation could then be based on a binding competition between archaeal histones and transcription factors. Most archaeal transcription regulators identified to date function by competing with TBP and TFB for binding to the

TATA-BRE region or with RNAP for the site of transcription initiation (Reeve, 2003). In this regard, when HMtA2 assembly incorporated the upstream regulatory region, transcription in vitro was inhibited (Figure 2.4B). The almost universal presence of histones in Eukarya has led to the argument that it was the evolution of the histone fold- based mechanism of DNA compaction that facilitated genome expansion and evolution (Henikoff et al., 2003). Accommodating much larger genomes within the confines of a nucleus may however also have required that essentially all the genomic

DNA was bound by histones. Under such conditions, regulation based on a simple competition between histones and transcription factors would have been compromised, and the histone tails and histone modifying and chromatin remodeling complexes may then have been needed to access and regulate the expression of histone-bound DNA. As eukaryal genome sequences accumulate, it may become possible to identify when the sophistication of eukaryal chromatin expression arose. From the archaeal histone sequences now available, it can be concluded that it was after the divergence of the

52 archaeal and eukaryal lineages (Sandman et al., 2001; Malik and Henikoff, 2003; Waters et al., 2003).

Potential weaknesses of the in vitro experiments

The Selex-1 sequence used in this study has very high binding affinity for archaeal histones HMfB and HMtA2 (Bailey et al., 2000). The high binding affinity of

Selex-1 is desirable for in vitro biochemical assays, but one must realize the potential weakness of using such a special sequence. The binding affinity of Selex-1 sequence may be higher than not only the random M.t. genomic DNA sequences, but also naturally occurring archaeal nucleosome positioning sequences. Selex-1 sequence may bind to histones so tightly in vitro as to inhibit nucleosome disassembly, whereas naturally occurring nucleosome positioning sequences in vivo may bind to histones less tightly and allow a dynamic equilibrium between nucleosome assembly and disassembly.

Experiments using naturally occurring nucleosome positioning sequences, not the Selex-1 sequence, would have better replicated what is going on in living cells, but a systematic search for naturally occurring archaeal nucleosome positioning sequences in M.t. genomic DNA was not successful (Christine Utz, personal communication). One must be cautious about the in vitro results concerning RNAP elongation on nucleosomal templates (the overall rate of elongation and RNAP pausing), as the high affinity of

Selex-1 sequence may have inhibited the dissociation of archaeal histones from the template DNA. Another caveat is that all in vitro transcription reactions were carried out in 120 mM KCl, but the K+ concentration in M.t. cells was 780 mM (Sprott and Jarrel,

53 1981). Thus, histone-DNA interactions and RNAP behavior in vivo versus in vitro may be quite different.

54 Figure 2.1. Construction of plasmids pYX2 and pYX6. Primers TD2, KS2, KS3, KS4,

KS5, KS6, KS7 (see Appendix) were used to construct the hmtB promoter and the U-less cassette. They were mixed (50 pmol of each primer in a 50-µl reaction volume), hybridized by heating to 95ºC and cooling gradually to room temperature (RT), and the resulting molecules cloned into the NsiI plus EcoRI digested Litmus 28 (New England

Biolabs) to generate pYX1. Primers KS8, KS9, KS10, KS11, KS12, KS13 (see Appendix) were used to construct the Selex-1 sequence. They were mixed, hybridized and the resulting molecules cloned into BamHI plus HindIII digested pUC18 to generate pKS564

(Kathleen Sandman, personal communication). A fragment released from pKS564 by

EcoRI and HindIII digestion was cloned into pYX1 linearized with EcoRI and HindIII to generate pYX2. Primers MX23, MX24, MX25, MX26, MX27, MX28 (see Appendix) were used to construct the spacer region that would separate the U-less cassette and the

Selex-1 sequence. They were mixed, hybridized and the resulting molecules cloned into pYX2 linearized with EcoRI and BamHI to generate pYX6. pYX6 was used as the template to generate transcription template T284 by PCR with primers TD2 and MX1.

The arrows on the plasmids represent the hmtB promoter, the black boxes represent the

U-less cassette, and the open boxes represent the Selex-1 sequence. The short lines marked by letters B, E, H, or N represent BamHI, EcoRI, HindIII, or NsiI restriction site, respectively.

55 TD2 KS2 KS3 KS8 KS9 KS10

KS7 KS6 KS5 KS4 KS13 KS12 KS11

Ligation with Ligation with BamHI/HindIII NsiI/EcoRI digested digested pUC18 Litmus28

hmtB promoter Selex-1

N EBH E H

pYX1 pKS564

EcoRI + HindIII EcoRI + HindIII

Selex-1 E B H

ligation hmtB promoter hmtB promoter TD2 Se le x- U-less E 1 E B B cassette MX1 EcoRI + BamHI pYX2 pYX6

ligation MX23 MX24 MX25

PCR with primers MX26 MX27 MX28 TD2 and MX1

TD2 Figure 2.1 U-less Selex-1 MX1 cassette

56 Figure 2.2. Construction of transcription template T435 from T284. T284 was digested

with HindIII, and the digestion fragment that contained the hmtB promoter was purified.

A second DNA molecule was generated by PCR from T284 using primer MX50 that

contained an overhang that contained a HindIII site and primer MX51 that contained an

overhang with a sequence identical to primer MX52, digested with HindIII and purified.

The two DNA molecules were mixed, ligated, and T435 was generated by PCR from the

ligation product using primers TD2 and MX52. The letter H represents the HindIII restriction site, and the striped box indicates the complimentary HindIII region. The arrows on T284 and T435 represent the hmtB promoter, the black boxes represent the U- less cassette, the open boxes represent the Selex-1 sequences, and the ovals represent the locations where archaeal nucleosomes are expected to assemble.

57 H TD2 MX50 H T284 MX51 MX1

PCR with primers Digestion with HindIII MX50 and MX51 Digestion with HindIII

TD2

H

MX51 MX52

PCR with primers TD2 and MX52

TD2 T435 MX52

Figure 2.2

58 Figure 2.3. The position of HMtA2 assembly on T284. (A) The sequence of T284 is

shown with the BRE and TATA-box sequence from the hmtB promoter (Darcy et al.,

1999), the U-less cassette (dark gray), Selex-1 sequence (light gray) (Bailey et al., 2000),

site of transcription initiation ( ), and 3' ( ) and 5' (asterisks) boundaries of the MN-

protected fragments indicated. (B) Electrophoretic separation of the fragments of the

T284 DNA protected from MN digestion by assembly into complexes at the HMtA2

dimer to 100-bp DNA ratios indicated above the corresponding lanes. Control lanes

contained size standards (S), untreated T284 DNA (–) and T284 DNA exposed to MN for

1 min in the absence of HMtA2 (0). The resulting molecules are indicated by differing

numbers of asterisks connected to a triangle ( ). The asterisks mark the 5' end of the

molecules and the triangle marks the 3' end of the molecules. (C) Autoradiogram of the

electrophoretic separation of the restriction fragments generated from the 90-bp MN-

protected DNA by EcoRI (E), BamHI (B), or NruI (N). The control lanes contained size standards (S) and an aliquot of the 90-bp DNA not exposed to any restriction enzymes

(–). (D) Diagram showing the position of the archaeal nucleosome (shaded oval) and the

lengths of MN-protected DNA as determined by primer extension. Lanes (5, 15, 30)

contained primer extension products from MN-protected DNA at 5, 15 or 30 histone

dimers per 100 bp of template DNA. Adjacent lanes (G, C, T, A) contained DNA

sequencing ladders.

59 Figure 2.3

60 Figure 2.4. EMSA of HMtA2 binding and runoff transcription. (A) Autoradiogram of the electrophoretic separation of the complexes formed by incubation of [32P]-labeled T284

DNA (50 ng) without (–) and with HMtA2 at the histone dimer to 100 bp DNA ratios indicated above each lane. (B) Electrophoresis of the 225-nt [32P]-labeled runoff transcripts synthesized in 30 min at 58°C from T284 preincubated with HMtA2 at the histone dimer to 100 bp DNA ratios indicated above each lane. The amount of transcript, as a percentage of that synthesized in the absence of HMtA2 (–), is listed below each lane.

Lane S contained size standards.

61 A.

B.

Figure 2.4

62 Figure 2.5. Stalled-transcript elongation and EMSA of ternary complexes. (A) Ternary complexes that contained the [32P]-labeled 24-nt U-less transcripts were incubated without (–) or with HMtA2 and then added to complete transcription reaction mixtures and incubated at 58°C for 20 min. A control aliquot of the ternary complex was incubated in a reaction mixture that lacked UTP (–UTP). The transcripts synthesized were separated by electrophoresis, and [32P]-labeled transcripts detected by autoradiography. The

HMtA2 dimer to 100 bp DNA ratios are indicated above the corresponding lanes. The control lanes contained size standards (S1 and S2). (B) Aliquots of ternary complexes that contained the [32P]-labeled 24-nt U-less transcripts were incubated with HMtA2 at the histone dimer to 100 bp DNA ratios indicated above the corresponding lanes. The products were separated by electrophoresis and visualized by autoradiography. Control lanes contained a sample of [32P]-end-labeled template DNA (T), an aliquot of the ternary complexes incubated without HMtA2 addition (–), and an aliquot incubated with the

HMfB K13T+R19S+T54K variant which lacks DNA binding ability (Soares et al., 2000) at the histone dimer to 100 bp DNA ratio of 30 (C).

63 Figure 2.5

64 Figure 2.6. Transcript elongation in the absence and presence of an archaeal nucleosome.

Ternary complexes containing T284, M.t. RNAP and [32P]-labeled 24-nt U-less transcript incubated with HMtA2 (+HMtA2) or without HMtA2 (–HMtA2) were added to complete reaction mixtures and placed at 58°C. Aliquots were taken at the times indicated, and the transcripts synthesized were separated by electrophoresis and visualized using a phosphorimager. Control lanes contained size standards (S1 and S2). In the diagram, the template DNA is shown with or without the positioned archaeal nucleosome. The rate of transcription was estimated, as indicated, from the increase in length of transcripts during incubation at 58°C.

65 Figure 2.6

66 Figure 2.7. DNA sequence of T435. The TATA-box of the hmtB promoter is shown and the transcription start site is marked with an arrow. The U-less cassette is boxed, and the

DNA sequences where archaeal nucleosomes are expected to assemble are underlined.

67 T435 DNA sequence: TATA-box

CTCAGAAAAACCTTAAAATTAGCGATATATTTATATAGGA TATATGAATAGATAATATCACACGGAGACAACAACACG CGGAATTTCCGCGGGTGTCGAACCATTTTGATGACTCGC ATATGGGGAGCCAACAACACGCGGAATTCGAGCTCGGT ACCCGGGATCCGATATCAACCGTACTGGTGTTGTCCTAC GCTAATCTAAGCCGTTTACTCGCGATTTTGAAAATAGCTT AGGTGGAGATCTGATATCAAGCTTTTTCCGCGGGTGTCG AACCATTTTGATGACTCGCATATGGGGAGCCAACAACAC GCGGAATTCGAGCTCGGTACCCGGGATCCGATATCAAC CGTACTGGTGTTGTCCTACGCTAATCTAAGCCGTTTACTC GCGATTTTGAAAATAGCTTAGGTGGCTAGGTCTCCCTGA AAGGCA

Figure 2.7

68 Figure 2.8. Archaeal nucleosome assembly on T435 and transcript elongation through

two nucleosomes. (A) [32P]-end-labeled T435 DNA was preincubated with HMtA2 at the

histone dimer to 100 bp DNA ratios indicated to the left of each lane (0, 5, 10, 20, 40)

and digested by DNase I (1 U) at RT for 5 min. The digestion products were separated by

electrophoresis through a 6% denaturing gel and visualized by using a phosphorimager.

In the diagram above the gel, the locations of the two archaeal nucleosomes are indicated

by two ovals. (B) Ternary complexes containing a [32P]-labeled 24-nt U-less transcript were incubated with HMtA2 (+HMtA2) or without HMtA2 (–HMtA2) and then added to complete reaction mixtures and placed at 58°C. Aliquots were taken at the times indicated, and the transcripts synthesized were separated by electrophoresis and visualized using a phosphorimager. The control lane contained nucleotide size standards

(S). Some pauses were observed on both control DNA and nucleosomal DNA (○), whereas other pauses were observed only on nucleosomal DNA (●).

69 A.

0 75 165 250 340

nt (relative to transcription start) 75 125 175 225 275 325 40 20 HMtA2:DNA 10 5 0

B. 352 nt run- off transcript

0 (s) ○ 5 ○ 10 15 20 25 30 -HMtA2 45 60 120

● 0 5 15 ○ 30 45 + HMtA2 ○ 60 ● 75 90 105 120 150 S 180 (nt) 50 100 150 200 250

Figure 2.8

70

CHAPTER 3

TRANSCRIPTION INITIATION BY ARCHAEAL RNAP IN VITRO RELEASES TFB, BUT NOT TBP FROM THE PROMOTER DNA

Introduction

The basal transcriptional machinery in Archaea closely resembles the eukaryal

RNAPII system (Reeve, 2003). In the eukaryal RNAPII system, TBP or TFIID, which contains TBP, remains associated with the promoter DNA after transcription initiation and departure of the RNAPII (Zawel et al., 1995; Weideman et al., 1997; Yudkovsky et al., 2000; Chen et al., 2002), and thus facilitates transcriptional re-initiation. The stability of the TBP-promoter complex is further strengthened by transcription activators or coactivators (Yudkovsky et al., 2000). Promoter-bound TBP can be removed by Mot1, a yeast protein that employs ATP hydrolysis to disrupt TBP-DNA interactions (Pugh, 2000;

Darst et al., 2001). In contrast to TBP, the other general transcription factors, including

TFIIB, are either released after transcription initiation and must therefore re-assemble into PIC before each round of transcription, or remain attached to RNAPII and leave the promoter as part of the transcription elongation complex (Yudkovsky et al., 2000).

Archaeal transcription initiation in vitro requires only archaeal TBP and TFB (Reeve,

2003). Based on the homology with the eukaryal RNAPII system, it was predicted that

TBP should remain bound at the promoter and TFB leave after transcription initiation.

The experiments reported in this chapter were undertaken to determine the fates of TBP

71 and TFB after transcription initiation from the hmtB promoter. The results obtained

confirmed the RNAPII-based prediction.

Material and Methods

Chemicals and reagents

Anti-Xpress antibodies were purchased from Invitrogen, peroxidase-labeled anti- mouse antibodies from Pierce (Rockford, IL), ECL plus Western-blot detection kit,

Hybond nitrocellulose membranes from Amersham Pharmacia, and heparin from USB

(Cleveland, OH).

Template construction

Plasmid pYX2 and pYX6 constructions are described in Chapter 2 (Figure 2.1).

Biotinylated T284 and T225 (Figure 3.1) were generated by PCR amplification from

pYX6 and pYX2 using biotinylated primer TD2 and non-biotinylated primer MX1.

Biotinylated T475 (Figure 3.1) was PCR amplified from pYX2 using biotinylated primer

MX2 and non-biotinylated primer MX1. Promoterless T439 (Figure 3.1) was PCR

amplified from Litmus28 using biotinylated primer MX2 and non-biotinylated primer

MX3.

Single and multiple round transcriptions

T284 DNA (200 ng) was incubated with TBP (100 ng), TFB (600 ng), RNAP

(1000 ng), ATP (400 µM), GTP (400 µM), CTP (20 µM), and 10 µCi of [α-32P] CTP

(3000 Ci/mmol) in transcription buffer and the mixture (100 µl final volume) incubated at

72 58°C for 8 min to allow transcription initiation. The reaction mixture was divided equally

into two tubes and heparin was added to one tube to a final concentration of 80 µg/ml to

prevent transcription re-initiation. UTP was added to both tubes to a final concentration

of 400 µM and incubation was continued at 58°C. Aliquots (8 µl) were taken at 2, 5, 10,

15, 20, and 30 min, and mixed with an equal volume of 95% formamide, 20 mM EDTA.

The [32P]-labeled transcripts synthesized were electrophoresed through 6% denaturing polyacrylamide gels, visualized and quantitated using a phosphorimager.

Template competition assays

Template competition assays were adapted from the RNAPII-based protocols developed by Zawel et al. (1995). To investigate the fate of TBP after transcription initiation, T284 (100 ng) was incubated with TBP (50 ng), TFB (300 ng), RNAP (500 ng) in transcription buffer (50 µl final volume) at RT for 20 min. 10 µg of streptavidin-coated beads were added to the reaction and the mixture was incubated at RT for 5 min. The beads were then captured with a magnet and the supernatant was removed. After the beads were washed 4 times with 200 µl of transcription buffer, they were resuspended in

50 µl of transcription buffer. In a separate tube, T225 (80 ng) was incubated with TFB

(300 ng) and RNAP (500 ng) in transcription buffer (50 µl final volume) at RT for 20 min. The contents of the two tubes were then mixed, incubated at RT for 10 min, and then incubated at 58°C, followed by addition of ATP (400 µM final concentration), UTP

(400 µM), GTP (400 µM), CTP (20 µM), and 10 µCi of [α-32P] CTP. Aliquots (10 µl) were removed after 10, 20, and 30 min. The transcripts synthesized were electrophoresed through 6% denaturing polyacrylamide gel and visualized using a phosphorimager. As a

73 control for this experiment, TBP (50 ng) was added after the two mixtures were mixed. In

a reciprocal experiment, T284 was incubated with TFB and RNAP only, and T225 was

incubated with TBP, TFB and RNAP. Schematic diagrams of these experiments are

shown in Figure 3.3.

To investigate the fate of TFB after transcription initiation, the same experimental

protocol was followed but TFB was the transcription factor withheld from either T225 or

T284 during the initial incubation (Figure 3.4).

To investigate the extent to which TBP might dissociate from the promoter after

transcription initiation, reaction mixtures contained T225, TBP, TFB and RNAP in the

same ratios, but in final volumes of 50, 125, 250, 375 and 500 µl. After incubation at RT

for 20 min, T225 complexes were captured by streptavidin-coated Dynabeads, washed

and resuspended in 50 µl transcription buffer with a constant amount of T284 (100 ng),

TFB and RNAP. The remainder of the procedures followed the protocol described above.

Immunodetection of TBP and TFB

To detect the presence of TBP and TFB in the PIC, T475 (200 ng) or T439 (200

ng) were incubated with TBP (50 ng), TFB (300 ng), and RNAP (500 ng) in transcription

buffer (50 µl final volume) at RT for 20 min. The complexes were captured by streptavidin-coated beads and washed 4 times with transcription buffer. The beads were resuspended in 16 µl transcription buffer and digested with SpeI (10 U) at 37°C for 30 min to separate the promoter-containing DNA region from the beads (see Figure 3.1 for the location of SpeI site on T475 and T439). The beads were captured by attraction to a magnet and the supernatant was transferred to a fresh tube. The samples were mixed with

74 4 µl of 5X SDS loading buffer and boiled for 5 min and the polypeptides present were

separated by electrophoresis through 15% SDS polyacrylamide gels. The proteins were

transferred to Hybond nitrocellulose membranes. The membranes were probed with anti-

Xpress antibody (diluted 1:10,000) and peroxidase-labeled anti-mouse antibody (diluted

1:5,000) and the complexes detected using the ECL Plus kit. To determine if TBP or TFB

remained attached to the template DNA after initiation, T475 (200 ng) was incubated

with TBP (50 ng), TFB (300 ng), and RNAP (500 ng) in transcription buffer (50 µl final volume) at RT for 20 min. The complexes were captured by attraction to a magnet, washed, resuspended in transcription buffer (50 µl) and salmon sperm DNA (10 µg) was added as competitor DNA. The reaction mixture was incubated at 58°C for 10 min with or without the nucleotides indicated in Figure 3.6. The DNA was removed, washed, and subjected to SpeI digestion. The proteins attached to the DNA released from the beads by

SpeI cleavage were subjected to SDS-PAGE and western blotting as described above.

Results

Multiple-round transcription in vitro

Heparin was used to test the multiple-round nature of the in vitro transcription

system, because it was known to prevent transcription re-initiation but not elongation by

archaeal RNAP (Bartlett et al., 2004). In vitro transcription reactions were performed in

the presence or absence of heparin and aliquots were taken at increasing times. As shown

in Figure 3.2, in the transcription reactions without heparin, there was a linear increase in

the amount of the 225-nt transcripts synthesized from T284 for 30 min. In the reactions

75 with heparin, there was no increase in the amounts of transcripts synthesized beyond the

initial round of synthesis. If the M.t. in vitro transcription system supported only a single

round of transcription, there would have been no difference in the amounts of transcripts

synthesized in the presence or in the absence of heparin. The continued increase in

transcript accumulation in the absence of heparin was consistent with multiple rounds of

transcription initiation by M.t. RNAP on T284 during 30 min incubation at 58°C.

Template competitions

T284 or T225 (as the first template) was incubated with archaeal RNAP, TBP and

TFB to allow the assembly of PICs. These complexes were then captured using streptavidin-coated beads, and unbound transcription factors removed. The complexes were then resuspended in transcription buffer. T225 or T285 (as the second template) was incubated with archaeal RNAP and TBP or archaeal RNAP or TFB, but not with both transcription factors. The two mixtures were mixed, NTPs added and the reaction mixtures incubated at 58°C for 30 min. If a transcription factor (TBP or TFB) was released from the first template after initiation, it would become available to the second template, thus two transcripts of different lengths (165 or 225 nt) would be synthesized.

However, if the transcription factor remained associated with the first template after initiation, it would not be available to the second template, so only one transcript would be synthesized. As shown in Figure 3.3, when TBP was the transcription factor not provided to the second template, only one transcript was synthesized and specifically the transcript from the template incubated initially with TBP, suggesting that TBP was bound to the first template and was not released after transcription initiation. In contrast to TBP,

76 when TFB was the transcription factor not provided to the second template (Figure 3.4), both transcripts were synthesized, suggesting that TFB was released from the first template after initiation, and became available for initiation of transcription from the second template.

To further investigate the TBP observation, increasing amounts of PICs containing T225, TBP, TFB and RNAP were incubated with a fixed amount of T284 incubated with only TFB and RNAP, but not TBP (Figure 3.5). At T225 to T284 molar ratios of 1 and 2.5, little or no transcript was synthesized from T284. When the molar ratios were increased to 5, 7.5 and 10, transcripts from T284 were observed. However, the amounts of the transcripts from T284 were much less (<10%) than the amounts of the transcripts from T225. Most of the TBP did therefore remain attached to the promoter of

T225 after transcription initiation, but a small amount of TBP dissociated from the promoter of T225 and became available to direct transcription initiation from the promoter of T284.

Immunodetection of TBP and TFB

Western blot experiments were performed to detect binding of TBP and/or TFB binding to T475, which has an archaeal promoter, and to T439, which has no archaeal promoter. T475 and T439 were attached to streptavidin-coated beads through a biotin label. Both DNA molecules have one SpeI site, and the DNA distal to this site, relative to the bead, plus any proteins bound to this DNA were released from the beads by SpeI digestion and remained in the supernatant after bead removal. DNA-associated proteins that remained in the supernatant were separated by SDS-PAGE and were incubated with

77 anti-Xpress antibody that bound to an epitope present in the N-terminal his-tag region of the recombinant TBP and TFB. When T439 was used, no TBP or TFB remained detectable in the supernatant after SpeI digestion and bead removal (Figure 3.6A).

However, when T475 was used, both TBP and TFB were present in the supernatant after the same treatment (Figure 3.6A). This is consistent with TFB and TBP binding to the promoter region of T475.

PICs assembled on T475 were also incubated with ATP, or ATP and CTP, or

ATP, CTP and GTP or all four NTPs at 58°C for 10 min, and the T475 and T475- associated proteins collected by using Dynabeads and a magnet. Proteins associated with

T475 were released from beads by SpeI digestion and probed using the anti-Xpress antibody. As shown in Figure 3.6B, both TBP and TFB remained associated with the

T475 DNA after addition of ATP or ATP and CTP. However, after addition of ATP, CTP and GTP, or all four NTPs, only TBP was detected, while TFB was no longer detectable.

The lengths of transcripts that could be synthesized in the presence of ATP, ATP+CTP,

ATP+CTP+GTP and all four NTPs were 1, 4, 24 and 165 nt, respectively (Figure 3.1).

This indicates that TBP remained associated with T475 from the synthesis of the first nucleotide through transcription run-off. In contrast, TFB remained associated with the promoter during the initial stages of transcription, but was released during transcript elongation from position +4 to +24.

Discussion

The results obtained from transcription in vitro from T284 in the presence and absence of heparin (Figure 3.2) confirmed that the M.t. in vitro transcription system

78 containing RNAP, TBP and TFB supported multiple rounds of transcription. This was an essential control for the template competition assays used to determine the fates of TBP and TFB after transcription initiation and these required a multiple-round transcription system.

In the template competition assays, when the first template was pre-incubated with TBP and the second template was not, transcripts were synthesized only from the first template (Figure 3.3). TBP did not dissociate from the hmtB promoter, consistent with the results obtained with eukaryal TBP or TFIID on adenovirus major late promoter

DNA in the eukaryal RNAPII-based system (Zawel et al., 1995). When the first template was preincubated with TFB and the second template not preincubated with TFB, transcripts were synthesized from both templates (Figure 3.4). TFB therefore dissociated from the first template after transcription initiation and became available to the second template, consistent with the results for TFIIB (homolog of TFB) in the eukaryal RNAPII system (Zawel et al. 1995). However, when 5 to 10-fold excess of the T225 complexes were mixed with T284 not pre-incubated with TBP, most TBP molecules remained bound to the first template after transcription initiation, but a small percentage (<10%) of the TBP molecules dissociated and became available for the second template (Figure 3.5).

The results from Western blot assays (Figure 3.6) corroborated the template competition results and further identified the timing of TFB dissociation. TFB was still bound to the promoter after incubation with ATP and CTP, but was no longer bound after incubation with ATP, CTP and GTP. Based on the U-less cassette sequence of T475

DNA (Figure 3.1), incubation with ATP and CTP would allow synthesis of a 4-nt nascent transcript, whereas incubation with ATP, CTP and GTP would allow synthesis of a 24-nt

79 nascent transcript. TFB therefore dissociated when the nascent transcript was extended

from 4 to 24 nt.

The results obtained from the template competitions and Western blot assays are

consistent with earlier footprinting results that demonstrated that the TATA-box region of

the archaeal gdh promoter remained protected from ExoIII digestion after transcription

initiation in a Pyrococcus furiosus in vitro transcription system, and that a structural

transition occurred in the elongation complex during transcript elongation from position

+7 to +9, most likely coinciding with the first forward translocation step of the RNAP

(Spitalny and Thomm, 2003). The RNA:DNA hybrid within such an archaeal elongation

complex is 9-12 bp (Spitalny and Thomm, 2003) and for transcription to continue beyond

12 nucleotides, the 5΄ end of the nascent transcript must exit the elongation complex. In

the eukaryal PIC, TFIIB binding to RNAPII covers the transcript exit site, and protrusion

of the nascent transcript has been proposed to dissociate TFIIB from RNAPII and to

facilitate promoter clearance (Chen and Hahn, 2003; Bushnell et al., 2004; Chen and

Hahn, 2004). The results obtained with the archaeal system are consistent with TFB

being similarly dissociated from the archaeal RNAP elongation complex by nascent

transcript protrusion.

The results reported were generated in vitro, in a minimal initiation system, and

additional transcription factors undoubtedly contribute in vivo to the assembly, stabilization, and longevity of archaeal initiation complexes. However, based on the

RNAPII precedent (Cang and Prelich, 2002; Chen et al., 2002), archaeal TBP may well

remain bound to genomic DNA in vivo. This would create a problem for transcription

regulation, as several archaeal transcription regulators, including MDR1, LrpA and Phr,

80 compete with TBP for the same binding site (Bell et al., 1999; Dahlke and Thomm, 2002;

Vierke et al., 2003). If the TATA-box were always occupied by TBP, then transcription

regulators could never bind to this site, and there would be no competition. Archaeal TBP

could be removed by an unrecognized archaeal analog of MOT1 or maybe as a

consequence of DNA replication, transcription from an upstream promoter (Martens et al.,

2004) or adjacent promoter (Chapter 5), or DNA distortion induced by a chromatin

protein (Reeve, 2003). In Eukarya, NC2 interacts specifically with the DNA in TBP-

TATA-box complexes by employing a histone fold-based mechanism of DNA binding and suppresses transcription initiation (Kamada et al., 2001). Proteins with TBP-binding activity have been identified in Pyrococcus species (Matsuda et al., 1999; Matsuda et al.,

2001), but they are not widely conserved in Archaea. All Archaea do have DNA-

distorting chromatin proteins. TBP could be dislodged from promoter DNA by adjacent

histone binding, DNA compaction, and distortion. In Archaea that do not have histones, this role of histones could be taken over by other DNA-binding proteins, such as Alba in

Sulfolobus species.

81 Figure 3.1. Templates T284, T225, T475 and T439. Boxes indicate the locations of the

BRE, TATA box, and U-less cassette (U–) and, as indicated by arrows, transcription initiated at the start of the U-less cassette will result in 225-nt or 165-nt runoff transcripts.

A biotin molecule (o) was attached to the template DNA. T225 has the same sequence as

T284 except for the 60-bp deletion ( ) as indicated. T475 was amplified from pYX2

using primers MX1 and MX2, and T439 was amplified from Litmus28 using primers

MX2 and MX3. The SpeI (S) and NdeI sites (N) are indicated. The sequence of the U-less

cassette in T475 is shown.

82 T284

T225

T475

T439

ACACGGAGCCAACAACACGCGGAATTT..

Figure 3.1

83 Figure 3.2. Transcription in the presence and absence of heparin. T284 was incubated with TBP, TFB, RNAP, ATP, GTP and [α-32P]-CTP to allow the formation of stalled elongation complexes. The reaction mixture was divided in half, and heparin was added to one (○) of the two resulting reaction mixtures. UTP was then added to both, and aliquots were removed after 2, 5, 10, 15, 20, and 30 min of incubation at 58°C. The transcripts present were separated by electrophoresis through a 6% denaturing polyacrylamide gel, visualized and quantitated using a phosphorimager. The graph and the phosphorimage show the relative amounts of the 225-nt transcript synthesized in the presence (○) and absence (●) of heparin.

84 2 5 10 15 20 30 (min)

Figure 3.2

85 Figure 3.3. Template competition assays with TBP. T284 (upper experiment) or T225

(lower experiment) DNA was incubated with TBP, TFB, and RNAP, and the complexes

formed were captured, washed, and added to the second reaction mixture that contained

TFB and RNAP, but lacked TBP. After 10 min at 20°C, NTPs (with [32P]-CTP) were

added, the reaction mixtures were placed at 58°C, and the [32P]-labeled transcripts present in aliquots taken after 10, 20, and 30 min of incubation were separated by PAGE (upper experiment, lanes 4, 5, and 6; lower experiment, lanes 7, 8, and 9) and visualized using a phosphorimager. As illustrated for the upper experiment, TBP (50 ng) was added in control reactions and the transcripts present 10, 20, and 30 min after NTP addition were separated in lanes 1, 2, and 3. Lane S contained size standards.

86 T284 +TBP +TFB +RNAP

T225 +TFB +RNAP

T284 +TFB +RNAP

T225 +TBP +TFB +RNAP

Figure 3.3

87 Figure 3.4. Template competition assays with TFB. T284 (upper experiment) or T225

(lower experiment) DNA was incubated with TBP, TFB, and RNAP, and the complexes

formed were captured, washed, and added to the second reaction mixture that contained

TBP and RNAP, but lacked TFB. After 10 min at 20°C, NTPs (with [32P]-CTP) were

added, the reaction mixtures were placed at 58°C, and the [32P]-labeled transcripts present in aliquots taken after 10, 20, and 30 min of incubation were separated by PAGE (upper experiment, lanes 4, 5, and 6; lower experiment, lanes 7, 8, and 9) and visualized using a phosphorimager. As illustrated for the upper experiment, TFB (50 ng) was added in control reactions and the transcripts present 10, 20, and 30 min after NTP addition were separated in lanes 1, 2, and 3. Lane S contained size standards.

88 T284 +TBP +TFB +RNAP

T225 +TBP +RNAP

T284 +TBP +RNAP

T225 +TBP +TFB +RNAP

Figure 3.4

89 Figure 3.5. Template competition assays with TBP when complexes formed on T225 were in excess. Increasing amounts of T225 DNA were incubated with TBP, TFB, and

RNAP, and the complexes formed were captured, washed, and added to the second reaction mixture that contained a fixed amount of T284 DNA, TFB and RNAP, but lacked TBP. After 10 min at 20°C, NTPs (including [32P]-CTP) were added, the reaction mixtures were placed at 58°C, and the [32P]-labeled transcripts present after 30 min of incubation were separated by PAGE and visualized using a phosphorimager. In lanes 2, 3,

4, 5 and 6, the molar ratios of T225 to T284 were 1, 2.5, 5, 7.5 and 10, respectively. TBP

(50 ng) was added to a control reaction and the transcripts present 30 min after NTPs addition were separated in lane 1.

90 1 2 3 4 5 6

225 nt

165 nt

Figure 3.5

91 Figure 3.6. Immunoblot assays of template binding by TFB and TBP. (A) T475 or T439 was incubated with TBP, TFB, and RNAP in transcription buffer for 10 min at RT. The complexes formed were captured, washed, and subjected to SpeI digestion, and the transcription factors remaining in the supernatant after bead removal were separated by

SDS-PAGE and detected by immunoblotting. (B) After washing, T475-containing PICs were incubated in reaction mixtures that contained either no NTPs (–), ATP (A), ATP plus CTP (AC), ATP plus CTP plus GTP (ACG), or all four NTPs (ACGU). The complexes were then washed and subjected to SpeI digestion. After bead removal, the presence of TFB and TBP in the supernatant was determined by immunoblotting.

92 9 A. 5 3 7 T475 remove remove T4 or beads beads T4 SpeI TFB T439 wash Immunoblot +TBP supernatant TBP +TFB +RNAP

remove B. remove TXN beads remove o T475 beads A 58 C wash SpeI beads +TBP wash 10min Immunoblot +TFB AC supernatant +RNAP ACG ACGU U G G A AC AC AC

TFB TBP

Figure 3.6

93 CHAPTER 4

IN VITRO TRANSCRIPTION OF ADDITIONAL M.t. PROMOTERS AND STUDY OF THE M.t. ALBA PROTEIN

Introduction

When the M.t. in vitro transcription system was first established using TBP, TFB and RNAP, transcription in vitro was observed from the hmtB promoter, but not from other promoters known to be active in vivo (Darcy, 1999). This led to the suspicion that there exist unidentified archaeal general transcription factors. Later, TFE was identified as the third archaeal general transcription factor (Hanzelka et al., 2001). When added to in vitro transcription reactions, TFE stimulated transcription from several M.t. methane gene promoters, which otherwise directed little or no transcription in vitro. However,

TFE did not stimulate transcription from the hmtB promoter in vitro.

To further investigate the M.t. derived in vitro transcription system, several additional promoters were PCR amplified from M.t. genomic DNA and tested for their abilities to direct transcription initiation in vitro. These include promoters of three tRNA genes (MTH1527 tRNAAla, MTH1344 tRNAArg and MTH0638 tRNALeu) and promoters of the nifH and glnA genes that encode proteins involved in nitrogen fixation. All of these promoters directed transcription initiation in vitro, but transcription termination in vitro was not observed downstream of the tRNA genes.

94 Alba is a general DNA-binding protein in Sulfolobus species (Chapter 1). The

M.t. genome also encodes an Alba homolog (Smith et al., 1997). However, a lysine

residue critical for DNA-binding and reversible acetylation of the Sulfolobus Alba is replaced by an asparagine residue in M.t. Alba. It is also interesting that the M.t. genome

encodes homologs of the eukaryal histone deacetylase Rpd3 and the histone

acetyltransferase Elp3. Thus, the M.t. Alba could be acetylated and deacetylated by these

enzymes. A recombinant M.t. Alba was therefore generated, and its DNA-binding

properties investigated.

Materials and Methods

Amplification of transcription templates

All transcription templates were PCR amplified from M.t. genomic DNA (a gift

from Rachel Samson). Primers MX43 and MX44 (see Appendix) were used to PCR

amplify the promoter region of MTH1527 (tRNAAla), primers MX45 and MX46 to PCR

amplify the promoter region of MTH1344 (tRNAArg), and primers MX47 and MX48 to

PCR amplify the promoter region of MTH0638 (tRNALeu). Primers MX119 and MX120

were used to PCR amplify the promoter region of the nifH gene (MTH1560), and primers

MX109 and MX 110 to PCR amplify the promoter region of the glnA gene (MTH1570).

The PCR products were purified using Qiagen PCR clean-up kits and then used in in vitro transcription reactions. The sequences of the BRE/TATA-box regions and the expected sizes of run-off transcripts from these templates are shown in Figure 4.1.

95 In vitro transcription

In vitro transcription reactions were carried out in 25 µl volumes containing

template DNA (50 ng), RNAP (500 ng), TBP (300 ng), TFB (50 ng), 200 µM ATP, 200

µM UTP, 200 µM GTP, 20 µM CTP, and 5 µCi of [α-32P] CTP (3000 Ci/mmol). The reaction mixtures were incubated at 58°C for 30 min, mixed with 25 µl of 95% formamide, 20 mM EDTA. The products were separated by electrophoresis through 6% denaturing polyacrylamide gels and visualized using a phosphorimager.

Cloning of M.t. Alba in E. coli

MTH1483 encodes Alba and was PCR amplified from M.t. genomic DNA using

primers MX11 and MX12 (see Appendix) and the Pfu polymerase. Primer MX11 added a

BamHI restriction site and MX12 added an EcoRI restriction site to the PCR product. The

PCR product was purified using a Qiagen PCR clean-up kit and digested with BamHI and

EcoRI restriction enzymes and ligated with BamHI and EcoRI digested pTrcHisA. The ligation product was used to transform E. coli Top10 competent cells and transformants plated on LB agar plates that contained ampicillin (50 µg/ml). Colonies that are resistant to ampicillin were picked and resuspended in LB liquid medium containing ampicillin

(50 µg/ml). When the OD600 reached 0.4, 0.7 ml of culture was mixed with 0.3 ml of

sterile glycerol in a screw-capped vial, vortexed briefly and transferred to a –70˚C

freezer for long-term storage.

96 Purification of recombinant M.t. Alba

To obtain recombinant M.t. Alba, 50-ml cultures were grown in LB liquid medium containing ampicillin (50 µg/ml) at 37 °C to OD600 of ~0.6 and IPTG was added to a final concentration of 1 mM. After 2 h of further incubation, the cells were harvested by centrifugation in a Sorval SS34 rotor at 5000 rpm for 10 min, resuspended in 1.2 ml lysis buffer [50 mM Na3PO4 (pH 8.0), 300 mM NaCl] and passaged twice through a

French press. The cell lysate was collected and centrifuged in an Eppendorf microfuge at

14,000 rpm for 20 min. The supernatant was collected and applied to a Qiagen Ni-NTA column that was washed twice with 0.6 ml wash buffer [50 mM Na3PO4 (pH 6.0), 300 mM NaCl, 50 mM imidazole] and eluted twice with 0.2 ml elution buffer [50 mM

Na3PO4 (pH 6.0), 300 mM NaCl, 500 mM imidazole]. The eluates were pooled and dialyzed against 500 ml of 50 mM Tris-HCl (pH 8.0), 300 mM KCl, 10 mM MgCl2, 1 mM DTT, 20% (v/v) glycerol to remove the imidazole. Glycerol was added to the dialyzed protein solution to a final concentration of 35% (v/v). The protein solution was then stored at -20°C.

EMSA of M.t. Alba

DNA molecules for EMSA were generated by PCR from pYX2 (Figure 2.1) using primers TD2 and MX11 (see Appendix). The PCR product was purified using a Qiagen

PCR clean-up kit, end-labeled using T4 kinase and [32P]-γ-ATP and separated from unincorporated nucleotides by passage through a 1-ml syringe filled with G-50 Sephadex resin. For gel shift assays, aliquots of the labeled DNA (1 ng) were incubated with increasing amounts of recombinant Alba in 100 mM Tris-HCl (pH 8.0), 100 mM KCl.

97 After 20 min incubation at RT, the reaction products were separated by electrophoresis

through 6% native polyacrylamide gels and visualized using a phosphorimager.

Results

In vitro transcription using other promoters

Run-off transcripts of the expected sizes were synthesized from the templates carrying the promoters of the nifH, glnA, and three tRNA genes (Figure 4.1).

Transcription of the three tRNA genes continued to the end of the template, generating run-off transcripts without terminating at the TA-rich sequences following the tRNA genes. The TA-rich sequences are similar but non-identical to a sequence

(TTTTAATTTT) that terminated transcription downstream of the tRNAVal gene in the

Methanococcus thermolithotrophicus in vitro transcription system (Thomm, 1996). The sequences are AAATCTTTTTT, TTTTTTATTAA, and TTTTTTTAATA downstream of tRNAAla, tRNAArg and tRNALeu, respectively. These three tRNA genes are apparently monocistronic transcription units, as they have well defined BRE and TATA-box sequences, and the ORFs immediately upstream and downstream of the tRNA genes are transcribed in opposite directions. Purification and assays of recombinant M.t. Alba

Recombinant M.t. Alba was generated, with the additional his-tag sequence

MGGSHHHHHHGMASMTGGQQ attached to the N-terminus. The protein was over- expressed in E. coli and was soluble. When the protein was incubated with the 84-bp

[32P]-labeled hmtB DNA, and the reaction products subjected to electrophoresis through

native polyacrylamide gels, no gel shift was observed, indicating this Alba preparation

did not bind DNA. Using an independently cloned C-terminal his-tagged M.t. Alba

98 protein and several other DNA molecules as binding substrates, Kathleen Sandman did not observe any DNA-binding activity of recombinant M.t. Alba (Kathleen Sandman, personal communication).

Discussion

The observation of in vitro transcription from the nifH, glnA, and three tRNA promoters demonstrated that the M.t. in vitro transcription system was capable of transcription from not only the hmtB promoter (Darcy et al., 1999), but also several other

M.t. promoters. These promoters all have easily identifiable TATA-box and BRE sequences (Figure 4.1A). The lengths of run-off transcripts were consistent with transcription initiation occurring ~24 bp downstream from these TATA-boxes (Figure

4.1). Approximately the same amounts of transcripts were generated from these promoters, except for the tRNAArg promoter in which the first base of the TATA-box is

A, not T, as in the consensus for archaeal TATA-boxes. The templates carrying the nifH and glnA promoters could now be used to study how expression of these nitrogen fixation genes is regulated in M.t. The robust transcription observed in vitro suggests that regulation of these two promoters most likely employs a repressor-based system, as has been demonstrated in vivo in Methanococcus maripaludis (Lie and Leigh, 2002; Lie and

Leigh, 2003). However, as an in vitro transcription system is not available for M. maripaludis, the regulation of these nitrogen fixation genes as a model in Archaea may be better studied biochemically using the M.t. in vitro transcription system.

The lack of transcription termination downstream of the three tRNA genes is noteworthy, as a 10-bp TA-rich sequence (TTTTAATTTT) downstream of the tRNAVal

99 gene from Methanococcus thermolithotrophicus was reported to be necessary and sufficient for transcription termination in vitro (Thomm, 1996). Mutations in this

sequence abolished its function (Thomm, 1996). Downstream from the three

tRNA genes used in this study are TA rich sequences that resemble the terminator

sequence described by Thomm, but they did not cause transcription termination in the

M.t. in vitro transcription system. It remains possible that TA rich sequences do constitute

transcription termination signals, but additional protein factor(s) are needed. The

termination of transcription observed downstream of the tRNAVal gene from M. thermolithotrophicus in vitro by Thomm may be an exception, not a general rule. When

the tRNAVal terminator sequence was inserted into a transcription template downstream of the hmtB promoter and investigated using the M. thermautotrophicus in vitro transcription system, it did not cause transcription termination in vitro (Thomas

Santangelo, personal communication), suggesting that there are inherent differences between the M. thermolithotrophicus and the M. thermautotrophicus in vitro transcription systems. On the other hand, in vivo studies have demonstrated that a T-rich sequence

(TTATTCTTT) was sufficient and necessary to cause transcription termination in

Haloferax volcanii (Kuo, 1997; Thompson et al., 1999). More recently, a terminator that could encode an mRNA with the potential to fold into a stem-loop was identified downstream of the M.t. mcr operon, and this terminator caused transcription termination in the M.t. in vitro transcription system (Thomas Santangelo, personal communication), suggesting that different types of terminator sequences exist in Archaea.

The lack of DNA-binding by the M.t. Alba was surprising, given that this activity

was well documented for Alba proteins from Sulfolobus species (Bell et al., 2002;

100 Wardleworth et al., 2002; Aravind et al., 2003; Zhao et al., 2003; Edmondson et al.,

2004). The most likely explanation is that the N-terminal and C-terminal his-tagged recombinant Alba generated in E. coli did not fold correctly and therefore did not have the correct configuration for DNA binding. This can be tested by generating a recombinant Alba without any tag in E. coli, or by purifying native Alba from M.t. cell extracts and repeating the gel shift experiments. Since the recombinant his-tagged M.t.

Alba generated in this study did not bind DNA, experiments to study its effects on transcription were not pursued.

101

Figure 4.1. Transcription from the tRNAAla, tRNAArg, tRNALeu, nifH and glnA promoters in vitro. (A) The sequences of the BRE and TATA-box regions of the tRNAAla, tRNAArg, tRNALeu, nifH and glnA genes are shown, with the predicted sizes of the run-off transcripts indicated to the right. (B) The products of in vitro transcription reactions were separated by electrophoresis through 6% denaturing polyacrylamide gels and visualized using a phosphorimager. The names of the promoters are indicated and the control lane

(marker) contained size standards.

102 A. predicted size of run-off transcript BRE TATA-box (nt) tRNAAla GGAAAATT TTAATATA GGAT 255 tRNAArg GAGAAAAT ATTTAAAT GGTT 189 tRNALeu CCGCAAAC TTTAAATA GTAA 210 nifH CCAAAAAA TTTAAATA GATG 136 glnA ACAAAAAA TATATAAA CCAG 230

B. tR tR tR mar N N N nif gl A A A k n H er Le Al Ar A a g nt u

250 200

150

100

50

Figure 4.1

103 CHAPTER 5

REGULATION OF THE trp OPERON IN M.t.

Introduction

The biosynthesis of tryptophan is metabolically the most expensive of all amino acids (Bentley, 1990) and expression of the trp genes that encode the enzymes that catalyze the pathway is tightly controlled in Bacteria and Eukarya. The trp genes themselves are highly conserved, but a variety of different molecular mechanisms regulate their expression. The trp genes are also conserved in Archaea, and trp gene transcripts have been shown to be more abundant in Methanothermobacter marburgensis and Pyrococcus kodakarensis KOD1 cells grown in the absence than in the presence of tryptophan (Gast et al., 1994; Gast et al., 1997; Tang et al., 1999; Tang et al., 2000), but the molecular details of this regulation have not been reported. Given this evidence of regulation, and precedents established from Bacteria and Eukarya available for comparison, I decided to investigate trp gene regulation in M.t. using the in vitro transcription system.

In M.t., the 7 trp genes encoding the enzymes required for tryptophan biosynthesis from chorismate are clustered in the trpEGCFBAD operon (Figure 5.1A).

104 Upstream of the trp operon are several imperfect repeats with the consensus sequence

TGTACA, which were proposed to be binding sites for transcription regulator(s) and

designated TRP boxes (Gelfand et al., 2000). The gene (MTH1654) immediately

upstream of the trp operon and transcribed in the opposite direction encodes a protein that has homologs only in Archaea. Computer analyses of this protein and its homologs predict that it contains an ACT domain (acetolactate synthase, chorismate mutase and

TyrR) in the C-terminal region (Gelfand et al., 2000). ACT domains have been

implicated in allosteric regulations of metabolic enzymes (in acetolactate synthase and

chorismate mutase) and transcription regulation (in E. coli TyrR) in response to amino

acid binding (Chipman and Shaanan, 2001). The structures of two proteins with ACT

domains have been determined. One is the E. coli 3-phosphoglycerate dehydrogenase (3-

PGDH) and the other is the rat hydroxylase (PheOH). 3-PGDH binds

serine at the protein dimer interface, while the phenylalanine-binding site of PheOH was

not located in the structure (Chipman and Shaanan, 2001).

The trpEGCFBAD operon in M. marburgensis has the same organization and an

upstream promoter sequence similar to that in M.t. (Meile et al., 1991). M. marburgensis

mutants that constitutively express the trp operon have been isolated, based on their

resistance to 5-methyltryptophan (5MT). One mutant had a single base pair deletion in

the region upstream of the trp operon, and a second mutant had a missense mutation in

the upstream ORF that corresponds to MTH1654 in M.t. (Gast et al., 1997).

Experiments were therefore undertaken to determine if the MTH1654 gene

product is a transcription regulator of the trp operon in M.t. The results obtained

demonstrate that this protein is a sequence-specific DNA binding protein that auto-

105 represses the transcription of its own gene. It also represses the transcription of the

trpEGCFBAD operon when tryptophan is present. Based on this function as a trp operon

regulator, but to avoid confusion with the E. coli TrpR that has the same function but a

different structure, MTH1654 has been designated trpY, and the encoded protein TRPY.

The promoter region of MTH1476 (here designated trpB2), which encodes a TrpB

homolog but is not linked to the trp operon, contains two TRP box sequences (Gelfand et al., 2000). Experiments were undertaken to determine if trpB2 transcription is also regulated by TRPY. The results obtained show that trpB2 transcription is repressed by

TRPY.

Materials and Methods

Transcription template construction

A PCR fragment encompassing the intergenic region between MTH1654 (trpY)

and MTH1655 (trpE) was PCR amplified from M.t. genomic DNA using primers MX75

and MX57 (see Appendix) and cloned into pCR2.1-TOPO using a TOPO TA cloning kit.

The resulting plasmid was named p7557 and its sequence was verified by automated

sequencing. Template T1 (466 bp) (Figures 5.1A and 5.3A) was PCR amplified from

p7557 using primers MX75 and MX57, and was used in in vitro transcription reactions.

T11 (142 bp) (Figure 5.6A) was amplified from p7557 using primers MX56 and MX64,

and was used in EMSA.

Plasmid p7557 was subjected to site-directed mutagenesis to generate templates

with mutated sequences (T2 through T9). Each reaction mixture (50 µl) contained 20 ng

106 of p7557, 5 µl 10X Pfu buffer [200 mM Tris-HCl (pH 8.8), 100 mM (NH4)2SO4, 100 mM KCl, 1 mg/ml BSA, 20 mM MgSO4], 1 µl 10 mM dNTPs, 1 µl Turbo Pfu

polymerase (2.5 U/µl), and 50 pmol of each of two mutagenizing primers (see Appendix).

The reaction mixtures were incubated in a thermocycler for 18 cycles (94°C, 15 sec;

45°C, 1 min; 68°C, 8 min). DpnI (5 U) was then added and the reaction mixtures further incubated at 37°C for 1 h before1 µl aliquot of reaction mixture was used to transform E. coli DH5α competent cells and plated on LB kanamycin (50 µg/ml) plates to select for transformants. Colonies were picked and cultures grown in 2 ml LB liquid medium that contained kanamycin (50 µg/ml). Plasmids were obtained using Qiagen miniprep kits.

The inserts in the plasmids were sequenced to verify the presence of the desired mutations. Plasmids with desired mutations were used as templates in PCR reactions with primers MX57 and MX75 to generate the transcription templates T2 through T9.

Template T10 (438 bp) (Figure 5.12A) carrying the trpB2 promoter region was amplified from M.t. genomic DNA using primers MX111 and MX138, and used in in vitro transcription reactions. T12 (136 bp) (Figure 5.12A and E) was amplified from M.t.

genomic DNA using primers MX112 and MX132 and was used in EMSA. T13 (84 bp)

(Figure 5.7) was PCR amplified from pYX2 using primers TD2 and MX11, and was used

in EMSA as a non-specific DNA substrate.

In vitro transcription In vitro transcription reaction mixtures (25 µl) contained template DNA (30 ng),

RNAP (250 ng), TBP (50 ng), TFB (300 ng), 200 µM ATP, 200 µM UTP, 200 µM GTP,

20 µM CTP and 5 µCi of [α-32P] CTP (3000 Ci/mmol). The reaction mixtures were

incubated at 58°C for 30 min. When TRPY and/or amino acids were added, the amounts

107 or concentrations are noted in the relevant text, figures and figures legends. Because of

the low solubility of 5MT in water, 5MT was dissolved in methanol as a 20 mM stock

solution. All other amino acids were dissolved in water.

Mapping of transcription start sites

In vitro transcription reactions were performed in reaction mixtures scaled up to

200 µl. After 30 min incubation at 58°C, 5 U of DNase I were added and the reactions were incubated at 37°C for 30 min. After incubation, 2 µl of glycogen (200 mg/ml), 100

µl of 1 M Na-acetate, and 900 µl of 95% ethanol were added and the tubes were incubated at -20°C for 10 min. After centrifugation in an Eppendorf microfuge at 14,000 rpm for 15 min, the supernatant was discarded, and the RNA pellets resuspended in 30 µl of TE buffer [10 mM Tris-HCl (pH 7.5), 1 mM EDTA].

Primer extension reaction mixtures (20 µl total volume) contained 2 µl of 10X reverse transcription buffer [100 mM Tris-HCl (pH 7.8), 100 mM (NH4)2SO4, 1 mM

32 DTT, 1 mg/ml BSA, 25 mM MgCl2], 2 µl of 5 mM dNTPs, 0.25 pmol of [ P]-labeled primer MX77, MX64 or MX133 (see Appendix), 4 U of OmniScript and 2 µl of in vitro transcribed RNA solution. The reaction mixtures were incubated at

37°C for 60 min, and the reaction stopped by adding 12 µl of 96% formamide, 25 mM

EDTA, and the products separated by electrophoresis on sequencing gels alongside DNA sequencing ladders generated by using the same [32P]-labeled primer and T1 or T10 as the template DNA.

108 Cloning of trpY

MTH1654 that encodes TRPY was PCR amplified from M.t. genomic DNA using primers MX86 and MX87 (see Appendix) and the Pfu polymerase. Primer MX86 added an NdeI restriction site to the PCR product and MX87 added a HindIII restriction site.

The PCR product was purified using a Qiagen PCR clean-up kit and digested with NdeI and HindIII. Plasmid pET30a was also digested with NdeI and HindIII. The PCR product and the linearized pET30a were ligated using T4 ligase. An aliquot of the ligation product was used to transform competent E. coli DH5α cells with transformants selected on LB agar plates that contained kanamycin (50 µg/ml).

Kanamycin resistant colonies were picked from the agar plates and resuspended in 2 ml of LB liquid medium containing kanamycin (50 µg /ml). The cultures were grown overnight and the plasmid DNA purified using a Qiagen plasmid miniprep kit. The plasmids were sequenced to verify the insert sequence. One such plasmid with the correct insert sequence (p8687) was used to transform competent E. coli Rosetta (DE3) cells and transformants were selected by plating on LB agar plates that contained both kanamycin

(50 µg/ml) and chloramphenicol (25 µg/ml).

Preparation of recombinant TRPY

To obtain recombinant TRPY, 50-ml cultures of E. coli Rosetta (DE3; p8687) were grown in LB liquid media containing kanamycin (50 µg/ml) and chloramphenicol

(25 µg/ml) at 37°C to an OD600 of ~0.6 and IPTG then added to a final concentration of 1 mM. After 2 h of continued incubation, the cells were harvested by centrifugation in a

Sorval SS34 rotor at 5000 rpm for 10 min. The cell pellet was resuspended in 1.2 ml lysis

109 buffer [50 mM Na3PO4 (pH 8.0), 300 mM NaCl] and passaged twice through a French

press. The cell lysate was centrifuged in an Eppendorf microfuge at 14,000 rpm for 20

min. The supernatant was collected and applied to a Qiagen Ni-NTA column. The

column was washed twice with 0.6 ml wash buffer [50 mM Na3PO4 (pH 6.0), 300 mM

NaCl, 50 mM imidazole] and eluted twice with 0.2 ml elution buffer [50 mM NaPO4 (pH

6.0), 300 mM NaCl, 500 mM imidazole]. The eluates were pooled and dialyzed against

500 ml of 50 mM Tris-HCl (pH 8.0), 300 mM KCl, 10 mM MgCl2, 1 mM DTT, 20%

(v/v) glycerol to remove the imidazole. Glycerol was added to the dialyzed protein

solution to a final concentration of 35% (v/v). The TRPY protein preparation was stored

at –20°C. The concentration of TRPY protein was measured using the Bradford assay

with BSA (Sigma) as a standard. The purity of the protein preparation was estimated by

SDS-PAGE and Coomassie staining.

[32P]-labeling of oligonucleotides and DNA

Single-stranded oligonucleotides or double-stranded DNA molecules were labeled

using T4 kinase. A typical 20 µl reaction mixture contained 4 µl of forward reaction buffer [350 mM Tris-HCl (pH 7.6), 50 mM MgCl2, 500 mM KCl, 5 mM DTT], 10 U T4

kinase, 200 µCi [γ-32P] ATP (7000 Ci/mmole), and was incubated at 37°C for 1 h. The

reactions were terminated by the addition of 20 µl of 20 mM EDTA. The labeled DNA

was separated from unincorporated [γ-32P] ATP by passage through a syringe packed with Sephadex G-50 resin. Double-stranded DNA molecules were prepared by mixing equal molar amounts of two single-stranded complementary oligonucleotides (25 pmol/µl). The mixture was heated to 95°C for 3 min and then cooled down to RT.

110

EMSA of TRPY binding to DNA

[32P]-labeled DNAs (1 ng) were incubated with increasing amounts of TRPY

and/or tryptophan plus 50 ng of poly dI-dC for 20 min at RT in 10 µl of transcription buffer. Following the addition of 1.5 µl of loading buffer (40% [w/v] sucrose, 0.4% bromophenol blue, 0.4% xylene cyanol), the reaction products were separated by electrophoresis through 6% native polyacrylamide gels and visualized using a phosphorimager.

DNase I footprinting

DNA molecules for DNase I footprinting were PCR amplified using [32P]-end-

labeled primer MX64 and unlabeled primer MX75 from plasmids p7557, p8283

(mutation in TATAE box) and p8485 (mutation in TATAY box). Each reaction mixture

(10 µl) contained 25 ng of labeled DNA and 1 U of DNase I in transcription buffer.

TRPY, tryptophan, TBP and TFB were added when necessary and their amounts are

indicated in the corresponding figures and figure legends. After incubation at RT for 7

min, the reactions were stopped by adding 10 µl of 96% formamide, 25 mM EDTA, and

the products separated by electrophoresis through 6% sequencing gels, and visualized

using a phosphorimager.

Results

Primer extension mapping of trpEGCFBAD and trpY transcription start sites

111 A diagram of the M.t. trpEGCFBAD operon and the adjacent trpY gene and the intergenic region is shown in Figure 5.1A. Results of primer extension are shown in

Figure 5.1C. Transcription of the trpEGCFBAD operon was found to start at a G, located

24 bp downstream from a potential TATA-box (TTTAAATA) and 32 bp downstream from a potential BRE (ACAAGA). Transcription of the trpY gene also started at a G, located 24 bp downstream from a potential TATA-box (TACATATA) and 32 bp from a potential BRE (ACAAAG). The two genes are located on opposite DNA strands and their promoters overlap. In both cases, the initiating G occurs within the sequence context

TTGT. The first nucleotide of the trpY transcript coincides with the bp that was deleted in the M. marburgensis mutant MWR1 that was resistant to 5MT (Gast et al., 1997).

Confirmation of the trpY/trpE TATA-box sequences and the importance of the nucleotide at the site of transcription initiation

When T1 DNA was used as the template in in vitro transcription reactions, two run-off transcripts were generated (Figure 5.1B). One, designated E, is 262 nt long, corresponding to the trpE transcript; the other, designated Y, is 169 nt long, corresponding to the trpY transcript. Templates T2 and T3 were constructed to confirm that the candidate TATA-box sequences (TATAE = TTTAAATA; TATAY =

TACATATA) were required for E and Y transcription, respectively. The AA and AT at positions 5 and 6 of each TATA-box were replaced by GG in T2 and T3. As predicted, only Y was synthesized in vitro from T2 and only E from T3 (Figure 5.1B). DNase I footprinting further confirmed that the TATAE and TATAY sequences were required for binding and ternary complex formation by TBP and TFB (Figure 5.1E). Consistent with

112 previous reports (Bell et al., 1999; Bell and Jackson, 2000; Hausner and Thomm, 2001;

Vierke et al., 2003), TBP and TFB binding to T2 and T3 protected regions of ~25 bp and the DNase I footprints generated covered the TATAE and TATAY sequences, respectively.

With T1, the original template with both TATA-boxes intact, TBP-TFB binding protected ~45 bp and the footprint extended to both TATA-boxes. A dominant hypersensitive site was generated upstream of the TATAE sequence by TBP and TFB binding on templates T1 and T2 with a couple of minor hypersensitive sites also introduced downstream of the TATA-box. A dominant hypersensitive site was not generated on T3, but minor hypersensitive sites were introduced upstream of the DNase I footprint.

Templates T4 through T7 were constructed to determine if the initiator nucleotide of the trpY gene was important for transcription initiation (Figure 5.1D). When the initiator nucleotide was changed from G to A (T4), Y synthesis was reduced. When the initiator nucleotide was changed from G to T (T5), or C (T6), or deleted (T7), Y synthesis was abolished.

Purification of TRPY

The recombinant TRPY used in this study contained 167 amino acid residues plus a C-terminal his-tag (LAAALQHHHHHH) (Figure 5.4A). The calculated molecular mass of the recombinant his-tagged TRPY is 21.2 KDa. The purified protein migrated slightly slower than a 20 KDa marker in SDS-PAGE (Figure 5.2), consistent with the calculated molecular weight. The purified protein was greater than 95% pure, as judged by

Coomassie staining after SDS-PAGE.

113

Effects of TRPY on transcription in vitro

In the absence of TRPY and tryptophan, in vitro transcription using T1 as template resulted in the E and Y transcripts (Figures 5.1B and 5.3). Addition of tryptophan (800 µM final concentration) had no effect on the amounts of the E and Y transcripts synthesized. Addition of TRPY (200 ng) had no effect on the amount of the E transcript synthesized, but reduced the amount of the Y transcript by >80% (Figure 5.3).

When tryptophan (800 µM final concentration) and TRPY protein (200 ng) were both added, syntheses of both E and Y were abolished (Figure 5.3). When tryptophan was replaced by tyrosine, leucine, , phenylalanine, alanine or arginine (800 µM final concentration), TRPY inhibited only Y synthesis, not E synthesis (Figure 5.3).

The effects of TRPY on E and Y synthesis were determined with increasing amounts of TRPY (25, 50, 100, 150, 200 ng) present in the in vitro transcription reactions

(Figure 5.4B, C). In the absence of tryptophan, there was no effect on E synthesis, regardless of the TRPY concentration, but a TRPY concentration-dependent decrease in

Y synthesis was observed. In the presence of tryptophan (800 µM final concentration), both E and Y synthesis were decreased in a TRPY concentration-dependent manner. In the absence of tryptophan, the amount of TRPY that caused 50% inhibition of Y synthesis was ~75 ng, corresponding to 38 TRPY monomers per template DNA molecule.

In the presence of tryptophan (800 µM), the amounts of TRPY that caused 50% inhibition of E and Y synthesis were ~100 ng and ~75 ng, respectively, corresponding to

50 and 38 TRPY monomers per template DNA molecule.

114 When increasing concentrations of tryptophan (2, 8, 24, 80, 160, 240, 480, 800 and 1600 µM) were added to in vitro transcription reactions that contained a fixed amount of TRPY (200 ng), E synthesis decreased as the tryptophan concentration increased (Figure 5.5). The concentration of tryptophan that caused 50% inhibition of E synthesis was ~8 µM, close to the internal tryptophan concentration (11.3 µM) measured in M. marburgensis cells grown in tryptophan-free medium (Gast et al. 1997). Addition of the tryptophan analog 5MT also inhibited E synthesis in the presence of 200 ng of

TRPY. This indicates that TRPY responded to both tryptophan and 5MT.

DNA binding activity of TRPY

To determine the DNA binding activity of TRPY and to identify the DNA binding sites in the trpY/trpE intergenic region, EMSA experiments were undertaken. TRPY binding to T11 (Figure 5.6A) produced a gel shift band in the absence of tryptophan

(Figure 5.6B). In the presence of tryptophan, a diffuse gel shift band was produced, suggesting that a larger complex or a conformationally different complex was formed. In the presence of other amino acids and TRPY, the diffuse gel shift band was not produced.

EMSA experiments also showed that TRPY did not bind to the control T13 (hmtB promoter) DNA, regardless of the presence of tryptophan (Figure 5.7).

To define the sites of TRPY binding, DNA molecules with the sequences shown in Figure 5.8A were used in EMSA. When incubated with TRPY, A12 produced a gel shift, in the presence and absence of tryptophan, but A23, A34, A50 and A56 did not produce a gel shift. A12 therefore contains the binding site for TRPY. To define which sequence within A12 was important for TRPY binding, derivatives (A13–A19) with the

115 sequences shown in Figure 5.8A were generated and used in EMSA. A13 and A14, which have altered spacing between TRP box 1 and 2, did not produce a gel shift. A15,

A16 and A18, which have mutations outside the TRP boxes, produced a gel shift, but

A17 and A19, which have mutations within TRP box 1 or 2 respectively, did not. To extend this study, additional DNA molecules with the sequences shown in Figure 5.8B were used in EMSA. B12 has the wild-type sequence and produced a gel shift in the presence and absence of tryptophan. The complexes formed with B12 in the presence of tryptophan were more diffuse than the complexes formed in the absence of tryptophan.

B13, which has a mutation in TRP box 1, did not produce a gel shift at all. B15 and B16, which have mutations in TRP box 3 and 4, produced a gel shift, but did not produce the more diffuse complexes.

Determination of TRPY binding sites by DNase I footprinting

DNA molecules for DNase I footprinting were generated by PCR from p7557 using an unlabeled primer MX75 and a [32P]-labeled primer MX64. The DNA molecules

(25 ng) were incubated with increasing amounts of TRPY (0, 25, 50, 100 and 150 ng) in the absence or presence of tryptophan (800 µM final concentration) at RT for 20 min and the complexes formed subjected to DNase I digestion for 8 min. In the absence of tryptophan, DNase I protection resulted in a ~25 bp footprint, located in the region of

TRP boxes 1 and 2 (Figure 5.9, lane 7). The footprint region was the same region that was found to be important for TRPY binding in the EMSA experiments (Figure 5.8).

There was a prominent hypersensitive site introduced by TRPY binding within TRP box

1, and this hypersensitive site corresponded to the third base in TRP box 1. There was

116 also a minor hypersensitive site introduced in TRP box 2. In the presence of tryptophan, the DNase I footprint was extended to ~55 bp, and included TRP boxes 1 through 4

(Figure 5.9, lane 13). There were two major hypersensitive sites introduced, one in TRP box 1 and one in TRP box 4. The first was the same as seen without tryptophan; the second was located at the third base of TRP Box 4. These two sites were separated by 42 bp, or ~4 helical turns of DNA. There were also three minor hypersensitive sites introduced at 10 or 11 bp intervals. In the presence and absence of tryptophan, a hypersensitive site was also generated in TRP box 5. These footprinting results were consistent with binding of TRPY to TRP boxes 1 and 2 in the absence of tryptophan, to

TRP boxes 1, 2, 3 and 4 in the presence of tryptophan, and to TRP box 5 in the presence and absence of tryptophan.

Transcription using templates with mutations in TRP boxes 1 and 3

Templates T8 and T9 were constructed to investigate the role of TRP boxes 1 and

3 in TRPY binding and transcription regulation (Figure 5.10). The TRP box 1 sequence was changed from TGTACA to TGGGGG in template T8, the same sequence change that was introduced to B12 to generate B13 that did not produce a gel shift when incubated with TRPY in the presence or absence of tryptophan (Figure 5.8B). The TRP box 3 sequence was changed from AGTACC to AGGGGC in template T9 and this change was introduced in B12 to generate B15 that gave a gel shift in the presence of TRPY but did not generate the diffuse band in the presence of TRPY plus tryptophan (Figure 5.8B).

When T8 was used as the template for in vitro transcription, there was decreased inhibition of Y synthesis in the absence of tryptophan, consistent with decreased binding

117 of TRPY to the mutated TRP box 1. This mutation had no effect on TRPY inhibition of E synthesis in the presence of tryptophan. When T9 was used as the template for in vitro transcription, there was decreased inhibition of E synthesis in the presence of tryptophan, consistent with decreased TRPY binding to the mutated TRP box 3.

Mutational studies of TRPY protein

Mutations in the trpY gene have been identified in 5MT-resistant strains of M. thermautotrophicus by Lubomira Cubonova (personal communication). Recombinant versions of the encoded TRPY proteins were generated in E. coli and tested in gel shift assays. One TRPY variant, A128E, retained the DNA-binding activity. When A128E was added to in vitro transcription reactions containing template T1, it repressed Y synthesis, but did not repress E synthesis in the presence of tryptophan (Figure 5.11). Another

TRPY variant, G149R, which confers the 5MT-resistance phenotype in M. thermautotrophicus, inhibited E synthesis in in vitro transcription reactions in the presence of tryptophan (Figure 5.11).

trpB2 transcription is also regulated by TRPY

Diagrams of the MTH1476 (trpB2) and MTH1477 genes are shown in Figures

5.12A and B. trpB2 is a homolog of trpB and is present in both Archaea and Bacteria. It is postulated to be involved in indole or serine metabolism (Hettwer and Sterner, 2002;

Xie et al., 2002). The function of MTH1477 is unknown and it is present only in Archaea.

The intergenic region between trpB2 and MTH1477 contains two perfect TRP boxes, separated by 4 bp, located upstream of a potential TATA-box for the trpB2 gene.

118 Template T10 (438 bp) was generated by PCR from M.t. genomic DNA, and T10 includes the trpB2 and MTH1477 intergenic region. When T10 was used in in vitro transcription reactions, a transcript (B2) that had the electrophoretic mobility consistent with a run-off transcript (201 nt) was synthesized. The transcription start site was located

24 bp downstream of the putative TATA-box (TTTAAATA). Between the transcription and start sites, there is a short ORF that could encode a 15-residue leader peptide that would contain 5 serine residues (Figure 5.12B). Addition of TRPY inhibited

B2 synthesis in the absence and presence of tryptophan, but the inhibition was more severe in the presence of tryptophan (Figure 5.12C and D). When 200 ng of TRPY was added, B2 synthesis was reduced by 70% in the absence of tryptophan, and by 95% in the presence of tryptophan (800 µM). B2 synthesis was reduced by 50% by ~150 ng of

TRPY (corresponding to 80 TRPY monomers per template DNA molecule) in the absence of tryptophan and by ~50 ng of TRPY in the presence of tryptophan. A transcript from the MTH1477 gene was not generated in these in vitro transcription reactions. T12

(136 bp) was generated by PCR from M.t. genomic DNA. It includes the trpB2/MTH1477 intergenic region, but is shorter than T10, and so more suitable for

EMSA. When T12 was incubated with TRPY, a gel shift band was observed in the presence and absence of tryptophan (Figure 5.12E). In contrast to the in vitro transcription reaction results, in which TRPY inhibition of B2 synthesis was increased by the presence of tryptophan, a complete gel shift was observed at a lower concentration of

TRPY in the absence of tryptophan than in the presence of tryptophan.

119 Discussion

Organization of the trpY and trpEGCFBAD regulatory region

The results obtained from in vitro transcription reactions from templates with and without mutations in the potential TATA-boxes confirmed that divergent transcription was initiated from two TATA-boxes in the trpY and trpEGCFBAD intergenic region

(Figure 5.1B). The direction of archaeal transcription is determined by TFB binding to

the BRE, a 6-bp region located immediately upstream of the TATA-box. The consensus

sequence of the BRE is RNWAAW (R= purine, W = A or T, N = any base; Bell et al.,

1999). For TATAE, the 6-bp sequence upstream is ACAAGA; for TATAY, the 6-bp

sequence upstream is ACAAAG, both matching the BRE consensus. The 6-bp sequences

downstream of TATAE and TATAY are AGGTTC and TCCGTG and these poorly

conform to the BRE consensus. trpY and trpE therefore contain their own TATA-boxes

and BREs, and the arrangement of the BREs relative to the TATA-boxes determines the

polarity of PIC assembly and the direction of transcription initiation.

Transcription of trpY and trpEGCFBAD is directed by overlapping promoters

(Figure 5.1A and E). The sizes of DNase I footprints (~25 bp) on T2 and T3 are

consistent with previous results for the assembly of one TBP and one TFB on a promoter

(Bell et al., 1999; Bell and Jackson, 2000; Hausner and Thomm, 2001). The size of

DNase I footprint (~45 bp) on T1 is consistent with two TBP/TFB complexes binding to

the same DNA. TBP/TFB binding generated hypersensitive sites on T1 and T2 and these

corresponded to the region where TFB binds to the BRE (Bell et al., 1999). Based on the

crystal structure of the archaeal TBP-TFB-promoter complex, TBP contacts the bases in

the TATA-box and TFB contacts both phosphate groups and bases in the BRE (Littlefield

120 et al., 1999). If the TBP-TATA-box and TFB-BRE interactions are mapped to the trpY and trpEGCFBAD intergenic region, it is possible to have two TBP/TFB complexes positioned to this region without steric conflict. However, one caveat remains. The TFB in the crystal structure did not have its N-terminal region, but photochemical crosslinking experiments have demonstrated that TFB binds not only to the BRE, but also to the region between the TATA-box and transcription initiation site, and this binding involves the N-terminal region of TFB (Bartlett et al., 2004; Renfrow et al., 2004). If these interactions are added to the trpY and trpEGCFBAD intergenic region, a conflict between

TBP binding and TFB binding at both promoters seems likely. Given that the in vitro transcription and footprinting results suggest that two sets of TBP/TFB can bind to the intergenic region simultaneously, additional structural studies are needed to define where the TFB N-terminal region interacts with DNA downstream of the TATA-box.

If two sets of TBP/TFB complexes bind to the intergenic region simultaneously, can two RNAPs be recruited and initiate transcription simultaneously? DNase I footprinting revealed that an archaeal RNAP binds to the region from –15 to +15, relative to the transcription start site, on both the template and non-template strands (Hausner and

Thomm, 2001). Similar results were obtained from photochemical crosslinking of RNAP to the template DNA (Bartlett et al., 2004; Renfrow et al., 2004). Such an assembly of two RNAPs would conflict with the two TBP/TFB complexes bound at the divergent promoters. A further complication is the transcription initiation bubble, which extends from –10 to +4 in Archaea (Hausner and Thomm, 2001). Such a transcription bubble from one promoter will overlap the TATA-box of the second promoter. So it seems there is no possibility of two RNAPs initiating simultaneously from TATAY and TATAE.

121 If transcription cannot initiate from TATAY and TATAE simultaneously, then the

choice of initiation from either TATAY or TATAE must depend on the assembly of a PIC

at TATAY or TATAE. If it occurs with equal frequency at TATAY versus TATAE, then there will be equal numbers of initiation events from TATAY versus TATAE. TBP has been shown to stay bound to the promoter after transcription initiation (Chapter 3), so it is possible that the TATA-box occupied by TBP first will direct multiple rounds of transcription initiation before the second TATA-box is occupied by TBP.

The nucleotide at the site of initiation was found to be critically important for trpY transcription, as deletion of this G or substitution with C or T abolished initiation and substitution with A reduced initiation (Figure 5.1D). The same nucleotide was deleted in the 5MT-resistant strain (MWR1) of M. marburgensis. An explanation for the relationship between the deletion and 5MT-resistance phenotype seems apparent. With the initiating nucleotide deleted, the M. marburgensis strain is defective in trpY

transcription, and cannot synthesize TRPY. The lack of TRPY would lead to the

constitutive expression of the trpEGCFBAD operon, resulting in increased intracellular

concentration of tryptophan that outcompetes the toxic effect of 5MT.

It is not surprising that archaeal RNAP preferred G or A as the initiating

nucleotide (Figure 5.1D). This preference for G or A is consistent with observations from

other RNAPs. For example, there are 17 T7 RNAP promoters in the T7 phage genome,

15 of which begin with G and 2 begin with A (Dunn and Studier, 1983). E. coli RNAP

initiates transcription usually with ATP or GTP, although occasionally CTP and UTP are

used (Reddy and Chatterji, 1993). In eukaryotes, random and site-directed mutational

studies have identified an in many RNAPII promoters, with the

122 consensus sequence YY A/G (+1) N T/A YY (Smale and Kadonaga, 2003). A survey of

recent literature on archaeal transcription reveals that ATP and GTP are used frequently

(~40% each) as initiating nucleotide, followed by UTP (~20%), while CTP is almost

never used.

To create an RNA chain de novo, RNAP must start by base-pairing the initiating

nucleotide with the template base. The preference for initiation with GTP suggests that

Watson-Crick base-pairing between the initiating nucleotide and the template is

important, as a G:C base pair is stronger than a T:A base pair. However, the lack of

initiation with CTP invalidates this simple explanation. More likely, the preference of

nucleotide depends on the fit of the nucleotide within the RNAP (Young et al., 2002;

Kuzmine et al., 2003), and this could explain why E. coli RNAP prefers ATP and T7

RNAP prefers GTP. It can be hypothesized that ATP and GTP have a better fit in the M.t.

RNAP than CTP and UTP. Since the transcription initiation region of the trpY gene has

the sequence TCTTG (+1) TCT, deleting the G or substituting the G with C or T makes

the region completely devoid of A or G, thus making transcription initiation from an

adjacent alternative nucleotide very unlikely.

DNA-binding properties of TRPY

The EMSA results demonstrated that A12 contains binding sites for TRPY

(Figure 5.8A). Sequence changes outside the TRP boxes did not eliminate the gel shift, whereas sequence changes within TRP box 1 or 2 did. The EMSA results confirmed that

TRP boxes 1 and 2 are essential for TRPY binding. As deletion or addition of 2 bp between TRP boxes 1 and 2 also abolished gel shift, the spacing between TRP boxes 1

123 and 2 must be important for TRPY binding. TRPY binding therefore requires two TRP boxes separated by 4 bp. The TRPY binding site has the consensus sequence

TGTACANNNNTGTACA, reminiscent of the E. coli TyrR binding sequence

TGTAAANNNNNNTTTACA (Andrews et al., 1991). TRPY probably binds as a dimer, with each monomer binding to one TRP box and 4 bp being the optimal distance between the two DNA binding domains of the two TRPY monomers. Many transcription regulators form dimers and bind to two adjacent sites on DNA. This mechanism of binding provides higher sequence specificity and usually results in a lower dissociation constant than monomer binding. If TRPY binding requires to a single 6-bp TRP box, there would be one binding site per 4096 bp of DNA, and the M.t. genome (1.75 Mbp;

Smith et al., 1997) would contain ~450 such binding sites. If TRP binding requires two perfect TRP box sequences separated by 4 bp, then the frequency of such a binding site is once per 16.7 Mbp of DNA. Allowing one mismatch in each TRP box would increase the frequency to once per 1 Mbp of DNA. A FASTA search for the

TGTACANNNNTGTACA sequence in the entire M.t. genome identified sequences in the promoter regions of the trpEGCFBAD operon (TGTACANNNNTATACA) and the trpB2 gene (TGTACANNNNTGTACA), with the mismatched nucleotide underlined.

Sequences with two mismatches (TGTTCANNNNAGTACA in the MTH1129 mrtA coding region and TGTTCANNNNTGTACC in the MTH1401 unknown conserved protein coding region) and sequences with three mismatches (TGTTCANNNNAGTTCA,

TGATCANNNNTGTGCA, AGTACTNNNNTGTGCA, TGTGCANNNNAGTACC,

AGTTCANNNNTGTTCA, TGTGCANNNNGGTGCA, AGTTCANNNNTGTTCA,

AGTACANNNNAGTTCA, TGGTCANNNNTGGACA, TGTCCANNNNAGTGCA,

124 TGTACANNNNAATACT, AGTACCNNNNTGAACA, AGTACCNNNNTGTATA in

TRP boxes 3 and 4, TGTACANNNNTGAATG in TRP boxes 5 and 6 in trpY and trpE intergenic region) are also present in the M.t. genome.

The lack of gel shift with A23, A34, A50 and A56 (Figure 5.8A) presumably reflects their low affinity for TRPY, although they do contain TRP box sequences. A23 has TRP boxes 2 and 3, but the spacing is 15 bp, not 4 bp, whereas A34 has TRP boxes 3 and 4 and the correct spacing, but TRP boxes 3 and 4 have 2 and 1 mismatch from the consensus sequence, respectively. A56 has TRP boxes 5 and 6. The TRP box 5 sequence is a perfect match to the consensus, but the TRP box 6 sequence has three mismatches.

The diffuse gel shift band observed with B12 (Figure 5.8B) indicates that new

TRPY-DNA complexes were formed in the presence of tryptophan. The new complexes could be either larger complexes or have a different conformation from the complexes formed in the absence of tryptophan. When B13 (with a mutation in TRP box 1) was used, there was no gel shift at all, regardless of the presence of tryptophan, arguing that TRP box 1 is essential for TRPY binding. When B15 and B16 were used (with mutations in

TRP boxes 3 or 4), a gel shift band was observed, but the diffuse gel shift band was not observed in the presence of tryptophan, indicating that TRP boxes 3 and 4 are needed to form the larger complex or to generate the conformational change that resulted in the diffuse band.

The diffuse band was formed in the presence of tryptophan, but not in the presence of other amino acids (Figure 5.6), consistent with TRPY responding specifically to tryptophan, and not to other amino acids, including the aromatic amino acids phenylalanine and tyrosine. Tryptophan is most likely sensed by binding to the predicted

125 ACT domain in the C-terminal region of TRPY. In E. coli TyrR, a binding site for

tyrosine has been mapped by mutagenesis to an ACT domain, but this has not been

confirmed by crystal structure determination (Pittard et al., 2005). In E. coli 3PGDH, the

only ACT domain family member whose structure has been solved with a bound ligand, a

serine residue is positioned at the interface of two ACT domains (Chipman and Shaanan,

2001). This protein forms a tetramer in solution, and 4 molecules of serine are bound at

four monomer:monomer interfaces (Al-Rabiee et al., 1996; Grant et al., 1996). The

number of tryptophan molecules bound per TRPY is not known, but can be resolved by

the determination of the TRPY structure in the presence of tryptophan. Alternatively, 14C-

Trp binding to TRPY can also provide an answer with accurate 14C-Trp and TRPY

quantitations (Kathleen Sandman, personal communication).

DNase I footprinting (Figure 5.9) revealed that TRPY binding introduced a major

hypersensitive site in TRP box 1, and a less pronounced hypersensitive site in TRP box 5

in the absence of tryptophan. In the presence of tryptophan, two major hypersensitive

sites were generated, one in TRP box 1 and one in TRP box 4, with three minor

hypersensitive sites located in between, and also a minor hypersensitive site in TRP box 5.

The hypersensitive site in TRP box 5 indicates that TRPY binds to this region, but

binding to this region must be relatively unstable as a gel shift was not observed when

TRPY was incubated with A56 that contained this DNA region (Figure 5.8A). The

distance between the two major hypersensitive sites is 42 bp, or ~4 helical turns of DNA,

with the three intervening minor hypersensitive sites separated from each other by 1

helical DNA turn. This suggests that TRPY binding constrains the DNA in such a way

that DNase I has preferential access at 10 or 11-bp intervals. Similar periodic cutting of

126 DNA occurred on the surface of a nucleosome exposed to DNase I or hydroxyl radicals

(Hayes and Wolffe, 1992; Pereira and Reeve, 1999), suggesting that DNA might wrap around TRPY in a similar way as DNA wraps around histones.

Model for trpY and trpEGCFBAD transcription regulation

Based on the results obtained and arguments presented above, a model for trpY and trpEGCFBAD transcription regulation can be proposed. Transcription of trpEGCFBAD and trpY occurs from divergent overlapping promoters (Figure 5.1). In the absence of tryptophan, TRPY binds to the TRP box 1 and 2 region (from +4 to +16) located immediately downstream from the trpY transcription start site (Figures 5.8 and

5.9). Given that the footprint of an archaeal RNAP bound to a promoter region covers a region from -20 to +15 (Hausner and Thomm, 2001), TRPY binding must sterically prevent RNAP binding, but should not interfere with TBP/TFB binding. Thus, the inhibition of trpY transcription follows the repression mechanism established for several other archaeal transcription regulators, including MDR1, LrpA and Phr (Bell et al., 1999;

Dahlke and Thomm, 2002; Vierke et al., 2003), namely prevention of RNAP access to the site of transcription initiation. In the presence of tryptophan, TRPY appears to contact not only the TRP box 1 and 2 region, but also the TRP box 3 and 4 region, thus preventing TBP/TFB access to both TATAE and TATAY boxes and causing the inhibition of transcription from both the trpY and trpEGCFBAD promoters. This is the mechanism of transcription repression established for several other archaeal transcription regulators, including Lrs14 and TrmB (Bell and Jackson, 2000; Lee et al., 2003). E synthesis in vitro was reduced by ~50% when tryptophan was present at a concentration of ~8 µM (Figure

127 5.5), a concentration that is close to the internal tryptophan concentration (11.3 µM) measured in M. marburgensis cells grown in the absence of tryptophan (Gast et al., 1997).

Apparently, the effect of tryptophan on transcription in vitro is relevant to the in vivo situation.

This model only describes regulation at the level of transcription initiation, and it is possible that additional levels of regulation occur during transcription elongation and translation. The second amino acid residue in M.t. TRPY and all its archaeal homologs is invariably a tryptophan, and this is the only tryptophan in these proteins. This tryptophan residue could play a regulatory role such that in the absence of charged tRNATrp, translation of trpY mRNA would abort or stall, thus lowering the TRPY concentration in the cell. Additionally, the entire M.t. trpEGCFBAD operon contains only a single tryptophan codon (W175 in trpB), and this tryptophan residue is conserved in all TrpB proteins from Bacteria, Archaea and Eukarya, except in Chlamydia psittanci and

Rhodobacter capsulatus (Xie et al., 2001). This tryptophan residue is not directly involved in the catalysis of indole and serine to tryptophan (Hyde et al., 1988), but could have an unidentified essential function. The rarity of Trp codons in the trp genes is also observed in other Archaea, although M.t. is an extreme case. Based on the average Trp codon frequency of ~0.83% in M.t., there should be 18 Trp codons in the trpEGCFBAD operon. It is interesting that the habitat of M.t. is sewage sludge (Zeikus and Wolfe, 1972) and sewage sludge is rich in indole (a breakdown product of tryptophan), suggesting that tryptophan might also be rich in such an environmental niche (Ishikawa et al., 1981). M.t. will have a growth advantage if it does not express the trp operon when tryptophan can be obtained from the environment. Trp codons are also underrepresented in the trp genes

128 in E. coli and B. subtilis, which grow in animal intestines or soils. Mathematical models predict that such underrepresentations of Trp codons cause tryptophan biosynthetic enzymes to be synthesized more quickly when tryptophan is suddenly depleted in the environment (Alves and Savageau, 2005).

The finding that TRPY auto-regulates the transcription of its own gene is consistent with many other transcription regulators in Bacteria, Eukarya and Archaea. In

E. coli, ~40 % of the known transcription regulators auto-regulate the expression of their own genes (Rosenfeld et al., 2002). Autoregulation fine-tunes the level of mRNA, and therefore the amount of regulator protein present in the cell. But once repression is achieved, how is an auto-repressed promoter reactivated? There are some well-studied examples in Bacteria and Eukarya. As an example in Bacteria, E. coli LexA normally binds to the promoter of its own gene and prevents its transcription. However, when

RecA is activated by single-stranded DNA resulting from DNA damage in the cell, LexA on the lexA promoter is cleaved, and this allows the lexA gene to be transcribed (Brent and Ptashne, 1982). In Eukarya, c- auto-regulates the transcription of its own gene by binding to its promoter region. The repression is relieved though heterodimerization of c-Myc with Max that is expressed under the control of the Ras-MAPK or Wnt signaling pathway when cells are exposed to stimulatory growth factors or cytokines (Secombe et al., 2004). In addition to these direct activation strategies, some regulators are diluted through cell growth. Since TRPY represses transcription of its own gene in the presence and in the absence of tryptophan, how and when is it ever made? It is possible that TRPY dissociates from the TRP box 1 and 2 region sporadically, and this allows brief transcription of trpY. As TRPY is a sequence-specific transcription regulator, its

129 concentration in the cell is probably very low, and this can be measured if anti-TRPY

antibodies are available. As a model, E. coli has 120 TrpR dimers per cell in cells grown

in tryptophan-containing medium, and 375 TrpR dimers in cells grown in tryptophan-free

medium (Gunsalus et al., 1986). So even a very brief period of trpY transcription can

generate enough trpY mRNA. TRPY could also be removed by the DNA replication

machinery once per every cell cycle.

How can TRPY binding be reconciled with the finding that TBP stays bound to

the TATA-box region of a promoter after transcription initiation (Chapter 3)? In the

absence of tryptophan, TRPY binds to the TRP box 1 and 2 region, and this does not

interfere with TBP/TFB binding to TATAE /BRE, based on DNase I footprints obtained

using TBP, TFB and template T1, T2 or T3 (Figure 5.1E) and previous reports (Bell et al.,

1999, Hausner and Thomm, 2001). In the presence of tryptophan, TRPY may bind to the

TRP box 3 and 4 region, and this region contains TATAE and TATAY. If TBP molecules have already bound to TATAE and TATAY, TRPY has to dislodge these TBP molecules

in order to gain access to the TRP box 3 and 4 region. A likely scenario is that a TRPY

dimer bound at the TRP box 1 and 2 region recruits a second TRPY dimer, the two

dimers tetramerize and dislodge TBP from TATAE and TATAY. Mot1, which disrupts

TBP-DNA binding in yeast, binds similarly to a region immediately upstream of the

TATA-box before dislodging TBP (Darst et al., 2001; Dasgupta et al., 2002).

TRPY also regulates transcription of trpB2

The intergenic region between MTH1476 (trpB2) and MTH1477 (unknown

conserved archaeal protein) has two TRP boxes, separated by 4 bp, located 7 bp upstream

130 of the TATA-box (Figure 5.12A). Addition of TRPY repressed trpB2 transcription in vitro in the presence and absence of tryptophan, but more severely in the presence of tryptophan than in the absence of tryptophan (Figure 5.12C). trpB2 transcription is therefore regulated by TRPY. In contrast to the transcription results, the EMSA results indicate that TRPY has higher affinity for the T12 DNA that contains the trpB2 regulatory region in the absence of tryptophan than in the presence of tryptophan (Figure

5.12E).

The function of TrpB2 is unknown, but TrpB2 has been proposed to play a role in indole or serine metabolism (Hettwer and Sterner, 2002; Xie et al., 2002). Unlike TrpB,

TrpB2 cannot form a complex with TrpA, because several conserved amino acid residues needed to make contacts with TrpA are absent. Recombinant TrpB2 binds indole much more tightly than TrpB, so it could function as an indole scavenger or as a sensor for the presence of other microorganisms in the community that secrete indole, possibly as a quorum signal (Hettwer and Sterner, 2002). Alternatively, TrpB2 could function as a serine deaminase, as genome analyses have revealed that the presence of TrpB2 correlates almost perfectly with the absence of a primary serine deaminase (COG1760)

(Xie et al., 2002).

Since only the trpB2 transcript (B2) was synthesized in vitro (Figure 5.12C), the

MTH1477 promoter apparently did not function in vitro. The function of the MTH1477 gene product is unknown, and homologs are present only in a few methanogens,

Archaeoglobus fulgidus and some uncultured environmental Archaea. A potential TATA- box (TATCTTTA) for MTH1477 preceded by a potential BRE (GTAAGA) is located 6 bp downstream from the trpB2 TATA-box, between the trpB2 TATA-box and site of

131 trpB2 initiation (Figure 5.12A). Why MTH1477 was not transcribed in vitro is not clear,

but the 4th base of the putative TATA-box is C, and mutational studies have shown that

the 4th and 5th bases of a TATA-box must be A or T to support transcription in vitro

(Darcy, 1999). TBP may have low affinity for this TATA-box and a transcription

activator may be needed to recruit TBP to this TATA-box and/or to stabilize the complex

formed. This would be analogous to the mechanism of the Ptr2 activator on the rb2

promoter in M. jannaschii (Ouhammouch et al., 2003; Ouhammouch et al., 2005).

Could there be additional genes in M.t. whose expression is also regulated by

TRPY? There are sequences within MTH748 (chorismate synthase; aroC), MTH804

(chorismate mutase; pheA) and MTH1640 (prephenate dehydrogenase) that resemble

TRP boxes (Gelfand et al., 2000). However, these TRP boxes were separated by 5 bp rather than 4 bp and are located in protein-coding regions, rather than in upstream regulatory regions. It remains possible that TRPY will bind to these sites and regulate transcription elongation instead of initiation.

Comparison with trp gene regulation in other organisms

Although the structural genes for tryptophan biosynthesis are conserved in

Bacteria, Archaea and Eukarya, diverse mechanisms have evolved to regulate their expression. It is thus interesting to compare the mechanism of trp gene regulation in M.t. with the mechanisms in S. cerevisiae, E. coli, and B. subtilis.

The TRP gene organization and the regulation of their transcription in yeast differ in several respects from the situation in Bacteria and Archaea. The TRP genes in yeast are scattered on four different chromosomes, and not arranged in operons. Therefore,

132 each TRP gene has its own promoter and regulatory region, and is transcribed

independently. Even in the presence of tryptophan, there is some basal level transcription

of TRP genes and, consequently, enzyme synthesis (Arndt et al., 1987). When a yeast cell

experiences tryptophan starvation, it activates transcription of not only the TRP genes,

but also genes for the biosynthesis of arginine, histidine, isoleucine, leucine, lysine,

serine and (Schurch et al., 1974). This cross-pathway transcription regulation is

directed by a positive regulator, not a negative regulator as in Bacteria and Archaea. A

leucine zipper protein binds to consensus sequences upstream of the TRP genes and activates their transcription by recruiting chromatin modifying complexes and destabilizing nucleosomes, allowing transcription initiation (Arndt and Fink, 1986;

Suckow et al., 1994; Topalidou and Thireos, 2003). GCN4 is rapidly degraded by the ubiquitin pathway during conditions of non-starvation, but its half-life is increased upon amino acid starvation (Kornitzer et al., 1994; Meimoun et al., 2000). also contributes to the level of GCN4 in the cell, with four small upstream

ORFs (uORF1-4) in the 5΄ leader region of the GCN4 mRNA acting as negative regulators of translation when amino acids are abundant in the cell (Hinnebusch et al.,

1997).

Among the , the mechanism of trp gene regulation in M.t. is more similar to that of E. coli than to that in B. subtilis, as both M.t. and E coli use a DNA-

binding tryptophan-sensing protein as a regulator, whereas B. subtilis uses an RNA- binding tryptophan-sensing protein (Yanofsky, 2000). Further, both M.t. TRPY and E. coli TrpR auto-regulate their own expression, and appear to have altered DNA binding properties in the presence and absence of tryptophan (Schevitz et al., 1985; Marmorstein

133 et al., 1987; Zhang et al., 1987; Otwinowski et al., 1988). Despite these functional

similarities, M.t. TRPY and E. coli TrpR are entirely different in their primary sequences

and predicted secondary structures. For example, E. coli TrpR consists completely of α

helices, whereas M.t. TRPY is predicted to have both α helices and β strands. The

tryptophan-binding site in E. coli TrpR has been located in the crystal structure, and

although the tryptophan-binding site in M.t. TRPY is predicted to be the ACT domain,

this has yet to be determined. E. coli also uses transcriptional attenuation to sense the

level of tRNATrp to regulate the trp gene expression, but to date, there is no experimental

evidence for a post-initiation mechanism of transcription regulation for M.t. trp operon.

In contrast to M.t. and E. coli, transcription of the trp operon in B. subtilis is

regulated by the trp RNA-binding attenuation protein (TRAP) (Yanofsky, 2000). TRAP

is composed of 11 identical subunits. Tryptophan-activated TRAP binds to 11 triplet

repeats (7 GAG and 4 UAG) present in the trp leader transcript. TRAP binding results in

transcription termination upstream of the trp gene. TRAP binding to trp transcripts that escape attenuation also sequesters the trpE Shine-Dalgarno sequence, inhibiting

binding and translation initiation.

134

Figure 5.1. Divergent promoters direct transcription of trpY and trpEGCFBAD. (A) The diagram shows the organization (drawn to scale) of the trpY and trpEGCFBAD genes with ORF numbers as designated in the M. thermautotrophicus genome sequence database indicated (Smith et al., 1997). The locations of the tryptophan codons (W2 in

MTH1654 and W175 in MTH1659) are indicated. The black bar represents transcription template T1 that directs synthesis of two transcripts, E (262 nt) and Y (169 nt). The DNA sequence of the intergenic region is shown with two arrows (labeled E and Y) indicating the start sites and directions of transcription. The shaded triangle indicates the bp deleted in M. marburgensis strain MWR1 that conferred resistance to 5MT (Gast et al., 1997). In transcription templates T4, T5, T6 and T7, this nucleotide was changed from G to A, T, C or deleted. The TATA-boxes for E and Y transcription are marked as TATAE and

TATAY and their sequences are boxed. Below TATAE and TATAY are the mutated

TATA-box sequences present in templates T2 and T3. The protein encoding sequences are boxed and marked as trpY and trpE. Six putative TRP boxes are identified with the nucleotides in these TRP boxes that match the consensus TGTACA marked with (*) and mismatches marked with (-). (B) In vitro transcription from T1, T2 and T3. Arrows indicate the E and Y transcripts. Lane (S) contained the size standards. (C) The top panel shows the primer extension product generated from the E transcript (lane P) adjacent to

DNA sequencing ladders (lanes T, A, C, G). The TATA box and the initiation region sequence are shown with the first nucleotide (G) marked with an asterisk (*). The bottom panel shows the primer extension products generated from the Y transcript (lane P) and the DNA sequencing ladders. (D) In vitro transcripts synthesized from T1, T4, T5, T6

135 and T7 with arrows identifying the E and Y transcripts. Lane (S) contained the size standard. (E) DNase I footprint obtained with T1, T2 and T3 DNA incubated with (+) or without (-) TBP and TFB. Lane (S) contained the size standard. The intergenic sequence is shown above the gel with the T1 sequence and the mutated TATA boxes in T2 and T3 indicated. The DNA regions protected from DNase I digestion are indicated by shaded bars beneath the DNA sequence and adjacent to the footprint in the gel.

136 A. MTH1654 MTH1655 MTH1656 MTH1657 MTH1658 MTH1659 MTH1660 MTH1661 trpY trpE trpG trpC trpF trpB trpA trpD

W2 W175 T1 trpE trpY (E;262nt) MWR1 1 2 3 4 5 6 (Y;169nt) TATAE E CTTCCACATACACTTCACCACATTGA**TG*T*A*CA*TTATT**A*TAC**AGACAAGATTTAAATAGT***ACC* TCCGTG***TAT* *ATGTACTTTGTGGATACAGAGGTGATATGCAT*GT**AC***AGGAGTG**AA*TG GAAGGTGTATGTGAAGTGGTGTAACTACATGTAATAATATGTCTGTTCTAAATTTATCATGGAGGCACATATACATGAAACACCTATGTCTCCACTATACGTACATGTCCTCACTTAC trpY Y TTTGGATA TATAY trpE A AAACCTAT T4 TATCCGTA T5 T T2 ATAGGCAT T6 C T7 T3 B. C. TTTAAATA T S T1 T2 T3 A C G

P G C A T

D. E. SGA T C G S

Figure 5.1

137 Figure 5.2. Purification and SDS-PAGE of recombinant TRPY. Recombinant TRPY was overexpressed in E. coli. The proteins in aliquots of the E. coli cell lysate, flow-through fraction, wash fractions and elution fractions from a Ni-NTA column were separated by electrophoresis through a 12% polyacrylamide gel and stained with Coomassie blue. The band corresponding to recombinant TRPY is indicated by an arrow. The lane (marker) contained protein molecular mass standards, and the 20 KDa and 50 KDa bands are identified.

138 h e g t u a o r s n n n e r y t o o o k l - h h h i i i r t t t l s s s a l w a a a u u u e o l l l m l w w w e e e c f

50 KDa

recombinant 20 KDa TRPY

Figure 5.2

139 Figure 5.3.T1 DNA sequence and the effect of TRPY and amino acids on E and Y synthesis. (A) T1 DNA sequence. The transcription start sites from the trpE and trpY promoters are identified with short arrows. Primers that anneal to this DNA region are identified with long arrows. Putative TRP boxes are underlined. (B) In vitro transcription reactions contained T1 DNA and either no tryptophan and no TRPY (-), tryptophan

(+Trp), TRPY (+TRPY) or TRPY plus the amino acid listed. The reaction products were separated by electrophoresis through 6% denaturing gels and visualized using a phosphorimager. The lane (M) contained size standards.

140 A. T1 DNA sequence: MX75 CATCGCTTATCTCAACGTCGTCACAGTATATTTTGCCGT TTCTGTCTATTCTGAACCCGAGGTCTATGATCTTCCTGG MX77 CCACATACATGCGGGAGGGGTATCCCTCAAATCTGTGC MX56 TTTATCTGCTTCCACATACACTTCACCACATTGATGTAC trpY ATTATTATACAGACAAGATTTAAATAGTACCTCCGTGTA trpE TATGTACTTTGTGGATACAGAGGTGATATGCATGTACAG GAGTGAATGTTTTTGGCATTACAGAACTGAAAAACCCG MX64 GTAAAGGAAAAGATAGAATTTAAAGAACCCTTTGAACTA TTCAAAAGTATTTATTCTGAATACGACTCATCATTCCTTC TGGAGTCAATGGAAAGCGACACGGGTCTTGCAAGATAC TCATTCATGGGATTCGAGCCACAGATGATAATAAGGGC CCGTTCAGGTTTTATAGAAGTTGAATACGAGGGATCAA MX57 GG

B.

p r u is la e g Tr y e H h r + T L + +A P A Y Y + + Y Y + + P P Y Y P P Y Y M - rp R P P P P T TR TR R T TR TR TR T TR (nt)

300 E 250

200 Y

150

Figure 5.3

141 Figure 5.4. Amino acid sequence of TRPY and TRPY’s effect on E and Y synthesis. (A)

The amino acid sequence of TRPY is shown with predicted secondary structures (α- helices represented by zigzags and β-strands represented by EEEEE). The boxed region is the putative ACT domain and the shaded oval marks the putative tryptophan-binding site

(Gelfand et al., 2000; Chipman and Shaanan, 2001). Residues that are conserved in archaeal TRPY homologs that have >35% identity to M.t. TRPY are indicated by shaded circles. (B) E and Y synthesis in vitro in the absence or presence of tryptophan (800 µM) with 0, 25, 50, 100, 150 and 200 ng of TRPY added. (C) The relative amounts of E or Y synthesized, quantitated by using a phosphorimager, are graphed with the amount synthesized in the absence of TRPY set at 100%. (●) and (○) E transcripts synthesized in the absence and presence of tryptophan, (■) and (□) Y transcripts synthesized in the absence and presence of tryptophan.

142 A.

1 MWKQIKHRFEGYPSRMYVARKIIDLGFR

29 IDRNGKIYCDDVEISDVALARAVGVDRR

57 TVRATANTILEDEKLRGIFENMMPAGAL EEEEE 85 LRDAAGELDFGVVEIEADARNPGILAAA EEEEE EEE 113 ARLIADKGISIRQAHAGDPELDETPRLT EEE EEE 141 IITETPIPGGLLKDFLKIDGVKRVSIY 167

B. C.

)

%

(

t

p

i

r

c

s

n

a

r

T

TRPY (ng)

Figure 5.4

143 Figure 5.5. The effects of tryptophan and 5MT on E and Y transcription. In vitro transcription reaction mixtures contained 50 ng of T1 DNA, and when present (+), 200 ng of TRPY. The concentrations of tryptophan or 5MT are indicated above the corresponding lanes. The transcripts were separated by electrophoresis through 6% denaturing gels and visualized using a phosphorimager. The E and Y transcripts are identified. The amounts of E and Y synthesized are given below each lane, with the amount synthesized in the absence of TRPY set at 100%. ND, not determined. The lane on the extreme left contained size standards.

144 TRPY - + + + + + + + + + + + + + + + +

Trp (µM) 0 0 2 8 2 8 1 2 4 8 1 4 0 6 4 8 0 6 0 0 0 0 0 0 5MT (µM) 0 0 0 0 8 8 8 . . 0 0 0 8 0 8

250 E 200 Y 150

E synthesis (%) 100 100 93 51 26 13 10 11 6 3 1 100 93 92 94 15 11

Y synthesis (%) 100 7 6 7 4 3 1 1 1 1 1 ND ND ND ND ND ND

Figure 5.5

145 Figure 5.6. EMSA with T11 DNA using TRPY in the presence or absence of amino acids. (A) The DNA sequence of T11. The transcription start sites from trpE and trpY are identified by short arrows. The primers used to PCR amplify the DNA fragment are identified with long arrows. Putative TRP boxes are underlined. (B) [32P]-labeled T11

DNA (1 ng) was incubated with TRPY (200 ng) and/or various amino acids (800 µM final concentration) plus 50 ng of poly dI-dC for 20 min at RT in 10 µl of transcription buffer. The reaction products were separated by electrophoresis through a 6% native polyacrylamide gel and visualized using a phosphorimager. Protein-free T11 DNA and the TRPY-T11 DNA complexes formed are indicated.

146 A. T11 DNA sequence:

MX56 trpY ACATACACTTCACCACATTGATGTACATTATTATACAGAC trpE AAGATTTAAATAGTACCTCCGTGTATATGTACTTTGTGGA TACAGAGGTGATATGCATGTACAGGAGTGAATGTTTTTG GCATTACAGAACTGAAAAACCCGG

MX64

B.

TRPY - - + + + + + + + +

tryptophan - + - + ------

other amino - - - - Tyr Leu His Ala Phe Arg acid

TRPY-T11 DNA complexes

T11 DNA

Figure 5.6

147 Figure 5.7. EMSA with T11 and T13 DNA in the presence of TRPY and/or tryptophan.

[32P]-labeled T11 and/or T13 DNA (1 ng) were/was incubated with TRPY (200 ng)

and/or tryptophan (800 µM final concentration) plus 50 ng of poly dI-dC for 20 min at

RT in 10 µl of transcription buffer. The reaction products were separated by electrophoresis through 6% native polyacrylamide gels and visualized using a phosphorimager.

148 T11 DNA + + + + + + - - - T13 DNA - - - + + + + + + TRPY - + + - + + - + + tryptophan - - + - - + - - +

T11 DNA (trpE/trpY)

T13 DNA (hmtB)

Figure 5.7

149 Figure 5.8. Determination of TRPY binding sites by EMSA. (A) The DNA molecules

A12, A23, A34, A50 and A56 are indicated by lines below the DNA sequence with the 6 putative TRP boxes identified and numbered 1 through 6. A13 through A19 are derivatives of A12, and their sequences are shown. These DNA molecules (1 ng) were incubated with TRPY (200 ng) and/or tryptophan (800 µM final concentration) plus 50 ng of poly dI-dC for 20 min at RT in 10 µl of transcription buffer. The reaction products were separated by electrophoresis and visualized using a phosphorimager. (B) The sequences of B12 and its derivatives B13, B15, B16 are shown. These DNA molecules (1 ng) were incubated with TRPY (0, 25, 50, 100, 150 and 200 ng) with or without tryptophan (800 µM final concentration) plus 50 ng of poly dI-dC for 20 min at RT in 10

µl of transcription buffer. The reaction products were separated by electrophoresis and visualized using a phosphorimager.

150

A. 1 2 3 4 5 6 CTTCACCACATTGATGTACATTATTATACAGACAAGATTTAAATAGTACCTCCGTGTATATGTACTTTGTGGATACAGAGGTGATATGCATGTACAGGAGTGAATGTTTTTGGCATTACAGAACTGAAAAACCCGG GAAGTGGTGTAACTACATGTAATAATATGTCTGTTCTAAATTTATCATGGAGGCACATATACATGAAACACCTATGTCTCCACTATACGTACATGTCCTCACTTACAAAAACCGTAATGTCTTGACTTTTTGGGCC A23 A50 A56 A12 A34 B12 A12 CTTCACCACATTGATGTACATTATTATACAGAC A13 CTTCACCACATTGATGTACATTATATTATACAGAC A14 CTTCACCACATTGATGTACATTTATACAGAC A15 CGGGGCCACATTGATGTACATTATTATACAGAC A16 CTTCACCAGGGGGATGTACATTATTATACAGAC A17 CTTCACCACATTGATGGGGGTTATTATACAGAC A18 CTTCACCACATTGATGTACATCGGTATACAGAC A19 CTTCACCACATTGATGTACATTATTAGGGGGAC

A12 A23 A34 A50 A56 A12 A13 A14 A12 A15 A16 A17 A18 A19 TRPY trp

B. B12 CTTCACCACATTGATGTACATTATTATACAGACAAGATTTAAATAGTACCTCCGTGTATATGTAC B13 CTTCACCACATTGATGGGGGTTATTATACAGACAAGATTTAAATAGTACCTCCGTGTATATGTAC B15 CTTCACCACATTGATGTACATTATTATACAGACAAGATTTAAATAGGGGCTCCGTGTATATGTAC B16 CTTCACCACATTGATGTACATTATTATACAGACAAGATTTAAATAGTACCTCCGGCGGTATGTAC

B12 B13 -trp +trp -trp +trp TRPY TRPY TRPY TRPY

B15 B16 -trp +trp -trp +trp TRPY TRPY

Figure 5.8

151 Figure 5.9. DNase I footprinting of TRPY-DNA complexes. The sequence of the DNA region between the trpE (GTG start codon) and trpY (ATG start codon) translation start sites is shown on the right, with the putative TRP boxes indicated 1 through 6. The dashed arrows indicate transcription start sites of trpE and trpY. [32P]-labeled DNA molecules (25 ng) generated by PCR from p7557 using primers MX 75 and [32P]-labeled

MX64 were incubated with TRPY (0, 25, 50, 100 and 150 ng) in the absence (-trp) or presence (+trp) of tryptophan (800 µM final concentration). The complexes formed were exposed to DNase I. The reaction products were separated by electrophoresis through 6% sequencing gels and visualized using a phosphorimager. Lanes S10 and S50 contained size standards. The shaded bars represent the footprint regions and the arrows connect the hypersensitive sites to the corresponding nucleotides on the DNA sequence.

152 E Y

-trp +trp 0 S5 S10 S50

250

200

150

Y

100

90

80

70

60 E

50

40

30

1 2 34 56 7 8 9 10 11 12 13 Figure 5.9

153 Figure 5.10. In vitro transcription from templates T1, T8 and T9. (A) Sequence of the

intergenic regions in T1, T8 and T9 with the TRP boxes boxed and the arrows indicating

transcription start sites. T8 has mutations in TRP box 1, and T9 has mutations in TRP box

3. (B) E and Y synthesis from T1, T8 and T9 in vitro in the absence (-Trp) or presence

(+Trp) of tryptophan (800 µM) with 0, 25, 50, 100, 150 and 200 ng of TRPY added. (C)

The amounts of E or Y synthesized from T8 and T9 are quantitated using a

phosphorimager and graphed as percentages of the amount synthesized in the absence of

TRPY. (●) and (○) E transcript synthesized in the absence and presence of tryptophan, (■) and (□) Y transcript synthesized in the absence and presence of tryptophan.

154 A.

T8 CTTCACCACATTGATGTACATTATTATACAGACAAGATTTAAATAGTACCTCCGTGTATATGTACTTTGTGGATAC T1 GAAGTGGTGTAACTACATGTAATAATATGTCTGTTCTAAATTTATCATGGAGGCACATATACATGAAACACCTATG )

%

-TGATGGGGGTTA- -AATAGGGGCTCC- ( T8 -ACTACCCCCAAT- T9 -TTATCCCCGAGG-

t

p

i

r

C. c

s

n

T1 T8 T9 a

B. r

T

TRPY (ng) -Trp T9

)

%

(

t

p

i

+Trp r

c

s

n

a

r

T

TRPY (ng)

Figure 5.10

155 Figure 5.11. Effects of TRPY and TRPY variants G149R and A128E on transcription in vitro. In vitro transcription reaction mixtures contained T1 DNA (30 ng), 800 µM tryptophan plus 200 ng of TRPY, TRPY (G149R) or TRPY (A128E) as indicated. The reaction products were separated by electrophoresis through a 6% denaturing polyacrylamide gel and visualized using a phosphorimager. The arrows indicate the E and Y transcripts.

156

- WT G149R A128E TRPY - + + + + + + tryptophan - - + - + - +

E

Y

Figure 5.11

157 Figure 5.12. Regulation of trpB2 transcription by TRPY. (A) Organization of the

MTH1477 (unknown conserved archaeal protein) and MTH1476 (trpB2) genes. Three tryptophan codons are present in trpB2 gene and are indicated at their locations (W15,

W59 and W137). The black bars represent templates T10 and T12. (B) Alignment of the

MTH1477/trpB2 and trpY/trpE intergenic sequences. The arrows marked as Y, E and B2 indicate the transcription start sites of trpY, trpE and trpB2, respectively. Nucleotides identical in both sequences are connected by vertical lines. TRP boxes, TATA-boxes and

ORFs are shaded. Between the trpB2 transcription and translation start sites is a sequence that could encode a leader peptide with its amino acid sequence shown in the box above the DNA sequence. The asterisk (*) indicates the . (C) In vitro transcription reaction mixtures that contained T10 (30 ng) were incubated with 0, 25, 50, 100, 150 and

200 ng of TRPY in the absence (▲) or presence (∆) of tryptophan (800 µM) at 58°C for

30 min. The transcripts synthesized were separated by electrophoresis through 6% denaturing polyacrylamide gels and visualized using a phosphorimager. (D) The amounts of B2 transcript synthesized from T10, quantitated using a phosphorimager, are graphed as percentages of the amount synthesized in the absence of TRPY. (E) EMSA of T12 with TRPY. [32P]-labeled T12 DNA (1 ng) was incubated with 0, 25, 50, 100, 150 and

200 ng of TRPY in the absence (-Trp) or presence (+Trp) of tryptophan (800 µM final concentration) for 20 min at RT. The complexes formed were separated by electrophoresis through 6% native polyacrylamide gels and visualized using a phosphorimager. T12 DNA, TRPY-T12 DNA complexes, and a non-specific PCR product present in the T12 DNA preparation are indicated.

158 T12 A. T10 trpB2 (B2;201 nt) MTH1476 MTH1477 trpB2 B. W59 W15 W137 TATA TCTCATCACTGTTTATTAACCATTCCCTTTGATGTACATATATGTACAGTAAAGTTTTAAATAGTTCTTTAAAGATAT AGAGTA MTH1477 CCACATACACTTCACCACATTGATGTACATTATTATACAGACAAGATTTAAATAGTACCTCCGTGTATA GGTGTA trpY 1 2 G 33 4 B2 Y M M I S S V G T S S S H E V I * trpB2 CTTACTGCATGGAAATGATGATTTCATCAGTAGGGACCAGTTCCAGCCATGAGGTGATATGATGAATAAGATCGTT E TACTTATTCTAGCAA TGTACTTTGTGGATACAGAGGTGATATCATGTACAGGAGTGAAT trpE CACTTA C. D.

)

%

(

t -Trp p -trp i

r

c

s

n +trp

a

r

+Trp T

-Trp +Trp E.

T12-TRPY complex Non-specific PCR product

T12

Figure 5.12

159 CHAPTER 6

SUMMARY AND FUTURE DIRECTIONS

Archaea are more closely related to Eukarya than to Bacteria (Woese et al., 1990) with the homologies most obvious in the components of the basal transcription machinery and in the presence of histones (Reeve, 2003). Archaeal histones do not however have N- terminal tails and are not subject to post-translational modifications. Archaea do not have homologs of the multisubunit complexes in Eukarya that remodel chromatin and activate transcription, but rather have bacterial-like transcription regulators that must therefore interact with the eukaryal-like basal transcription machinery.

I have used the M.t. in vitro transcription system previously established by Trevor

Darcy (Darcy et al., 1999) to investigate events in archaeal transcription initiation,

elongation, and gene-specific regulation. The results obtained have established that the

M.t. RNAP is able to transcribe through one or two archaeal nucleosomes in vitro,

without the aid of additional transcriptional factors, albeit at a slower rate (Chapter 2).

Archaeal TBP remains associated with the promoter, but archaeal TFB is released from

the promoter after transcription initiation (Chapter 3). Several additional M.t. promoters

support transcription initiation in vitro (Chapter 4). Transcription of the

160 trpEGCFBAD operon is regulated by a tryptophan-sensing repressor that also auto- regulates the transcription of its own gene (Chapter 5). Some thoughts and future directions that stem from these results are presented below.

Archaeal RNAP-nucleosome interactions

Although the archaeal RNAP was able to transcribe through one or two nucleosomes in vitro without the help of any accessory proteins, the rate of elongation was slow (~3 nt/s) and the RNAP paused at several sites along the nucleosomal template

(Chapter 2). Mechanisms must therefore exist in vivo that increase the rate of transcript elongation through archaeal nucleosomes. In Eukarya and Bacteria, many elongation factors have been identified that prevent RNAP pausing, arrest and termination

(Shilatifard, 2004; Borukhov et al., 2005). In Eukarya, many factors also exist that facilitate transcript elongation through nucleosomes (Sims et al., 2004; Svejstrup, 2004).

As future work, I would propose that two potential archaeal elongation factors should be investigated, TFS, a homolog of eukaryal TFIIS, and MTH817, a homolog of Elp3, which is a subunit of the Elongator complex in eukaryotes. TFIIS promotes RNAPII readthrough of pause and arrest sites by inducing a transcript cleavage activity that is intrinsic in RNAPII (Fish and Kane, 2002). TFIIS is inserted into the RNAPII secondary channel and reaches the active site (Kettenberger et al., 2003). This is similar to

GreA/GreB and DksA in Bacteria, which are also located inside RNAP secondary channels (Laptenko et al., 2003; Opalka et al., 2003; Sosunova et al., 2003; Perederina et al., 2004). TFIIS also promotes RNAP readthrough of roadblocks, such as nucleosomes

(Izban and Luse, 1992), sequence-specific DNA-binding proteins (Reines and Mote,

161 1993) and DNA-binding drugs (Mote et al., 1994). Archaeal TFS has been shown to promote transcript cleavage and increase transcription fidelity (Hausner et al., 2000;

Lange and Hausner, 2004), but it has not been tested for an activity in enhancing transcription through nucleosomal templates. When the M.t. RNAP encountered an archaeal nucleosome, there was a prominent pause (Chapter 2). The pause suggests that the 3΄-end of the nascent RNA was probably backtracked into the secondary channel, and

TFS may help cleave the RNA and create a new 3΄-end, giving the RNAP the opportunity to invade the nucleosome. TFS should now be tested for its ability to facilitate M.t.

RNAP transcription through nucleosomes in the in vitro transcription system.

The eukaryal Elongator complex stimulates transcript elongation on nucleosomal templates in vitro. It is recruited to elongating RNAPII by the CTD and has histone acetyltransferase activity in vitro, but the detailed molecular mechanism remains obscure

(Otero et al., 1999; Wittschieben et al., 1999; Kim et al., 2002; Petrakis et al., 2004). An archaeal homolog of Elp3, a subunit of the Elongator complex, is encoded in many archaeal genomes. It is possible that archaeal Elp3 interacts with archaeal RNAP to promote elongation through archaeal nucleosomes. Such a hypothesis is very tenuous, as the archaeal RNAP lacks the CTD and the archaeal histones are not acetylated.

Nevertheless, the hypothesis can be tested easily in vitro using the M.t. in vitro transcription system, and for this purpose, I have already generated a recombinant version of the M.t. Elp3.

In addition to this candidate gene approach, M.t. cell lysate should be fractionated and assayed for factors that might accelerate archaeal RNAP elongation on nucleosomal templates. It is possible that Archaea have unique transcription elongation factors that

162 have no homology to any bacterial or eukaryal proteins, and assays of cell fractions will

provide a relatively unbiased approach to discover these novel elongation factors. In

Eukarya, many elongation factors, including FACT, Elongins and the Elongator complex

were identified by this biochemical fractionation and assay approach (Sims et al., 2004).

Another question raised by experiments described in Chapter 2 is the fate of the

archaeal nucleosome after archaeal RNAP passage. Three alternatives can be proposed:

1) an archaeal RNAP transcribes through a nucleosome using a DNA looping mechanism

as described for RNAPIII and this transfers the intact nucleosome to an upstream

promoter-proximal location (Studitsky et al., 1997); 2) an archaeal RNAP transcribes

through a nucleosome by transiently disrupting local DNA-histone contacts at 10 or 11

bp intervals and the nucleosome remains at the same location after the passage of RNAP;

3) an archaeal RNAP removes archaeal nucleosome completely from DNA and the

archaeal nucleosome reassembles spontaneously on the DNA at the same location after

the passage of RNAP. Experiments using MN digestion followed by primer extension or

DNase I footprinting can be used to determine if the nucleosome location has changed

after the passage of RNAP, consistent with alternative 1, or not, consistent with

alternative 2 and 3. In alternative 2, the histones are never fully dissociated from DNA,

whereas in alternative 3, the histones are temporarily dissociated from DNA. To

distinguish between these two alternatives, nucleosome templates could be assembled

using [125I]-labeled histones (Bailey et al., 1999) and transcript elongation allowed in the presence of unlabeled competitor histones. If [125I]-labeled histones remain associated with transcription template after the passage of RNAP, alternative 2 would be favored. If

[125I]-labeled histones are replaced by unlabeled histones, alternative 3 would be favored.

163 The experiments should also establish the extent of spontaneous histone exchange in the

absence of RNAP elongation and the spontaneous exchange should be subtracted to

obtain the extent of histone exchange caused by RNAP passage. Alternatively, [125I]-

labeled histones can be assembled with biotin-labeled DNA templates immobilized on

streptavidin-coated beads, and whether the [125I]-labeled histones remain immobilized or released into solution after the passage of M.t. RNAP can be assayed to determine the fate of the archaeal histone.

Transcription elongation experiments can be also repeated with wild-type HMfB or HMfB variants known to have higher or lower affinities for DNA (Soares et al., 2000;

Soares et al., 2003). These experiments should reveal specifically which histone residue-

DNA interactions are important in causing archaeal RNAP pausing.

Search for an archaeal transcription terminating factor

The lack of transcription termination under the in vitro condition (Chapter 3) raises the question of how transcription is terminated in Archaea. Several non-exclusive alternatives seem likely. 1) Archaeal RNAP terminates on encountering a DNA sequence that function as an intrinsic terminator. This has been demonstrated downstream of the M. thermolithotrophicus tRNAVal gene, where the sequence TTTTAATTTT terminates

transcription in vitro (Thomm et al., 1996). An intrinsic terminator has also been

identified downstream of the M.t. mcr operon that encodes methyl coenzyme M reductase

(Thomas Santangelo, personal communication). 2) A dedicated

promotes termination by archaeal RNAP, analogous to the in Bacteria

(Richardson, 1990; Richardson, 2002). As archaeal genomes do not encode any Rho

164 homologs, the archaeal termination factor must be a novel protein. However, the E. coli

Rho factor does cause M.t. RNAP to terminate transcription in vitro (Thomas Santangelo,

personal communication). 3) Termination is coupled to the 3΄-RNA cleavage, analogous

to the exonuclease-based mechanism in the RNAPII system (Kim et al., 2004; Teixeira et

al., 2004; West et al., 2004). 4) Termination results from RNAP interacting with a

sequence-specific DNA binding protein, analogous to the mechanism in the RNAPI

system (Bartsch et al., 1988).

To identify archaeal transcription termination factor(s), the M.t. in vitro

transcription system can be used to assay candidate terminator sequences, factors or

fractions from M.t. cell lysate. It will be necessary to distinguish transcription termination

from transcription pause or arrest. Such experiments are currently ongoing (Thomas

Santangelo, personal communication).

Structural studies of M.t. TRPY

TRPY is a sequence-specific DNA-binding protein that senses tryptophan and

represses the trp operon expression (Chapter 5). Although the function of this protein has been determined, many questions regarding its mechanisms remain to be answered.

Specifically, does TRPY bind tryptophan directly? If so, where is the tryptophan binding site in TRPY? What is the DNA-binding domain of TRPY? How does it recognize the

TRP box sequence? What is the structure of the TRPY-DNA complex?

Mutational studies of TRPY are already identifying amino acid residues important for DNA binding and tryptophan sensing (Chapter 5; Lubomira Cubonova and Kathleen

Sandman, personal communications), but a high-resolution X-ray crystal structure should

165 reveal a lot more about the detailed structure of this protein. Structure determination of a protein is not a trivial endeavor, but I think in the case of TRPY, such an effort is justified. First, TRPY is one of the very few archaeal transcription regulators whose physiological function is known. Previous studies have focused on archaeal Lrp proteins, but the regulatory targets of many of these proteins are unknown (Napoli et al., 1999;

Bell and Jackson, 2000; Brinkman et al., 2000; Enoru-Eta et al., 2000; Brinkman et al.,

2002; Dahlke and Thomm, 2002; Enoru-Eta et al., 2002). Second, given the very large amount of structural and functional information known for E. coli TrpR (Schevitz et al.,

1985; Marmorstein et al., 1987; Zhang et al., 1987; Otwinowski et al., 1988), M.t. TRPY is a very attractive protein for comparative studies. Third, M.t. is a thermophilic organism, and proteins from thermophilic organisms are usually more amenable to crystallization (Edwards et al., 2000). In fact, M.t. has been selected as a model organism for a high-throughput structural genomics investigation (Christendat et al., 2000), and there are well-established procedures for the crystallization of M.t. proteins. So there is a good chance that crystallization of M.t. TRPY will succeed.

M.t. TRPY structures should be established in the presence and absence of tryptophan, and then crystallization of TRPY-DNA complexes in the presence or absence of tryptophan should be attempted. M.t. TRPY has the potential to become a model system for investigating the interactions and interplays of archaeal gene-specific regulators with the archaeal basal transcription machinery.

166

LIST OF REFERENCES:

Adam, M., Robert, F., Larochelle, M., and Gaudreau, L. (2001). H2A.Z is required for global chromatin integrity and for recruitment of RNA polymerase II under specific conditions. Mol Cell Biol 21, 6270-6279.

Adelman, K., La Porta, A., Santangelo, T. J., Lis, J. T., Roberts, J. W., and Wang, M. D. (2002). Single molecule analysis of RNA polymerase elongation reveals uniform kinetic behavior. Proc Natl Acad Sci USA 99, 13538-13543.

Ahmad, K., and Henikoff, S. (2002a). Histone H3 variants specify modes of chromatin assembly. Proc Natl Acad Sci USA 99, 16477-16484.

Ahmad, K., and Henikoff, S. (2002b). The histone variant H3.3 marks active chromatin by replication-independent nucleosome assembly. Mol Cell 9, 1191-1200.

Akey, C. W., and Luger, K. (2003). Histone chaperones and nucleosome assembly. Curr Opin Struct Biol 13, 6-14.

Akoulitchev, S., Makela, T. P., Weinberg, R. A., and Reinberg, D. (1995). Requirement for TFIIH kinase activity in transcription by RNA polymerase II. Nature 377, 557-560.

Allers, T., and Mevarech, M. (2005). Archaeal genetics - the third way. Nat Rev Genet 6, 58-73.

Alonso, A., Mahmood, R., Li, S., Cheung, F., Yoda, K., and Warburton, P. E. (2003). Genomic microarray analysis reveals distinct locations for the CENP-A binding domains in three human chromosome 13q32 neocentromeres. Hum Mol Genet 12, 2711-2721.

Al-Rabiee, R., Zhang, Y., and Grant, G. A. (1996). The mechanism of velocity modulated allosteric regulation in D-3-phosphoglycerate dehydrogenase. Site-directed mutagenesis of effector binding site residues. J Biol Chem 271, 23235-23238.

167 Andrews, A. E., Dickson, B., Lawley, B., Cobbett, C., and Pittard, A. J. (1991). Importance of the position of TyrR boxes for repression and activation of the tyrP and aroF genes in . J Bacteriol 173, 5079-5085.

Aravind, L., Iyer, L. M., and Anantharaman, V. (2003). The two faces of Alba: the evolutionary connection between proteins participating in chromatin structure and RNA metabolism. Genome Biol 4, R64.

Aravind, L., and Koonin, E. V. (1999). DNA-binding proteins and evolution of transcription regulation in the archaea. Nucleic Acids Res 27, 4658-4670.

Arents, G., Burlingame, R. W., Wang, B. C., Love, W. E., and Moudrianakis, E. N. (1991). The nucleosomal core histone octamer at 3.1 A resolution: a tripartite protein assembly and a left-handed superhelix. Proc Natl Acad Sci USA 88, 10148-10152.

Arents, G., and Moudrianakis, E. N. (1993). Topography of the histone octamer surface: repeating structural motifs utilized in the docking of nucleosomal DNA. Proc Natl Acad Sci USA 90, 10489-10493.

Arents, G., and Moudrianakis, E. N. (1995). The histone fold: a ubiquitous architectural motif utilized in DNA compaction and protein dimerization. Proc Natl Acad Sci USA 92, 11170-11174.

Arndt, K., and Fink, G. R. (1986). GCN4 protein, a positive transcription factor in yeast, binds general control promoters at all 5' TGACTC 3' sequences. Proc Natl Acad Sci USA 83, 8516-8520.

Arndt, K. M., and Chamberlin, M. J. (1990). RNA chain elongation by Escherichia coli RNA polymerase. Factors affecting the stability of elongating ternary complexes. J Mol Biol 213, 79-108.

Arndt, K. T., Styles, C., and Fink, G. R. (1987). Multiple global regulators control HIS4 transcription in yeast. Science 237, 874-880.

Awrey, D. E., Weilbaecher, R. G., Hemming, S. A., Orlicky, S. M., Kane, C. M., and Edwards, A. M. (1997). Transcription elongation through DNA arrest sites. A multistep process involving both RNA polymerase II subunit RPB9 and TFIIS. J Biol Chem 272, 14747-14754.

168 Bailey, K. A. (2000). DNA interactions of the archaeal histone HMf from the hyperthermophilic methanogen Methanothermus fervidus, Ph.D. Dissertation, Ohio State University, Columbus, Ohio.

Bailey, K. A., Chow, C. S., and Reeve, J. N. (1999). Histone stoichiometry and DNA circularization in archaeal nucleosomes. Nucleic Acids Res 27, 532-536.

Bailey, K. A., Marc, F., Sandman, K., and Reeve, J. N. (2002). Both DNA and histone fold sequences contribute to archaeal nucleosome stability. J Biol Chem 277, 9293-9301.

Bailey, K. A., Pereira, S. L., Widom, J., and Reeve, J. N. (2000). Archaeal histone selection of nucleosome positioning sequences and the procaryotic origin of histone- dependent genome evolution. J Mol Biol 303, 25-34.

Baliga, N. S., Bonneau, R., Facciotti, M. T., Pan, M., Glusman, G., Deutsch, E. W., Shannon, P., Chiu, Y., Weng, R. S., Gan, R. R., et al. (2004). Genome sequence of Haloarcula marismortui: a halophilic archaeon from the Dead Sea. Genome Res 14, 2221-2234.

Baliga, N. S., Goo, Y. A., Ng, W. V., Hood, L., Daniels, C. J., and DasSarma, S. (2000). Is gene expression in Halobacterium NRC-1 regulated by multiple TBP and TFB transcription factors? Mol Microbiol 36, 1184-1185.

Bartlett, M. S., Thomm, M., and Geiduschek, E. P. (2004). Topography of the euryarchaeal transcription initiation complex. J Biol Chem 279, 5894-5903.

Bartsch, I., Schoneberg, C., and Grummt, I. (1988). Purification and characterization of TTFI, a factor that mediates termination of mouse ribosomal DNA transcription. Mol Cell Biol 8, 3891-3897.

Bednar, J., Studitsky, V. M., Grigoryev, S. A., Felsenfeld, G., and Woodcock, C. L. (1999). The nature of the nucleosomal barrier to transcription: direct observation of paused intermediates by electron cryomicroscopy. Mol Cell 4, 377-386.

Bell, S. D., Botting, C. H., Wardleworth, B. N., Jackson, S. P., and White, M. F. (2002). The interaction of Alba, a conserved archaeal chromatin protein, with Sir2 and its regulation by acetylation. Science 296, 148-151.

169 Bell, S. D., Brinkman, A. B., van der Oost, J., and Jackson, S. P. (2001a). The archaeal TFIIEα homologue facilitates transcription initiation by enhancing TATA-box recognition. EMBO Rep 2, 133-138.

Bell, S. D., Cairns, S. S., Robson, R. L., and Jackson, S. P. (1999a). Transcriptional regulation of an archaeal operon in vivo and in vitro. Mol Cell 4, 971-982.

Bell, S. D., and Jackson, S. P. (2000). Mechanism of autoregulation by an archaeal transcriptional repressor. J Biol Chem 275, 31624-31629.

Bell, S. D., Kosa, P. L., Sigler, P. B., and Jackson, S. P. (1999b). Orientation of the transcription preinitiation complex in Archaea. Proc Natl Acad Sci USA 96, 13662- 13667.

Bell, S. D., Magill, C. P., and Jackson, S. P. (2001b). Basal and regulated transcription in Archaea. Biochem Soc Trans 29, 392-395.

Belotserkovskaya, R., Oh, S., Bondarenko, V. A., Orphanides, G., Studitsky, V. M., and Reinberg, D. (2003). FACT facilitates transcription-dependent nucleosome alteration. Science 301, 1090-1093.

Bentley, R. (1990). The shikimate pathway--a metabolic tree with many branches. Crit Rev Biochem Mol Biol 25, 307-384.

Boehm, A. K., Saunders, A., Werner, J., and Lis, J. T. (2003). Transcription factor and polymerase recruitment, modification, and movement on dhsp70 in vivo in the minutes following heat shock. Mol Cell Biol 23, 7628-7637.

Borukhov, S., Lee, J., and Laptenko, O. (2005). Bacterial transcription elongation factors: new insights into molecular mechanism of action. Mol Microbiol 55, 1315-1324.

Brendel, C., Gelman, L., and Auwerx, J. (2002). Multiprotein bridging factor-1 (MBF-1) is a cofactor for nuclear receptors that regulate lipid metabolism. Mol Endocrinol 16, 1367-1377.

Brinkman, A. B., Bell, S. D., Lebbink, R. J., de Vos, W. M., and van der Oost, J. (2002). The Sulfolobus solfataricus Lrp-like protein LysM regulates lysine biosynthesis in response to lysine availability. J Biol Chem 277, 29537-29549.

170 Brinkman, A. B., Dahlke, I., Tuininga, J. E., Lammers, T., Dumay, V., de Heus, E., Lebbink, J. H., Thomm, M., de Vos, W. M., and van Der Oost, J. (2000). An Lrp-like transcriptional regulator from the archaeon Pyrococcus furiosus is negatively autoregulated. J Biol Chem 275, 38160-38169.

Buratowski, S. (1994). The basics of basal transcription by RNA polymerase II. Cell 77, 1-3.

Buratowski, S. (2000). Snapshots of RNA polymerase II transcription initiation. Curr Opin Cell Biol 12, 320-325.

Bushnell, D. A., Cramer, P., and Kornberg, R. D. (2002). Structural basis of transcription: alpha-amanitin-RNA polymerase II cocrystal at 2.8 A resolution. Proc Natl Acad Sci USA 99, 1218-1222.

Bushnell, D. A., and Kornberg, R. D. (2003). Complete, 12-subunit RNA polymerase II at 4.1-A resolution: implications for the initiation of transcription. Proc Natl Acad Sci USA 100, 6969-6973.

Bushnell, D. A., Westover, K. D., Davis, R. E., and Kornberg, R. D. (2004). Structural basis of transcription: an RNA polymerase II-TFIIB cocrystal at 4.5 Angstroms. Science 303, 983-988.

Cameron, I. L., Pavlat, W. A., and Jeter, J. R., Jr. (1979). Chromatin substructure: an electron microscopic study of thin-sectioned chromatin subjected to sequential protein extraction and water swelling procedures. Anat Rec 194, 547-562.

Cang, Y., and Prelich, G. (2002). Direct stimulation of transcription by negative cofactor 2 (NC2) through TATA-binding protein (TBP). Proc Natl Acad Sci USA 99, 12727- 12732.

Chang, C. H., and Luse, D. S. (1997). The H3/H4 tetramer blocks transcript elongation by RNA polymerase II in vitro. J Biol Chem 272, 23427-23434.

Chen, D., Hinkley, C. S., Henry, R. W., and Huang, S. (2002). TBP dynamics in living human cells: constitutive association of TBP with mitotic chromosomes. Mol Biol Cell 13, 276-284.

171 Chen, H. T., and Hahn, S. (2003). Binding of TFIIB to RNA polymerase II: Mapping the binding site for the TFIIB zinc ribbon domain within the preinitiation complex. Mol Cell 12, 437-447.

Chen, H. T., and Hahn, S. (2004). Mapping the location of TFIIB within the RNA polymerase II transcription preinitiation complex: a model for the structure of the PIC. Cell 119, 169-180.

Chipman, D. M., and Shaanan, B. (2001). The ACT domain family. Curr Opin Struct Biol 11, 694-700.

Chirinos, M., Hernandez, F., and Palacian, E. (1999). Transcription of DNA templates associated with histone (H3+H4)2 tetramers. Arch Biochem Biophys 370, 222-230.

Christendat, D., Yee, A., Dharamsi, A., Kluger, Y., Savchenko, A., Cort, J. R., Booth, V., Mackereth, C. D., Saridakis, V., Ekiel, I., et al. (2000). Structural proteomics of an archaeon. Nat Struct Biol 7, 903-909.

Conaway, J. W., and Conaway, R. C. (1999). Transcription elongation and human disease. Annu Rev Biochem 68, 301-319.

Conaway, J. W., Shilatifard, A., Dvir, A., and Conaway, R. C. (2000). Control of elongation by RNA polymerase II. Trends Biochem Sci 25, 375-380.

Cramer, P. (2002). Multisubunit RNA polymerases. Curr Opin Struct Biol 12, 89-97.

Cramer, P. (2004). Structure and function of RNA polymerase II. Adv Protein Chem 67, 1-42.

Cramer, P., Bushnell, D. A., and Kornberg, R. D. (2001). Structural basis of transcription: RNA polymerase II at 2.8 angstrom resolution. Science 292, 1863-1876.

Cremer, T., Kupper, K., Dietzel, S., and Fakan, S. (2004). Higher order chromatin architecture in the : on the way from structure to function. Biol Cell 96, 555- 567.

Cubonova, L., Sandman, K., Hallam, S., DeLong E. F., and Reeve J. N. (2005). Histones in Crenarchaea. J Bacteriol 187, 1877-1882.

172 Dahlke, I., and Thomm, M. (2002). A Pyrococcus homolog of the leucine-responsive regulatory protein, LrpA, inhibits transcription by abrogating RNA polymerase recruitment. Nucleic Acids Res 30, 701-710.

Dame, R., and Goosen, N (2002). HU: promoting or counteracting DNA compaction? FEBS Lett. 529, 151-6.

Darcy, T. J. (1999) In vitro analysis of transcription from the thermophilic archaeon Methanobacterium thermoautotrophicum, Ph.D. dissertation, Ohio State University, Columbus, Ohio.

Darcy, T. J., Hausner, W., Awery, D. E., Edwards, A. M., Thomm, M., and Reeve, J. N. (1999). Methanobacterium thermoautotrophicum RNA polymerase and transcription in vitro. J Bacteriol 181, 4424-4429.

Darst, R. P., Wang, D., and Auble, D. T. (2001). MOT1-catalyzed TBP-DNA disruption: uncoupling DNA conformational change and role of upstream DNA. Embo J 20, 2028- 2040.

Dasgupta, A., Darst, R. P., Martin, K. J., Afshari, C. A., and Auble, D. T. (2002). Mot1 activates and represses transcription by direct, ATPase-dependent mechanisms. Proc Natl Acad Sci USA 99, 2666-2671.

Decanniere, K., Babu, A. M., Sandman, K., Reeve, J. N., and Heinemann, U. (2000). Crystal structures of recombinant histones HMfA and HMfB from the hyperthermophilic archaeon Methanothermus fervidus. J Mol Biol 303, 35-47.

DeDecker, B. S., O'Brien, R., Fleming, P. J., Geiger, J. H., Jackson, S. P., and Sigler, P. B. (1996). The crystal structure of a hyperthermophilic archaeal TATA-box binding protein. J Mol Biol 264, 1072-1084.

Dellino, G. I., Schwartz, Y. B., Farkas, G., McCabe, D., Elgin, S. C., and Pirrotta, V. (2004). Polycomb silencing blocks transcription initiation. Mol Cell 13, 887-893.

Deppenmeier, U., Johann, A., Hartsch, T., Merkl, R., Schmitz, R. A., Martinez-Arias, R., Henne, A., Wiezer, A., Baumer, S., Jacobi, C., et al. (2002). The genome of Methanosarcina mazei: evidence for lateral gene transfer between bacteria and archaea. J Mol Microbiol Biotechnol 4, 453-461.

173 Dinger, M. E., Baillie, G. J., and Musgrave, D. R. (2000). Growth phase-dependent expression and degradation of histones in the thermophilic archaeon Thermococcus zilligii. Mol Microbiol 36, 876-885.

Dixon, M. P., Pau, R. N., Howlett, G. J., Dunstan, D. E., Sawyer, W. H., and Davidson, B. E. (2002). The central domain of Escherichia coli TyrR is responsible for hexamerization associated with tyrosine-mediated repression of gene expression. J Biol Chem 277, 23186-23192.

Dong, F., Hansen, J. C., and van Holde, K. E. (1990). DNA and protein determinants of nucleosome positioning on sea urchin 5S rRNA gene sequences in vitro. Proc Natl Acad Sci USA 87, 5724-5728.

Dunn, J. J., and Studier, F. W. (1983). Complete nucleotide sequence of T7 DNA and the locations of T7 genetic elements. J Mol Biol 166, 477-535.

Edmondson, S. P., Kahsai, M. A., Gupta, R., and Shriver, J. W. (2004). Characterization of Sac10a, a hyperthermophile DNA-binding protein from Sulfolobus acidocaldarius. Biochemistry 43, 13026-13036.

Edwards, A. M., Arrowsmith, C. H., Christendat, D., Dharamsi, A., Friesen, J. D., Greenblatt, J. F., and Vedadi, M. (2000). Protein production: feeding the crystallographers and NMR spectroscopists. Nat Struct Biol 7, 970-972.

Edwards, A. M., Kane, C. M., Young, R. A., and Kornberg, R. D. (1991). Two dissociable subunits of yeast RNA polymerase II stimulate the initiation of transcription at a promoter in vitro. J Biol Chem 266, 71-75.

Elmendorf, B. J., Shilatifard, A., Yan, Q., Conaway, J. W., and Conaway, R. C. (2001). Transcription factors TFIIF, ELL, and Elongin negatively regulate SII-induced nascent transcript cleavage by non-arrested RNA polymerase II elongation intermediates. J Biol Chem 276, 23109-23114.

Enoru-Eta, J., Gigot, D., Glansdorff, N., and Charlier, D. (2002). High resolution contact probing of the Lrp-like DNA-binding protein Ss-Lrp from the hyperthermoacidophilic crenarchaeote Sulfolobus solfataricus P2. Mol Microbiol 45, 1541-1555.

Enoru-Eta, J., Gigot, D., Thia-Toong, T. L., Glansdorff, N., and Charlier, D. (2000). Purification and characterization of Sa-lrp, a DNA-binding protein from the extreme

174 thermoacidophilic archaeon Sulfolobus acidocaldarius homologous to the bacterial global transcriptional regulator Lrp. J Bacteriol 182, 3661-3672.

Fahrner, R. L., Cascio, D., Lake, J. A., and Slesarev, A. (2001). An ancestral nuclear protein assembly: crystal structure of the Methanopyrus kandleri histone. Protein Sci 10, 2002-2007.

Fan, J. Y., Gordon, F., Luger, K., Hansen, J. C., and Tremethick, D. J. (2002). The essential histone variant H2A.Z regulates the equilibrium between different chromatin conformational states. Nat Struct Biol 9, 172-176.

Feng, Q., Wang, H., Ng, H. H., Erdjument-Bromage, H., Tempst, P., Struhl, K., and Zhang, Y. (2002). Methylation of H3-lysine 79 is mediated by a new family of HMTases without a SET domain. Curr Biol 12, 1052-1058.

Fish, R. N., and Kane, C. M. (2002). Promoting elongation with transcript cleavage stimulatory factors. Biochim Biophys Acta 1577, 287-307.

Flaus, A., and Owen-Hughes, T. (2004). Mechanisms for ATP-dependent chromatin remodelling: farewell to the tuna-can octamer? Curr Opin Genet Dev 14, 165-173.

Forbes, A. J., Patrie, S. M., Taylor, G. K., Kim, Y. B., Jiang, L., and Kelleher, N. L. (2004). Targeted analysis and discovery of posttranslational modifications in proteins from methanogenic archaea by top-down MS. Proc Natl Acad Sci USA 101, 2678-2683.

Galagan, J. E., Nusbaum, C., Roy, A., Endrizzi, M. G., Macdonald, P., FitzHugh, W., Calvo, S., Engels, R., Smirnov, S., Atnoor, D., et al. (2002). The genome of M. acetivorans reveals extensive metabolic and physiological diversity. Genome Res 12, 532-542.

Gast, D. A., Jenal, U., Wasserfallen, A., and Leisinger, T. (1994). Regulation of tryptophan biosynthesis in Methanobacterium thermoautotrophicum Marburg. J Bacteriol 176, 4590-4596.

Gast, D. A., Wasserfallen, A., Pfister, P., Ragettli, S., and Leisinger, T. (1997). Characterization of Methanobacterium thermoautotrophicum Marburg mutants defective in regulation of L-tryptophan biosynthesis. J Bacteriol 179, 3664-3669.

175 Gelfand, M. S., Koonin, E. V., and Mironov, A. A. (2000). Prediction of transcription regulatory sites in Archaea by a comparative genomic approach. Nucleic Acids Res 28, 695-705.

Gnatt, A. (2002). Elongation by RNA polymerase II: structure-function relationship. Biochim Biophys Acta 1577, 175-190.

Gnatt, A. L., Cramer, P., Fu, J., Bushnell, D. A., and Kornberg, R. D. (2001). Structural basis of transcription: an RNA polymerase II elongation complex at 3.3 A resolution. Science 292, 1876-1882.

Graham, D. E., Kyrpides, N., Anderson, I. J., Overbeek, R., and Whitman, W. B. (2001). Genome of Methanocaldococcus jannaschii. Methods Enzymol 330, 40-123.

Grant, G. A., Schuller, D. J., and Banaszak, L. J. (1996). A model for the regulation of D- 3-phosphoglycerate dehydrogenase, a Vmax-type allosteric enzyme. Protein Sci 5, 34-41.

Grayling, R. A., Bailey, K. A., and Reeve, J. N. (1997). DNA binding and nuclease protection by the HMf histones from the hyperthermophilic archaeon Methanothermus fervidus. Extremophiles 1, 79-88.

Gunsalus, R. P., Miguel, A. G., and Gunsalus, G. L. (1986). Intracellular Trp repressor levels in Escherichia coli. J Bacteriol 167, 272-278.

Hansen, J. C., Ausio, J., Stanik, V. H., and van Holde, K. E. (1989). Homogeneous reconstituted oligonucleosomes, evidence for salt-dependent folding in the absence of histone H1. Biochemistry 28, 9129-9136.

Hansen, J. C., and Wolffe, A. P. (1992). Influence of chromatin folding on transcription initiation and elongation by RNA polymerase III. Biochemistry 31, 7977-7988.

Hansen, S. K., Takada, S., Jacobson, R. H., Lis, J. T., and Tjian, R. (1997). Transcription properties of a cell type-specific TATA-binding protein, TRF. Cell 91, 71-83.

Hanzelka, B. L., Darcy, T. J., and Reeve, J. N. (2001). TFE, an archaeal transcription factor in Methanobacterium thermoautotrophicum related to eucaryal transcription factor TFIIEalpha. J Bacteriol 183, 1813-1818.

176 Hausner, W., Frey, G., and Thomm, M. (1991). Control regions of an archaeal gene. A TATA box and an initiator element promote cell-free transcription of the tRNA(Val) gene of Methanococcus vannielii. J Mol Biol 222, 495-508.

Hausner, W., Lange, U., and Musfeldt, M. (2000). Transcription factor S, a cleavage induction factor of the archaeal RNA polymerase. J Biol Chem 275, 12393-12399.

Hausner, W., and Thomm, M. (1995). The translation product of the presumptive Thermococcus celer TATA-binding protein sequence is a transcription factor related in structure and function to Methanococcus transcription factor B. J Biol Chem 270, 17649- 17651.

Hausner, W., and Thomm, M. (2001). Events during initiation of archaeal transcription: open complex formation and DNA-protein interactions. J Bacteriol 183, 3025-3031.

Hayes, J. J., and Wolffe, A. P. (1992). The interaction of transcription factors with nucleosomal DNA. Bioessays 14, 597-603.

Heinicke, I., Muller, I., Pittekow, M., and Klein, A. (2004) Mutational analysis of genes encoding chromatin proteins in the archaeon Methanococcus voltae indicates their involvement in the regulation of gene expression. Mol Genet Genomics 272, 76-87

Hendrickson, E. L., Kaul, R., Zhou, Y., Bovee, D., Chapman, P., Chung, J., Conway de Macario, E., Dodsworth, J. A., Gillett, W., Graham, D. E., et al. (2004). Complete genome sequence of the genetically tractable hydrogenotrophic methanogen Methanococcus maripaludis. J Bacteriol 186, 6956-6969.

Henikoff, S., Ahmad, K., Platero, J. S., and van Steensel, B. (2000). Heterochromatic deposition of centromeric histone H3-like proteins. Proc Natl Acad Sci USA 97, 716-721.

Henikoff, S., Furuyama, T., and Ahmad, K. (2004). Histone variants, nucleosome assembly and epigenetic inheritance. Trends Genet 20, 320-326.

Hethke, C., Bergerat, A., Hausner, W., Forterre, P., and Thomm, M. (1999). Cell-free transcription at 95 degrees: thermostability of transcriptional components and DNA topology requirements of Pyrococcus transcription. Genetics 152, 1325-1333.

177 Hettwer, S., and Sterner, R. (2002). A novel tryptophan synthase beta-subunit from the hyperthermophile Thermotoga maritima. Quaternary structure, steady-state kinetics, and putative physiological role. J Biol Chem 277, 8194-8201.

Hochheimer, A., Hedderich, R., and Thauer, R. K. (1999). The DNA binding protein Tfx from Methanobacterium thermoautotrophicum: structure, DNA binding properties and transcriptional regulation. Mol Microbiol 31, 641-650.

Hofacker, A., Schmitz, K. M., Cichonczyk, A., Sartorius-Neef, S., and Pfeifer, F. (2004). GvpE- and GvpD-mediated transcription regulation of the p-gvp genes encoding gas vesicles in Halobacterium salinarum. Microbiology 150, 1829-1838.

Holstege, F. C., Tantin, D., Carey, M., van der Vliet, P. C., and Timmers, H. T. (1995). The requirement for the basal transcription factor IIE is determined by the helical stability of promoter DNA. EMBO J 14, 810-819.

Hwang, D., and Kornberg, A. (1992). Opposed actions of regulatory proteins, DnaA and IciA, in opening the replication origin of Escherichia coli. J Biol Chem. 267, 23087-91.

Imbalzano, A. N., Kwon, H., Green, M. R., and Kingston, R. E. (1994). Facilitated binding of TATA-binding protein to nucleosomal DNA. Nature 370, 481-485.

Ishikawa, T., Ose, Y., and Sato, T. (1981). An investigation of organic compounds in night soil and the treated water by gas chromatography-mass spectrometry. Sci Total Environ 20, 241-253.

Ito, T., Tyler, J. K., and Kadonaga, J. T. (1997). Chromatin assembly factors: a dual function in nucleosome formation and mobilization? Genes Cells 2, 593-600.

Izban, M. G., and Luse, D. S. (1992). Factor-stimulated RNA polymerase II transcribes at physiological elongation rates on naked DNA but very poorly on chromatin templates. J Biol Chem 267, 13647-13655.

Jenuwein, T., and Allis, C. D. (2001). Translating the histone code. Science 293, 1074- 1080.

Jiang, Y., and Gralla, J. D. (1995). Nucleotide requirements for activated RNA polymerase II open complex formation in vitro. J Biol Chem 270, 1277-1281.

178 Kamada, K., Shu, F., Chen, H., Malik, S., Stelzer, G., Roeder, R. G., Meisterernst, M., and Burley, S. K. (2001). Crystal structure of negative cofactor 2 recognizing the TBP- DNA transcription complex. Cell 106, 71-81.

Kamashev, D., and Rouviere-Yaniv, J. (2000). The histone-like protein HU binds specifically to DNA recombination and repair intermediates. EMBO J. 19, 6527-35.

Kamenski, T., Heilmeier, S., Meinhart, A., and Cramer, P. (2004). Structure and mechanism of RNA polymerase II CTD phosphatases. Mol Cell 15, 399-407.

Kawarabayasi, Y. (2001). Genome of Pyrococcus horikoshii OT3. Methods Enzymol 330, 124-134.

Kettenberger, H., Armache, K. J., and Cramer, P. (2003). Architecture of the RNA polymerase II-TFIIS complex and implications for mRNA cleavage. Cell 114, 347-357.

Kim, J. H., Lane, W. S., and Reinberg, D. (2002). Human Elongator facilitates RNA polymerase II transcription through chromatin. Proc Natl Acad Sci USA 99, 1241-1246.

Kim, J. L., Nikolov, D. B., and Burley, S. K. (1993). Co-crystal structure of TBP recognizing the minor groove of a TATA element. Nature 365, 520-527.

Kim, M., Krogan, N. J., Vasiljeva, L., Rando, O. J., Nedea, E., Greenblatt, J. F., and Buratowski, S. (2004). The yeast Rat1 exonuclease promotes transcription termination by RNA polymerase II. Nature 432, 517-522.

Kireeva, M. L., Walter, W., Tchernajenko, V., Bondarenko, V., Kashlev, M., and Studitsky, V. M. (2002). Nucleosome remodeling induced by RNA polymerase II: loss of the H2A/H2B dimer during transcription. Mol Cell 9, 541-552.

Kirov, N., Tsaneva, I., Einbinder, E., and Tsanev, R. (1992). In vitro transcription through nucleosomes by T7 RNA polymerase. EMBO J 11, 1941-1947.

Kornberg, R. D. (1998). Mechanism and regulation of yeast RNA polymerase II transcription. Cold Spring Harb Symp Quant Biol 63, 229-232.

Korzheva, N., and Mustaev, A. (2001). Transcription elongation complex: structure and function. Curr Opin Microbiol 4, 119-125.

179 Korzheva, N., Mustaev, A., Kozlov, M., Malhotra, A., Nikiforov, V., Goldfarb, A., and Darst, S. A. (2000). A structural model of transcription elongation. Science 289, 619-625.

Kosa, P. F., Ghosh, G., DeDecker, B. S., and Sigler, P. B. (1997). The 2.1-A crystal structure of an archaeal preinitiation complex: TATA-box-binding protein/transcription factor (II)B core/TATA-box. Proc Natl Acad Sci USA 94, 6042-6047.

Kruger, K., Hermann, T., Armbruster, V., and Pfeifer, F. (1998). The transcriptional activator GvpE for the halobacterial gas vesicle genes resembles a basic region leucine- zipper regulatory protein. J Mol Biol 279, 761-771.

Kuldell, N. H., and Buratowski, S. (1997). Genetic analysis of the large subunit of yeast transcription factor IIE reveals two regions with distinct functions. Mol Cell Biol 17, 5288-5298.

Kulish, D., and Struhl, K. (2001). TFIIS enhances transcriptional elongation through an artificial arrest site in vivo. Mol Cell Biol 21, 4162-4168.

Kuo, Y.-P. (1997) In vivo characterization of archaeal transcription termination signals and characterization of a haloferax volcanii heat shock gene : a model for gene regulation, Ohio State University, Columbus, Ohio.

Kyrpides, N. C., and Ouzounis, C. A. (1999). Transcription in archaea. Proc Natl Acad Sci USA 96, 8545-8550.

Lagrange, T., Kapanidis, A. N., Tang, H., Reinberg, D., and Ebright, R. H. (1998). New core promoter element in RNA polymerase II-dependent transcription: sequence-specific DNA binding by transcription factor IIB. Genes Dev 12, 34-44.

Lange, U., and Hausner, W. (2004). Transcriptional fidelity and proofreading in Archaea and implications for the mechanism of TFS-induced RNA cleavage. Mol Microbiol 52, 1133-1143.

Langer, D., Hain, J., Thuriaux, P., and Zillig, W. (1995). Transcription in archaea: similarity to that in eucarya. Proc Natl Acad Sci USA 92, 5768-5772.

Lavoie, D., and Chaconas, G. (1994). A second high affinity HU binding site in the phage Mu transpososome. J Biol Chem 269, 15571-6

180

Lee, S. J., Engelmann, A., Horlacher, R., Qu, Q., Vierke, G., Hebbeln, C., Thomm, M., and Boos, W. (2003). TrmB, a sugar-specific transcriptional regulator of the trehalose/maltose ABC transporter from the hyperthermophilic archaeon Thermococcus litoralis. J Biol Chem 278, 983-990.

Lewis, D., and Adhya, S. (2002). In vitro repression of the gal promoters by GalR and HU depends on the proper helical phasing of the two operators. J Biol Chem 272, 2498- 504.

Lie, T. J., and Leigh, J. A. (2002). Regulatory response of Methanococcus maripaludis to alanine, an intermediate nitrogen source. J Bacteriol 184, 5301-5306.

Lie, T. J., and Leigh, J. A. (2003). A novel repressor of nif and glnA expression in the methanogenic archaeon Methanococcus maripaludis. Mol Microbiol 47, 235-246.

Littlefield, O., Korkhin, Y., and Sigler, P. B. (1999). The structural basis for the oriented assembly of a TBP/TFB/promoter complex. Proc Natl Acad Sci USA 96, 13668-13673.

Liu, J., and Turnbough, C. L., Jr. (1994). Effects of transcriptional start site sequence and position on nucleotide-sensitive selection of alternative start sites at the pyrC promoter in Escherichia coli. J Bacteriol 176, 2938-2945.

Liu, Q. X., Jindra, M., Ueda, H., Hiromi, Y., and Hirose, S. (2003). Drosophila MBF1 is a co-activator for Tracheae Defective and contributes to the formation of tracheal and nervous systems. Development 130, 719-728.

Luger, K., Mader, A. W., Richmond, R. K., Sargent, D. F., and Richmond, T. J. (1997a). Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature 389, 251-260.

Luger, K., Rechsteiner, T. J., Flaus, A. J., Waye, M. M., and Richmond, T. J. (1997b). Characterization of nucleosome core particles containing histone proteins made in bacteria. J Mol Biol 272, 301-311.

Luger, K., and Richmond, T. J. (1998). The histone tails of the nucleosome. Curr Opin Genet Dev 8, 140-146.

181 Mackereth, C. D., Arrowsmith, C. H., Edwards, A. M., and McIntosh, L. P. (2000). Zinc- bundle structure of the essential RNA polymerase subunit RPB10 from Methanobacterium thermoautotrophicum. Proc Natl Acad Sci USA 97, 6316-6321.

Magill, C. P., Jackson, S. P., and Bell, S. D. (2001). Identification of a conserved archaeal RNA polymerase subunit contacted by the basal transcription factor TFB. J Biol Chem 276, 46693-46696.

Makarova, K. S., and Koonin, E. V. (2003). Comparative genomics of Archaea: how much have we learned in six years, and what's next? Genome Biol 4, 115.

Malcolm, D. B., and Sommerville, J. (1977). The structure of nuclear ribonucleoprotein of amphibian oocytes. J Cell Sci 24, 143-165.

Malik, H. S., and Henikoff, S. (2003). Phylogenomics of the nucleosome. Nat Struct Biol 10, 882-891.

Marc, F., Sandman, K., Lurz, R., and Reeve, J. N. (2002). Archaeal histone tetramerization determines DNA affinity and the direction of DNA supercoiling. J Biol Chem 277, 30879-30886.

Marmorstein, R. Q., Joachimiak, A., Sprinzl, M., and Sigler, P. B. (1987). The structural basis for the interaction between L-tryptophan and the Escherichia coli trp aporepressor. J Biol Chem 262, 4922-4927.

Marsh, V. L., Peak-Chew, S. Y., and Bell, S. D. (2005). Sir2 and the acetyltransferase, Pat, regulate the archael chromatin protein, Alba. J Biol Chem.

Martens, C., Krett, B., and Laybourn, P. J. (2001). RNA polymerase II and TBP occupy the repressed CYC1 promoter. Mol Microbiol 40, 1009-1019.

Martens, J. A., Laprade, L., and Winston, F. (2004). Intergenic transcription is required to repress the Saccharomyces cerevisiae SER3 gene. Nature 429, 571-574.

Matsuda, T., Fujikawa, M., Haruki, M., Tang, X. F., Ezaki, S., Imanaka, T., Morikawa, M., and Kanaya, S. (2001). Interaction of TIP26 from a hyperthermophilic archaeon with TFB/TBP/DNA ternary complex. Extremophiles 5, 177-182.

182 Matsuda, T., Morikawa, M., Haruki, M., Higashibata, H., Imanaka, T., and Kanaya, S. (1999). Isolation of TBP-interacting protein (TIP) from a hyperthermophilic archaeon that inhibits the binding of TBP to TATA-DNA. FEBS Lett 457, 38-42.

Matsuzaki, H., Kassavetis, G. A., and Geiduschek, E. P. (1994). Analysis of RNA chain elongation and termination by Saccharomyces cerevisiae RNA polymerase III. J Mol Biol 235, 1173-1192.

Meile, L., Stettler, R., Banholzer, R., Kotik, M., and Leisinger, T. (1991). Tryptophan gene cluster of Methanobacterium thermoautotrophicum Marburg: molecular cloning and nucleotide sequence of a putative trpEGCFBAD operon. J Bacteriol 173, 5017-5023.

Meinhart, A., and Cramer, P. (2004). Recognition of RNA polymerase II carboxy- terminal domain by 3'-RNA-processing factors. Nature 430, 223-226.

Meneghini, M. D., Wu, M., and Madhani, H. D. (2003). Conserved histone variant H2A.Z protects euchromatin from the ectopic spread of silent heterochromatin. Cell 112, 725-736.

Minakhin, L., Bhagat, S., Brunning, A., Campbell, E. A., Darst, S. A., Ebright, R. H., and Severinov, K. (2001). Bacterial RNA polymerase subunit omega and eukaryotic RNA polymerase subunit RPB6 are sequence, structural, and functional homologs and promote RNA polymerase assembly. Proc Natl Acad Sci USA 98, 892-897.

Morse, R. H., Pederson, D. S., Dean, A., and Simpson, R. T. (1987). Yeast nucleosomes allow thermal untwisting of DNA. Nucleic Acids Res 15, 10311-10330.

Mote, J., Jr., Ghanouni, P., and Reines, D. (1994). A DNA minor groove-binding ligand both potentiates and arrests transcription by RNA polymerase II. Elongation factor SII enables readthrough at arrest sites. J Mol Biol 236, 725-737.

Moudrianakis, E. N., and Arents, G. (1993). Structure of the histone octamer core of the nucleosome and its potential interactions with DNA. Cold Spring Harb Symp Quant Biol 58, 273-279.

Myers, L. C., and Kornberg, R. D. (2000). Mediator of transcriptional regulation. Annu Rev Biochem 69, 729-749.

183 Nacheva, G. A., Guschin, D. Y., Preobrazhenskaya, O. V., Karpov, V. L., Ebralidse, K. K., and Mirzabekov, A. D. (1989). Change in the pattern of histone binding to DNA upon transcriptional activation. Cell 58, 27-36.

Napoli, A., van der Oost, J., Sensen, C. W., Charlebois, R. L., Rossi, M., and Ciaramella, M. (1999). An Lrp-like protein of the hyperthermophilic archaeon Sulfolobus solfataricus which binds to its own promoter. J Bacteriol 181, 1474-1480.

Ng, H. H., Ciccone, D. N., Morshead, K. B., Oettinger, M. A., and Struhl, K. (2003). Lysine-79 of histone H3 is hypomethylated at silenced loci in yeast and mammalian cells: a potential mechanism for position-effect variegation. Proc Natl Acad Sci USA 100, 1820-1825.

Ng, H. H., Feng, Q., Wang, H., Erdjument-Bromage, H., Tempst, P., Zhang, Y., and Struhl, K. (2002). Lysine methylation within the globular domain of histone H3 by Dot1 is important for telomeric silencing and Sir protein association. Genes Dev 16, 1518-1527.

Ng, W. V., Kennedy, S. P., Mahairas, G. G., Berquist, B., Pan, M., Shukla, H. D., Lasky, S. R., Baliga, N. S., Thorsson, V., Sbrogna, J., et al. (2000). Genome sequence of Halobacterium species NRC-1. Proc Natl Acad Sci USA 97, 12176-12181.

Nikolov, D. B., Chen, H., Halay, E. D., Usheva, A. A., Hisatake, K., Lee, D. K., Roeder, R. G., and Burley, S. K. (1995). Crystal structure of a TFIIB-TBP-TATA-element ternary complex. Nature 377, 119-128.

Ohkuma, Y., Hashimoto, S., Wang, C. K., Horikoshi, M., and Roeder, R. G. (1995). Analysis of the role of TFIIE in basal transcription and TFIIH-mediated carboxy-terminal domain phosphorylation through structure-function studies of TFIIE-α. Mol Cell Biol 15, 4856-4866.

Okamoto, T., Yamamoto, S., Watanabe, Y., Ohta, T., Hanaoka, F., Roeder, R. G., and Ohkuma, Y. (1998). Analysis of the role of TFIIE in transcriptional regulation through structure-function studies of the TFIIE β subunit. J Biol Chem 273, 19866-19876.

Okuda, M., Tanaka, A., Arai, Y., Satoh, M., Okamura, H., Nagadoi, A., Hanaoka, F., Ohkuma, Y., and Nishimura, Y. (2004). A novel structure in the large subunit of human general transcription factor TFIIE. J Biol Chem 279, 51395-51403.

184 O'Neill, T. E., Smith, J. G., and Bradbury, E. M. (1993). Histone octamer dissociation is not required for transcript elongation through arrays of nucleosome cores by phage T7 RNA polymerase in vitro. Proc Natl Acad Sci USA 90, 6203-6207.

Orphanides, G., LeRoy, G., Chang, C. H., Luse, D. S., and Reinberg, D. (1998). FACT, a factor that facilitates transcript elongation through nucleosomes. Cell 92, 105-116.

Orphanides, G., and Reinberg, D. (2000). RNA polymerase II elongation through chromatin. Nature 407, 471-475.

Otero, G., Fellows, J., Li, Y., de Bizemont, T., Dirac, A. M., Gustafsson, C. M., Erdjument-Bromage, H., Tempst, P., and Svejstrup, J. Q. (1999). Elongator, a multisubunit component of a novel RNA polymerase II holoenzyme for transcriptional elongation. Mol Cell 3, 109-118.

Otwinowski, Z., Schevitz, R. W., Zhang, R. G., Lawson, C. L., Joachimiak, A., Marmorstein, R. Q., Luisi, B. F., and Sigler, P. B. (1988). Crystal structure of Trp repressor/operator complex at atomic resolution. Nature 335, 321-329.

Ouhammouch, M., Dewhurst, R. E., Hausner, W., Thomm, M., and Geiduschek, E. P. (2003). Activation of archaeal transcription by recruitment of the TATA-binding protein. Proc Natl Acad Sci USA 100, 5097-5102.

Ouhammouch, M., Langham, G. E., Hausner, W., Simpson, A. J., El-Sayed, N. M., and Geiduschek, E. P. (2005). Promoter architecture and response to a positive regulator of archaeal transcription. Mol Microbiol 56, 625-637.

Ouhammouch, M., Werner, F., Weinzierl, R. O., and Geiduschek, E. P. (2004). A fully recombinant system for activator-dependent archaeal transcription. J Biol Chem 345, 5033-5043

Pan, G., and Greenblatt, J. (1994). Initiation of transcription by RNA polymerase II is limited by melting of the promoter DNA in the region immediately upstream of the initiation site. J Biol Chem 269, 30101-30104.

Pardee, T. S., Bangur, C. S., and Ponticelli, A. S. (1998). The N-terminal region of yeast TFIIB contains two adjacent functional domains involved in stable RNA polymerase II binding and transcription start site selection. J Biol Chem 273, 17859-17864.

185 Parvin, J. D., Shykind, B. M., Meyers, R. E., Kim, J., and Sharp, P. A. (1994). Multiple sets of basal factors initiate transcription by RNA polymerase II. J Biol Chem 269, 18414-18421.

Pavlov, N. A., Cherny, D. I., Jovin, T. M., and Slesarev, A. I. (2002). Nucleosome-like complex of the histone from the hyperthermophile Methanopyrus kandleri (MkaH) with linear DNA. J Biomol Struct Dyn 20, 207-214.

Pereira, S. L., Grayling, R. A., Lurz, R., and Reeve, J. N. (1997). Archaeal nucleosomes. Proc Natl Acad Sci USA 94, 12633-12637.

Pereira, S. L., and Reeve, J. N. (1999). Archaeal nucleosome positioning sequence from Methanothermus fervidus. J Mol Biol 289, 675-681.

Persengiev, S. P., Zhu, X., Dixit, B. L., Maston, G. A., Kittler, E. L., and Green, M. R. (2003). TRF3, a TATA-box-binding protein-related factor, is vertebrate-specific and widely expressed. Proc Natl Acad Sci USA 100, 14887-14891.

Petrakis, T. G., Wittschieben, B. O., and Svejstrup, J. Q. (2004). Molecular architecture, structure-function relationship, and importance of the Elp3 subunit for the RNA binding of holo-elongator. J Biol Chem 279, 32087-32092.

Pfeifer, F., Gregor, D., Hofacker, A., Plosser, P., and Zimmermann, P. (2002). Regulation of gas vesicle formation in halophilic archaea. J Mol Microbiol Biotechnol 4, 175-181.

Pittard, J., Camakaris, H., and Yang, J. (2005). The TyrR regulon. Mol Microbiol 55, 16- 26.

Ponting, C. P. (2002). Novel domains and orthologues of elongation factors. Nucleic Acids Res 30, 3643-3652.

Proudfoot, N. J., Furger, A., and Dye, M. J. (2002). Integrating mRNA processing with transcription. Cell 108, 501-512.

Pugh, B. F. (2000). Control of gene expression through regulation of the TATA-binding protein. Gene 255, 1-14.

186 Qureshi, S. A., Bell, S. D., and Jackson, S. P. (1997). Factor requirements for transcription in the Archaeon Sulfolobus shibatae. Embo J 16, 2927-2936.

Qureshi, S. A., and Jackson, S. P. (1998). Sequence-specific DNA binding by the S. shibatae TFIIB homolog, TFB, and its effect on promoter strength. Mol Cell 1, 389-400.

Rabenstein, M. D., Zhou, S., Lis, J. T., and Tjian, R. (1999). TATA box-binding protein (TBP)-related factor 2 (TRF2), a third member of the TBP family. Proc Natl Acad Sci USA 96, 4791-4796.

Rangasamy, D., Greaves, I., and Tremethick, D. J. (2004). RNA interference demonstrates a novel role for H2A.Z in chromosome segregation. Nat Struct Mol Biol 11, 650-655.

Reeve, J. N. (2003). Archaeal chromatin and transcription. Mol Microbiol 48, 587-598.

Reeve, J. N., Bailey, K. A., Li, W. T., Marc, F., Sandman, K., and Soares, D. J. (2004). Archaeal histones: structures, stability and DNA binding. Biochem Soc Trans 32, 227- 230.

Reeve, J. N., Sandman, K., and Daniels, C. J. (1997). Archaeal histones, nucleosomes, and transcription initiation. Cell 89, 999-1002.

Reines, D., Chamberlin, M. J., and Kane, C. M. (1989). Transcription elongation factor SII (TFIIS) enables RNA polymerase II to elongate through a block to transcription in a human gene in vitro. J Biol Chem 264, 10799-10809.

Reines, D., and Mote, J., Jr. (1993). Elongation factor SII-dependent transcription by RNA polymerase II through a sequence-specific DNA-binding protein. Proc Natl Acad Sci USA 90, 1917-1921.

Reiter, W. D., Hudepohl, U., and Zillig, W. (1990). Mutational analysis of an archaebacterial promoter: essential role of a TATA box for transcription efficiency and start-site selection in vitro. Proc Natl Acad Sci USA 87, 9509-9513.

Renfrow, M. B., Naryshkin, N., Lewis, L. M., Chen, H. T., Ebright, R. H., and Scott, R. A. (2004). Transcription factor B contacts promoter DNA near the transcription start site of the archaeal transcription initiation complex. J Biol Chem 279, 2825-2831.

187 Richardson, J. P. (1990). Rho-dependent transcription termination. Biochim Biophys Acta 1048, 127-138.

Richardson, J. P. (2002). Rho-dependent termination and ATPases in transcript termination. Biochim Biophys Acta 1577, 251-260.

Roeder, R. G., Schwartz, L. B., and Sklar, V. E. (1976). Function, structure, and regulation of eukaryotic nuclear RNA polymerases. Symp Soc Dev Biol, 29-52.

Rosenfeld, N., Elowitz, M. B., and Alon, U. (2002). Negative autoregulation speeds the response times of transcription networks. J Mol Biol 323, 785-793.

Sandman, K., Grayling, R. A., Dobrinski, B., Lurz, R., and Reeve, J. N. (1994a). Growth- phase-dependent synthesis of histones in the archaeon Methanothermus fervidus. Proc Natl Acad Sci USA 91, 12624-12628.

Sandman, K., Krzycki, J. A., Dobrinski, B., Lurz, R., and Reeve, J. N. (1990). HMf, a DNA-binding protein isolated from the hyperthermophilic archaeon Methanothermus fervidus, is most closely related to histones. Proc Natl Acad Sci USA 87, 5788-5791.

Sandman, K., Perler, F. B., and Reeve, J. N. (1994b). Histone-encoding genes from Pyrococcus: evidence for members of the HMf family of archaeal histones in a non- methanogenic Archaeon. Gene 150, 207-208.

Sandman, K., and Reeve, J. N. (2000). Structure and functional relationships of archaeal and eukaryal histones and nucleosomes. Arch Microbiol 173, 165-169.

Sandman, K., Soares, D., and Reeve, J. N. (2001). Molecular components of the archaeal nucleosome. Biochimie 83, 277-281.

Santisteban, M. S., Kalashnikova, T., and Smith, M. M. (2000). Histone H2A.Z regulats transcription and is partially redundant with nucleosome remodeling complexes. Cell 103, 411-422.

Schevitz, R. W., Otwinowski, Z., Joachimiak, A., Lawson, C. L., and Sigler, P. B. (1985). The three-dimensional structure of trp repressor. Nature 317, 782-786.

188 Schnapp, G., Graveley, B. R., and Grummt, I. (1996). TFIIS binds to mouse RNA polymerase I and stimulates transcript elongation and hydrolytic cleavage of nascent rRNA. Mol Gen Genet 252, 412-419.

Schurch, A., Miozzari, J., and Hutter, R. (1974). Regulation of tryptophan biosynthesis in Saccharomyces cerevisiae: mode of action of 5-methyl-tryptophan and 5-methyl- tryptophan-sensitive mutants. J Bacteriol 117, 1131-1140.

Schwabish, M. A., and Struhl, K. (2004). Evidence for eviction and rapid deposition of histones upon transcriptional elongation by RNA polymerase II. Mol Cell Biol 24, 10111-10117.

Sekinger, E. A., and Gross, D. S. (2001). Silenced chromatin is permissive to activator binding and PIC recruitment. Cell 105, 403-414.

Shilatifard, A. (2004). Transcriptional elongation control by RNA polymerase II: a new frontier. Biochim Biophys Acta 1677, 79-86.

Simpson, R. T., and Stafford, D. W. (1983). Structural features of a phased nucleosome core particle. Proc Natl Acad Sci USA 80, 51-55.

Sims, R. J., 3rd, Belotserkovskaya, R., and Reinberg, D. (2004). Elongation by RNA polymerase II: the short and long of it. Genes Dev 18, 2437-2468.

Slesarev, A. I., Mezhevaya, K. V., Makarova, K. S., Polushin, N. N., Shcherbinina, O. V., Shakhova, V. V., Belova, G. I., Aravind, L., Natale, D. A., Rogozin, I. B., et al. (2002). The complete genome of hyperthermophile Methanopyrus kandleri AV19 and monophyly of archaeal methanogens. Proc Natl Acad Sci USA 99, 4644-4649.

Smale, S. T., and Kadonaga, J. T. (2003). The RNA polymerase II core promoter. Annu Rev Biochem 72, 449-479.

Smith, D. R., Doucette-Stamm, L. A., Deloughery, C., Lee, H., Dubois, J., Aldredge, T., Bashirzadeh, R., Blakely, D., Cook, R., Gilbert, K., et al. (1997). Complete genome sequence of Methanobacterium thermoautotrophicum deltaH: functional analysis and comparative genomics. J Bacteriol 179, 7135-7155.

189 Soares, D., Dahlke, I., Li, W. T., Sandman, K., Hethke, C., Thomm, M., and Reeve, J. N. (1998). Archaeal histone stability, DNA binding, and transcription inhibition above 90 degrees C. Extremophiles 2, 75-81.

Soares, D. J., Marc, F., and Reeve, J. N. (2003). Conserved eukaryotic histone-fold residues substituted into an archaeal histone increase DNA affinity but reduce complex flexibility. J Bacteriol 185, 3453-3457.

Soares, D. J., Sandman, K., and Reeve, J. N. (2000). Mutational analysis of archaeal histone-DNA interactions. J Mol Biol 297, 39-47.

Soppa, J. (1999a). Normalized nucleotide frequencies allow the definition of archaeal promoter elements for different archaeal groups and reveal base-specific TFB contacts upstream of the TATA box. Mol Microbiol 31, 1589-1592.

Soppa, J. (1999b). Transcription initiation in Archaea: facts, factors and future aspects. Mol Microbiol 31, 1295-1305.

Spitalny, P., and Thomm, M. (2003). Analysis of the open region and of DNA-protein contacts of archaeal RNA polymerase transcription complexes during transition from initiation to elongation. J Biol Chem 278, 30497-30505.

Starich, M. R., Sandman, K., Reeve, J. N., and Summers, M. F. (1996). NMR structure of HMfB from the hyperthermophile, Methanothermus fervidus, confirms that this archaeal protein is a histone. J Mol Biol 255, 187-203.

Stetter, K. O. T., M.;Winter, J., Wildgruber, G., Huber, H., Zillig, W., Janecovic, D., König, H., Palm, P (1981). Methanothermus fervidus, sp. nov., a novel extremely thermophilic methanogen isolated from an Islandic hot spring. Zbl Bakt Hyg, I Abt Orig C 2, 166-178.

Stiller, J. W., and Hall, B. D. (2002). Evolution of the RNA polymerase II C-terminal domain. Proc Natl Acad Sci USA 99, 6091-6096.

Stiller, J. W., McConaughy, B. L., and Hall, B. D. (2000). Evolutionary complementation for polymerase II CTD function. Yeast 16, 57-64.

Strahl, B. D., and Allis, C. D. (2000). The language of covalent histone modifications. Nature 403, 41-45.

190 Studitsky, V. M., Kassavetis, G. A., Geiduschek, E. P., and Felsenfeld, G. (1997). Mechanism of transcription through the nucleosome by eukaryotic RNA polymerase. Science 278, 1960-1963.

Suckow, M., Madan, A., Kisters-Woike, B., von Wilcken-Bergmann, B., and Muller-Hill, B. (1994). Creating new DNA binding specificities in the yeast transcriptional activator GCN4 by combining selected amino acid substitutions. Nucleic Acids Res 22, 2198-2208.

Sullivan, S., Sink, D. W., Trout, K. L., Makalowska, I., Taylor, P. M., Baxevanis, A. D., and Landsman, D. (2002). The Histone Database. Nucleic Acids Res 30, 341-342.

Suto, R. K., Clarkson, M. J., Tremethick, D. J., and Luger, K. (2000). Crystal structure of a nucleosome core particle containing the variant histone H2A.Z. Nat Struct Biol 7, 1121- 1124.

Suzuki, Y., Tsunoda, T., Sese, J., Taira, H., Mizushima-Sugano, J., Hata, H., Ota, T., Isogai, T., Tanaka, T., Nakamura, Y., et al. (2001). Identification and characterization of the potential promoter regions of 1031 kinds of human genes. Genome Res 11, 677-684.

Svejstrup, J. Q. (2002). Chromatin elongation factors. Curr Opin Genet Dev 12, 156-161.

Svejstrup, J. Q. (2004). The RNA polymerase II transcription cycle: cycling through chromatin. Biochim Biophys Acta 1677, 64-73.

Tabassum, R., Sandman, K. M., and Reeve, J. N. (1992). HMt, a histone-related protein from Methanobacterium thermoautotrophicum delta H. J Bacteriol 174, 7890-7895.

Takayanagi, S., Morimura, S., Kusaoke, H., Yokoyama, Y., Kano, K., and Shioda, M. (1992). Chromosomal structure of the halophilic archaebacterium Halobacterium salinarium. J Bacteriol 174, 7207-7216.

Takemaru, K., Harashima, S., Ueda, H., and Hirose, S. (1998). Yeast MBF1 mediates GCN4-dependent transcriptional activation. Mol Cell Biol 18, 4971-4976.

Takemaru, K., Li, F. Q., Ueda, H., and Hirose, S. (1997). Multiprotein bridging factor 1 (MBF1) is an evolutionarily conserved transcriptional coactivator that connects a regulatory factor and TATA element-binding protein. Proc Natl Acad Sci USA 94, 7251- 7256.

191 Tang, X., Ezaki, S., Fujiwara, S., Takagi, M., Atomi, H., and Imanaka, T. (1999). The tryptophan biosynthesis gene cluster trpCDEGFBA from Pyrococcus kodakaraensis KOD1 is regulated at the transcriptional level and expressed as a single mRNA. Mol Gen Genet 262, 815-821.

Tang, X. F., Ezaki, S., Atomi, H., and Imanaka, T. (2000). Biochemical analysis of a thermostable tryptophan synthase from a hyperthermophilic archaeon. Eur J Biochem 267, 6369-6377.

Teixeira, A., Tahiri-Alaoui, A., West, S., Thomas, B., Ramadass, A., Martianov, I., Dye, M., James, W., Proudfoot, N. J., and Akoulitchev, A. (2004). Autocatalytic RNA cleavage in the human beta-globin pre-mRNA promotes transcription termination. Nature 432, 526-530.

Thiru, A., Hodach, M., Eloranta, J. J., Kostourou, V., Weinzierl, R. O., and Matthews, S. (1999). RNA polymerase subunit H features a beta-ribbon motif within a novel fold that is present in archaea and eukaryotes. J Mol Biol 287, 753-760.

Thoma, F., Koller, T., and Klug, A. (1979). Involvement of histone H1 in the organization of the nucleosome and of the salt-dependent superstructures of chromatin. J Cell Biol 83, 403-427.

Thomm, M. (1996). Transcription factors and termination of transcription in Methanococcus. Systematic and Applied Microbiology 16, 648-655.

Thomm, M., Sandman, K., Frey, G., Koller, G., and Reeve, J. N. (1992). Transcription in vivo and in vitro of the histone-encoding gene hmfB from the hyperthermophilic archaeon Methanothermus fervidus. J Bacteriol 174, 3508-3513.

Thompson, D. K., Palmer, J. R., and Daniels, C. J. (1999). Expression and heat- responsive regulation of a TFIIB homologue from the archaeon Haloferax volcanii. Mol Microbiol 33, 1081-1092.

Todone, F., Brick, P., Werner, F., Weinzierl, R. O., and Onesti, S. (2001). Structure of an archaeal homolog of the eukaryotic RNA polymerase II RPB4/RPB7 complex. Mol Cell 8, 1137-1143.

Todone, F., Weinzierl, R. O., Brick, P., and Onesti, S. (2000). Crystal structure of RPB5, a universal eukaryotic RNA polymerase subunit and transcription factor interaction target. Proc Natl Acad Sci USA 97, 6306-6310.

192 Tomschik, M., Karymov, M. A., Zlatanova, J., and Leuba, S. H. (2001). The archaeal histone-fold protein HMf organizes DNA into bona fide chromatin fibers. Structure (Camb) 9, 1201-1211.

Topalidou, I., and Thireos, G. (2003). Gcn4 occupancy of open reading frame regions results in the recruitment of chromatin-modifying complexes but not the mediator complex. EMBO Rep 4, 872-876.

Tsai, F. T., Littlefield, O., Kosa, P. F., Cox, J. M., Schepartz, A., and Sigler, P. B. (1998). Polarity of transcription on Pol II and archaeal promoters: where is the "one-way sign" and how is it read? Cold Spring Harb Symp Quant Biol 63, 53-61.

Tsai, F. T., and Sigler, P. B. (2000). Structural basis of preinitiation complex assembly on human pol II promoters. EMBO J 19, 25-36.

Tyler, J. K. (2002). Chromatin assembly. Cooperation between histone chaperones and ATP-dependent nucleosome remodeling machines. Eur J Biochem 269, 2268-2274.

Uptain, S. M., Kane, C. M., and Chamberlin, M. J. (1997). Basic mechanisms of transcript elongation and its regulation. Annu Rev Biochem 66, 117-172. van Holde, K., and Zlatanova, J. (1996). What determines the folding of the chromatin fiber? Proc Natl Acad Sci USA 93, 10548-10555.

Vassylyev, D. G., Sekine, S., Laptenko, O., Lee, J., Vassylyeva, M. N., Borukhov, S., and Yokoyama, S. (2002). Crystal structure of a bacterial RNA polymerase holoenzyme at 2.6 A resolution. Nature 417, 712-719.

Venter, J. C., Remington, K., Heidelberg, J. F., Halpern, A. L., Rusch, D., Eisen, J. A., Wu, D., Paulsen, I., Nelson, K. E., Nelson, W., et al. (2004). Environmental genome shotgun sequencing of the Sargasso Sea. Science 304, 66-74.

Vierke, G., Engelmann, A., Hebbeln, C., and Thomm, M. (2003). A novel archaeal transcriptional regulator of heat shock response. J Biol Chem 278, 18-26.

Walter, W., and Studitsky, V. M. (2001). Facilitated transcription through the nucleosome at high ionic strength occurs via a histone octamer transfer mechanism. J Biol Chem 276, 29104-29110.

193 Wang, Y., Fischle, W., Cheung, W., Jacobs, S., Khorasanizadeh, S., and Allis, C. D. (2004). Beyond the double helix: writing and reading the histone code. Novartis Found Symp 259, 3-17; discussion 17-21, 163-169.

Wardleworth, B. N., Russell, R. J., Bell, S. D., Taylor, G. L., and White, M. F. (2002). Structure of Alba: an archaeal chromatin protein modulated by acetylation. EMBO J 21, 4654-4662.

Waters, E., Hohn, M. J., Ahel, I., Graham, D. E., Adams, M. D., Barnstead, M., Beeson, K. Y., Bibbs, L., Bolanos, R., Keller, M., et al. (2003). The genome of Nanoarchaeum equitans: insights into early archaeal evolution and derived parasitism. Proc Natl Acad Sci USA 100, 12984-12988.

Weideman, C. A., Netter, R. C., Benjamin, L. R., McAllister, J. J., Schmiedekamp, L. A., Coleman, R. A., and Pugh, B. F. (1997). Dynamic interplay of TFIIA, TBP and TATA DNA. J Mol Biol 271, 61-75.

Weilbaecher, R. G., Awrey, D. E., Edwards, A. M., and Kane, C. M. (2003). Intrinsic transcript cleavage in yeast RNA polymerase II elongation complexes. J Biol Chem 278, 24189-24199.

Werner, F., and Weinzierl, R. O. (2002). A recombinant RNA polymerase II-like enzyme capable of promoter-specific transcription. Mol Cell 10, 635-646.

West, S., Gromak, N., and Proudfoot, N. J. (2004). Human 5' --> 3' exonuclease Xrn2 promotes transcription termination at co-transcriptional cleavage sites. Nature 432, 522- 525.

Westover, K. D., Bushnell, D. A., and Kornberg, R. D. (2004a). Structural basis of transcription: separation of RNA from DNA by RNA polymerase II. Science 303, 1014- 1016.

Westover, K. D., Bushnell, D. A., and Kornberg, R. D. (2004b). Structural basis of transcription; nucleotide selection by rotation in the RNA Polymerase II active center. Cell 119, 481-489.

Wettach, J., Gohl, H. P., Tschochner, H., and Thomm, M. (1995). Functional interaction of yeast and human TATA-binding proteins with an archaeal RNA polymerase and promoter. Proc Natl Acad Sci USA 92, 472-476.

194 Widom, J. (2001). Role of DNA sequence in nucleosome stability and dynamics. Q Rev Biophys 34, 269-324.

Wieland, G., Orthaus, S., Ohndorf, S., Diekmann, S., and Hemmerich, P. (2004). Functional complementation of human centromere protein A (CENP-A) by Cse4p from Saccharomyces cerevisiae. Mol Cell Biol 24, 6620-6630.

Wittschieben, B. O., Otero, G., de Bizemont, T., Fellows, J., Erdjument-Bromage, H., Ohba, R., Li, Y., Allis, C. D., Tempst, P., and Svejstrup, J. Q. (1999). A novel histone acetyltransferase is an integral subunit of elongating RNA polymerase II holoenzyme. Mol Cell 4, 123-128.

Woese, C. R., Kandler, O., and Wheelis, M. L. (1990). Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci USA 87, 4576-4579.

Wolffe, A. P. (1994). The transcription of chromatin templates. Curr Opin Genet Dev 4, 245-254.

Woodcock, C. L. (1977). Reconstitution of chromatin subunits. Science 195, 1350-1352.

Workman, J. L., and Roeder, R. G. (1987). Binding of transcription factor TFIID to the major late promoter during in vitro nucleosome assembly potentiates subsequent initiation by RNA polymerase II. Cell 51, 613-622.

Woychik, N. A. (1998). Fractions to functions: RNA polymerase II thirty years later. Cold Spring Harb Symp Quant Biol 63, 311-317.

Woychik, N. A., and Hampsey, M. (2002). The RNA polymerase II machinery: structure illuminates function. Cell 108, 453-463.

Xie, G., Forst, C., Bonner, C., and Jensen, R. A. (2002). Significance of two distinct types of tryptophan synthase beta chain in Bacteria, Archaea and higher plants. Genome Biol 3, 404-410.

Xie, Y., and Reeve, J. N. (2004). Transcription by an archaeal RNA polymerase is slowed but not blocked by an archaeal nucleosome. J Bacteriol 186, 3492-3498.

195 Xue, H., Guo, R., Wen, Y., Liu, D., and Huang, L. (2000). An abundant DNA binding protein from the hyperthermophilic archaeon Sulfolobus shibatae affects DNA supercoiling in a temperature-dependent fashion. J Bacteriol 182, 3929-3933.

Yan, Q., Moreland, R. J., Conaway, J. W., and Conaway, R. C. (1999). Dual roles for transcription factor IIF in promoter escape by RNA polymerase II. J Biol Chem 274, 35668-35675.

Yang, J., Hwang, J. S., Camakaris, H., Irawaty, W., Ishihama, A., and Pittard, J. (2004). Mode of action of the TyrR protein: repression and activation of the tyrP promoter of Escherichia coli. Mol Microbiol 52, 243-256.

Yanofsky, C. (2000). Transcription attenuation: once viewed as a novel regulatory strategy. J Bacteriol 182, 1-8.

Yoda, K., Morishita, S., and Hashimoto, K. (2004). Histone variant CENP-A purification, nucleosome reconstitution. Methods Enzymol 375, 253-269.

Yokomori, K., Verrijzer, C. P., and Tjian, R. (1998). An interplay between TATA box- binding protein and transcription factors IIE and IIA modulates DNA binding and transcription. Proc Natl Acad Sci USA 95, 6722-6727.

Yudkovsky, N., Ranish, J. A., and Hahn, S. (2000). A transcription reinitiation intermediate that is stabilized by activator. Nature 408, 225-229.

Zawel, L., Kumar, K. P., and Reinberg, D. (1995). Recycling of the general transcription factors during RNA polymerase II transcription. Genes Dev 9, 1479-1490.

Zhang, G., Campbell, E. A., Minakhin, L., Richter, C., Severinov, K., and Darst, S. A. (1999). Crystal structure of Thermus aquaticus core RNA polymerase at 3.3 A resolution. Cell 98, 811-824.

Zhang, R. G., Joachimiak, A., Lawson, C. L., Schevitz, R. W., Otwinowski, Z., and Sigler, P. B. (1987). The crystal structure of trp aporepressor at 1.8 A shows how binding tryptophan enhances DNA affinity. Nature 327, 591-597.

Zhao, K., Chai, X., and Marmorstein, R. (2003). Structure of a Sir2 substrate, Alba, reveals a mechanism for deacetylation-induced enhancement of DNA binding. J Biol Chem 278, 26071-26077.

196 Zhu, Q., Zhao, S., and Somerville, R. L. (1997). Expression, purification, and functional analysis of the TyrR protein of Haemophilus influenzae. Protein Expr Purif 10, 237-246.

Zhu, W., Sandman, K., Lee, G. E., Reeve, J. N., and Summers, M. F. (1998). NMR structure and comparison of the archaeal histone HFoB from the mesophile Methanobacterium formicicum with HMfB from the hyperthermophile Methanothermus fervidus. Biochemistry 37, 10573-10580.

Zimmermann, P., and Pfeifer, F. (2003). Regulation of the expression of gas vesicle genes in Haloferax mediterranei: interaction of the two regulatory proteins GvpD and GvpE. Mol Microbiol 49, 783-794.

197

APPENDIX

SEQUENCES OF THE OLIGONUCLEOTIDES USED IN THIS STUDY

TD2: 5΄CTCAGAAAAACCTTAAAATTAGCGATATATTTATATA KS2: 5΄GGATTATATGAATAGATAATA KS3: 5΄TCACACGGAGCCAACAACACGCGG KS4: 5΄AATTCCGCGTGTTGTTGG KS5: 5΄CTCCGTGTGATATTATCTATT KS6: 5΄CATATAATCCTTATATAAATATATCGC KS7: 5΄TAATTTTAAGGTTTTTCTGAGTGCA KS8: 5΄GATCCGATATCAACCGTACTGGTGT KS9: 5΄TGTCCTACGCTAATCTAAGCCGTTTACTCGCGA KS10: 5΄TTTGAAAATAGCTTAGGTGGAGATCTGATATCA KS11: 5΄AGCTTGATATCAGATCTCCACCTAAGCT KS12: 5΄ATTTTCAAAATCGCGAGTAAACGGCTTAGATTA KS13: 5΄GCGTAGGACAACACCAGTACGGTTGATATCG KS1A: 5΄TAATAAGCTTACTTAAACAAGGAGG KS1B: 5΄GTGATATAGAATTCTGATATG MX1: 5΄CTAGAACCGGTGACGTCACCA MX2: 5΄GTCAGGGGGGCGGAGCCTATGG MX3: 5΄CGCCAGGGTTTTCCCAGTCACGACGTT MX11: 5΄AATCGGATCCATGTCAGAGGAGAATGTAGTATACAT MX12: 5΄AATCGAATTCTTAATCCTTTCGGAGCTGAATTTCT MX23: 5΄AATTTCCGCGGGTGTCGAACCA MX24: 5΄TTTTGATGACTCGCATATGGGGAGCCAACA

198 MX25: 5΄ACACGCGGAATTCGAGCTCGGTACCCGG MX26: 5΄GTCATCAAAATGGTTCGACACCCGCGGA MX27: 5΄TTCCGCGTGTTGTTGGCTCCCCATATGCGA MX28: 5΄GATCCCGGGTACCGAGCTCGAA MX50: 5΄TTGCTTGGTCAAGCTTTTTCCGCGGGTGTCGAACCA MX51: 5΄TGCCTTTCAGGGAGACCTAGCCACCTAAGCTATTTTCAAAAT MX52: 5΄TGCCTTTCAGGGAGACCTAG MX56: 5΄ACATACACTTCACCACATTGATGTA MX57: 5΄CCTTGATCCCTCGTATTCAACTTCT MX64: 5΄CCGGGTTTTTCAGTTCTGTAATGCC MX75: 5΄CATCGCTTATCTCAACGTCGTCACAG MX77: 5΄GGGAGGGGTATCCCTCAAATCTGT MX80: 5΄CATTATTATACAGAAAGATTTAAATAGTACCTCCGTG MX81: 5΄CACGGAGGTACTATTTAAATCTTTCTGTATAATAATG MX82: 5΄CATTATTATACAGACAAGATTTGGATAGTACCTCCGTG MX83: 5΄CACGGAGGTACTATCCAAATCTTGTCTGTATAATAATG MX84: 5΄ATAGTACCTCCGTGTATCCGTACTTTGTGGATACAGAG MX85: 5΄CTCTGTATCCACAAAGTACGGATACACGGAGGTACTAT MX86: 5΄GGTGAAGCATATGTGGAAGCAGATAAAGCACAGA MX87: 5΄ACCGAAAGCTTTCCGTATATTGAGACCCTCTTGACGCC MX99: 5΄CGGGGCCACATTGATGTACATTATTATACAGAC MX100: 5΄GTCTGTATAATAATGTACATCAATGTGGCCCCG MX101: 5΄CTTCACCAGGGGGATGTACATTATTATACAGAC MX102: 5΄GTCTGTATAATAATGTACATCCCCCTGGTGAAG MX103: 5΄CTTCACCACATTGATGGGGGTTATTATACAGAC MX104: 5΄GTCTGTATAATAACCCCCATCAATGTGGTGAAG MX105: 5΄CTTCACCACATTGATGTACATTATTAGGGGGAC MX106: 5΄GTCCCCCTAATAATGTACATCAATGTGGTGAAG MX111: 5΄GCACTCCCCTTGAGAATACCCTTGGCAG

199 MX113: 5΄GATGTACATTATTATACAGAGAAGATTTAAATAGTACCTCCGTGTATATGT ACTTTGTGG MX114: 5΄CCACAAAGTACATATACACGGAGGTACTATTTAAATCTTCTCTGTATAATA ATGTACATC MX115: 5΄GATGTACATTATTATACAGATAAGATTTAAATAGTACCTCCGTGTATATGTA CTTTGTGG MX116: 5΄CCACAAAGTACATATACACGGAGGTACTATTTAAATCTTATCTGTATAATA ATGTACATC MX117: 5΄GATGTACATTATTATACAGAAAAGATTTAAATAGTACCTCCGTGTATATGT ACTTTGTGG MX118: 5΄CCACAAAGTACATATACACGGAGGTACTATTTAAATCTTTTCTGTATAATAA TGTACATC MX124: 5΄GTACATATACACGGAGCCCCTATTTAAATCTT MX128: 5΄GTACATACCGCCGGAGGTACTATTTAAATCTT MX129: 5΄CTTCACCACATTGATGTACATTATATTATACAGAC MX130: 5΄GTCTGTATAATATAATGTACATCAATGTGGTGAAG MX131: 5΄CTTCACCACATTGATGTACATTTATACTGAC MX132: 5΄GTCAGTATAAATGTACATCAATGTGGTGAAG MX133: 5΄CAAGAACGATCTTATTCATCATATCACC MX138: 5΄TCCCCCGGATTCTCCCTGCAATTATCCTTG MX139: 5′CTTCACCACATTGATGTACATCGGTATACAGAC MX133: 5′CAAGAACGATCTTATTCATCATATCACC MX140: 5′GTCTGTATACCGATGTACATCAATGTGGTGAAG MX141: 5′AAGATTTAAATAGGGCTCCGTGTATATGTAC

200