INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in ^ew riter fiice, while others may be from any type o f computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely afifect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back o f the book.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9” black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order. UMI A Bell & Howell Information C om p aiy 300 North Zed) Road, Ann Aibor MI 48106-1346 USA 313/761-4700 800/521-0600

IN VIVO CHARACTERIZATION OF ARCHAEAL TERMINATION SIGNALS AND CHARACTERIZATION OF A HALOFERAX VOLCANII HEAT SHOCK : A MODEL FOR GENE REGULATION

DISSERTATION

Presented in Partial fulfillment of the Requirements for the Degree of Doctor of Philosophy in the Graduate School of the Ohio State University

By

Yen-Ping Kuo, B.S., M.S.

*****

The Ohio State University 1997

Dissertation Committee: Approved by Charles J. Daniels, Advisor Tina Henkin John N. Reeve F. Robert Tabita Adviser Department of Microbiology UMI Number: 9731658

UMI Microform 9731658 Copyright 1997, by UMI Company. All rights reserved.

This microform edition is protected against unauthorized copying under Title 17, United States Code.

UMI 300 North Zeeb Road Ann Arbor, MI 48103 ABSTRACT

To investigate the mechanisms of in the we developed an in vivo assay system to examine the requirements for transcription termination in

Haloferax volcanii and began studies to identify a model for studying gene regulation.

The availability of a transformation system and gene expression vectors for

H. volcanii has facilitated the development of a plasmid-based in vivo assay for transcription termination. Using this system, we established that a eukaryal RNA polymerase III was recognized by the archaeal RNA transcription apparatus and functioned as an efficient termination element. This element was combined with the

H. volcanii tRNALys and the yeast tRNAProM reporter gene to construct a transcription termination assay module that could be used to investigate the sequence and structural characteristics of termination signals. Results of this study established that oligo-dT tracts function as efficient terminators in vivo and that a second, bacterial-like rho-independent, termination mechanism may be also present.

Furthermore, the termination properties of elements containing two T-tracts suggested the occurrence of an “inch-worming” mechanism for archaeal transcription termination.

11 To study regulated gene expression in H. volcanii, we chose to examine the heat shock response as a potential model system. Using information available for global expression of heat shock loci, we identified and cloned a gene encoding the chaperonin- containing TCP-1 (CCTl) . The deduced amino acid sequence of this gene predicted a protein of 59 kDa that was similar to Sulfolobus shibatae TF55P and human

TCP-1. Transcripts of cctl and its related were present at higher levels when cells were challenged with heat stress or salt shock. Transcript analysis indicated that transcription of the cctl gene initiated 25 bp downstream from a typical archaeal TATA promoter element under both control and heat shock conditions, and analysis of the termination site indicated that the transcript terminated in a T6 T-tract. These studies established the cctl gene as a suitable gene for the study of transcription regulation in H. volcanii.

Ill This work is dedicated to my children, Shin and Joey,

and my husband, Tenfu, who have often shared family time with my research.

IV ACKNOWLEDGMENTS

I would like to especially thank my advisor. Dr. Charles J. Daniels, for his guidance, support, understanding and patience over the past five years and in the preparation of this dissertation. I also thank my Committee members. Dr. John Reeve,

Dr. F. Robert Tabita, and Dr. Tina Henkin, for their encouragement and advice. 1 am indebted to David Armbruster who has been a wonderful mentor and friend since I joined the laboratory. I am grateful to Dorothea Thompson both for her friendship and her assistance in editing this dissertation. Many thanks also go to my co-workers in Daniels’ lab for creating a fun working environment and introducing me to the heart of American culture. I will be always grateful to my parents who emphasized the value of higher education and gave me unlimited support during my pursuit of a Ph.D. Finally, my sincere appreciation goes to my loving husband for his sacrifice, encouragement and unconditional support. VITA

July 18,1962 ...... Bom—Chia-I, Taiwan

1984 ...... B.S., National Taiwan University, Taipei, Taiwan

1984-1987 ...... Research Assistant and Medical Technologist, National Taiwan University Hospital, Taipei, Taiwan

1989 ...... M.S., Michigan State University East Lansing, Michigan

1990-1991 ...... Microbiologist, Ohio Department of Health Columbus, Ohio

1991 -present ...... Graduate Teaching andResearch Associate, The Ohio State University Columbus, Ohio

PUBLICATIONS

Research Publication

Kuo, Y.-P. 1989. Detection of Enterohemorragic 0157:H7 by radioactive and non-radioactive DNA probes. M.S. Thesis. Michigan State University, East Lansing, MI.

VI Abstract

Palmer, J. R., Y.-P. Kuo, and C. J. Daniels. 1994. /n v/vo analysis of archaeal transcription signals. American Society for Microbiology, May.

Kuo, Y.-P. and M.-C. Shen. 1986. Immunological status of patients with hemophilia in Taiwan. XVII International Congress of the World Federation of Hemophilia, Milano.

Kuo, Y. P., C.-H. Wang, and M.-C. Shen. 1986. A test of platelet bound immunoglobulin by electroimmunoassay in thrombocytopenic purpura. Chinese Hematology Association, Taipei.

Kuo, Y.-P., F.-W. Liu, R.-F. Hsiu, and M.-C. Shen. 1986. Immunological defects in Hemophiliacs with or without HTLV-III infections. Chinese Hematology Association, Taipei.

Chang, S.-C., S. C. Lue, Y.-P. Kuo, and M.-C. Shen. 1986. Analysis of DNA in Hemophilia A patients using a cloned probe. Chinese Hematology Association, Taipei.

FIELD OF STUDY

Major Field: Microbiology

Molecular Biology of the Archaea

vu TABLE OF CONTENTS

ABSTRACT...... ii DEDICATION...... iv ACKNOWLEDGMENTS...... v VITA...... vi TABLE OF CONTENTS...... viii LIST OF FIGURES...... xiii LIST OF TABLES...... xvi LIST OF ABBREVIATIONS...... xvii

CHAPTER 1: GENERAL INTRODUCTION...... I Archaea and the Three Domains of ...... 1 Gene Transcription ...... 6 DNA-dependent RNA polymerases ...... 6 Promoter, Transcription Initiation Factors, and Initiation Complex Formation 13 Transcription Elongation ...... 17 Transcription Termination ...... 19 Heat Shock : Structure and Function ...... 25 HSP70 family ...... 26 Hsp60 family ...... 28 CCT/TriC/TCP-l/TF55 Family ...... 30 Archaeal Heat Shock Proteins ...... 32

vm Mechanisms of Regulated Gene Expression and Heat Shock Genes as a Paradigm for Regulation ...... 35 Transcription Regulation Strategies in the Three Domains ...... 36 Heat Shock Regulation ...... 39 Research Problems ...... 43 In vivo Characterization of Archaeal Transcription Termination Signals ...... 44 Development of a Gene Regulation Model System ...... 44

CHAPTER 2: MATERIAL AND METHODS...... 46 Reagents and Enzymes ...... 46 Bacterial Strains and Culture Conditions ...... 47 Nucleic Acid Isolation ...... 48 Small-scale Plasmid/Cosmid DNA Isolation ...... 48 Small-scale H. volcanii Genomic DNA Isolation ...... 48 Extraction of DNA from Agarose Gels and Polyacrylamide Gels ...... 49 RNA Isolation ...... 50 Nucleic Acid Quantitation ...... 51 DNA Restriction Analysis ...... 52 Electrophoretic Techniques for Nucleic Acid Analysis ...... 52 Nondenaturing Agarose Gel Electrophoresis ...... 52 Formaldehyde Agarose Gel Electrophoresis ...... 53 Nondenaturing Polyacrylamide Gel Electrophoresis ...... 54 Denaturing Polyacrylamide Gel Electrophoresis ...... 55 Cell Extract Preparation and SDS-Polyacrylamide Gel Electrophoresis ...... 56 Preparation of Competent Cells and Transformation ...... 57 E. co//DH5a ...... 57 E. co//JM llO ...... 58 H. volcanii WFDl 1...... 59 Cloning ...... 61

ix Preparation of the Insert ...... 6 1 Vectors and Ligation Reactions ...... 63 Identification of Clones ...... 64 Nucleic Acid Sequence Analysis ...... 65 DNA Sequencing ...... 65 Comparative Sequence Analysis and RNA Structure Prediction ...... 66 Radio labeling and Purification of DNA Probe ...... 67 Radiolabeling of DNA Probe ...... 67 Purification of Radiolabeled DNA Probe ...... 68 Colony Blot Hybridization ...... 68 Colony Blotting ...... 68 Hybridization and Detection ...... 69 Northern Analysis ...... 70 Southern Analysis ...... 71 Primer Extension ...... 71 Annealing of DNA Probes to RNA ...... 71 Reverse Transcription and Product Analysis ...... 72 S1 Nuclease Mapping...... 73 3’ End-Labeling of dsDNA Probe ...... 73 Annealing of DNA Probes to RNA ...... 74 SI nuclease Digestion and Product Analysis ...... 74 Bent DNA Analysis ...... 76 Induction of the Heat Shock Gene ...... 77 Heat-Shock Induction ...... 77 Salt-Shock Induction ...... 78

CHAPTER 3 : IN VIVO CHARACTERIZATION OF ARCHAEAL TRANSCRIPTION TERMINATION SIGNALS...... 85 Introduction ...... 85

X Result...... 86 Yeast tRNAProM RNA Polymerase m Terminator Directs Transcriptional Termination in H. volcanii ...... 86 Determining the 3' Terminus of the Yeast tRNAProM Transcript ...... 87 Determining the Functional Role of the Yeast Pol III Terminator in H. volcanii .90 Development of a New Expression Module, sptProM, for in vivo Termination Studies ...... 94 Sequence Requirements of Archaeal T-tract Terminators ...... 104 Orientation-Dependency of tbp2 and cctl Terminators ...... 104 5’ and 3’ Block Deletion Mutagenesis of tbp2 and cctl Terminators ...... 107 Sequence Specificity in the 3' Regions of the tbp2 and cctl Terminators ...... 114 Spacing Constraints Between the T-Tracts of tbp2!cctl and the pol III Terminators ...... 121 Sequence Requirements within the T-tract ...... 122 The Function of H. volcanii tRNALys Promoter as a Termination Element ...... 126 Effect of the Structural Environment on a T-tract Terminator ...... 127 Termination Efficiency of a T-tract Located within a Stem-Loop Structure ...... 130 Termination Activity of Bent DNA ...... 134 Determination of DNA Curvature in Active Terminators ...... 135 Activity of a Bacterial p-Independent-Like Terminator in H. volcanii ...... 141 Termination Activity of a p-Independent Terminator-Like Sequence from the H. volcanii rRNA ...... 142 Terminator Activity of E. coli trpA Terminator in H. volcanii ...... 149 Discussion ...... 152 Terminator Function of a Eukaryal Pol III Termination Element in H. volcanii 152 An Expression Module for Monitoring in vivo Termination ...... 153 Characteristics of T-tract Terminators ...... 154 Bacterial p-lndependent-like Termination in the Archaea ...... 161 Archaeal Transcription Termination and RNAP Inchworming ...... 162

x i Summary ...... 166

CHAPTER 4: IDENTIFICATION OF AN H. VOLCANII HEAT SHOCK GENE ENCODING A MEMBER OF THE CCT FAMILY...... 168 Introduction ...... 168 Results...... 170 Cloning and Sequencing of an H. volcanii Heat Shock (HS) Gene ...... 170 Analysis of the cctl Gene...... 179 Phylogenetic Analyses ...... 187 Presence of Additional cct Gene in H. volcanii ...... 190 Transcript Level of ccf-Related Genes in Response to Environmental Stresses Study ...... 193 Mapping the cctl Transcription Start Site ...... 195 Mapping the cctl Transcription Termination Site ...... 201 Construction of an H. volcanii cctl Gene Overexpression Strain ...... 201 Expression of cctl in the H. volcanii Gene Overexpression Strain ...... 206 Discussion ...... 207 Identification of cctl and its Protein Product ...... 210 Characterization of the Putative CCTl Protein ...... 211 Regulated Gene Expression of H. volcanii cctl and Its Related Genes...... 212 Transcriptional Signals of c c tl...... 216 Summary ...... 218

CHAPTERS: CONCLUSION...... 219 Archaeal Transcription Termination ...... 220 Characterization of a Heat Shock Gene ...... 223

APPENDIX A: LIST OF TRANSCRIPTION TERMINATION CONSTRUCTS 226 APPENDIX B: LIST OF STRAINS...... 234 BIBLIOGRAPHY...... 235 xii LIST OF FIGURES

Figure Title Page

1.1 A rooted universal phylogenetic tree ...... 3

1.2 Schematic depicting the separation of eukaryal, archaeal and bacterial DNA-dependent RNA polymerase subunits in an SDS- polyacrylamide gel ...... 9

1.3 Consensus of the TATA elements in the archaeal and eukaryal pol II promoters ...... 14

3.1 Mapping the 3’ terminus of the yeast tRNAProM transcript ...... 88

3.2 Functional analysis of the yeast tRNAProM terminator ...... 92

3.3 Primer extension analysis of derived from the tRNATrpO 16M-AGGAG reporter gene ...... 95

3.4 The constmction and use of the sptProM termination assay expression module ...... 98

3.5 Mapping tRNAProM gene transcripts generated from the sptProM expression module ...... 102

3.6 A survey of archaeal DNA sequences at mapped transcription termination sites ...... 105

3.7 Termination activity of the H. volcanii tbp2 and cctl terminators ...... 108

3.8 Analysis of the requirement for sequences in the 5’ and 3’ regions of the tbp2 T-tract terminator ...... 112

xni 3.9 Effects of deletion in the 3’ regions of the tbp2 and cctl T-tracts on in vivo termination ...... 115

3.10 Effects of altering the sequences 3’ of the tbp2 T-tract on termination efficiency in vivo ...... 120

3.11 Effects of single nucleotide replacements on the termination activity of the tbp2 terminator T-tract ...... 124

3.12 The ability of the H. volcanii tRNALys TATA/BoxA sequence to function as a transcription terminator ...... 128

3.13 Termination activity of new TBP2Ts constructs ...... 132

3.14 Activity of bent DNAs as transcription terminators ...... 136

3.15 Electrophoretic mobility of termination elements ...... 139

3.16 H. volcanii rRNA operon terminator ...... 145

3.17 Mapping the 3’ terminus of the transcript from SsrT.F construct by SI nuclease digestion ...... 147

3.18 Termination activity of E. coli trpA terminator in H. volcanii ...... 150

3.19 The influence of spacing distances on the termination efficiencies of T-tracts ...... 167

4.1 Identification of Mlul fragments from cosmid A199 encoding an H. volcanii heat shock (HS) responsive gene, cctl...... 171

4.2 Identification of H. volcanii HS gene-containing subclones from cosmid A199 ...... 174

4.3 Subcloning and sequencing strategy of the H. volcanii cctl gene 176

4.4 Nucleotide sequence of the H. volcanii cct/gene and its deduced amino acid sequence ...... 180

4.5 Protein sequence alignment of H. volcanii CCTl and other CCTs...183

4.6 Phylogenetic tree of CCTs based on pairwise sequence comparison using the Clustal method ...... 189

x iv 4.7 Southern analysis of H. volcanii genomic DNA ...... 191

4.8 Induction of the H volcanii cct transcripts...... 193

4.9 Mapping the H. volcanii cctl gene transcription start site ...... 196

4.10 SI nuclease mapping of the 3 ’ terminus of the cctl transcript...... 198

4.11 Construction and cloning of H. volcanii cctl gene-overexpression strain...... 205

4.12 Expression of cctl in a //. volcanii gene-overexpression strain...... 208

XV LIST OF TABLES

Table Title Page

1.1 Comparison of the information processing system of the three domains ...... 7

1.2 Protein sequence similarities and relatedness of the Sulfolobus acidocaldarius RNAP subunits to the corresponding bacterial and eukaryal RNAP subunits ...... 11

2.1 Synthetic oligonucleotides used in the termination study ...... 79

2.2 Synthetic oligonucleotides used in the heat shock gene study ...... 84

3.1 Spacing constraints between multiple T-tracts...... 123

3.2 The influence of the two nucleotides flanking the T-tract on the termination efficiency ...... 157

4.1 List of H. volcanii heat shock gene subclones ...... 178

4.2 Pairwise comparisons of selected eukaryal, archaeal, and bacterial chaperonins using the Clustal method ...... 189

XVI LIST OF ABBREVIATIONS

bp (s) BSA bovine serum albumin cpm counts per minutes ddHiO double distilled water DNA deoxyribonucleic acid DNase deoxyribonuclease dNTP deoxynucleotide triphosphate A deletion DTT dithiothreitol EDTA ethylenediaminetetraacetic acid g gram IPTG isopropylthio-P-galactoside kbp kilobase pairs kDa kilodalton p micro pg microgram pi microliter pM micromolar mg milligram ml milliliter mM millimolar min minutes M molar MOPS 3-(N-morpholino)propanesulfonic acid

x v ii MW molecular weight mRNA messenger RNA N.D. not detectable ng nanogram nM nanomolar nt nucleotide OD optical density PAGE polyacrylamide gel electrophoresis PCR Polymerase Chain Reaction PEG polyethylene glycol PIPES piperazine-N,N’-bis(2-ethanesulfonic acid) pol I RNA polymerase I pol II RNA polymerase II pol III RNA polymerase III RNA ribonucleic acid RNAP RNA polymerase RNaseA ribonuclease A rpm revolutions per minute rRNA ribosomal RNA SDS sodium dodecyl sulfate TBE Tris-borate EDTA buffer TEMED N,N,N’ -tetramethylethylenediamine Tm melting temperature of duplex DNA Tris tris(hydroxymethyl)aminomethane tRNA transfer RNA U unit(s) of enzyme activity UV ultraviolet light X-gal 5-bromo-4-chloro-3-indolyl-P-D-galactopyranoside W.T. wild type

XVlll CHAPTER 1

GENERAL INTRODUCTION

Archaea and the Three Domains of Life

Prior to 1977, it was thought that all life forms could be simply divided into two

phylogenetically distinct kingdoms, the (no nuclei) and the (true

nuclei). As it turned out, evolution has generated a more profound divergence in nature,

as Woese and Fox discovered using 16S (or 18S) ribosomal RNA (rRNA) cataloguing

(Woese and Fox, 1977). They found a group of that were more distinct from

other than they were from . These “new” prokaryotes, the

Archaebacteria (now the Archaea), encompass a group of organisms that flourish in extreme habitats such as high salt, high temperature, exclusively anoxic conditions, and extremely low temperature (Bams et al., 1994; DeLong et al., 1994).

As more data on Archaea accumulated from biochemical and molecular biological studies (Wolters and Erdmann, 1989), it became clear that the diversity of life could not be accurately reflected by the conventional five-kingdom taxonomy (Whittaker, 1959) or the eukaryote-prokaryote dichotomy. In 1990, Woese formally proposed a three-domain taxonomy (Woese et al., 1990) in which all of life was divided into three

Domains—namely Bacteria, Archaea, and Enkarya. Each Domain was then further divided into kingdoms, the classical taxon. Other phylogenetic analyses of conserved macromolecules, such as DNA-dependent RNA polymerases (Iwabe et al., 1991; Puhler et al., 1989; Zillig et al., 1989), elongation factors (Baldauf et al., 1996), transfer RNA (tRNA) synthetases (Brown and Doolittle, 1995), and ATPases (Iwabe et al., 1989) also confirmed the validity of the three-domain proposal. Based on the 16S rRNA sequences and ancestral gene duplications, Olsen and Woese proposed the rooted universal tree shown in Figure 1.1 (Olsen and Woese, 1993). In this tree, the Archaea and the Eukarya diverge from a common ancestor as sister groups, and the inferred root is located between the Bacteria and Archaea/Eukarya linkages. The Archaea branch is further divided into two kingdoms, the Euryarchaeota and the Crenarchaeota, which comprise four distinct phenotypes. The crenarchaeotes are mainly sulfur-dependent thermophilic archaea; in contrast, the euryarchaeotes are phenotypically diverse with members including the three groups of methanogens (Methanococcales,

Methanobacteriales, and Methanomicrobiales), the extreme halophiles, and

Thermococcales. Alternatively, phylogenetic analysis based on Hsp70 proteins suggest a close relationship between the Archaea and the Gram positive bacteria (Gupta and Singh,

1994; Gupta et al., 1997) and a chimeric origin of the Eukarya (Gupta and Golding,

1996).

Since their discovery, the Archaea have commanded the attention of many researchers. Understanding the molecular properties of the Archaea might help to Figure 1.1: A rooted universal phylogenetic tree. The order and the lengths of the branches were determined using I6S rRNA phytogeny, and the root was inferred from ancestral gene duplications. Adapted from Olsen and Woese (1993). Archaea

Crenarchaeota MalhanofiiicraWttos Euryarchaeota HalobactartaMM OasuNurooocoaaa Malhanoooccales

Tharmocoocales Mathanobactarialas

Bacteria Eucarya Animais Slma Fungl L -— Oman plants Cllalas — ' Flagsllalas — Tilchomonads 'Microsporidia • DIplomonads understand the evolutionary processes that led the primitive ancestor to modem day organisms and to understand the molecular mechanism of adaptation to extreme environments. Our knowledge of the Archaea has grown significantly in the past two decades, yet perhaps one of the most exciting events in Archaea research occurred in

1996 when the entire genome of Methanococuus jamaschii, a strict anaerobic methanogen, was sequenced (Bult et al., 1996). With the elucidation of the first archaeal genome, we are now able to “measure” the similarities and dissimilarities between the

Archaea, Bacteria, and Eukarya by comparing the properties of representative proteins and RNAs from each domain, rather than generalizing from a limited number of molecules.

Analysis (Bult et al., 1996) of the M. jamaschii 1.73-megabase pair genome revealed that only 38% of the 1738 predicted M jamaschii protein-coding genes could be assigned potential functions. Homologs have not been identified for the remaining genes based on sequence comparisons. Furthermore, out of the 38% matches, only low homologies are shared between M. jamaschii and the other previously sequenced bacterial genomes of Haemophilius influenzae (11%) and Mycoplasma genitalium (17%).

The majority of genes in M jannaschii that are most similar to those found in Bacteria are involved in energy production, cell division, and metabolism. However, most of the

M jannaschii genes involved in the information-processing systems (transcription, translation, and replication) are more similar to those found in Eukarya. The information obtained from analyzing the M. jannaschii genome is consistent with previous findings that point to the mosaic nature of the archaeal genome (Keeling et al., 1994; Ouzo unis and Kyrpides, 1996). A comparison of selected molecular features among the three domains is listed in Table 1.1. A more detailed discussion of the archaeal transcription machinery and its regulation will be presented in the following sections.

Gene Transcription

Studies of the archaeal transcription system in the past decade have greatly benefited from the advance of techniques. Our understanding of archaeal RNA polymerase complexes, promoters, and eukaryal-like transcription initiation factors suggests that the archaeal transcription machinery represents the clearest distinctions between Archaea and Bacteria and the highest similarity between Archaea and Eukarya (Baumann et al., 1995; Langer et al., 1995; Zillig et al., 1993).

DNA-dependent RNA polymerases

Bacteria contain a single type of RNA polymerase (RNAP) for transcribing all of their genes. The bacterial RNAP holoenzyme is a simple complex containing the core enzyme and one of several specific sigma factors (a-factor). The core enzyme, which consists of a P’ (165kDa), p (155kDa), and two identical a (35kDa) subunits, contains all the functions required for RNA synthesis (Hick et al., 1994): P’ subunit is responsible for template DNA binding; p subunit binds nucleotide substrates (Allison et al., 1985;

Sweetser et al., 1987); and a subunit is involved in RNAP assembly, promoter Character Bacteria Archaea Eukarya

DNA packaging Proteins': - + + HU-protein + + -

Transcription Units'*: Protein opérons + + - rRNA opérons (16S-23S-5S) + +/- -

Transcription': ilNA polymerase ttjPP’CT + - - complex - + 4- Promoter type (-35/-10) + - - TATA - + 4- Termination signal Hairpin-(U)n + -/+ - Oligo-T - + 4- Transcription factors TBP - + + TFIIB - + 4-

Intron Occurrence'': Spliceosome -- 4* Group I + (+) 4- Group II + - 4- tRNA - + •f

RNA Processing*: mRNA 5’ CAP -- 4- 3’ poIyA - + /- 4* Editing -- 4- rRNA protein endonuclease + (+) (+) protein+RNA endonuclease - ? 4- tlbrillarin - + 4- tRNA RNaseP RNA ribozyme + + /- - CCA addition +/- +/- - Base modification + ++ 4-

Translation*^: initiation with tRNAf"" + -- Shine-Delgamo interaction + +/- - Protein Splicing + + 4-

Table 1.1: Comparison of the information processing system in three domains +/- indicates a coexistence of positive and negative conditions; (+) indicates evidence still needed for further elucidation of such molecular feature. a: Darcy et al., 1995; DeLange et al., 1981; Ouzounis and Kyrpides, 1996; Sandman et al., 1990; Sandman et al., 1994; Stanch et al., 1996. b: Brown et al., 1989; Ramirez et al., 1989; Shimmin et al., 1989. c; Iwabeetal., 1991; Langer et al., 1995. d: Daniels et al., I985.a; Datta et al., 1989; Haas et al., 1989. e: Amiri, 1994; Reiter et al., 1990; Shimmin and Dennis, 1996; Thompson et al., 1989; Thompson and Daniels. 1988. f: Hodges et al., 1992.

binding, and interactions with transcriptional activators (Gaal et al., 1996; Kato et al.,

1996; Landini and Volkert, 1995). The specific cr-factor determines the promoter

specificity and directs the core enzyme to the promoter (Haldenwang, 1995; Helmaim,

1994).

In contrast, the eukarya have three nuclear RNAPs, each specializing in the

synthesis of different classes of RNAs (Eick et al., 1994): polymerase 1 (pol 1) transcribes rRNA, polymerase II (pol II) transcribes messenger RNA (mRNA) and several small nuclear RNAs (snRNA), and polymerase III (pol III) transcribes tRNA, 5S rRNA, and small cellular and viral RNAs. Pols I, II, and III are protein complexes, each comprising

10 to 15 subunits, which are evolutionarily related in all three RNAPs (Sentenac, 1985;

Woychik et al., 1990) (see Figure 1.2 and Table 1.2). The two largest subunits of eukaryal RNAPs are the structural and functional homologs of the bacterial (3 and P’ subunits (Allison et al., 1985; Sweetser et al., 1987; Woychik and Young, 1994). Two other subunits (AC40 and AC 19), which share only limited sequence identity with the bacterial a subunits, contain a highly conserved 20-amino-acid region of a subunits Figure 1.2: Schematic depicting the separation of eukaryal, archaeal, and bacterial

DNA-dependent RNA polymerase subunits in an SDS-polyacrylamide gel. Homologous subunits are given an identical pattern design, and the apparent molecular weights for some of the subunits are indicated at the left. Pol I, II, and III are the three eukaryal nuclear RNAPs from Saccharomyces cerevisiae; archaeal B-type RNAP is from

Sulfolobus acidocaldarius, and archaeal B’B”-type RNAP is from Halobacterium halobium; bacterial RNAP is from Escherichia coli. The capital letters designate the archaeal and the eukaryal large subunits, and the Greek letters designate the bacterial subunits. Redrawn from Klenk et al. (1994). Eukarya Archaea Bacteria KDa pol I pel n pol m B-type B’B”-type

^SS!SS ASSSM 190 sssa A S^ B S 135 g C E S J

■ m il ■ m il m m ABC27

ABC23

AC19

ABClOp

Figure 1.2

10 E. coli RNAP S. cerevisiae RNAP Subunits of S. ac. Homologous % % Homologous %% RNAP subunits similarity identity subunits similarity identity B P 54 30 B150 65 44 A’ P’ 51 29 B220 (first 2/3) 63 43 A” P’ 54 28 B220 (last 1/3) 59 33 D a 46 22 AC40 57 34 E B16 46 23 F G H ABC27(lastl/3) 55 40 I K ABC23(last 3/4) 61 38 L AC 19 (last 3/4) 62 35 M N ABC103 67 52

Table 1.2: Protein sequence similarities and relatedness of the Su/fohbus acidocaldarius RNAP subunits to the corresponding bacterial and eukaryal RNAP subunits (Langer et al., 1995; Lanzendorfer et al., 1994). directly related to the bacterial RNAP assembly (Woychik and Young, 1994). Thus, they are likely to also play a role in RNA polymerase subunit interactions in Eukarya. The roles of other eukaryal RNAP subunits are not well yet defined.

Although a single enzyme complex, the archaeal RNAP has been shown by genetic and biochemical analyses to be comparable to the eukaryal pols I, n, and in in terms of numbers, sizes, and sequence homologies of the subunits (Klenk et al., 1992; Klenk et al.,

1994; Langer et al., 1994; Leffers et al., 1989; Madon et al., 1983; Prangishvilli et al.,

1982; Puhler et al., 1989; Puhler et al., 1989; Schnabel et al., 1982; Zillig et al., 1979).

Among all the archaeal RNAPs, the most studied is the RNAP of Sulfolobus acidocaldarius which comprises 15 single subunits, all but one of the subunit-coding genes have been sequenced (Langer et al., 1995; Lanzendorfer et al., 1994). Similarly, as many as eleven genes encoding the RNAP subunits were identified in the M jannaschii genome (Bult et al., 1996). Although the three and five largest subunits of the

S. acidocaldarius RNAP and theM jannaschii RNAP, respectively, are homologs of both the two largest eukaryal subunits and the bacterial P and P’ subunits, the archaeal and the eukaryal RNAP subunits share immunological cross-reactivity and higher sequence identity and thus appear to be more closely related to each other. In addition, other archaeal RNAP small subunits, which appear to have eukaryal homologs, are absent in bacteria. Archaeal RNAPs are also resistant to the potent bacterial RNA polymerase inhibitors rifampicin and streptolydigin (Lefifers et al., 1989; Yang and Price, 1995;

Zavriev and Shemyakin, 1982). A comparison of the RNAP components in the three

12 domains is presented in Figure 1.2, and the sequence similarities of the homologous

subunits are listed in Table 1.2.

Promoter, Transcription Initiation Factors, and Initiation Complex Formation

Typical bacterial promoters consist of two consensus hexamer sequences separated by 16 to 18 bps and centered at positions -35 and -10 () relative to the transcription start site. For transcription initiation to occur, the a factor in the preassembled RNAP holoenzyme complex contacts the -35 and -10 regions and directs the binding of the RNAP to specific promoters (Busby and Ebright, 1994; Eick et al.,

1994; Haldenwang, 1995; Helmann, 1994). TheÆ. coli “housekeeping” a factor (a™) is the product of the rpoD gene and recognizes the consensus sequences TTGACA in the -

35 region and TATAAT in the -10 region. The RNAP recruiting event is followed by partial melting of the promoter, which results in an open promoter-RNAP complex containing approximately 12 base pairs of melted DNA (the DNA bubble), repeating , and then productive elongation [reviewed in (Eick et al., 1994; Yager and VON Hippel, 1987)]. DNA footprinting of the E. coli open promoter-RNAP complex yielded a protected region spanning from -57 to +20 (Krummel and Chamberlin,

1989).

In addition to the homology of their RNAPs, the similarities between the archaeal and the pol II transcription systems also extend to the architecture of their promoters

(Figure 1.3). The archaeal consensus promoter, deduced fi'om sequence comparison

(Brown et al., 1989; Zillig et al., 1993) and mutation analysis in vitro (Hain et al., 1992;

13 Reiter et al., 1990) and in vivo (Danner and Soppa, 1996; Palmer and Daniels, 1995), contains two conserved sequence elements, designated boxA and boxB. BoxA, which has the consensus sequence S’-T/CTTAT/AA-3’, is situated 27±2 bps upstream from the transcription start site and resembles the eukaryal pol II TATA box (consensus 5’-

TATAAA-3') in both location and sequence (Smale, 1994). BoxB, which contains the transcription start site (usually at a purine residue), is generally less conserved at the sequence level (5’-T/CG/A-3’); nevertheless, its location resembles the weakly conserved

Halobacterial Consensus T T T A A N C A T Archaeal Consensus T T T A T A C A Eukaryal pol II Consensus T A T A A A

Figure 1.3: Consensus of the TATA elements in the archaeal and eukaryal pol II promoters. (Palmer and Daniels, 1995; Hain et al., 1992; Smale, 1994)

(Py) 2 CA(Py ) 4 at most pol II transcription start sites (Smale, 1994). Due to the resemblance between the Archaea and pol II promoters, the following discussions on the eukaryal transcription initiation complex formation and elongation process will focus on the pol II system.

Unlike the less complicated initiation complex formation in bacteria, transcription initiation by RNA pol II is a multi-step process involving many additional initiation factors

14 [for details, see recent reviews (Baumann et ai., 1995; Conaway and Conaway, 1993; Eick et al., 1994; Gralla, 1996; Boeder, 1996). In the first step, TFUD [a complex of TATA- binding protein (TBP) and its associated factors (Tjian, 1996)] binds to the core promoter via the TBP component to form the initial complex. This serves as the recognition site for pol II. TFIIA, which stabilizes TFllD binding at the promoter, and TFIIB, which acts as a bridge between the incoming pol II and the initial complex, enter the complex. Pol II, along with other basal transcription factors (TFUE, I'FUF, TFÜH), are then recruited to the complex to form a complete preinitiation complex. The fully assembled preinitiation complex (closed complex) is subsequently converted into an activated complex (open complex). This conversion step is accompanied by the addition of phosphate groups to the carboxy terminus of the largest subunit of pol II (Jiang et al., 1996; Jiang et al., 1993;

Feaver et al., 1991). The complex is now committed to transcription. DNase I footprinting analysis using the human AdML promoter showed that the fully assembled preinitiation complex protected a region fi'om -40 to +30 (Buratowski et al., 1989).

Interestingly, despite the complexity of the initiation complex in vivo, TBP and TFIIB alone were sufficient for initiating specific transcription by RNA polymerase II in vitro

(Tyree et al., 1993), therefore representing a minimal set of pol II transcription initiation factors. The TBP-TATA interaction is the first event in the initiation, and recent evidence

(Leuther et al., 1996) suggests that TFIIB plays a crucial role in start site selection. Both

TBP and TFIIB homologs have been identified in the Archaea (see below).

In light of the similarities between the archaeal and the eukaryal transcription initiation machinery, it was not surprising to discover that, like the eukaryal RNAPs, the

15 archaeal RNAP alone is unable to initiate specific promoter binding (Frey et al., 1990;

Hudepohl et al., 1990) and requires the assistance of eukaryal-like transcription initiation

factors (Thomm et al., 1994;for review see Thomm, 1996). The first report of a eukaryal-

like homolog in the archaea came in 1992. An open reading fi"ame

(ORF) potentially encoding the I FUB homolog (TFB) in Pyrococcus woesei was found to

share 32% to 36% sequence identity with its eukaryal counterparts (Ouzounis and Sander,

1992). Since then, the TFIIB homologs have been identified and purified fi'om

Methanococcus and Pyrococcus (Gobi et al., 1995; Hausner et al., 1996; Hethke et al.,

1996; Thomm et al., 1994). Recently, the putative TFUB-encoding genes from S.

shibatae (Qureshi et al., 1995.b) and from H. volcanii (Palmer, unpublished data) were

also cloned. Furthermore, the eukaryal-like TBP was also reported in a variety of

Archaea: Thermococus celer (Marsh et al., 1994), P. woesei (Rowlands et al., 1994),

P.Juriosiis (Hethke et al., 1996), M. thermolithotrophicus (Gobi et al., 1995), S. shibatae

(Qureshi et al., 1995.a), and H. volcanii (Palmer, unpublished). These potential TBP-

encoding genes displayed approximately 40% sequence identity with human TBP. In

addition to the overall sequence conservation, the structural and functional motifs

conserved in the eukaryal TBP and TFIIB proteins are also present in the archaeal homologs. The structural relatedness of the archaeal and eukaryal TBPs has been demonstrated with the recently solution of the 3-D structure of P. woesei TBP (DeDecker et al., 1996). This protein shows all of the structural features with the eukaryal TBPs and retains the potential for all reported DNA-protein contacts and TBP-TFIIB interaction.

Moreover, functional studies have shown that the Methanococcus TBP could be replaced

16 by yeast or human TBP in an in vitro transcription system (Wettach et ai., 1995), that the recombinant S. shibatae TBP could cooperate with the purified RNAP to direct specific transcription (Qureshi et al., 1995.a), and that the P. woesei TBP interacted specifically with the promoter and facilitated TFB binding (Rowlands et al., 1994). Recently, Hausner et al. (1996) reported that recombinant P. woesei TBP and TFIIB homologs could cooperate with P. woesei RNAP to accurately initiate transcription and generate specific footprints on the DNA template. Based on functional studies and sequence analysis, there is no evidence for the presence of other eukaryal basal transcription factors in the Archaea.

Taken together, the archaeal transcription initiation process might resemble the eukaryal mechanism as a simpler scheme involving mainly the promoter, RNAP and, the two eukaryal-like basal transcription factors, TBP and TFIIB.

Transcription Elongation

Before engaging in transcription elongation, the bacterial RNAP undergoes many cycles of abortive initiation. As the nascent transcript in the transcription bubble grows beyond eight or nine residues, the transcription complex becomes much more processive and enters the elongation stage, in which the DNA bubble grows to 18 base pairs in length

(Yager and VON Hippel, 1987). The beginning of transcription elongation is signaled by the translocation of the enzyme away from the promoter, the release of the sigma factor, and the formation of the elongation complex with the assistance of two elongation factors,

GreA and GreB (Hsu et al., 1995). The GreA/B factors also play a role in the elongation process by stabilizing the ternary elongation complex (Altmann et al., 1994) and

17 antagonizing the action of natural elongation-arrest sites that trap the advancing complex via a cleavage and restart action (Borukhov et al., 1993; Orlova et al., 1995; Reines,

1994). Two protein factors, NusA and NusG, which are involved in termination and antitermination, also play a role in elongation. NusA slows down RNAP (DeVito and

Das, 1994), whereas NusG stimulates transcription elongation (Burova et al., 1995; reviewed in Platt, 1996).

After the preinitiation complex is assembled in Eukarya, an ATP-dependent activation step results in the unwinding of the promoter DNA and the formation of an open complex at the transcription start site. These events further contribute to the conformational change of the preinitiation complex into an active form. This phase is characterized by the dissociation of the basal transcription factors, and, eventually, a transition to productive transcription elongation (Conaway and Conaway, 1993; Kane,

1994). Five factors involved in pol II transcription elongation have been identified so far, namely P-TEFb, TFIIS, TFIIF, Elongin and ELL, reviewed in (Kane, 1994; Reines et al.,

1996). In particular, TFIIF was found to stimulate the rate of elongation (Yankulov et al.,

1996), and the highly conserved pol II transcriptional elongation factor TFIIS has antiattenuation activity (Agarwal et al., 1991; Archambault et al., 1992; Bengal et al.,

1991). By binding to pol II, TFIIS may suppress transient pausing of the elongation complex induced by intrinsic pausing or termination sites, and sequence- specific DNA-binding proteins (Takagi et al., 1995). This action maintains the 3’ hydroxyl terminus of the nascent RNA chain in its proper position in the polymerase active site.

18 TFIIS also has the cleavage and restart capabilities of the bacterial GreA/B factors

(Cipres-Palacin and Kane, 1994; Reines, 1994).

In Archaea, not much is known about the transcription elongation process and the factors involved. An ORF encoding a protein with a C-domain that shares sequence identity with TFIIS was identified downstream fi'om an RNAP subunit L-encoding gene in

S. acidocaldarius (Langer and Zillig, 1993). A putative TFIIS homolog sharing 59% protein sequence identity with its eukaryal counterpart was also found in the M jam aschii genome (Bult et al., 1996). NusA and NusG homologs were also identified in

M jannaschii.

Our view of elongation as a steady process in both Bacteria and Eukarya has changed. Recent findings, such as the various sizes of the RNAP footprints (Krummel and

Chamberlin, 1992.a,b; Nudler et al., 1994), the changing distance between the front and rear ends of the enzyme from its catalytic site, and the expansion and contraction of the transcription bubble (Zaychikov et al., 1995), suggest that movement of the elongation complex is discontinuous (Chan and Landrick, 1994). It has been proposed, therefore, that elongation occurs by a process called “inchworming” in which RNAP moves along the template in ~ 10-nucleotide cycles (Chamberlin, 1995; Nudler et al., 1994).

Transcription Termination

Despite facing many antagonistic factors, such as nucleotide starvation and the presence of DNA template structures, the elongation complexes in Bacteria and Eukarya remain remarkably stable and progress at an average rate of approximately 40

19 aucleotides/sec. Therefore, for transcription termination to take place, specific strategies

must be utilized by the cells. Generally, there are two types of terminators; the intrinsic

terminators, which rely mainly on the nucleic acid sequence or structure; and the factor-

dependent terminators, which require specific interactions between cellular factors and

DNA/RNA sequences (Kerppola and Kane, 1991; Platt, 1996). For both systems,

components triggering the pausing of the RNAP and the release of the transcript are

essential. Although the mechanism responsible for the final release of the transcript has

not been clearly defined, recent evidence (Markovtsov et al., 1996; Wang et al., 1995)

supports a new model, in which the stable occupancy of the RNAP “exit channel” by the

growing transcript determines the processivity of the elongation complex (Nudler et al.,

1996). Therefore, the secondary structure of the transcript or the binding of protein

factors causes termination by stripping the transcript away fi’om the exit channel.

Bacteria Domain. Bacterial RNA polymerases escape the elongation process by utilizing

(Rho)p-independent or p-dependent terminators (Platt, 1986; Yager and voN Hippel,

1987). A p-independent terminator is an intrinsic terminator characterized by a special type of DNA sequence that gives rise to a GC-rich hairpin loop followed by a tract of U

residues in the RNA transcript. A direct correlation between the RNA transcript displacement from the transcription bubble and the termination efficiency indicated that the formation of the transcript hairpin structure is essential for this type of intrinsic terminator

(Daube et al., 1994). Perhaps such a secondary structure in the nascent transcript elicits an elongation pause and disrupts the interaction between the transcript and the RNAP, particularly in the “exit cleft” (Kerppola and Kane, 1991). Traditionally, it was believed

20 that the role of the tract of U residues was to contribute to the instability of dA:rU hybrid

(Platt, 1986; Yager and VON Hippel, 1987). This might have been an oversimplification, since neither the free energy of hairpin formation nor the length of the U-tract alone could explain the termination eflBciency (d'Aubenton Carafa et al., 1990; Reynolds and

Chamberlin, 1992). Interestingly, the distance fi'om the stem of the hairpin to the 3 ’ end of the transcript located within the U-tract seems to play a role: within a 6- to 8- nucleotide distance, the U-tract efiBciently facilitates transcript dissociation. When the distance is increased to 10 to 12 nucleotides, only pausing, induced by the hairpin, occurs

(Platt, 1996). Furthermore, sequence variation within the stem-loop (Cheng et al., 1991) and the flanking sequence can also affect termination efiBciency (Reynolds and Chamberlin,

1992).

Of all the types of protein factor-induced termination, the bacterial p-dependent termination has been studied in the most detail. The active E. coli p-factor is a homohexamer; each monomer (46kDa) contains specific RNA-binding and ATP-binding domains. The current working model for p action favors a tethered tracking mechanism

(Steinmetz and Platt, 1994). Upon binding to RNA and assisted by RNA-dependent ATP hydrolysis, the p protein can translocate itself on the RNA towards the 3’ end (as far as 50 to 100 nucleotides or more), catch up to the elongation complex, unwind the RNA-DNA duplex, and eventually cause the dissociation of RNA polymerase fi'om the RNA transcript and the release of the nascent RNA (Platt, 1994). Despite considerable investigation, no simple consensus sequence, pattern, or structure correlates with termination efficiency or p-binding. However, p-binding favors -rich sequences (Schneider et al., 1993)

21 and the binding aflSnity highly depends on the cytosine content (Wang and VON Hippel,

1993), although there is still controversy over whether the content or the position of the cytosine residues determines the binding (Platt, 1994).

Eukarya Domain. In the eukarya, a complex hierarchy of transcription termination sites suggests a much more sophisticated transcription system. All of the pol I terminators examined are orientation-dependent and contain a protein-binding site (pausing element) located downstream from the mapped 3 ’ terminus of the transcript (Reeder and Lang,

1994). Transcription termination of pol I requires species- and sequence-specific protein factor binding to the pausing element. One of the best studied pol I termination factors, murine TTFI (polymerase I ), binds to a Sail box, 5’-

AGGTCGACCAGA/TA/TNTCG-3’, located 18 bases downstream from the 3' end of the pre-rRNA (Smid et al., 1992). The TTFI homolog in yeast, Reblp, binds DNA at the region located 108 bps downstream from the mature 25S rRNA (Lang and Reeder, 1993).

The binding of Reblp at the pausing element serves as a roadblock and does not appear to require specific interaction between the termination factor and pol I (Lang et al., 1994).

In addition to the pausing site, efficient termination also demands a 5’ releasing-sequence element which works in concert with the termination factor-induced pausing and results in the release of the transcript. Studies on the yeast Reblp system revealed that the release element is a 12-nucleotide T-rich region (on the nontemplate strand) corresponding to the

3’terminal segment of the transcript (Lang and Reeder, 1995). Furthermore, the 5' flanking sequences of the releasing site appear to influence the position and efficiency of 3' end formation (Reeder and Lang, 1994).

22 Pol n terminators are still poorly understood. The difficulty in studying pol II termination is largely due to the 3’ mRNA processing, in which longer transcripts (up to many kilobases) are synthesized and trimmed back and a pol(A) tail is added to the 3’ end.

Besides T-rich regions on the nontemplate DNA strand, no specific sequence or structure motif of the terminators has been identified so far. However, the fact that not all T-tracts act as intrinsic terminators (Kerppola and Kane, 1990; Reines et al., 1987) and that the termination efficiency does not directly correlate with the number of Ts (Kerppola and

Kane, 1990) suggests that additional factors are involved in transcription termination.

Enhanced termination by binding of an AT-rich minor groove-binding peptide, netropsin, at the termination site (Ueno et al., 1992) and electrophoretic mobility analysis (Kerppola and Kane, 1990) of the human H.H3 terminator, 5 ' -TTTTTTTTCCTTTTT-3 ', suggest that DNA bending could also play a role in determining the termination efficiency.

Moreover, the termination event seems to be affected by sequences flanking the terminator

(Kerppola and Kane, 1990), efficiency of 3’ processing (Edwalds-Gilbert et al., 1993), appropriate spacing between the termination element and poly(A)-processing site

(Edwalds-Gilbert et al., 1993; Tantravahi et al., 1993), and potential binding of protein factors (Roberts et al., 1992).

Among the three eukaryal RNAPs, pol 111 terminators appear to be the most simple. The general consensus of a pol 111 terminator is a run of Ts (> 4) on the nontemplate strand of the transcribed gene (Bogenhagen and Brown, 1981; Mazabraud et al., 1987), although, in some cases, A-clusters or AT-rich regions are used (Hess et al.,

1985; Matsummoto et al., 1989). There appears to be no general secondary structure

23 requirement, yet two nucleotides (preferentially G and C) immediately adjacent to the sequence seem critical in determining termination eflBciency (Bogenhagen and Brown,

1981). In addition, the possible involvement of termination factors has been demonstrated by studying the action of the La protein in transcription termination. La protein binds to uridine residues at the 3' end of the transcript and possibly alters the conformation of the terminator region (Gottlieb and Steitz, 1989). This binding results in the release of the transcript and recycling of pol III by an ATPase-dependent helicase activity (Maraia,

1996; Maraia et al., 1994). A two-step model for pol III termination is supported by evidence in which pol III pausing and transcript release were both shown to be required for an efficient termination event and could be experimentally uncoupled (Campbell and

Setzer, 1992).

Archaea Domain. Despite the recent advances in understanding the archaeal transcription system, little is known about archaeal transcription termination. The only functional analysis of the archaeal termination signal reported to date was based on an in vitro study of the oligo-dT sequence at the 3' end of the Methanococcus tRNAval gene

(Thomm et al., 1994). This study identified the archaeal terminator as a eukaryal-like termination element, 5’-TTTTAATTTT-3’. Beyond this in vitro study, our understanding of archaeal transcription terminators is based mainly on comparing the sequences located at or close to the 3' end of transcripts. According to a survey of primary sequences at the transcription termination regions of archaeal genes (Brown et al., 1989), there appear to be two types of sequences, one that resembles the bacterial p-independent terminator and one that is similar to eukaryal T-tract terminators. However, as more genes are sequenced

24 and the 3' ends of the transcripts are mapped, the common feature of termination regions is a T-tract on the nontemplate strand (Chapter 3) While there appears to be no conserved secondary structure or element beside the conserved T-rich sequence, archaeal transcription termination, similar to the initiation and elongation machinery, might also prove to resemble the complex eukaryal system. However, mapping the 3’ termini of transcripts can not distinguish a processing site from a true termination site, nor does it accurately predict termination elements. Therefore, more functional studies of archaeal termination signals are essential.

Heat Shock Proteins: Structure and Function

All living organisms have evolved specific molecular responses in order to adapt to environmental stresses. In particular, the universally conserved heat shock response occurs when organisms are exposed to elevated temperatures and results in the rapid and transient induction of a set of heat shock proteins (HSPs). HSPs, also known as molecular chaperones, play important roles in facilitating in vivo protein folding and the assembly of multimeric structures; they achieve these goals by either blocking nonproductive protein-protein interactions or by sequestering the folding intermediates/denatured protein so that they can properly assume their native conformation (for detailed review, see Becker and Craig, 1994; Craig, 1993; Craig et al.,

1994; Georgopoulos, 1993; Hendrick and Haiti, 1995; Mager and Ferreira, 1993; Parsell and Lindquist, 1994). Upon heat stress, the accumulation of abnormally folded proteins in the cell might deplete the free pool of HSPs and lead to the increased expression of HSPs.

25 Several HSP families have been identified and, with the exception of the recently identified

chaperonin family (CCT/TriC/TF55), most of the eukaryal HSPs are designated according

to their apparent molecular mass, namely HspOO, Hsp90, Hsp70 (DnaK in E. coli), Hsp60

(GroEL in E. coli), and other small HSP families. Although they are functionally distinct,

members of the Hsp70 and Hsp60 families participate in a sequential protein-folding

pathway during normal physiological conditions (protein maturation) and under cellular

stress (anti-aggregation and folding) (for a review, see Frydman and Hart, 1994;

Georgopoulos, 1993; Hendrick and Haiti, 1995). The discussion in this section focuses

mainly on three heat shock protein families; Hsp70, Hsp60 and TCP/CCT.

Hsp70 family

Members of the highly conserved Hsp70 family in eukaryal cells can either be

synthesized constitutively or expressed only under heat shock conditions. The cellular

distribution of these proteins includes the cytoplasm, nucleus, endoplasmic reticulum,

mitochondria, and chloroplasts. E. coli has only one type of Hsp70, DnaK, which is

expressed constitutively and further induced upon heat (or other metabolic) stress. All

Hsp70 members share a consensus stmctural model which includes a highly conserved

ATPase domain at the amino-terminal end and a less conserved peptide-binding domain at the carboxy terminus. Hsp70 binds to exposed hydrophobic regions of unfolded or misfolded proteins, such as nascent polypeptides on the (de novo synthesis) or heat-denatured proteins (thermal stress), and prevents them from aggregating. Upon binding ATP, Hsp70 undergoes a conformational change, releases the substrate proteins,

26 and allows them to refold (reviewed by Frydman and Hart, 1994; Hendrick and Haiti,

1995). In addition, the binding of Hsp70 to abnormal (damaged/denatured) proteins also facilitates the rapid degradation of these proteins via ATP-dependent proteases (Craig et al., 1994; Frydman and Hart, 1994; Georgopoulos, 1993).

Another important physiological role of Hsp70 is in protein translocation

(Georgopoulos, 1993; Hendrick and Haiti, 1995). Since a stably folded structure prevents nascent proteins from crossing the bacterial inner membrane or entering the subcellular organelles (such as mitochondria, chloroplasts, or endoplasmic reticulum), secreted protein precursors must be maintained in a translocation-competent state posttranslationally. Although the nature of the action is still not fully understood, Hsp70 and other auxiliary cellular factors maintain the proteins in a structural state necessary for translocation.

Most of the cellular functions of Hsp70 are performed in conjunction with other factors. In the DnaK system in E. coli, for example, there is the 45 kDa chaperone DnaJ, whose gene is situated in the same operon as DnaK, and the nucleotide exchange factor

GrpE. The current model for Hsp70 in protein folding is best illustrated by the E. coli

DnaK/DnaJ/GrpE system (Frydman and Hart, 1994). Under physiological conditions,

DnaK is in an ATP-bound form, which has low affinity for DnaJ and unfolded polypeptides. Unfolded polypeptides first bind to DnaJ, which, in turn, triggers the ATP- hydrolysis of DnaK and recruits it to form a ternary complex. GrpE then facilitates the

ATP-ADP exchange on DnaK, and DnaK returns to the low-afl5nity ATP state and releases the polypeptides. Finally, the released polypeptides may fold to their native state,

27 reenter the DnaK-DnaJ cycle, or join the GrroEL/GroES complex for further folding

(described in the next section). In Eukarya, the identification of DnaJ homologs with similar regulatory roles in peptide binding suggests that the Hsp70 reaction cycle might be similar to that found in bacteria, except that Hsp70 ATP hydrolysis might play a different role (Minami et al., 1996).

Hsp60 family

The Hsp60 (or Cpn60) family is a class of highly conserved chaperonins found in bacterial cells (GroEL), mitochondria and chloroplasts [ribulose bisphosphate carboxylase

(Rubisco)-binding protein] (reviewed in Frydman and Hart, 1994; Georgopoulos, 1993;

Hendrick and Haiti, 1995; Horwich and Willison, 1993). Members of the Hsp60 family also bind newly synthesized or unfolded polypeptides. Their physiological role appears to be the folding of monomeric polypeptide chains in an ATP-dependent manner and the stabilization of these polypeptides in an “assembly-competent” state until they engage in productive interactions with other subunits. This protein-folding activity of Hsp60, although particularly important under thermal stress, is essential for cells at all temperatures. All Hsp60 family members share a conserved homooligomeric structure that contains two stacked rings, each containing seven 60 kd subunits, with a central cavity which can accommodate 1 to 2 polynucleotides. The crystal structure of E. coli

GroEL (Braig et al., 1994) confirmed the stacked double-ring structure and has defined the three dimensional structure of each subunit. It appears that a large portion of GroEL’s

28 functional surfaces, including ATP-binding pockets and substrate polypeptide- and

GroES-binding sites, are located on the wall of the central cavity and its invaginations.

The protein folding function of Hsp60 requires the participation of another protein,

HsplO (GroES in E. coli), a highly conserved single heptameric (7 members of 10 kDa subunits) ring, which binds to one end of the Hsp60 double toroid. A recent study showed that HsplO üom Mycobacterium leprae (Mande et al., 1996), upon associating with GroEL, provides a very hydrophilic surface to the central cavity and, therefore, might play an active role in assisting the protein-folding process. Chaperonin-assisted protein folding is a reaction cycle; the E. coli GroEL/GroES system is the best studied example of such a cycle (Martin et al., 1993; reviewed by Frydman and Hart, 1994). At the non­ binding stage, GroEL associates with GroES, and ADP is tightly bound to its subunits.

Upon protein binding, the unfolded protein enters the GroEL central cavity through the end not associated with GroES and binds to the ring. The binding of the protein substrate triggers the dissociation of ADP (exchange with ATP) and GroES. The binding of ATP to GroEL weakens its affinity for the bound protein and results in the reassociation of

GroES with the ring containing the substrate protein. Subsequent ATP hydrolysis, which possibly induces a conformational change in GroEL (Hayer-Hartl et al., 1996), causes the release of the substrate protein vdthin the shielded ring cavity where it folds into its correct conformation. Eventually, regeneration of the ADP-state stabilizes the

GroEL/GroES interaction, and the partially folded protein substrate binds again to GroEL to start another folding cycle. These cycles continue until the protein achieves its native state and losses its affinity for GroEL. The ability of GroEL to interact with the side

29 chains of hydrophobic and polar residues might allow this chaperone to act as an amphiphilic organizer in the process of protein folding (Richarme and Kohiyama, 1994).

Both Hsp60 and HsplO are induced in response to environmental stresses and play significant roles under conditions that cause protein dénaturation. At elevated temperatures, Hsp60 members were found to bind to native proteins in vivo and in vitro, to stabilize them, and to protect them fi"om thermal inactivation (formation of protein aggregates). When the stress of the high temperature is removed, incubation of the thermally unfolded protein-GroEL complex with Mg/ATP and GroES in vitro allows the renaturation of the protein. In addition, higher levels of both Hsp60 and HsplO are required for the de novo folding of newly synthesized proteins under such stress conditions.

CCT/TriC/TCP-l/TF55 Family

Another recently identified chaperonin family, whose members include T-complex polypeptide 1 (TCP-1) from the eukaryal cytosol and Thermophilic factor-55 (TF55) from the Archaea, represents the newest addition to an already growing list of molecular chaperones (recently reviewed by Horwich and Willison, 1993; Kubota et al., 1995;

Willison and Kubota, 1994). Interestingly, whereas the archaeal chaperonins are heat- inducible (Trent et al., 1994; Trent et al., 1991; Trent et al., 1990; this study), transcription of the yeast TCP-1-encoding genes is depressed under heat shock (Ursic and

Culbertson, 1992). The nomenclature for this new family of chaperonins, however, still varies: for example, TCP (Gupta, 1995), CCT (Kubota et al., 1995; Willison and Kubota,

30 1994) for chaperonin-containing TCP-1, TriC (Frydman and Hart, 1994) for TCP-1 ring complex, and TF55 (Trent et al., 1991). In this dissertation, this group of chaperonins is referred as the CCT family, a name which has been proposed recently by Kubota et al.

(1995).

Although members of the CCT and Hsp60 families share some degree of sequence identity and resemblance in their double-stacked ring structures, they are significantly different in many ways (reviewed in Kubota et al., 1995). For example, all Hsp60 members display 7-fold symmetry, whereas CCT members display 8- or 9-fold symmetry

(Kagawa et al., 1995; Marco et ai., 1994; Trent et al., 1991; Waldmarm et al., 1995.a); the ring structure of HspôO is a homo-oligomeric complex, whereas most of the CCTs are hetero-oligomeric complexes, each composed of two (Waldmann et al., 1995.c) or eight to ten different subunits (Willison and Kubota, 1994); and the HsplO (GroES)-like cofactor is not present in the CCT system (Kubota et al., 1995). Phylogenetic analysis based on the amino acid sequences of the putative ATPase domains (Kubota et al., 1995) and overall (Willison and Kubota, 1994) Hsp60 and CCT proteins, revealed that the chaperonin families diverged from the bacteria and organelles (Hsp60) and from the eukaryal cytosol and the archaeal forms early in chaperonin evolution.

Although they display more than two forms of structural symmetry and are different in the numbers of subunit types, members of the CCT family share a high degree of sequence identity and structural similarity to each other (Kubota et al., 1995). Based on their comparison with the GroEL crystal structure, it appears that the subunits of CCT family members, like their Hsp60 counterparts, also contain a conserved structural motif

31 involved in ATP binding. However, the putative polypeptide-binding regions of CCT subunits do not share significant identity among themselves or with subunits of Hsp60

(Kim et al., 1994). A plausible hypothesis is that the distinct subunits of CCT confer different substrate-binding specificities. Studies of protein folding by eukaryal CCT demonstrated that these proteins are capable of mediating the folding of a wide range of proteins, including actin, tublin, neurofilament, luciferase (in firefly), and hepatitis B virus capsid (Chen et al., 1994; Frydman et al., 1992; Lingappa et al., 1994; Vinh and Drubin,

1994). Eukaryal CCTs also release newly folded proteins in an ATP-dependent fashion

(Gao et al., 1992; Yaffè et al., 1992). Although the underlying mechanism for protein folding by CCT is still not as well understood, studies on P-actin folding revealed that binding of the unfolded polypeptide to CCT, like Hsp60, also occurs within the inner channel of the chaperonin (Marco et al., 1994). Furthermore, two cofactors (unrelated to

HsplO/GroES in sequence and structure), which modulated the ATPase activity of CCT mimicking the role of HsplO/GroES, were found to be required for the folding of a- and p-tubulin (Gao et al., 1993).

Archaeal Heat Shock Proteins

Although only a small number of archaeal heat shock proteins and their biological roles have been reported so far, all species of Archaea examined do display the heat shock response (reviewed in Baross and Holden, 1996; Conway de Macario and Macario, 1994;

Trent, 1996). Daniels et al. (1984) examined the synthesis of HSPs in seven halophiles

{H. volcanii, H. trapanicum, H. marismortui, H. salirtarium, R-4, and Y -11) by pulse-

32 labeling the cellular proteins with S^^-methionine under normal and heat-shock conditions.

A limited number of HSPs (4 to 6) were induced and their apparent molecular weights clustered in the ranges of 105-75, 45-44, and 28-21 kDa. The identities of these proteins were not determined. Putative HspTO genes, based on deduced protein sequence comparison, have been identified fi’om H. marismortui (Gupta and Singh, 1992) and H. cutirubrum (Gupta and Singh, 1994). However, their expression, particularly their heat- shock inducibility, has not been examined. Furthermore, direct evidence for the presence of other HSP or CCT homologs in halophiles has not been reported.

In raethanogens, Hebert et al. (1991) examined the heat shock response of

Methanococcus voltae and reported the induction of 11 HSPs which fell within an apparent molecular weight range of 18-90 kDa. Others have identified an ORF encoding an HspTO homolog from Methanosarcina mazei S6 which shares up to 65% protein sequence identity with the bacterial DnaKs (Macario et al., 1991). In addition, genes encoding the putative DnaJ (Macario et al., 1993) and GrpE (Conway de Macario et al.,

1994) homologs were identified fi'om two ORFs adjacent to the DnaK-encoding gene.

The three M. mazei S6 genes are arranged in a cluster (GrpE-DnaK-DnaJ), expressed monocistronically, and are heat-shock inducible (Clarens et al., 1995; Conway De Macario et al., 1995). Recently, a high molecular weight protein complex, whose projected structure (based on transmission electronmicroscopy) displays double-stacked rings with

8-fold symmetry, was purified fi'om a hyperthermophilic methanogen, Methanopynis kandleri (Andra et al., 1996). This polypeptide complex, which was called thermosome after its Pyrodictium occidtiim counterpart, contains homooligomeric subunits and is a

33 member of the newly identified chaperonin (CCT/TF55) family. The gene encoding the

59.5 kd subunit was identified, and its deduced protein sequence shares 69.3% identity with a Pyrococcus heat shock protein.

For hyperthermophiles, a correlation between HSP induction and enhanced thermotolerance was observed in ES4 (Holden and Baross, 1993) and Sulfolobus sp. 312

(Trent et al., 1990). Although attempts in identifying Hsp70 have not been successful

(Trent, 1996), the presence of CCT has been documented in many hyperthermophiles, including Sulfolobus shibatae (Kagawa et al., 1995; Trent et al., 1991), S. solfataricus

(Knapp et al., 1994; Marco et al., 1994), P. occultum (Phipps et al., 1991; Phipps et al.,

1993) Desulfurococciis SY (Kagawa et al., 1995), and Thermoplasma acidophilum

(Waldmarm et al., 1995.a; Waldmarm et al., 1995.b; Waldmarm et al., 1995.c). The

Sulfolobus shibatae TF55 complex (recently renamed as rosettasome by Kagawa et al.,

1995 or archaeosome by Quaite-Randall et al., 1995) was the first archaeal CCT member to be reported (Trent et al., 1991).

Sulfolobus shibatae TF55 is one of the most abundant cellular proteins, and its synthesis can be induced 8- to 9-fold upon heat shock, thus enhancing the thermotolerance of cells at lethal temperatures (Trent et al., 1994; Trent et al., 1991). The TF55 complex is composed of two stacked 9-membered rings containing a and P polypeptides in a 1 ; 1 stoichiometry, possibly with one type of the two polypeptides in each ring (Trent et al.,

1991; Kagawa et al., 1995). In vitro studies suggest that the TF55 complex exists in a cycle including two conformational states, an ATP-dependent closed complex and an open complex, which results fi’om ATP-hydrolysis of the closed complex (Quaite-Randall et al.,

34 1995). The open complex then further dissociates into subunits, which can reassemble

back into the complex form. The equilibrium between the complexes and the free subunits

is affected by the temperature and ATP concentrations. Functional studies have

demonstrated that the TF55 complex is capable of interacting with unfolded polypeptides

(Trent et al., 1991) in both conformational states and as free subunits in vitro (Quaite-

Randall et al., 1995). Both a- and (3-polypeptide-encoding genes have been identified,

and their deduced protein sequences show 54% identity with each other and 35% identity

with eukaryal CCT members (Kagawa et al., 1995; Trent et al., 1991). The expression of

these two heat-inducible genes appear to be coregulated at the transcriptional level.

Furthermore, polyclonal antibodies raised against the S. shibatae TF55 a- and (3-

polypeptides cross-react with proteins from seven different crenarchaea. Double-ring

complexes showing 8- or 9-fold symmetry also have been detected in most of the archaeal

CCTs reported for hyperthermophiles (except Desulfiirococcus SY), and some of their chaperonin-related activities were reported (reviewed by Trent, 1996).

Mechanisms of Regulated Gene Expression and Heat Shock Genes as a Paradigm for Regulation

Bacterial and eukaryal cells have developed a variety of strategies to regulate gene expression. Here we discuss the general schemes for regulation of transcription observed in the three Domains. This is followed by a review of heat shock gene regulation, which exhibits characteristic features of the mechanisms used in these organisms.

35 Transcription Regulation Strategies in the Three Domains

Bacterial transcription regulation can occur at the initiation, elongation, or termination stages (Yanofsky, 1992). A common regulatory strategy in the transcription initiation stage involves the binding of positive-acting or/and negative-acting proteins at or near the promoter, often called operator sites. Generally, these regulatory proteins undergo an allosteric conformational change induced by the binding of appropriate ligands or by other modification, such as phosphorylation. These changes can trigger the release or stimulate the binding of these factors to DNA. Another common regulatory scheme, at the level of transcription initiation, involves the utilization of alternative a-factors to direct

RNAP to specific promoters. Alternative sigma factors have been observed in the regulation of a number of genes involving heat shock response, sporulation, phage growth, flagellar expression and chemotaxis, nitrogen metabolism, and stationary phase growth

(reviewed in Haldenwang, 1995; Mager, 1995; Helmann, 1994; Hengge-Aronis, 1993;

Yura et al., 1993; Losick and Stragier, 1992; Helmann, 1991).

Although transcription elongation proceeds with a high level of processivity, transcription pausing and attenuation, caused by sequences and structures existing in the

DNA or RNA, contribute to the overall efficiency of transcription (Kane, 1994). In addition, protein factors such as GreA/GreB and NusA/NusG can also affect the elongation rate and hence play a role in transcription regulation (Chamberlin, 1995; Kane,

1994). The “termination vs. antitermination” strategy (Henkin, 1996) is another example of transcriptional regulation at the level of termination. This type of regulatory scheme involves a competition between transcriptional read-through and termination at a p-

36 dependent or p-independent terminator located upstream of structural genes. Novel

metabolic sensing mechanisms, including ribosome occupancy (Landick and Tumbough,

1992), uncharged tRNA interaction (Henkin, 1994), or binding of a protein factor

(GoUnick, 1994), can determine the terminator availability by promoting or preventing the

formation of RNA structures (terminators vs. antiterminators).

The general strategy for the transcriptional regulation of eukaryal genes appears to

involve a complex network of protein factors, which include DNA-binding and non-DNA-

binding factors, working in concert (Yanofsky, 1992). The DNA-binding factors

recognize and bind to specific response elements and interact directly or indirectly with

RNAP assembled at the promoter. The regulatory activities of these bound factors can be

positive, negative, or both, and the response elements are normally located upstream fi’om

the transcription initiation site (Calkhoven and Ab, 1996). The non-DNA-binding factors

“talk” to RNAP indirectly by affecting the action or the DNA-binding ability of the DNA-

binding factors (Calkhoven and Ab, 1996). In addition to the regulation occurring at the

transcription initiation stage, eukaryal gene expression can also be regulated at the elongation and termination stages by interactions between protein factors and RNA

structures or sequences (Yanofsky, 1992).

Even though the mechanisms of gene regulation in the Archaea remain to be elucidated, regulated gene expression has been reported for many archaeal genes whose transcription depends on the growth phase or substrate availability. These include the fdhQAB operon fromM thermoformicicum Z- 245 (Nolling and Reeve, 1997), fmiBIJwcB fromM kandleri (Vorholt et al., 1997), mtsAJmts^ fi’om M barkeri (Paul

37 and Krzycki, 1996), cdfeA fromM themiophila, (Sowers et al., 1993), and arcRACB

üomH. salinarium (Ruepp and Soppa, 1996). Transcriptional induction of heat shock genes has also been reported for thermophUic (Kagawa et al., 1995) and methanogenic

(Clarens et al., 1995) Archaea. In some instances, regulatory factors have been identified.

For example, the induction of two salt shock-responsive ORFs in H. mediterranei might involve an upstream Z-DNA sequence (Mojica et al., 1993). The transcription of the bop gene of H. halobium is also stimulated by a Z-DNA centered 23 bp upstream from the transcription start site (Young et al., 1996) and is activated by the products of the adjacent genes, bat and brp (Betlach et al., 1989; Leong et al., 1988). Recently, Kupiec-Cohen et al. (1997) demonstrated that the transcription of theM. maripaladis nifgtne is regulated by binding of a to a palindromic sequence located immediately downstream of the transcription start site (Cohen-Kupiec et al., 1997). The potential involvement of a repression mechanism was also reported for the «//HDK2 operon from M. barkeri 227

(Chien and Zinder, 1996). In addition, (j)H phage \nH. halobium strains was found to contain a protein resembling coliphage (Ken and Hackett, 1991). Interestingly, our laboratory has identified multiple copies of TBP- and TFB-encoding genes from H. volcanii and found that the expression of some of these basal transcription factor- encoding genes is growth phase- or heat shock-dependent (Palmer and Daniels, Thompson and Daniels, unpublished data). The possibility that halophilic Archaea utilize alternative transcription factors is a novel regulatory scheme. The Archaea appear to use a variety of strategies to control gene expression, some of which, like the repressor proteins, are

38 similar to the strategies used in bacterial cells. Others, such as alternative transcription

factor pairings, may represent new schemes for gene regulation.

Heat Shock Regulation

The HSPs produced in Bacteria, Archaea, and Eukarya are highly conserved in both their structure and function, and their induction is generally regulated at the level of transcription initiation. In general, the underlying mechanism of heat shock gene regulation reflects one of the general schemes of gene regulation in these organisms and

can serve as a model for studying gene expression.

Bacteria Domain. Bacterial heat shock gene regulation has been studied most

extensively in E. coli, in which RNAP interacts with instead of to specifically direct the holoenzyme complex to heat shock gene promoters (recently reviewed by

Mager and De Kruijff, 1995; Bukau, 1993; Georgopoulos et al., 1994; Yura et al., 1993).

The promoters have a -35 region with the consensus sequence 5’-

TCTCNCCCTTGAA-3’, a spacing of 13 to 17 nucleotides, and a -10 region with the sequence CCCCATNTA (Cowing et al., 1985). Under normal growth conditions, which must compete with a™ for core RNAP, exists at very low concentrations (10-30 molecules/cell) and is responsible for the basal expression of the major heat shock genes, such as dnoK, dnaJ, groEL, and groES. The low level of is due to the instability of the protein and the repression of its gene, rpoH. DnaK, DnaJ, and GrpE were shown to associate with possibly targeting the protein for by other proteases, and contributing to the instability of either directly or indirectly (Bukau, 1993; Gamer et

39 al., 1992; Liberek et al., 1992; Liberek and Georgopoulos, 1993). Furthermore, such protein-protein interactions interfere with the activity of in the transcription machinery for HSP gene expression.

Upon heat shock or the exposure to other stresses, the transient accumulation of

results from an increase in its half-life (up to 8 fold) and its synthesis. The enhanced stability of is transient (4-5 minutes) (Straus et al., 1987) and is likely to be caused by an increase in denatured proteins which titrate DnaK/DnaJ/GrpE away from (Gamer et al., 1996; Georgopoulos et al., 1994). Although there is a slight increase in the transcription of rpoH upon heat shock, the induction of synthesis is mainly regulated at the translational level. Three 5’-proximal c/j-acting elements were identified within the coding region of mRNA (Nagai et al., 1994; Nagai et al., 1991; Nagai et al., 1991). It

was proposed that the first two c/j-acting elements (positions + 6 to +20 and +153 to

+247) form a secondary structure that possibly modulates the frequency of translation initiation or provids binding sites for activator or repressor proteins (Yura et al., 1993).

The third element (position +364 to +433) is a binding site for the DnaK machinery, which, when bound to this mRNA region, can mediate translational arrest of (McCarty et al., 1996; Nagai et al., 1994).

As a consequence of accumulation, the HSP genes are rapidly induced from their a^^-specific promoters and the level of their protein products increase. When the stress persists, the induction of the HSPs is followed by an adaptation achieved by an autoregulation mechanism (Georgopoulos et al., 1994). As a result of HSP induction, free DnaK, DnaJ, and GrpE are again available to associate with and therefore, repress

40 its activity, increase its instability, repress its synthesis, and eventually turn off HSP induction (Gamer et al., 1996).

Eukarya Domain. The induction of HSPs in eukarya is modulated by the interaction between a transcriptional activator (HSF) and a conserved DNA sequence element (HSE)

(for recent review see: Fernandes et al., 1994; Mager and De Kruijff, 1995; Wu, 1995).

With the exception of the yeast HSF, which binds consitutively to the HSE and is responsible for expression of the heat shock genes under normal growth conditions, HSF in higher eukarya does not appear to associate with HSE under nonstress conditions.

Under normal growth conditions, HSF is in an inactive (non-DNA binding) monomeric form in both the cytoplasm and nucleus. Upon heat shock, the HSFs associate to form an active trimer that accumulates in the nucleus and binds to the HSE leading to activation of beat shock gene transcription (Morimoto, 1993; Westwood and Wu, 1993). It appears that the activation of the trimerized HSFs is facilitated by the phosphorylation of the HSF

(Cotto et al., 1996; Sorger, 1990; Sorger and Pelham, 1988). Once the heat stress is removed, or if an intermediate elevated temperature is maintained (< 42°C), the HSF trimers return to the inactive monomeric state again, and transcriptional activation is attenuated (Morimoto, 1993).

Although only one HSF gene has been identified in yeast (Wiederrecht et al., 1988) and Drosophila, higher eukarya have up to 3 HSFs: HSFl, HSF2, HSF3 (Morimoto et al.,

1992). However, only HSFl, which can respond to stresses such as elevated temperatures, heavy metals, and amino acid analogs, is the functional homolog of yeast

HSF (Fiorenza et al., 1995; Sarge et al., 1993). Although they lack strong protein

41 sequence conservation, members of the HSF family do share some common structural

features (Wu, 1995), such as an N-terminus DNA-binding domain with a non-canonical

helix-tum-helix motif (Vuister et al., 1994; Wiederrecht et al., 1988), a leucine zipper

(trimerization domain) (Sorger and Pelham, 1988), and a heptad repeat located at the C-

terminus (transactivation domain) (Wisniewski et al., 1996).

A typical HSE contains a contiguous array of conserved 5’ nGAAn 3’ repeats

(usually 3 to 6 ) arranged in alternating orientations (head-to-head or tail-to-tail)

(reviewed in Fernandes et al., 1994; Lis and Wu, 1992; Mager and De Kruijff, 1995). The

configuration of HSEs within HSP gene promoters may vary in the number of HSE units,

their positions (relative to the transcription initiation sites), and the spacing between them

(Amin et al., 1988; Xiao and Lis, 1988). HSF interacts with the HSE at the 5-bp repeat

unit, and binding of HSFs is cooperative (Xiao and Lis, 1991). This multiple HSF-binding

at adjacent HSEs allows the oligomerization of HSF and plays a critical role in stabilizing the protein-DNA complex. Therefore, both the sequence homology of the 5-bp repeats

and the arrangement of the HSEs within the promoter can influence the binding affinity of

HSF for the HSE.

Yeast HSF has the intrinsic ability to oligomerize and hence, to bind to DNA

(Sarge et al., 1993). Therefore, HSF must be regulated in a negative fashion to maintain its inactive monomer state under nonstress conditions. Although it is still controversial

(Mager and De Kruijff, 1995; Wu, 1995), a model similar to the autoregulation scheme in

E. coli, is supported by many observations (Boorstein and Craig, 1990; Mager and De

Kruijff, 1995; Morimoto, 1993). In this model, transient interactions between Hsp70 and

42 HSF stabilize a specific conformation of HSF under normal growth conditions.

Misfolded proteins resulting fi'om heat shock then compete with the HSFs for binding to

Hsp70; consequently, HSFs become available for trimerization and DNA-binding. The inactivation of HSF is also regulated by the elevated synthesis of Hsp70. Alternatively, recent findings suggest that all the regulatory elements controlling HSF activity may reside within the HSF itself (Goodson and Sarge, 1995; Larson et al., 1995). It is possible that both Hsp70 and HSF sequences play a regulatory role in HSF activity. Further investigation is needed before the regulation of HSF activity is fully elucidated.

Research Problems

The work presented in this dissertation focuses on two research areas. The first set of studies was conducted to characterize the in vivo termination signals of archaeal transcription. The purpose of the second project was to develop a model system to study regulated gene expression in H. volcanii.

In vivo characterization of archaeal transcription termination signals

Transcription studies often focus on the initiation and elongation stages; however, how committed RNAP complexes eflBciently and precisely terminate their tasks is also critical in the gene expression process. Transcription termination has been extensively studied in both bacteria and eukarya. In archaea, other than the in vitro study ofM

43 vcmnielii tRNAVal termination (Thomm et al., 1994), our understanding of archaeal transcription termination is greatly deficient.

The availability of gene expression vectors for H. volcanii has made it possible for us to examine the requirements for transcription termination in vivo (Nieuwlandt and

Daniels, 1990; Palmer and Daniels, 1994). We also developed a transcription termination vector and used this system to investigate the sequence and structural characteristics of archaeal transcription termination signals. The results of this study provide a set of rules for the identification of halobacterial terminators and suggest the occurrence of an

“inchworming” mechanism for transcription termination.

Development of a gene regulation model system

Despite the fact that our knowledge of archaeal gene expression has increased greatly in the last decade, our understanding of gene regulation in this third domain is very limited. To address this problem, we chose to examine heat shock as a potential model for studying gene regulation in H. volcanii. A complete set of overlapping cosmids, representing >90% of the H. volcanii genome, was available and a preliminary study reported that several of these cosmids carried heat shock-responsive genes (Charlebois et al., 1989; Trieselmaim and Charlebois, 1992).

We examined cosmid A199 and found that this cosmid carried a gene encoding a

CCT-related protein. This cell gene encodes a 560 amino acid protein sharing 47% sequence identity with Sulfolobus shibatae TF55 and 37% sequence identity with human

TCP-1. The cctl gene was monocistronic, and the transcript level of cc/-related genes

44 increased when ceils were challenged with beat stress or salt shock. Transcript mapping

revealed that the cctl gene initiated transcription from a typical archaeal TATA promoter

for both basal and induced transcription and that the transcript from this gene terminated

in a T6 T-tract. Introduction of the cctl gene into H. volcanii on a multicopy plasmid

showed that the protein was consitutively expressed at high levels. Further investigation of the regulation of the cctl gene was then carried out by another laboratory member

(Thompson, unpublished data).

45 CHAPTER 2

MATERIAL AND METHODS

Reagents and Enzymes

Restriction endonucleases, T4 DNA ligase, T4 DNA kinase, and Superscript II™ were purchased from Bethesda Research Laboratories, Inc.

(Gaithersburg, MD). Sequenase (T7 DNA polymerase) version 2 and 7-deaza-dGTP sequencing kit were obtained from United States Biochemical Corp. (Cleveland, OH).

Random-primed DNA labeling kit, RNase-free DNase I, and SI nuclease were purchased from Boehringer Mannheim Corp. (Indianapolis, IN). Prep-A-gene DNA purification kit was obtained from BioRad Laboratories (Hercules, CA). AmpiTaq™ was purchased from Perkin Elmer Corp. (Norwalk, CT). [a-P^^jdATP and [y-P^^JATP were obtained from ICN Biochemicals, Inc. (Irvine, CA). [a-P33]dATP and [a-S^sjdATP were purchased from DuPont NEN (Boston, MA). Diethyl pyrocarbonate (DEPC), kanamycin, ampicillin, isopropyl-P-D-thiogalatopyranoside (IPTG), X-gal (5-bromo-4-chloro-3- indolyl-P-D-galactopyranoside), and other chemical reagents were obtained from Sigma

Chemical Co. (St. Louis, MO). Oligonucleotides (Table 2.1 and 2.2) were synthesized by

46 Ransom Hill Bioscience, Inc. (Ramona, CA) or The Great American Gene Company

(Ramona, CA).

Bacterial Strains and Culture Conditions

Haloferax volcanii WFDl 1 (Charlebois et al., 1987) was grown at 37°C in

complex medium, which contained 125 g NaCl, 45 g MgClz 6 H2 O, 10 g MgS 0 4 JHzO,

10 g KCl, 1.34 g CaClz 2 H2 O, 3 g yeast extract, and 5 g tryptone (and 15 g agar for plates) per liter of distilled water (Nieuwlandt and Daniels, 1990). Strain WFDl 1 has been cured of plasmid pHV2 (Charlebois et al., 1987). Portions of the plasmid pHV2, including origin of replication, were used to construct the E. coli-H. volcanii shuttle vector, pWL (Lam and Doolittle, 1989). For growth of H. volcanii cells transformed with the pWL plasmid, 20 pM of mevinolin was also included in the complex medium.

Escherichia coli ED8767 carrying cosmid A199 was obtained from R. L. Charlebois

(Charlebois et al., 1989) and grown at 37°C in Luria-Bertani (LB) medium (10 g tryptone,

5 g yeast extract and 10 g NaCl per liter of distilled water, pH 7.5) supplemented with kanamycin (30 pg per ml). Cosmid A199 contains 21 kbp of the H. volcanii genome, including one of the heat-responsive loci (Trieselmann and Charlebois, 1992). E. coli

DH5a was used to propagate plasmid constructs and E. coli JMl 10 was used as a methylation-modification host before delivering constructs into H. volcanii WFDl 1.

Both E. coli strains were grown at 37°C in LB medium, which was supplemented with

ampicillin ( 1 0 0 pg per ml) when needed.

47 Nucleic Acid Isolation

Small-scale Plasmid/Cosmid DNA Isolation

Both plasmid and cosmid DNA were extracted from E. coli cultures grown

overnight using the alkaline lysis procedure described by Sambrook et al.(1989) with

some minor modifications: the addition of lysozyme to the cell suspension and the 10-

minute incubation before ethanol precipitation was omitted. The dried nucleic acid

pellets were then resuspended in 20 pi ddHzO (for every 1.5 ml culture) containing 1 pi

of 10 mg/ml RNaseA and incubated at 37°C for 10 minutes to degrade the RNA. The

DNA suspension was stored at -20°C.

Small-scale H. volcanii Genomic DNA Isolation

To isolate H. volcanii genomic DNA, cells from 5 ml of a late-log phase

H. volcanii culture were spun down and resuspended in TE buffer containing 1% Triton

X-100. Lysed cells were extracted once with an equal volume of phenol and then with a

1:1 phenol/chloroform mixture at room temperature for 20 minutes. The nucleic acids in the aqueous phase were precipitated with the addition of a 1/10 volume of 3 M sodium acetate and 2.5 volumes of 95% ethanol. Precipitated DNA was collected by centrifugation, and the pellet was washed with 70% ethanol, dried, and resuspended in TE buffer (10 mM Tris-HCl, pH 7.4, and 1 mM EDTA, pH 8.0) containing RNaseA. The resuspension step was allowed to proceed overnight at room temperature. The DNA suspension was then stored at -20°C.

48 Extraction of DNA from Agarose and Poivacrvlamide Gels

Restriction DNA fragments needed for isotope-labeling or subcloning purposes were separated by agarose gel electrophoresis and then recovered from agarose gel slices by using the BioRad Prep-A-Gene™ protocol (for DNA sizes > 500 bp) according to the manufacturer’s instructions or by the freeze-squeeze method (for DNA sizes >200 bp and

< 500 bp). To “freeze-squeeze” DNAs from agarose gels, the gel slice containing the

DNA of interest was placed in a 500 ul Eppendorf tube plugged with glass wool. A small hole was carefully made at the bottom of the tube, and the tube was then capped and submerged in liquid nitrogen for 2 to 5 minutes. After the freezing process, the tube was removed from liquid nitrogen, inserted into a 1.5 ml Eppendorf tube, and spun at 14,000 rpm (Eppendorf centrifuge model 5415C) for 5 minutes. After centrifugation, the eluate collected in the 1.5 ml tube was extracted with an equal volume of 1:1 phenol/chloroform, and the DNA in the aqueous layer was recovered by ethanol precipitation.

Small DNA fragments (< 200bps) were separated by polyacrylamide gel electrophoresis and purified via the elution method. The gel slice containing the DNA fragments of interest was excised, crushed, and placed in a 1.5 ml Eppendorf tube; 400 pi of elution buffer (20 mM Tris-HCl, pH 8.0,2 mM EDTA, 0.4 M NaCl, and 0.05% SDS) was added into the tube, and the tube was placed on a rocker overnight at room temperature. The next day, the gel matrix was spun down, the DNA in the elution buffer was extracted with phenol/chloroform, and the DNA was recovered by ethanol precipitation.

49 RNA Isolation

Termination Studies. We have observed that some pWL-based plasmids (Nieuwlandt and Daniels, 1990) are lost from cells stored at -70°C. Therefore, total RNA from H. volcanii cells carrying pWL plasmids with cloned termination elements was extracted from fresh H. volcanii transformants. A single colony from a transformation was used to inoculate a 3 ml starter culture. Of this late-log culture, 150 pi was used to inoculate 15

ml of H. volcanii complex medium. The H. volcanii culture was then grown to an OD 5 5 0 of 1.0+0.3. Total RNA was isolated from 3 ml of cells using Trizol™ reagent (BRL), a mono-phasic solution of phenol and guanidine isothiocyanate. Cells from two 1.5 ml volumes of the same cultures were spun down in an Eppendorf centrifuge at 14,000 rpm for 2 minutes. Each cell pellet was immediately homogenized with 500 pi of Trizol™ reagent, and then the homogenized samples from the same culture were combined into a single tube. After 5 minutes of incubation at room temperature, 200 pi of chloroform was added to the samples with vigorous shaking for 15 seconds. After 2 to 3 minutes of incubation at room temperature, the mixture was centrifuged at 12,500 x g, and the resulting aqueous phase was transferred to a fresh Eppendorf tube. The RNAs were precipitated from the aqueous solution by adding 500 pi of isopropanol, incubating the mixture for 10 minutes at room temperature, and collecting the RNAs by centrifugation at

12,000 X g for 10 minutes (at room temperature). The pellet was then washed in 75% ethanol, dried briefly under vacuum, and resuspended in 400 pi of ddHaO. With the addition of 1/10 volume of 3M sodium acetate and 2.5 volumes of 95% ethanol to the suspension, isolated RNAs could be stored up to one year at -70°C without degradation.

50 Heat Shock Studies. In the heat shock gene studies, special caution was taken to minimize the in vivo degradation of the short-lived heat shock-specific mRNA

(Trieselmaim and Charlebois, 1992). H. volcanii cultures intended for RNA isolation were quickly chilled in a salt-ice bath. Cells were collected by centrifugation at 4000 rpm for 10 minutes at -15°C and immediately subjected to lysis. RNAs used in the initial

Northern analysis for identifying the heat shock gene were extracted firom 50 ml of culture as described by Nieuwlandt and Daniels (1990). RNAs used in the transcription study were isolated from 5ml of culture using the RNeasy™ system (Qiagen, Inc.,

Chatsworth, CA), which is more efficient in isolating higher molecular weight RNA molecules than the Trizol™ reagent. However, total RNAs isolated using the RNeasy™ system were firequently contaminated with high molecular weight DNAs. Contaminating

DNAs were removed from the RNA preparations by adding 4 pi of lOX DNase buffer (60 mM MgClj, 1 M sodium acetate, pH 5.0) and 10 units of RNase-fi-ee DNase I to the 40 pi of RNA solution. Digestion reactions were incubated at 37°C for 15 minutes. After

DNase treatment, the RNAs were extracted with an equal volume of 1:1 phenol/chloroform and stored in 75% ethanol at -70°C. RNA samples were precipitated when needed for Northern analysis and transcript mapping.

Nucleic Acid Quantitation

Nucleic acid concentrations were determined spectrophotometrically (Sambrook et al., 1989). Normally, a 100- to 500-fold dilution of the DNA, RNA, or oligonucleotide sample was prepared in ddHzO, and the OD was measured at 260nm and 280nm. The

51 OD2 6 0 /OD2 8 0 ratio provided an estimate of the purity of the original samples (1.8 for DNA

and 2.0 for RNA), and the OD 2 6 0 reading was used to calculate the nucleic acid concentration (X). The calculation formulae are as follows:

For dsDNA: X (pg/ml) = OD 2 6 oxDilution factor x50pg/ml

For RNA: X (pg/ml) = OD 2 6 oxDilution factor x40pg/ml

For Oligonucleotides: X (pg/ml) = OD 2 6 oxDilution factor x33pg/ml

DNA Restriction Analysis

DNA restriction reactions were normally performed according to the manufacturer’s instructions. A typical restriction reaction contained 1 pg plasmid, 5 pg cosmid or 10-15 pg genomic DNA and 5-10 units of restriction enzymes in a total reaction volume of 20 pi. Depending on the purpose of the restricted DNA, the digestion reactions were terminated by extracting the DNA with phenol-chloroform or by purifying with the Prep-A-gene™ method or by adding 5 pi of 5X DNA loading dye (10% Ficoll

400, 50 mM pHS.O EDTA, 0.5% SDS, and 0.125% xylene cyanol and bromophenol blue).

Electrophoretic Techniques for Nucleic Acid Analysis

Nondenaturing Agarose Gel Electrophoresis

DNA fragments with sizes greater than 500 base pairs (bps) were separated by agarose gel electrophoresis for sizing and purification purposes. Depending on the size of

52 the desired DNA, gel solutions containing different percentages of agarose in IX TBE buffer (0.089 M Tris, 0.089 M boric acid, 0.05 M EDTA, pH8.0) were used to cast the gel

(for example: 0.7% -1% for DNAs > 2 kbps; 1.2% -1.5% for DNAs < 2 kbps and > 500 bps). Unless the gel was going to be used in a Southern transfer, 1 pi of 10 mg/ml ethidium bromide solution was mixed into 50 ml of the agarose solution before casting the gel. Electrophoresis was typically performed at 8V/cm in IX TBE running buffer. To visualize the DNA, the gel was placed on a UV-transilluminator after electrophoresis to be photographed or to cut out the DNA of interest. If ethidium bromide was not included in the gel, the DNAs were stained after completion of electrophoresis by submerging the gel in a 0.5 pg/ml ethidium bromide solution for 15 minutes and the visualized using a

UV-transilluminator.

Formaldehyde Agarose Gel Electrophoresis

For heat shock gene Northern analysis, total RNA were separated on a 1.2% agarose gel containing formaldehyde as the denaturing reagent (Ausubel et al., 1987).

Prior to this procedure, any labware or reagents were made RNase-free. Glassware was baked in the oven at 200°C for 6 to 16 hours; the gel electrophoresis apparatus was wiped clean with RNAzap™ (Ambion); and chemical reagents and ddHaO were treated with

0.1% diethylpyrocarbonate (DEPC) followed by incubation at 70°C for 2 hours to inactivate the DEPC.

RNA samples were pelleted by centrifugation at 14,000 rpm for 10 minutes,

resuspended in DEPC-treated distilled water, and quantitated by measuring the OD 2 6 0 .

53 Approximately 5-10 pg of total RNA was resuspended in 5 pi of DEPC-treated distilled water and then incubated at 55°C for 15 minutes after adding 19.4 pi of the sample treatment mixture (50 pi of lOX MOPS buffer, 87.5 pi of 37% formaldehyde, and 250 pi of deionized formamide). The lOX MOPS buffer contained 2 M MOPS [3-(N- morpholino)-propanesulfonic acid] at pH 7.0,0.05 M sodiiun acetate, and 10 mM EDTA

(pH 8.0). The deionized formamide was prepared by mixing 50 ml of formamide with

2.5 g of AG-501-X8(D) mixed bed resin at 4“C for 30 minutes and filtering the mixture twice through Whatman membrane. The deionized formamide was stored at -20°C. Prior to gel electrophoresis, 5 pi of the loading buffer (ImM EDTA, pH8.0; 0.05% bromophenol blue; 0.05% xylene cyanol; 50% glycerol) was added to each sample.

Duplicate sets of samples (usually 10 pg/sample) and RNA size markers (3 pg of

0.24-9.5 Kb RNA ladder, BRL) were normally loaded on one gel. Gels were run at 5

V/cm for 3 hours. After separation, the gel was soaked in two changes of DEPC-treated lOX SSC solution (15 minutes each) to remove the formaldehyde, the section containing the RNA MW markers was stained in an ethidium bromide (5 pg/ml) bath, destained in

DEPC-treated distilled water overnight at 4°C in the dark, and photographed the next day.

The other section of the gel was then used for Northern blot analysis.

Nondenaturing Poivacrvlamide Gel Electrophoresis

To separate small DNA fragments (< 500 bp), an 8% (for DNA sizes > lOObp) or a 12% to 15% (for DNA sizes < 100 bp) nondenaturing polyacrylamide gel was used

(Sambrook et al., 1989). Electrophoresis was performed in a vertical electrophoresis

54 apparatus with IX TBE buffer at 25 V/cm. After the completing the electrophoresis, the glass plates were carefully separated to leave the gel attached to one of the plates, and a

0.5 pg/ml ethidium bromide staining solution was then poured on the gel and left for 15 minutes before being rinsed off with ddH^O. DNA bands were visualized on a UV- transilluminator.

Denaturing Polyacrylamide Gel Eiectrophoresis

For the termination studies, the RNA species were separated by denaturing polyacrylamide gel electrophoresis. A gel was cast with 6% polyacrylamide gel solution containing 7 M urea as the dénaturant and set up as described for the nondenaturing polyacrylamide gels. The RNA samples were resuspended in loading buffer (7 M urea,

10% glycerol, 0.05% bromophenol blue, and 0.05% xylene cyanol) and incubated at 50°C for 10 minutes. Before loading the samples, the gel was pre-run for 30 minutes, the power was turned off, and the wells were rinsed with IX TBE buffer. Electrophoresis was carried out at 15 V/cm until the bromophenol blue dye reached the bottom of the gel.

The RNAs were visualized as described for nondenaturing polyacrylamide gels.

For separating single-stranded DNA sequencing products, a gel was cast using large (32x40 cm) glass plates, 0.4 mm spacers, and shark-tooth combs. Electrophoresis was done in the BRL Model S2 apparatus at 50 watts (1200-1500 volts) in IX TBE buffer. Typically, gels were pre-run for 30 minutes, and 2-4 pi of the heat-denatured (2 min at 90°C) sequencing samples was loaded per lane. To resolve longer sequences, at least two (or as many as three) sample loadings were applied at one-and-a-half to two-

55 hour intervals. The electrophoresis was allowed to proceed until the bromophenol blue reached the bottom of the gel. At the end of electrophoresis, the gel was soaked in 4 liters

of a solution containing 1 0 % acetic acid and 1 2 % methanol for 15 minutes to remove the urea, transferred to 3 MM Whatman paper, and dried at 70“C for 40 minutes on a vacuum gel dryer.

Cell Extract Preparation and SDS-Polyacrylamide Gel Electrophoresis

H. volcanii strains overexpressing the heat-shock gene (HSOPs) were analyzed for their protein contents. Crude cell extracts were prepared from H volcanii HSOP cultures

(OD5 5 0 0.8-1.0) before and after the heat shock at 60°C for 60 minutes. The cells were lysed in 30 pil of 0.1 M Tris-HCl, pH 7.5, and 0.1% Triton X-100 by pumping the solution up and down with a Gilson pipetman. One pi of 5 pg/ml DNase and 1 pi of 1 mg/ml RNaseA were then added to the cell extract followed by a 20-minute incubation at

37°C. The protein concentration of the crude extract was determined by using the BioRad

Protein Assay system: 800 pi of a 1:1000 crude extract dilution and 200 pi of the BioRad

Protein Assay reagent were mixed by vortexing, allowed to stand for at least 2 minutes at

room temperature, and OD 5 9 5 determined. Based on a standard curve derived from known protein samples, the concentration (X) of the total protein in the original crude extract was then calculated as follows:

OD5 9 0 - 1.2857x10'^ X (pg/pl) = 4.35x10*^

56 Proteins were separated by one-dimensional SDS-PAGE electrophoresis

(Laemmli method) as described by Sasse in Cxtrrent Protocol for Molecular Biology

(Ausubel et al., 1987). The following molecular weight markers (MWSDS-200 kit;

Sigma) were used: carbonic anhydrase (29 kd), egg albumin (45 kd), bovine plasma albumin (66 kd), phosphorylase B (97.4 kd), E. colt p-galactosidase (116 kd), and rabbit muscle myosin (205 kd). Gels were run at 50 mA for 2 hours and 40 minutes. To stain the proteins the gel was soaked in 5 gel volumes of 12.5% trichloroacetic acid in a Pyrex dish for one hour and stained overnight in Coomassie brilliant blue G-250 solution [0.1%

(w/v) Coomassie brilliant blue G-250, 6% (w/v) ammonium sulfate, 2% (v/v) phosphoric acid] (Harlow and Lane, 1988).

Preparation of Competent Cells and Transformation

Three transformation host strains were used in this study. E. coll DH5a was the plasmid-propagation host used in cloning for sequencing or other construction purposes.

E. coli JM I10 was the methylation-modification host, and H. volcanii WFDl 1 was the expression host for all the in vivo studies in H. volcanii.

E. coli PH5a

E. coli DH5a competent cells were prepared by SEM (Simple and Efficient

Method) as described by Inoue et al. (1990). An E. coli DH5a starter culture (incubated overnight at 37°C) was used to inoculate 150 ml of SOB [ 2% (w/v) tryptone, 0.5% (w/v)

yeast extract, 10 pM NaCl, 2.5 pM KCl, 20 pM MgClz and MgS 0 4 ] at 1:100 dilution.

57 The culture was grown to an ODeoo of 0.6 at 18°C with shaking (200 rpm). The flask was then transferred from the 18°C incubator to an ice bath and chilled for 10 minutes. The cells were spun down in four oakridge centrifuge tubes in a Sorvall centrifuge (4000 x g at 4°C for 10 minutes). The pelleted cells were then resuspended in 40 ml of ice-cold TB solution (10 mM Pipes, 55 mM MnCb, 15 mM CaClz, 250 mM KCl, pH 6.7) by gentle vortexing, chilled on ice for 10 minutes, and spun down as described. The cell pellets were then resuspended in 10 ml of ice-cold TB solution, and ice-cold DMSO (dimethyl sulfoxide) was added slowly with gentle swirling to a final concentration of 7%. The cell suspension was then aliqouted into sterile 1.5 ml Eppendorf tubes (50 pl/tube) and stored at -70°C until needed.

For transformation, competent cells were thawed on ice and 1 to 5 pi of miniprep

DNA or ligation mixture (containing 10 to 20 ng of DNA) was added to each tube.

Following a 30-minute incubation on ice, the transformation mixture was heat-shocked for 30 seconds at 42°C and chilled on ice. Then 450 pi of SOC (SOB plus 20 pM glucose) was added, and the cells were incubated at 37°C with shaking for 1 hour. After incubation, 20 to 100 pi of the cell suspension was plated on LB plates containing 10 pM

X-gal, 0.33 pM rPTG, and ampicillin (100 pg per ml).

E. coli JMI 10

Preparation of highly efficient competent cells appeared more difficult for E. coli

JMI 10 cells than for E. coli DH5a cells. Three different methods for the preparation of competent E. coli JMI 10 cells were compared: the CaCh procedure (Sambrook et al.,

58 1989), the SEM (described in DH5a section), and the one-step protocol described by

Chung et al. (1989). The last method yielded E. coli JMI 10 cells with the highest

competency. Therefore, the JMI 10 competent cells were routinely prepared according to

the one-step protocol. Typically, 100 ml of JMI 10 culture in LB broth was grown to an

ODeoo of 0.3-0.4 at 37°C. The cells were centrifuged (4000 x g at 4°C for 10 minutes) in

oakridge tubes and resuspended in 1/10 volume of ice-cold IX TSS solution [LB broth

with 10%(w/v) PEG (600), 5%(v/v) DMSO and 20-50 mM M g^, pH 6.5]. The cell

suspension was stored as 100 pl-aliquots in 1.5 ml Eppendorf tubes at -70°C and thawed

on ice when needed.

For transformations using the pWL vector isolated from E. coli DH5a, 1 pi of

miniprep DNA (approximately 0.1 pg) was added to the competent cells on ice and this

mixture was incubated for 1 hour on ice. Following this incubation, 900 pi of SOC

solution was added, and the mixture was incubated for 1 hour at 37°C with shaking. At

the end of this incubation, cells were concentrated 10-fold and plated on LB plates

supplemented with ampicillin (100 p.g/ml).

H. volcanii WEHW

To prepare H. volcanii WFDl 1 competent cells, 100 ml of in complex

medium was grown in a 250 ml flask to mid-log phase ( O D 5 5 0 0.45- 0.55) at 37°C with shaking. The cells were then spun down as described (see E. coli DH5a) and gently resuspended in 0.1 volume of spheroplasting solution (0.8 M NaCl, 27 mM KCl, 15%

59 sucrose, 50 mM Tris-HCl, pH 8.2 ) containing 15% glycerol (w/v). The cell suspension were stored in 100-pi aliquots in 1.5 ml Eppendorf tubes at -70°C until needed.

For transformation, the frozen competent cells were thawed on ice and then brought to room temperature. All procedures, unless stated otherwise, were performed in

Eppendorf tubes at room temperature. For each 100 pi of spheroplast cell suspension, 10 pi of 0.5 M EDTA (pH 8.0) was added with gentle mixing followed by the addition of a 5 pi DNA sample containing 1 pi of DNA (approximately 0.1 pg plasmid DNA) mixed with 4 pi of spheroplasting solution. After the mixture was allowed to sit for 5 minutes, an equal volume of 60% polyethylene glycol 600 (PEG) (600 pi of PEG plus 400 pi of spheroplasting solution) was gradually added with swirling. After a 15-minute incubation, 1 ml of regeneration broth (3.5 M NaCl, 0.15 M MgS04,50 mM KCl, 7 mM

CaC12, 15% sucrose, and 50 mM Tris-HCl, pH 7.2) was added with gentle inversion; the cells were pelleted in an Eppendorf centrifuge at 14,000 rpm for 2 minutes, the supernatant was discarded; and the cells were resuspended in 500 pi of fresh regeneration broth. The cell suspension was incubated at 42“C overnight, and 100 pi of cells was plated on H. volcanii complex medium plates containing 20 pM mevinolin. The plates were placed in a plastic bag with a damp paper towel and incubated at 42°C until colonies appeared (approximately 7-10 days).

60 Cloning

Preparation of the Insert DNAs

Termination elements examined in the in vivo termination study were generally

DNA fragments prepared from Polymerase Chain Reactions (PGR) or from annealing complementary oligonucleotides. Synthetic oligonucleotides (see Table 2.1) containing appropriate restriction sites at their 5’ end were used as primers for PGR. A typical 100 pi PGR reaction mixture contained 20 to 50 ng of template DNA, 150 pmoles of each primer, 2 pi of 10 mM dNTPs, 10 pi of 1 OX Taq polymerase buffer (300 mM Tricine, pH

8.0; 20 mM MgGl^; 50 mM p-mercaptoethanol; 0.1% gelatin; and 1% Thesit), and 2.5 units of Taq DNA polymerase. PGR was performed using 30 amplification cycles. Each amplification cycle included template dénaturation at 94°G for 30 seconds, primer annealing at 50°G to 55°G (depending on the Tm of the primers) for 30 seconds, and primer extension at 72 °G for 30 seconds to 2 minutes (depending on the size of DNA fragment to be amplified). The PGR products were precipitated, digested with appropriate restriction enzymes, analyzed by non-denaturing polyacrylamide gel electrophoresis, and purified from the gel by the elution method as described earlier.

Some small DNA fragments were prepared by armealing synthetic complementary oligonucleotide pairs. To improve future cloning efficiency, 200 pmoles of each oligonucleotide was phosphorylated at its 5’ end using T4 DNA polynucleotide kinase.

Reactions were carried out as described by the manufacturer’s instructions for 5’ end- labeling. After phosphorylation, the oligonucleotides were precipitated and resuspended in 20 pi of annealing buffer (10 mM Tris-HGl, pH 7.9; 2 mM MgG12; 50 mM NaGl; 1

61 mM EDTA). The annealing mixture was heated to 95°C for 5 minutes and allowed to cool gradually to room temperature. The annealed DNAs were separated by nondenaturing polyacrylamide gel electrophoresis and purified from the gel by the elution method.

In the initial cloning of the heat shock gene, M u \ fragments of cosmid A199 that hybridized to heat-shock specific RNA were purified from agarose gels by the Prep-A-

Gene™ method and the ends made flush with the Klenow fragment. The Klenow reaction was carried out at room temperature for 30 minutes in a 20 ^1 mixture, which contained 0.1 |ig DNA, 2 pi of lOX nick-translation buffer (0.5M Tris-HCl, pH7.2; 0.1 M

MgS 0 4 ; 1 mM dithiothreitol; 500 pg/ml bovine serum albumin), 2 nmoles of dNTPs, and

1 unit of Klenow fragment. In the subsequent subcloning, the insert DNAs were mostly restriction fragments of Cosmid A199 or the two MmI fragment-containing subclones purified by the Prep-A-Gene™ method.

For overexpression of the heat shock gene in H. volcanii, a DNA fragment containing the entire heat shock gene including its native promoter was amplified from H. volcanii genomic DNA by PCR. The two synthetic oligonucleotide primers

(HSC0MP1&2, see Table 2.2) used in the PCR also introduced synthetic BamlQ. endonuclease restriction sites at both ends of the gene. The PCR reaction mixture contained 200 ng of H. volcanii genomic DNA as the template, 150 pmoles of each primer, and other components as described earlier. The PCR conditions were different from those previously described. Prior to amplification, the reaction mixture was incubated at 97°C for two minutes followed by addition of Taq DNA polymerase. Each

62 amplification cycle included template dénaturation at 94°C for 1 minute, primer annealing at 55“C for 1 minute, and primer extension at 72®C for 5 minutes (Ponce and Micol,

1992). The PCR product was extracted with phenol/chloroform, precipitated, resuspended in ddHzO, and digested with BamHL restriction enzyme. The restricted DNA fragments were separated in an agarose gel, and the desired fragments were purified by the Prep-A-Gene™ method.

Vectors and Ligation Reactions

The termination element was normally cloned into the pUC19-derived plasmid vector which contained the H. volcanii tRNALys promoter (Nieuwlandt and Daniels,

1990) and the yeast tRNAProM gene (Palmer and Daniels, 1994) or the H. volcanii tRNATrpO 16M-AGGAG reporter gene (Nieuwlandt et al., 1993). The pUC vector was linearized with appropriate restriction enzymes, precipitated, and dephosphorylated with calf intestine alkaline phosphatase (CLAP; product of BRL) as described by the manufacturer. The dephosphorylation mixture was then subjected to agarose gel electrophoresis, and the vector DNA was purified by the Prep-A-Gene™ method. T4

DNA ligase reactions (BRL), typically containing 25-50 ng of pUC vector DNA, were performed according to the manufacturer’s instructions.

For the purpose of sequencing the heat shock gene, pUC19 vectors digested with appropriate restriction enzymes and prepared as described earlier were used in all the cloning and subcloning. For overexpressing the heat shock gene from its native promoter in H. volcanii, a new E. coli-H. volcanii shuttle vector was constructed. This vector,

63 designated pWL200, was generated by replacing the HindiSl-Eco^ region of pWL202

(Nieuwlandt and Daniels, 1990) with the multiple cloning region of pUC19. This removed the H volcanii tRNALys promoter and ensured that the cloned gene would be expressed from its native promoter. The DNA fragment containing the heat shock gene was then ligated into the 5amHI site of pWL200 vector.

Identification of Clones

The ampicillin-resistant E. co//DH5a transformants obtained from transforming with the ligation mixture of pUC vector and termination elements were analyzed by restriction analysis typically using Xbal and EcoEl restriction endonucleases. The size of the Xbal-E cai^ DNA fragment separated from the larger vector-containing DNA fragment indicated the number of inserts cloned into the vector. The sequence accuracy and the orientation of the insert in the single insert-containing constructs were then determined by DNA sequence analysis (described in the Sequence Analysis Section).

Once the construct in the pUC vector (propagated in E. coli DH5a) was verified, the

M«dni-£coRI fragment carrying the complete termination expression module was removed by restriction enzyme digestion and separated by 1.5% agarose gel electrophoresis, purified using the “freeze-squeeze” method, and cloned into the pWL shuttle vector. When cloning with the pWL vector, a higher quantity of vector DNA

(typically 100-200 ng) compared to pUC vector (25-50 ng) was used in the ligation reaction. The pWL constructs were then passed through E. coli DH5a, E. coli JMI 10

(for modification of the méthylation pattern), and then introduced into H. volcanii.

64 Identification of the E. coli DH5a and E. coli JMI 10 clones that carried the pWL vector containing the ///ndlll-EcoRI expression module were done by performing restriction analysis. Because of the difiBcidty in obtaining good quality plasmid preparations, recombinant plasmids from H. volcanii clones were verified by transforming E. coli

DH5a with the H. volcanii miniprep DNA (prepared by alkaline lysis procedure) and sequencing the plasmid DNA isolated from E. coli DH5a.

For the heat shock studies, subclones of the heat shock gene in pUC19 were easily identified as white E. coli DH5a colonies on LB plates containing ampicillin (100 pg/ml). The heat shock gene subclones were then further verified by restriction analysis.

E. coli DH5a carrying the pWL200 vector with the entire heat shock gene was screened by colony hybridization (see Colony Hybridization section) and verified by restriction analysis. Once verified, the pWL vector was passed through E. coli JMI 10 and introduced into H. volcanii. Identification of clones at each stage was performed as above.

Nucleic Acid Sequence Analysis

DNA Sequencing

Plasmid DNA for sequencing was prepared by the alkaline lysis miniprep method

(described earlier). Typically, 2-5 pg of DNA was used in a single sequencing reaction.

The volume of the DNA solution was adjusted to 200 pi with ddHzO, and the DNA was extracted with a 1:1 phenol/chloroform mixture to remove the RNaseA added into the

65 DNA miniprep earlier. After removing ± e aqueous layer containing the DNA to a clean

Eppendorf tube, the DNA was denatured by incubation with 20 pi (1/10 volume) of 2 N

NaOH/2 mM EDTA for 30 minutes at 37°C. The single-stranded DNA was then recovered by ethanol precipitation, dried, and used for subsequent sequencing. The DNA was sequenced according to the manufacturer’s instructions for the Sequenase T7 DNA polymerase sequencing kit (United States Biomedical, USB), or for the Bst DNA

Polymerase sequencing kit (BioRad). Both procedures are based on the dideoxynucleotide chain termination method described by Sanger (Sanger et al., 1977).

The sequencing primers used were either universal primers purchased from USB or custom-synthesized oligonucleotide primers. Sequencing products were subjected to 6% denaturing polyacrylamide gel electrophoresis (described previously) and visualized by autoradiography.

Comparative Sequence Analysis and RNA Structure Prediction

DNA nucleotide sequences were analyzed using the EditSeq program

(DNASTAR, Inc., Madison, WI) and compared with known protein-coding genes in the

NCBI data bank. Based on the search results, the gene sequences scoring high in the search reports were downloaded from Entrez Browser by gene accession numbers, and the collected gene sequences were then aligned using MegAlign program (DNASTAR,

Inc.). For prediction of RNA secondary structures, mFold RNA folding program (Zucker,

1994) was used (http://www.ibc.wustl.edu/~zuker/ma/forml.cgi).

66 Radiolabeling and Purification of DNA Probe

Radiolabellng of DNA Probe

5’ end P^^-labeled oligonucleotides were used as probes in Southern, Northern, and colony blot hybridizations, and as primers in primer extension analysis. A standard labeling reaction mixture contained the following components: 5 pmole of synthetic oligonucleotides, 5 pi of 5X forward reaction buffer (350 mM Tris-HCl, pH 7.6; 50 mM

MgCh; 500 mM KCl; 5 mM 2-mercaptoethanol) supplied by the manufacturer, 1 pi (10 units) of T4 polynucleotide kinase, 20 pCi [y-P^^]ATP (7000 Ci/mmole), and ddH20 to a final volume of 25 pi. The reaction was incubated at 37°C for 30 minutes. Following incubation, the labeled probe was directly added to the hybridization solution, or purified by gel filtration (described in next section).

For the initial identification of the heat shock gene by Northern analysis, DNA probes were prepared by random-primer labeling of Mwl-restricted fragments derived from cosmid A 199. Mlul fragments from cosmid A199 were purified from an agarose gel, and 50 ng of DNA was typically used in one labeling reaction. Reactions were carried out according to the instructions provided for the Random Primers DNA Labeling

System (BRL). For this procedure, the DNA to be labeled was placed in a 1.5 ml

Eppendorf tube, denatured for 5 minutes in a boiling water bath, and quickly chilled on ice. The following components were then added to the reaction tube: 2 pi of 0.5 mM dATP, dGTP, dTTP; 15 pi of random primer buffer mixture (0.67 M HBPES, 0.17 M

Tris-HCl, 17 mM MgCl 2 ,33mM 2-mercaptoethanol, 1.33 mg/ml ESA, 18 OD 2 6 9 units/ml

67 oligodeoxyribonucleotide primers, pH 6.8); 50 pCi of [a-P^^]dCTP; ddHzO up to a final

volume of 49 pi; and 1 pi of the Klenow Fragment. The reaction mixture was incubated at 25°C for one hour, terminated by adding 5 pi of 0.2 M EDTA (pH 7.5), and then

immediately added to the hybridization solution.

Purification of Radiolabeled DNA Probe

After the labeling reactions, DNA probes to be used for primer extension or S1 nuclease mapping (described in the later section) were purified before they were added to the RNA solutions. To remove radionucleotides, the reaction mixture was passed through a Sephadex G-25 size-exclusion gel filtration column, which was prepared by pouring a suspension of G-25 in TE buffer (pH 8.0) into a borosilicate glass pipette plugged with glass wool. The end-labeling reaction mixture was brought up to a total volume of 200 pi with TE buffer and applied to the column. DNA probes were eluted with TE buffer, and

600 pi of the elution containing the majority of the labeled DNA was collected. The

DNAs were extracted once with phenol and chloroform to prevent any possible RNase contamination.

Colony Blot Hybridization

Colony Blotting

When cloning with the pWL vector, for which a-complementation is not available as a screening method, putative clones were identified by colony blot hybridization. To determine which transformant contained recombinant plasmids, colonies were picked

68 with sterile toothpicks and streaked on plates containing the appropriate medium and antibiotic. The inoculation was done in such a fashion that the orientation of the replica colonies could be easily identified. The plates were incubated for 16 to 18 hours before performing colony lifts.

To prepare the colony lifts, ZetaProbe™-GT membrane (BioRad), which was cut to the size of the plates, was careftilly placed over the patched colonies and smoothed with a sterile glass “hockey-stick”. After 2-5 minutes, the membrane was lifted carefully, placed on two layers of 3 MM Whatman paper saturated with 0.4 M NaOH for 5 minutes to lyse the cells, and briefly blotted dry. The cell lysis procedure was repeated once followed by two 5-minute washes with 2X SSC [20X SSC (pH 7.4) contained 175.3 g of

NaCl and 88.3 g of sodium citrate per liter] and 0.2% SDS to remove cellular debris. The blots were allowed to air-dry and the DNA was fixed to the membrane by UV crosslinking (125mJ/cm^) using the GS GeneLinker™ (BioRad) before proceeding to hybridization. The plates were incubated for 4 hours at 37°C to regenerate visible colonies.

Hybridization and Detection

Membranes were prehybridized at 65°C for 5 minutes to 30 minutes in sealed

plastic bags containing hybridization solution [0.25 M NaH 2 P0 4 (pH 7.2), 7% SDS; 150 pi of solution per cm^ of membrane). After prehybridization, the solution in the bag was discarded and replaced with fresh hybridization solution. P^^-labeled probes were added to the bag and allowed to hybridize overnight (16 to 20 hours) at 65°C. Excess and

69 nonspecifically bound DNA probes were removed from the membrane by two 15-minute washes in a solution (500 fil/cm^) of 2X SSC and 0.5% SDS at room temperature. The signals were visualized and quantitated using the Instantlmager™ (Packard Instrument

Company, Downers Grove, IL). In addition, blots were exposed to X-ray film for a permanent record.

Northern Analysis

The in vivo termination activities of all the potential terminator elements were evaluated by using Northern analysis. RNAs isolated by the Trizol™ method, as described previously, were separated by denaturing polyacrylamide gel electrophoresis, and the RNAs were transferred to ZetaProbe™-GT nylon membrane by electroblotting using the Genie Electrophoretic Blotter (Idea Scientific Company; Minneapolis, MN).

The blotting procedure was carried out according to the manufacturer’s instructions in IX

TBE buffer for 1 hour at 12 volts. The RNAs were fixed to the membrane by UV- crosslinking (125 mJ/cm^). The procedure for prehybridization, hybridization, and washing of the blots was the same as described in the Colony Blot Hybridization section, except that the hybridization temperature was 50°C.

To determine the amount and the size of the heat-responsive RNA, RNAs were separated by formaldehyde agarose gel electrophoresis (described previously) and transferred to ZetaProbe™-GT nylon membrane. Capillary transfer of the RNAs to the membrane was carried out in lOX SSC for 16 to 20 hours (Thomas, 1980). The remaining steps in this procedure were as described above.

70 Southern Analysis

Southern blot analysis was done to detect the presence of other related heat shock genes in the H. volcanii genome (Southern, 1975). Various single and double restriction digestions of H. volcanii genomic DNA were performed. Approximately 5-10 pg of genomic DNA was used in each restriction reaction; digests performed included Mlul,

Xhol, 5srl, Apal, ApaVSstl, MluUXhol, Xhol/Sstl, and MluUSstl. Restricted DNA fragments were separated by electrophoresis in a 0.7% agarose gel and transferred onto

Zetaprobe™-GT membrane by capillary transfer overnight in 0.4 M NaOH. The probe was prepared by random-primer labeling the complete heat-shock gene fragment generated by PCR. Probe hybridization and signal detection were carried out as previously described in the Colony Blot Hybridization section.

Primer Extension

Annealing of DNA Probes to RNA

For each primer extension reaction carried out in the termination studies, approximately 10 ng of a 5’ P^^ end-labeled proEXI oligonucleotide primer (1/10 of total labeled probe) was added to the total RNA isolated from 1 ml of 77. volcanii cells and the

DNA/RNA were precipitated by ethanol precipitation. The pellet was washed with 75% ethanol, dried briefly under vacuum, and resuspended in hybridization buffer containing

0.3 M NaCl, 2 mM EDTA, and 10 mM Tris (pH 7.5). The mixture was heated to 80°C for 4 minutes and then incubated at the annealing temperature (50°C) for 1 hour. The appropriate control (minus RNA in the reaction) was also carried out.

71 Annealing of DNA probes to total RNA (isolated under different conditions) in the heat shock gene study was carried out in three 1.5 ml Eppendorf tubes, each containing 60 pi o f the 5’ end-labeled HSPE oligonucleotide ( approximately 10 ng).

In an effort to balance the signal intensity derived from the heat-responsive RNA isolated under different conditions (non-shocked, heat-shocked, and salt-shocked), various amounts of total RNA were added to the probes: one tube contained RNAs isolated from

4 ml of non-shocked cells; a second tube contained RNAs isolated from 1 ml of heat- shocked cells (60 minutes at 60°C); and a third tube contained RNAs isolated from 2 ml of salt-shocked cells (60 minutes in complex medium containing 0.9 M NaCl).

Annealing reactions were performed as described above.

Reverse Transcription and Product Analysis

Prior to the reverse transcription reaction, the primer-RNA hybrid was precipitated. To the pellet in each tube, 12 pi of DEPC-treated ddH20,2 pi of 0.1 M

DTT, 1 pi of 10 mM dNTP, and 4 pi of SX first-strand buffer [250 mM Tris-HCl (pH 8.3 at room temperature), 375 mM KCl, and 15 mM MgCh; provided with

SUPERSCRIPT™ n by BRL] was added. The mixture was equilibrated at 37°C for 2 minutes, and reverse transcription was initiated by the addition of 1 pi of

SUPERSCRIPT™ RNase H” reverse transcriptase (200 units). After allowing the extension reaction to proceed for one hour, RNAs in the reaction mixture were removed by adding 1 pi of RNaseA (10 mg/ml) and incubating the tubes at 37°C for 25 minutes.

After the RNA digestion, 10 pi of sequencing reaction loading buffer (USB sequencing

72 kit) was added to the final mixture. The reverse-transcribed DNA products were analyzed on a sequencing gel along with a sequencing ladder prepared with the same primer and a

DNA template carrying the target gene.

SI Nuclease Mapping

3’ End-Labeling of dsDNA Probe

To map the 3’ terminus of transcripts derived firom the heat shock gene and selected terminator constructs, S1 nuclease analysis was carried out. The DNA probes used were prepared by PCR. The PCR products were amplified from appropriate template DNAs by oligonucleotide primers with complementary sequences spanning the

3’ terminus of the gene, the transcription terminator and its downstream region. The PCR products were purified (as previously described in Cloning section), digested with 55/Nl

(in termination studies) or Mspl (in the heat shock gene study), and separated by electrophoresis in a 8% nondenaturing polyacrylamide gel oral .5% agarose gel. The

DNA fragments containing the putative terminator regions (identified by their known size) were purified from the gels by the elution or freeze-squeeze methods. The purified

DNA fragments were then labeled at the 3’ end on the nontemplate strand by filling-in the restriction site containing the 5’ overhang. Labeling reactions contained 50-80 ng of probe DNA, 10 pCi of [a-P32]dCTP (at Mspl site) or [a-P^^JdATP (at 5jrNl site), 7 units of Klenow, 2.5 pi of lOX Klenow buffer (0.5 M Tris-HCl, pH 7.5; 0.1 M MgCh; 10 mM

DTT ; and 0.5 mg/ml BS A), and ddHzO up to a final volume of 25 pi. The reaction mixture was then incubated at 25°C for 30 minutes. The labeled DNA probes were

73 immediately purified by passage through a Sephadex G-25 column and extracted with phenol/chloroform before addition to the RNAs for the annealing reaction.

Annealing of DNA Probes to RNA

Since dsDNA was used as the probe, it was particularly important to establish a hybridization condition that favored the DNArRNA hybrid rather than the DNA.DNA hybrid during the annealing stage. To determine the best hybridization temperature, trial experiments were conducted in which hybridization was initially performed at three different temperature settings: Tm—3°C, Tm, and 3°C. The Tm of the DNA probe was determined using the following formula (Berk, 1989):

Tm = 81.5 + 0.5 (% G+C) + 16.6 log [Na"^ - 0.6 (% formamide)

For all hybridization conditions, the DNA probe (from column elution) was precipitated alone (control) or coprecipitated with total RNA isolated from an H. volcanii culture.

When precipitated with RNA, the typical ratio was 10 ng of DNA to total RNA from 1 ml of culture (~5 pg). The DNA or DNA/RNA pellets were then resuspended in 30 pi of hybridization buffer (40 mM PIPES, pH 6.4; 1 mM EDTA, pH 8.0; 0.4 M NaCl; and 80% formamide), denatured at 85°C for 10 minutes, and hybridized overnight (14-16 hours) followed by SI nuclease digestion.

SI nuclease Digestion and Product Analysis

SI digestion was initiated by adding 270 pi of 81 nuclease buffer (30 mM sodium acetate, 250 mM NaCl, and 1 mM ZnCb) containing 270 units of S1 nuclease

74 into the hybridization mixture. The SI digestion was typically carried out at room temperature with the exception that the yeast tRNAProM reaction was performed on ice.

The aliquots of the entire digestion reaction were quenched at various times (normally 0,

10, 30, and 60 minutes) by adding 1/6 volume of stop buffer (4 M ammonium acetate; 0.1

M EDTA, pH 8.0; and yeast tRNA, 40 pg /ml). The products were then ethanol precipitated and their sizes were determined by running them in a sequencing gel adjacent to a sequencing ladder that served as size markers.

In each assay, various control reactions were included to ensure that the key components were working properly at every experimental stage. The first control reaction contained only the ethanol precipitated labeled DNA probes; this control served to verify the success of labeling reactions, and the integrity and the sizes of probes prior to S1 digestion. In the second control, the DNA probes were precipitated, resuspended in hybridization solution, and digested by S1 nuclease; this control yielded the background signals which might have occurred as a result of SI nuclease nicking the residual dsDNA in the experimental reaction. In the third control, the DNA probe was precipitated, resuspended in hybridization buffer, heat-denatured, and digested by SI nuclease; this reaction served to check the efficiency of S1 nuclease digestion of the ssDNA at the given conditions. The last control reaction was carried out along with the experimental reaction except that the RNA was not included here; this reaction was useful in identifying the protected products specifically obtained from DNArRNA hybrid in the experimental reaction.

75 Bent DNA Analysis

In order to determine the intrinsic bend of termination sequence elements, we constructed a new cloning vector, pUCbend, using pUC19 as the parental vector and the

///«dni-£coRI fragment from pBend2 (Zwieb et al., 1991) as the insert. The pBend2 plasmid, a low copy number pBR322 derivative containing a set of permuted restriction sites as well as unique Sail rndXbal sites in a Hin^HSi-EcoRl fragment, was a gift from K.

Sandman. Approximately 1 pg of pBend2 DNA was digested with ifmdlll and £coRJ restriction enzymes and its fragments separated by electrophoresis in a 8% non- denaturing polyacrylamide gel. The 241 bp HindSH-EcoRi. fragment in the gel was extracted by elution method and purified as described earlier. This fragment was ligated into the equivalent sites of pUC19, and the resulting recombinant plasmid, pUCbend, was purified from 1% agarose gel by the Prep-A-Gene™ method.

The sequence elements to be examined for their bending ability were prepared by

PCR in which Xbal restriction sites were added to each end of the fragments. These fragments were cloned into the pUCbend vector as Xbal-Xbal fragments and sequenced to verify the constructs. Individual digestion reactions using Mlul, BgH, Xhol, EcoBN,

PvuQ., Smal, Nrul, Kpnl, and BamlG. were performed. The electrophoretic mobility of the individual firagments was examined by using a larger size (32 cm x 40 cm) 8% nondenaturing polyacrylamide gel. The electrophoresis was performed at 4 V/cm for 2 hours, stopped to load the samples, and continued for 26 hours at 4 V/cm at room temperature. The low voltage setting prevented “warming-up” of the gel during the

76 electrophoretic process. To visualize the DNA, the gel was stained with ethidium bromide solution and photographed on a UV-transilluminator.

Induction of the Heat Shock Gene

Heat-Shock Induction

The temperature-course study was performed to determine the maximum induction temperature for the heat shock gene. An H volcanii culture (200 ml) was

grown to an OD 5 5 0 of 0.6 to 0.8 at 37°C with shaking (150 rpm). Aliquots (20 ml each) from the 200 ml culture were transferred into 150 ml pre-warmed flasks and placed in shaker water baths (150 rpm) set at 37°C, 45°C, 50°C, 55°C, 60°C, 65°C, or 70°C for 45 min. For each temperature point, RNAs were isolated immediately from 5 ml of cells using the RNAeasy™ method and stored at -70°C in 75% ethanol until needed.

Once the optimum heat-induction temperature was determined from the temperature-course study, a time-course study was conducted. In the time-course study,

200 ml of a H volcanii culture ( O D 5 5 0 0.6-0.8) growing at 37°C was transferred to a shaker water bath (150 rpm) set at the optimum heat-shock response temperature.

Aliquots (5 ml each) of the 200 ml culture were removed from the flask at time 0 and at

15-minute intervals for up to 90 minutes. Total RNA from each cell aliquot was immediately isolated.

77 Salt-Shock Induction

To study the effect of reduced salt concentrations on the expression of the heat shock gene, 60 ml of a H. volcanii culture was split into four 50-ml culture tubes. Cells were collected by centrifugation (4000 rpm, 10 minutes) at room temperature and resuspended thoroughly in 15 ml of complex medium (pre-warmed to 37°C) containing

2.2 M (100%), 1.96 M (80%), 1.32 M (60%), or 0.88 M (40%) NaCl. The culture tubes with the cell suspension were then incubated at 37°C with shaking for one hour, and 5 ml from each tube was removed for RNA extraction at the end of the salt-shock challenge.

78 Name Seauence 3’) pWL vector sequencing primer: pWLreverse GGCGTATCACGAGGCCC

For yeast tRNAProM terminator:

Term I CTAGTCTAGAGGTACCCACTGCTACTTTAA

Term2 GGCCTCTAGACCCGAATAATTTTTTTTTGCC

PIIIT9.1 GATCCAATTTTTTTTTGCG

PIIIT9.2 GATCCGCAAAAAAAAATTG

For constructing new expression vector (sptProM):

Newts GGCCGAGCTCAATAATTTTTTTTTGCC

Newt] CTAGGAGCTCACTGCTACTTTAATTTTATAC

TLess] GTACGGATCCCGGGGCGAGCTGGGAATTGA

For H. volcanii rRNA operon terminator:

5STL GAACTCTAGATTCATACTTCACAG

5STR AGGTTCTAGACCGTGCGTGCCGCTCTG

Trich.l GATCCGGCTTTGTTCATTTTATGCG

Trich.2 GATCCGCATAAAATGAACAAAGCCG

ShorT.l GACTGGATCCATTCATACTTCACAG

ShorT.2 TCGAGGATCCTGAACAAAGCCACTG

(to be continued) Table 2.1: Synthetic oligonucleotides used in the termination study. Underlined sequences indicate synthetic restriction sites

79 Table 2.1 (cont.)

Name Sequence (5*-> 3’)

DeSTLP.l GACTGGATCCGCGGTGTCCAGTGGC

DeSTLP.2 TCGAGGATCCCCGTGCGTGCCGCTC

For E. coli trpA terminator:

TrpA.l

TrpA.2 GATCCAAAAAAAAGCCCGCTCATTAGGCGGGCTG

For AT-rich bent-DNA element:

A3T3.I GATCCAAATTTGTCCGAAATTACTGAG

A3T3.2 GATCCTCAGTAATTTCGGACAAATTTG

Atract. 1 GATCCGCGTCTTTTTGGCATCTTTTTCATGG

Atract.2 GATCCCATGAAAAAGATGCCAAAAAGACGC

For H. volcanii tbp2 terminator:

TBP2Term.l GACTGGATCCGCGCTCCCGAAGTATCC

TBP2Tenm.2 TCGAGGATCCATTTCACACTTATGGG

TBP2Term.3 TCGAGGATCCCCGCGAAAAATCGG

TBP2Term.4 GACTGGATCCCCGATTTTTCGCGG

TBP2Mu5’ GACTGGATCCTGAGCAGACGTCATACCGATTTTTCGCGGTGCTA GG

TBP2(A4)3’ TCGAGGATCCCACACTTATGGGTGTTC

TBP2(A9)3’ TCGAGGATCCTTATGGGTGTTCAGACACC

TBP2(A20)3’ TCGAGGATCCCAGACACCTAGCACCGC

TBP2(A28)3’ TCGAGGATCCTAGCACCGCGAAAAATCGG

TBP2TAT GACTGGATCCCCGATTAATCGCGGTGCTAGG

TBP2T(A)10.1 GATÇÇCCGATTTTTAACGGTGCTAGGTGTCTGG

(to be continued)

80 Table 2.1 (co n t.)

Name Sequence (5’-> 3’)

TBP2T(A)I0.2 GATCCCAGACACCTAGCACCGTTAAAAATCGGG

TBP2T(A)12.I GATCCCCGATTTTTCGAAGTGCTAGGTGTCTGG

TBP2T(A)12.2 GATCCCAGACACCTAGCACTTCGAAAAATCGGG

TBP2T(A)14.1 GATCCCCGATTTTTCGCGAAGCTAGGTGTCTGG

TBP2T(A)I4.2 GATCCCAGACACCTAGCTTCGCGAAAAATCGGG

TBP2T(A)I6 TCGAGGATCCCAGACACCTATTACCGCG

TBP2T(A)18 TCGAGGATCCCAGACACCTTGCACCGCG

TBP2T(A)20 TCGAGGATCCCAGACATTTAGCACCGCG

TBP2T(A)22 TCGAGGATCCCAGATTCCTAGCACCGCG

TBP2T(A)24 TCGAGGATCCCATTCACCTAGCACCGCG

TBP2T(A)26 TCGAGGATCCTTGACACCTAGCACCGCG

TBP2T(A) 14.20-1 GACTGGATCCCCGATTTTTCGCGAAGC

TBP2T(A) 14.20-2 TCGAGGATCCCAGACATTTAGCTTCGCGAAAAATCG

TBP2T(CACC)20 TCGAGGATCCCAGAGGTGTAGC

TBP2T(A14)5(GGTG) TCGAGGATCCCAGACACCGTCAGTAGCTTCGCGAAAAATCG

TBP2T(A14)10(GGTG) TCGAGGATCCCAGACACCATCGTGTCAGTAGCTTCGCGAAAAAT CG

TBP2mu3’ TCGAGGATCCTCTGACTTGCATGCCGCGAAAAATCG

Tinsert GCTAGGTACCCGGGGAAAAATCCCAGACACC

TBP2T4 GACTGGATCCCCGATTTTCGCGGTG

TBP2T3 GACTGGATCCCCGATTTCGCGGTGC

TBP2A4 GACTGGATCCCCGATTTATCGCGG

TBP2G4 GACTGGATCCCCGATTTGTCGCGG

(to be continued)

81 Table 2.1 (cont.)

Name Sequence 3’)

TBP2C4 GACTGGATCCCCGATTTCTCGCGG

TBP2A3 GACTGGATCCCCGATTATTCGCGG

TBP2G3 GACTGGATCCCCGATTGTTCGCGG

TBP2C3 GACTGGATCCCCGATTCTTCGCGG

TBP2A2 GACTGGATCCCCGATATTTCGCGG

TBP2G2 GACTGGATCCCCGATGTTTCGCGG

TBP2C2 GACTGGATCCCCGATCTTTCGCGG

TBP2TSNew.l GATCCGATTTTTCGCGGTAC

TBP2TSNew.2 ÇGCGAAAAATCG

For H. volcanii cctb (heat shock gene) terminator:

HSTerml GACTGGATCCGCGTCCCAGTAGGCC

HSTenn.2 GACTGGATCCCGGCCAAATTCACGA

HST(A11)3’ TCGAGGATCCGGGTGTAAAGCCGCA

HST(A16)3’ TCGAGGATCCTAAAGCCGCAAAAAACGTCG

HST(A20)3’ TCGAGGATCCGCCGCAAAAAACGTCGTG

For bent DNA analysis:

PIIITX.I GACTTCTAGAATAATTTTTTTTrGCCTATC

PIIITX.2 TCGATCTAGACTGCTACTTTAAl-1 1 1ATAG

TBP2(A20)X.l GACTTCTAGACCGATTTTTCGCGG

TBP2(A20)X.2 TCGATCTAGACAGACACCTAGCACC

PIIIT3X GACTTCTAGAATAATTTGCCTATC

(To be continued)

82 Table 2.1 (cont.)

Name Sequence 3’)

TBP2TATX GACTTCTAGACCGATTAATCGCGG

HSdeLX.l GACTTCTAGACGGCCAAATTCACG

HSdeLX.2 TCGATCTAGACAGCTGCTAACGGG

AtractX. 1 CTAGAGCGTCTTTTTGGCATCTTTTTCATGT

AtractX.2 CTAGACATGAAAAAGATGCCAAAAAGACGCT

For 16 nts-extension in vector: plus 16 CCCGGGTACCATCGTCGAAGTCACTGGAGCTCAATAATTTTTTT TTGCCTATC For H. volcanii tRNALys promoter:

LysPT29 TCGAGGATCCGGAAAGTCTTTTTACCC

LysPA27 TCGAGGATCCGGAAAGTCATATTACCC

LysPA25 TCGAGGATCCGGAAAGTCATTTAACCC

LysP3’ TAGCGGATCCACTGCCGGTGGG

83 Name Sequence fS’-> 3’)

M13/pUC19 Universal primer F(-20)* GTAAAACGACGGCCAGT

M13/pUCI9 Universal primer F(-40)* GTTTTCCCAGTCACGACG

M13/pUC19 Universal primer R(-24)* AACAGCTATGACCATG

M13/pUC19 Universal primer R(-48)* AGCGGATAACAATTTCACACAGGA

HSPECATCTTGTCCATCCCTTTGGGGCC

Hsendf ACCGGCAGCGACGACGAC

Hsendr GCCGTCGTCGTCGTCGCT

HsendO CGAATCGACTCGACTCGCTCGGCC

HSCOMPl ACGCGGATCCCGTCTGTTCAACTGAGACGC

HSCOMP2 GTCCGGATCCCAGCTGCTAACGGGTGTAAA

Table 2.2: Synthetic oligonucleotides used in the heat shock gene study. * Oligos purchased from United States Biomedical. Underlined sequences indicate BamYS. restriction site.

84 CHAPTER 3

IN VIVO CHARACTERIZATION OF ARCHAEAL TRANSCRIPTION TERMINATION SIGNALS

Introduction

Transcription involves three distinct processes: initiation, elongation, and termination. Although the initiation event has been the most intensively studied, transcription termination efficiency can also play an important role in the regulation of gene expression (reviewed in Henkin, 1996; Kane, 1994). In the past decade, studies using bacterial and eukaryal systems have advanced our imderstanding of termination signals and the molecular mechanisms controlling termination; however, little is known about archaeal transcription termination. Hence, the goal of this project was to determine the sequence elements that dictate transcriptional termination in H. volcanii. Using an

E. coli-H. volcanii pWL shuttle vector and tRNA reporter genes as tools for in vivo transcription studies, we have demonstrated that a T-rich yeast RNA polymerase III (pol in) terminator functions as a strong termination element in H. volcanii. We then constructed a termination assay expression module that would allow us to examine the specific requirements for termination in vivo. With this assay system we tested both

85 endogenous and model terminators and examined the requirements for nucleotide sequences and structural elements. This study represents the first detailed analysis of the requirements for archaeal transcription termination in vivo and provide a foundation for further analysis of the molecular aspects of this process.

Results

Yeast tRNAProM RNA Polymerase HI Terminator Directs Transcriptional Termination in Æ volcanii

Our laboratory has used a yeast tRNAProM gene as a reporter gene in the studies of intron-processing and promoter definition in H. volcanii (Palmer and Daniels, 1994;

Palmer et al., 1994). The cloned firagment carrying the yeast pol HI gene also has an accompanying termination element composed of a tract of 9 thymine residues (T9-tract).

When this gene was expressed in H. volcanii on the E. coli-H. volcanii shuttle vector, pWL222, Northern analysis indicated that the transcript of this gene terminated in the eukaryal pol HI element rather than in a native H. volcanii terminator (poly C+U) that was present downstream (Palmer and Daniels, 1994). This section describes experiments that examine whether this eukaryal terminator functions as a terminator or an RNA processing signal in H. volcanii.

86 Determining the 3* Terminus of the Yeast tRNAProM Transcript

S1 nuclease mapping was used to determine the 3’ terminus of the yeast tRNAProM RNA expressed in H. volcanii. The DNA probe used in SI nuclease analysis was prepared from a PCR-generated DNA fragment containing the S' leader and the tRNAProM gene region. The PCR DNA was digested with Xbal, and the antisense strand was 3'-[P^^] end-labeled by filling in the recessive end of the probe DNA with dTTP, dCTP and [a-P^^]dATP using the Klenow en^mie (see Figure 3.1 A). This gave a probe of 206 nts.

Figure 3. IB shows the results of the SI nuclease mapping. Lanes 1 to 3 are controls. In lane 1, the experiment was terminated after DNA/RNA hybridization and used to confirm the size of the DNA probe (antisense strand). In lane 2, the DNA probe alone was heat-denatured and digested with SI nuclease; this control verified that the SI nuclease activity under the given condition (60 minutes, on ice) was sufficient to degrade single-stranded DNA. Lane 3 shows the result of a complete assay carried out without the presence of RNA. The fact that there is no detectable signal in lane 3 indicates that the protected S1 products present in lanes 4 to 7 were derived specifically from the

DNA:RNA hybrid. Lanes 4 through 7 show DNAs remaining after digestion of the

DNA.-RNA hybrid with SI nuclease. A single band, corresponding to a 122 nt DNA, was the prominent product. This places the 3' terminus of the transcript within the stretch of

U residues of the yeast pol HI terminator (Figure 3.2A). The absence of the labeled probe in lanes 4 to 7 might result from the loss of ssDNA recovery during ethanol precipitation

87 Figure 3.1: Mapping the 3’ terminus of the yeast tRNAProM transcript. (A) A schematic representation of the expression module used in vivo express of tRNAProM.

The yeast tRNAProM gene is under the transcriptional control of the H. volcanii tRNALys promoter. The location of the DNA probe used in SI mapping is shown. The

3’-end of this fragment was generated by Xbal digestion and labeled with [a-P^^]dATP.

The arrow indicates the expected 5’-end of the protected DNA fragment. (B) S1 mapping. Lane 1 is a control containing the hybridization buffer, DNA probe, and RNA isolated from H. volcanii cells carrying the construct shown in (A); S1 nuclease was omitted. Lane 2 is a control for SI nuclease activity; the DNA probe was heat-denatured and digested with SI nuclease for 60 min on ice. Lane 3 is the “minus” RNA control.

Lanes 4 to 7 contain the SI reactions; DNA probe and RNA were hybridized at 56°C for

16 hours and digested with 10 volume of SI nuclease (1000 units/ml) on ice for 10 min,

20 min, 30 min, and 60 min, respectively. The DNA size markers are labeled based on the migration of known ssDNA fragments.

88 (A)

Pol III H. volcanii Termination tRNALys Promoter Yeast tRNAProM ,Xba\ Element

BoxA BoxB Exonl Intronl Exon2 Sph\ Pst\ EcoRI Hindlll SI probe 122 nt

3’-*A T C T

(B) 1 2 3 4 5 6 7

210 — — 206 nt 190 —

150 — 130 — I— 122 nt

110 —

Figure 3.1

89 after SI nuclease digestion. The 81 analysis result is consistent with the earlier size observed in Northern hybridizations. While the data is consistent with this sequence functioning as a termination element, we could not eliminate the possibility that this sequence acted as a 3’ RNA processing signal. Hence, further studies were performed to investigate its functional role.

Determining the Functional Role of the Yeast Pol III Terminator inH. volcanii

To determine whether the tRNAProM terminator functions as a transcription terminator or as an RNA processing signal in H. volcanii, this sequence element was amplified using PCR and ligated into the Xbal site of shuttle vector pWL212. This shuttle vector contains the H. volcanii tRNALys promoter and a tRNA reporter gene, tRN ATrpO 16M-AGGAG (Figure 3.2 A). Introduction of the yeast pol III terminator in the Xbal site of this vector placed this element between the promoter and the reporter gene. The reporter gene, tRNATrp016M-AGGAG is a mutant form of the H. volcanii tRNATrp gene that produces an RNA that does not undergo intron-processing. If the pol m termination element acted as a transcription terminator in H. volcanii, transcription from the tRNALys promoter would be terminated before reaching the reporter gene, and the tRN AT rpO 16M-AGGAG transcript would not be made. However, if the pol III termination element in H. volcanii acted as a processing signal, the reporter gene would be transcribed, and the tRNATrpOlôM-AGGAG transcript would be processed at the pol in terminator region. In the latter case, the final products would remain stable and be readily detected by Northern analysis. In addition, to determine whether the function of

90 this element is sequence-dependent or is associated with any potential structure, we cloned this element in both the forward and reverse orientations. In the reverse orientation the secondary structure of the element would be preserved, although the

DNA sequence (except the AT-richness) would be changed.

Figure 3.28 shows the results of a Northern analysis with RNA derived from cells carrying this constructs. When the oligonucleotide probe TrpMEl, specific for the tRNATrp 16M-AGGAG exon 1 (left panel), was used only the no-terminator control

(lane 3) and the clone containing the reverse-terminator (lane 1) gave hybridization signals. No signal corresponding to the tRNATrp 16M-AGGAG transcript was visible from the clone containing the pol IE terminator in the forward orientation (lane 2). By measuring the hybridization signals, we estimated only a 5-10% termination read- through in the forward-terminator clone when it was compared to the signal from the reverse-terminator clone. After removing the first probe, this Northern blot was reprobed with oligonucleotides Terml and Term2, which contained sequences complementary to the terminator sequences in the forward and reverse orientations, respectively (right panel). A hybridization signal was observed only in RNAs derived from cells carrying the reverse-terminator clone (lane 1), thus confirming that this 315-nt transcript contained the terminator sequence and was a terminator-read-through product.

Since the insertion of this eukaryal pol UI terminator at the Xbal site of pWL212 placed the T9-tract (the termination site) of this sequence element only 27 bp downstream from the expected transcription start site, we also wanted to determine whether the

91 Figure 3.2: Functional analysis of the yeast tRNAProM terminator. (A) Schematic representation of the pWL212 plasmid carrying the pol in terminator. The pol HI terminator was placed at the site of pWL212 in both orientations. The pWL212 vector contains an expression module carrying the tRNATrpO 16M-AGGAG reporter gene under the control of the H. volcanii tRNALys promoter. Nucleotide sequences corresponding to the pol HI terminator are shown in both orientations. The solid bar on top of the sequence indicates the location of the 3’ temtinus of tRNAProM transcript mapped in Figure 3.1. (B) Northern analysis. A Northern blot was hybridized with the

5’[P^^] end-labeled oligonucleotide TrpMEl (exon 1 probe; left pLuel). This blot was stripped and reprobed with 5’[P^^] end-labeled Terml and Term2 oligonucleotides (pol in terminator probe; right panel). Lane 1 contains the RNA from the reverse-terminator construct. Lane 2 contains RNA isolated from cells carrying the forward-terminator construct. Lane 3 contains RNA from cells carrying the construct that does not contain the pol m terminator. The tRNATrpO 16M-AGGAG RNA is partially processed at its 5’- and 3’-ends, and its typical hybridization signal is labeled accordingly (left panel, lane 3).

92 (A) Yeast tRNAProM Terminator Forward CCCGAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAGTGGGTACC

Reverse GGTACCCACTGCTACTTTAATTTTATAGATAGGCAAAAAAAAA1TATTCGGG

H. volcanii tRNALys Promoter tRNATrp-016 M-a GGAG El Intron E2

HirOmSph^ Pst\ £coRl

pWL212 10.5 kb (B) TRNATrp016M-AGGAG Terminator Probe Exon 1 probe specific for both strands

315 nt 315 nt (read-thru) (read-thru) Pre016M-AGGAG RNAs 5’&3’ -processing

Figure 3.2

93 presence of the terminator affected transcription initiation from the tRNALys promoter.

To test this, primer extension was performed to map the transcription start site. The 5’-

[P^^] end-labeled exon 1-specific oligonucleotide, TrpMEl, was used as the primer with

RNAs extracted from H. volcanii cells carrying the forward-terminator and the reverse- terminator constructs. Figure 3.3 shows the results of this analysis. In both constructs, the transcription start site of tRNATrpO 16M-AGGAG RNA was mapped to the residue located in boxB of the tRNALys promoter, indicating that the RNAP complex was able to initiate transcription in both constructs at the normal start site. The fact that the tRN ATrpOlôM-AGGAG transcript from the forward clone, which was not detectable in the Northern analysis, was now observed might be due to the higher sensitivity of the primer extension analysis.

Taken together, these data indicated that the pol UI terminator functions as an efficient transcription terminator rather than as an RNA processing signal in H. volcanii.

Moreover, the terminator activity of this eukaryal terminator functioned in an orientation- dependent manner, suggesting that its role as a termination element is sequence- dependent—that is, with the T-tract located on the non-template strand.

Development of a New Expression Module, sptProM, for in vivo Termination Studies

Taking advantage of the strong termination activity of the yeast pol UI terminator, we constructed a new expression module, sptProM, for the further investigation of potential terminators. This new expression module contains the H. volcanii tRNALys promoter,

94 Figure 3.3: Primer extension analysis of RNAs derived from the tRNATrpO 16M-

AGGAG reporter gene. The synthetic oligonucleotide TrpMEl was used as the primer for DNA sequencing and reverse transcription reactions. The RNAs were isolated from

H. volcanii cells carrying the pWL212 plasmid containing the pol DI terminator in the reverse (lane 1) and forward (lane 2) orientations at the Xbal site (Figure 3.2A).

Sequences surrounding the mapped transcription start site (*) are shown.

95 m

e

Figure 3.3

96 followed by the terminator-less yeast tRNAProM reporter gene, a multiple cloning region located 3' of the gene, and the pol HI terminator located further downstream. In this new assay system, termination resulting from an inserted DNA element can be determined by measuring the amount of transcript terminating within the insert region versus termination at the yeast pol UI terminator located downstream.

Figure 3.4A illustrates the construction of the sptProM expression plasmid. Two

PCR reactions, using a DNA template that contained the H. volcanii tRNALys promoter and the yeast tRNAProM gene, were performed to isolate the reporter gene and the pol III terminator for reconstruction of the expression module. To isolate the reporter gene, the primers used in the PCR reaction were the pUC19/M13 universal primer reverse (-48) and the Tless3 oligonucleotide (see Table 2.1). The Tless primer has a synthetic BamlQ. site at its 5' end followed by a sequence complementary to the 3’end of the yeast tRNAProM gene located upstream from the terminator. To isolate the pol III termination element, oligonucleotides NewtS and Newt3 (see Table 2.1) were used as primers in the

PCR reaction. The amplified products generated from these two PCRs were the terminator-less yeast tRNAProM gene located downstream of the H. volcanii tRNALys promoter (PCR.1) and the yeast pol HI terminator (PCR.2) flanked by a pair of Sstl sites.

We first cloned the PCR.1 product as a HinéBl-BamlU. fragment at the equivalent sites in pUC19. The PCR.2 product was subsequently cloned into the Sstl site of the pUC19 vector carrying the PCR.1 fragment to generate the new expression vector, pUCsptProM.

Potential termination elements could then be inserted at the BamUÎ or Kpril sites located

97 Figure 3.4: The construction and use of the sptProM termination assay expression

module. (A) A schematic representation showing the construction of the sptProM module

in pUC19. Primers used in PCR.1 were M13/pUC19 universal primer reverse (-48)

(primer 1.1) and Tless oligonucleotide (primer 1.2). Primers used in PCR.2 were Newt5

(primer 2.1) and NewtS (primer 2.2). (B) Use of the sptProM module as a genetic tool

for identifying potential terminators in vivo. This new expression module allows the introduction of potential termination elements between the tRNAProM reporter and the yeast pol HI terminator.

98 (A) &/I Primer 2.1 H in i III Pol III ^ Primer 1.1 Termination H. volcanii tRNALys Promoter ^ . Yeast tRNAProM Element

EcoKl Hin

W/ndm BamHl

Pol III Amp" Hind III, Termination pUC19 flomHI Element

ligation

H. volcanii tRNALys promoter Yeast tRNAProM Xbal &rl n r H ///ndlll flamHI HindiM

Ligation

pUCsptProM

Figure 3.4 (to be continued)

99 Figure 3.4 (cont.)

(B)

Terminator Element

H volcanii Yeast tRNAProM tRNALys Promoter Yeast tRNAPro Terminator o o 1

Hindlll Xb>I BrniHl Kpnl SslI Eco RI Sm>I S:II

Active terminator Inactive Terminator between the tRNAProM coding region and the pol DI terminator (Figure 3.48). Since there is another Smal site located at the 3’end of the tRNAProM gene, the Smal site between the BamlU. and sites was not available for cloning purposes. When the sptProM module was introduced into H. volcanii on the pWL plasmid, this new expression module yielded a 154-nt tRNAProM product (Figure 3.8), suggesting that transcription termination took place at the mapped thymine residues of the pol ID terminator.

Primer extension and 81 nuclease mapping were performed to determine if the correct transcription start and stop sites were used in this new expression module. Figure

3.5 A shows the result of primer extension analysis using proEXI, an oligonucleotide complementary to exon 1 of tRNAProM, as the primer. A single extension product was observed that corresponded to a transcript initiating at the expected guanine residue located in boxB of the tRNALys promoter (Palmer and Daniels, 1994). A 97 bp fragment that contained the 3’ end of the tRNAProM-coding region and the terminator was used in the 81 nuclease analysis. After 81 nuclease digestion, the DNA probe yielded major protected products with the sizes of 54 to 58 nt (Figure 3.58), placing the termination site within the T-tract of the pol ID terminator. Hence, the reconstruction of the tRNAProM gene did not affect transcription initiation or termination in this new expression module.

101 Figure 3.5: Mapping tRNAProM gene transcripts generated from the sptProM expression module. (A) Primer extension analysis using 5’-[P^^] end-labeled proEXI as the primer and RNA isolated from H. volcanii cells carrying pWLsptProM. The mapped start site

(*) and its adjacent sequence are shown. (B) SI mapping. Lane 1 contains the 3’-[P^^] end-labeled 97 bp DNA probe. Lane 2 is the “minus RNA” control. Lane 3 contains labeled DNA probe and RNA hybridized at 49°C for 16 hours and followed by digestion with 10 volumes of SI nuclease (1000 units/ml) at room temperature (25°C) for 30 minutes. The major protected products were 54- to 58-nt fragments, which mapped the 3’ terminus of tRNAProM within the T-tract of pol UI terminator.

102 (A) T C G A PE 1 7 1

O

Figure 3.5 Sequence Requirements of Archaeal T-tract Terminators

A survey of archaeal DNA sequences surrounding the region of the mapped transcription termination sites (Figure 3.6) indicates that there are no conserved secondary-structure elements or sequences other than an oligo(dT) sequence (T-tract) on the non-template strand. This feature is similar to that of many eukaryal terminators

(Chang et al., 1986; Kerppola and Kane, 1990; Lang and Reeder, 1995). To learn more about the structure of T-tract-containing sequence elements and their role as transcription terminators in the Archaea, we analyzed the terminators from two protein-encoding genes, tbp2 (Palmer, unpublished data) and cctl (this work; see Chapter 4).

Orientation-Dependence of tbo2 and cctl Terminators

The T-tract element in the tbp2 3’-region is located 20 to 24 nucleotides downstream from the translational . We designed two oligonucleotides,

TBP2Terml and TBP2Term2, that would direct PCR amplification of a 80-bp DNA firagment from tbp2. This PCR product contained 62 bp of sequence located immediately downstream from the stop codon (see TBP2comp.F sequence in Figure 3.7A) and introduced synthetic BamlU. restriction sites with 4 bp of random sequence at each end.

After digestion with BamlU and purification from an 8% polyacrylamide gel, the tbp2 terminator-containing sequence element was then cloned into pUCsptProM at the BamlU site in both orientations. The constructs, TBP2comp.F for forward sequence and

TBP2comp.R for reverse sequence, were then expressed in H. volcanii as previously described. Figure 3.7A shows the results of the Northern analysis of these constructs.

104 Figure 3.6: A survey of archaeal DNA sequences at mapped transcription termination sites. The mapped termination sites are in bold print. The lines above the sequences indicate inverted repeats. With the exception of H. volcanii tbp2 (Palmer, unpublished data), cctl (this work), and SB12SSV1 Tl, T2, T3, Tindl, and Tjnd2 (Reiter et al., 1988), sequences were obtained from NCBI Genbank™. These genes are: Hvo rRNA

(X02128); Hvo LI le and 12e (X58924); Hvo sodl(M97486); Hvo sod2 (M97487); Hcu sod (M97484); Hcu slg (M26502); Hcu L12 and LI le (X15078); Hha BO (JO 1727); Hma

S9 (M76567); Hma sod (M97485); Hma msg (M76567); Hmo S7 (X57145); Mva tRNAVal, Dmo 23S Tl and T2 (X06190); Tte 23S (X06157); Mth rRNA (X62857), Hme gvp (X64701). Abbreviations for the organisms are Hvo, Haloferax volcanii', Hcu,

Halobacterium cutirubrum; Hha, Halobacterium halobium', Hma, Haloarcula marismortui; Hme, Haloferax mediterranev, Hmo, Halococcus morrhuae', SB12SSVI,

Sulfolobus strain 12 virus-like particle; Mva, Methanococcus vannielii', Dmo,

Desulfurococcus mobilis', Tte, Thermoproteus tenax; Mth, Methanococcus thermoautotrophicum.

105 H vo t b p 2 GCGCTCCCGAACTATCCGATTTTTCGCGGTGCTAGGTGTCTGAACACCCATAACTGTCAAA H vo c c t l AATTCGACTCGACTCGCTCGGCCAAATTCACGACGTTTTTTGCGGCTTTACACCCGTTA H vo rRNA ATTCATACTTCACAGAGCCCACTGGACGCGGTGTCCAGTGGCTTTGTTCATTTTATGCAGAGCGGCACGCACGG H vo L l l e CGTCTCGGAACCGTCTCGGCGGTTCGAACCGCCCTTTCCGTCGGGCCGAGTCGACTCGGCCCGCTTTCTTCTCT H vo L 1 2 e CCGCCGCACCCGCGGCGACGCTCGCTTCGAAACTCCGATTTCTTTCGATGGAA H vo s o d l GCCCCGCCAACCGGCACCCTCCTCGAACGCACGGCACATTTTTCGGTGAGTGACGGCGACAGAGGCCGTC H vo s o d 2 ACCCGGTACATCCCGACTTTTTCTTTCGGACGCAACGCCGTCAGCGGCGCGCTGTCCGC H cu s o d GCGGGGACACCCTGAAACGCCACGCTTTTTCGCCGTGTAGCGTTCCGATAGCTTCTATACACAGCCCAG H cu s l g CCGAACACGCTCCCGTGTTTTTTTCGCGTCATGGCGGCTCATCAGTGGCTGGCGTGAGTGCGGCGACCGTGTGCTCG H cu L 1 2 CCCGGTCGCGTCGCGCGCCGACAAGCCACGATCACATCGTTTTTTAGCCGCGTGCCACTCGGGAAGCCACG H cu L l l e CGCCGCCCGAGGAGTTTCTGCGCCGTTCGGTTCGCGTTCGCGTACTCGATAGCGGCGTGTGTCCGCGGG H h a BO tcgcacacgcaggacagccc CSCA a CCSg c g CBBctgtgttcaacgacacacgatga Hma a operon ACCGATGACCGCCCGCTCCCCGAGCGCGACGACCGAAAGGGGTTTTACGGGGCACGCAAAACGAAC Hma S9 CGGTGCCGAGCCGACGTACGGACTCGACGACTTCGAGTCCGACATCTAACGTCGCTGAACCGTTCGTTTTCTCGCTGTC Hma s o d o Hma ON m sg CGTCGCTGAACCGTTCGTTTTCTCGCTGTCGAAGTGCGAGTGGCGATGCCAGTATCGCCAACACAA Hmo S7 CCGTTCGCACAGTTCTTTTCGGTTTTCGATCCTTGCCGTTCGGTCAGCCATCGCTTTGACAGCAACGACTGTTAT Hme g v p TTGTAAGTAACGCTACTCGACCGATTTTTCGAACCCACT SB 12SSV 1 T l ATGGAAATCAGTTTAAAGCCAATCATTTTTTTGGTC6 TTTTTATCATCGTAGGGATAGCA SB 12SSV 1 T2 GCCCTTTATAAAGTCATATTCTTTTTCTTTCCCTGATGAGTGCGTTAGGGGATGTAATCTACA SB 12SSV 1 T3 GCAAAATCTTTTTTTACCTCTTTTTAAATCTGTCTTATATGAAAAAACTG SB 12SSV 1 T.ndl CAATAAATGATTTGTTCTAAACTATTTTTTCTCTCTATCTCTATATCTATATATATACATAACTAAA SB12SSV1 Ti„d2 ACCATTATACAAACTCAGAAAAACTATTTTTTTGTTATACTCTTACCCCATATATATATAGATATATA M va tR N A V al GTTCGAATCCGGCTGGGTCCACTATTTTAATTTTGAGCATATGTATC Dmo 2 3 S 'n GCAGCGCTCGTAACGGTTTTATTCCTCTCCTGGCTTTTCATCAACTAGTAGTGGTGGGTATG Dmo 2 3 S 'T 2 AGATTATTTTTCAATGATGAGGATACCTAACTCCCTTATGAGTGGTGTTGGAGCAG T t e 23 S GGCGTGGGGTACTCGTAATACTCCTTTTTCTCCCCGGGCTCCGTTTGGCTACAGTGTG M th rRNA TTAACGACTGATGCCCAAGGTCCTTCTATGACATGGGCTGATGCCCGGGATGAT When placed in the forward orientation, 80% of the tRNAProM transcription termination could be attributed to the tbp2 tenninator. The reverse sequence of the tbp2 terminator also appeared to direct termination. The size of this transcript suggested that its 3’ terminus was in close proximity to the 3’ end of the tRNAProM coding region. Since the

5’ AT-rich element of the reverse element does not fit the criteria for a T-tract terminator

(see below), we speculate that this AT-rich element might, in fact, cooperate with the tRNA structure to form a bacterial p-independent-like terminator (this is discussed in more details in rRNA terminator section) and subsequently direct termination.

A similar approach was taken to study the termination activity of the cctl terminator (see Figure 3.7B for sequences). To amplify the cctl terminator, HSC0MP2 and HSTerml oligonucleotides were used as primers in PCR amplification. Figure 3.7B shows the results of the Northern analysis of the constructs containing the cctl terminator. The cctl terminator directed termination only when it was placed in the expression module in the forward orientation (HScomp.F; lanel in Figure 3.7B). These data suggest that T-tracts function as terminators when they are present on the non­ template strand.

5’ and 3’ Block Deletion Mutagenesis of the tbp2 and cctl Terminators

The orientation-dependence of the tbp2 and cctl terminators described in the previous section suggests the existence of a potential sequence requirement for these two termination elements. Consequently, our next goal was to determine whether sequences

107 Figure 3.7: Termination activity of the H volcanii tbp2 and cctl terminators.

(A) Northern analysis of TBP2comp.F and TBP2comp.R. The sequence element immediately downstream of the tbp2 stop codon is shown in both the forward

(TBP2comp.F) and reverse (TBP2comp.R) orientations. The termination efficiencies of

TBP2comp.F (lane 1) and TBP2comp.R (lane 2) in the sptProM expression module are presented next to the Northern blot. (B) Northern analysis of HScomp.F and HScomp.R.

The sequence element immediately downstream of the cctl stop codon is shown in both forward (HScomp.F) and reverse (HScomp.R) orientations. The termination efficiencies of these two sequence elements are also shown. The tRNAProM transcripts in both

Northern blots were probed with 5’-[P^^] end-labeled proEXl oligonucleotide, specific for exonl of tRNAProM. The termination efficiency was calculated by dividing the total counts derived from the tRNAProM transcript with its 3’-terminus located within the insert terminator by the total counts derived from all tRNAProM transcripts.

Hybridization with a tRNAIle probe (IleuT7E3), specific for chromosomally encoded tRNAIle, served as an internal control for RNA recovery. The ratio of tRNAProM to tRNAIle was used to determine if the tRNAProM derived from a construct was differentially degraded. Predicted termination sites (in bold print) were deduced from the known transcription start site and the size of the transcripts.

108 (A)

TBP2comp.F (aaatço)GCGCTCCCGAACTATCCGATTTTTCGCGGTGCTAGGTGTCTGAACACCCATAAGTGTGAAAT3 ' Ba/nHI

TBPlcomp.R (a a a ^ ) ATTTCACACTTATGGGTGTTCAGACACCTAGCACCGCGAAAAATCGGATAGTTCGGGAGCGC3 ' BamHl

Termination efficiency (%) Read-thru 1. TBP2comp.F 2. TBP2comp.R O so tbp2 term insert 80 <5* _yector-insert junction

^ 20 30 Term

tRNAIle— *: Approximately 70% of transcripts terminate at the junction of the vector and the insert DNA (see nucleotides in bold print)

Figure 3.7 (to be continued) Figure 3.7 (cont.)

(B)

HScomp.F 5 ' GCGTCCCAGTAGGCCACCATACACCCACACCCTGATTCGAATCGACTCGACT CGCTCGGCCAAATTCACGACGTTTTTTGCGGCTTTACACCCGTTAGCAGCTG3'

HScomp.R 5 ' CAGCTGCTAACGGGTGTAAAGCCGCAAAAAACGTCGTGAATTTGGCCGAGCG AGTCGAGTCGATTCGAATCAGGGTGTGGGTGTATGGTGGCCTACTGGGACGC3'

Termination efHciency (%)

Read-thru — 1. HScomp.F 2. HScomp.R cctl term— ' insert 80 <5

pol HI Term 20 >90

tRNAIle — adjacent to the T-tract element were required for the termination. For this purpose, we first performed block deletion mutagenesis in the 5’ and 3’ regions surrounding the tbp2

T-tract terminator. The tbp2 T-tract and selected 5’ and 3’ regions were isolated by PCR amplification using the following primers: TBP2Terml and TBP2Term3 for TBP2deR;

TBP2Term2 and TBP2Term4 for TBP2deL; TBP2Term3 and TBP2Term4 for TBP2Ts

(see Figure 3.8 for the termination element sequences and Table 2.1 for the primer sequences). For the removal of the 5’ region of the cctl terminator, primers HSdeL and

HScomp2 (see Table 2.1) were used in the PCR amplification. Deletion of the 3’ region of the cctl terminator is described in next section.

As shown in Figure 3.8, the removal of the 5'-region, up to 5 nucleotides from the

T-tract, did not appear to adversely affect the termination function of the tbp2 terminator and actually yielded a higher level of tRNAProM transcript (lane 2), possibly due to an increase in transcript stability by shortening the unstructured 3’ tail. This result suggested that the region 5’ of the T-tract in the tbp2 terminator was not essential for termination activity. This was confirmed by using a block replacement mutant in which this 5’-region was replaced with a random sequence of the same size. In this case no change in termination efficiency was observed (data not shown). Similarly, the 5’-region of the cctl terminator, containing 57 bp downstream from the stop codon, was not required for the termination activity; with only 17 bp left in its 5’ flanking region, the cctl T-tract terminated transcription at 85% efficiency (W.T. in Figure 3.9B). As observed for the shortened tbp2 terminator, a higher level of tRNAProM was also observed in this case

(compared with lane 1 in Figure 3.7B).

I ll Figure 3.8: Analysis of the requirement for sequences in the 5’ and 3’ regions of the tbp2 T-tract terminator. The top panel shows the 5’ and 3’ sequences accompanying the tbp2 T-tract terminator. The construct numbers correspond to the lane designations in the

Northern analysis shown in the panel below. The lower panel presents the Northern analysis of RNAs isolated from cells carrying these deletions. Lane C is the expression module control, sptProM. T refers to hybridization signals corresponding to tRNAProM transcripts with 3’ termini located within the T-tract of the tbp2 terminator, and “P”designates hybridization signals corresponding to read-through products with 3’ termini located within the T-tract of the pol DI terminator. Hybridization with a tRNAIle probe served as an internal control (see figure legend of Figure 3.7). The termination efficiency of each insert sequence element was determined as described in Figure 3.7.

112 5’GCGCTCCCGAACTATCCGATTTTTCGCGGTGCTAGGTGTCTGAACACCCATAAGTGTGAAAT3' 1. TBP2comp.F ; ______I 2. TBP2deL I-

3. TBP2deR

4. TBP2TS

3 4 m m #

P: pol m Term. 154 nt I: tbplTexm.

— tRNAIle

Termination 85 95 Efficiency (%) at the tbp2 Terminator

Figure 3.8

113 In contrast to the removal of sequences from the 5’ region, the deletion of the 3'- region or deletions that included both 5’ and 3’ regions of the tbp2 T-tract abolished termination efficiency (Figure 3.8, lanes 3 and 4). This suggested that the 3'-region plays an indispensable role in the termination activity of the tbp2 terminator. We reasoned that this region could provide specific sequences required for the binding of transcription termination factors or fulfill a spacing constraint between the tbp2 and yeast pol III terminators.

Sequence Specificity in the 3* Regions of the tbp2 and cctl Terminators

To determine if the 3’-regions of T-tract terminators contain a specific sequence element, possibly required for the binding of a termination factor, we first constructed a series of deletion mutants in the 3’ region of the tbp2 and cctl terminators. Deletions in the tbp2 terminator region were obtained by using the primers TBP2(A4)3’, TBP2(A9)3’,

TBP2(A20)3’ or TBP2(A28)3’ in conjunction with TBP2Term2 in PCRs. The sequences of these mutant terminators and the Northern analysis of these constructs are shown in

Figure 3.9 A. The results indicated that, while the last 20 bps of the TBP2deL (W.T.) were dispensable, deletion of 8 (TBP2A28) to 13 bps (TBP2A33, equivalent to TBP2Ts) from the 3’ end of TBP2A20 decreased tbp2 terminator activity significantly.

Furthermore, we deleted the pol HI terminator from the TBP2A20 construct and found that the tbp2 termination element remained an active terminator (data not shown). These data indicated that the termination activity of TBP2A20 was independent of the

114 Figure 3.9: Effects of deletions in the 3’ regions of the tbp2 and cctl T-tracts on in vivo termination. The top panels show the sequences of the wild type and 3’ deletion constructs for tbp2 (A) and cctl (B). The lower panels show the Northern analysis of

RNAs isolated from cells carrying these deletions. Locations of the read-through and tbp2 or cctl directed termination products are indicated. Hybridization with a tRNAIle probe served as an internal control (see figure legend of Figure 3.7). The termination efficiency of the tbp2 and cctl sequence elements was calculated as described in Figure

3.7.

115 (A)

tbo2 Terminators

W.T. CCGATTTTTCGCGGTGCTAGGTGTCTGAACACCCATAAGTGTG AAAT A4 CCGATTTTTCGCGGTGCTAGGTGTCTGAACACCCATAAGTGTG A9 CCGATTTTTCGCGGTGCTAGGTGTCTGAACACCCATAA A20 CCGATTTTTCGCGGTGCTAGGTGTCTG A28 CCGATTTTTCGCGGTGCTA A33 CCGATTTTTCGCGG

A4 A9 A20 A28 A33

pol in termi— read-thru '—

tbp2 term

tRNAIle

Termination Efficiency (%)

Figure 3.9 (to be continued)

116 Figure 3.9 (cont.)

cctl Terminator

W.T. CGGCCAAATTCACGACGTTTTTTGCGGCTTTACACCCGTTAGCAGCTG

Al I CGGCCAAATTCACGACGTTTTTTGCGGCTTTACACCC Al6 CGGCCAAATTCACGACGTTTTTTGCGGCTTTA

A20 CGGCCAAATTCACGACGTTTTTTGCGGC

W.T. A il A 16 A 20 g # # , : : :

pol III terni Q in vector cctl term

— tRNAIle

Termination 85 58 27 15 Efficiency (%) of cctl terminator

117 downstream pol HI terminator, and that the last 13 nts of the TBP2A20 were necessary for its function as a terminator.

A similar observation was made with deletions in the region 3’ of the cctl T-tract.

For the 3’ deletion of the cctl terminator, the termination elements (Figure 3.98) were

PCR amplified using the primers HSTerml and HST(A11), HST(A16) or HST(A20).

Figure 3.98 shows the sequences of these PCR products and the results of the Northern analysis of the expression constructs carrying these deletion sequence elements. The deletion of the 3’ region in cctl terminator (HSdeL) had an apparent negative effect when as few as 11 bp were removed from the end, and the loss of termination efficiency appears to correlate with the number of nucleotides removed, confirming the essential role of the 3’region in T-tract terminators in our expression module.

Although the regions 3’of both the tbp2 and cctl terminators appeared to possess a critical role in termination, sequence comparison of these regions revealed very little sequence conservation. The only apparent conserved motif was the 5’-GGTG-3’ element, which appears twice in the tbp2 terminator, and its absence seemed to coincide with a drop in termination efficiency. The cctl terminator does not have the same motif; however, it contains the reverse complement sequence, 5’-CACC-3’, located 21 bps downstream of the T-tract. In an effort to demonstrate the significance of the 5’-GGTG-

3’ motif, DNAs containing mutations in this sequence were also constructed. Since the tbp2 terminator constructs gave well defined signals in Northern analysis, we chose to focus on the tbp2 tenninator.

118 To investigate the role of sequences in the 3’ region of the TBP2A20 T-tract a series of mutant terminators were constructed containing AA dinucleotide replacements, sequence block replacements, and sequence insertions (Fig.3.10). We chose to mutagenize the 3’ sequence with -adenine dinucleotides since adenine occurs in low frequency in this region. Results of AA replacements indicate that, although there was a mild decrease (20-30%) in the termination efiBciency, no single sequence appeared to be essential. This included the single interruption of either 5’-GGTG-3’ motif (see

TBP2.A20 and TBP2.A22). Terminator activity was not completely abolished unless the mutation occurred within the T-tract (see TBP2TAT).

Suspecting that the single dinucleotide-replacement might not be sufficient to inhibit the interaction of a potential termination factor—a property observed for some bacterial p-dependent terminators (Zalatan et al., 1993)—we also constructed a double dinucleotide-replacement mutant (TBP2A14.20). Two mutants were also constructed where the 5’-GGTG-3’motif was replaced with 5’-CACC-3’ [TBP2.(CACC)20 and

TBP2.A14.(CACC)20]. In addition, the importance of helical phasing of the 5’-GGTG-

3’ motifs was examined by inserting 5 or 10 nts of sequence between the two sequence motifs [TBP2A14.5.(GGTG) and TBP2A14.10.(GGTG)]. Figure 3.10 shows that all of the mutants described maintained a termination activity comparable to that of the wild type construct. These data suggested that the need for sequences 3’ of the T-tract did not reflect a requirement for specific sequences. Finally, the TBP.mu3’ construct, in which the 3’ region of TBP2A20 (W.T.) was replaced with a random sequence containing the

119 Termination Efficiency (%) TBP2A20(W.T.) CCGATTTTTCGCGGTGCTAGGTGTCTG 90 TBP2.A26 CCGATTTTTCGCGGTGCTAGGTGTCAA 87 TBP2.A24 CCGATTTTTCGCGGTGCTAGGTGAATG 75 TBP2.A22 CCGATTTTTCGCGGTGCTAGGAATCTG 52 TBP2.A20 CCGATTTTTCGCGGTGCTAAATGTCTG 66 TBP2.A18 CCGATTTTTCGCGGTGCAAGGTGTCTG 67 TBP2.A16 CCGATTTTTCGCGGTAATAGGTGTCTG 89 TBP2.A14 CCGATTTTTCGCGAAGCTAGGTGTCTG 92 TBP2.A12 CCGATTTTTCGAAGTGCTAGGTGTCTG 88 TBP2.AlO CCGATTTTTAACGGTGCTAGGTGTCTG 65 TBP2.TAT CCGATTAATCGCGGTGCTAGGTGTCTG 1 TBP2.A14.20 CCGATTTTTCGCGAAGCTAAATGTCTG 76 TBP2. (CACO20 CCGATTTTTCGCGGTQCTACCACTCTG 87 TBP2.A14. (CACO20 CCGATTTTTCGCGAAGCTACCACTCTG 91 TBP2.A14.5.(66TG) CCGATTTTTCGCGAAGCTAc tgacGGTGTCTG 94 TBP2.A14.10.(GGTG) CCGATTTTTCGCGAAGCTActgacacgatGGTGTCTG 92 TBP2.mu3' CCGATTTTTCGCGGCATGCAAGTCAGA 87

Figure 3.10: Effects of altering the sequences 3’ of the tbp2 T-tract on termination efficiency in vivo. The sequences in bold prints indicate site-specific replacements; the lower case letters indicate sequence insertions. The termination efficiency of each mutant was determined using the in vivo assay system.

120 same number of nucleotides, retained essentially wild-type activity (Figure 3.10,

TBP.muS’). This again indicated that the requirement for sequences 3’ of T-tract was independent of sequence.

Spacing C onstraints B etw een the T-Tracts of tboUcctl and the pol III T erm inators

The absence of a specific sequence requirement in the 3’ region of the tbp2 terminator suggested that, instead of providing a specific sequence element, the 3' region fulfills a spacing constraint in the context of our expression module. This hypothesis is supported by data presented in Figure 3.9, in which deletion of sequences 3’ of the tbp2 and cctl terminators led to the loss of termination efficiency, and data of Figure 3.10, which indicate that this region does not provide a specific sequence element. To further examine the spacing requirement, we reevaluated the results obtained for several existing constructs and created additional mutants to alter the distance between the two T-tract elements. Table 3.1 summarize the analysis of these constructs. Increasing the spacing between the two T-tracts in the TBP2TsNew construct (the nature of TBP2TsNew will be discussed in p. 130) by inserting 16 nts of random sequence led to a 34% increase in termination activity at the first T-tract (see TBP2TsNew+16). Similarly, when a shortened version of the pol m terminator that contained only the T9-tract of the pol HI terminator and the two immediately adjacent nucleotides on each side (pHIT9) was inserted, it had a termination efficiency of approximately 14%. However, after inserting a

16-nt spacer between the two T-tracts in pniT9 construct, the termination activity of pinT9 increased up to 50% (see pIIIT9+16). The reason that the termination efficiency of

121 the pinT9 sequence element in the new construct (pinT+16) was still not comparable to that of the wild type pol HI terminator (estimated > 90%) might be a result of structural constraint, which will be described in the following section. Nevertheless, these data suggest that a spacing of approximately 36 to 44 nts between T-tracts is required for an efficient termination directed by the first T-tract.

Sequence requirem ents w ithin the T -tract

In addition to an apparent spacing requirement between T-tracts, the conservation of the thymine residues in the T-tract terminator seems to play a critical role in termination function. Placing the T-tract terminator in the reverse orientation

(TBP2comp.R and HScomp.R, Figure 3.7) or replacing two thymine residues within the

T-tract with adenine residues (TBP2TAT, Figure 3.10) abolished terminator activity. To further characterize the sequence requirements within the T-tract, we performed a mutagenesis study of the T-tract of the tbp2 terminator. Figure 3.11 shows the result of this study, including deletion and single nucleotide replacements. The data indicate that deleting a single thymine residue (AT) from the T-tract decreased its termination efficiency by 33%, and removing two thymine residues (ATT) nearly eliminated the terminator activity (7%). Therefore, a minimum of 4 thymine residues within the T-tract is required for the activity of the tbp2 terminator. Furthermore, to examine the overall sequence requirements, we constructed single point mutations that replaced each of the three inner thymine residues of T5-tract with the other three possible nucleotides (A,G

122 Termination Construct® Spacing (n t/ Efficiency (%)

T B P 2 A 2 0 44 90

T B P 2 A 2 8 36 69

T B P 2 T s N e w 2 0 62

TBP2TsNew+16 36 96

H S d e L 51 85

H S A l l 40 58

H S A 1 6 34 27

H S A 2 0 30 15

pIIIT9 28 14

pinT9+16‘ 44 50

T a b l e 3.1: Spacing constraints between multiple T-tracts.

The sequence of each termination element listed is given in Appendix A. # : Spacing distance is expressed as number of nucleotides between the 3’ end of the T- tract of the inserted termination element and the 5’ end of the pol HI terminator, t: This level of termination may also reflect the consequences of altered RNA structure in the terminator region (see below).

123 F igure 3.11: Effects of single nucleotide replacements on the termination activity of the tbp2 terminator T-tract. (A) Termination activity determined by Northern analysis. The

W.T. sequence (TBP2A20) of the tbp2 terminator is shown on the top and the mutated thymine residues are labeled and presented in bold print. The AT mutant had one thymine residue deleted from the T-tract, and the ATT had two. (B) Data summary. The termination efficiency of each mutant examined in (A) was determined as previously described (Figure 3.7). Hybridization with a tRNAIle probe served as an internal control

(see figure legend of Figure 3.7).

124 (A)

W.T. (TBP2A20) CCGAT1 T 2 T 3 T 4 T 5 CGCGGTGCTAGGTGTCTGAACGCCCATAAGTGTGAAAT

l4 T3 I------1 I------1 W.T. A G C A G C Read-thru

i— tbp2 Term. in tRNAProM—

— tRNAIle

T2 A G C W.T. AT ATT Read-thru

— tbp2 Term. in tRNAProM—

— tRNAIle

(B)

A 14% G 5% C 13% t .) T.TzT, T4T5 87% (W .T.) T 1T2T3T4T5 87% i T 1T 2T3T4 54% 22% A A 42% T.T2T3 7% 3% G G 16% 40% C C 46%

Figure 3.11

125 and C). As summarized in Figure 3.1 IB, the presence of a guanine residue in any of the three positions strongly inhibited termination, and there was a particularly stringent

demand for the thymine residue at the T 3 position. While a cytosine residue was preferred over an adenine residue at the T% position, its influence on termination was essentially the

same as when it occurred at the T 4 position. Moreover, the results also clearly demonstrated that any substitution in the T-tract could significantly inhibit termination.

The least affected mutant sequence had a cytosine residue at the T 4 position, yet its termination efficiency (46%) was still only approximately 52% of the W.T. (87%).

The function of H. volcanii tRNALvs Promoter as a Termination Element

The fact that the T-tract itself is the only region within the termination element that contains sequence-specificity raised the question of its role in the termination process. Having observed that an H. volcanii tRNALys promoter with 5 consecutive T residues functioned as an active promoter (Palmer and Daniels, 1995), we asked how a protein factor, such as TBP, could distinguish between a promoter and terminator and initiate the appropriate interaction. We speculated that the transcription initiation factors, such as TBP and TFB, might bind to the T-tract element of a terminator. However, since there is no boxB located downstream from the T-tract, the initiation complex is not expected to initiate transcription but serves merely as a road-block causing the release of

RNAP and the nascent transcript. Such a termination scheme would be very economical for these organisms. To examine this hypothesis, we used the H. volcanii tRNALys promoter as a model.

126 Based on the results of an in vivo analysis of the H. volcanii tRNALys promoter

(Palmer and Daniels, 1995), we selected three boxA mutants. All of these mutant

promoters have transcription initiation activities approximately 2 -fold greater than the

W.T. promoter, and therefore were capable of efficiently recruiting initiation factors.

Furthermore, their sequences (Figure 3.12) represented a spectrum of good to bad T-tract

termination elements. Figure 3.12 shows the results of this study. With the exception of

LysPT29.F (lane 1), which had a T5-tract and displayed a moderate terminator activity

(32%), all other constructs failed to terminate transcription at the boxA sequence. This

suggested that T-tract sequences, characteristic of transcription termination elements, do

not efficiently recruit TBP or other preinitiation proteins for functioning as a stable road­

block. In addition to the read-through transcript, a 165-nt RNA species that has its

predicted 3’-terminus located within the 16-nt spacer sequence also consistently appeared

in all constructs. Since we have detected this transcript in many constructs containing an

inactive termination element with the 16- nt spacer, it is probably the result of 3’

degradation targeted at the spacer sequence of the longer transcript; thus, these RNAs

represent read-through transcripts.

Effect of the Structural Environment on a T-tract Terminator

In addition to analyzing the terminators at the primary sequence level, we also wanted to examine the structural environment in which the T-tracts reside and to

investigate whether the local structural environment influences the termination activity of

127 Figure 3.12: The ability of the H. volcanii tRNALys TATA/BoxA sequence to function as a transcription terminator. The -37 to -12 sequence region of W.T. and mutant forms of the H. volcanii tRNALys promoter in both forward and reverse orientations is shown

(sequences 1 to 6 ). Boxed sequences indicate the TATA-like elements of the tRNALys promoter. Relative promoter strengths (Palmer and Daniels, 1995) of these mutants, which is in part a reflection of TBP binding, is also indicated. Northern analysis of the constructs is presented in the bottom panel. The control lane (lane C) contains the sptProM+16 construct without the insert sequence. Hybridization with a tRNAIle probe served as an internal control (see figure legend of Figure 3.7). The termination efficiency of each insert sequence element is indicated. N.D. indicates that the termination activity of the insert DNA was not detectable.

128 tRNALvs Promoter Relative Promoter Acthitv (%) W.T. GGAAAGTC ATTTT^ CCCACCGGC AGT 100 1. LysPT29.F GGAAAGTC TTTTT/^ CCCACCGGCAGT 271 2. LysPT29.R ACTGCCGGTGGGTAAAAAGACTTTCC 3. LysPA27.F GGAAAGTdATATTXCCCACCGGCAGT 205 4. LysPA27.R ACTGCCGGTGGGTAATATGACTTTCC 5. LysPAlS.F GGAAAGTdÀtTTAACCCACCGGCAGT 249 6. LysPA25.R ACTGCCGGTGGGTTAAATGACTTTCC

C 1 2 3 4 5 6

Read-thru

tRNALys _ promoter

tRNAIle

Termination 32 N.D. N.D. N.D. 5 N.D. Efficiency (%)

Figure 3.12

129 the T-tract elements. In particular, we investigated the effects of placing the T-tract in a stem-Ioop structure, whether T-tract were bent, and whether bend sequences would facilitate termination.

Termination Efficiency of a T-tract Located within a Stem-Loop Structure

We have shown earlier (Figure 3.8 and Figure 3.9A) that TBP2Ts (equivalent to

TBP2A33) version of the tbp2 terminator, whose T-tract is only 31 bp upstream from the

T-tract of pol HI terminator, was inactive. However, when 13 bp of a random sequence was added to the 3’ end of the mutant terminator, termination activity was completely recovered (see TBP2.Mu3’ in Figure 3.10). We reasoned that the tbp2 terminator in

TBP2Ts was inactive due to the inhibition imposed by the pol 111 terminator in close proximity. We then deleted the pol 111 terminator from the TBP2Ts construct (TBP2TsX; see Appendix A for the sequence) in an attempt to eliminate its interference on the activity of tbp2 terminator. However, the TBP2Ts element in this new construct remained inactive (data not shown), suggesting that, although it contained sequence sufficient for its termination activity, the TBP2Ts was inherently inactive as a terminator.

A possible explanation was that a secondary structure might form that hindered the ability of the T-tract to direct termination. To explore this hypothesis, we analyzed the potential secondary structure of TBP2Ts in the context of the expression module.

Analysis of the sequence between the tRNAProM-coding region and the pol in terminator for potential RNA secondary structure (Zucker, 1994) indicated a possible secondary structure (Figure 3.13A) that contained two stem-loops with an overall free

130 energy (AG) of -22.2 (kcal/mole). Within this predicted structure, the T-tract of TBP2Ts is located in the loop of the first stem-Ioop structure. To examine if this sequence/structure configuration inhibited the termination activity of TBP2Ts, we reconstructed the TBP2Ts terminator by removing the vector sequence containing the

BamVl (at the 3’ end of the TBP2Ts), Smal and partial Kpnl sites (see Figure 3.13 A).

This sequence region was responsible for forming the 3’-stem in the first stem-Ioop and the 5’-stem of the second stem-Ioop. Deleting this sequence was predicted to eliminate both stem-Ioop structures.

As shown in Figure 3.13B, this new construct (TBP2TsNew) exhibited markedly higher termination activity, increasing from 0-3% (TBP2Ts) to 59-66% (lane 1). This result suggested that the secondary structure indeed inhibited the termination activity of

TBP2Ts in the previous construct. However, the efficiency of the tbp2 terminator in this new construct was still not at the W.T. level (90%). We suspected that this was due to the short distance (20-nt) between the T-tracts of the tbp2 and the pol HI terminators in the TBP2TsNew construct. To test this possibility, we reconstructed TBP2TsNew by adding a 16-bp spacer to form TBP2TsNew+16 and by constructing a mutant of

TBP2TsNew where the downstream pol HI terminator was deleted, TBP2TsNewX. As shown in Figure 3.13B lanes 2 and 3, TBP2TsNew.l6 and TBP2TsNewX sequence elements terminated transcription at 96% and 97% efficiency, respectively. In the case of

TBP2TsNewX, the minor read-through represents termination in the vector. The data

131 Figure 3.13: Termination activity of new TBP2Ts constructs. (A) The predicted secondary structure of the TBP2Ts construct in the region downstream from the tRNAProM. Nucleotides are numbered according to their position from the 3’ terminus of the tRNAProM-coding region. The sequence regions belonging to the tbp2 and pol III terminators are marked. The boxed sequence was removed in the TBP2TsNew construct, and the arrow points to the position of the 16-nt insertion in TBP2TsNew. 16. Panel (B)

Northern analysis of constructs carrying sequence elements TBP2TsNew (lane 1),

TBP2TsNew.l6 (lane 2), or TBP2TsNewX (lane 3). Residual read-through termination in TBP2TsNewX reflects termination at a site in the vector. The termination efficiencies of these elements were determined as previously described (Figure 3.7) and listed in the table. Hybridization with a tRNAIle probe served as an internal control (see figure legend of Figure 3.7).

132 (A) TBP2Ts (TBP2A33) Energy = —22.2 kcal/mole

UU ^jI7 U U c A G-C 20 lOC-G C-G C / C C G u- A A A U-G40 G C ■C G C •U 49 5 C G C C C AAÜAAUUUUUUUUU- 30

tbp2 Terminator Pol III Terminator

(B) 1

Termination Efficiency of TBP2 Terminator (%) Read-thru 1. TBP2TsNew 62 tbp2 Term. 2. TBP2TsNew.l6 96 3. TBP2TsNewX 97

tRNAIle

Figure 3.13

133 obtained from studying these two constructs suggested that the presence of the pol HI terminator in TBP2TsNew was responsible for at least a 30% reduction in termination efSciency of the tbp2 terminator and again confirmed the spacing requirement between T- tracts.

The negative influence of T-Ioop structure in the termination activity of the T- tracts was also seen in the PIIIT9+I6 construct described earlier (see Spacing Constraints between the T-tracts). In this construct (see Appendix A for its sequence), the upstream and downstream regions of the T9-tract can form a 7-bp stem, consequently placing the

T9-tract in the loop. This structure (free energy AG = -13.9 kcal/mole) is not as stable as the TBP2Ts hairpin and can assume some degree of termination activity (50%), but not at the level of the wild type pol m terminator (> 90%).

Termination Activity of Bent DNA

In Eukarya, studies have demonstrated that intrinsically bent DNA (Kerppola and

Kane, 1990) or artificially induced DNA curvature (Ueno et al., 1992) can cause transcription termination. Here we examined two DNA elements that have been shown by the SELEX assay (Beutel and Gold, 1992) to cause DNA bending. The A3T3 element

(see Figure 3.14A) has two imperfect repeats of 3’-AAATTT-5’ phased in one helical turn. This sequence represents a non-T-tract (as compared to a minimal T-tract for terminators) bent DNA. The other bent-DNA sequence, designated Tract (see Figure

3.14A), has two T5-tracts phased by one helical turn. Figure 3.14 shows the sequences of these two elements and the Northern analysis of constructs containing these sequences in

134 their forward and reverse orientations. The results indicate that only Tract.F was able to direct termination at a significant level: 60% at the first T5-tract and 10% at the second one (lane 3, Figure 3.14B). Although TractR still maintained its intrinsic curvature, it did not induce termination at a detectable level (lane 4). The A3T3 element, on the other hand, was not an efficient terminator, regardless of its orientation (lanes 1 and 2). Hence,

DNA-bending alone was not sufficient to cause termination in H. volcanii. Furthermore, the fact that the Tract.F element was the only efficient termination signal suggests that this element functions by virtue of its T-tract sequences rather than its ability to bend

DNA.

To ensure that the minimal spacing requirement was met and that the results obtained were not the result of spacing constraints, constructs were prepared with 16 bp inserts (Tract.F/R+16; A3T3.F/R+16; see Appendix A for sequences). The termination activities of these new constructs were essentially the same (data not shown) except that we observed that termination at the second T5-tract of the Tract.F increases fi-om 10 to

26%, which was probably due to the distance increase (firom 30 to 46 nts ) between this

T5-tract and the T9-tract of the pol HI terminator.

Determination of DNA Curvature in Active Terminators

Although our bent-DNA termination data suggested that DNA curvature alone is not sufficient for termination, it is possible that DNA bending assists termination in T- tract terminators. Since bent-DNA conformation is normally associated with anomalous

135 Figure 3.14: Activity of bent DNAs as transcription tenninators. (A) Sequences of the bent-DNA elements. These sequences were previously identified as bent DNAs using the

SELEX procedure (Beutel and Gold, 1992). (B) Termination assays. The tRNAProM transcripts derived fi’om the constructs containing the firagments listed in (A) were analyzed by Northern hybridization. Lane numbers correspond to the DNAs presented in

Panel A. Lane C is a control, containing the sptProM expression module only.

Termination efficiencies are indicated below the individual lanes. N.D. indicated not detectable. Hybridization with a tRNAIle probe served as an internal control (see figure legend of Figure 3.7).

136 (A) Bent-DNA Element

1. A3T3.F AAATTTGTCCGAAATTACTGA

2. A3T3.R TCAGTAATTTCGGACAAATTT

3. Tract.F GCGTCTTTTTGGCATCTTTTTCATG

4. TractR CATGAAAAAGATGCCAAAAAGACGC

(B) 4 C

Read-thru-

Bent-DNA

-tRNAIle

Termination 12 20 70 N.D. Efficiency (%)

Figure 3.14

137 electrophoretic mobility (Beutel and Gold, 1992; Crothers et al., 1990; Hagerman, 1990), the intrinsic DNA-bending ability of a sequence is often determined by examining the electrophoretic mobility of small DNA fiagments containing such a sequence. A sequence element is most capable of introducing curvature when it is located in the center of the DNA fragment and least capable when it is located at the terminus of a DNA fragment. Curved DNA fragments normally migrate at a lower rate than linear fragments of the same size during electrophoresis.

The sequence elements we chose to examine were Tract (a positive bent-DNA control), the yeast tRNAProM pol m terminator, the tbp2 terminator (TBP2A20) and the region from the tbp2 terminator to the pol DI terminator in the TBP2A20 construct.

Figure 3.15B shows an 8 % PAGE analysis of these fragments recovered from the pUCbend vector by digestion with Mlul and £coRV. Whereas the Mlul digest places the insert DNA at the end of the entire fragment, the £coRV digest places the insert in the center. Lanes IM and IE in the PAGE presented in Figure 3.14 represent the vector control (M, Mlul digest; E, £coRV digest). This control reflects any inherit conformational abnormality within the vector sequence. Lane 2 is the positive control.

Tract; the mobility shift of EcoRV fragment in Tract is the most apparent among all the constructs examined. We did not detect any electrophoretic abnormally in fragments containing the yeast tRNAProM pol IE terminator or TBP2A20-pIIIT fragments (lanes 3 and 5). However, the £coRV fragment containing TBP2A20 (lane 4) did show a minor degree of retardation, suggesting an inherent DNA-bending ability. Overall, these data indicate

138 Figure 3.15: Electrophoretic mobility of termination elements. (A) The polylinker region of pBend2 (or pUCbend) (Zwieb et al., 1991). This region contains 17 pairs of permutated restriction sites as well as two unique cloning sites, Xbal and Sali, at the center. Termination elements were cloned into th e c a l site (indicated with arrowhead).

When digested with any of the restriction enzymes at the permutated restriction sites, this

polylinker region yields a 120-bp fragment. (B) 8 % polyacrylamide gel electrophoresis.

The Mlul (M) and EcoRV (E) fragments derived from the polylinker region of each

construct were analyzed for their electrophoretic mobility on a 8 % PAGE at 4V/cm for 26 hours. Lane 1 is the control, which does not contain an insert in the polylinker region of pUCbend. Lanes 2 to 5 contain constructs carrying the Tract, pIUT (pol HI terminator),

TBP2A20, and TBP2A20-pIU'f fragments at the Xbal site, respectively.

139 (A) (B)

EcoRI — Mlul — Bgin — Nhel — Clal — Styl — Spel — Xhol — EeoRV— PvuII — 1 2 3 4 5 Smul — StuI — n ri n rn n MEMEMEMEIVIE Nrul — SspI — Kpnl — Ncol — BamHI” " Xbal — Sail — lllllllllllllllllllllllllllllll Mlul — Bgill — Nhel — Clal — Styl — Spel — Xhol — EcoRV— PvuII — Smal — Still — Nrul — SspI — Kpnl — Ncol — BamHI— Hindlll—

Figure 3.15

140 that bent DNA is not a general requirement for a functional termination element in

H. volcanii. However, they do not rule out the involvement of DNA-bending in archaeal termination, since one terminator {tbp2) is a potential bent DNA.

Activity of a Bacterial p-Independent-Like Terminator inH. volcanii

In addition to the eukaryal-like T-tract terminators, some archaeal genes have 3’- terminal sequences resembling the bacterial p-independent terminators (Brown et al.,

1989; Lechner and Sumper, 1987; Muller et al., 1985). To determine if a bacterial p- independent-like termination system exists in the archaea, we examined a potential p- independent terminator found in the 3’-region of the H. volcanii rRNA operon. H. volcanii has two copies of rRNA opérons. Each operon contains (in order) 16S rRNA, tRNAala, 23 S rRNA, 5S rRNA, with one of the copies containing additional tRNACys genes located downstream of the SSrRNA. In particular, the 5S rRNA regions of both copies have been completely sequenced (Daniels et al., 1985.b). One hundred percent nucleotide sequence identity was observed from 20 bp upstream to 74 bp downstream of the duplicated 5S rRNA genes. Within the 74 bp downstream region, a sequence element containing a potential stem-Ioop structure followed by three clusters of Ts (5’-

TTTGTTCATTTT-3’) is a likely transcription terminator (see SsrT.F in Figure 3.16A for its sequence). The goal of this part of the investigation was to understand the structure and/or sequence requirements within this type of terminator.

141 Terminator Activity of a o-Independent Terminator-Like Sequence from the H. volcanii rRNA Oneron

Many archaeal stable RNA genes do not contain T-tract terminators and possibly generate the final RNA product by means of post-transcriptional processing (Zillig et al.,

1993). Therefore, we first wanted to determine whether the 3’ sequence element of

H. volcanii 16S-23S-5S rRNA operon is a processing signal for endonuclyotic cleavage, such as RNase HI or RNase E, or a termination element. The 3' flanking region of the 5 S rRNA gene (the last gene in the operon) was amplified firom H. volcanii genomic DNA by PCR using primers (5STL and 5STR; see Table 2.1) containing synthetic Xbal sites at each end. The Xbal-Xbal PCR DNA fragment was placed into sptProM at the Xbal site located between the H. volcanii tRNALys promoter and the tRNAProM gene. Figure

3.16B shows the Northern analysis of the constructs with the insert DNA placed in either orientation. These data indicated that the SsrT.F (forward construct) completely terminated transcription before RNAP reached the tRNAProM coding region, since no tRNAProM was detected in cells carrying this construct (lane 1). Had the SsrT acted as a processing signal, we would have detected the presence of the stable processed transcript.

The SsrT.R sequence element (reverse construct), on the other hand, did not seem to have termination activity (lane 2). Therefore, the 3’ sequence element of the H. volcanii rRNA operon (SSrT) most likely fimctioned as a transcription terminator. Consequently, the question that we wanted to address next was whether the termination activity of the 5SrT operated like a bacterial p-independent terminator. To do so, we generated constructs containing 5SrT or mutant versions of this sequence, and introduced these DNAs

142 downstream of the tRNAProM gene in the sptProM expression module. This allowed us

to directly assess the efBciency and accuracy of the termination event.

The Xbal-Xbal 5SrT DNA fragment was first cloned into pUC1318 to pick up the

Bamm sites at each end and then cloned into the expression module (sptProM) at the

BamlQ. site. The DNA was cloned in both orientations. As shown in Figure 3.16C lanes

1 and 2, while the reverse sequence (SsrT.R) did not terminate transcription at a

significant level, the forward sequence (SsrT.F) displayed S2% termination activity. We

also observed two smaller transcripts being made from the SsrT.F construct. The failure

to detect any read-through products in the SsrT.F construct, coupled with the observation

that tRNAProM does not undergo S’ end or intron processing (Palmer et al., 1994),

suggest that these RNAs might result from general 3’ degradation. Therefore, the

termination activity of the SsrT.F sequence element might, in fact, be much higher than

S2%. Furthermore, using SI nuclease mapping, we have mapped the 3’-terminus of the

tRNAProM derived from the SsrT.F construct to the 3 nucleotides immediately following

the stem-Ioop structure (see Figure 3.17).

To determine if the SsrT.F termination activity requires both the stem-Ioop

structure and the following T-rich region, we constructed the following deletion mutants

using the SsrT.F sequence: deSTLP, which has a deletion of the S’-region up to the left

arm of the stem; ShorT, which has the T4-tract (in the T-rich region) and its downstream region deleted; and Trich, which contains only the 3’-region beginning at the bottom of the stem. The sequences of these elements are shown in Figure 3.16 A. The results of

Northern analysis for the constructs carrying these sequences (Figure 3.16C) indicated

143 that none of the mutant sequences were efficient in directing transcription termination, suggesting that both the intact stem-Ioop structure and the 3’ T-rich region play a role in termination. Moreover, comparison between the deSTLP and Trich constructs suggested that the residual stem-Ioop sequence in the deSTLP might be a target site for degradation.

While the only differences between these two sequence elements was that the deSTLP contained an additional upstream region including the loop and the right-arm stem, the deleterious effect on the stability of the tRNAProM transcript was apparent for the deSTLP construct but not for the Trich construct. Also, an interesting phenomenon was observed for the Trich construct (Figure 3.16C, lane 5); in addition to the read-through product terminating at the pol HI terminator, four smaller transcripts appeared as minor products from this construct. Based on their sizes, two transcripts are likely to represent read-through termination within the vector sequence, and the other two (labeled rT; comprising 20% of the total transcripts) were likely to be transcripts terminating at the

T3- and T4-tracts of the Trich sequence elements. We also inserted the 16-nt spacer in the Trich constructs (Trich+16) to increase the distance between the T4-tract in Trich and the T9-tract in the pol HI terminator from 30-nt to 46-nt. For Trich+16, we obtained a clear signal, which represented approximately 40% of the total product and had the predicted 3'-terminus of the tRNAProM residing in the T4-tract of the Trich sequence

(data not shown). It is tempting to speculate that removal of the hairpin results in a switch from the p- independent-like termination to T-tract termination. Taken together.

144 F igure 3.16: H. volcanii rRNA operon terminator. (A) Sequences of the H. volcanii rRNA operon termination elements. The first sequence element, SsrT.F, contains the 74- nt sequence that is located immediately downstream firom the 5S rRNA coding region.

Arrows indicate inverted repeats and arrowheads mark the termination sites determined by 81 nuclease mapping (Figure 3.17). The SsrT.R is the reverse complementary sequence of SsrT.F, and the other three sequence elements are deletion mutants of

SsrT.F. The sequence regions deleted in the mutants are indicated by dotted lines. (B)

Northern analysis. The Northern blot contain RNAs fi-om cells carrying the SsrT.F

(lanel) or SsrT.R (lane 2) terminators at the Xbal site of the sptProM. No read-through product was detected in lane 1, so the terminator activity of SsrT.F was designated 100%.

The activity of SsrT.R was below the level of detection since the probe (exon 1 probe) only detect read- through products (< S%; designated as not detectable, N.D.). Lane C contains the sptProM control. (C) Northern analysis of constructs carrying the sequence elements (at the BamVH site of sptProM) listed in (A). The hybridization signals corresponding to transcripts terminating at the rRNA operon termination element are labeled as r T . The smaller transcripts in lanes 2 to 3 might be the results of general degradation. The termination efficiency for each insert DNA is indicated below. The symbol N.D. designates no detectable signal at the rRNA terminator. The control lane contains RNA firom cells carrying the expression module, sptProM. Hybridization with a tRNAIle probe served as an internal control (see figure legend of Figure 3.7).

145 (A)

1 . S s r T . F attcatacttcacagagcccactgg^cgcggtStccK gttcctttgttcattttatgcagagcggcacgcacgg

2 . S s r T . R CCGTGCGTGCCGCrCTGCATAAAATGAACAAAGCCACTGGACACCGCGTCCAGTGGGCTCTGTGAAGTCTGAAT

3 . d e S T L P ------gcggtStccagtggctttgttcattttatgcagagcggcacgcacgg

4 . S h o r T ATTCATACTTCACAGAGCCCACTGGACGCGGTGTCCAGTGGCrrTGTTCA-

S . T r i c h - GGCTTTGTTCATTTTATGCAGAGCGGCACGCACGG

(B) (C) C 1 2 3 4

Read-

Read-thru

In vector Pol III Term. 154 nt - (154 nt) —

tRNAIle— tRNAIle

Termination 100 N.D. S2 N.D. N.D. N.D. 20 Efficiency (% ) at rRNA Term inator

Figure 3.16

146 Figure 3.17: Mapping the 3’ terminus of the transcript from the SsrT.F construct by SI nuclease digestion. The DNA probe was prepared from a DNA fragment amplified from

PCR using ProMSP and NewT3 oligonucleotide primers and digested by 5jrNl. RNA was isolated from H. volcanii cells carrying the SsrT.F construct. Lane 1 is a control that contained the labeled 188-nt DNA probe. The two additional bands in this lane might be the results of non-specific labeling on the non-template strand DNA and minor contamination during the process of preparing restriction DNA firagment from agarose gel. Lane 2 is a control for S1 nuclease activity, in which the DNA probe was heat- denatured and digested with SI nuclease for 30 min at room temperature. Lanes 3 to S are the “minus” RNA controls, in which the DNA probe in the hybridization buffer was digested with SI nuclease for 15 min, 30 min, and 45 min, respectively. Lanes 6 to 8 contain the SI reactions, in which approximately 10 ng DNA and 10 pg RNA were hybridized at 47°C for 16 hours and digested with 10 volume o f SI nuclease (1000 units/ml) on ice for 15 min, 30 min, and 45 min, respectively. The positions of the nucleotides corresponding to the 5’ terminus of the major protected products, which are

77- to 79-nt in size, are indicated in Figure 3.16A. The DNA size markers are labeled based on a sequencing ladder.

147 1 2 3 4 5 6 7 8

' ' iiii 100

77-79 nt 7 4 -

Figure 3.17

148 these data suggested that the element located downstream of H. volcanii rRNA operon contains a bacterial p-independent-like terminator.

T enuinator A ctivitv of E. coll troA T erm inator in H. volcanii

To confirm that the p-independent termination system functions in H. volcanii, we decided to test a known bacterial p-independent terminator in our termination assay system. We chose the E. coli trpA. terminator (see Figure 3.ISA for its sequence and structure) as a model since it is a well-defined bacterial p-independent terminator in E. coli (Christie et al., 1981; Platt, 1981). Complementary oligonucleotides containing the trpA terminator sequence were annealed and cloned into the sptProM at the BamHl site.

Unexpectedly, the termination activity of the trpA terminator was only approximately 9% in our expression module (TrpA.T construct; see Figure 3.18C, lane 1). Furthermore, deletion of the pol HI terminator downstream from the TrpA.T construct did not restore the activity of the trpA terminator (TrpA.X; lane 2), suggesting that the trpA terminator either could not act as a terminator or that it was placed in a context that prevents its function. Figure 3.18B shows the structure of the trpA terminator sequence region (in bold print) in the context of TrpA.T or TrpAT.X, as predicted by the mFold RNA folding program. It appears that the trpA terminator in this case could exist in an alternative structure that places its T-tract element within the bulge loop. This predicted structure disrupts the native structure of TrpA as a p-independent terminator. It also prevents the

T-rich sequence region from acting as a T-tract terminator since the T-tract is located in a

149 Figure 3.18: Termination activity of E. coli trpK terminator in H. volcanii. (A) The

native secondary structure of the E. coli trpA terminator. (B) The secondary structure of

the region containing the E. coli trpA terminator in sptProM. The secondary structure of

the region downstream from the tRNAProM to the 5’ Sstl site in TrpA.T or TrpA.TX

constructs was predicted using the mFold RNA folding program (Zucker, 1994). The

trpA terminator sequence is shown in bold print. (C) Northern analysis. Lane 1 contains

TrpA.T construct, and lane 2 contains TrpA.TX. In lane 2, the transcript labeled vector. 1

has its 3’-terminus located at approximately 37 nucleotides downstream from the EcoRL

site (the 3’-end of the sptProM expression module). The vector.2 label indicates the

RNA that has its 3’-terminus at approximately the 3’ site—the 3’-end of the structure

shown in (B). Hybridization with a tRNAIle probe served as an internal control (see

figure legend of Figure 3.7). Termination efficiencies of these constructs are indicated in the table to the right.

150 (A) (B)

A T

T G ^ C - G ^ I", T A A - T T T T C-G C-G G-C C-G c C Q

C-G trpA T-A ^ ^ f;yATerminator C-G Term inator A-T T-G in the context of C-G G-C G-C s p t P r o M G-C G-C G-T

5 ’ A-T TTTTTTT 3’ 5 ’ G-C C G-C 3’

(C) 1 2

T erm ination Efficiency (% ) a t t h e trpA Term inator #0 0 — vector. 1 Pol ni Term. — i — vector . 2 . ^ ^ _ trpA Term. — Term. ^ T r p A . T

2 TrpA TX

t R N A I l e

Figure 3.18

151 hairpin. This structure also explains the dominant signal (70%; labeled vector.2 in lane

2) of TrpA.TX in the Northern analysis. Since there was no efiBcient terminator present in the TrpA.TX construct, the 3’-terminus of most of the transcripts possibly resulted from general degradation from the unstructured 3’ end.

Discussion

Data from studies focusing on transcription termination in Bacteria and Eukarya suggest that different organisms have adopted different strategies for regulating transcription at the level of termination. However, it seems that certain features of successful termination mechanisms have been preserved in evolution (reviewed in

Kerppola and Kane, 1991; Platt, 1996). Consequently, we have used the bacterial and eukaryal termination mechanisms as models for exploring transcription termination in the third phylogenetic domain, Archaea.

T erm inator Function of a E ukarvai pol III Term ination E lem ent in H. volcanii

In studying the in vivo termination of the yeast tRNAProM terminator, we demonstrated that a eukaryal pol HI terminator functions as a strong terminator in H. volcanii. This observation suggests that the archaeal RNAP can recognize a eukaryal pol in terminator and terminate transcription efficiently. Hence, archaeal transcription machinery might, in fact, utilize a eukaryal-like termination mechanism.

The fact that the yeast pol IE terminator, when cloned in the reverse orientation, was not able to direct termination in H. volcanii, also provides some insight into the

152 requirements for archaeal termination. First, since the reverse complementary sequence would still be AT-rich, archaeal termination is not simply the result of DNArRNA instability attributed to weak A:U base pairing. The T-tract must be present on the non­ template strand to direct efficient termination. Second, an interesting observation concerns the apparent requirements for termination in vivo as compared with that discussed for terminators using in vitro system. The reverse sequence of the pol IE terminator contains the element 5’-CTTTAATTTT-3’, which resembles 5’-

TTTTAATTTT-3’, which was described as an efficient terminator of Methanococcus tRNAVal transcription in vitro (Thomm et al., 1994). We observed, however, that the sequence (5’-CTTTAATTTT-3’) did not terminate transcription in vivo. This discrepancy may reflect differences in the termination systems utilized by halophiles and methanogens, but more likely, it demonstrates an inherent shortcoming of in vitro studies using purified RNAP (Manley et al., 1989). In in vitro transcription assays it is difficult to distinguish between transient RNAP pausing and true termination in which the nascent transcript and RNAP are released firom the elongation complex. For these reasons we undertook a detailed analysis of the requirements for transcription termination in vivo.

A n Expression M odule for M onitoring in vivo Term ination

The sptProM expression module constructed for this study is a very useful genetic tool for studying archaeal termination in vivo. This termination module contains the yeast tRNAProM reporter gene under the transcriptional control of the H. volcanii tRNALys promoter, synthetic BamYQ. and Kpnl cloning sites, and a eukaryal pol HI terminator at the

153 3 ’ end. The yeast tRNAProM gene gives rise to a discrete RNA product that is not processed and therefore can be readily measured by Northern analysis (Palmer et al.,

1994).

The sptProM expression module has several advantages as a genetic tool for investigating transcription termination. First, it provides a positive assay for transcription termination. This is in comparison to the placement of potential terminators in a region upstream of H. volcanii tRNATrpO 16M-AGGAG gene where lack of read-through

(negative data) was interpreted as a positive termination event. As with the AGGAG construct, this module would also allow insertion of putative terminators in the region 5’ of the reporter. This provided an alternative means to assess whether a sequence element was a target for post transcription processing. Using SI mapping and Northern analysis, the accuracy and efficiency of a potential termination element could be evaluated.

Characteristics of T-tract Terminators

Our studies on archaeal T-tract terminators demonstrated that they consist of either one or multiple oligo(dT) stretches on the non-template strand. Tracts of thymine residues are also the recurring theme in the active artificial termination elements we examined, such as bent DNA and LystRNA promoter sequences. Using S1 nuclease digestion, the 3’termini of transcripts terminating at tbp2, cctl and pol HI terminators were consistently mapped within the T-tract elements. Other tRNAProM transcripts that terminated at an inserted termination element, although not precisely mapped, most likely had 3’-termini within a T-tract based on the size of the transcripts as determined by

154 Northern analysis. Therefore, nascent RNA transcripts terminated at T-tract terminators

are mostly released within the T-tract. T-rich regions are also characteristics of eukaryal

terminators (Campbell and Setzer, 1992; Kerppola and Kane, 1990; Lang and Reeder,

1995).

In some cases (Figure 3.6), the 3’ termini of archaeal gene transcripts were mapped to sequence regions adjacent to a T-tract motif (Reiter et al., 1988) or to a thymine not located within a poly-T tract (Shimmin and Dennis, 1996). Since S1 nuclease is a very aggressive enzyme, we have found that it is capable of hydrolyzing double-stranded nucleic acids at regions where dsDNA “breathing” can take place

(particularly at A-U base-pairing). Therefore, when conducting SI mapping, we consistently used only a 3’-end-labeled DNA probe and carefully selected the S1 digestion conditions (normally by performing at low temperature, 0“C to 20°C) under which digestion of double-stranded nucleic acids was reduced to a minimum (Berk,

1989). Such an approach should yield SI nuclease protection data that reflect true termination.

After confirming the function of the T-tract as the “core” element of archaeal T- tract terminators, we then defined in detail the sequence requirements within the T-tract motif. Mutational studies using the tbp2 terminator as a model revealed the following sequence requirements for T-tract motif: (1) a minimum number of 4 Ts; (2) guanine residues are inhibitory at any T-tract position; (3) replacement of the T at position 3 in the

T5-tract (5 Ts) with any of the other three possible nucleotides suppresses termination;

155 (4) substitution of cytosine at the T2 and T4 positions is acceptable. Together, these data suggest that archaeal T-tract terminators and eukaryal pol HI terminators are similar.

It has been proposed that oligo(dT) tracts on the non-template strand constitute eukaryal pol III terminators (reviewed in Geiduschek and Tocchini-Valentini, 1988).

However, the minimum number of Ts varied among species: T7 in yeast (Allison and

Hall, 1985) and T4 in Xenopus (Bogenhagen and Brown, 1981). In addition to the T-tract motif, the two nucleotides flanking the T-tract can influence its termination efficiency

(Bogenhagen and Brown, 1981; Mazabraud et al., 1987). However, the identities of these flanking nucleotides do not appear to be conserved since they vary in different terminators o f Xenopus laevis genes—for example, 5’-GC-3’ for 5S RNA (Bogenhagen and Brown, 1981) and 5’-AT-3’ for tRNALys (Mazabraud et al., 1987).

Ihe identity of nucleotides adjacent to T-tracts in archaeal terminators might also have some impact on termination efficiency. For the TBP2.A10 mutant in which the two nucleotides immediately 3 ’ to T-tract of the tbp2 terminator were changed from 5’-CG-3’ to 5’-AA-3’, the termination efficiency of the T-tract was reduced by approximately 25%.

Comparison of the dinucleotide sequences found upstream and downstream of T-tract terminators (Table 3.2) indicated a possible preference for G and C residues immediately

3’ of the T-tract motif. However, as exceptions to such a preference exist in some T-tract terminators (Figure 3.6), these adjacent nucleotides in archaeal terminators likely function by influencing the environmental context of the T-tracts rather than by providing sequence specificity.

156 Termination Terminators Sequences (5’->3’)® Efficiency (%)

HScomp (W.T.) CGTTTTTTGC 90

TBP2comp (W.T.) GATTTTTCG 90

TBP2.A10 GATTTTTAA 65

pol in AATTTTTTTTTGC >90*

LysPT29 AGTCllilAC 32

Table 3.2: The influence of the two nucleotides flanking the T-tract on the termination efficiency. shows the sequence containing the T-tract and its adjacent nucleotides; #: estimated number.

In both the E. coli (Reynolds and Chamberlin, 1992; Telesnitsky and Chamberlin,

1989) and pol II termination (Kerppola and Kane, 1990) systems, the sequences flanking the termination site can influence the efficiency of the RNAPs.

Moreover, transcription termination in the eukaryal pol I system requires the binding of a species-specific termination factor to a DNA sequence located downstream from the T- rich element (Lang et al., 1994; Reeder and Lang, 1994). It has been shown that binding of the yeast pol I termination factor, Reblp, is essential for pausing of the elongation complex and for stimulating the release of the RNA transcript and pol I at the upstream

T-rich region (Lang and Reeder, 1995). There is also a specific interaction between the

157 bound Reblp and the T-rich motif. The T-rich motif is required for both the pausing and

release events, and the spacing between the T-rich and the RebIp-binding motifs is

critical. However, our mutational studies on the flanking regions of the tbp2 and cctl

terminators indicated that such a pol I-like terminator does not exist in these genes.

While the 3’ regions of the tbp2 terminator were necessary for maintaining a spacing

requirement between the putative termination element and the pol HI terminator in our

expression module (discussed later), they do not confer sequence specificity as in the pol 1

system. Furthermore, the 5’ regions located upstream of the T-tract motif in the tbp2

terminator appeared to be dispensable.

Since the T-tract itself seems to be the core element of archaeal T-tract

terminators, how does such a sequence signal direct the termination event mechanism? It

is unlikely that the presence of a T-tract merely reduces the stability of the RNA:DNA

hybrid in the transcription bubble as suggested in an earlier model for bacterial p-

independent termination (Yager and von Hippel, 1991; Yager and voN Hippel, 1987). If the termination event was solely dependent on the thermodynamic stability of the

RNA:DNA hybrid, replacing the thymine residues within the T-tract with adenine residues should not affect termination efficiency. However, based on our data, this was clearly not the case; a dinucleotide (AA) replacement within the T-tract of the tbp2 terminator nearly abolished its termination activity.

Evidence obtained firom studying eukaryal terminators (Campbell and Setzer,

1992; Deng et al., 1996; Gottlieb and Steitz, 1989; Lang et al., 1994; Lang and Reeder,

1995; Maraia et al., 1994; McStay and Reeder, 1990; Xie and Price, 1996) suggests that

158 the common strategy for transcription termination involves a two-step mechanism.

Therefore, an intrinsic terminator must satisfy two distinct termination steps: (1) stopping the RNAP (pausing) and (2) release of the RNAP and the nascent RNA transcript. The signals responsible for these two events may or may not be different. In the pol m system, the two steps in the termination process have been experimentally uncoupled

(Campbell and Setzer, 1992). While recognition of the pol HI terminator could be accomplished by purified RNAP alone (Cozzarelli et al., 1983; James and Hall, 1990;

Shaaban et al., 1995), the completion of the transcript, its release and recycling of the

RNAP in complete cell extracts required binding of La antigen to RNA at its 3’ U residues (Gottlieb and Steitz, 1989; Gottlieb and Steitz, 1989; Maraia et al., 1994). All the events for pol EU termination occurred within the T-tract of the terminator. Since the archaeal transcription system is eukaryal-like and the sequence features of their T-tract terminators resemble those of the pol lU terminator, the T-tracts of archaeal terminators will likely be responsible for pausing and transcript release/cleavage.

Currently, there are no genetic tools or techniques available for directly examining the mechanism of archaeal termination. However, we investigated the possibility that

DNA secondary structure was involved in archaeal termination using an in vivo approach.

Since DNA bending has been correlated with the termination efficiency of intrinsic pol II

(Kerppola and Kane, 1990) and pol lU (Gottlieb and Steitz, 1989) terminators, we tested whether a bent DNA could function as a terminator in Archaea and whether a functional

T-tract terminator could bend DNA intrinsically. Our study on the termination activity of two bent DNA elements identified by a SELEX procedure (Beutel and Gold, 1992),

159 revealed that bent DNA without a T-tract containing at least 4 Ts could not efficiently terminate transcription and that the T-tract containing bent DNA was only functional in one orientation (Ts on the non-template strand). Therefore, DNA-bending alone is not sufficient for termination, and the role of the T-tract is not limited to DNA bending (if it does induce bending). However, since the tbp2 terminator (one of the two active terminators we examined for bending) is capable of introducing some DNA curvature, it still remains to be determined whether bending of the DNA template is one of the strategies used by the Archaea for effecting elongation pausing.

During our study of archaeal terminators, we discovered that a T-tract (or U-tract in RNA) located in the loop region (T-loop) of a potential stem-loop structure could not direct termination (see Figure 3.13). While such a structure might not exist in native archaeal terminators, this observation suggests a potential underlying termination mechanism. We suggest two possible hypotheses for why the T-loop is inhibitory for the termination activity of the T-tract motif. One is that the T-loop inhibits the release of

RNA transcripts, perhaps by preventing the necessary interactions between an RNA- binding protein and the U residues and this interferes with the cleavage of the transcript.

Examples of RNA-binding proteins involved in termination include the La-antigen for pol in termination (Gottlieb and Steitz, 1989.a; Gottlieb and Steitz, 1989.b) and the vaccinia capping enzyme for vaccinia virus RNAP termination (Shuman and Moss,

1988). Both proteins recognize the U residues in the 3’-end of RNA; the latter is non­ functional if sequestered in the stem of an RNA secondary structure (Shuman and Moss,

1988). The second hypothesis is that the T-loop inhibits the discontinuous movement of

160 RNAP (“inchworming”). Recently, the T-tract sequence (or A-tract on the template

strand) has been shown to signal the entry of the elongation complex into an inchworm­

like translocation cycle on the DNA template, which correlates with the termination event

(Nudler et al., 1995) (discussed in more detail in the following section). As the RNAP

approaches the T-tract region on the DNA, the dsDNA melts in the transcription bubble,

and the ssDNA becomes available to form a structure which will prevent the T-loop from

making direct contacts with RNAP. Therefore, the RNAP misses the signal for initiating

inchworming and fails to terminate transcription.

Bacterial p-Independent-like Termination in The Archaea

Although there is no conserved secondary structure beyond the common T-tract

element in most of the 3’ regions of the archaeal genes, some termination sites are located

in T-tracts preceded by inverted repeats (Brown et al., 1989). Such a sequence-structure

feature is the signature of bacterial p-independent terminators. Our study of the 3’

sequence region of the H. volcanii rRNA operon demonstrated that this bacterial p-

independent terminator-like sequence element is an efficient terminator in vivo.

Furthermore, the 3’ terminus of the tRNAProM transcript derived from the construct

carrying this rRNA terminator was mapped within the 3 nucleotides at the bottom of the

stem to the first T-tract following the secondary structure. The results of the mutational

analyses o f the rRNA terminator demonstrated that the efficiency of this archaeal terminator depended on the integrity of the stem-loop structure and the T-rich region, the characteristic feature of the classical p-independent terminator,

161 Taken together, our data suggest that the H. volcanii rRNA terminator is analogous to the bacterial p-independent terminator and that features of eukaryal (T-tract) and bacterial (p-independent) termination exist in the archaeal termination system.

Unfortunately, due to the structural problem, our test of an authentic bacterial p- independent terminator in H. volcanii using the E. coli trpA terminator was inconclusive.

However, since all classical bacterial p-independent terminators contain a 3’ T-tract, it would be difficult to determine whether the termination activity exhibited by a bacterial p-independent terminator in the Archaea was due to the entire sequence/structure or to the

T-tract of the terminator. Consequently, we did not further pursue the bacterial p- independent terminator. To resolve this problem, we will need an in vitro transcription system and purified factors involved in the termination process, and footprinting and pulse/chase experiments will be required.

The fact that the Archaea are capable of utilizing both T-tract and p-independent terminators means that they should be able to switch between these two different termination mechanisms. This is exactly what we have observed. When the stem-loop structure was deleted firom the H. volcanii rRNA terminator, the T-rich sequence region left behind behaved like a T-tract terminator and exhibited termination activity.

Archaeal Transcription Termination and RNAP Inchworming

Throughout this study, we consistently observed that the spacing between the T- tracts of the putative terminator and the downstream pol HI terminator appeared to affect the termination efficiency of the first T-tract. Comparison of the termination activities of

162 selective constructs (Table 3.2) suggests that bringing the T-tract of the pol HI terminator in close proximity to the T-tract of the putative termination element reduced the termination activity of the first T-tract by approximately 30%. Such an observation is indicative of the RNAP inchworming mechanism of transcription termination reported recently in E. coli (Nudler et al., 1995).

Recent studies on the £ coli elongation complex have changed our view of the transcription elongation and termination process (Nudler et al., 1996; Nudler et al., 1994;

Nudler et al., 1995; Wang et al., 1995). In the classical transcriptional bubble paradigm, it was thought that the elongation complex moved monotonically and that its stability depended solely on the stability of the DNArRNA hybrid within the transcription bubble.

In contrast to the classical model, the current view of an elongation complex is one of a dynamic complex whose stability is determined by the complex interactions between the

RNAP, the nucleic acids, and other protein factors. In this new elongation model (called the “inchworm” model) (Chamberlin, 1995; Chan and Landrick, 1994; Platt, 1996), the

RNAP advances in alternating laps of monotonie and inchworm-like movements. The nascent transcript is inserted into an RNA-binding site on the RNAP, and the contacts between the RNA transcript and the RNAP change upon elongation arrest (Markovtsov et al., 1996). As a result of RNAP inchworming, and depending on the topological state at which the elongation complex is captured, the footprint of the E. coli RNAP on the DNA template can range from 25 to 40 bp and the transcription bubble contracts between 14 to

18 nt (Krummel and Chamberlin, 1992.b). This inchworming-like movement is also implicated in the eukaryal pol II (reviewed in Aso et al., 1995) and pol in termination

163 systems. In the pol II system, the footprint of the elongation complex on the DNA ranges from 48 to 55 bp and the transcription bubble contracts with size between 18 to 27 nt (Gu et al., 1996). The inchworming movement of RNAP is not intrinsic but represents a response by the elongation complex to specific DNA sequences (Wang et al., 1995).

Recently, Nudler et al. (1995) has shown that the T-tract motif is a sequence element signaling the entry of E. coli RNAP inchworming and that the leaping of the RNAP at the terminator coincides precisely with transcription termination in vitro (Nudler et al., 1995).

It is suggested that the low occupancy of the transcript exit channel in the RNAP, due to the formation of the RNA hairpin or stripping by p-factor, disrupts the interaction between the RNA transcript and RNAP and destabilizes the ternary structure. Threading of RNA during the leap of the RNAP (approximately 10 nts) (Zaychikov et al., 1995) causes further instability of the elongation complex and leads to an irreversible termination event (Nudler et al., 1995; Platt, 1996).

As the universality of the RNAP inchworming becomes more and more evident, it is likely that this mechanism exists in all three phylogenetic domains. If termination in the archaea also involves RNAP inchworming signaled by T-tract elements on the DNA, then distance between two T-tracts should influence the termination efficiency of the first

T-tract. Here we present a hypothetical model (Figure 3.19) to explain the effect of different spacing distances on the termination efficiencies of T-tracts. First, as shown in

Figure 3.19A, when the distance between two T-tracts is equal to or less than the size of a transcription bubble, the RNAP can recognize both T-tracts in one translocation cycle, and in this case, both T-tracts function as a single termination signal. In this situation, the

164 majority of the transcript will be cleaved/ released at the first T-tract, while the remaining transcripts will be release at the second T-tract. Although the size of the archaeal transcription bubble has not been mapped, the footprint of the Methanococcus RNAP has been determined to extend approximately 50 bp (Thomm, 1996), which is close to the size of the pol II footprint. Therefore, the size of the DNA bubble in the archaeal elongation complex might also resemble the size of the pol II DNA bubble (18 to 27 nt).

Figure 3.19B illustrates the second situation in which the two T-tracts are further apart—a distance greater than the size of the transcription bubble—yet are close enough so that RNAP can reach the second T-tract after it completes the inchworming signaled by the first T-tract. In this case, about 30% of the RNAPs will read through the first T- tract, start another translocation cycle at the second T-tract, and eventually be released at the second site. Under these circumstances, the distance between the T-tracts is likely to be the size of the transcription bubble plus the distance that RNAP leaps. Using the footprinting data for the E. coli and pol 11 systems as a reference, we estimate that this number should range between approximately 30 and 39 nts. Finally, if the two T-tracts are even further apart (> 40 nts) (Figure 3.19C), the RNAP should only detect one T-tract at a time and efficiently terminate transcription at the first T-tract. In this case, the presence of the second T-tract will not affect the termination efficiency of the first T- tract.

165 Summary

Taken together, the results generated from this research provide a set of rules for archaeal termination signals. Consistent with the mosaic nature of its genome, the

Archaea preserve termination signals that resemble those used in the other two phylogenetic domains. Furthermore, our data suggest that the underlying mechanism of archaeal termination involves the recently described RNA inchworming process. Further elucidation of the archaeal termination mechanism can be built upon the knowledge obtained from this study and will require a well defined in vitro transcription system.

166 RNAP footprint ~ SO bp

~ 18- 27 bp Transcription bubble D < 27 bp

Non-template strand

Template strand RNA polymerase

(B) D = ~ 30 - 39

T1

60% 30% RNAP translocates after inchworming

(C) D > ~39 bp T1 T2 > 90% < 10%

RNAP translocates after inchworming

Figure 3.19: The influence of spacing distances on the termination efficiencies of T-tracts.

Panels A, B, and C illustrate three conditions in which the distances (D) between two T- tracts on the non-template strand are different. Tl, first T-tract; T2, second T-tract. The predicted termination efficiency of each T-tract is indicated. Detailed discussion of this figure is presented in the text.

167 CHAPTER 4

H)ENTIFICATION OF AN H, raiCA/V//HEAT SHOCK GENE ENCODING A MEMBER OF THE CCT FAMILY

Introduction

The basal transcription machinery in Archaea has a number of eukaryal-like features, namely a complex multisubunit RNAP, a pol Il-like core promoter element, and the eukaryal-like transcription factors IBP and TFB (Langer et al., 1995; Thomm, 1996;

Zillig et al., 1993). Although there appears to be a close relationship between the archaeal and eukaryal transcription systems, our understanding of how Archaea regulate gene expression is very limited. To address this issue, we chose to use the heat shock

(HS) or stress response as a model for studying gene regulation in Archaea.

The HS response is an ideal system for studying gene regulation in Archaea, since this regulatory response is universally conserved, and the molecular mechanisms of this response have been extensively studied in both Bacteria and Eukarya (Mager and De

Kruijff, 1995). In these organisms, the expression of HS response genes is controlled at the level of transcription. In E. coli an alternative sigma factor, is induced during heat shock that directs RNA polymerase to heat shock promoters (Bukau, 1993;

168 Georgopoulos et al., 1994; Yura et al., 1993). In eukaryal cells, HS gene expression is modulated by the interactions between the heat shock transcription factor (HSF) and the basal transcription machinery (Fernandes et al., 1994; Wu, 1995). In each case, the regulatory schemes follow the general regulatory paradigms for these organisms. We reasoned that regulation of heat shock genes in the archaea would also reflect the general regulatory scheme in these organisms. We chose H. volcanii, a halophilic archaeon, as a model since this organism has been shown to have a HS response (Daniels et al., 1984), and 7 potential HS response loci have been identified in global gene expression studies

(Trieselmann and Charlebois, 1992). This organism also offers the advantage of having a genetic exchange system (Nieuwlandt and Daniels, 1990) and available vector system for in vivo gene expression studies (Palmer and Daniels, 1994).

We examined cosmid A199, which was shown (Trieselmann and Charlebois,

1992) to carry one of the seven HS-responsive loci in H. volcanii. Sequence analysis of this locus revealed a protein sequence that shares high sequence identity with other archaeal CCTs (44% to 62%) and eukaryal TCP-1 (approximately 35%) proteins and only moderate similarity (16% to 19%) to bacterial GroEL and eukaryal Hsp60 proteins.

Analysis of the cctl gene transcript, its transcription pattern, and its protein product (by means of overexpression) was also performed in this study. Identification of the regulatory sequence element(s) within the cctl 5’ flanking region and the underlying induction mechanism is currently being investigated by another laboratory member

(Thompson, unpublished data).

169 Results

Cloning and Sequencing of anH. volcanii Heat Shock (HS) Gene

Using radiolabeled cDNAs derived from H. volcanii DS2 total RNAs and a minimal set of overlapping cosmid clones (Charlebois et al., 1991), Trieselmann and

Charlebois identified 7 heat shock-responsive loci (Trieselmann and Charlebois, 1992).

We obtained cosmid A199 from R. L. Charlebois. This cosmid carried a 34-kbp Mlul partial fragment of the H. volcanii DS2 genome, including two potential heat shock loci located on 4.1 kb and 2.2 kb Mlul fragments (Charlebois et al., 1989; Trieselmann and

Charlebois, 1992). Based on the preliminary data of Trieselmann and Charlebois, and on our restriction analysis. Figure 4.1 A presents a partial restriction map of the DNA insert present in cosmid A199. To identify the HS gene-containing DNA fi-agments from cosmid A199, we first digested the cosmid clone with MmI restriction endonuclease, which generated 12 detectable fragments (Figure 4.IB). Based on the preliminary data, the HS loci were present in the 4 kb and 2 kb size ranges. Two DNA windows in the agarose gel—one containing fragments about 4 kb in size and the other containing fragments about 2 kb in size (Figure 4.IB) were purified using the Prep-A-Gene™ procedure, radiolabeled by the random primer-labeling method, and used to probe

Northern blots containing RNAs isolated from H. volcanii cells grown under normal growth condition or challenged by heat shock. Figure 4.1C shows that the probes (five

MmI fragments) derived from the mixture of the 4 kb and 2 kb windows hybridized to a

170 Figure 4.1: Identification of Mlul fragments from cosmid A199 encoding an H. volcanii heat shock (HS) responsive gene, cctl. (A) A partial restriction map of cosmid A199.

Cosmid A199 contains a 34 kb Mlul partial (“MmI”) fragment of H. volcanii genomic

DNA (Charlebois et al., 1991). The location of the cctl gene is shown by the solid rectangular bar, and the direction of its transcription is indicated by an arrow. (B) Mlul digestion of cosmid 199. Lane 1; HindM digested X DNA molecular weight markers.

Lane 2; cosmid A 199 digested with Mlul. The two windows of ~4 kb and ~2 kb DNA fragments used as the probes for Northern hybridization are indicated at the right. (C)

Northern analysis of H. volcanii total RNA isolated from cells grown at 37°C (lane 1) or challenged at 60°C for 30 min (lane 2), 60 min (lane 3), or 75 min (lane 4). A mixture of the five Mlul restriction fragments isolated from the 2 kb and 4 kb windows (B) were used as the probes.

171 (A)

H. volcanii cctl

1------—— BamHl Pstl HindSn

5 k b

(B) (Q

[ k b p ] 1 2

h 4 kbp window

P 2 kbp 1 7 0 0 n t window

F i g u r e 4 . 1

172 1.7 kb heat-shock specific RNA that exists at low level under physiological conditions

and increases its abundance upon shifting the cells to 60°C.

The five Mlul firagments were then blunt-ended and cloned into the Smal site of

pUC19. To determine which clones carried the HS gene-containing fragments, we

labeled the cloned DNA fragments by random-primer labeling and used each as a probe

in separate Northern hybridizations. Two clones, designated HS5 and HS21, were found

to hybridize to the same heat shock RNA (see Figure 4.2A and 4.2B, respectively). HS5

contained a 2.3 kb DNA insert with unique HirvXSi. oadXhol restriction sites. HS21

contained a 4.1 kb fragment with unique Pstl, Xhol, and Nrul restriction sites. The sizes

of these two DNA fragments were consistent with the data reported by Trieselmann and

Charlebois (1992).

Initial sequencing of HS5 and HS21 revealed that these two clones contained an

ORF with a deduced amino acid sequence that shared similarity with S. shibatae TF55

(now TFSSP) (Trent et al., 1991) and eukaryal TCP-1 proteins (CCTs) (Kubota et al.,

1994), both members of the CCT chaperonin family (Horwich and Willison, 1993;

Willison and Kubota, 1994). The HS5 appeared to encode the JV-terminal half of the

protein, whereas the HS21 contained the C-terminal half (Figure 4.2B). Overlapping

subclones (Figure 4.3 and Table 4.1) were generated from HS5, HS21, and cosmid A 199

and the complete sequence of gene and its flanking region were determined (Figure 4.3

and Figure 4.4). Based on the similarity of this gene to the CCT proteins, this gene was named cctl. Furthermore, deduced from the sequence information and restriction

173 Figure 4.2: Identification of H. 'volcanii HS gene-containing subclones from cosmid

Al99. (A) Northern analysis. Total RNAs were isolated from H. volcanii cells grown at

37°C (lane 1) and heat shocked at 60°C for 30 min (lane 2), or 75 min (lane 3). The DNA probes used were radiolabeled HS5 (left panel) and HS21 (right panel). RNA molecular weight markers are indicated on the left. (B) Restriction map of HS5 and HS21. HS5 contained a 2.3 kb fragment and HS21 carried a 4.1 kb fragment (from cosmid A 199) inserted into the Smal site of pUC19. The cctl gene region, identified by sequencing, is indicated by the solid bar with the arrow indicating the direction of transcription.

Restriction enzymes are B, BamHÎ', E, EcoRl; H, Hind EH; K, Kpnl; N, Nrul; P, Pstl; S,

Sstl; X, Xhol.

174 (A)

9.49 —

— 1700 nt

(B) 2.3 kb HS5 E K S XB H

Ikb HS21 4.1 kb T E K S^p XN S B H

pUC19 2.7 kb

Figure 4.2

175 Figure 4.3: Subcloning and sequencing strategy for the H. volcanii cctl gene. Solid bar represents the cctl gene coding region. Selective restriction sites are indicated: H,

Mndlll; M, MmI; Ps, PstV, Pv, P vmU; Sa, Sail, Ss, Sstl; X, Xhol. Slashed bars (above the gene) indicate the subclones constructed and their numbers correspond to those listed in

Table 4.1. Regions of DNA sequence information obtained from each sequencing reaction are shown as arrows.

176 17 15 i l 14 i l i l 11 i l

•100 bp 1.7 kb

H Sa X Sa Sa Sa^^Sa SaSsSspgSa

4 -*•

- » 4-

Figure 4.3

177 Number Name Insert Description

1 8PP 8 kb Pstl-Pstl from cosmid A 199

2 5PB 5 kb Pstl-BamVl from cosmid A199

3 HS5 2.3 kb Mlul partial HS gene-containing fragment

4 HS21 4.1 kb Mlul partial HS gene-containing fragment

5 HS5.S1 1.5 kb Sail fragment from HS5

6 HS5.S4 0.38 kb SaR fragment from HS5

7 HS5.S1.SH 0.5 kb 5!sd-///ndin fragment from HS5S1

8 HS5S4-XS HS5S4 with A7ioI-5jrI removed

9 HS5.HP HindSl-Pstl of HS5

10 8PP-XX 8PP vnthXhol fragment removed

11 HS21.PvPv PvuR fragment from HS21

12 HS21.Pvpv-Ps HS21.PvPv with Pstl-Pstl removed

13 HS21.Pvpv-Ss HS21.PvPv with Sstl-Sstl removed

14 HS21.S1 0.18 kb Sail fragment from HS21

15 HS21.S2 0.31 kb fragment from HS21

16 HS21.S3 0.25 kb fragment from HS21

17 HS21.S4 0.42kb fragment from HS21

Table 4.1 List of H. volcanii heat shock gene subclones

178 analysis of cctl. the location of the cctl gene on cosmid A199 was identified and is shown in Figure 4.1 A.

Analysis of the cctl Gene

The nucleotide sequence of cctl gene and its flanking regions are shown in Figure

4.4. The deduced amino acid sequence predicted a 1680-bp ORF starting with the ATG codon at position 238 and ending with the stop codon, UGA, at position 1918. The nucleotide sequence, 5’-TTTATA-3’, which matches the consensus sequence for the archaeal TATA promoter element, 5’-T/CTTAT/AA-3’ (Palmer and Daniels, 1995; Zillig et al., 1993), was identified 32 to 27 bps upstream firom the ATG start codon. Like many archaeal protein coding genes, no ribosome binding sequence was apparent in the sequence immediately 5’ of the start codon. In addition, a T6-tract with adjacent GC residues, features that fit the description of archaeal transcription terminators (Zillig et al.,

1993), and shown in this work to fimction as a terminator, was found 74 bp downstream fi-om the stop codon. Based on the sequence information, the expected size of such a transcript would be approximately 1760 nts, which is consistent with Northern blot analysis (Figure 4.1 and 4.2).

The presiuned CCTl protein contains 560 amino acids, giving the protein an estimated molecular weight of 59 kDa. In addition, based on the amino acid sequence,

CCTl is very acidic, similar to other characterized halobacterial proteins, having a calculated pi of 3.87. Figure 4.5 shows an amino acid sequence alignment of the H. volcanii CCTl, human TCP-1 (a representative of eukaryal CCT protein family), and

179 Figure 4.4: Nucleotide sequence of the H. volcanii cctlgene and its deduced amino acid sequence. The complete nucleotide sequence of the H. volcanii cot I gene, including 5’ and 3’ flanking sequences, is shown. The amino acid residues in the coding region are indicated under the corresponding nucleotides. The mapped transcription start site is indicated with an asterisk (*), and the experimentally determined transcription termination sites are marked with bullets. The consensus TATA element is boxed, and two sets of direct repeats are indicated by arrows. The underlined sequence is complementary to the anti-sense oligonucleotide, HSPE, used in the primer extension and transcription induction studies. The amino acids in the putative ATP-binding pocket

(Kim et al., 1994) are shown in bold print.

180 CGACAGAACAACTGAGACGCCACCGCCGACGAGTCCACCCCGACGCGACACGGCACTCGC 6 0

CCGCATCGTCTTTTCACCGCACCCGCTTCGGGACTGGCGAGACCGCCCGGCGAAAGCCAC 12 0

AGATAAACCGCCGAAATCGACCGCCGTCTCCGGGTCTCGACCACCCCACAGTGTTGCGTT 180

TTGGCATACCAATAAACGAAGCTTT|rTTATAbAATCACAAACAATCAGGCGAT$ACTATG 240 • - • ► - • -► * M

AGCCAGCGAATGCAGCAGGGTCAGCCCATGATCATTCTGGGCGAAGACTCCCAGCGCACA 300 SQRMQQGQPMIILGEDSQRT

TCCGGACAGGATGCGCAGTCGATGAACATCACGGCCGGGAAGGCCGTCGCAGAGGCCGTA 360 SGQDAQSMNITAGKAVAEAV

CGCACGACGCTCGGCCCCAAAGGGATGGACAAGATGCTCGTCGACTCCGGCGGGCAGGTC 420 RTTLGPKGMDKMLVDSGGQV

GTCGTCACGAACGACGGCGTCACCATCCTCAAGGAGATGGACATCGACCACCCCGCGGCC 480 VVTNDGVTILKEMDIDHPAA

AACATGATCGTCGAAGTCTCCGAGACCCAGGAGGACGAGGTCGGAGACGGCACGACGACG 540 NMIVEVSETQEDEVGDGTTT

GCCGTCATCAACGCCGGTGAACTCCTCGACCAGGCCGAGGACCTCCTCGACTCCGACGTC 600 AVINAGELLDQAEDLLDSDV

CACGCGACGACCATCGCGCAGGGCTACCGCCAGGCCGCCGAGAAGGCCAAGGAAGTCCTC 660 HATTIAQGYRQAAEKAKEVL

GAGGACAACGCCATCGAGGTCACGGAGGACGACCGCGAGACCCTCACGAAGATCGCCGCA 720 EDNAIEVTEDDRETLTKIAA

ACGGCGATGACCGGTAAGGGCGCGGAGTCCGCGAAGGACCTGCTCTCCGAACTCGTCGTC 780 TAMTGKGAESAKDLLSELVV

GACGCCGTGCTGGCCGTCAAGGACGACGACGGCATCGACACGAACAACGTCTCCATCGAG 840 DAVLAVKDDDGIDTNNVS IE

AAGGTCGTCGGCGGCACCATCGACAACTCCGAACTCGTCGAGGGCGTCATCGTCGACAAG 900 KVVGGTIDNSELVEGVIVDK

GAACGCGTCGACGAGAACATGCCCTACGCCGTCGAGGACGCGAACATCGCCATCCTCGAC 960 ERVDENMPYAVEDANIAILD

GACGCGCTGGAAGTCCGCGAGACCGAAATCGACGCGGAAGTCAACGTCACGGACCCCGAC 1020 DALEVRETEIDAEVNVTDPD

CAGCTTCAGCAGTTCCTCGACCAGGAAGAAAAGCAGCTGAAAGAGATGGTCGACCAACTC 1080 QLQQFLDQEEKQLKEMVDQL

Figure 4.4 be continued) 181 Figure 4.4 (cont.)

GTCGAGGTCGGCGCTGACGCCGTCTTCGTCGGTGACGGCATCGACGACATGGCGCAGCAC 1 1 4 0 VEVGADAVPVGDGIDDMAQH

TACCTCGCCAAGGAGGGCATCCTCGCGGTCCGCCGCGCCAAGTCCTCCGACCTCAAGCGT 1 2 0 0 YLAKEGILAVRRAKSSDLKR

CTCGCCCGCGCGACGGGCGGCCGCGTCGTCAGCAGTCTCGACGACATCGAGGCCGACGAC 1 2 6 0 LARATGGRVVS SLDDIEADD

CTCGGCTTCGCCGGCTCCGTCGGACAGAAGGACGTCGGCGGCGACGAGCGCATCTTCGTC 1 3 2 0 LGFAGSVGQKDVGGDERIFV

GAGGACGTCGAGGACGCCAAGTCCGTCACGCTCATCCTCCGCGGTGGCACGGAACACGTC 1 3 8 0 EDVEDAKSVTLILRGGTEHV

GTCGACGAGCTCGAACGCGCCATCGAGGACTCCCTCGGCGTCGTCCGCACGACGCTCGAA 1 4 4 0 VDELERAIEDSLGVVRTTLE

GACGGCAAGGTCCTCCCCGGCGGCGGCGCTCCCGAGACGGAGCTCTCCCTGCAGCTCCGC 1 5 0 0 DGKVLPGGGAPETELSLQLR

GAGTTCGCTGACTCCGTCGGCGGCCGCGAGCAGCTCGCCGTCGAGGCGTTCGCCGAGGCG 1 5 6 0 EFADSVGGREQLAVEAFAEA

CTGGACATCATCCCGCGCACCCTCGCCGAGAACGCCGGTCTCGACCCCATCGACTCCCTC 1 6 2 0 LDIIPRTLAENAGLDPIDSL

GTCGACCTGCGCTCGCGCCACGACGGCGGCGAGTTCGCAGCCGGCCTCGACGCCTACACG 1 6 8 0 VDLRSRHDGGE FAAGLDAYT

GGCGAGGTCATCGACATGGAAGAAGAGGGCGTCGTGGAGCCCCTCCGCGTCAAGACCCAG 1 7 4 0 GEVIDMEEEGVVEPLRVKTQ

GCTATCGAGTCCGCGACCGAAGCCGCAGTCATGATTCTCCGCATCGACGACGTCATCGCG 1 8 0 0 AIESATEAAVMILRIDDVIA

GCTGGCGACCTCTCCGGTGGCCAGACCGGCAGCGACGACGACGACGGCGGAGCACCCGGC 1 8 6 0 AGDLSGGQTGSDDDDGGAPG

GGCATGGGCGGCGGCATGGGTGGCATGGGCGGTATGGGCGGCATGGGCGGCGCAATGTGA 1 9 2 0 GMGGGMGGMGGMGGMGGAM

GCGTCCCAGTAGGGCACCATACACCCACACCCTGATTCGAATCGACTCGACTCGCTCGGC 1 9 8 0

CAAATTCACGAC g M Î W t GCGGCTTTACACCCGTTAGCAGCTGCTTTGCCCGCGGACGC 2040

CCGGTTTGGACGCCGTTAATCGAGCGCCCGT 2071

182 Figure 4.5; Protein sequence alignment of H. volcanii CCTl and other CCTs. All gene

sequences were obtained from NCBI Genbank, and the alignment was performed by using the Clustal method of the DNA Star Lasergene programs. Sequence abbreviations and gene accession numbers are as follows: Hum TCPl, human TCPl (PI7987); Mkan therm, M kandleri thermosome (Z50745); Pocc therm, Pyrococcus sp. (KODI) thermosome (D29672); Desulf HHSP, Desulfurococcus 57 HHSP (NCBI gibbsq

170854); Tacid thermA, T. acidophilum thermosome a (Z46649); Tacid thermB, T. acidophilum thermosome P (Z46650); Ssh TF55A, S. shibatae TF55a (P28488); Ssh

TF55B, S. shibatae TFSSp (P46219); Hvol CCTl, H. volcanii CCTl (this study); Hvol

CCT2, H. volcanii CCT2 (Kuo et al., 1997).

183 Huol

Hvol c m VjÇD D:D G - - - I I DIT NIM V S ll E K V ViG Gl t I I d In S e IlIv E|G vj I V |P K SlRIVtP E NI m p | y a Iv e I 233 Hvol c m VI El A N. D G S HIVaPLiEitf V SII.BT 0 TIG: RL£ A S E S e IlI l TIGIAIV I P KUI PIV HID d Im P|V Q F D 238 M an ctrerm -|Ei E EID GE lIV IIDITln HII Kl LIE K K E GJ3_G L Elfl.TLEj LLvijS,GLMi V I P K ËIRJV H P G M PjRjR V E|237 D esulf HtSP EIK V G DHL Y! Kl V DLDNIKIFIEKKE G._^S: V H_EJT Q L Il_RjG VVIPKEVVHPGMPKRVEi23S Pocc therm EIK VG E TIY KIV D L D N I Kl FIE. K K_S.G G S V KID T Q L l i G V V I D K E V V H P G M P K ^ Vjsl 23S Tacid thermA Eng R D G KUU l;V DÎtËN Ijav DIK KiNiG G S 'V n I d T q E ljslG jïhi_I.D K E p c i v h I s KIM p I F v j v l K 232 T acid thermB LLL_BL.G -i5-Y Y IV D iFlD ail P V VlK K(Q|G G A il D P T 0 L ItNl I I VP K ELBlV H P G M P D ^V lK 233 Ssh TESSA E ^IL P NiaGLBNIVLSIL E(J.LI_K, I! dI K Kl KIg G S II E^dEjÀ lT v E PKEVVHPGM PlElRjai 240 Ssh TESSB - |¥ l RIG pfïÔW YIV D L D NI V OllIVlK Kinic G S iInIp T O UV Y| m i ID K E V V H P G M P K R j l Ë 241 tu n TCpl Y T D I Rtao PRYPVNSV nIuLÜ S A HIGIRlg.O M E S m Il HS' i Y A L N C vV!Iv I g S q Ig m p K r1 I V 230

Hvol c m D|AINrï~Â! i JT d DIA L E V|R[E T E I D A El V N ' S K CIL K^£M V~d ]q L v ( 3 283 Hvol o m .E e( a OIL k Iq k Iv D tfifïlv d . 288 M an therm N A K I A L-LIN C P l l E V K E T EITlD AEIRITDPEQLQA FtljElEjE E| R M L S E M V P K HA|E 287 Desulf f*BP m.A KIALINDIALEVKET EIt | p A E I R I Tis] P EQLQAFLEQEEIRML REMVDKIKE 285 Pooc therm jGIA K I A Ll I J I EIA L E V K_ET E'TId A E I R I T'slp E O L O A F L ^Q E jj k Ih L r1e,m VDKIKS------28S Tacid thermA iH,A K I A U I I P Si A, L E il| Kl K13; e J ELûi K V_2| I|S l D PIS K I |Q Ë f lIn IQ ElTftÜT F iKi, M VIe Ik I KIK 282 Tacid thermB DiA ------K I A_L L D A_PIL E U lg K P IE IF Ë T N LlR i Ë p PIS M Ilgl^F lI a Ii 1 E .£ J U 1 L lR|EMVDK_Uas 283 Ssh TFSSA KIAKIAIVILDAALEVIEK p Ie H s Ia I KU SU TISIP E Q| I KU F Ll D E EIS K YIL KId Im V D KiL A S 290 Ssh TESSB N A K I A L L D Ai SiL E v is K p Ie îLID A E I R HM D PUICTM H k If L ElElE E N l l i l KË k Iv P K^A A 291 tu n T tp l N A K I AlCiL D FSLÜQKTKMKLGVQV VU T P P~i k H d Q I R Q RlEIS P I t Ik Ë R IQULlJ L A 280

Figure 4.5 (to be continued) 184 Figure 4.5 (cont.)

Hvol o c n ■y G D AIV FIVG DIG I D gMIAQ H Y L A KIEIG I L A y R RIa IKLs Is D| l K R|l ^ r L Hvol cca SIG ALDjV VFCQKGIDDLAQHYLA Kto]G I L A V R rI t Jk K S Man cbenn jr|G A N V V F, QKGXDDDAQHYLAKKGILAyRRyRKSD Dasulf HEP IVG ANVy Fl 1QKGIDDLAQHYLAK Ÿ]g I | 3 a yRRVKKSDMEKLAKATG a |k I: Pocc them [VG ANVV^, ]Q K G I D D JtA Q H Y L A K Yj G n u l A yRRVKKSDMEKLAKATG AK: ■ n v o id thermA ^ G A N V Vl Li’ LQKGIDDlylAQHY L A K flGIlYIAVRRVKKSD M E K L A K A T G A K : TwriH thermB iVG AN VVII Q K G I D DtMiAQ HJf L iü ^ C lUflAV R RV K K S D M^DjK L A K A T G AlS Ssh 1PS5A I GANyylIICQKGIDjailAQHlELAKKGILAVRR, SshlESSB G A N y W IIC .S Ü G I d LE y iü Ji, YJtAJULG.K G I L A ' tu n Tkpl T G A N vTl L T T GIG I DËi M C L Kbfl F V E ^ amIaj

Hvol œ n ;ld ^ i e a d Î3------f S g s Û3g S k DIV G G D RllFV ElDV - E D a E s Iv t | l I : Hvol CCT2 IL.DIS I E A A y ------GRAS yfâ R D E A cl L F y Iv e G ti G D D y h g Iv t I l Il I; M an th em IlILD LIS E(Ë Ü ------LGE ASyiVElEKlKVIA KtM I F y E G Cl KlD)P K A V T I L Desulf HHSP VRIDLTPEn------LGEAELVEQRK VIA Gl E NIM I F y E G Cl KNPKAVTIL Pooc ttierm y RID L T P S g ------L G E A E L y E O R K ^ A a.EN|M TFV EG CI K N PKA VXIL Ttif-iA ctrermA iJ.DJU lX.E1 S V ------L GX.A Efriv e[ e][r K[ï]GiDlgRiM F v IS g c K N P KAVls Lhcid thermB iiBj.E IX.S s[g ------L g I t A e LeI v EXI VUÎ.y Gl E DjYIM F vtllG X i KN^KAv|s Ssh 1FS5A I.KIi3^T_EiD|g ------L g | y A E L y El EfRÎRly G|N g K|M F u i E g I - K n H k A VI N SshlPSSB .ilfij e Tl J sjiid ------LGlY ai ^ I l y e ie Ir K y ci e | gKiMiviFy e g I a K.H.P k Is tu n T tpl JJA n Lu E G IS E T F e a a mil g Iq IA SE.VIVlaE r I c DID ElLIIlL I K N - TLK) A R T S A S

Hvol o c n .G T.E H y y D a liT rUÎi | e d Ls jl g v vi _r t t Ul£ . d g k v lLp Ig g g a p e it Hvol 0CI2 ^ T l f l H y y D HLIE R I.G .y^D A J J ^ V V A)S T V AID g S v JIl A G G G At2E|y L R N M an th em G T E H V V D E ^ E R Ai_IJE D AjllG V VJÜ A Al L, E D G K W Vlâ_XG G A P EIV Desulf HEP GTEHVyDEVERALEDAy K|V V |K lD q y tE D G kTi B P a Ig G A P El I E L A IIKJL K E 428 Pocc th em G T.E H v V D E V E R A L E D a Iv Kiv VUçlcjD V|E D G td I yÜl.AjG G A P EUJ ______E L A_I_R _ L|D______TScid tlmniA JLTIQ H V v S eJLE R.&JND.AI.I RIV V All T ^ E D Gjsi fHjW JG G G a 1v 1ê ,A E L AImIR UA k Iy A N@ 42S Thcid ttmmmB e It E H V V D B Ml E Rl S I Tj D(.S1 Li S .a Il XJLSII A Y A16. G G G a It A.16,e Q a f [r l ’rI.s Jy A (Q K 426 Ssh TFSSA S N D M A LlD E AIE RIS I N D A Ll H A L R NU L L E P V i E ± J G G G a 1 IlEI LIe J l A M WL R E Y A 1r [Ü 433 SshTFSSB .a L ia RI y y D E T IE R .6 ,g RID A L GTIVA DIV I RID GIR A VIA G G G AivlE|.I^LE IIa I k E l x Ti^Y A PO 434 tun Ttpl AND F M c Id HMIE Ris l d h Id A L icfv v I k R VIL E S K sfv jv p Ig G G Al v Ie a 1 a IlI s f ï i y 1 Ll E n Iy a I 't (Ü 429

Hvol o c n VGGREQLAl VIE A F A^EjA lI d . l l i P P . TLAENAGLDPI DLsjL vj DIL RIS r [ h 1 d G g I e F A - - 47S Hvol o c n yfsic R E 0 L Al VIE a E a D A L EILJ^P r E L A EN A G L DÎH I D T L WD L R A A H ^ D G Q V|R - - 478 Man ttrem v i s G R E Q L AiVIE A F A.D A L X .ÎJ I P P. T L A E n E g L D PX-CNl" vtilluE.a S I DI______479 Desulf HHSP y G g (h EQLAIEAFAlEAUKjVIPRTLAENAGLD PjV, e [t L V KIV II A A HI k IeIK G P 477 Pocc th em VO______gIBEOLAIEAFAIBA ÜJÇl3i.I PRTLAENAGLDP - II ET _ . ET L L V KIV I A A HI k Is IK G P t 477 Thcid thermA G G R JIQ L A I E.A F AUS A L E ll( l P R T L A E N A g CË D P IlNlT L IlK LlJÇjA hJ kH[g i r i S ■ 474 ’nw-iH thermB .^G GJI jBJQ L A I EliSF A D AÏS El El I P.E|A|L A E N A G L D P l 3 l l [k L R A HIA K IG N K m • 475 Sdi TFSSA VGGKIEQLAIE A F, A D A L El El I PI T IIL A e E A G L li6 lI i.^ A , M^L R^R HjA k Isi I l T N ■ 482 SshTFSSB J J G J Ï WE O L A I E Al YlAlNlAl llElG L I M IIL A.S.N A JL i,D _E _L ^ KUJm jjJl RI.S L H j N E%N_K - - 483 tun T tpl mIg I s Fr e o L a d a e IF-AI r S! g LIV I PI NLT L A VIN Al A 01 g S Tjg L V AlK L R Al PlHINfii AlO VIN P E 479

Hvol o c n ------A G LÏpiA YlT G H -ly_IIDJ||Eiai.E!G VIVIe p L r y k t o A ^ E |S A T E A AI v Im i l R I D 517 Hvol o c n ------A q LJi)VmX.G_B -IV E DA F DIA G viV IEIT A H AlK l i o AjV A S AlSI E A a 1 N L v Il Im I D 520 M an ttrem ------A.GLIID VLY DIG_D -E K ID MJ. EIEIG VI v Ie p L R VKTO a]l aIs A.T E A A^M IL RID 521 Desulf HEP ------IlG y D V FIE G El - P AID MIMEIK|G y II A PLRVjPKjOAIKSA SEAAIMILRID 519 Pocc ttrem ------i Ig V D V FI El G a - P AID MJ lXIRIG V x I a p S r VIP k 0 a I K S A SIEAAIMILRID 519 T hdd tt]ermA ------VIG y d I l D N N G E w D MlK A Kl G VIV D P L R v]i3^A[rE|S A e E a E m 1 L r I D 516 T ndd ttiermB ...... Y d I NIV F X G_a - IlE DJJV K NI G y I E pTi Ir ylG.KlQ A jd s A T E A A I M I L R I D 517 Ssh TFSSA ------TlG y D v u g Is k lE .D |gy, Y A L N 1:1 E H lla.V Kl Aid V l IÎC^ A T E A j^ T AU l S I D 525 SshTFSSB ------W YIS L N L E x GIN - Pl£.DJJW KJJG VX.E P| A LIV KIM N lfil K|WA T E AIV T L VlL R ID 526 tu n T tpl rknlkwiisll S ls nus k - P RLQN K OlA G VIFlE H t im jS IV K SLI k IFIA T E A A II T U L R I D 528

Hvol o c n .D y I A A GIDIL SIG G Q TIGIS D ^D .D GGAPGGMGGGMGGM îTWÊTk^fTi 561 Hvol o c n ;d E I.A a d D |L sI t G - -la D d Ie E e G G A P G G M ------G G M I G GlMioaMlfilGli)- M 557 M an tiiem D y I A a | r EIL S K E lE Ü E E E E B ...... G G S 545 Desulf HEP ID y I A W sfiTti EIKiDlK B D - - - K ...... G G S N D P G S(D l Id } 545 Pocc therm :D y I A .^ S |K Ji Eud DULE_5.G| - - K ...... IGG^...... E d Le .G. ______WD LUJi 546 Thcid ttrermA ID y I AlS Ktjas T P P S G Q iaG Q G ------oIsjM p S g UsI m P l ^ 545 Thcid thersfi IDJ£,I AIT K S s ( 3 S S S N P Ptî3 S G ------S S S e E s Ie JJI 543 Ssh TFSSA ;D(LU A~Â1a PIUK s ie k ik Ig g Ie g S k EESGGEGGAGTPS l E d ___ _ 560 S * TFSSB ni^fLAXKIJÇG G.^E PIG g IS k IB ...... KEEKS-- -552 s E ^ tun T tpl d I l I D Ik n H PIB I L R ILElH ...... ----GSYEDAVH s S a L NUJj 556

185 other known archaeal CCTs. The extensive sequence identity between this H. volcanii protein and other CCT members occurs throughout the entire protein, except at the extreme carboxy terminus. However, the sequence conservation between the eukaryal

CCT and these archaeal CCTs is most striking at the N- and C-terminal regions. These highly conserved regions correspond to the equatorial domains of E. coli GroEL (residues

1-133 and 409-523) (Kim et al., 1994), which are responsible for subunit assembly and

ATP-binding and hydrolysis (Braig et al., 1994). One of the unique features of the CCTl sequence is the 5 GGM/GGGM repeats situated at its carboxy terminus. We searched the

NCBI Genbank database with this sequence motif and found that this repeat, while generally absent in CCT proteins, was found in many Hsp60/GroEL proteins (Gupta,

1995; Picketts et al., 1989). Since deletion of this element did not appear to affect the biological activity of E. coli GroEL in vitro (McLennan et al., 1993), the functional role of this element is presently unknown. However, it has been suggested that the repeat might be involved in anchoring the protein in the membrane (Vodkin and Williams,

1988; Chitnis and Nelson, 1991).

The ATP-binding site and its surrounding structural sites, which are highly conserved in all CCTs and the members of the Hsp60/GroEL family (Kim et al., 1994), were also found in H. volcanii CCTl. The conserved GDGTT motif, located at amino acid residues 96 to 100, is the putative ATP-binding site (Kubota et al., 1994). Mutation of the aspartate within the same motif in GroEL abolishes its ATPase activity and prevents the release of polypeptide bound to GroEL (Braig et al., 1994). Two other conserved sequence elements, the hexapeptide W TNDG (amino acid residues 61 to 66)

186 and a triple glycine loop (amino acid residues 408 to 410) are adjacent to the ATP- binding site in the structural pocket. The W TNDG motif of CCTl does not perfectly match the hexapeptide consensus motif of eukaryal CCTs (TITXDG); however, replacement of the first and/or the isoleucine to valine in this motif occurs in many archaeal CCTs (see Figure 4.5).

Table 4.2 summarizes the pair-wise comparison of sequence similarities between the H. volcanii CCTl, some eukaryal CCTs, all known archaeal CCTs, and Hsp60s from bacteria and eukarya. The data show that H. volcanii CCTl shares 43.8% to 61.5% similarity with archaeal CCTs (1 vs. 2 to 9) and approximately 35% similarity with all three eukaryal CCTs (1 vs. 10 to 12). The sequence conservation between the H. volcanii

CCTl and Hsp60s (1 vs. 13 to 15) is much lower (less than 19%), which is similar to the comparison between CCTs and Hsp60s (2 to 12 vs. 13 to 15). In addition, when compared with all archaeal CCTs, the H. volcanii CCTl was most similar to the

M kandleri thermosome (61.5%). Taken together, all of the sequence information suggests that the H. volcanii HS protein is a new member of the CCT family. This cctl gene is the first chaperonin-encoding gene identified from a halophilic archaeon.

Phylogenetic Analyses

A phylogenetic tree of the CCTs and Hsp60s listed in Table 4.2 was constructed by using the Clustal method of the Lasergene program (DNAStar, Inc.). As shown in

Figure 4.6, CCTs diverged from Hsp60s as a separate group, and the CCT linkage is

187 Percent Similarity 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 ■ ■ S7.8 61.5 54.9 54.2 50.1 50.3 43.8 46.0 34.9 35.6 34.5 17.2 18.8 16.4 1 Hvol CCT1 2 39.1 ■ ■ 54.9 51.0 50.9 46.2 46.0 41.1 44.7 29.9 30.8 28.7 19.7 19.5 18.9 2 Hvol CCT2 3 36.9 43.9 ■ ■ 67.5 67.7 57.2 56.5 50.6 53.8 32.5 37.6 34.5 18.2 20.6 18.0 3 Mkan therm 4 43.6 47.3 30.9 93.9 57.6 56.5 51.6 54.7 33.2 35.4 34.7 16.9 19.3 16.5 4 Desulf HHSP 5 44.3 47.2 30.7 5.5 58.2 58.4 52.4 55.5 36.4 35.8 34.1 16.8 19.5 17.2 5 Pocc therm 6 48.2 51.2 41.3 40.7 40.4 58.9 48.6 50.6 34.1 34.3 33.4 18.0 21.1 20.6 6 Tacid thermA 7 48.4 52.7 42.6 43.0 41.6 39.9 ^ ■ 1 47.5 54.5 33.7 35.0 34.1 15.8 §I 18.8 16.6 7 Tacid thermB 8 52.4 55.5 45.8 45.1 44.3 48.5 49.8 51.4 33.3 33.0 30.8 18.1 18.9 16.6 8 Ssh TFSSA 9 51.9 52.3 45.4 44.6 44.0 47.4 45.0 45.9 ^ ■ 1 34.6 33.6 30.4 16.4 18.2 16.1 9 Ssh TFSSB I 10 61.9 64.8 62.0 60.8 60.7 60.2 60.8 60.9 62.7 ■ ■ 65.0 60.3 15.1 17.1 14.4 10 humTcpi Q. 11 61.5 64.4 59.4 61.9 61.4 59.6 59.5 61.9 63.9 34.1 { ■ ■ 59.4 15.0 16.0 15.4 11 Athal Tcpi 12 61.2 65.7 62.4 62.8 63.3 63.6 63.8 62.9 65.2 37.1 38.4 13.0 15.1 14.7 12 Scare Tcpi OO 00 13 77.8 78.5 77.1 79.8 80.0 79.7 80.0 80.9 79.8 82.9 82.6 82.5 60.1 49.5 13 Eco GroEL 14 76.5 77.3 75.2 77.7 77.5 77.0 76.5 78.1 79.1 80.1 81.4 81.9 38.1 ■ ■ 49.1 14 Bsub GroEL 15 79.4 80.5 78.8 78.1 78.1 78.4 80.9 80.2 80.5 82.2 82.9 82.3 49.0 49.0 ■ ■ i 15 Hum HspSO 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Table 4.2: Pairwise comparisons of selected eukaryal, archaeal, and bacterial chaperonins using the Clustal method.

Genbank accession number; A. thaliana Tcpi, DI 1351; S. cerevisiae Tcpi, M21160; B. subtilis GroEL, B41884; E. coli GroEL, S56371; human Hsp60, M34664; others are listed in Figure 4.5 legend. p. occultum thermosome Desulfurococcus SY HHSP M. kandleri termosome H. volcanii CCT I H. volcanii CCT2 T. acidophilum thermosome a T. acidophilum thermosome p S. shibatae TF55a S. shibatae TFSSp. Human TCPl A. thaliana TCP\ VO00 S. cerevisiae TCPl B. subtilis GroEL E. coli GroEL Human Hsp60

50 40 30 20 10 Figure 4.6: Phylogenetic tree of CCTs based on pairwise sequence comparison using the Clustal method. The distance between sequences can be measured by the scale beneath the tree. further branched into two sister groups—the eukaryal CCTs and the archaeal CCTs.

Within the archaeal CCT branch, the H. volcanii CCTl is clustered with all other euryarchaeota CCTs and two crenarchaeota CCTs [P. occultum thermosome and

Desulfurococcus hyperthermophilic heat shock protein (HHSP)]. The other archaeal

CCT group consists of the a and the P subunits of S. shibatae TF55.

Detection of an Additional crf-Related Gene in H. volcanii

It was not until the end of this study that the hetero-oligomeric nature of archaeal

CCT complexes became known. The first archaeal CCT, TF55 (the only one identified then), was thought to be a homo-oligomeric complex (Trent et al., 1991). To date, all other purified archaeal CCTs with the exception of M. kandleri thermosome (homo- oligomeric complex) (Andra et al., 1996) display a hetero-oligomeric structure of two stacked rings formed by two types of highly related subunits (51% for S. Shibatae TF55;

59% for the T. acidophilum thermosome) (Kagawa et al., 1995; Knapp et al., 1994;

Phipps et al., 1991; Waldmann et al., 1995.a; Waldmann et al., 1995.b; Waldmann et al.,

1995.C).

To determine whether other ccf-related gene(s) are present in H. volcanii, we performed Southern blot hybridization in which H. volcanii genomic DNA was digested with various restriction enzymes and probed with radiolabeled DNA encompassing the entire cctl gene. Radiolabeled cctl DNA hybridized to three Mlul fragments (see lanel of Figure 4.7): the 4.1 kb and 2.3 kb fragments previous shown to carry the cctl gene and

190 F igure 4.7: Southern analysis of H. volcanii genomic DNA. (A) Agarose gel electrophoresis of H volcanii genomic DNA. Each lane contains approximately 15 pg of

H. volcanii genomic DNA digested with Mlul (lane 1), Xhol (lane2), Sstl (laneS), and

Apal (lane 4). (B) Southern hybridization of H. volcanii genomic digests. A PCR product encompassing the entire cctl gene was used as probe. X-HindUI digestion products were used as size markers. Lane labels correspond to those in panel (A).

191 (B)

12 3 4 23 — 9.4 — 6.5— l-f w

4.3

2.3 2.0

- zm

Figure 4.7

192 an additional 3.3 kb fragment. When genomic DNA was digested with Xhol, two fragments were expected to show hybridization signals: a 2.0 kb fragment containing the carboxy terminus of the CCTl ORF, and another fragment (size unknown) containing the amino terminus. However, in addition to the 2.0 kb fragment, two other fragments

(approximately 18 kb and 11 kb in size) also hybridized to the probe (lane 2). In addition to the two cc/7-containing Sstl fragments (1.5 kb and 3.0 kb in size; see lane 3), we consistently detected two other signals corresponding to a 6.2 kb fragment and 8.6 kb fragment (lane3). Furthermore, two Apal fragments (only one contained cctl) hybridized to the probe (lane 4). Together, these digests suggest the presence of at least one other cct-related gene in H. volcanii.

Transcript Level of cc/-Related Genes in Response to Environmental Stresses

To demonstrate that the level of cctl mRNA is temperature-dependent, 5’ end- labeled HSPE oligonucleotide (see Figure 4.4 for its complementary sequence in the gene) was used to probe total RNAs isolated from H. volcanii cells incubated at different temperatures (Figure 4.8). However, one of the complications that arose from the realization that H. volcanii had multiple CCTs was the potential for cross-hybridization in the Northern analysis. We later checked the specificity of the oligonucleotide probe,

HSPE, used in the induction studies (described in next section) by performing Southern analysis (data not shown). The result indicated that this HSPE oligonucleotide also hybridized to the additional 3.3 kb Mlul fragment detected earlier in the Southern hybridization using the entire cctl as a probe. Recently, this second cct gene, named cct2,

193 has been completely sequenced (Kuo et al., 1997) and its deduced amino acid sequence shown in Figure 4.5. The complementary regions of HSPE oligonucleotide in cctl and cct2 shared > 90% sequence identity at the nucleotide level. Therefore, the signals we obtained in the following Northern analysis may represent two different transcripts.

The temperature course was conducted first to determine the temperature for maximum induction of the cct mRNA. In this experiment, cells growing at 37*C were subjected to heat stress at 45°C, 50°C, 55“C, 60°C, and 65°C for 45 min. Total RNA was isolated immediately using the RNeasy™ system and subjected to Northern analysis

(Figure 4.8A). The induction became apparent when cells were challenged at 50°C, and the peak for the level of the cct mRNA occurred between 55°C and 60°C with an approximately 13-fold increase in HSPE-derived signal compared to the non-shock condition. The induction was calculated as a ratio by dividing the total counts of cct- specific mRNA detected under thermal stress to the total counts of ccr-specific mRNA detected under non-stress conditions (lane 1 in Figure 4.8ABC). Since no appropriate

RNA internal control for signal quantitation was available, we consistently loaded total

RNA isolated from 1.2 ml of an volcanii culture (approximately 10 pg) per gel lane.

Based on the results of the temperature-course study, we then conducted a time- course study at 60°C. The results of this experiment are shown in Figure 4.8B. The data indicate that the level of the H. volcanii cct transcripts increased dramatically after 30 minutes at 60°C and continued to rise after 75 minutes. However, in a separate experiment, we observed that induction began to decline after 60 minutes at 60°C (data

194 not shown). This inconsistency may reflect instability of heat shock gene transcripts, which complicates their isolation and detection.

In addition to the heat shock response, Daniels et al. also examined the protein profiles of H. volcanii under salt shock conditions (Daniels et al., 1984). They reported that lowering the salt concentration of the growth medium induced at least one unique protein and two other proteins (91 kDa and 79 kDa), which seemed to comigrate with two

HSPs. To examine if the level of the cct transcripts also responded to salt shock, we subjected H volcanii cells (ODsso 1.0) to reduced NaCl concentrations at 37°C for 60 minutes. Total RNA from these cells was analyzed by Northern analysis as in the temperature-course and time-course studies. Figure 4.8C shows that cctl was also induced under lower salt conditions— 2.7-fold at 60% and 4.6-fold at 40%. Therefore,

CCT and/or CCT2 might be a general stress protein of H. volcanii that plays an important role in the survival of cells encountering environmental stresses.

Mapping the cctl Transcription Start Site

To examine the possibility that different promoters in the cctl gene were utilized under different physiological conditions, we performed primer extension analysis using total RNA extracted from H. volcanii cells that were not challenged, cells that were heat shocked at 60°C for 60 minutes, and cells that were salt shocked in complex medium that contained 40% of 2.2 M NaCl concentration. In order to clearly detect primer extension

195 Figure 4.8: Induction of the H. volcanii cct transcripts. Northern analyses of total RNA

isolated from H. volcanii cells under conditions of heat shock are shown. (A)

Temperature-course study. H. volcanii cultures were grown at 37°C and then exposed to

a 45 min temperature upshift at 45°C, 50°C, 55°C, 60°C, or 65°C (lanes 1-6, respectively).

(B) Time-course study. An H. volcanii culture was heat-shocked at 60°C for 0, 5, 10,

15, 30,45, 60, or 75 minutes (lanes 1-8, respectively). (C) Salt shock study. H. volcanii

cells were challenged with complex medium, which contained NaCl concentrations of 2.2

M (100%), 1.8 M (80%), 1.3 M (60%), or 0.9 M (40%) (lanes 1-4, respectively) for 60 minutes. End-labeled oligonucleotide HSPE was used as the probe. The sizes of the transcripts are indicated, and the fold induction for each condition is shown.

196 (A) 1 2 3 4 5 6

1.7 kb

Induction 1 1 5 13 12 7

(B) 1 2 3 4 5 6 7 8

1.7 kb

Induction 1 1 1 1 7 11 20 24 Fold

(C) 1 2 3 4

1.7 kb

Induction 1 1 3 5 Fold

Figure 4.8

197 products under each condition, different amounts of RNA were used (the relative amounts were: control, 4; heat shock, 1; salt shock, 2). Therefore, the primer extension products could not be used for quantitation purposes.

A discrete primer extension product in lanes 1,2, and 3 of Figure 4.9 suggested that transcription of “cctl" initiated from the same guanine residue under non-stressed, heat shock, or salt shock conditions. However, since the synthetic antisense oligonucleotide, HSPE, was used as the primer in the cDNA synthesis, the cDNA again represents the product from both cctl and cct2 genes. Nevertheless, the transcription start sites of cctl and cct2 used under both heat shock and non-heat shock conditions has been confirmed from the constructs containing the cct promoters and tRNAProM reporter gene fusion (Thompson and Daniels, unpublished data). Since the distance between the region complementary to HSPE primer and the mapped transcription start site in cctl and cct2 genes is identical, only one cDNA product was detected in our primer extension analysis.

The cctl transcription start site, which is located 4 nucleotides upstream from the translational start codon (shown in Figures 4.5 and 4.9), is in the boxB (5’-TG-3’) of the promoter. Based on the primer extension data, the TATA element of the cctl promoter is indeed the 5’-TTTATA-3’ element at positions -28 to -23, which resembles an archaeal consensus promoter. Furthermore, the same promoter was utilized under both heat shock and non-heat shock conditions.

198 Figure 4.9: Mapping the H. volcanii cctl gene transcription start site. Different amounts of total RNA were used in the assays in order to obtained balanced signals. Lane 1 contained total RNA (approximately 30 to 40 pg) from 4 ml of a nonstressed H. volcanii culture; lane 2 contained total RNA (approximately 7.5 to 10 pg) from 1 ml of a heat- shocked culture (60 minutes at 60°C); lane 3 contained total RNA (approximately 15 to

20 pg) from 2 ml of a 40% salt-shocked culture. HSPE oligonucleotide was used as the primer for both the extension and sequencing reaction. Sequences adjacent to the identified start site (marked by *) are shown.

199 T c G A

# «•

Figure 4.9

2 0 0 Mapping the cctl Transcription Termination Site

The 3' terminus of the cctl transcript was determined by SI nuclease mapping.

The DNA probe was prepared from a dsDNA fragment generated by PCR using Hsendf and M13/pUC19 universal reverse (-48) primers (Table 2.2) and HS21S4 as the template

(see Figure 4.3 and subclone 17 in Table 4.1). The resultant DNA was digested with the

Mspl restriction enzyme. This probe contained 172 bp complementary to the expected

RNA and 79 bps of the 3’ sequences. The DNA:RNA hybridization was executed at

53°C, and the SI nuclease digestion was carried out at room temperature. Figure 4.10 shows the results of the SI nuclease mapping. The results showed that the major products were 137 to 141 nucleotides and were derived specifically from the protected

DNA: RNA hybrid. As predicted, the 3’ terminus of the cctl transcript was within the

T6-tract located 74 nucleotides downstream from the stop codon on the non-template strand ( see Figure 4.5). Minor termination sites that correspond to RNAs of 167 and

155 nts represent termination within the downstream T-rich region and the vector sequence, respectively. Since the sequences of cctl and cct2 genes at the termination region diverge significantly, the results of 3’ mapping of the cctl transcript should not have been interfered by the presence of the cct2 transcript.

Construction of an H. volcanii cctl Gene-Overexpression Strain

In order to verify that CCTl is indeed synthesized in H. volcanii, the cctl gene was cloned into the £. coli-H volcanii pWL shuttle vector and introduced into

201 Figure 4.10: S1 nuclease mapping of the 3’ terminus of the cctl transcript.

Approximately 10 ng of probe DNA was used in one reaction, and the total RNA

(approximately 10 pg) used in each hybridization reaction was isolated from 1.2 ml of

H. volcanii culture challenged at 60°C for 60 minutes. Lanes 1-6 are controls. Lane 1

shows the 3’-[P^^]-endlabeled DNA probe (251 nts). In lane 2, the labeled probe was

resuspended in hybridization buffer and subjected to SI nuclease digestion for 60 min at

room temperature. In lanes 3 and 4, the labeled DNA probes were resuspended in hybridization buffer, boiled, chilled on ice to denature, and then treated with SI nuclease for 30 or 60 minutes, respectively. Lanes 5 and 6 contain the DNA probe without RNA and were incubated with SI nuclease for 30 and 60 min, respectively. Lanes 7, 8, and 9 contain both DNA and RNA. Samples were digested with SI nuclease for 0, 30, and 60 min, respectively. A size ladder based on the sequencing reaction of a known DNA and the predicted size range of the protected products are indicated.

2 0 2 (nt) 3 4 5 6 7 8 9

250 ■ — 251 230 —

210 —

190 —

170 —

155 —

137-141 135 —

Figure 4.10

203 H. volcanii. By comparing the protein profiles of H. volcanii cells carrying the pWL

plasmid with or without the cctl gene, we anticipated that CCTl synthesis could be

detected For this purpose, we constructed a new shuttle vector, designated pWL200.

The scheme for constructing pWL200 and cloning cctl is illustrated in Figure 4.11. A

pWL204 vector created earlier in this laboratory was digested to remove the HindSH-

£coRI fragment containing the H. volcanii tRNALys promoter, and the multiple cloning

region of pUC19 was ligated into the equivalent sites. In addition to the multiple cloning region, this new vector, pWL200, also retains some important features of pWL204, such as the mevinolin- and ampicillin-resistance markers and the origins of replication for E.

coli and for H. volcanii (Lam and Doolittle, 1989).

The cctl gene-containing DNA fragment was prepared by PCR using the primers

HScompl and HScomp2 (see Table 2.2), and cosmid A199 as template DNA. Each primer contained a synthetic BamiU. restriction site at its 5’ end. The 2044-bp PCR product, which contained 237 bp of sequence upstream from the cctl translational start codon, the coding region, and 104 bps of sequence downstream from the stop codon, was digested with BamYH and cloned into the equivalent site of pWL200. The resulting overexpression construct contained the entire cctl gene under the control of its own promoter. Ten overexpression clones were identified from E. coli DH5a, and six of these were introduced into H. volcanii WFDl 1. The transcription and translation of these plasmid-bome cctl HSOPs (heat shock over producers) were examined.

204 H. volcanii tRNALys promoter

r j , WmUI „ „ PxA ' &

Muldpie doming r^ io n of pUCI9 I------1 Mev* Hindia GmRI pWL204 10.4 kb Ampî

A//ndIII-£coRI deletion AwHI H. volcanii genomic DNA ligation I I S C O M P l '^ «-^SCO.MW V Mndin fwii : Multiple doming region orpUCI9 BamHl PCR ffinim BamHl

pWL200 Batim Hindlll /»vw!I SâmHI

BamHl Bamn

ligation

H M lll & niH I Bamm &0 RI

f Mev' A W u H S O P l 12.5 kb , Ampy

transform E. coli DH5a; screen by colony hybridization; confirm by restriction analysis i Transform E. coliJMIIO i Transform H. volcanii WFDl I

Figure 4.11: Construction and cloning of H. volcanii cell gene-overexpression strain.

205 Expression ofc ctl in theft volcanii Gene-Overexpression Strain

Figure 4.12 shows the SDS-PAGE protein profiles of cell lysates prepared from

H volcanii cells containing and not containing HSOP. The HSOP strain produced a 79 kDa protein at elevated levels (lane 4). This protein was not detectable in cell lysate prepared from the non-transformed cells (lanel and lane 2) or from cells transformed with the pWL vector without the cctl gene (lane 3). It is likely that this protein represents the product of cctl gene. The high level of the CCTl protein in the HSOP strain under non- stress conditions was probably due to basal transcription (“leakiness”) from the multiple copies of the cctl gene carried by the plasmid. Interestingly, the CCTl protein level in the HSOP strain did not increase under stress conditions (lane 5); this pattern was observed with all six HSOP strains examined (data not shown). Results from a preliminary Northern analysis (data not shown) of two HSOP strains indicated that H. volcanii cells carrying plasmid-home copies of the cc/7genes exhibited a higher basal level of cct transcripts than non-transformed cells. The level of ccr-related mRNA in the

HS0P2 containing strains was fixrther induced upon heat shock. Therefore, the cc/7 gene carried by the pWL vector was expressed and inducible at the transcriptional level.

Together, these data suggest that the regulation of the cctl expression may be controlled at two different levels, namely transcription and translation. Perhaps the transcriptional control ensures the positive response of the cctl expression under stress, and the translational control may tightly regulate the level of CCTl protein at an elevated but controlled state.

206 We also noted that the apparent molecular weight of the overexpressed protein in

SDS-PAGE was approximately 79 kDa—30% larger than the predicted molecular weight based on the amino acid sequence. This difference could be the result of post- translational modification of the protein, such as phosphorylation or glycosylation, or by the low mobility of acidic proteins (Spicher et al., 1992). This latter possibility appears more likely since CCTl appeared to be unstable and frequently degraded into two smaller polypeptides with apparent molecular weights of 32.5 kDa and 25 kDa (data not shown), which have a total mass similar to the predicted mass of 59 kDa.

Discussion

The heat shock gene regulation is an ideal paradigm for studying transcription activation in Archaea for many reasons. First, the heat shock response is universal, and the induction of heat shock gene transcription is rapid and dramatic, which can facilitate the detection of its regulatory components. Second, a wealth of information on heat shock gene regulation, acquired from comprehensive studies conducted in the Bacteria and Eukarya, is readily available. For the purpose of understanding archaeal gene regulation, we have cloned and characterized a heat shock gene, cctl, from H. volcanii.

The predicted protein product of cctl is a member of a new chaperonin family called

CCT for chaperonin containing TCPl. This inducible gene is presently being used as a model for studying gene regulation in Archaea in our laboratory.

207 Figure 4.12: Expression of cctl in a H volcanii gene-overexpression strain. Each lane on the 10% SDS-PAGE was loaded with H. volcanii cell lysate containing approximately

30 pg of proteins. Lanes 1 and 2 contained lysates from H. volcanii WFDl 1 under non­ heat shock and heat shocked at 60°C for 60 minutes, respectively. Lane 3 contained lysate from H. volcanii WFDl 1 carrying the pWL vector without the cctl gene. Lanes 4 and 5 contained lysates from HSOP2-containing strain under non-heat shock and heat- shocked at 60°C for 60 minutes, respectively. The masses of protein size markers are indicated.

208 — 79 kDa

Figure 4.12

209 Identification of c c tl and Its Protein Product

To date, archaeal CCTs have been identified primarily in thermophiles. Most recently, a CCT homolog, called a thermosome, was also purified and its gene sequenced from the methanogen M. kandleri (Andra et al., 1996). Comparative sequence analysis revealed that the H. 'volcanii cctl gene shares the highest protein sequence similarity with

M. kandleri thermosome (61.5%); 43.8% to 61.5% with the CCTs from other thermophilic Archaea, and approximately 34% with eukaryal CCTs. The H volcanii cct/gene represents the first description of a chaperonin-encoding gene from a halophilic archaeon.

Structural analyses of six archaeal CCT complexes, using electron microscopy, have shown that these archaeal CCTs are stacked double-ring complexes with 8 to 9 subunits per ring (Andra et al., 1996; Trent, 1996). Functional studies of TF55 complexes from S. shibatae (Trent et al., 1991) and S. solfataricus (Knapp et al., 1994) and thermosomes from P. occultum (Phipps et al., 1991) and T. acidophilum (Waldmann et al., 1995.b) indicated that these CCT complexes possess chaperonin fimctions, such as

ATPase activity, peptide binding, and protein folding (reviewed in Baross and Holden,

1996). Hence, archaeal CCTs, although they share only low primary sequence identities

(16-18%) with HSP60/GroEL, are the fimctional homologs of HSP60/GroEL and are likely to retain chaperonin function. Given the likely indispensable biological roles that

CCTs might have in protein folding, CCTs are probably as prevalent in Archaea as Hsp60 and GroEL are in the Eukarya and Bacteria, respectively.

2 1 0 Characterization of the Putative CCTl Protein

Based on its deduced amino acid sequence, the calculated mass of CCTl is 59 kDa. The size of this protein does not agree with the sizes of heat shock proteins detected by [^^S]methionine labeling of cellular proteins under heat shock conditions (Daniels et al., 1984). Daniels et al. demonstrated the increased synthesis of five HSPs with sizes of

98, 91, 85, 79, and 21 kDa, none of which had an apparent MW of 59 kDa. We detected the protein product of cctl by overexpressing cctl on a multicopy plasmid (pWL200) in

H. 'volcanii and examining the protein profiles in SDS-PAGE. We discovered that the cctl protein product had an apparent MW of 79 kDa, which matched the MW of one of the five HSPs reported by Daniels et al. (1984). Furthermore, Daniels and coworkers reported that the synthesis of this 79 kDa protein might also be induced in response to reduced salt concentration, which is consistent with our observation that transcription of cctl is also salt shock-inducible (see below). The fact that CCTl migrated at a higher mass on SDS-PAGE than predicted based on its primary sequence may be due to post- translational modification or, more likely, to the acidic nature of the protein. A similar situation was reported for the M. Kandleri thermosome (Andra et al., 1996). The subunit comprising the M. Kandleri thermosome is also an acidic polypeptide with a pi of 4.27 and an estimated MW of 59 kDa. However, the subunit migrated on SDS-PAGE as a 75 kDa protein. This lower mobility may be due to the inefficiency of SDS-binding to acid proteins (Kaufinann et al., 1984; Spicher et al., 1992).

Like other CCTs, H. volcanii CCTl contains the conserved sequence motifs corresponding to the putative ATP-binding site and its surrounding sites within the

2 1 1 pocket (Fenton et al., 1994; Kim et al., 1994). One unique and interesting feature of the

H. volcanii CCTl is the GGM repeats located at its extreme carboxy-terminus. This

GGM motif has been suggested to play a role in anchoring the 60 kDa immunoreactive proteins of Coxiella burnetii (Vodkin and Williams, 1988) and Mycobacteria leprae

(Gillis et al., 1985) to the membrane. Furthermore, GGM repeats are also present in a chaperone-encoding gene jfrom Cyanobacterium, whose deduced protein sequence shared high sequence identity with members of a peripheral membrane protein family (Chitnis and Nelson, 1991). However, there is no direct evidence supporting this proposed role of the GGM repeats. Most of the CCTs, lacking these repeats, are soluble proteins (Kim et al., 1994). We have not examined the subcellular location of the H. volcanii CCTl protein.

Regulated Gene Expression ofH. volcanii cctl and Its Related Genes

The “discovery” of second cct gene {cctl) brought the realization that the probe

(HSPE) used in the Northern analysis for cctl mRNA cross-hybridized both cct transcripts. Therefore, it is necessary for us to reevaluate the data obtained from Northern analysis and reinterpret them as detection of the cct-related transcripts, namely cctl and cct2. Under non-heat shock conditions, the cct transcripts exist at a low basal level; when challenged with elevated temperatures of 50°C or higher for more than 30 minutes, large amounts of the cct transcripts began to accumulate in vivo. The level of cct transcripts appeared to peak between 55°C and 60°C, and as much as a 24-fold increase was detected in the time-course study conducted at 60“C. However, given the unstable nature of heat

2 1 2 shock-specific mRNA, and without having an appropriate internal control for the RNA

recovery, the induction fold should not be considered as an absolute value.

The temperature-course and time-course studies of cc/-related gene expression

indicated that the cct transcript level responded to heat shock, which could be the

outcome of transcription induction or increase in RNA stability. Since there is no known

archaeal RNAP inhibitor yet available, determining the in vivo decay rate of the cct

mRNA remains difficult. However, independent studies of cctl/cct2 promoter and

reporter gene fusion (Thompson and Daniels, unpublished data) have confirmed the heat

shock inducibility of both cct genes and their induction at the level of transcription

initiation.

Since halophilic Archaea grow in extreme salt environments, reduced salt

concentrations are likely to impose an environmental stress on these organisms. In the

Bacteria and Eukarya domains, osmoregulation, which is signaled by the two-component transduction systems (Csonka and Hanson, 1991); Chang and Meyerowitz, 1994; Maeda et al., 1994; Mager and Varela, 1993; Ota and Varshavsky, 1993), occurs mostly at the transcriptional level (Csonka and Hanson, 1991; Maeda et al., 1994; Yu et al., 1995). In this study, we have observed that the transcription of cct genes is also salt shock- responsive. The fact that the magnitude of cct response to salt shock was not as dramatic as that seen under heat shock conditions might be due to the need for a longer response time for cells to sense the osmolarity changes in the environment. Alternatively, it may be that one of the cct gene is salt shock-responsive and the other is not. In addition to our findings, at least two other reports have described transcription induction triggered by

213 lowered salt in halophilic Archaea. In H. mediterranei, two ORJFs that were differentially

expressed in response to mediiun salinity were identified, and their regulation may

involve the Z-DNA structure in the upstream regions (Mojica et al., 1993). In another

study, Ferrer et al. (1996) used cDNA probes derived from salt-shocked total RNA and

used these to probe the H. volcanii cosmid library (Charlebois et al., 1991). Nine regions

of the H. volcani genome were identified as being transcribed at elevated levels under

dilute salt concentration (0.48 M of NaCl). Surprisingly, none of the nine salt shock-

responsive loci reported were located on cosmid A199, which contains the cctl gene, or cosmid 268, which contains the cct2 gene (Kuo et al.). In our study, the salt shock

Northern blot was probed specifically for the cct transcripts. Furthermore, the apparent

MW of CCTl on SDS-PAGE was consistent with one of two proteins (79 kDa and 91 kDa) synthesized at higher level upon salt shock and heat shock (Daniels et al., 1984).

Therefore, the salt shock-inducibility of cct is evident, and the cct transcript might not have been well represented in the pool of cDNA probes prepared in the study reported by

Ferrer et al. (1996). Like other heat shock proteins, CCT/CCT2 may play an important role in protein folding under normal physiological conditions as well as environmental stresses, thus qualifying as a general stress protein.

In addition to transcriptional induction, the data presented suggest the involvement of translational or post-translational repression in cctl regulation. Although there was induction of the cctl transcript in response to heat shock, cctl showed little induction at the protein level in the overexpression strain (HS0P2). These data suggested that CCTl was likely maintained at an elevated but steady-state level by a post-

214 transcriptional regulatory mechanism. Here, we propose two potential models for the molecular mechanisms controlling cctl expression. The first model involves a positive regulatory scheme in which heat-induced translation of the cctl transcript requires assistance of another factor. In such a case, extremely high levels of cctl mRNA

(HS0P2 under heat shock conditions) might saturate factors required for its translation, resulting in inefficient translation of the chromosomal and plasmid copies. However, this hypothesis cannot account for the slight increase in CCT I synthesis in the wild-type

H. volcanii cells under heat shock condition. In a preliminary Northern analysis, we observed that, although the cct mRNA level in the wild-type cells was similar to that in the non-heat shocked HSOP strain, the CCTl protein level in heat-shocked wild-type cells was significantly lower than that observed for the HSOP strain. Such observation suggests that depletion of a positive regulatory factor did not occur.

The second model involves a negative regulatory scheme in which the cctl mRNA or CCTl protein might be the binding targets of negative regulator(s) induced after heat shock. Binding of a negative modulator could arrest translation, introduce

RNA instability, or subject the nascent CCTl polypeptide to degradation. This second scheme resembles the translation repression model for rpdH mRNA in E. coli in which

DnaK/DnaJ/GrpE bind to the mRNA or its protein product and subsequently reduce the level of by means of translational repression and protein degradation (reviewed in

Georgopoulos et al., 1994; Mager and De Kruijff, 1995; Yura et al., 1993). In such a model, non-transformed WFDl 1 cells would display high but controlled levels of CCTl protein upon reaching a new steady state after heat shock. Although the cctl genes in the

215 non-heat shocked HSOP cells were only transcribed at a low level, transcripts derived from more than 10 copies of the plasmid-borae cctl gene could be significant. Thus, in the absence of a heat-responsive repressor, the cctl transcript from non-shocked HSOP cells was translated uncontrollably, or, alternatively protein product remained remarkably stable in vivo. However, upon heat shock, the translational repressor became available, prevented transcripts from being translated, and possibly targeted CCTl for degradation.

Consequently, CCTl protein levels in HSOP cells under physiological and heat-shock conditions displayed little change. The second model appears more plausible, especially in view of our preliminary observation.

Transcriptional Signals ofcctl

Although identification of the transcription start site of the cctl gene based on primer extension analysis was also complicated by the presence of the cct2 gene, our result is confirmed by the promoter fusion studies (Thompson and Daniels, unpublished data). Under both physiological and stress conditions, transcription of cctl gene initiated from the same guanine residue within the pyrimidine-purine dinucleotide located just 4 bp upstream from the putative translational start codon. Approximately 28 to 23 nucleotides upstream from the mapped transcription start site, we identified a putative

TATA element (5’-TTTATA-3’), matching the archaeal boxA consensus (Palmer and

Daniels, 1995; Zillig et al., 1993). The presence of these archaeal consensus promoter elements suggest that expression of cctl is not regulated by the heat shock gene regulation scheme used by E. coli, which involves alternative cr-factor in the holoenzyme

216 complex (Heimann, 1994; Mager and De Kruijfif, 1995). In view of the similarities of the archaeal RNA polymerase and eukaryal RNA polymerase U, it is more likely that the regulation of cctl resembles the eukaryal system (Lis and Wu, 1994; Mager and De

Kruijff, 1995) where the interaction between a /raw-acting protein factor (HSF) and a c/j-acting heat shock element (HSE) in the promoter region regulates the expression of heat shock genes in a positive fashion. Although no eukaryal HSE-like 5’-nGAAn-3’ inverted repeats were found in the upstream region of cctl, we observed five near perfect

5’-ACCGCC-3’ repeats upstream from the TATA element and two pairs of overlapping direct repeats, 5’-AATCA-3’ and 5’-ACAA-3’, located between the TATA element and boxB (Figure 4.4). Based on the alignment of the upstream regions of S. shibatae

TF55a- and TF55(3-encoding genes, Kagawa et al. (1995) detected three conserved regions 5’ to the ORFs of the two genes, two of which were located upstream from the transcription start site (Kagawa et al., 1995): the -90 region, containing a pseudo- palindromic sequence, the TATA element, and its 10 upstream nucleotides. However, we do not see a palindromic sequence architecture in the upstream region of the H. volcanii cctl, nor do the two S. shibatae CCT-encoding genes contain the sequence repeats we observed for cctl. Therefore, it would be premature to predict any sequence elements of cctl having a regulatory role. Recent data from this laboratory have shown that the regulatory element controlling heat shock transcription initiation lie within the 4 nts located 5 nt upstream from the boxA (Thompson and Daniels; personal communication), indicating a scheme for heat shock regulation that is different from the eukaryal and bacterial mechanisms.

217 The S1 nuclease digestion experiment mapped the 3’-terminus of the cctl transcript to the first 5 thymidine residues in the T6-tract located 74 nucleotides downstream firom the stop codon. In addition to the major protected products, two minor groups of products were also detected: one had a 3’terminus corresponding to the region between the two shorter T-tracts following the T6-tract; the other 3’terminus corresponded to a region in the plasmid sequence. These minor products are most likely due to “leaky” termination as a result of heat shock induction. Therefore, we predict that the terminator element for cctl is the T6-tract, which is preceded by a 10-nucleotide AT- rich sequence and immediately flanked by GC residues. Using mfold program (Zucker,

1994), no stable secondary structure was predicted in the 3’-terminus sequence of cctl, which encompasses the 74 nucleotides of the untranslated region, the terminator, and its downstream sequence. The ability of this sequence to direct transcription termination in vivo was verified in termination studies described in Chapter Three of this dissertation.

Summary

Using preliminary data on global gene expression patterns in H. volcanii

(Triesehnann and Charlebois, 1992) we have cloned and sequenced a heat shock inducible gene. This gene encodes a relative of the CCT family of eukaryal chaperone proteins and is related to previously described thermofactors and thermosome proteins of other Archaea. This gene provides a workable model system for the detailed analysis of the molecular mechanisms of the gene regulation in the halophilic Archaea, which is currently underway.

218 CHAPTER 5

CONCLUSION

The research described in this dissertation focused on two areas: the elucidation of transcription termination in vivo in H. volcanii and the development of a model system for studying regulated gene expression in the same organism. In the termination study, an in vivo termination assay module was developed and used to investigate the sequence and structural requirements for transcription termination in the Archaea. The results of this study define a set of criteria for the identification of transcription terminators in the halophilic archaea and suggest the occurrence of an “inch-worming” mechanism for transcription termination. In order to study regulated gene expression in H. volcanii, we chose to examine the heat shock response as a potential model system and identified a cct-related gene. The cctl gene was sequenced and its transcription induction pattern examined. The findings of each study are summarized in the following sections.

219 Archaeal Transcription Termination

The 3’ terminus of the yeast tRNAProM RNA under the transcriptional control of the H. volcanii tRNALys promoter was mapped within a T9-tract of the eukaryal pol 111 terminator. Placement of the pol III terminator between the H. volcanii tRNALys promoter and a reporter gene, tRNATrpO 16M-AGGAG, showed that the pol III termination element prematurely terminated tRNATrpO 16M-AGGAG transcription in vivo. Furthermore, the activity of the eukaryal pol III terminator was found to be orientation-dependent, requiring the T9-tract to be on the nontemplate strand. These results suggest that the transcription machinery in Archaea recognizes the pol III sequence element as a terminator.

Taking advantage of the termination activity of the yeast pol III terminator, a new expression module, sptProM, was constructed for use as an in vivo termination assay.

This new module contained (5’->3’) the H. volcanii tRNALys promoter, the yeast tRNAProM reporter gene, restriction enzyme sites for the introduction of putative terminators, and the yeast pol III terminator. Transcript mapping was performed to verify that transcription initiation and termination of the tRNAProM reporter gene was not altered in the new module. The sptProM module was then used to examine the sequence and structural requirements for H. volcanii terminators in vivo.

Analysis of the H. volcanii tbp2 and cctl terminators suggested that these terminators function efficiently when their T-tracts were located on the nontemplate DNA strand. The sequence requirements within the T-tract terminator were defined by deletion

2 2 0 and sequence-replacement mutagenesis in the tbp2 terminator. These results suggested that the minimal number of Ts in the T-tract is four. In addition, replacement of individual thymine residues within the T-tract with the other three possible nucleotides had different effects on termination efficiency. Guanine was found to be inhibitory at any position, and cytosine was acceptable at the T2- and T4-positions. Overall, three contiguous Ts within the T-tract motif are required, and no substitution allowed in the

T3-position. In addition to the T-tract itself, nucleotides immediately 3’ to the T-tract also appeared to influence termination to some degree. Comparison of the different termination sequences (wild type or mutants) used in this study indicated that while the nucleotide requirements immediately 5’ to the T-tract were not stringent, there was a preference for G or C residues adjacent to the 3’ end of the T-tract. Therefore, archaeal T- tract terminators resemble eukaryal pol III terminators at the nucleotide sequence level.

The requirements within the flanking regions of the T-tracts were investigated initially using 5’ and 3’ deletion mutagenesis. The deletion data suggested that, whereas the region 5’ of the T-tract was dispensable, deletion of the region downstream of the T- tract hindered termination activity. Extensive site-directed and block-replacement mutagenesis of the 3’ region of the tbp2 terminator indicated that there were no sequence requirements in this region. However, comparison of various deletion or insertion constructs suggested that bringing the pol III terminator in close proximity to the upstream terminator inhibited the termination activity of the first T-tract. Therefore, the

3’ region of the T-tract terminator is required for maintaining a spacing distance between

221 the T-tracts of the putative terminator and the downstream pol III terminator. Such a

situation would require RNAP to “look ahead” on the template during the

elongation/termination process and would favor the occurrence of an RNAP inchworming

model. In a termination process that involves RNAP inchworming, the presence of a

proximal downstream T-tract might signal the RNAP to enter the inchworming cycle and

prevent it from being released efiBciently from the first T-tract.

Although the role of the T-tract in the termination process is still not yet clear, we

specifically examined two possibilities: inducing DNA curvature and providing a binding

site for a protein “road-block”. However, test of two previously determined bent DNAs

indicated that not every bent DNA could direct termination. Therefore, the role of the T-

tract is not simply inducing DNA curvature. Furthermore, of the two active terminators

examined, only one could induce DNA curvature, thus suggesting that bent DNA is not a required feature for an archaeal T-tract terminator. Similarly, binding of a protein “road­ block” (such as a transcription initiation complex) alone appears to be insufficient for directing termination. For example, the TATA sequence elements of the H. volcanii tRNALys promoter, which should be capable of recruiting the initiation complex, could not direct termination at the presumed termination site unless a T-tract was present.

In addition, the environmental context in which T-tracts reside can also influence the termination activity of the T-tract. A T-tract located in the loop of a stem-loop structure could not direct termination, possibly due to inability of RNAP or a protein factor to recognize the T-tract within a hairpin.

2 2 2 Results from studying the H. volcanii rRNA operon terminator suggest that

Archaea also utilize bacterial p-independent-like termination systems. The 3’ region of the H. volcanii rRNA operon has a potential stem-loop structure followed by a T-rich region. Integrity of both the structure and sequence elements is required for termination function. Termination directed by the rRNA terminator yields an RNA transcript with a

3’ terminus within the region including the bottom of the stem and the T-tract immediately following. Such features resemble those described for the bacterial p- independent terminators. Therefore, Archaea are capable of utilizing at least two types of termination elements.

To the best of our knowledge, the work presented here is the first archaeal termination study conducted in vivo. This research provides a more extensive definition of archaeal termination signals than the description based on in vitro studies (Thomm et al., 1994) and 3’-terminus transcript mapping (Brown et al., 1989). Furthermore, the data suggest a potential underlying mechanism for transcription termination in the Archaea.

Elucidation of this mechanism will require the use of an in vitro transcription system, which is currently being explored in our laboratory.

Characterization of a Heat Shock Gene

A complete set of cosmids representing 96% of the H. volcanii genome were available and a preliminary study reported that these cosmids carried a total of seven heat shock-responsive loci. We used restriction fragments from cosmid A199 as probes for

223 Northern analysis and identified two Mlul fragments carrying a heat shock gene. From

the clones that contained these two Mlul fragments, the cctl gene was sequenced and

characterized.

The ccrfgene encodes a 560 amino acid protein with an estimated MW of 59 Kda

and a PI of 3.87. Comparative sequence analysis indicated that this protein shares 46%

sequence similarity with Sulfolobus shibatae TF55P and 35% sequence similarity with

human TCP-1. Southern analysis of the H. volcanii genome using the entire cctl gene as

a probe indicated the existence of another ccr-related gene. Detection of the transcripts

derived from the ccr-related genes revealed increased transcript level when cells were

challenged with heat stress or salt shock. Time-course and temperature-course studies

indicated that the level of the cct transcripts was maximally induced after 60 minutes at

60°C. Salt shock treatment with 1.7 M, 1.3 M, or 0.9 M NaCl (2.2 M NaCl is the

physiological condition) for 60 minutes resulted in a 1, 3, and 5-fold induction,

respectively. The cctl gene is monocistronic. Primer extension analysis indicated that the cctl gene initiated transcription 25 bp downstream from a typical archaeal TATA promoter element— 5 -TTTATA-3 . This promoter was used for both basal and induced transcription. S1 mapping showed that the cctl transcript terminated in T-tract of six thymines, a common T-tract motif of archaeal terminators. The function of this potential termination element was verified in the in vivo termination project.

Introduction of the cctl gene into H. volcanii on a multicopy plasmid showed that the CCTl protein was expressed at high levels in the absence and presence of heat shock,

224 even though preliminary Northern analysis of the overexpression strains indicated an increase in the cctl transcript level. Therefore, there might be a negative regulatory mechanism that tightly regulates the level of the CCTl protein either by affecting CCTl synthesis or directly affecting the protein itself. Regulation of cctl gene expression is currently being investigated by another member in our laboratory (Thompson, D. K.)

In this study, a model system has been developed for investigating gene regulation in halophilic archaea. Combined with our increased understanding in the archaeal basal transcription system, use of this heat shock gene model system should provide insight into the regulatory mechanisms governing archaeal gene expression.

225 APPENDIX A

LIST OF TRANSCRIPTION TERMINATION CONSTRUCTS

Tract. F GCGTCTTTTTGGCATCTTTTTCATGGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTT GCCTATCTATAAAATTAAA6TA6CAGT

Tract.F+16 GCGTCTTTTTGGCATCTTTTTCATGGGATCCCCGGGTACCa tc g r tc g a a g t ca c tgGAGC TCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAGT

TractR CATGAAAAAGATGCCAAAAAGACGCGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTT GCCTATCTATAAAATTAAA6TA6CAGT

TractR+16 CATGAAAAAGATGCCAAAAAGACGCGGATCCCCGGGTACCa tcgtcgaagtca c tgGAGC TCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAGT

A 3T 3.F AAATTTGTCCGAAATTACTGAGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTTGCCT ATCTATAAAATTAAAGTAGCAGT

A 3T 3.F +16 AAATTTGTCCGAAATTACTGAGGATCCCCGGGTACCa t c g t c g a a g t c a c tgGAGCTCAA TAATTTTTTTTTGCCTATCTATAAAATTAAAgTAGCAGT

A 3T 3.R TCAGTAATTTCGGACAAATTTGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTTGCCT ATCTATAAAATTAAAGTAGCAGT

Bold print indicates termination elements. Plain print indicates vector sequences, and italic print indicates 16-bp insert sequence. 226 S sr T .P ATTCATACTTCACA6&GCCCACT6GAC6CG6T6TCCA6T6GCTTT6TTCATTTTAT6CA6 AGCGGCACGCACGGTCTAGAGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTTGCCTA TCTATAAAATTAAAGTAGCAGT

S sr T .R CCGTGCGTGCCGCTCTGCATAAAATGAACAAAGCCACTGGACACCGCGTCCAGTGGGCTC TGTGAAGTCTGAATTCTAGAGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTTGCCTA TCTATAAAATTAAAGTAGCAGT

DeSTLP GCGGTGTCCAGTGGCTTTGTTCATTTTATGCAGAGCGGCACGCACGGGGATCCCCGGGTA CCGAGCTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAGT

S h o rT . F ATTCATACTTCACAGAGCCCACTGGACGCGGTGTCCAGTGGCTTTGTTCAGGATCCCCGG GTACCGAGCTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAGT

Short.F +16 ATTCATACTTCACAGAGCCCACTGGACGCGGTGTCCAGTGGCTTTGTTCAGGATCCCCGG GTACCa tcg rtcg ra a g -tca c tgGAGCTCAATAATTTTTTTTTGCCTATCTATAAAATTAAA GTAGCAGT

T r ic h .F GGCTTTGTTCATTTTATGCGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTTGCCTAT CTATAAAATTAAAGTAGCAGT

T r ic h +16 GGCTTTGTTCATTTTATGCGGATCCCCGGGTACCa t cgrt cg r a a g tc a c tgGAGCTCAATA ATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAGT

T rpA .T AGCCCGCCTAATGAGCGGGCTTTTTTTTGGATCCCCGGGTACCGAGCTCAATAATTTTTT TTTGCCTATCTATAAAATTAAAGTAGCAGT

T rpA .T +16 AGCCCGCCTAATGAGCGGGCTTTTTTTTGGATCCCCGGGTACCa tcgtcgaagtcac tgG AGCTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAGT

Bold print indicates termination elements. Plain print indicates vector sequences, and italic print indicates 16-bp insert sequence.

227 T tpA .T X AGCCCGCCTAATGtAGCGGGCTTTTTTTTGGÀTCCCCGGGTACCGAGCTC

P I I I T 9

AATTAAAGTAGCAGT

P I I I T 9 +16 AATTTTTTTTTGCGGATCCCCGGGTACCa t c g t c g a a g t ca c tgGAGCTCAATAATTTTT TTTTGCCTATCTATAAAATTAAAGTAGCAGT

HScomp GCGTCCCAGTAGGCCACCATACACCCACACCCTGATTCGAATCGACTCGACTCGCTCGGC CAAATTCACGACGTTTTTTGCGGCTTTACACCCGTTAGCAGCTGGGATCCCCGGGTACCG AGCTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

H Scom p. R CAGCTGCTAACGGGTGTAAAGCCGCAAAAAACGTCGTGAATTTGGCCGAGCGAGTCGAGT CGATTCGAATCAGGGTGTGGGTGTATGGTGGCCTACTGGGACGCGGATCCCCGGGTACCG AGCTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

HSdeL CGGCCAAATTCACGACGTTTTTTGCGGCTTTACACCCGTTAGCAGCTGGGATCCCCGGGT ACCGAGCTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

H SdeL . R GAGAGGCCCTGGGTTCAATTCCCAGCTCGCCCCGGGATCCCAGCTGCTAACGGGTGTAÂA GCCGCAAAAAACGTCGTGAATTTGGCCGGGATCCCCGGGTACCGAGCTCAATAATTTTTT TTTGCCTATCTATAAAATTAAAGTAGCAG

HSLT GCGTCCCAGTAGGCCACCATACACCCACACCCTGATTCGAATCGACTCQACTCGCTCGGC CAAATTCACGACGTTTTTTGCGGCTTTACACCCGTTAGCAGCTGCTTTGCCCGCGGACGC GCCGGTTTGGGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTTGCCTATCTATAAAAT TAAAGTAGCAG

Bold print indicates termination elements. Plain print indicates vector sequences, and italic print indicates 16-bp insert sequence.

228 HSLTdeL CGGCCAAATTCACGACGTTTTTTGCGGCTTTACACCCGTTAGCAGCTGCTTTGCCCGCGG ACGCGCCGGTTTGGGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTTGCCTATCTATA AAATTAAAGTAGCAG

H S T ( - ll) CGGCCAAATTCACGACGTTTTTTGCGGCGGATCCCCGGGTACCGAGCTCAATAATTTTTT TTTGCCTATCTATAAAATTAAAGTAGCAGTTTACACCC

H S T (-1 6 ) CGGCCAAATTCACGACGTTTTTTGCGGCTTTAGGATCCCCGGGTACCGAGCTCAATAATT TTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

H S T (-2 0 ) CGGCCAAATTCACGACGTTTTTTGCGGCGGATCCCCGGGTACCGAGCTCAATAATTTTTT TTTGCCTATCTATAAAATTAAAGTAGCAG

L y sP A 25.F GGAAAGTCATTTAACCCACCGGCAGTGGATCCCCGGGTACCa t c g t c g a a g t c a c tgGAG CTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

L ysP A 25.R ACTGCCGGTGGGTTAAATGACTTTCCGGATCCCCGGGTACCa t c g t c g a a g t c a c tgGAG CTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

L y sP A 2 7 .F GGAAAGTCATATTACCCACCGGCAGTGGATCCCCGGGTACCa t c g t c g a a g t c a c tgGAG CTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

L ysP A 27.R ACTGCCGGTGGGTAATATGACTTTCCGGATCCCCGGGTACCa t c g t c g a a g t c a c tgGAG CTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

L y sP A 2 9 .F GGAAAGTCTTTTTACCCACCGGCAGTGGATCCCCGGGTACCa t c g t c g a a g t c a c tgGAG CTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

Bold print indicates termination elements. Plain print indicates vector sequences, and italic print indicates 16-bp insert sequence.

229 L y sP A 2 9 .R ACTGCCGGTGGGTAAAAAGACTTTCCGGATCCCCGGGTACCatcgtcgraagrtcactgGAG

TBP2C0NP.F: 6C6CTCCCGAACTATCC6ATTTTTC6CG6TGCTAG6T6TCT6AACACCCATAA6T6T6AA ATGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAG GAG

TB P2com p.R ATTTCACACTTATGGGTGTTCAGACACCTAGCACCGCGAAAAATCGGATAGTTCGGGAGC GCGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAG CAG

T B P2deL : GGATCCCCGATTTTTCGCGGTGCTAGGTGTCTGAACACCCATAAGTGTGAAATGGATCCC CGGGTACCGAGCTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

TBP2deR: GGATCCGCGCTCCCGAACTATCCGATTTTTCGCGGGGATCCCCGGGTACCGAGCTCAATA ATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

TBP2mu5' TGAGCAGACGTCATACCGATTTTTCGCGGTGCTAGGTGTCTGAACACCCATAAGTGTGAA ATGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAG CAG

TBP2deL(-4)3' : CC6ATTTTTCGCGGTGCTAGGTGTCTGAACACCCATAAGTGTGGGATCCCCGGGTACCGA GCTCAATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

TBP2deL(-9)3*: CCGATTTTTCGCGGTGCTAGGTGTCTGAACACCCATAAGGATCCCCGGGTACCGAGCTCA ATAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

TBP2deL(-20)3' : CCGATTTTTCGCGGTGCTAGGTGTCTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

Bold print indicates termination elements. Plain print indicates vector sequences, and italic print indicates 16-bp insert sequence.

230 TBP2deL(-28)3': CCGATTTTTCGCGGTGCTAGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTT6CCTAT CTATAAAATTAAAGTAGCAG

TBP2deL(-33)3>: CCGATTTTTCGCGGGGATCCCCGGGTACCGAGCTCAATAATTTTTTTTTGCCTATCTATA AAATTAAA6TAGCA6

TBP2TsNew

GCAG

TBP2TsNew+16 CCGATTTTTCGCGGTACCatcgtcgraagrtcactgGAGCTCAATAATTTTTTTTTGCCTAT CTATAAAATTAAAGTAGCAG

TBP2TsNewX CCGATTTTTCGCGGTACCGAGCTC

TBP2(A26): CCGATTTTTCGCGGTGCTAGGTGTCaaGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2(A24): CCGATTTTTCGCGGTGCTAGGTGaaTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2(A22): CCGATTTTTCGCGGTGCTAGGaaTCTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2(A20): CCGATTTTTCGCGGTGCTAaaTGTCTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2(A18): CCGATTTTTCGCGGTGCaaGGTGTCTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

Bold print indicates termination elements. Plain print indicates vector sequences, italic print indicates 16-bp insert sequence, and underline marks sequence replacement

231 TBP2(A16): CCGATTTTTCGCGGTaaTAGGTGTCTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2(A14): CCGATTTTTCGCGaaGCTAGGTGTCTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2(A12) : CCGATTTTTCGaaGTGCTAGGTGTCTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2(AlO) : CCGATTTTTaaCGGTGCTAGGTGTCTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2TAT: CCGATTaaTCGCGGTGCTAGGTGTCTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2A14.20: CCGATTTTTCGCGAAGCTAAATGTCTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2[CACC]20: CCGATTTTTCGCGGTGCTACACCTCTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2A14.[CACC]20: CCGATTTTTCGCGAAGCTACACCTCTGGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2(A14) .5. [GGTG] : CCGATTTTTCGCGAAGCTActgacGGTGTCTGGGATCCCCGGGTACCGAGCTCAATAATT TTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

Bold print indicates termination elements. Plain print indicates vector sequences, italic print indicates 16-bp insert sequence, and underline marks sequence replacement

232 TBP2(A14).1 0 .[GGTG]: CCGATTTTTCGCGAAGCTActgacacgatGGTGTCTGGGATCCCCGGGTACCGAGCTCAA TAATTTTTTTTTGCCTATCTATAAAATTAAAGTAGCAG

TBP2deL(-2G)mu3' CCGATTTTTCGCGGCATGCAAGTCAGAGGATCCCCGGGTACCGAGCTCAATAATTTTTTT TTGCCTATCTATAAAATTAAAGTAGCAG

TBP2deL(-20)mu3'X CCGATTTTTCGCGGCATGCAAGTCAGAGGATCCCCGGGTACCgagctc

TBP2deLT(-20)X CCGATTTTTCGCGGTGCTAGGTGTCTGGGATCCCCGGGTACCgagc t c

TBP2TspIIIT3 CCGATTTTTCGCGGGGATCCCCGGGTACCGAGCTCAATAATTTGCCTATCTATAAAATTA AAGTAGCAG

TBP2TSX CCGATTTTTCGCGGGGATCCCCGGGTACCgagctc

Bold print indicates termination elements. Plain print indicates vector sequences, anditalic prints indicates 16-bp insert sequence.

233 APPENDIX B

LIST OF STRAINS

Strain Genotype Source

E. coli DH5a F-, (j)80d, lacZùMXS, (/acZYA- BRL flrgF)UI69, endA\, recAl, hsdR\l{T^ deoR thi-\, supEAA, X~gyrA96, relAX

E. coli JMl 10 F’[/raD36,proAB^ lacV^, /acZAMlS], Stratagene dam, dam, supEAA, hsd^Xl, thi, leu, thr, rspL, lacY, galK, gatT, ara, tonA, tsx, A(lac-proAB)

H. volcanii WFDIl H. volcanii DS2 (ATCC 29605, type W. F. Doolittle strain) cured of the endogenous plasmid pHV2

234 BIBLIOGRAPHY

Agarwal, K., Baek, K. H., Jeon, C. J., Miyamoto, K., Ueno, A., and Yoon, H. S. (1991). Stimulation of transcript elongation requires both the zinc finger and RNA polymerase II binding domains of human TFIIS. Biochemistry 30, 7842-7851.

Allison, D. S., and Hall, B. D. (1985). Effects of alterations in the 3' flanking sequence on in vivo and in vitro expression of the yeast SUP4-0 tRNATyr gene. EMBO J. 4 ,2657- 2664.

Allison, L. A., Moyle, M., Shales, M., and Ingles, C. J. (1985). Extensive homology among the largest subunits of eukaryotic and prokaryotic RNA polymerases. Cell 42, 599-610.

Altmann, C. R., Solow-Cordero, D. E., and Chamberlin, M. J. (1994). RNA cleavage and chain elongation by Escherichia coli DNA-dependent RNA polymerase in a binary enzyme:RNA complex. Proc. Natl. Acad. Sci. USA 91, 3784-3788.

Amin, J., Ananthan, J., and Voellmy, R. (1988). Key features of heat shock regulatory elements. Mol. Cell. Biol. 8, 3761-3769.

Amiri, K. A. (1994). Fibrillarin-like proteins occur in the domain Archaea. J. Bacteriol. 776,2124-2127.

Andra, S., Frey, G., Nitsch, M., Baumeister, W., and Stetter, K. O. (1996). Purification and structural characterization of the thermosome from the hyperthermophilic archaeum Methanopyrus kandleri. FEBS Letters 379,127-131.

235 Archambault, J., Lacroute, F., Ruet, A., and Friesen, J. D. (1992). Genetic interaction between transcription elongation factor TFIIS and RNA polymerase II. Mol. Cell. Biol. 12, 4142-4152.

Aso, T., Conaway, J. W., and Conaway, R. C. (1995). The RNA polymerase II elongation complex. FASEB J. 9, 1419-1428.

Ausubel, F. M., Brent, R., Kingston, R. E., Moore, D. D., Smith, J. A., Seidman, J. G., and Struhl, K. (1987). Current Protocols in Molecular Biology (New York: Greene Publishing Associates and Wiley-Interscience).

Baldauf, S. L., Palmer, J. D., and Doolittle, W. F. (1996). The root of the universal tree and the origin o f eukaryotes based on elongation factor phylogeny. Proc. Natl. Acad. Sci. USA 93, 7749-7754.

Bams, S. M., Fundyga, R. E., Jeffries, M. W., and Pace, N. R. (1994). Remarkable archaeal diversity detected in a Yellowstone National Park hot spring environment [see comments]. Proc. Natl. Acad. Sci. USA 91, 1609-1613.

Baross, J. A., and Holden, J. F. (1996). Overview of hyperthermophiles and their heat shock proteins. Advances in Protein Chemistry 48, 1-34.

Baumann, P., Qureshi, S. A., and Jackson, S. P. (1995). Transcription: new insights from studies on Archaea. Trends Genet. II, 279-283.

Becker, J., and Craig, E. A. (1994). Heat-shock proteins as molecular chaperones. Eur. J. Biochem. 219, 11-23.

Bengal, E., Flores, O., ICrauskopf, A., Reinberg, D., and Aloni, Y. (1991). Role of the mammalian transcription factors IIF, IIS, and UX during elongation by RNA polymerase II. Mol. Cell. Biol. II, 1195-1206.

Berk, A. J. (1989). Characterization of RNA molecules by SI nuclease analysis. Methods in Enzymology 180, 248-347.

236 Betlach, M. C., Shand, R. F., and Leong, D. M. (1989). Regulation of the bacterio-opsin gene of a halophilic archaebacterium. Can. J. Microbiol. 3 5 , 134-40.

Beutel, B. A., and Gold, L. (1992). In vitro evolution of intrinsically bent DNA. J. Mol. Biol. 228, 803-812.

Bogenhagen, D. P., and Brown, D. D. (1981). Nucleotide sequences in Xenopus 5S DNA required for transcription termination. Cell 24,261-70.

Boorstein, W. R., and Craig, E. A. (1990). Transcriptional regulation of SSA3, and HSP70 gene from Saccharomyces cerevisiae. Mol. Cell Biol. 10, 3262-3267.

Bomkhov, S., Sagitov, V., and Goldfarb, A. (1993). Transcript cleavage factors from E. coli. Cell 72,459-66.

Braig, K., Otwinowski, Z., Hegde, R., Boisvert, D. C., Joachimiak, A., Horwich, A. L., and Sigler, P. B. (1994). The crystal structure of the bacterial chaperonin GroEL at 2.8 A [see comments]. Nature 371, 578-586.

Brown, J. R., and Doolittle, W. F. (1995). Root of the universal tree of life based on ancient aminoacyl-tRNA synthetase gene duplications. Proc. Natl. Acad. Sci. USA 92, 2441-2445.

Brown, J. W., Daniels, C. J., and Reeve, J. N. (1989). Gene structure, organization, and expression in archaebacteria. Crit. Rev. Microbiol. 16,287-338.

Bukau, B. (1993). Regulation of the Escherichia coli heat-shock response. Mol. Microbiol. 9 ,671-680.

Bult, C. J., White, O., Olsen, G. J., Zhou, L., Fleischmann, R. D., Sutton, G. G., Blake, J. A., FitzGerald, L. M., Clayton, R. A., Gocayne, J. D., Kerlavage, A. R., Dougherty, B. A., Tomb, J. P., Adams, M. D., Reich, C. I., Overbeek, R., Kirkness, E. P., Weinstock, K. G., Merrick, J. M., Glodek, A., Scott, J. L., Geoghagen, N. S. M., Weidman, J. P., Puhrmann, J. L., Venter, J. C., et al. (1996). Complete genome sequence of the methanogenic archaeon, Methanococcusjannaschii. Science 273, 1058-1073.

237 Buratowski, S., Hahn, S., Gnarente, L., and Sharp, P. A. (1989). Five intermediate complexes in transcription initiation by RNA polymerase II. Cell 56, 549-561.

Burova, E., Hung, S. C., Sagitov, V., Stitt, B. L., and Gottesman, M. E. (1995). Escherichia coli NusG protein stimulates transcription elongation rates in vivo and in vitro. J. Bacteriol. 177,1388-1392.

Busby, S., and Ebright, R. H. (1994). Promoter structure, promoter recognition, and transcription activation in prokaryotes. Cell 79, 743-746.

Calkhoven, C. P., and Ab, G. (1996). Multiple steps in the regulation of transcription- factor level and activity. Biochem. J. 317, 329-342.

Campbell, F. E., Jr., and Setzer, D. R. (1992). Transcription termination by RNA polymerase III: uncoupling of polymerase release from termination signal recognition. Mol. Cell. Biol. 12, 2260-2272.

Chamberlin, M. J. (1995). New models for the mechanism of transcription elongation and its regulation. Harvey Lect. 88, 1-21.

Chan, C. L., and Landrick, R. (1994). New perspectives on RNA chain elongation and termination by E. coli RNA polymerase. In Transcription Mechanisms and Regulation, R. C. Conaway and J. W. Conaway, eds. (New York: Raven Press), pp. 297-322.

Chang, C., and Meyerowitz, E. M. (1994). Eukaryotes have "two-component" signal transducers. Res. Microbiol. 145,481-486.

Chang, Y. N., Pirtle, I. L., and Pirtle, R. M. (1986). Nucleotide sequence and transcription of a human tRNA gene cluster with four genes. Gene 48,165-174.

Charlebois, R. L., Hofinan, J. D., Schalkwyk, L. C., Lam, W. L., and Doolittle, W. F. (1989). Genome mapping in halobacteria. Can. J. Microbiol. 35,21-29.

238 Charlebois, R. L., Lam, W. L., Cline, S. W., and Doolittle, W. F. (1987). Characterization of pHV2 from Halobacterium volcanii and its use in demonstrating transformation of an archaebacterium. Proc. Natl. Acad. Sci. USA 84, 8530-8534.

Charlebois, R. L., Schalkwyk, L. C., Hofinan, J. D., and Doolittle, W. F. (1991). Detailed physical map and set of overlapping clones covering the genome of the archaebacterium Haloferax volcanii DS2. J. Mol. Biol. 222, 509-524.

Chen, X., Sullivan, D. S., and Huffaker, T. C. (1994). Two yeast genes with similarity to TCP-1 are required for microtubule and actin fimction in vivo. Proc. Natl. Acad. Sci. USA 97,9111-9115.

Cheng, S. W., Lynch, E. C., Leason, K. R., Court, D. L., Shapiro, B. A., and Friedman, D. I. (1991). Functional importance of sequence in the stem-loop of a transcription terminator. Science 254,1205-1207.

Chien, Y. T., and Zinder, S. H. (1996). Cloning, fimctional organization, transcript studies, and phylogenetic analysis of the complete nitrogenase structural genes (n{/HDK2) and associated genes in the archaeon Methanosarcina barkeri 227. J. Bacteriol. 178, 143-148.

Chitnis, P. R., and Nelson, N. (1991). Molecular cloning of the genes encoding two chaperone proteins of the cyanobacterium Synechocystis sp. PCC 6803. J. Biol. Chem. 266, 58-65.

Christie, G. E., Famham, P. J., and Platt, T. (1981). Synthetic sites for transcription termination and a fimctional comparison with operon termination sites in vitro. Proc. Natl. Acad. Sci. USA 78,4180-4184.

Chung, C. T., Niemela, S. L., and Miller, R. H. (1989). One-step preparation of competent Escherichia coli: transformation and storage of bacterial cells in the same solution. Proc. Natl. Acad. Sci. USA 86,2172-2175.

Cipres-Palacin, G., and Kane, C. M. (1994). Cleavage of the nascent transcript induced by TFIIS is insufficient to promote read-through of intrinsic blocks to elongation by RNA polymerase II. Proc. Natl. Acad. Sci. USA 91, 8087-8091.

239 Clarens, M., Macario, A. J., and Conway de Macario, E. (1995). The archaeal dnaK-dnai gene cluster: organization and expression in the methanogen Methanosarcina mazei. J. Mol. Biol. 250,191-201.

Cohen-Kupiec, R., Blank, C., and Leigh, J. A. (1997). Transcriptional regulation in Archaea: in vivo demonstration of a repressor binding site in a methanogen. Proc. Natl. Acad. Sci. USAP^. 1316-1320.

Conaway, R. C., and Conaway, J. W. (1993). General initiation factors for RNA polymerase II. Annu. Rev. Biochem. 62, 161-190.

Conway De Macario, E., Clarens, M., and Macario, A. J. (1995). Archaeal grpE: transcription in two different morphologic stages of Methanosarcina mazei and comparison with dnaK and dnaJ. J. Bacteriol. 177, 544-550.

Conway de Macario, E., Dugan, C. B., and Macario, A. J. (1994). Identification of agrpE heat-shock gene homolog in the archaeon Methanosarcina mazei. J. Mol. Bio.l 240,95- 101.

Conway de Macario, E., and Macario, A. J. L. (1994). Heat-shock response in Archaea. Trends Biotechnol 12, 512-518.

Cormack, B. P., and Struhl, K. (1992). The TATA-binding protein is required for transcription by all three nuclear RNA polymerases in yeast cells. Cell 69, 685-696.

Cotto, J. J., Kline, M., and Morimoto, R. I. (1996). Activation of heat shock factor 1 DNA binding precedes stress-induced serine phosphorylation. J. Biol. Chem. 217, 3355- 3358.

Cowing, D. W., Bardwell, J. C., Craig, E. A., Woolford, C., Hendrix, R. W., and Gross, C. A. (1985). Consensus sequence for Escherichia coli heat shock gene promoters. Proc. Natl. Acad. Sci. USA 82,2679-2683.

Cozzarelli, N. R., Gerrard, S. P., Schlissel, M., Brown, D. D., and Bogenhagen, D. F. (1983). Purified RNA polymerase III accurately and efficiently terminates transcription of 5S RNA genes. Cell 34, 829-835. 240 Craig, E. A. (1993). Chaperones: helpers along the pathways to protein folding. Science 260,1902-1903.

Craig, E. A., Weissman, J. S., and Horwich, A. L. (1994). Heat shock proteins and molecular chaperones: mediators of protein conformation and turnover in the cell. Cell 78, 365-372.

Crothers, D. M., Haran, T. E., and Nadeaus, J. G. (1990). Intrinsically bent DNA. J. Biol. Chem. 265, 7093-7096.

Csonka, L. N., and Hanson, A. D. (1991). Prokaryotic osmoregulation genetics and physiology. Annu. Rev. Microbiol. 45, 569-606.

Daniels, C. J., Gupta, R., and Doolittle, W. F. (1985).a. Transcription and excision of a large intron in the tRNATrp gene of an archaebacterium, Halobacterium volcanii. J. Biol. Chem. 260,3132-3134.

Daniels, C. J., Hofinan, J. D., Mac William, J. G., Doolittle, W. P., Woese, C. R., Luehrsen, K. R., and Fox, G. E. (1985).b. Sequence of 5S ribosomal RNA gene regions and their products in the archaebacterium Halobacterium volcanii. Mol Gen Genet 198, 270-274.

Daniels, C. J., McKee, A. H. Z., and Doolittle, W. F. (1984). Archaebacterial heat-shock proteins. EMBO J. 3, 745-749.

Danner, S., and Soppa, J. (1996). Characterization of the distal promoter element of halobacteria in vivo using saturation mutagenesis and selection. Mol. Micro. 19,1265- 1276.

Darcy, T. J., Sandman, K., and Reeve, J. N. (1995). Methanobacterium formicicum, a mesophilic methanogen, contains three HFo histones. J. Bacteriol. 177, 858-860.

Datta, P. K., Hawkins, L. K., and Gupta, R. (1989). Presence of an intron in elongator methionine-tRNA of Halobacterium volcanii. Can. J. Microbiol. 25, 189-194.

241 Daube, S. S., Hart, C. R., and v o n Hippei, P. H. (1994). Coupling of RNA displacement and intrinsic termination in transcription from synthetic RNA:DNA bubble duplex constructs. Proc. Natl. Acad. Sci. USA 91,9539-9543.

d'Aubenton Carafa, Y., Brody, E., and Thermes, C. (1990). Prediction of rho-independent Escherichia coli transcription terminators. A statistical analysis of their RNA stem-loop structures. J. Mol. Bio.l 216, 835-858.

DeDecker, B. S., Obrien R., Fleming, P. J., Geiger, J. H., Jackson, S. P., and Sigler, P. B. (1996). The crystal structure of a hyperthermophilic archaeal TATA-box binding protein. J. Mol. Bio.l 264, 1072-1084.

DeLange, R. J., Williams, L. C., and Searcy, D. G. (1981). A histone-like protein (HTa) from Thermoplasma acidophilum. II. Complete amino acid sequence. J. Biol. Chem. 256, 905-911.

DeLong, E. P., Wu, K. Y., Prezelin, B. B., and Jovine, R. V. (1994). High abundance of Archaea in Antarctic marine picoplankton. Nature 371, 695-697.

Deng, L., Hagler, J., Shuman, S., Memon, A. R., Meng, B., Mullet, J. E., Kaukinen, K. H., Tranbarger, T. J., Misra, S., Washburn, R. S., Jin, D. J., Stitt, B. L., Washburn, R. S., and Stitt, B. L. (1996). Factor-dependent release of nascent RNA by ternary complexes of vaccinia RNA polymerase. J. Biol. Chem. 271,19556-19562.

DeVito, J., and Das, A. (1994). Control of transcription processivity in phage lambda: Nus factors strengthen the termination-resistant state of RNA polymerase induced by N antiterminator. Proc. Natl. Acad. Sci. USA 91, 8660-8664.

Edwalds-Gilbert, G., Prescott, J., and Falck-Pedersen, E. (1993). 3' RNA processing efficiency plays a primary role in generating termination-competent RNA polymerase II elongation complexes. Mol. Cell. Biol. 13, 3472-3480.

Eggen, R. I. (1994). Regulated gene expression in methanogens. FEMS Microbiol. Rev. 15,251-260.

242 Eick, D., Wedel, A., and Heumann, H. (1994). From initiation to elongation: comparison of transcription by prokaryotic and eukaryotic RNA polymerases. Trends Genet. 10, 292- 296.

Feaver, W. J., Gileadi, O., Li, Y., and Komberg, R. D. (1991) CTD kinase associated with yeast RNA polymerase II initiation factor b. Cell 67,1223-1230.

Fenton, W. A., Kashi, Y., Furtak, K., and Horwich, A. L. (1994). Residues in chaperonin GroEL required for polypeptide binding and release. Nature 371, 614-619.

Fernandes, M., O'Brien, T., and Lis, J. (1994). Structure and regulation of heat shock gene promoters. Volume 1, R. I. Morimoto, A. Tissiere and C. Georgopoulos, eds. (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press), pp.375-393.

Ferrer, C., Mojica, F. J., Juez, G., and Rodriguez-Valera, F. (1996). Differentially transcribed regions of Haloferax volcanii genome depending on the medium salinity. J. Bacteriol. 178, 309-313.

Fiorenza, M. T., Farkas, T., Dissing, M., Kolding, D., and Zimarino, V. (1995). Complex expression of murine heat shock transcription factors. Nucl. Acids Res. 23, 467-474.

Frey, G., Thomm, M., Brudigam, B., Gohl, H. P., and Hausner, W. (1990). An archaebacterial cell-free transcription system. The expression of tRNA genes from Methanococcus vannielii is mediated by a transcription factor. Nucl. Acids Res. 18,1361- 7.

Frydman, J., and Hart, F.-U. (1994). Molecular chaperone functions of hsp70 and hsp60 in protein folding, R. I. Morimoto, A. Tissiere and C. Georgopoulos, eds. (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press), pp.251-283.

Frydman, J., Nimmesgem, E., Erdjument-Bromage, H., Wall, J. S., Tempst, P., and Hartl, F. U. (1992). Function in protein folding of TRiC, a cytosolic ring complex containing TCP-1 and structurally related subunits. EMBO J. 11,4767-4778.

243 Gaal, T., Ross, W., Blatter, E. E., Tang, H., Jia, X., BCrishnan, V. V., Assa-Munt, N., Ebright, R. H., and Gourse, R. L. (1996). DNA-binding determinants of the alpha subunit of RNA polymerase: novel DNA-binding domain architecture. Genes Dev. 10,16-26.

Gamer, J., Bujard, H., and Bukau, B. (1992). Physical interaction between heat shock proteins DnaK, DnaJ, and GrpE and the bacterial heat shock transcription factor sigma 32. Cell 69, 833-842.

Gamer, J., Multhaup, G., Tomoyasu, T., McCarty, J. S., Rudiger, S., Schonfeld, H. J., Schirra, C., Bujard, H., and Bukau, B. (1996). A cycle of binding and release of the DnaK, DnaJ and GrpE chaperones regulates activity of the Escherichia coli heat shock transcription factor sigma32. EMBO J. 15,607-617.

Gao, Y., Thomas, J. O., Chow, G.-H. L., and Cowan, N. J. (1992). A cytoplasmic chaperonin that catalyze p-actin folding. Cell 69, 1043-1050.

Gao, Y., Vainberg, I. E., Chow, R. 1., and Cowan, N. J. (1993). Two cofactors and cytoplasmic chaperonin are required for the folding of a- and P-tubulin. Mol. Cell. Biol. 13, 2478-2485.

Geiduschek, E. P., and Tocchini-Valentini, G. P. (1988). Transcription by RNA polymerase III. Annu. Rev. Biochem. 57, 873-914.

Georgopoulos, C. (1993). Role of the major heat shock proteins as molecular chaperones. Annu. Rev. Cell Biol. 9 ,601-634.

Georgopoulos, C., Liberek, K., Zylicz, M., and Ang, D. (1994). Properties of the heat shock proteins of Escherichia coli and the autoregulation of the heat shock response, R. I. Morimoto, A. Tissiere and C. Georgopoulos, eds. (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press), pp.209-249.

Gillis, T. P., Miller, R. A., Young, D. B., Khanolkor, S. R., and Buchanan, T. M. (1985). Immunochemical characterization of a protein associated with Mycobacterium leprae cell wall. Infect. Immun. 49, 371-377.

244 Gohl, H. P., Grondahl, B., and Thomm, M. (1995). Promoter recognition in archaea is mediated by transcription factors: identification of transcription factor aTFB from Methanococcus thermolithotrophicus as archaeal TATA-binding protein. Nucl. Acids Res. 23, 3837-3841.

Gollnick, P. (1994). Regulation of the Bacillus subtilis by an RNA-binding protein. Mol. Microbiol. II, 991-997.

Goodson, M. L., and Sarge, K. D. (1995). Heat-inducible DNA binding of purified heat shock transcription factor. J. Biol. Chem. 270,2447-2450.

Gottlieb, E., and Steitz, J. A. (1989).a. Function of the mammalian La protein: evidence for its action in transcription termination by RNA polymerase III. EMBO J. 8, 851-861.

Gottlieb, E., and Steitz, J. A. (1989).b. The RNA binding protein La influences both the accuracy and the efficiency of RNA polymerase III transcription in vitro. EMBO J. 8, 841-850.

Gralla, J. D. (1996). Global steps during initiation by RNA polymerase II. Methods Enzymol. 273, 99-110.

Gu, W., Wind, M., and Reines, D. (1996). Increased accommodation of nascent RNA in a product site on RNA polymerase II during arrest. Proc. Natl. Acad. Sci. USA. 93, 6935- 6940.

Gupta, R. S., Bustard, K., Falah, M., and Singh, D. (1997). Sequencing of heat shock protein 70 (DnaK) homologs from Deinococcus proteolyticus and Thermomicrobium roseum and their integration in a protein-based phylogeny of prokaryotes. J Bacteriol. 779,345-357

Gupta, R. S., and Golding, G. B. (1996). The origin of the eukaryotic cell. Trends Biochem. Sci. 27:,166-171.

Gupta, R. S. (1995). Evolution of the chaperonin families (Hsp60, HsplO and Tcp-1) of proteins and the origin of eukaryotic cells. Mol. Microbiol. 15, 1-11.

245 Gupta, R. S., and Singh, B. (1994). Phylogenetic analysis of 70 kD heat shock protein sequences suggests a chimeric origin for the eukaryotic . Curr. Biol. 4 ,1104- 1114.

Gupta, R. S., and Singh, B. (1992). Cloning of the HSP70 gene from Halobacterium marismortui: relatedness of archaebacterial HSP70 to its eubacterial homologs and a model for the evolution of the HSP70 gene. J. Bacteriol. 174,4594-605.

Haas, E. S., Daniels, C. J., and Reeve, J. N. (1989). Genes encoding 5S rRNA and tRNAs in the extremely thermophilic archaebacterium Methanothermus fervidus. Gene 77,253- 63.

Hagerman, P. J. (1990). Sequence-directed curvature of DNA. Arm. Rev. Biochem. 59, 755-781.

Hain, J., Reiter, D., Hudepohl, U., and Zillig, W. (1992). Elements of an archaeal promoter defined by mutational analysis. Nucl. Acids Res. 20, 5423-5428.

Haldenwang, W. G. (1995). The sigma factors of Bacillus subtilis. Microbiol. Rev 59, I- 30.

Harlow, and Lane. (1988). Antibodies: A laboratory manual (Cold Spring Harbor Laboratories). pp.650

Hausner, W., Wettach, J., Hethke, C., and Thomm, M. (1996). Two transcription factors related with the eucaryal transcription factors TATA-binding protein and transcription factor IIB direct promoter recognition by an archaeal RNA polymerase. J. Biol. Chem. 271, 30144-30148.

Hebert, A. M., Kropinski, A. M., and Jarrell, K. F. (1991). Heat shock response of the archaebacterium Methanococcus voltae. J. Bacteriol. 173, 3224-3227.

Helmann, J. D. (1991). Alternative sigma factors and the regulation of flagellar gene expression. Mol. Microbiol. 5:2875-2882.

246 Helmann, J. D. (1994). Bacterial sigma factors. In Transcription Mechanisms and Regulation, R. C. Conaway and J. W. Conaway, eds. (New York: Raven Press), pp. 1-17.

Hendrick, J. P., and Hartl, F. U. (1995). The role of molecular chaperones in protein folding. FASEB J. 9 ,1559-1569.

Hengge-Aronis, R. (1993). Survival of hunger and stress: the role of rpoS in early stationary phase gene regulation in E. coli. Cell 72:165-168.

Henkin, T. M. (1996). Control of transcription termination in prokaryotes. Annu. Rev. Genet. 30, 35-57.

Henkin, T. M. (1994). tRNA-directed transcription antitermination. Mol. Microbiol. 13, 381-387.

Hess, J., Perez-Stable, C., Wu, G. J., Weir, B., Tinoco, 1., Jr., and Shen, C. K. (1985). End-to-end transcription of an Alu family repeat. A new type of polymerase-111- dependent terminator and its evolutionary implication. J. Mol. Biol. 184, 7-21.

Hethke, C., Geerling, A. C., Hausner, W., de Vos, W. M., and Thomm, M. (1996). A cell- free transcription system for the hyperthermophilic archaeon Pyrococcus fuhosus. Nucl. Acids Res. 24,2369-2376.

Hodges, R. A., Perler, F. B., Noren, C. J., and Jack, W. E. (1992). Protein splicing removes intervening sequences in an archaea DNA polymerase. Nucl. Acids Res. 20, 6153-6157.

Holden, J. F., and Baross, J. A. (1993). Enhanced thermotolerance and temperature- induced changes in protein composition in the hyperthermophilic archaeon ES4. J. Bacteriol. 175,2839-2843.

Horwich, A. L., and Willison, K. R. (1993). Protein folding in the cell: functions of two families of molecular chaperone, hsp60 and TF55-TCP1. Phil. Trans. R. Soc. Lond. B 339,313-326.

247 Hsu, L. M., Vo, N. V., and Chamberlin, M. J. (1995). Escherichia coli transcript cleavage factors GreA and GreB stimulate promoter escape and gene expression in vivo and in vitro. Proc. Natl. Acad. Sci. USA 9 2 ,11588-11592.

Hudepohl, U., Reiter, W. D., and Zillig, W. (1990). In vitro transcription of two rRNA genes of the archaebacterium Sulfolobus sp. B12 indicates a factor requirement for specific initiation. Proc. Natl. Acad. Sci. USA 87, 5851-5855.

Inoue, H., Nojima, H., and Okayama, H. (1990). High efficiency transformation of Escherichia coli with plasmids. Gene 96,23-28.

Iwabe, N., Kuma, K., Hasegawa, M., Osawa, S., and Miyata, T. (1989). Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred fi-om phylogenetic trees of duplicated genes. Proc. Natl. Acad. Sci. USA 86,9355-9359.

Iwabe, N., Kuma, K., Kishino, H., Hasegawa, M., and Miyata, T. (1991). Evolution of RNA polymerases and branching patterns of the three major groups of Archaebacteria. J. Mol. Evol. 32, 70-78.

James, P., and Hall, B. D. (1990). retl-1, a yeast mutant affecting transcription termination by RNA polymerase III. Genetics 125,293-303.

Jiang, Y., Smale, J. T., and Gralla J. D. (1993) A common ATP requirement for open complex formation and transcription at promoters containing initiator or TATA elements. J. Biol. Chem. 268:6535-6540.

Jiang, Y., Yan, M., and Gralla, J. D. (1996). A three-step pathway of transcription initiation leading to promoter clearance at an activated RNA polymerase II promoter. Mol. Cell. Biol. 16, 1614-1621.

Kagawa, H. K., Osipiuk, J., Maltsev, N., Overbeek, R., Quaite-Randall, E., Joachimiak, A., and Trent, J. D. (1995). The 60 kDa heat shock proteins in the hyperthermophilic archaeon Sulfolobus shibatae. J. Mol. Biol. 253, 712-725.

248 Kagawa, Y., Ohta, T., Abe, Y., Endo, H., Yohda, M., Kato, N., Endo, L, Hamamoto, T., Ichida, M., Hoaki, T., et al. (1995). Gene of heat shock protein of sulfur-dependent archaeal hyperthermophile Desulfurococcus. Biochem. Biophys. Res. Commun. 214, 730-736.

Kane, C. (1994). Transcript elongation and gene regulation in eukaryotes. In Transcription Mechanisms and Regulation, R. C. Conaway and J. W. Conaway, eds. (New York: Raven Press), pp. 279-296.

Kato, N., Aiba, H., and Mizuno, T. (1996). Suppressor mutations in alpha-subunit of RNA polymerase for a mutant of the positive regulator, OmpR, in Escherichia coli. FEMS Microbiol. Lett. 139,175-180.

Kaufinann, E., Geisler, N., and Weber, K. (1984). SDS-PAGE strongly overestimates the molecular masses of the neurofilament proteins. FEBS Lett 770, 81-84.

Keeling, P. J., Charlebois, R. L., and Doolittle, W. F. (1994). Archaebacterial genomes: eubacterial form and eukaryotic content. Curr. Opin. Genet. Dev. 4, 816-822.

Ken, R., and Hackett, N. R. (1991). Halobacterium halobium strains lysogenic for phage phi H contain a protein resembling coliphage repressors. J. Bacteriol. 173, 955-960.

Kerppola, T. K., and Kane, C. M. (1990). Analysis of the signals for transcription termination by purified RNA polymerase II. Biochemistry 29, 269-278.

Kerppola, T. K., and Kane, C. M. (1991). RNA polymerase: regulation of transcript elongation and termination. FASEB J. 5,2833-2842.

Kim, S., Willison, K. R., and Horwich, A. L. (1994). Cytosolic chaperonin subunits have a conserved ATPase domain but diverged polypeptide-binding domains. Trends Biochem. Sci. 19, 543-548.

Klenk, H. P., Palm, P., Lottspeich, F., and Zillig, W. (1992). Component H of the DNA- dependent RNA polymerases of Archaea is homologous to a subunit shared by the three eucaryal nuclear RNA polymerases. Proc. Natl. Acad. Sci. USA 89, 407-410.

249 Klenk, H.-P., Palm, P., and Zilig, W. (1994). DNA-dependent RNA polymerases as phylogenetic marker molecules. System. Appl. Micro. 1 6 ,138-147.

Knapp, S., Schmidt-Krey, I., Hebert, H., Bergman, T., Jomvall, H., and Ladenstein, R. (1994). The molecular chaperonin TF55 from the Thermophilic archaeon Sulfolobus solfataricus. A biochemical and structural characterization. J. Mol. Biol. 242, 397-407.

Knimmel, B., and Chamberlin, M. J. (1989).b. RNA chain initiation by Escherichia coli RNA polymerase. Structural transitions of the enzyme in early ternary complexes. Biochemistry 28, 7829-7842.

Krummel, B., and Chamberlin, M. J. (1992).a. Structural analysis of ternary complexes of Escherichia coli RNA polymerase. Individual complexes halted along different transcription units have distinct and unexpected biochemical properties. J. Mol. Biol. 225, 221-37.

Krummel, B., and Chamberlin, M. J. (1992).b. Structural analysis of ternary complexes of Escherichia coli RNA polymerase. Deoxyribonuclease I footprinting of defined complexes. J. Mol. Bio.l 225, 239-250.

Kubota, H., Hynes, G., Came, A., Ashworth, A., and Willison, K. (1994). Identification of six Tcp-1-related genes encoding divergent subunits of the TCP-1-containing chaperonin. Curr. Biol. 4, 89-99.

Kubota, H., Hynes, G., and Willison, K. (1995). The chaperonin containing t-complex polypeptide 1 (TCP-1). Multisubunit machinery assisting in protein folding and assembly in the eukaryotic cytosol. Eur. J. Biochem. 230, 13-16.

Kuo, Y.-P., Thompson, D. K., St. Jean, A., Charlebois, R. L., and Daniels, D. J. (1997). Characterization of two heat shock genes from Haloferax volcanii: A model system for transcription regulation in the Archaea. J. Bacteriol. (submitted for publication).

Lam, W. L., and Doolittle, W. F. (1989). Shuttle vectors for the archaebacterium Halobacterium volcanii. Proc. Natl. Acad. Sci. USA 86, 5478-82.

250 Landick, R., and Tumbough, C. L. (1992). Transcriptional attenuation. In Transcriptional regulation. Volume 1, S. L. McKnight and K. R. Yamamoto, eds. (Cold Spring Harbor Laboratory Press), pp. 407-446.

Landini, P., and Volkert, M. R. (1995). RNA polymerase alpha subunit binding site in positively controlled promoters: a new model for RNA polymerase-promoter interaction and transcriptional activation in the Escherichia coli aidA and aidB genes. EMBO J. 14, 4329-4335.

Lang, W. H., Morrow, B. E., Ju, Q., Warner, J. R., and Reeder, R. H. (1994). A model for transcription termination by RNA polymerase I. Cell 79, 527-534.

Lang, W. H., and Reeder, R. H. (1993). The REBl site is an essential component of a terminator for RNA polymerase I in Saccharomyces cerevisiae. Mol. Cell. Biol. IS, 649- 658.

Lang, W. H., and Reeder, R. H. (1995). Transcription termination of RNA polymerase 1 due to a T-rich element interacting with Reblp. Proc. Natl. Acad. Sci. USA 92,9781- 9785.

Langer, D., Hain, J., Thuriaux, P., and Zillig, W. (1995). Transcription in archaea: similarity to that in eucarya. Proc. Natl. Acad. Sci. USA 92, 5768-5772.

Langer, D., Lottspeich, F., and Zillig, W. (1994). A subunit of an archaeal DNA- dependent RNA polymerase contains the SI motif. Nucl. Acids Res. 22, 694.

Langer, D., and Zillig, W. (1993). Putative tflls gene of Sulfolobus acidocaldarius encoding an archaeal transcription elongation factor is situated directly downstream of the gene for a small subunit of DNA-dependent RNA polymerase. Nucl. Acids Res. 21, 2251.

Lanzendorfer, M., Langer, D., Hain, J., Klenk, H.-P., Holz, I., Amold-Ammer, I., and Zillig, W. (1994). Structure and function of the DNA-dependent RNA polymerase of sulfolobus. System. Appl. Micro. 16. 156-164.

251 Larson, J. S., Schuetz, T. J., and Kingston, R. E. (1995). In vitro activation of purified human heat shock factor by heat. Biochemistry 34, 1902-1911. Lechner, J., and Sumper, M. (1987). The primary structure of a procaryotic glycoprotein. Cloning and sequencing of the cell surface glycoprotein gene of halobacteria. J. Biol. Chem. 262, 9724-9729.

Leffers, H., Gropp, F., Lottspeich, P., Zillig, W., and Garrett, R. A. (1989). Sequence, organization, transcription and evolution of RNA polymerase subunit genes from the archaebacterial extreme halophiles Halobacterium halobium and Halococcus morrhuae. J. Mol. Biol. 206, 1-17.

Leong, D., Boyer, H., and Betlach, M. (1988). Transcription of genes involved in bacterio-opsin gene expression in mutants of a halophilic archaebacterium. J. Bacteriol. 770,4910-4915.

Leuther, K. K., Bushnell, D. A., and Komberg, R. D. (1996). Two-dimensional crystallography of TFllB- and llE-RNA polymerase 11 complexes: implications for start site selection and initiation complex formation. Cell 85, 773-779.

Liberek, K., Galitski, T. P., Zylicz, M., and Georgopoulos, C. (1992). The DnaK chaperone modulates the heat shock response of Escherichia coli by binding to the sigma 32 transcription factor. Proc. Natl. Acad. Sci. USA 89,3516-3520.

Liberek, K., and Georgopoulos, C. (1993). Autoregulation of the Escherichia coli heat shock response by the DnaK and DnaJ heat shock proteins. Proc. Natl. Acad. Sci. USA 90, 11019-11023.

Lingappa, J. R., Martin, R. L., Wong, M. L., Ganem, D., Welch, W. J., and Lingappa, V. R. (1994). A eukaryotic cytosolic chaperonin is associated with a high molecular weight intermediate in the assembly of hepatitis B virus capsid, a multimeric particle. J. Cell. Biol. 125, 99-111.

Lis, J., and Wu, C, (1992). Heat shock Factor. In Transcriptional Regulation, S. L. McKnight and K. R. Yamamoto, eds. (Cold Spring Harbor Laboratory Press), pp. 907- 930.

252 Lis, J. T., and Wu, C. (1994). Transcriptional regulation of the heat shock genes. In Transcription Mechanisms and Regulation, R. C. Conaway and J. W. Conaway, eds. (New York: Raven Press), pp. 457-476.

Losick, R. and Stragier, P. (1992) Crisscross regulation of cell-type-specific gene expression during development in B. subtilis. Nature 555:601-604.

Macario, A. J., Dugan, C. B., Clarens, M., and Conway de Macario, E. (1993). dnaJ in archaea. Nucl. Acids Res. 21, 2773.

Macario, A. J., Dugan, C. B., and Conway de Macario, E. (1991). A ¥i homo log in the archaebacterium Methanosarcina mazei S6. Gene 108, 133-137.

Madon, J., Leser, U., and Zillig, W. (1983). DNA-dependent RNA polymerase from the extremely halophilic archaebacterium Halococcus morrhuae. Eur. J. Biochem. 135,279- 283.

Maeda, T., Wurgler-Murphy, S. M., and Saito, H. (1994). A two-component system that regulates an osmosensing MAP kinase cascade in yeast. Nature 369,242-245.

Mager, W. H., and De BCruijff, A. J. (1995). Stress-induced transcriptional activation. Microbiol. Rev. 59, 506-531.

Mager, W. H., and Ferreira, P. M. (1993). Stress response of yeast. Biochem. J. 290, 1- 13.

Mager, W. H., and Varela, J. C. (1993). Osmostress response of the yeast Saccharomyces. Mol. Microbiol. 10, 253-258.

Mande, S. C., Mehra, V., Bloom, B. R., and Hoi, W. G. J. (1996). Structure of the heat shock protein chaperonin-10 of Mycobacterium leprae. Science 271,203-207.

Manley, J. L., Proudfoot, N. J., and Platt, T. (1989). RNA 3'-end formation. Genes Dev. 5, 2218-2222.

253 Maraia, R. J. (1996). Transcription termination factor La is also an initiation factor for RNA polymerase IE. Proc. Natl. Acad. Sci. U.S.A. 93, 3383-3387.

Maraia, R. J., Kenan, D. J., and Keene, J. D. (1994). termination factor La mediates transcript release and facilitates reinitiation by RNA polymerase III. Mol. Cell. Biol. 14,2147-2158.

Marco, S., Carrascossa, J. L., and Valpuesta, J. M. (1994). Reversible interaction of b- actin along the channel of the TCP-1 cytoplasmic chaperonin. Biophys. J. 67, 364-368.

Marco, S., Urena, D., Carrascosa, J. L., Waldmann, T., Peters, J., Hegerl, R., Pfeifer, G., Sack-Kongehl, H., and Baumeister, W. (1994). The molecular chaperone TF55. Assessment of symmetry. FEBS Lett. 341,152-155.

Markovtsov, V., Mustaev, A., and Goldfarb, A. (1996). Protein-RNA interactions in the active center of transcription elongation complex. Proc. Natl. Acad. Sci. U.S.A. 93,3221- 3226.

Marsh, T. L., Reich, C. I., Whitelock, R. B., and Olsen, G. J. (1994). Transcription factor IID in the Archaea: sequences in the Thermococcus celer genome would encode a product closely related to the TATA-binding protein of eukaryotes. Proc. Natl. Acad. Sci. USA 9/, 4180-4184.

Martin, J., Mayhew, M., Langer, T., and Ulrich Hart, F. (1993). The reaction cycle of GroEL and GroES in chaperonin-assisted protein folding. Nature 366, 228-233.

Matsummoto, K., Takii, T., and Okada, N. (1989). Characterization of a new termination signal for RNA polymerase III responsible for generation of a discrete-sized RNA transcribed from salomon total genomic DNA in a Hela cell extract. J. Biol. Chem. 264, 1124-1131.

Mazabraud, A., Scherly, D., Muller, F., Rungger, D., and Clarkson, S. G. (1987). Structure and transcription termination of a lysine tRNA gene from Xenopus laevis. J. Mol. Biol. 195, 835-845.

254 McCarty, J. S., Rudiger, S., Schonfeld, H. J., Schneider-Mergener, J., Nakahlgashi, K., Yura, T., and Bukau, B. (1996). Regulatory region C of the E. coli heat shock transcription factor, sigma32, constitutes a DnaK binding site and is conserved among eubacteria. J. Mol. Biol. 256, 829-837.

McLennan, N. F., Girshovich, A. S., Lissin, N. M., Charters, Y., and Masters, M. (1993). The strongly conserved carboxyl-terminus glycine-methionine motif of the Escherichia coli GroEL chaperonin is dispensable. Mol. Microbiol. 7,49-58.

McStay, B., and Reeder, R. H. (1990). A DNA-binding protein is required for termination of transcription by RNA polymerase I 'mXenopus laevis. Mol. Cell. Biol. 10, 2793-2800.

Minami, Y., Hohfeld, J., Ohtsuka, K., and Hartl, F. U. (1996). Regulation of the heat- shock protein 70 reaction cycle by the mammalian DnaJ homolog, Hsp40. J. Biol. Chem. 271,19617-19624.

Mojica, F. J., Juez, G., and Rodriguez-Valera, F. (1993). Transcription at different salinities of Haloferax mediterranei sequences adjacent to partially modified Pstl sites. Mol. Microbiol. 9, 613-621.

Morimoto, R. I. (1993). Cells in stress: transcriptional activation of heat shock genes. Science 259, 1409-1410.

Morimoto, R. I., Sarge, K. D., and Abravaya, K. (1992). Transcriptional regulation of heat shock genes. A paradigm for inducible genomic responses. J. Biol. Chem. 267, 21987-21990.

Mullakhanbhai, M. F., and Larsen, H. (1975). Halobacterium volcanii spec, nov., a Dead Sea halobacterium with a moderate salt requirement. Arch. Microbiol. 104, 207-214.

Muller, B., Allmansberger, R., and Klein, A. (1985). Termination of a transcription unit comprising highly expressed genes in the archaebacterium Methanococcus voltae. Nucleic Acids Res 13, 6439-6445.

255 Nagai, H., Yuzawa, H., Kanemori, M., and Yura, T. (1994). A distinct segment of the ct32 polypeptide is involved in DnaK-mediated negative control of the heat shock response in Escherichia coli. Proc. Natl. Acad. Sci. USA 9 1 ,10280-10284.

Nagai, H., Yuzawa, H., and Yura, T. (1991). Interplay of two cis-acting mRNA regions in translational control of sigma 32 synthesis during the heat shock response of Escherichia coli. Proc. Natl. Acad. Sci. USA 55, 10515-10519.

Nagai, H., Yuzawa, H., and Yura, T. (1991). Regulation of the heat shock response in E. coli: involvement of positive and negative cis-acting elements in translation control of sigma 32 synthesis. Biochimie 73,1473-9.

Nieuwlandt, D. T., Carr, M. B., and Daniels, C. J. (1993). In vivo processing of an intron- containing archael tRNA. Mol. Microbiol. 5, 93-99.

Nieuwlandt, D. T., and Daniels, C. J. (1990). An expression vector for the archaebacterium Haloferax volcanii. J. Bacteriol. 172, 7104-7110.

Nolling, J., and Reeve, J. N. (1997). Growth- and substrate-dependent transcription of the formate dehydrogenase (/W/zCAB) operon in Methanobacterium thermoformicicum Z- 245. J. Bacteriol. 179, 899-908.

Nudler, E., Avetissova, E., Markovtsov, V., and Goldfarb, A. (1996). Transcription processivity: protein-DNA interactions holding together the elongation complex. Science 275,211-217.

Nudler, E., Goldfarb, A., and Kashlev, M. (1994). Discontinuous mechanism of transcription elongation. Science 265, 793-796.

Nudler, E., Kashlev, M., Nikiforov, V., and Goldfarb, A. (1995). Coupling between transcription termination and RNA polymerase inchworming. Cell 81, 351-357.

Olsen, G. J., and Woese, C. R. (1993). Ribosomal RNA: a key to phylogeny. FASEB J. 7, 113-123.

256 Orlova, M., Newlands, J., Das, A., Goldfarb, A., and Borukhov, S. (1995). Intrinsic transcript cleavage activity of RNA polymerase. Proc. Natl. Acad. Sci. USA 92,4596- 4600.

Ota, I. M., and Varshavsky, A. (1993). A yeast protein similar to bacterial two- component regulators. Science 262, 566-569.

Ouzounis, C., and Kyrpides, N. (1996). The emergence of major cellular processes in evolution. FEES Lett. 390, 119-123.

Ouzounis, C., and Sander, C. (1992). TFIIB, an evolutionary link between the transcription machineries of archaebacteria and eukaryotes [letter]. Cell 71,189-190.

Ouzounis, C. A., and Kyrpides, N. C. (1996). Parallel origins of the core and eukaryotic transcription from Archaea. J. Mol. Evol. 42, 234-239.

Palmer, J. R., and Daniels, C. J. (1995). In vivo definition of an archaeal promoter. J. Bacteriol. 177, 1844-1849.

Palmer, J. R., and Daniels, C. J. (1994). A transcriptional reporter for in vivo promoter analysis in the archaeon Haloferax volcanii. Appl. Environ. Microbiol. 60,3867-3869.

Palmer, J. R., Nieuwlandt, D. T., and Daniels, C. J. (1994). Expression of a yeast intron- containing tRNA in the archaeon Haloferax volcanii. J. Bacteriol. 176, 3820-3823.

Parsell, D. A., and Lindquist, S. (1994). Heat shock proteins and stress tolerance. In The biology of heat shock proteins and molecular chaperones. Volume 1, R. I. Morimoto, A. Tissieres and C. Georgopoulos, eds. (Cold Spring Harbor Laboratory Press), pp.457-493.

Paul, L., and Krzycki, J. A. (1996). Sequence and transcript analysis of a novel Methanosarcina barkeri methyltransferase II homolog and its associated corrinoid protein homologous to methionine synthase. J. Bacteriol. 178, 6599-5607.

257 Phipps, B. M., HofËmann, A., Stetter, K. O., and Baumeister, W. (1991). A novel ATPase complex selectively accumulated upon heat shock is a major cellular component of thermophilic archaebacteria. EMBO J. 70,1711-22.

Phipps, B. M., Typke, D., Hegerl, R., Volker, S., Hoffinann, A., Stetter, K. O., and Baumeister, W. (1993). Structure of a molecular chaperone from a thermophilic archaeabacterium. Nature 361,475-477.

Picketts, D. J., Mayanil, C. S., and Gupta, R. S. (1989). Molecular cloning of a Chinese hamster mitochondrial protein related to the "chaperonin" family of bacterial and plant proteins. J. Biol. Chem. 264,12001-12008.

Platt, T. (1994). Rho and RNA: models for recognition and response. Mol. Microbiol. 11, 983-990.

Platt, T. (1996). RNA structure in transcription elongation, termination, and antitermination. In RNA structure and function. R. Simons and M. Grunberg-Manago, eds. (New York: Cold Spring Harbor Press).

Platt, T. (1981). Termination of transcription and its regulation in the tryptophan operon of E. coli. Cell 24, 10-23.

Platt, T. (1986). Transcription termination and the regulation of gene expression. Ann. Rev. Biochem. 55, 339-372.

Ponce, M. R., and Micol, J. L. (1992). PCR amplification of long DNA fragments. Nucl. Acids Res. 20, 623.

Prangishvilli, D., Zillig, W., Gierl, A., Biesert, L., and Holz, I. (1982). DNA-dependent RNA polymerase of thermoacidophilic archaebacteria. Eur. J. Biochem. 122, 471-477.

Puhler, G., Leffers, H., Gropp, F., Palm, P., Klenk, H. P., Lottspeich, P., Garrett, R. A., and Zillig, W. (1989). Archaebacterial DNA-dependent RNA polymerases testify to the evolution of the eukaryotic nuclear genome. Proc. Natl. Acad. Sci. USA 86, 4569-4573.

258 Puhler, G., Lottspeich, F., and Zillig, W. (1989). Organization and nucleotide sequence of the genes encoding the large subunits A, B and C of the DNA-dependent RNA polymerase of the archaebacterium Sulfolobus acidocaldarins. Nucl. Acids Res. 17, 4517-4534.

Quaite-Randall, E., Trent, J. D., Josephs, R., and Joachimiak, A. (1995). Conformational cycle of the archaeosome, a TCP 1-like chaperonin from Sulfolobus shibatae. J. Biol. Chem. 270,28818-28823.

Qureshi, S. A., Baumann, P., Rowlands, T., Khoo, B., and Jackson, S. P. (1995).a. Cloning and functional analysis of the TATA binding protein from Sulfolobus shibatae. Nucl. Acids Res. 23, 1775-1781.

Qureshi, S. A., Khoo, B., Baumann, P., and Jackson, S. P. (1995).b. Molecular cloning of the transcription factor TFIIB homolog from Sulfolobus shibatae. Proc. Natl. Acad. Sci. USA 92, 6077-6081.

Ramirez, C., Shimmin, L. C., Newton, C. H., Matheson, A. T., and Dennis, P. P. (1989). Structure and evolution of the LI 1, LI, LI 0, and L12 equivalent ribosomal proteins in eubacteria, archaebacteria, and eucaryotes [published erratum appears in Can. J. Microbiol. 1989 Oct;35(10):975j. Can. J. Microbiol. 35,234-244.

Reeder, R. H., and Lang, W. (1994). The mechanism of transcription termination by RNA polymerase 1. Mol. Microbiol. 12, 11-15.

Reines, D. (1994). Nascent RNA cleavage by transcription elongation complexes. In Transcription Mechanisms and Regulation, R. C. Conaway and J. W. Conaway, eds. (New York: Raven Press), pp. 263-278.

Reines, D., Conaway, J. W., and Conaway, R. C. (1996). The RNA polymerase 11 general elongation factors. Trends Biochem. Sci. 21, 351-355.

Reines, D., Wells, D., Chamberlin, M. J., and Kane, C. M. (1987). Identification of intrinsic termination sites in vitro for RNA polymerase II within eukaryotic gene sequences. J. Mol. Biol. 196,299-312.

259 Reiter, W. D., Hudepohl, U., and Zillig, W. (1990). Mutational analysis of an archaebacterial promoter: essential role of a TATA box for transcription efficiency and start-site selection in vitro. Proc. Natl. Acad. Sci. USA 57,9509-13.

Reiter, W. D., Palm, P., and Zillig, W. (1988). Transcription termination in the archaebacterium Sulfolobus'. signal structures and linkage to transcription initiation. Nucleic Acids Res 16,2445-59.

Reynolds, R., and Chamberlin, M. J. (1992). Parameters affecting transcription termination by Escherichia coli RNA polymerase. II. Construction and analysis of hybrid terminators. J. Mol. Bio.1 224, 53-63.

Richarme, G., and Kohiyama, M. (1994). Amino acid specificity of the Escherichia coli chaperone GroEL (heat shock protein 60). J. Biol. Chem. 269, 7095-7098.

Roberts, S., Purton, T., and Bentley, D. L. (1992). A protein-binding site in the c-myc promoter functions as a terminator for RNA polymerase II transcription. Genes Dev. 6, 1562-1574.

Roeder, R. G. (1996). The role of general initiation factors in transcription by RNA polymerase II. Trends Biochem. Sci. 21, 327-35.

Rowlands, T., Baumann, P., and Jackson, S. P. (1994). The TATA-binding protein: a general transcription factor in eukaryotes and archaebacteria. Science 264, 1326-1329.

Ruepp, A., and Soppa, J. (1996). Fermentative arginine degradation in Halobacterium salinarium (formerly Halobacterium halobium): genes, gene products, and transcripts of the arcRACB gene cluster. J. Bacteriol. 178,4942-4947.

Sambrook, J., Maniatis, T., and Fritsch, E. F. (1989). Molecular Cloning: a laboratory manual, 2nd ed. (Plainview, NY: Cold Spring Harbor Press).

Sandman, K., Krzycki, J. A., Dobrinski, B., Lurz, R., and Reeve, J. N. (1990). HMf, a DNA-binding protein isolated from the hyperthermophilic archaeon Methanothermus fervidus, is most closely related to histones. Proc. Natl. Acad. Sci. USA 87, 5788-5791.

260 Sandman, K., Perler, F. B., and Reeve, J. N. (1994). Histone-encoding genes from Pyrococcus: evidence for members of the HMf family of archaeal histones in a non- methanogenic Archaeon. Gene 150,207-208.

Sanger, P., Nicklen, S., and Coulson, A. R. (1977). DNA sequencing with chain- terminating inhibitors. Proc. Natl. Acad. Sci. 74, 5463.

Sarge, K. D., Murphy, S. P., and Morimoto, R. I. (1993). Activation of heat shock gene transcription by heat shock factor 1 involves oligomerization, acquisition of DNA- binding activity, and nuclear localization and can occur in the absence of stress [published errata appear in Mol. Cell. Biol. 1993 May; 13(5):3122-3 and 1993 Jun;13(6):3838-9]. Mol. Cell. Biol. 13, 1392-407.

Schnabel, R., Zillig, W., and Schnabel, H. (1982). Component E of the DNA-dependent RNA polymerase of the archaebacterium Thermoplasma acidophilum is required for the transcription of native DNA. Eur. J. Biochem. 129,473-477.

Schneider, D., Gold, L., and Platt, T. (1993). Selective enrichment of RNA species for tight binding to Escherichia coli . FASEB J. 7, 201-7.

Sentenac, A. (1985). Eukaryotic RNA polymerases. CRC Grit. Rev. Biochem. 18, 31-90.

Shaaban, S. A., Krupp, B. M., and Hall, B. D. (1995). Termination-altering mutations in the second-largest subunit of yeast RNA polymerase HI. Mol. Cell. Biol. 15,1467-1478.

Sharp, P. A. (1992). TATA-binding protein is a classless factor. Cell 68, 819-821.

Shimmin, L., and Dennis, P. (1996). Conserved sequence elements involved in regulation of ribosomal protein gene expression in halophilic archaea. J. Bacteriol. 178,4737-4741.

Shimmin, L. C., Newton, C. H., Ramirez, C., Yee, J., Downing, W. L., Louie, A., Matheson, A. T., and Dennis, P. P. (1989). Organization of genes encoding the LI 1, LI, LIO, and L12 equivalent ribosomal proteins in eubacteria, archaebacteria, and eucaryotes. Can. J. Microbiol. 35, 164-170.

261 Shuman, s., and Moss, B. (1988). Factor-dependent transcription termination by vaccinia virus RNA polymerase. Evidence that the cis-acting termination site is in nascent RNA. J. Biol. Chem. 263, 6220-6225.

Smaie, S. T. (1994). Core promoter architecture for eukaryotic protein-coding genes. In Transcription Mechanisms and Regulation, R. C. Conaway and J. W. Conaway, eds. (New York: Raven Press), pp. 63-81.

Smid, A., Finsterer, M., and Grummt, I. (1992). Limited proteolysis unmasks specific DNA-binding of the murine RNA polymerase I-specific transcription termination factor TTFI. J. Mol. Biol. 227, 635-47.

Sorger, P. K. (1990). Yeast heat shock factor contains separable transient and sustained response transcriptional activators. Cell 62, 793-805.

Sorger, P. K., and Pelham, H. R. (1988). Yeast heat shock factor is an essential DNA- binding protein that exhibits temperature-dependent phosphorylation. Cell 54, 855-864.

Southern, E. M. (1975). Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98, 503-517.

Sowers, K. R., Thai, T. T., and Gunsalus, R. P. (1993). Transcriptional regulation of the carbon monoxide dehydrogenase gene (cdhA) in Methanosarcina thermophila. J. Biol. Chem. 268, 23172-23178.

Spicher, K., Nuernberg, B., Jager, B., Rosenthal, W., and Schultz, G. (1992). Heterogeneity of three electrophoretically distinct Go alpha-subunits in mammalian brain. FEBS Lett 307,215-218.

Starich, M. R., Sandman, K., Reeve, J. N., and Summers, M. F. (1996). NMR structure of HMfB from the hyperthermophile, Methanothermus fervidus, confirms that this archaeal protein is a histone. J. Mol. Biol. 255, 187-203.

Steinmetz, E. J., and Platt, T. (1994). Evidence supporting a tethered tracking model for helicase activity of Escherichia coli Rho factor. Proc. Natl. Acad. Sci. USA 91, 1401- 1405. 262 Straus, D. B., Walter, W. A., and Gross, C. A. (1987). The heat shock response of E. coli is regulated by changes in the concentration of sigma 32. Nature 329, 348-351.

Sweetser, D., Nonet, M., and Young, R. A. (1987). Prokaryotic and eukaryotic RNA polymerases have homologous core subunits. Proc Natl Acad Sci USA 8 41192-1196. ,

Takagi, Y., Conaway, J. W., and Conaway, R. C. (1995). A novel activity associated with RNA polymerase II elongation factor SHI. SHI directs promoter-independent transcription initiation by RNA polymerase II in the absence of initiation factors. J. Biol. Chem. 270,24300-24305.

Tantravahi, J., Alvira, M., and Falck-Pedersen, E. (1993). Characterization of the mouse beta maj globin transcription termination region: a spacing sequence is required between the poly(A) signal sequence and multiple downstream termination elements. Mol. Cell. Biol. 12, 578-587.

Telesnitsky, A., and Chamberlin, M. J. (1989). Terminator-distal sequences determine the in vitro efficiency of the early terminators of bacteriophages T3 and T7. Biochemistry 28, 5210-5218.

Thomas, P. S. (1980). Hybridization of denatured RNA and small DNA firagments transferred to nitrocellulose. Proc. Natl. Acad. Sci. U. S. A. 77, 5201-5205.

Thomm, M. (1996). Archaeal transcription factors and their role in transcription initiation. FEMS Microbiol. Rev. 18, 159-171.

Thomm, M., Hausner, W., and Hethke, C. (1994). Transcription factors and termination of transcription in Methanococcus. System Appl. Micro. 16, 148-155.

Thompson, L. D., Brandon, L. D., Nieuwlandt, D. T., and Daniels, C. J. (1989). Transfer RNA intron processing in the halophilic archaebacteria. Can. J. Microbiol. 35, 36-42.

Thompson, L. D., and Daniels, C. J. (1988). A tRNA(Trp) intron endonuclease from Halobacterium volcanii. Unique substrate recognition properties. J. Biol. Chem. 263, 17951-17959.

263 Tjian, R. (1996). The biochemistry of transcription in eukaryotes: a paradigm for subunit regulatory complexes. Phil. Trans. R. Soc. Lond. B Biol. Sci. 351,491-499.

Trent, J. D. (1996). A review of acquired thermotolerance, heat-shock proteins, and molecular chaperones in archaea. FEMS Microbiol. Rev. 18,249-258.

Trent, J. D., Gabrielsen, M., Jensen, B., Neuhard, J., and Olsen, J. (1994). Acquired thermotolerance and heat shock proteins in thermophiles from the three phylogenetic domains. J. Bacteriol. 176,6148-6152.

Trent, J. D., Nimmesgem, E., Wall, J. S., Hartl, F. U., and Horwich, A. L. (1991). A molecular chaperone from a thermophilic archaebacterium is related to the eukaryotic protein t-complex polypeptide-1. Nature 354,490-493.

Trent, J. D., Osipiuk, J., and Pinkau, T. (1990). Acquired thermotolerance and heat shock in the extremely thermophilic archaebacterium Sulfolobus sp. strain B12. J. Bacteriol. 172,1478-1484.

Trieselmann, B. A., and Charlebois, R. L. (1992). Transcriptionally active regions in the genome of the archaebacterium Haloferax volcanii. J. Bacteriol. 174,30-34.

Tyree, C. M., George, C. P., Lira-DeVito, L. M., Wampler, S. L., Dahmus, M. E., Zawel, L., and Kadonaga, J. T. (1993). Identification of a minimal set of proteins that is sufficient for accurate initiation of transcription by RNA polymerase II. Genes Dev. 7, 1254-1265.

Ueno, A., Baek, K., Jeon, C., and Agarwal, K. (1992). Netropsin specifically enhances RNA polymerase II termination at terminator sites in vitro. Proc. Natl. Acad. Sci. USA 89, 3676-3680.

Ursic, D., and Culbertson, M. R. (1992). [letter] Is yeast TCP-1 a chaperonin. Nature 356, 392.

Vinh, D. B., and Drubin, D. G. (1994). A yeast TCP-1-like protein is required for actin fimction in vivo. Proc. Natl. Acad. Sci. USA 91, 9116-9120.

264 Vodkin, M. H., and Williams, J. C. (1988). A heat shock operon in Coxiella burnetti produces a major antigen homologous to a protein in both mycobacteria and Escherichia coli. J. Bacteriol. 170,1227-34.

Vorholt, J. A., Vaupel, M., and Thauer, R. K. (1997). A selenium-dependent and a selenium-independent formylmethanofuran dehydrogenase and their transcriptional regulation in the hyperthermophilic Methanopynis kandleri. Mol. Microbiol. 2 3 , 1033- 1042.

Vuister, G. W., Kim, S. J., Orosz, A., Marquardt, J., Wu, C., and Bax, A. (1994). Solution structure of the DNA-binding domain of Drosophila heat shock transcription factor. Nat. Struct. Biol. 1,605-614.

Waldmann, T., Lupas, A., Kellermann, J., Peters, J., and Baumeister, W. (1995).a. Primary structure of the thermosome from Thermoplasma acidophilum. Biol. Chem. Hoppe. Seyier. 376, 119-126.

Waldmann, T., Nimmesgem, E., Nitsch, M., Peters, J., Pfeifer, G., Muller, S., Kellermann, J., Engel, A., Hartl, F. U., and Baumeister, W. (1995).b. The thermosome of Thermoplasma acidophilum and its relationship to the eukaryotic chaperonin TRiC. Eur. J. Biochem. 227, 848-856.

Waldmann, T., Nitsch, M., Klumpp, M., and Baumeister, W. (1995).c. Expression of an archaeal chaperonin in E. coli: formation of homo- (alpha, beta) and hetero-oligomeric (alpha+beta) thermosome complexes. FEBS Lett 376,67-73.

Wang, D., Meier, T. I., Chan, C. L., Feng, G., Lee, D. N., and Landick, R. (1995). Discontinuous movements of DNA and RNA in RNA polymerase accompany formation of a paused transcription complex. Cell 81, 341-350.

Wang, Y., and von Hippel, P. H. (1993). Escherichia coli transcription termination factor rho. I. ATPase activation by oligonucleotide cofactors. J. Biol. Chem. 268, 13940-13946.

Westwood, J. T., and Wu, C. (1993). Activation of Drosophila heat shock factor: conformational change associated with a monomer-to-trimer transition. Mol. Cell. Biol. 13, 3481-3486.

265 Wettach, J-, Gohl, H. P., Tschochner, H., and Thomm, M. (1995). Functional interaction of yeast and human TATA-binding proteins with an archaeal RNA polymerase and promoter. Proc. Natl. Acad. Sci. USA 92,472-476.

Whittaker, R. H. (1959). On the broad classification of organisms. Q. Rev. Biol. 34,210- 226.

Wiederrecht, G., Seto, D., and Parker, C. S. (1988). Isolation of the gene encoding the S. cerevisiae heat shock transcription factor. Cell 54, 841-853.

Willison, K. R., and Kubota, H. (1994). The structure, fimction, and genetics of the chaperonin containing TCP-1 (CCT) in eukaryotic cytosol. In The biology of heat shock proteins and molecular chaperones. R. I. Morimoto, A. Tissiere and C. Georgopoulos, eds. (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press), pp. 299-312

Wisniewski, J., Orosz, A., Allada, R., and Wu, C. (1996). The C-terminal region of Drosophila heat shock factor (HSF) contains a constitutively fimctional transactivation domain. Nucl. Acids Res. 24, 367-374.

Woese, C. R., and Fox, G. E. (1977). Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc. Natl. Acad. Sci. USA 74, 5088-5090.

Woese, C. R., Kandler, O., and Wheelis, M. L. (1990). Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Natl. Acad. Sci. USA S7,4576-4579.

Wolters, J., and Erdmann, V. A. (1989). The structure and evolution of archaebacterial ribosomal RNAs. Can. J. Microbiol. 35,43-51.

Woychik, N. A., Liao, S. M., Kolodziej, P. A., and Young, R. A. (1990). Subunits shared by eukaryotic nuclear RNA polymerases. Genes Dev. 4, 313-323.

Woychik, N. A., and Young, R. A. (1994). Exploring RNA polymerase II structure and function. In Transcription Mechanisms and Regulation, R. C. Conaway and J. W. Conaway, eds. (New York: Raven Press), pp. 227-242.

266 Wu, C. (1995). Heat shock transcription factors: structure and regulation. Ann. Rev. Cell. Dev. Biol. 11,441-469.

Xiao, H., Perisic, O., and Lis, J. T. (1991). Cooperative binding of Drosophila heat shock factor to arrays of a conserved 5 bp unit. Cell 64:585-593.

Xiao, H., and Lis, J. T. (1988). Germline transformation used to define key features of heat shock response elements. Science 239, 1139-1142.

Xie, Z., and Price, D. H. (1996). Purification of an RNA polymerase II transcript release factor firom Drosophila. J. Biol. Chem. 271,11043-11046.

Yaffe, M. B., Farr, G. W., Miklos, D., Horwich, A. L., Stemlicht, M. L., and Stemlicht, H. (1992). TCPl complex is a molecular chaperone in tubulin biogenesis. Nature 358, 245-248.

Yager, T. D., and von Hippel, P. H. (1991). A thermodynamic analysis of RNA transcript elongation and termination in Escherichia coli. Biochemistry 30, 1097-1118.

Yager, T. D., and v o n Hippel, P. H. (1987). Transcript elongation and termination in Escherchia coli. In Escherichia coli and Salmonella typhimurium. 1 Edition, Volume 2, Heidhardt, F. C. et al. ed. (Washington D. C.: American Society of Microbiology). pp.1241-1275.

Yang, X., and Price, C. W. (1995). Streptolydigin resistance can be conferred by alterations to either the beta or beta' subunits of Bacillus subtilis RNA polymerase. J. Biol. Chem. 270, 23930-23933.

Yankulov, K. Y., Pandes, M., McCracken, S., Bouchard, D., and Bentley, D. L. (1996). TFIIH functions in regulating transcriptional elongation by RNA polymerase II in Xenopus oocytes. Mol. Cell. Biol. 16, 3291-3299.

Yanofsky, C. (1992). Transcriptional regulation: elegance in design and discovery. In Transcriptional regulation. Volume 1, S. L. McKnight and K. R. Yamamoto, eds. (Cold Spring Harbor Laboratory Press), pp. 1-11.

267 Young, C F., Kim, J. M., Molinari, E., and DasSarma, S. (1996). Genetic and topological analyses of the bop promoter of Halobacterium halobium'. stimulation by DNA supercoiling and non-B-DNA structure. J. Bacteriol. 775:840-845.

Yu, G., Deschenes, R. J., and Fassler, J. S. (1995). The essential transcription factor, Mcml, is a downstream target of Slnl, a yeast "two-component" regulator. J. Biol. Chem. 270, 8739-8743.

Yura, T., Nagai, H., and Mori, H. (1993). Regulation of the heat-shock response in bacteria. Annu Rev. Microbiol. 47,321-350.

Zalatan, F., Galloway-Salvo, J., and Platt, T. (1993). Deletion analysis of the Escherichia coli rho-dependent transcription terminator trp t'. J. Biol. Chem. 268,17051-17056.

Zavriev, S. K., and Shemyakin, M. F. (1982). RNA polymerase-dependent mechanism for the stepwise T7 phage DNA transport from the virion into E. coli. Nucl. Acids Res. 10, 1635-1652.

Zaychikov, E., Denissova, L., and Heumaim, H. (1995). Translocation of the Escherichia coli transcription complex observed in the registers 11 to 20: "jumping" of RNA polymerase and asymmetric expansion and contraction of the "transcription bubble". Proc. Natl. Acad. Sci. USA 92, 1739-1743.

Zillig, W., Klenk, H. P., Palm, P., Puhler, G., Gropp, F., Garrett, R. A., and Leffers, H. (1989). The phylogenetic relations of DNA-dependent RNA polymerases of archaebacteria, eukaryotes, and eubacteria. Can. J. Microbiol. 35, 73-80.

Zillig, W., Palm, P., Klenk, H.-P., Langer, D., Hudepohl, U., Hain, J., Lanzendorfer, M., and Holz, I. (1993). Transcription in archaea. In The biochemistry of Archaea. Kates, M. et al. ed. (Elsevier Science Publishers B. V.). pp. 367-391.

Zillig, W., Stetter, K. O., and Janekovic, D. (1979). DNA-dependent RNA polymerase from the archaebacterium Sulfolobus acidocaldarius. Eur. J. Biochem. 96, 597-604.

268 Zucker, M. (1994). Prediction of RNA secondary structure by energy minimization in Computer analysis o f sequence data. In Methods Mol. Biol., A. M. Griffin and H. G. Griffin, eds. (Humana Press, Inc.), pp. 267-294.

Zwieb, C., Kim, J., and Adhya, S. (1991). Detection of DNA bending by gel electrophoresis: use of plasmid vectors. In A laboratory guide to in vitro studies of protein-DNA interactions. Volume 5, J. P. Jost and H. P. Saluz, eds. (Birkhauser Verlag Basel), pp. 245-257.

269