<<

GENOME-SCALE STUDIES OF RHO-DEPENDENT TERMINATION

IN

by Jason Matthew Peters

A dissertation submitted in partial fulfillment of the requirement for the degree of

Doctor of Philosophy (Genetics)

at the UNIVERSITY OF WISCONSIN – MADISON 2012

Date of final oral examination: 8/10/12

The dissertation is approved by the following members of the Final Oral Committee: Robert Landick, Professor, Gary Roberts, Professor Emeritus, Bacteriology Richard Burgess, Professor Emeritus, Oncology Audrey Gasch, Associate Professor, Genetics Nicole Perna, Associate Professor, Genetics

i

Genome-scale studies of Rho-dependent transcription termination in Escherichia coli

Jason Matthew Peters

under the supervision of Professor Robert Landick

University of Wisconsin – Madison

Rho is a homohexameric ring-shaped that translocates nascent RNA through its central cleft in an ATP-dependent manner, then dissociates RNA polymerase (RNAP) from RNA and template DNA. Despite decades of study, little was known about Rho association with transcription elongation complexes (ECs); sites of

Rho termination across the E. coli chromosome; and the effects of the elongation factors NusG, NusA, and the structuring protein H-NS on Rho termination. I found that Rho and RNAP have very similar distributions on DNA, suggesting that Rho associates with ECs early and throughout the process of transcription elongation. This association allows Rho to quickly respond to termination signals such as those that are unmasked when transcription and translation become uncoupled.

Prior to my studies, the sites of Rho termination in E. coli were mostly unknown. I identified Rho-dependent terminators by examining the distribution of RNAP on DNA in

Rho-inhibited cells. I found that Rho terminates highly structured , such as transfer

RNAs (tRNAs) and small RNAs (sRNAs). This finding was surprising, because it was

ii thought that Rho could not associate with structured RNAs. I also found that Rho terminated a small set of novel antisense transcripts that occurred within . Upon examining the transcriptome of cell in which Rho was inhibited, I determined that widespread increases in antisense transcription occurred throughout the genome. This finding established that a major function of Rho in vivo is to prevent elongation of antisense transcripts.

Finally, I investigated the effects of NusG, NusA, and H-NS on Rho termination. I found that NusG enhances termination at a minority subset of Rho-dependent terminators that are defined by a low C/G ratio at the termination site. In contrast, a large deletion within nusA had no effect on Rho termination in vivo. I also identified a strong overlap between Rho-dependent terminators and H-NS binding sites, as well as genetic interactions between hns and rho. This led me to propose a model in which H-

NS enhances Rho termination by slowing RNAP elongation, providing a longer kinetic window for Rho to terminate transcription.

iii

Acknowledgements

I give my thanks to Bob, Rachel, and members of the Landick and Gourse labs for their patience in enduring more than a half a decade worth of bad jokes. I would also like to give special thanks to the transcription Rho. I couldn’t have done it without you, buddy.

iv

Table of contents

Abstract ...... i

Acknowledgements ...... iii

Table of contents...... iv

List of figures ...... xiii

List of tables ...... xviii

Chapter 1 - Introduction ...... 1

Introduction to transcription termination ...... 2

Elongation complex stability and the definition of termination ...... 3

The signal and steps in the pathway of intrinsic termination ...... 7

The intrinsic termination signal ...... 7

Pausing at the intrinsic ...... 8

Terminator hairpin nucleation ...... 11

Possible role of the partially formed terminator hairpin in pausing ...... 12

Commitment to the termination pathway ...... 15

Sequence conservation of intrinsic terminators ...... 16

The mechanism of intrinsic termination ...... 18

v

Changes in the nucleic acid scaffold during intrinsic termination ...... 19

Conformational changes in RNAP during intrinsic termination ...... 20

Intrinsic termination in other ...... 23

Introduction to Rho-dependent termination ...... 25

The Rho termination signal and steps in the pathway of Rho termination ...... 26

Rho binding to RNA ...... 26

RNA translocation through Rho...... 32

EC pausing at the site of termination ...... 33

Kinetic coupling ...... 34

Rho association with elongation complexes ...... 35

The mechanism of Rho termination ...... 36

Changes in the nucleic-acid scaffold during Rho termination ...... 37

Conformational changes in RNAP during Rho Termination ...... 39

Physiological functions and targets of Rho ...... 40

Targets of Rho ...... 40

Silencing of foreign DNA ...... 41

Suppression of antisense transcription ...... 42

vi

The mechanism of Rho-dependent polarity ...... 42

Rho- competition for RNA ...... 43

Rho-ribosome competition for NusG ...... 43

Two-checkpoint model for polarity...... 47

Protein factors that may affect Rho termination ...... 47

NusA ...... 47

H-NS ...... 48

Conclusions ...... 50

Outline of thesis chapters ...... 51

References ...... 53

Chapter 2 - Rho trafficking on units in vivo...... 69

Abstract ...... 70

Introduction ...... 71

Results ...... 76

Analysis of RNAP ChIP-chip signals on E. coli TUs ...... 76

Regulator trafficking on representative E. coli TUs ...... 77

σ70, NusA, NusG, and Rho associate with ECs in different patterns ...... 80

vii

NusG apparent occupancy depends on TU length, not function ...... 89

Promoter-proximal RNAP peaks correlate with -proximal NusA and Rho

peaks ...... 92

Rho-dependent termination is not the primary cause of promoter-proximal RNAP

peaks ...... 96

Discussion ...... 98

NusA, NusG, and Rho exhibit different patterns of EC association, but no TU-

specific specialization ...... 98

σ70 appears to associate with ECs stochastically ...... 103

The mechanistic basis of promoter-proximal RNAP peaks ...... 104

Materials and Methods ...... 106

Acknowledgements ...... 122

Supplementary Figures ...... 123

References ...... 138

Chapter 3 - Rho directs widespread termination of intragenic and stable RNA transcription ...... 149

Abstract ...... 150

Introduction ...... 151

viii

Results ...... 153

BCM Alters the Distribution of RNAP ...... 153

Rho termination at tRNAs ...... 162

Rho termination of sRNA synthesis ...... 166

Rho inhibition reveals novel antisense transcription ...... 168

Discussion ...... 172

Rho-dependent termination and stable RNA synthesis ...... 173

Rho terminates novel antisense transcripts in E. coli ...... 174

Rho-dependent termination and horizontal transfer ...... 176

Conclusion ...... 178

Materials and Methods ...... 179

Acknowledgements ...... 182

Supplementary Figures ...... 183

References ...... 193

Chapter 4 - Rho and NusG suppress pervasive antisense transcription in

Escherichia coli ...... 199

Abstract ...... 201

ix

Introduction ...... 201

Results ...... 206

A major function of Rho is suppression of antisense transcription ...... 206

Rho and H-NS silence transcription at the same genomic loci ...... 213

NusG principally assists termination at a minority subset of Rho-dependent

terminators ...... 217

NusG is required at Rho-dependent terminators with sub-optimal nucleic-acid

sequences ...... 220

REP elements also are associated with Rho-dependent terminators...... 224

A classic nusA deletion has no effect on Rho-dependent termination in vivo ...... 226

Discussion ...... 228

Impact of antisense transcription on sense transcription ...... 228

H-NS may assist Rho in silencing antisense transcription ...... 231

Mechanism of NusG enhancement and relevance to polarity models ...... 232

Analogous control of antisense transcription in bacteria and eukaryotes ...... 234

Materials and Methods ...... 235

Acknowledgements ...... 242

Supplementary Figures ...... 243

x

References ...... 251

Chapter 5 - Conclusions and Future Directions ...... 260

Conclusions ...... 261

Future Directions ...... 264

Rho crosslinking and association with ECs ...... 264

Rho binding to structured RNAs ...... 265

Rho-dependent termination in other bacteria ...... 266

Mechanism of termination enhancement by NusG...... 268

Roles of NusA in vivo ...... 269

The mechanistic basis for Rho-H-NS synergy ...... 271

References ...... 273

Appendix A - Synthetic lethal screening identifies a genetic connection between transcript cleavage and central metabolism ...... 277

Introduction ...... 278

Results ...... 281

A screen for mutations that are synthetic lethal with ΔgreA (slgA mutants) ...... 281

All three slgA mutations are transposon insertions in pgi ...... 284

Discussion ...... 294

xi

Materials and Methods ...... 297

Acknowledgements ...... 299

References ...... 300

Appendix B - Mutations in rho identify a novel mechanism of ethanol tolerance in

Escherichia coli ...... 302

Introduction ...... 303

Results ...... 304

The ethanol-tolerant strain, MTA156, contains a mutation in the rho gene that is

sufficient for ethanol tolerance ...... 304

Strains containing the rho(L270M) mutation are defective for transcription

termination ...... 305

Altered fatty acid composition in rho(L270M) may relate to ethanol tolerance ..... 305

Discussion ...... 313

Materials and Methods ...... 314

Acknowledgements ...... 316

References ...... 317

Appendix C - Investigating the NusG-S10 model of transcription-translation coupling ...... 319

xii

Introduction ...... 320

Results ...... 324

NusG and S10 interact in the bacterial two-hybrid assay ...... 324

A screen for substitutions that disrupt the interaction between NusG and S10 .... 324

Substitutions in S10 residues M88, D97, V98, and S101 disrupt the interaction

between NusG and S10 ...... 331

Discussion ...... 338

Materials and Methods ...... 338

Acknowledgements ...... 344

References ...... 345

Appendix D - Generating strains for use in genome-scale phenotypic mapping to investigate transcription ...... 347

Introduction ...... 348

Results ...... 350

Generation of mutant strains for phenotype mapping ...... 350

Discussion ...... 352

Materials and Methods ...... 354

Acknowledgements ...... 358

xiii

References ...... 359

Appendix E - Investigating nusA essentiality in Escherichia coli...... 361

Introduction ...... 362

Results ...... 364

A partial deletion of nusA (ΔnusA* ) is not a null allele ...... 364

Expression of the NusA-NTD alone is insufficient for MDS42 viability ...... 365

Unexpected properties of the rho(E134D) ΔnusA* strain ...... 369

Discussion ...... 370

Materials and Methods ...... 371

Acknowledgements ...... 373

References ...... 374

List of Figures

Chapter 1 - Introduction ...... 1

Figure 1.1. Structure of the EC...... 4

Figure 1.2. Mechanisms of intrinsic termination...... 9

Figure 1.3. Sequence features of intrinsic terminators...... 13

xiv

Figure 1.4. Steps prior to Rho termination...... 27

Figure 1.5. Crystal structures of Rho...... 30

Figure 1.6. Two-checkpoint mechanism to suppress Rho termination in protein-coding

genes (polarity)...... 44

Chapter 2 - Rho trafficking on bacterial transcription units in vivo...... 69

Figure. 2.1. Bacterial regulators of transcript elongation...... 72

Figure. 2.2. Apparent occupancy profiles of RNAP and regulators on representative

TUs...... 79

Figure. 2.3. Mid-TU regulator signals correlated with RNAP signals...... 82

Figure. 2.4. Aggregate apparent occupancy for highly expressed TUs...... 85

Figure. 2.5. Gene-averaged regulator/RNAP ratios...... 90

Figure. 2.6. Frequency of Rho and NusA co-occurrence for RNAP peaks associated

with genes...... 94

Figure. 2.7. Model of transcription regulator trafficking during initiation to elongation

transition...... 100

70 Figure 2.S1. RNAP (β´) peak Occapp versus σ peak Occapp ...... 123

Figure 2.S2. Offset of RNAP peaks from σ70 peaks ...... 124

Figure 2.S3. Shape and location of RNAP and σ70 peaks with and without treatment

of cells with rifampicin...... 126

xv

Figure 2.S4. Comparison of NusG profiles ...... 129

Figure 2.S5. RT-PCR quantitation of RNA levels near transcription start sites and

within TUs ...... 130

Figure 2.S6. Fraction of σ70 peaks not associated with transcripts (Reppas et al.

2006) for which NusA or Rho peaks occur within 300 bp of an associated RNAP

peak, binned by RNAP peak height ...... 132

Figure 2.S7. Comparison of aggregate Occapp profiles from TUs exhibiting promoter-

proximal RNAP peaks to those for TUs regulated by attenuation (trp and pyrBI). ... 136

Figure 2.S8. Effect of Rho inhibition by bicyclomycin on expression of genes

(Cardinale et al, 2008) as a function of traveling ratio...... 138

Chapter 3 - Rho directs widespread termination of intragenic and stable RNA transcription ...... 149

Figure. 3.1. Global effects of Rho inhibition on the distribution of RNAP...... 154

Figure. 3.2. Locations of BSRs and BSR associated genes across the E. coli

chromosome...... 158

Figure. 3.3. Rho termination at tRNAs and sRNAs ...... 163

Figure. 3.4. Rho inhibition reveals antisense transcription...... 169

Figure 3.S1. Comparison of BSR positions to genes reported to be affected by BCM

in expression profiling experiments from Cardinale et al ...... 183

xvi

Figure 3.S2. Quantitative PCR confirmation of ChIP-chip results ...... 183

Figure 3.S3. BCM effect on the distribution of RNAP at the thrW tRNA ...... 183

Figure 3.S4. BCM effect on the distribution of RNAP at the rygD sRNA ...... 183

Figure 3.S5. Transcriptional readthough from tRNA onto K-12 specific genes

and prophage elements ...... 183

Chapter 4 - Rho and NusG suppress pervasive antisense transcription in

Escherichia coli ...... 199

Figure 4.1. Regulators of transcript elongation in bacteria...... 202

Figure 4.2. Genome-wide analysis of Rho-dependent transcription termination...... 207

Figure 4.3. Effects of Rho inhibition at class I and class II Rho-dependent terminators.

...... 210

Figure 4.4. Spatial and functional associations between H-NS and Rho-dependent

termination...... 214

Figure 4.5. Effects of ΔnusG and ΔnusA* on Rho-dependent termination...... 218

Figure 4.6. Basis for NusG effects on Rho-dependent termination...... 222

Figure 4.7. Models of antisense transcription termination by Rho...... 222

Figure 4.S1. Increased antisense transcription due to Rho inhibition does not affect

sense transcription ...... 243

Figure 4.S2. The combination of rho15(Ts) and Δhns is synthetic lethal ...... 243

xvii

Figure 4.S3. Motifs found near Rho-dependent terminators correspond to REP

elements ...... 243

Figure 4.S4. Confirmation of the ΔnusA* allele ...... 243

Appendix A - Synthetic lethal screening identifies a genetic connection between transcript cleavage and central metabolism ...... 277

Figure A.1. Overview of synthetic lethal screening...... 279

Figure A.2. slgA greA double mutants are viable at 37 oC...... 285

Figure A.3. slgA greA double mutants are temperature-sensitive for growth...... 287

Figure A.4. Growth of pgi greA on M9 minimal glucose medium...... 290

Figure A.5. Growth of pgi greA on M9 minimal fructose-6-phosphate medium...... 293

Appendix B - Mutations in rho identify a novel mechanism of ethanol tolerance in

Escherichia coli ...... 302

Figure B.1. Readthrough of a Rho-dependent terminator at the rho locus ...... 306

Figure B.2. Readthrough of a Rho-dependent terminator upstream of fabF...... 308

Figure B.3. Mutations in fabF affect growth of wild type and rho(L270M) mutant in 5%

ethanol...... 311

Appendix C - Investigating the NusG-S10 model of transcription-translation coupling ...... 319

Figure C.1. Structural model of the NusG-S10-NusB complex...... 321

xviii

Figure C.2. The bacterial two-hybrid assay...... 325

Figure C.3. NusG and S10 interact in the bacterial two-hybrid assay...... 327

Figure C.4. Quantification of bacterial two-hybrid interactions...... 329

Figure C.5. Substitutions in S10 that affect the binding of NusG to S10...... 332

Figure C.6. Substitutions in S10 that eliminate binding to NusG have little effect on

the S10-NusB interaction...... 334

Appendix E - Investigating nusA essentiality in Escherichia coli...... 361

Investigating nusA essentiality in Escherichia coli ...... 361

Figure E.1. PCR analysis of nusA recombinants...... 366

List of Tables

Chapter 3 - Rho directs widespread termination of intragenic and stable RNA transcription ...... 149

Table 3.1. BSR Annotation Summary ...... 160

Appendix A - Synthetic lethal screening identifies a genetic connection between transcript cleavage and central metabolism ...... 277

Table A.1. Synthetic lethal screening...... 282

Table A.2. Strains used in this study...... 298

xix

Appendix B - Mutations in rho identify a novel mechanism of ethanol tolerance in

Escherichia coli ...... 302

Table B.1. Strains and primers used in this study...... 315

Appendix C - Investigating the NusG-S10 model of transcription-translation coupling ...... 319

Table C.1. Strains, , and primers used in this study...... 339

Appendix D - Generating strains for use in genome-scale phenotypic mapping to investigate transcription ...... 347

Table D.1. Mutations selected for phenotype mapping ...... 351

Table D.2. Strains, plasmids, and primers used in this study ...... 355

Appendix E - Investigating nusA essentiality in Escherichia coli...... 361

Table E.1. PCR analysis of nusA recombinants...... 368

Table E.2. Strains, plasmids, and primers used in this study...... 372

1

Chapter 1

Introduction

This chapter has been published in part (Jason M. Peters, Abbey D. Vangeloff, and

Robert Landick 2011. Bacterial transcription terminators: the RNA 3´-end chronicles.

Journal of Molecular Biology 412:793-813).

2

Introduction to transcription termination

The mechanism of transcription termination by bacterial RNA polymerase (RNAP) has been a focus of studies since the earliest days of bacterial molecular genetics. Although the basic framework of both intrinsic and factor-dependent termination (Rho-independent and Rho-dependent termination, respectively) has been known for many years, the detailed molecular mechanisms by which terminators destabilize and dissociate the elongating transcription complex (EC) have remained elusive. This uncertainty about the nature and order of molecular contacts and rearrangements is all the more remarkable given the detailed structural knowledge of

ECs now available. Complicating this uncertainty about fundamental mechanisms, especially for intrinsic termination, has been the need to annotate termination sites in bacterial computationally (d'Aubenton Carafa et al. 1990; Washio et al. 1998;

Ermolaeva et al. 2000; Lesnik et al. 2001; Unniraman et al. 2002; Hosid and Bolshoy

2004; de Hoon et al. 2005; Kingsford et al. 2007; Mitra et al. 2008; Mitra et al. 2009;

Mitra et al. 2010). Although vitally important to parse the burgeoning collection of bacterial genome sequences, computational methods are limited by our knowledge of what constitutes a validated terminator. Multiple models for the molecular basis of termination have been posited and substantiated to varying degrees (Santangelo and

Roberts 2004; Park and Roberts 2006; Epshtein et al. 2007; Larson et al. 2008;

Epshtein et al. 2010), making this an appropriate time to review bacterial termination mechanisms.

We will focus only on intrinsic and Rho-dependent termination, which are thought to be the principal mechanisms by which bacterial transcription units are defined and by

3 which bacterial gene expression is regulated during transcript elongation. We will consider first the molecular determinants of EC stability, since reduction of EC stability is the principal requirement for terminating transcription. We will then describe the determinants and mechanisms of intrinsic and Rho-dependent termination.

Elongation complex stability and the definition of termination

The structure of bacterial ECs and the molecular determinants of their exceptional stability are relatively well understood. Once RNAP has escaped the initiation phase of transcription, it maintains a canonical set of contacts to RNA and DNA that allow it to transcribe >104 bp without dissociation. These include contacts to ~18 bp of not-yet- transcribed duplex DNA downstream of the active site, an 8-10 bp RNA:DNA hybrid within a 12-14 nt transcription bubble, and ~5 nt of ssRNA in an RNA exit channel (Fig.

1.1; Komissarova and Kashlev 1998; Korzheva et al. 2000). The exit channel is separated from the main channel, which holds the hybrid, by the lid domain. The lid covers the RNA -11 nt, which is held in a shallow pocket by weak exit-channel contacts

(Fig. 1.1c; Vassylyev et al. 2007b). The DNA duplex reforms as the template strand exits the main channel, but does not make strong RNAP contacts (Fig. 1.1a); single- molecule FRET in yeast RNAPII ECs place it near the clamp domain (Andrecka et al.

2009) and footprinting experiments confirm that upstream DNA is not protected by

RNAP (Zaychikov et al. 1995; Wang and Landick 1997). Substrate NTPs are thought to enter the active site through a secondary channel that is separated from the main cleft by the bridge helix. Rapid addition requires folding of the trigger loop (TL)

4

Figure 1.1. Structure of the EC. (A) Model of EC based on a T. thermophilus EC crystal structure (PDB ID 2o5i).(Vassylyev et al. 2007b) (B) Cutaway view of EC model showing locations of open and closed conformations of the trigger loop (PDB IDs 1iw7 and 2o5j, respectively)(Vassylyev et al. 2002; Vassylyev et al. 2007a) and the location of the lid separating the RNA exit and main channels. (C) Close-up view of the RNA exit channel with the flap tip removed and the numbered for an EC in the pretranslocated register.

5

Figure 1.1

6 into antiparallel trigger helices that pack against the bridge helix in a 3-helix bundle and contact the bound NTP substrate. Downstream DNA, the RNA:DNA hybrid, and the exiting RNA are surrounded by semi-mobile RNAP domains called the clamp, protrusion, and lobe (Fig. 1.1a).

In vitro, ECs containing Escherichia coli RNAP are stabile to 1 M NaCl or up to 65 °C

(Wilson and von Hippel 1994; Nudler et al. 1996; Komissarova et al. 2002). This remarkable stability is attributable principally to RNAP contacts to the RNA:DNA hybrid

(Nudler et al. 1997; Korzheva et al. 1998; Komissarova et al. 2002), but significant contributions are also made by contacts to the downstream DNA duplex and exiting

RNA (Nudler et al. 1996; Komissarova and Kashlev 1998).

Termination occurs when these contacts are sufficiently destabilized that the rate of EC inactivation and eventual dissociation becomes significant relative to the rate at which the next nucleotide is added to the growing RNA transcript (von Hippel and Yager

1992). This branched kinetic mechanism has several important consequences. First, the rate of transcript elongation defines a kinetic window within which termination at any given DNA position is possible. For this reason, transcriptional pausing is thought to be the first step in a termination pathway and a prerequisite to efficient termination.

Second, true termination requires dissociation of the EC, with release of RNA and DNA from RNAP. Some paused ECs become arrested on DNA (i.e., transcriptionally inactivated without dissociating), via either backtracking of the RNA and DNA chains through RNAP or other processes; true termination requires dissociation of the EC.

Finally, the kinetic branch between elongation and termination may occur prior to actual

EC dissociation. Formation of an inactivated EC that eventually dissociates, rather than

7 direct EC dissociation, may compete with elongation (Yin et al. 1999), but the structure of the inactivated EC remains unresolved (Berlin and Yanofsky 1983; Gusarov and

Nudler 1999; Kashlev and Komissarova 2002; Epshtein et al. 2007). Thus, termination may occur in at least three steps, an initial pause, formation of a termination intermediate, and dissociation of the EC.

The intrinsic termination signal and steps in the pathway of intrinsic termination

Intrinsic termination, sometimes called Rho-independent termination, refers to dissociation of the EC caused solely by interactions of DNA and RNA with RNAP without the assistance of auxiliary transcription regulators. In E. coli and many other bacteria, intrinsic terminators are found at the end of operons where they form mRNA 3´ ends, and also between, within, or upstream from genes where they can regulate transcription via attenuation. Intrinsic terminators exhibit canonical common features and characteristics, but vary in sequence, termination efficiency, and mechanism.

The intrinsic termination signal

An intrinsic terminator is characterized by a GC-rich dyad repeat followed by a stretch of

Ts in the nontemplate DNA strand that, when transcribed into RNA, forms a GC-rich stem-loop structure (or hairpin) followed by a 7-8 nt U-rich tract in the RNA:DNA hybrid

(Rosenberg and Court 1979; d'Aubenton Carafa et al. 1990). The downstream DNA sequence also plays a role at some intrinsic terminators (Telesnitsky and Chamberlin

8

1989a; Epshtein et al. 2007; Martinez-Trujillo et al. 2010). An A-tract is sometimes present upstream from the hairpin, but this typically reflects encoding of a bi-directional intrinsic terminator; addition of an A-tract does not enhance sense-strand termination

(Wilson and von Hippel 1995). The terminator hairpin and U-tract appear to be universal features of intrinsic terminators, but differences in exact sequence and structure cause variations in the efficiency of termination and in mechanism (Larson et al. 2008). The intrinsic termination signal causes dissociation of ECs in discrete steps whose order and function are now relatively well understood: (1) a transcriptional pause, (2) hairpin nucleation, (3) EC disruption by hairpin completion and (4) EC dissociation (Fig. 1.2).

Pausing at the intrinsic terminator

In the first step of termination, a transcriptional pause halts nucleotide addition, allowing the terminator hairpin to form while the U-tract RNA is still within the RNA:DNA hybrid.

Prior to hairpin formation, the pause is induced principally by the U-tract itself (Gusarov and Nudler 1999). Although the intrinsic-terminator-associated pause has not been extensively dissected, studies of other transcriptional pause signals make it highly likely that the sequences of the exiting RNA, the nucleotides in the active site, and the downstream DNA duplex also contribute to the duration of this pause (Lee et al. 1990;

Chan and Landick 1993; Chan et al. 1997; Kireeva and Kashlev 2009). An important unanswered question is whether the intrinsic-terminator-associated pause involves backtracking; backtracking would inhibit formation of the terminator hairpin by protecting

9

Figure 1.2. Mechanisms of Intrinsic Termination. The major intermediates in the intrinsic termination pathway are depicted in schematic form. Three alternative routes to

EC disruption by hairpin completion are depicted. The version of hairpin invasion depicted corresponds to the specific conformational change model of Epshtein et al.,

2007 (Epshtein et al. 2007). The changes in trigger loop conformation in different intermediates is emphasized by color changes, but remain speculative.

10

Figure 1.2

11 greater amounts of RNA within the exit channel of the paused RNAP. A U-tract preceded by GC-rich RNA is generally thought to favor backtracking (Komissarova and

Kashlev 1997; Nudler et al. 1997), and some researchers argue that all pausing involves backtracking (Mejia et al. 2008; Depken et al. 2009). However, strong evidence for the existence of nonbacktracked pauses has been reported (Toulokhonov et al.

2007; Kireeva and Kashlev 2009; Landick 2009).

Terminator Hairpin Nucleation

Hairpin nucleation occurs by closure of the hairpin loop by one to several bp (Wilson and von Hippel 1995; Woodside et al. 2006). This can be as fast as microseconds, but varies with sequence context; and understanding of the kinetics of hairpin formation is still in its infancy. Current understanding suggests multiple routes are possible, ranging from initial loop formation to initial formation of the first few hairpin bp (Ma et al. 2006).

Hairpin formation is likely promoted by the flap and ZBD domains at the mouth of the

RNA exit channel (Epshtein et al. 2007). Upon nucleation, the stem likely extends quickly to within 1-2 nt of the upstream end of the hybrid before encountering a significant barrier posed by the lid domain (Fig. 1.1c).

An important and underappreciated contribution to hairpin nucleation is competition from upstream RNA structures (Fig. 1.2). Although alternative structures that compete with a terminator hairpin are well known from their roles in transcriptional attenuation mechanisms (Henkin and Yanofsky 2002), competing structures likely play a much more general role in limiting the efficiency of termination. Indeed, eliminating upstream

12

RNA structure using mild force in single-molecule experiments or by sequestration with oligonucleotides increases termination efficiency even when alternative structures are not obvious (Larson et al. 2008). Such competition of upstream RNA for hairpin nucleation likely explains reports that promoter-proximal sequence (Goliger et al. 1989;

Telesnitsky and Chamberlin 1989b) or complementary alterations to terminator hairpin structure (Cheng et al. 1992; Wilson and von Hippel 1995) can alter intrinsic termination efficiency. Studies of termination should be constructed with care to avoid these potentially confounding effects.

Possible role of the partially formed terminator hairpin in pausing

Once the stem extends to the lid, the configuration of the EC is remarkably similar to that of a hairpin-stabilized, paused EC in which a hairpin in the RNA exit channel that leaves 11-12 unpaired nt in the 3´-promixal RNA is known to prolong pausing (Fig. 1.3;

Toulokhonov and Landick 2003) The extent to which the partially formed terminator hairpin contributes to pausing at terminators is unknown. However, it is unlikely to be essential because interactions of the hairpin with the RNAP flap domain, which are required for hairpin-stabilization of the his pause, contribute only modestly to termination efficiency at several terminators tested (Toulokhonov and Landick 2003). Thus, the flap may assist hairpin nucleation or extension, but does not play an essential role in pausing as it does at the hairpin-stabilized his pause (Toulokhonov and Landick 2003;

Kuznedelov et al. 2006).

13

Figure 1.3. Sequence Features of Intrinsic Terminators. Canonical RNA sequences

(red) are depicted paired to a DNA scaffold with paired and unpaired nucleotides depicted as filled circles and lines. The sequence of λtR2 is used to depict canonical features in the RNA, except the hairpin loop, which is shown as filled circles and lines.

(A) configuration at the pause step, with portions of the DNA scaffold omitted. The extent of conservation in the RNA 3´ stem and U-tract is shown above the DNA as information content using WebLogo (www.weblogo.berkeley.edu; Schneider and

Stephens 1990; Crooks et al. 2004). The %AT for downstream DNA of 50 near-perfect

U-tract terminators (blue) and 50 imperfect U-tract terminators (red) selected as described in the text is shown. The variations in %AT at positions 11-13, and 19-21

(numbered relative to the U-tract) are significantly different than a randomized sequence

(p≤0.05; T test). (B) RNA/DNA configuration after hairpin nucleation. (C) RNA/DNA configuration during EC disruption, with canonical terminator features labeled.

14

Figure 1.3

15

EC Disruption

In the next step of termination, the terminator hairpin extends to ≤ 8 nt from the terminated RNA 3´ end. This hairpin extension melts ~3 bp of the RNA:DNA hybrid by extracting the RNA strand from the hybrid, by rearrangements of RNAP involving the lid, the exit channel, and the main cleft, or both (Gusarov and Nudler 1999; Komissarova et al. 2002). Hybrid melting disrupts and destabilizes the EC to the point dissociation becomes favorable; this appears to be the energetically limiting step in termination

(Larson et al. 2008). In principle, dissociation could occur by initial release of RNA followed by bubble collapse and DNA release, initial release of DNA followed by RNA release, or near simultaneous release of both RNA and DNA. The order of these events, whether they are obligatory or stochastic, and whether they are universal or differ among terminators is unknown.

Commitment to the termination pathway

Two key questions about intrinsic termination are whether the EC becomes irreversibly committed to termination prior to dissociation and which step in the process of termination is rate-limiting. For the his terminator and the λtR2 terminator, it has been argued that commitment occurs prior to EC dissociation (Gusarov and Nudler 1999; Yin et al. 1999). Single-molecule observations revealed that a long-lived state unable to resume elongation forms at the his terminator prior to DNA release (Yin et al. 1999); this complex retains RNA until DNA release (Pyun 2005). A salt-sensitive “trapped” complex has been isolated and studied at the λtR2 terminator (Gusarov and Nudler 1999;

16

Epshtein et al. 2007), although it has been suggested to be a binary RNA-RNAP complex similar to a binary complex that forms after RNAP dissociates from the trpL terminator (Berlin and Yanofsky 1983; Kashlev and Komissarova 2002).

In principle, any step in the termination pathway prior to a committed intermediate can contribute to the efficiency of termination. For instance, the rate of paused EC formation relative to the rate of elongation past sites of potential termination may set an upper limit on termination efficiency. Escape from the paused EC can also occur and reduce termination efficiency. Thus, rather than a simple partition between elongation and dissociation as envisioned by early models of termination (von Hippel and Yager 1992), the aggregate consequence of several steps in a termination pathway may determine overall efficiency. This may explain the complex ways that solute concentration, supercoiling, and temperature influence termination efficiency (Reynolds et al. 1992;

Wilson and von Hippel 1994), but also makes assigning sequence effects on termination complicated.

Sequence conservation of intrinsic terminators

The conservation of intrinsic terminator sequence determinants has important consequences both for the mechanism of termination (specifically whether termination occurs by a single pathway or by alternative pathways), and for the bioinformatic identification of terminators in genome sequences. To illustrate the sequence conservation of intrinsic terminators, we examined E. coli terminators catalogued in

RegulonDB (.ccg.unam.mx; Gama-Castro et al. 2010). Because even

17 terminators compiled from the literature may not be experimentally verified (see below), we identified a subset of 100 terminators that matched the predictions of the best available prediction algorithm, TransTermHP (Kingsford et al. 2007). These “gold- standard” terminators highlight the key sequence features of intrinsic terminators (Fig.

1.3).

The terminator hairpin stem varies from 5-17 bp, with an average ~8 bp, and exhibits strong bias for GC at the 5 positions nearest the U-tract with a modest preference for G at -3 (-1 is the 3´-most hairpin nt). This is consistent with the importance of the bottom of the stem in supplying the energy that destabilizes the EC in the final step of termination

(Gusarov and Nudler 1999; Komissarova et al. 2002; Larson et al. 2008). The preference for G at -3 (-10 relative to termination at U7) may reflect the role of G at -10 in favoring pausing, presumably by formation of a 10-bp hybrid (Kyzer et al. 2007;

Herbert et al. 2008). The terminator loops vary from 3-10 nt, with an average of ~4 (70% of the terminators had loops of 3 or 4 nt). The prevalence of these so-called tetraloops in intrinsic terminator hairpins likely reflects their ability to stabilize RNA structures

(Antao et al. 1991; Antao and Tinoco 1992).

The U-tract exhibited near-universal presence of at least two Us adjacent to the hairpin, a strong bias for U in the proximal 5-nt segment of the U-tract, and significantly greater sequence diversity in the distal 3-nt segment of the U-tract (Fig. 1.3a). About half the terminators had a perfect or near-perfect U-tract (at most one A in positions 4-8 of the

U-tract), whereas nearly all the remainder contained at least 2 non-U residues with at least one C or G in the distal U-tract (imperfect U-tract; Fig. 1.3a). The downstream

DNA sequence exhibited marked differences in %AT for the near-perfect vs. imperfect

18

U-tract classes of terminators. Imperfect U-tract terminators exhibited high %AT at positions +10-12, whereas near-perfect U-tract terminators exhibited low %AT at the same positions. The two classes of terminators exhibited the opposite sequence bias at positions +18-19. These sequence patterns are consistent with the evidence that downstream DNA can compensate for an imperfect U-tract at the T7 terminator

(Telesnitsky and Chamberlin 1989a; Reynolds and Chamberlin 1992), and suggest that the +10-12 region may affect the ease of duplex melting during hypertranslocation. The

+18-19 region corresponds to a contact made by the clamp domain (+10-11 relative to the active site; Vassylyev et al. 2007b).

The mechanism of intrinsic termination

Although the basic steps in intrinsic termination are relatively clear (Figs. 1.2 & 1.3), the structural changes that destabilize and dissociate the EC and the extent to which these vary at terminators with different sequences are at best partially understood. Although mechanistic models have often been divided into so-called rigid-body models that emphasize changes to the thermodynamic stability of the RNA/DNA scaffold and conformational change models that emphasize conformational changes in RNAP, this is a false dichotomy because changes to the structures of both the scaffold and RNAP must occur during termination. Thus, we will instead consider what is known about these two aspects of intrinsic termination mechanisms: structural changes in the nucleic acid scaffold and conformational changes in RNAP.

19

Changes in the nucleic acid scaffold during intrinsic termination

Careful examination of the proximal U-tract using crosslinking and chemical probing reveals that the upstream 3 bp in the RNA:DNA hybrid, corresponding to the first 3 bases of the U-tract, melt upon terminator hairpin extension (Gusarov and Nudler 1999;

Komissarova et al. 2002). Three models have been proposed to explain this melting: hybrid shearing, hypertranslocation, and hairpin invasion (Fig. 1.2; Macdonald et al.

1993; Yarnell and Roberts 1999; Komissarova et al. 2002; Toulokhonov and Landick

2003; Santangelo and Roberts 2004; Epshtein et al. 2007; Larson et al. 2008). In hybrid shearing, extension of the hairpin pulls the RNA out the exit channel by transiently breaking and reforming base pairs in the hybrid as the RNA shifts out of register with the DNA strand. In hypertranslocation, extension of the hairpin pulls the RNA out the exit channel but retains the register of the RNA:DNA hybrid by translocation of both the

RNA and DNA without accompanying nucleotide addition. In hairpin invasion, the hairpin extends into the main cleft of RNAP rather than pulling RNA out the exit channel and causes hybrid melting due to steric constraints in the main cleft.

Several lines of evidence favor the hybrid shearing and hypertranslocation models, and suggest the selection between them is governed by the ease of shearing the U-tract vs. the ease of translocating the DNA bubble. A switch between these two events was observed in single-molecule experiments that detect the ability of force that assists or opposes translocation to alter termination efficiency; an imperfect U-tract terminator

(t500) exhibited force-dependence whereas as a perfect and near-perfect U-tract terminator did not (his and λtR2; Larson et al. 2008). The imperfect U-tract t500 terminator is also inhibited by blocking translocation with a roadblock or blocking DNA unwinding

20 with a crosslink (Santangelo and Roberts 2004). These results suggest that hybrid shearing occurs when the hybrid is weak, but that hypertranslocation becomes favorable when the hybrid is stronger. The high %AT at positions +10-12 of imperfect U- tract terminators (Fig. 1.3a), which would favor hypertranslocation, suggests a general use of hypertranslocation at imperfect U-tract terminators. This view is consistent with findings that mismatches in the upstream portion of the bubble inhibit termination (Ryder and Roberts 2003), although reannealing also could occur in a hairpin invasion model.

The key observation favoring the hairpin invasion model is the retention of a 3´-RNA-nt crosslink to the RNAP active site in an inactivated termination intermediate formed at

λtR2 (Epshtein et al. 2007). Although this model would also be consistent with the lack of effect on termination efficiency at λtR2 in DNA-pulling experiments (because no translocation is involved), it is difficult to explain why stabilization of the distal hybrid in an imperfect U-tract terminator leads to a requirement for hypertranslocation if hairpin invasion alone can dissociate an EC. A comparable examination of RNA 3´-end location at terminators found to hypertranslocate and during steps between formation of the termination intermediate and EC dissociation would be instructive.

Conformational changes in RNAP during intrinsic termination

Given that the EC is held together by interactions of flexible domains with RNA and

DNA (e.g., the clamp, flap lobe, and protrusion; Figs 1.1A & 1.2), at least some conformational changes seem certain to occur during termination regardless of the mechanism that alters the scaffold structure. However, the nature and role of these

21

RNAP conformational changes is the least understood aspect of the mechanism. The clamp domain has been observed in both open and closed conformations (Cramer et al.

2001; Darst et al. 2002; Tagami et al. 2010), is postulated to close during promoter binding (Landick 2001; Mukhopadhyay et al. 2008), and should favor termination if opened. Even the initial step of transcriptional pausing is proposed to involve some clamp movement; the TL is thought to be trapped in an inactive configuration that is linked to nascent RNA hairpins, the hybrid, and downstream DNA through the bridge helix and movements of the clamp (Toulokhonov et al. 2001; Toulokhonov and Landick

2003; Toulokhonov et al. 2007). A recent report of a EC with a more open clamp conformation presumably in response to an exit channel hairpin strongly supports this view (Tagami et al. 2010). The hairpin in this structure matches the configuration of the partially formed terminator hairpin and the his pause hairpin (extends to -12; Figs 1.1C,

1.2, & 1.3B; Toulokhonov and Landick 2003; Toulokhonov et al. 2007). Inability to resolve the hairpin and the presence of the transcription inhibitor, Ghf1, in the secondary channel of this relatively low resolution structure (~4.3 Å) limit interpretation of this result, but minimally, it makes it likely that clamp opening participates in both hairpin-stabilization of pauses and in intrinsic termination.

Clamp involvement in termination is also supported by the ability of sequence changes in the downstream DNA at the point of clamp contact and a deletion in this clamp region to alter termination efficiency (Epshtein et al. 2007), as well as the bias in sequence composition at this point (+19-21 from U-tract or +9-11 from active site; Fig. 1.3a).

Clamp changes in the RNA exit channel (e.g., deletion of the ZBD; Epshtein et al. 2007) deletion of the flap tip (Toulokhonov and Landick 2003; Epshtein et al. 2007), and

22 changes in contacts to the RNA:DNA hybrid opposite the clamp (Yarnell and Roberts

1999) also have been reported to affect termination efficiency. At least some of these effects occur after pausing as they can be detected using halted ECs (Epshtein et al.

2007). Understanding the extent, timing, and mechanistic importance of clamp opening is a fertile area for future study, for instance by using FRET and blocking movements using crosslinking.

Recently, a detailed set of conformational changes in combination with hairpin invasion has been proposed as an “allosteric” model of intrinsic termination (although allostery is usually defined as a change in the activity of a catalytic site caused by the binding of a diffusible effector molecule to a distinct allosteric site; Epshtein et al. 2007). In this version of the hairpin invasion mechanism, the terminator hairpin is proposed to dissociate the EC while the RNA 3´ end remains in the active site by sweeping across the main cleft, disrupting EC-stabilizing contacts, and causing TL folding by direct contact to the trigger loop (Epshtein et al. 2007). TL folding is proposed to dissociate the

EC. Although the energy source for this extensive hairpin motion is unclear, the model makes several testable predictions. First, TL folding or partial folding can be blocked to determine if it is required for termination as it is for nucleotide addition (Vassylyev et al.

2007a). Second, the model predicts a many-angstrom movement of the hairpin loop toward the downstream DNA that should be readily detectable by FRET between probes in the hairpin loop and the downstream DNA. Finally, since the model was generated using the λtR2 terminator sequence that does not require hypertranslocation, it would be instructive to apply the same tests to terminators with imperfect U-tracts for which hypertranslocation is now indicated.

23

Intrinsic termination in other bacteria

Remarkably, most knowledge about intrinsic termination mechanisms comes from study of a handful of terminators (e.g., λtR2), with only minimal effort to sample known sequence and structural diversity (Reynolds et al. 1992; Reynolds and Chamberlin

1992). Thus, mechanistic study of a wider variety of terminator sequences is highly desirable. This is especially important because canonical intrinsic terminators (Fig. 1.3) are absent downstream from genes in many bacteria, including Mycobacteria,

Helicobacter, Treponema, Synechocystis, Mycoplasma, and Borrelia (Washio et al.

1998; Ermolaeva et al. 2000; Mitra et al. 2009). In many cases, other types of RNA structures can be identified after genes (Mitra et al. 2009), which led to the suggestion that novel mechanisms of intrinsic termination operate in these bacteria. For two reasons, we wish to caution against assuming these structures operate as intrinsic terminators, especially in bioinformatic analyses of bacterial genomes. First, even the sets of intrinsic terminators often relied on as being experimentally validated (e.g., in

RegulonDB or in d'Aubenton Carafa et al., 1990 (d'Aubenton Carafa et al. 1990) include sequences that, although near sites of in vivo RNA 3´-end formation, were identified by visual inspection and lack canonical features (e.g. the E. coli trpR and hupB terminators;

Gunsalus and Yanofsky 1980; Kohno et al. 1990; Gama-Castro et al. 2010). True experimental validation of an intrinsic terminator requires demonstrating the following:

(1) it causes dissociation of EC during in vitro transcription as detected by release of

RNA and DNA from RNAP; (2) it generates terminated RNA 3´ ends before readthrough transcripts appear during synchronized in vitro transcription; (3) it generates the

24 terminated RNA 3´ ends in vivo; and (4) it significantly reduces synthesis of RNA downstream from the site in vivo. Reports of efficient intrinsic termination at noncanonical sequences do not meet all these criteria (Ingham et al. 1995; Abe and

Aiba 1996; Unniraman et al. 2001), which are necessary because terminator-like structures in RNA may be present after genes for many other reasons. For example, rather than causing intrinsic termination such structures could stabilize mRNA against 3´ exonucleases, guide processing of RNAs by binding nucleases, mediate RNA-RNA interactions important for regulation, pause or arrest transcription without causing intrinsic termination, facilitate termination by binding unknown termination , or guide DNA uptake in horizontal gene transfer (Kingsford et al. 2007). To emphasize this point, we offer two instructive examples. The E. coli trp contains an “obvious” intrinsic terminator after trpA and corresponding to the 3´ end of the trp mRNA.

Nonetheless, termination is inefficient at this site and mostly occurs further downstream by Rho-dependent termination followed by processing back to the apparent terminator

(Wu et al. 1981). Similarly, the E. coli asnU gene is followed by a GC-rich hairpin and U- tract that deviates from consensus only by the presence of a single A between the hairpin and U-tract. Nonetheless, essentially all termination of asnU in vivo is caused by

Rho (Peters et al. 2009).

Clearly, much remains to be learned about transcriptional termination in diverse bacteria. Given the absence of obvious intrinsic terminators in some bacteria, it seems likely that new termination mechanisms and new termination proteins will be uncovered.

However, extrapolation from existing knowledge without accompanying biochemical study that distinguishes intrinsic termination from factor-dependent termination and the

25 many other explanations for the appearance of RNA structures and RNA 3´ ends will be more confusing than illuminating. As molecular biology transitions to an era that incorporates genome-scale datasets and computational prediction, experimental verification becomes especially important.

Although the mechanisms of termination by archaeal and eukaryotic RNAPs are also of significant interest, neither intrinsic nor factor-dependent termination mechanisms for archaeal or eukaryotic RNAPs that match the bacterial paradigms have so far been described. Commonalities in the structures of bacterial, archaeal, and complexes suggests that termination mechanisms must overcome similar energetic and structural barriers. Thus, the apparent divergence of mechanisms by which termination occurs in bacteria, , and eukaryotes suggest that their termination pathways have evolved independently.

Introduction to Rho-dependent termination

The key distinction between intrinsic and Rho-dependent termination is that the latter requires the participation of Rho, a homohexameric ring protein that binds to the nascent RNA transcript and then threads RNA 5´  3´ through the center of the ring as an ATP-powered translocase. Once the nascent RNA passes through the ring, Rho dissociates RNAP from RNA and template DNA. Recent work has increased our overall understanding of the signals and sequence elements required for Rho-dependent termination, clarified the nature of Rho as a translocase/, provided refined

26 models for the termination process, and suggested new possibilities for regulation by factor-dependent termination in vivo.

The Rho termination signal and steps in the pathway of Rho termination

Rho binding to RNA

Rho termination is governed by sequences in the nascent RNA and template DNA that act at three steps: (1) Rho binding to RNA and activation of translocase activity, (2) translocation of RNA through Rho, and (3) pausing of the EC at the site of termination

(Fig. 1.4). In the first step, Rho binds to C-rich unstructured RNA, which triggers the

RNA-dependent ATPase activity that powers translocation (reviewed elsewhere;

Richardson 2003). Rho has the highest affinity for synthetic poly(C) RNA, which maximally stimulates ATPase activity.(Lowery-Goldhammer and Richardson 1974).

Natural Rho binding sites, which are called rut (Rho utilization) sites and lie in RNA upstream from points of termination, are ~ 80 nt in length with high C content and relatively little secondary structure (Chen and Richardson 1987; McSwiggen et al. 1988;

Zalatan and Platt 1992). Unpaired C residues within the rut site are important for termination, since blocking rut with complementary oligonucleotides or replacing rut with highly structured RNA greatly reduces termination (Chen et al. 1986). A consensus rut sequence is not required for termination; DNA encoding CA-rich RNA or completely synthetic sequences consisting mostly of C and T residues are sufficient to elicit termination at λtR1 (Hart and Roberts 1991) or an otherwise non-terminator site (Guerin et al. 1998), respectively, by acting as artificial rut sites. The observed depletion of G

27

Figure 1.4. Steps prior to Rho termination. The blue panel (left) depicts a model in which Rho binds only to RNA (Banerjee et al. 2006), whereas the green panel (right) depicts an alternate model in which Rho binds directly to RNAP.(Epshtein et al. 2010).

Both simple RNA extraction or conformational change models are possible in either pathway, as depicted in the darker panels (bottom).

28

Figure 1.4

29 residues from rut sites (Alifano et al. 1991; Peters et al. 2009) also plays an indirect role in Rho binding by making the RNA less likely to form strong secondary structures with

G:C basepairing. Sequence comparisons of known rut sites reveal few common features (other than C-richness); Ciampi 2006) making bioinformatic prediction of Rho- dependent terminators problematic. Combining high-resolution maps of Rho termination obtained experimentally (Peters et al. 2009) with genomic sequence data may help to uncover similarities between rut sites that have previously gone unnoticed.

Structural studies have provided atomic-level views of RNA binding and subsequent isomerization by Rho. Both crystallographic and electron microscopy (EM) results suggest Rho initially binds to RNA in an open, “lock washer” conformation (Fig. 1.5a;

Skordalakes and Berger 2003) and then isomerizes into a closed ring as RNA transfers to the central cavity (Fig. 1.5b; Thomsen and Berger 2009). EM images depict Rho hexamers in either a closed or “notched” state when a short (23 nt) RNA cofactor is present (Gogol et al.). When larger RNAs that exceed the capacity of the primary site

(100 nt) are added, the notched population of Rho hexamers converts to the closed state. In crystals of open Rho, RNA is bound to the N-terminal domain, consistent with previous structural work indicating that the N-terminal domain functions as the primary, or high-affinity RNA binding site (Fig. 1.5a; Briercheck et al. 1998; Bogden et al. 1999).

RNA associates with a cleft in the N-terminal domain that is only large enough to fit pyrimidines, and exhibits a preferred interaction with C residues (Bogden et al. 1999).

Two independent structures revealed closed forms of Rho with RNA bound to the C- terminal ATPase domain, which contains the secondary, or low affinity binding site (not shown and Fig. 1.5b; Skordalakes and Berger 2006; Thomsen and Berger 2009).

30

Figure 1.5. Crystal structures of Rho. (A) The open form of Rho (1pvo) bound to

AMPPNP (phosphoaminophosphonic acid-adenylate ester: an ATP analogue), and

RNA (Skordalakes and Berger 2003). (B) The closed asymmetric form of Rho (3ice) bound to ADP-BeF (adenosine-5'-diphosphate: an ATP analogue, and BeF: a phosphate mimic), and RNA (Thomsen and Berger 2009).

31

Figure 1.5

32

However, the RNA makes different contacts in the two structures, confusing identification of the secondary binding site and understanding of its strict specificity for

RNA. The RNA-dependent transition between open and closed states is hypothesized to represent a switch between RNA loading and translocation competent forms of Rho

(Skordalakes and Berger 2006); however, large-scale changes in Rho conformation may also occur during translocation (Boudvillain et al. 2010).

RNA translocation through Rho

In the second step of Rho termination, RNA is translocated 5´  3´ through the central cavity of Rho, but can be impeded by structural blocks, such as RNA hairpins, other

RNA-binding proteins, or . Thus, the second component of a Rho signal is an unimpeded RNA segment that facilitates translocation. Recent evidence suggests that

Rho may be capable of bypassing certain RNA structures by binding to the single stranded regions around the structure, effectively "stepping around" RNA hairpins

(Schwartz et al. 2007b). Alternatively, the strong translocase activity of Rho, which can displace streptavidin from a biotinylated RNA (Schwartz et al. 2007a), may directly melt

RNA structures and displace some RNA-binding proteins.

Biochemical studies have led to several detailed models for RNA translocation through

Rho (reviewed elsewhere; Patel 2009; Boudvillain et al. 2010), which can be divided into two classes that are relevant to termination. In the tethered tracking model, rut RNA remains bound to the primary site of Rho during translocation (Steinmetz and Platt

1994), whereas in simple translocation models RNA contacts only the central cavity of

33

Rho during translocation. A structural model of a closed Rho hexamer extrapolated from a dimer crystal structure minus the C-terminal 65 residues (which were unresolved) showed RNA bound to both the primary site and C-terminal domain; this model is consistent with the tethered tracking model (Skordalakes and Berger 2006). However, biochemical tests proved inconsistent with the location of RNA in the C-terminal domain

(Rabhi et al. 2011), and a more recent closed Rho structure reveals an asymmetric hexamer with RNA in the central cavity but not bound to the primary site (Fig. 1.4B;

Thomsen and Berger 2009). Thus current structural data, while not excluding the tethered tracking model, are more consistent with a simple translocation model.

However, active ATP hydrolysis by Rho increases RNA protection from nuclease digestion (Galluppi and Richardson), suggesting that Rho could retain RNA contacts during translocation. Further study is needed to distinguish whether RNA contacts with the primary site are lost during translocation, and if so, how these strong contacts are lost. One complication of some existing studies is reliance on Rho bearing N- or C- terminal His tags, which are known to compromise Rho function in vivo

(Balasubramanian and Stitt) and in vitro (Miwa et al.).

EC pausing at the site of termination

In the final step of Rho termination, Rho dissociates an EC halted at a pause site. Thus, the third component of Rho termination is a pause sequence that renders the EC susceptible to Rho termination. Several studies of natural and arbitrary sequences establish a correlation between RNAP pausing and the positions of Rho termination

34

(Kassavetis and Chamberlin 1981; Morgan et al. 1983; Lau and Roberts 1985; Stewart et al. 1986; Galloway and Platt 1988). However, not all RNAP pause sites on the DNA template function as efficient Rho termination sites (Kassavetis and Chamberlin 1981), and some, such as the his pause, inhibit Rho termination (Dutta et al. 2008). The presence of a RNA hairpin in the exit channel at the his pause site may interfere with

RNA translocation and termination; arrested and highly backtracked ECs also are poor substrates for Rho termination (Dutta et al. 2008). Although general determinants of pausing likely apply to pauses at Rho termination sites, additional factors such as distance from the rut site, stability of the EC (von Hippel and Yager 1992), sequence of the 3´ end of the RNA in the active site of RNAP (Epshtein et al. 2010), susceptibility to backtracking, and structure of the nascent RNA (Dutta et al. 2008) contribute to efficiency of termination. This complexity further complicates predicting sites of Rho termination based on sequence.

Kinetic coupling

Rho translocates on RNA being actively synthesized by RNAP. Termination is thought to begin when little or no RNA remains between Rho and RNAP. Thus, competing rates of RNA chain growth and Rho translocation dictate the likelihood of termination. This is referred to as kinetic coupling, and is supported by experiments showing that pause- susceptible RNAPs or reduced RNAP elongation rates increase the efficiency of Rho termination (Jin et al. 1992).

35

Rho association with elongation complexes

Recent studies suggest that Rho associates with ECs in vivo even when not engaged in termination (Mooney et al. 2009a), and that Rho may bind directly to RNAP (Epshtein et al. 2010). Rho-EC association in vivo is indicated by very similar genome-wide distributions of RNAP and Rho in ChIP-chip experiment (Mooney et al. 2009a). Rho-

RNAP association in vitro is based on retention of Rho by bead-immobilized ECs with transcripts too short to emerge from the RNA exit channel (Epshtein et al. 2010). In addition, wild-type Rho fails to terminate ECs that have been pre-incubated with a termination-defective mutant Rho, implying that mutant Rho is bound to the EC stably enough to prevent binding of active Rho (Epshtein et al. 2010).

Possible RNAP-Rho association in vivo raises important questions. First, is Rho bound to all ECs? The ChIP-chip assay reports only enrichment, not stoichiometry, so it is currently unclear if every EC binds Rho (Mooney et al. 2009a). Rho is reported to be

~0.1% of total protein in E. coli (Imai and Shigesada 1978), corresponding to ~1400

Rho hexamers per cell. Interestingly, ~1250 of the 13,000 RNAP molecules are in ECs

(Grigorova et al. 2006); thus enough Rho appears present to bind every EC. However, if

Rho can bind RNAP not in an EC, as current results suggest (Epshtein et al. 2010), then not every EC would contain Rho. Better quantitation of the amount of Rho present in cells more direct tests for EC-Rho association in vivo are needed.

Second, what is the function of Rho-EC association? The local concentration of Rho could be higher, which would allow Rho to engage RNA rapidly when a rut site becomes exposed (Epshtein et al. 2010). Binding of RNAP-associated Rho to rut would cause a

36

RNA loop to form between the RNA exit channel and the central cavity of Rho (Epshtein et al. 2010). This RNA loop would then be translocated through Rho, possibly altering the trajectory of RNA coming out of the exit channel. When the RNA loop becomes taut,

Rho could either remain bound the initial site on RNAP, or could transfer to the RNA exit channel. Although this could impact the mechanism of termination, it does not change the concept of kinetic coupling: termination requires that the rate at which RNA is pulled through Rho still must exceed the rate at which RNAP extends the nascent RNA chain whether the RNA loops between an RNAP-Rho complex or tethers Rho to RNAP.

Validation of the Rho-RNAP association model requires definition of the binding determinants on Rho and RNAP that, when altered, disrupt Rho-RNAP association and alter Rho termination in vivo and in vitro. Given the extensive history of Rho and RNAP genetics, it is notable that such determinants have not been reported to date.

The mechanism of Rho termination

Despite major advances in understanding the structural and biochemical properties of the Rho hexamer, the detailed mechanisms by which Rho dissociates the EC have remained obscure. As for intrinsic termination, changes in both the nucleic-acid scaffold and RNAP conformation are likely to occur during Rho termination. The major classes of the mechanisms are only superficially altered by whether or not Rho is pre-bound to

ECs (Fig. 1.4).

37

Changes in the nucleic-acid scaffold during Rho termination

Rho is a powerful molecular motor capable of applying force through RNA translocation

(Schwartz et al. 2007a), as well as a RNA:DNA helicase (Brennan et al. 1987); these activities are thought to supplant the function of the terminator hairpin in each of the possible classes of termination mechanisms (Fig. 1.2). In hybrid shearing, RNA is translocated through Rho until the RNA becomes taut, resulting in a pulling force that indirectly disrupts the RNA:DNA hybrid (Richardson 2002). In hypertranslocation, Rho exerts a pushing force that causes RNAP to translocate forward on the DNA template without extension of the RNA chain (Park and Roberts 2006). In the invasion model, the

RNA 3´-end remains in the RNAP active site, and the helicase activity of Rho is used to directly unwind the RNA:DNA hybrid (Epshtein et al. 2010).

For hybrid shearing, Rho must generate sufficient force to shear the RNA:DNA hybrid (Richardson 2002), which can be more stable than the U-tract hybrids found at intrinsic terminators. Although Rho can generate >200 pN of force (based on its ability to displace a streptavidin bead; Schwartz et al. 2007a) the force required to shear non-

U-tract hybrids is unknown because it is greater than the 30 pN at which other linkages break in force-clamp experiments (Dalal et al. 2006). Direct, single-molecule measurements of these forces would be invaluable to test the physical plausibility of hybrid shearing by Rho.

The hypertranslocation model is supported by the findings that Rho allows RNAP to elongate by two additional nucleotides against a downstream roadblock, and that Rho termination is inhibited when upstream DNA strands in the transcription bubble cannot

38 reanneal (Park and Roberts 2006). However, it remains unknown if Rho can create a hypertranslocated EC in which the RNA 3´-end no longer resides in the active site.

Although inhibition of Rho termination by mutations that prevent DNA reannealing in the upstream bubble is consistent with hypertranslocation (Park and Roberts 2006), less is known about the effects of downstream bubble unwinding on termination. The hypertranslocation model predicts that inhibiting downstream unwinding would inhibit both forward translocation by RNAP and termination, as has been observed for some classes of intrinsic terminators. An unresolved question is whether hypertranslocation during Rho termination is dependent on hybrid sequence in a way that parallels the sequence dependence observed in intrinsic termination, making tests of hypertranslocation on templates encoding different Rho-dependent terminators highly desirable.

The invasion model is supported by crosslinking data that suggests that the RNA

3´-end remains in the RNAP active site during termination (Epshtein et al. 2010). These crosslinking experiments were performed on inactivated intermediates, which may or may not be on the termination pathway. Interestingly, the position of RNA 3´-end crosslinking shifts slightly in the presence of Rho, suggesting that either the conformation of protein components in the active site changes, or that the RNA 3’-end moves independent of a conformational change. The invasion model also posits that

Rho uses its helicase activity to directly unwind the upstream end of the RNA:DNA hybrid (Epshtein et al. 2010). However, since the hybrid is buried within RNAP, significant conformational changes would need to take place for Rho to have direct access to the hybrid.

39

Conformational changes in RNAP during Rho Termination

Most models for Rho termination are silent about potential conformational changes in

RNAP that may coincide with alterations to the nucleic acid scaffold of the EC.

However, this does not mean that hybrid shearing or hypertranslocation would occur without RNAP conformational changes. These would be favored by the same changes discussed for intrinsic termination (e.g., clamp opening). Further, bubble collapse, as envisioned when forward translocation is blocked, is physically implausible without conformational changes in RNAP (Park and Roberts 2006).

A specific conformational change model has been proposed in which Rho “pushes” against the RNAP lid domain, causing clamp movement and subsequent unfolding of the TL (Epshtein et al. 2010). TL unfolding is proposed to cause further clamp opening, loss of nucleic acid contacts, and irreversible inactivation of the EC. This proposal is based on the findings that tagetitoxin (Tgt), which binds directly to the TL (Vassylyev et al. 2007b), inhibits Rho termination (Dutta et al. 2008), and that substitutions in the TL enhance Rho termination (Epshtein et al. 2010). Interestingly, in the proposed conformational change model of Rho termination the unfolded TL destabilizes the EC

(Epshtein et al. 2010), whereas in the conformational change model of intrinsic termination TL folding is proposed to destabilize the EC (Epshtein et al. 2007).

Ultimately, further experiments involving alterations that stabilize, destabilize, or eliminate folding of the TL will be needed to clarify the role of TL conformational changes in Rho termination.

40

Physiological functions and targets of Rho

Rho termination plays a variety of roles in cellular physiology including formation of transcript 3´-ends (Roberts 1969), prevention of persistent RNA:DNA hybrids (R-loops) that form during transcription (Harinarayanan and Gowrishankar 2003), and the enforcement of transcription and translation coupling through polarity (De Crombrugghe et al. 1973). Recent studies have added new roles for Rho termination: “silencing” transcription of foreign (horizontally-transferred) DNA (Cardinale et al. 2008), suppression of antisense transcription (Peters et al. 2009), and elimination of stalled

ECs that interfere with DNA replication (Washburn and Gottesman 2010).

Targets of Rho

Genome-wide analysis of Rho termination using RNAP ChIP-chip in the presence of the

Rho inhibitor bicyclomycin (BCM) revealed ~200 putative Rho termination sites in the E. coli K-12 genome (Peters et al. 2009), ~10 of which were known prior to the study

(reviewed elsewhere; Ciampi 2006). Rho termination sites were found downstream of stable RNAs, including ~1/3 of all tRNA operons and 7 annotated sRNAs. Rho termination of stable RNAs is unexpected because such RNAs are highly structured.

One potential explanation of this result is that Rho could bind to the unstructured "tail" regions of the stable RNA prior to processing (Rossi et al. 1981). However, it is also possible that certain stable RNAs could act as aptamers that bind directly to Rho

41

(Peters et al. 2009). In vitro mapping of Rho binding sites within stable RNA transcripts should help clarify this issue.

Silencing of foreign DNA

Expression array profiling of BCM-treated cells found a general upregulation of genes within cryptic prophages and other horizontally-transferred (i.e., foreign) elements, indicating that Rho silences transcription of these elements (Cardinale et al. 2008).

RNAP ChIP-chip in BCM-treated cells also identified a statistically significant association of Rho-dependent terminators with foreign DNA (Peters et al. 2009). There are several possible explanations for this association. First, foreign DNA may have a greater number of Rho terminators involved in regulation, such as the timm terminator that suppresses the induction of toxic genes in the rac prophage (Cardinale et al. 2008).

Second, insertion of foreign DNA into active transcription units may require Rho termination in the foreign segments to compensate for loss of natural terminators; this is hypothesized to occur when phage integrases remove intrinsic terminators from tRNA genes during recombination (Peters et al. 2009). Third, suboptimal codon usage in foreign DNA could inhibit translation and expose rut sites in the RNA to which Rho can bind and terminate transcription (Cardinale et al. 2008).

42

Suppression of antisense transcription

Antisense transcription may be a general target of Rho termination in vivo. Twenty-four novel antisense transcripts were identified in RNAP ChIP-chip experiments because

Rho inhibition with BCM increased transcription in a direction opposite to the annotated gene (Peters et al. 2009), but this number is thought to be a significant underestimate of the number of Rho-terminated antisense transcripts due to technical limitations.

Recently reported deep RNA sequencing of E. coli detected ~1000 antisense RNA sequences (Dornenburg et al. 2010). Rho termination is better suited to terminate antisense transcription than intrinsic termination, because coding requirements on the sense strand make encoding an antisense intrinsic terminator problematic (Peters et al.

2009). Global, strand-specific RNA quantification should help determine the extent of antisense transcript suppression by Rho.

The mechanism of Rho-dependent polarity

Rho termination of untranslated RNA transcripts causes polarity, which is the decrease in expression of distal genes in an operon caused by premature stop codons or inefficient translation in an upstream gene within the same operon. Although models for polarity have existed for several decades, recent studies reveal added complexity in the ways translating ribosomes suppress Rho function. We will describe the two prevailing models for polarity, Rho-ribosome competition for RNA and Rho-ribosome competition for NusG, and suggest how they can be combined into a two-checkpoint model to detect

43 translation failure in mRNAs. We note these concepts are similar irrespective of whether

Rho binds RNAP in vivo.

Rho-ribosome competition for RNA

In the RNA competition model, Rho is prevented from associating with nascent RNA in the presence of the ribosome (Fig. 1.6; Adhya and Gottesman 1978). During active translation, potential Rho binding sites on the RNA (rut sites) are occupied by the ribosome. Since RNA binding by Rho is a prerequisite for Rho termination, physical occlusion of Rho binding sites by ribosomes prevents Rho activation and, thus, termination. In conditions where translation is optimal, Rho can only bind rut sites 5´ of the ribosome and Rho termination is suppressed. When translation slows or terminates,

Rho can bind rut near RNAP, termination occurs, and polarity results.

Rho-ribosome competition for NusG

In the NusG competition model, Rho and the ribosome are alternate targets for the

NusG-CTD (Fig. 1.6). NusG is a small protein (21 kDa in E. coli) that enhances Rho termination. NusG has two conserved domains (NTD and CTD). The NusG-NTD binds to RNAP via interactions with the β´ clamp helices (Mooney et al. 2009b), and the

NusG-CTD binds to Rho (Burmann et al. ; Chalissery et al.). Both interactions with

RNAP and Rho are needed for termination enhancement by NusG (Mooney et al.

2009b), which increases the rate of RNA release by an unknown mechanism

44

Figure 1.6. Two-checkpoint mechanism to suppress Rho termination in protein- coding genes (polarity). Competition can occur for recruitment of rut RNA sequestered in the ribosome and for the NusG-CTD, which binds ribosomal protein S10 or Rho.

Because NusG affects Rho dissociation of ECs but not recruitment, the two mechanisms may operate as sequential checks to determine whether an mRNA can be translated.

45

Figure 1.6

46

(Chalissery et al.). Recent NMR studies have found that the NusG-CTD binds to ribosomal protein S10 (also known as NusE; Burmann et al.) Binding of the NusG-CTD to Rho or S10 is mutually exclusive (Burmann et al.), but the affinity of NusG for Rho is much greater than for S10 (Kd ≈ 12 nM versus 50 μM, respectively; Burmann et al. ;

Pasman and von Hippel 2000). The surface of S10 bound by NusG-CTD is exposed when S10 is in the ribosome, consistent with idea that the EC and the ribosome are physically linked by NusG-S10 interaction (Burmann et al.). If transcription is actively coupled with translation, the ribosome binds to the NusG-CTD, which blocks the ability of NusG to enhance Rho-dependent termination. However, if the ribosome dissociates due to a premature translation stop codon or if translation is inefficient, Rho has access to the NusG-CTD, and can cause polarity by terminating transcription.

Several important questions remain about the NusG-based model of polarity. Due to the large difference in affinity for NusG between S10 and Rho (Burmann et al.), the ribosome must remain close to the EC to win the competition (Proshkin et al.), or additional factors present in the translation-coupled EC must prevent Rho association or action. Several ribosomal proteins are known to associate with RNAP (Squires and

Zaporojets 2000; Torres et al. 2001) and could serve as potential "coupling factors."

Some Rho-dependent terminators may not require NusG for efficient termination

(Sullivan and Gottesman 1992), so competition for NusG may be relevant for only some

Rho termination sites; other sites may be governed only by competition for rut RNA (Fig.

1.6).

47

Two-checkpoint model for polarity

A combination of the RNA and NusG competition models may operate as a two- checkpoint mechanism, in which ribosome competition for RNA blocks Rho recruitment and S10 binding to NusG prevents Rho dissociation of ECs. This two-checkpoint mechanism is consistent with the known biochemical properties of NusG, in that NusG has no effect on the binding of RNA to Rho. Because Rho must first bind RNA to act on the EC, RNA binding would be the first of the two checkpoints. Experiments using mutants of S10 that fail to bind to NusG will be critical in assessing the importance of the NusG-S10 interaction to polarity.

Protein factors that may affect Rho termination

NusA

The transcription elongation factor NusA has been proposed to play contradictory roles in Rho termination. NusA is a multi-domain protein (NTD-S1-KH1-KH2-CTD;

Worbs et al. 2001) that binds at two places on the surface of RNAP: The NusA-NTD and the NusA-CTD bind to the β-Flap and α-CTD domains of RNAP, respectively (Mah et al. 2000; Ha et al. 2010). NusA increases the duration of hairpin-stabilized pausing by

RNAP (Toulokhonov et al. 2001) and enhances intrinsic termination (Greenblatt et al.

1981). In vitro, NusA decreases the efficiency of Rho termination at model terminators

(such as λtR1; Lau and Roberts 1985), and may directly inhibit the Rho ATPase

(Schmidt and Chamberlin 1984). The effects of NusA on Rho termination in vivo are less clear. Point mutations in the DNA encoding the S1 and KH1 domains of NusA

48 reduce Rho-induced polarity in the (Ward and Gottesman 1981) and termination at λtR1 (Saxena and Gowrishankar 2011a). Also, a partial deletion of nusA

(ΔnusA*) that removes the DNA encoding the S1 through KH2 domains shows a similar effect on gene expression in ORF-based microarray experiments as Rho inhibition by

BCM or deletion of nusG, suggesting that NusA generally enhances Rho termination in vivo (Cardinale et al. 2008). However, the ΔnusA* allele can only be sole source of

NusA in a strain background that either lacks cryptic prophages (MDS42; Posfai et al.

2006; Cardinale et al. 2008), or that produces a defective Rho protein (rho(E134D);

Zheng and Friedman 1994). The observation that a defective Rho protein can suppress a loss-of-function mutation in nusA argues that NusA normally antagonizes Rho termination (Zheng and Friedman 1994). High-resolution genome-wide experiments that examine the effects of defective NusA variants at Rho-dependent terminators will clarify the role of NusA in vivo.

H-NS

The -like nucleoid structuring protein (H-NS), and similar proteins (e.g.

YdgT, Hha, and StpA) that assemble into nucleoid-associated filaments may also affect

Rho termination. H-NS binds DNA at AT-rich sequences, then oligomerizes into long filaments that can extend up to several kb (Kahramanoglou et al. 2011). H-NS oligomerization is associated with gene silencing at promoters by occluding or trapping

RNAP at promoters (Fang and Rimsky 2008), and with general structuring of the bacterial nucleoid (Fang and Rimsky 2008). H-NS may also act as a roadblock to RNAP

49 elongation, although experimental evidence for this role is lacking. Dominant negative mutations in hns that are defective for dimerization suppress the termination defects of certain rho and nusG mutants (Saxena and Gowrishankar 2011b). Further, overexpression of the H-NS-like protein YdgT also suppresses termination defects in rho and nusG strains, although wild-type hns is required for this suppression (Saxena and Gowrishankar 2011b). Finally, deletion of ydgT or hha (which encodes another H-

NS-like protein) further exacerbates termination defects (Saxena and Gowrishankar

2011b). These data suggest that changes to nucleoid protein filaments containing H-NS can either directly or indirectly affect Rho termination. Genome-wide comparison of H-

NS binding sites with Rho-dependent terminators will help establish further establish the role of H-NS in Rho termination.

50

Conclusions

The steps in intrinsic and Rho-dependent termination, the signals that provoke them, and some of the ways they are regulated are now relatively well understood. However, the detailed mechanisms by which changes to the nucleic acid scaffold and to the structure of RNAP destabilize and dissociate the EC remain under study. Strong evidence suggests an energetic trade-off between ease of hybrid shearing and hypertranslocation at different intrinsic terminator sequences that determines which pathway operates, but the rules that govern this switch, the nature of RNAP conformation changes, and the mechanistic importance of these RNAP conformational changes await further study. Rho termination may operate by similar mechanisms, but the rules governing them and the roles of RNAP conformational changes are even less clear. The exciting discovery of the NusG interaction with a ribosomal protein suggests a possible two-checkpoint mechanism by which Rho termination can halt synthesis of untranslatable mRNAs. Finally, elucidation of both intrinsic and factor-dependent termination pathways in diverse bacteria promises to be a fertile area for exciting and important advances in future studies of transcription termination in bacteria. Even after

40 years, a complete understanding of termination mechanisms remains an elusive goal.

51

Outline of thesis chapters

Research presented in this thesis focused on the roles of Rho-dependent transcription termination in vivo. In this work, I defined the distribution of Rho across the Escherichia coli genome, identified novel targets of Rho termination, and found functional synergy between Rho, NusG, and H-NS.

In Chapter 2, I examined the distribution of Rho and RNA polymerase (RNAP) across the E. coli genome using chromatin immunoprecipiation and microarrays (ChIP-chip). I found that Rho and RNAP had strikingly similar distributions on DNA; Rho loads onto early elongation complexes (ECs), and is found at RNAP “peaks” near the 5′ ends of certain genes. Rho termination is not responsible for RNAP peaks, however, because inhibition of Rho with bicyclomycin (BCM) did not increase RNAP occupancy downstream of RNAP peaks. I conclude that Rho can interact with ECs early and throughout transcription, consistent with Rho-dependent polar effects throughout transcription units (TUs) when transcription and translation are uncoupled.

In Chapter 3, I determined targets of Rho termination using RNAP ChIP-chip in the presence and absence of the Rho inhibitor BCM. I identified ~200 Rho-dependent terminators in total; half were found at the end of genes, while the other half were located within genes. The genes terminated by Rho included small RNAs (sRNAs) and transfer RNAs (tRNAs), establishing a new role for Rho in termination of stable RNA transcription.

In Chapter 4, I examined the effects of Rho inhibition on the transcriptome of E. coli using tiling expression microarrays and RNAseq. I found that the vast majority of

52

Rho termination prevented antisense transcription. Some antisense transcripts could only be identified in conditions in which Rho was inhibited by BCM. I also investigated the roles of the putative Rho cofactors NusG and NusA in Rho termination. I determined that only a minority subset of Rho-dependent terminators required NusG for function, and none required NusA. Finally, I identified genetic interactions between rho and hns, as well as overlap between Rho-dependent terminators and H-NS binding. These data suggest that H-NS plays a genome-wide role in Rho termination of antisense transcripts.

Chapter 5 provides a summary of the thesis and future directions for the field of bacterial transcription termination.

53

References

Abe H, Aiba H. 1996. Differential contributions of two elements of rho-independent terminator to transcription termination and mRNA stabilization. Biochimie 78: 1035- 1042.

Adhya S, Gottesman M. 1978. Control of transcription termination. Annu Rev Biochem 47: 967-996.

Alifano P, Rivellini F, Limauro D, Bruni CB, Carlomagno MS. 1991. A consensus motif common to all Rho-dependent prokaryotic transcription terminators. Cell 64: 553-563.

Andrecka J, Treutlein B, Arcusa MA, Muschielok A, Lewis R, Cheung AC, Cramer P, Michaelis J. 2009. Nano positioning system reveals the course of upstream and nontemplate DNA within the RNA polymerase II elongation complex. Nucleic Acids Res 37: 5803-5809.

Antao VP, Lai SY, Tinoco I, Jr. 1991. A thermodynamic study of unusually stable RNA and DNA hairpins. Nucleic Acids Res 19: 5901-5905.

Antao VP, Tinoco I, Jr. 1992. Thermodynamic parameters for loop formation in RNA and DNA hairpin tetraloops. Nucleic Acids Res 20: 819-824.

Balasubramanian K, Stitt BL. Evidence for amino acid roles in the of ATP hydrolysis in Escherichia coli Rho. J Mol Biol 404: 587-599.

Banerjee S, Chalissery J, Bandey I, Sen R. 2006. Rho-dependent transcription termination: more questions than answers. J Microbiol 44: 11-22.

Berlin V, Yanofsky C. 1983. Release of transcript and template during transcription termination at the . J Biol Chem 258: 1714-1719.

54

Bogden CE, Fass D, Bergman N, Nichols MD, Berger JM. 1999. The structural basis for terminator recognition by the Rho transcription termination factor. Mol Cell 3: 487-493.

Boudvillain M, Nollmann M, Margeat E. 2010. Keeping up to speed with the transcription termination factor Rho motor. Transcription 1: 70-75.

Brennan CA, Dombroski AJ, Platt T. 1987. Transcription termination factor rho is an RNA-DNA helicase. Cell 48: 945-952.

Briercheck DM, Wood TC, Allison TJ, Richardson JP, Rule GS. 1998. The NMR structure of the RNA binding domain of E. coli suggests possible RNA-protein interactions. Nat Struct Biol 5: 393-399. Burmann BM, Schweimer K, Luo X, Wahl MC, Stitt BL, Gottesman ME, Rosch P. A NusE:NusG complex links transcription and translation. Science 328: 501-504.

Cardinale CJ, Washburn RS, Tadigotla VR, Brown LM, Gottesman ME, Nudler E. 2008. Termination factor Rho and its cofactors NusA and NusG silence foreign DNA in E. coli. Science 320: 935-938.

Chalissery J, Muteeb G, Kalarickal NC, Mohan S, Jisha V, Sen R. Interaction Surface of the Transcription Terminator Rho Required to Form a Complex with the C-Terminal Domain of the Antiterminator NusG. J Mol Biol.

Chan C, Wang D, Landick R. 1997. Spacing from the transcript 3' end determines whether a nascent RNA hairpin interacts with RNA polymerase to prolong pausing or triggers termination. J Mol Biol 268: 54-68.

Chan CL, Landick R. 1993. Dissection of the his leader pause site by base substitution reveals a multipartite signal that includes a pause RNA hairpin. J Mol Biol 233: 25-42.

Chen CY, Galluppi GR, Richardson JP. 1986. Transcription termination at lambda tR1 is mediated by interaction of rho with specific single-stranded domains near the 3' end of cro mRNA. Cell 46: 1023-1028.

55

Chen CY, Richardson JP. 1987. Sequence elements essential for rho-dependent transcription termination at lambda tR1. J Biol Chem 262: 11292-11299.

Cheng S-W, Lynch EC, Leason KR, Court DL, Shapiro BA, Friedman DI. 1992. Functional importance of sequence in the stem-loop of a transcription terminator. Science 254: 1205-1207.

Ciampi MS. 2006. Rho-dependent terminators and transcription termination. Microbiology 152: 2515-2528.

Cramer P, Bushnell D, Kornberg R. 2001. Structural basis of transcription: RNA polymerase II at 2.8 Å resolution. Science 292: 1863-1876.

Crooks GE, Hon G, Chandonia JM, Brenner SE. 2004. WebLogo: a sequence logo generator. Genome Res 14: 1188-1190. d'Aubenton Carafa Y, Brody E, Thermes C. 1990. Prediction of rho-independent Escherichia coli transcription terminators. A statistical analysis of their RNA stem-loop structures. J Mol Biol 216: 835-858.

Dalal RV, Larson MH, Neuman KC, Gelles J, Landick R, Block SM. 2006. Pulling on the nascent RNA during transcription does not alter kinetics of elongation or ubiquitous pausing. Mol Cell 23: 231-239. Darst SA, Opalka N, Chacon P, Polyakov A, Richter C, Zhang G, Wriggers W. 2002. Conformational flexibility of bacterial RNA polymerase. Proc Natl Acad Sci U S A 99: 4296-4301.

De Crombrugghe B, Adhya S, Gottesman M, Pastan I. 1973. Effect of Rho on transcription of bacterial operons. Nat New Biol 241: 260-264. de Hoon MJ, Makita Y, Nakai K, Miyano S. 2005. Prediction of transcriptional terminators in Bacillus subtilis and related species. PLoS Comput Biol 1: e25.

56

Depken M, Galburt EA, Grill SW. 2009. The origin of short transcriptional pauses. Biophys J 96: 2189-2193.

Dornenburg JE, Devita AM, Palumbo MJ, Wade JT. 2010. Widespread Antisense Transcription in Escherichia coli. MBio 1.

Dutta D, Chalissery J, Sen R. 2008. Transcription termination factor rho prefers catalytically active elongation complexes for releasing RNA. J Biol Chem 283: 20243- 20251.

Epshtein V, Cardinale CJ, Ruckenstein AE, Borukhov S, Nudler E. 2007. An allosteric path to transcription termination. Mol Cell 28: 991-1001.

Epshtein V, Dutta D, Wade J, Nudler E. 2010. An allosteric mechanism of Rho- dependent transcription termination. Nature 463: 245-249.

Ermolaeva MD, Khalak HG, White O, Smith HO, Salzberg SL. 2000. Prediction of transcription terminators in bacterial genomes. J Mol Biol 301: 27-33.

Fang FC, Rimsky S. 2008. New insights into transcriptional regulation by H-NS. Curr Opin Microbiol 11: 113-120.

Galloway JL, Platt T. 1988. Signals sufficient for rho-dependent transcription termination at trp t' span a region centered 60 base pairs upstream of the earliest 3' end point. J Biol Chem 263: 1761-1767.

Galluppi GR, Richardson JP. 1980. ATP-induced changes in the binding of RNA synthesis termination protein Rho to RNA. J Mol Biol 138: 513-539.

Gama-Castro S, Salgado H, Peralta-Gil M, Santos-Zavaleta A, Muniz-Rascado L, Solano-Lira H, Jimenez-Jacinto V, Weiss V, Garcia-Sotelo JS, Lopez-Fuentes A et al.

57

2010. RegulonDB version 7.0: transcriptional regulation of Escherichia coli K-12 integrated within genetic sensory response units (Gensor Units). Nucleic Acids Res.

Gogol EP, Seifried SE, von Hippel PH. 1991. Structure and assembly of the Escherichia coli transcription termination factor rho and its interaction with RNA. I. Cryoelectron microscopic studies. J Mol Biol 221: 1127-1138.

Goliger JA, Yang XJ, Guo HC, Roberts JW. 1989. Early transcribed sequences affect termination efficiency of Escherichia coli RNA polymerase. J Mol Biol 205: 331-341.

Greenblatt J, McLimont M, Hanly S. 1981. Termination of transcription by nusA gene protein of Escherichia coli. Nature 292: 215-220.

Grigorova IL, Phleger NJ, Mutalik VK, Gross CA. 2006. Insights into transcriptional regulation and sigma competition from an equilibrium model of RNA polymerase binding to DNA. Proc Natl Acad Sci U S A 103: 5332-5337.

Guerin M, Robichon N, Geiselmann J, Rahmouni AR. 1998. A simple polypyrimidine repeat acts as an artificial Rho-dependent terminator in vivo and in vitro. Nucleic Acids Res 26: 4895-4900.

Gunsalus RP, Yanofsky C. 1980. Nucleotide sequence and expression of Escherichia coli trpR, the structural gene for the trp aporepressor. Proc Natl Acad Sci U S A 77: 7117-7121.

Gusarov I, Nudler E. 1999. The mechanism of intrinsic transcription termination. Mol Cell 3: 495-504.

Ha KS, Toulokhonov I, Vassylyev DG, Landick R. 2010. The NusA N-terminal domain is necessary and sufficient for enhancement of transcriptional pausing via interaction with the RNA exit channel of RNA polymerase. J Mol Biol 401: 708-725.

58

Harinarayanan R, Gowrishankar J. 2003. Host factor titration by chromosomal R-loops as a mechanism for runaway replication in transcription termination-defective mutants of Escherichia coli. J Mol Biol 332: 31-46.

Hart CM, Roberts JW. 1991. Rho-dependent transcription termination. Characterization of the requirement for cytidine in the nascent transcript. J Biol Chem 266: 24140-24148.

Henkin TM, Yanofsky C. 2002. Regulation by transcription attenuation in bacteria: how RNA provides instructions for transcription termination/antitermination decisions. Bioessays 24: 700-707.

Herbert KM, Greenleaf WJ, Block SM. 2008. Single-molecule studies of RNA polymerase: motoring along. Annu Rev Biochem 77: 149-176.

Hosid S, Bolshoy A. 2004. New elements of the termination of transcription in . J Biomol Struct Dyn 22: 347-354.

Imai M, Shigesada K. 1978. Studies on the altered rho factor in a nitA mutants of Escherichia coli defective in transcription termination. I. Characterization and quantitative determination of rho in cell extracts. J Mol Biol 120: 451-466.

Ingham CJ, Hunter IS, Smith MC. 1995. Rho-independent terminators without 3' poly-U tails from the early region of actinophage ΦC31. Nucleic Acids Res 23: 370-376.

Jin DJ, Burgess RR, Richardson JP, Gross CA. 1992. Termination efficiency at rho- dependent terminators depends on kinetic coupling between RNA polymerase and rho. Proc Natl Acad Sci U S A 89: 1453-1457.

Kahramanoglou C, Seshasayee AS, Prieto AI, Ibberson D, Schmidt S, Zimmermann J, Benes V, Fraser GM, Luscombe NM. 2011. Direct and indirect effects of H-NS and on global gene expression control in Escherichia coli. Nucleic Acids Res 39: 2073-2091.

59

Kashlev M, Komissarova N. 2002. Transcription termination: primary intermediates and secondary adducts. J Biol Chem 277: 14501-14508.

Kassavetis GA, Chamberlin MJ. 1981. Pausing and termination of transcription within the early region of T7 DNA in vitro. J Biol Chem 256: 2777-2786.

Kingsford CL, Ayanbule K, Salzberg SL. 2007. Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake. Genome Biol 8: R22.

Kireeva ML, Kashlev M. 2009. Mechanism of sequence-specific pausing of bacterial RNA polymerase. Proc Natl Acad Sci U S A 106: 8900-8905.

Kohno K, Wada M, Kano Y, Imamoto F. 1990. Promoters and autogenous control of the Escherichia coli hupA and hupB genes. J Mol Biol 213: 27-36.

Komissarova N, Becker J, Solter S, Kireeva M, Kashlev M. 2002. Shortening of RNA:DNA hybrid in the elongation complex of RNA polymerase is a prerequisite for transcription termination. Mol Cell 10: 1151-1162.

Komissarova N, Kashlev M. 1997. RNA polymerase switches between inactivated and activated states By translocating back and forth along the DNA and the RNA. J Biol Chem 272: 15329-15338.

Komissarova N, Kashlev M. 1998. Functional topography of nascent RNA in elongation intermediates of RNA polymerase. Proc Natl Acad Sci U S A 95: 14699-14704.

Korzheva N, Mustaev A, Kozlov M, Malhotra A, Nikiforov V, Goldfarb A, Darst SA. 2000. A structural model of transcription elongation. Science 289: 619-625.

Korzheva N, Mustaev A, Nudler E, Nikiforov V, Goldfarb A. 1998. Mechanistic model of the elongation complex of Escherichia coli RNA polymerase. Cold Spring Harb Symp Quant Biol 63: 337-345.

60

Kuznedelov KD, Komissarova NV, Severinov KV. 2006. The role of the bacterial RNA polymerase beta subunit flexible flap domain in transcription termination. Dokl Biochem Biophys 410: 263-266.

Kyzer S, Ha KS, Landick R, Palangat M. 2007. Direct versus limited-step reconstitution reveals key features of an RNA hairpin-stabilized paused transcription complex. J Biol Chem 282: 19020-19028.

Landick R. 2001. RNA polymerase clamps down. Cell 105: 567-570.

Landick R. 2009. Transcriptional pausing without backtracking. Proc Natl Acad Sci U S A 106: 8797-8798.

Larson MH, Greenleaf WJ, Landick R, Block SM. 2008. Applied force reveals mechanistic and energetic details of transcription termination. Cell 132: 971-982.

Lau LF, Roberts JW. 1985. Rho-dependent transcription termination at lambda R1 requires upstream sequences. J Biol Chem 260: 574-584.

Lee DN, Phung L, Stewart J, Landick R. 1990. Transcription pausing by Escherichia coli RNA polymerase is modulated by downstream DNA sequences. J Biol Chem 265: 15145-15153.

Lesnik EA, Sampath R, Levene HB, Henderson TJ, McNeil JA, Ecker DJ. 2001. Prediction of rho-independent transcriptional terminators in Escherichia coli. Nucleic Acids Res 29: 3583-3594.

Lowery-Goldhammer C, Richardson JP. 1974. An RNA-dependent nucleoside triphosphate phosphohydrolase (ATPase) associated with rho termination factor. Proc Natl Acad Sci U S A 71: 2003-2007.

61

Ma H, Proctor DJ, Kierzek E, Kierzek R, Bevilacqua PC, Gruebele M. 2006. Exploring the energy landscape of a small RNA hairpin. J Am Chem Soc 128: 1523-1530.

Macdonald LE, Zhou Y, McAllister WT. 1993. Termination and slippage by bacteriophage T7 RNA polymerase. J Mol Biol 232: 1030-1047.

Mah TF, Kuznedelov K, Mushegian A, Severinov K, Greenblatt J. 2000. The alpha subunit of E. coli RNA polymerase activates RNA binding by NusA. Genes Dev 14: 2664-2675.

Martinez-Trujillo M, Sanchez-Trujillo A, Ceja V, Avila-Moreno F, Bermudez-Cruz RM, Court D, Montanez C. 2010. Sequences required for transcription termination at the intrinsic λtI terminator. Can J Microbiol 56: 168-177.

McSwiggen JA, Bear DG, von Hippel PH. 1988. Interactions of Escherichia coli transcription termination factor rho with RNA. I. Binding stoichiometries and free energies. J Mol Biol 199: 609-622.

Mejia YX, Mao H, Forde NR, Bustamante C. 2008. Thermal probing of E. coli RNA polymerase off-pathway mechanisms. J Mol Biol 382: 628-637.

Mitra A, Angamuthu K, Jayashree HV, Nagaraja V. 2009. Occurrence, divergence and evolution of intrinsic terminators across eubacteria. Genomics 94: 110-116.

Mitra A, Angamuthu K, Nagaraja V. 2008. Genome-wide analysis of the intrinsic terminators of transcription across the genus Mycobacterium. Tuberculosis (Edinb) 88: 566-575.

Mitra A, Kesarwani AK, Pal D, Nagaraja V. 2010. WebGeSTer DB--a transcription terminator database. Nucleic Acids Res.

Miwa Y, Horiguchi T, Shigesada K. 1995. Structural and functional dissections of transcription termination factor rho by random mutagenesis. J Mol Biol 254: 815-837.

62

Mooney RA, Davis SE, Peters JM, Rowland JL, Ansari AZ, Landick R. 2009a. Regulator trafficking on bacterial transcription units in vivo. Mol Cell 33: 97-108.

Mooney RA, Schweimer K, Rosch P, Gottesman M, Landick R. 2009b. Two structurally independent domains of E. coli NusG create regulatory plasticity via distinct interactions with RNA polymerase and regulators. J Mol Biol 391: 341-358.

Morgan WD, Bear DG, von Hippel PH. 1983. Rho-dependent termination of transcription. II. Kinetics of mRNA elongation during transcription from the bacteriophage lambda PR promoter. J Biol Chem 258: 9565-9574.

Mukhopadhyay J, Das K, Ismail S, Koppstein D, Jang M, Hudson B, Sarafianos S, Tuske S, Patel J, Jansen R et al. 2008. The RNA polymerase "switch region" is a target for inhibitors. Cell 135: 295-307.

Nudler E, Avetissova E, Markovtsov V, Goldfarb A. 1996. Transcription processivity: protein-DNA interactions holding together the elongation complex. Science 273: 211- 217.

Nudler E, Mustaev A, Lukhtanov E, Goldfarb A. 1997. The RNA-DNA hybrid maintains the register of transcription by preventing backtracking of RNA polymerase. Cell 89: 33- 41.

Park JS, Roberts JW. 2006. Role of DNA bubble rewinding in enzymatic transcription termination. Proc Natl Acad Sci U S A 103: 4870-4875.

Pasman Z, von Hippel PH. 2000. Regulation of rho-dependent transcription termination by NusG is specific to the Escherichia coli elongation complex. Biochemistry 39: 5573- 5585.

Patel SS. 2009. Structural biology: Steps in the right direction. Nature 462: 581-583.

63

Peters JM, Mooney RA, Kuan PF, Rowland JL, Keles S, Landick R. 2009. Rho directs widespread termination of intragenic and stable RNA transcription. Proc Natl Acad Sci U S A 106: 15406-15411.

Posfai G, Plunkett G, 3rd, Feher T, Frisch D, Keil GM, Umenhoffer K, Kolisnychenko V, Stahl B, Sharma SS, de Arruda M et al. 2006. Emergent properties of reduced-genome Escherichia coli. Science 312: 1044-1046.

Proshkin S, Rahmouni AR, Mironov A, Nudler E. Cooperation between translating ribosomes and RNA polymerase in transcription elongation. Science 328: 504-508.

Pyun J. 2005. Single-molecule studies of active and static transcription complexes. in Biochemistry, p. 135. Brandeis University, Waltham.

Rabhi M, Gocheva V, Jacquinot F, Lee A, Margeat E, Boudvillain M. 2011. Mutagenesis-based evidence for an asymmetric configuration of the ring-shaped transcription termination factor rho. J Mol Biol 405: 497-518.

Reynolds R, Bermudez-Cruz RM, Chamberlin MJ. 1992. Parameters affecting transcription termination by Escherichia coli RNA polymerase. I. Analysis of 13 rho- independent terminators. J Mol Biol 224: 31-51.

Reynolds R, Chamberlin MJ. 1992. Parameters affecting transcription termination by Escherichia coli RNA. II. Construction and analysis of hybrid terminators. J Mol Biol 224: 53-63.

Richardson JP. 2002. Rho-dependent termination and ATPases in transcript termination. Biochim Biophys Acta 1577: 251-260.

Richardson JP. 2003. Loading Rho to terminate transcription. Cell 114: 157-159. Roberts JW. 1969. Termination factor for RNA synthesis. Nature 224: 1168-1174.

64

Rosenberg M, Court D. 1979. Regulatory sequences involved in the promotion and termination of RNA transcription. Annu Rev Genet 13: 319-353.

Rossi J, Egan J, Hudson L, Landy A. 1981. The tyrT locus: termination and processing of a complex transcript. Cell 26: 305-314.

Ryder AM, Roberts JW. 2003. Role of the non-template strand of the elongation bubble in intrinsic transcription termination. J Mol Biol 334: 205-213.

Santangelo TJ, Roberts JW. 2004. Forward translocation is the natural pathway of RNA release at an intrinsic terminator. Molecular Cell 14: 117-126.

Saxena S, Gowrishankar J. 2011a. Compromised factor-dependent transcription termination in a nusA mutant of Escherichia coli: spectrum of termination efficiencies generated by perturbations of Rho, NusG, NusA, and H-NS family proteins. J Bacteriol 193: 3842-3850.

-. 2011b. Modulation of Rho-dependent transcription termination in Escherichia coli by the H-NS family of proteins. J Bacteriol 193: 3832-3841.

Schmidt MC, Chamberlin MJ. 1984. Binding of rho factor to Escherichia coli RNA polymerase mediated by NusA protein. J Biol Chem 259: 15000-15002.

Schneider TD, Stephens RM. 1990. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18: 6097-6100.

Schwartz A, Margeat E, Rahmouni AR, Boudvillain M. 2007a. Transcription termination factor rho can displace streptavidin from biotinylated RNA. J Biol Chem 282: 31469- 31476.

Schwartz A, Walmacq C, Rahmouni AR, Boudvillain M. 2007b. Noncanonical interactions in the management of RNA structural blocks by the transcription termination rho helicase. Biochemistry 46: 9366-9379.

65

Skordalakes E, Berger JM. 2003. Structure of the Rho transcription terminator: mechanism of mRNA recognition and helicase loading. Cell 114: 135-146.

-. 2006. Structural insights into RNA-dependent ring closure and ATPase activation by the Rho termination factor. Cell 127: 553-564.

Squires CL, Zaporojets D. 2000. Proteins shared by the transcription and translation machines. Annu Rev Microbiol 54: 775-798.

Steinmetz EJ, Platt T. 1994. Evidence supporting a tethered tracking model for helicase activity of Escherichia coli Rho factor. Proc Natl Acad Sci U S A 91: 1401-1405.

Stewart V, Landick R, Yanofsky C. 1986. Rho-dependent transcription termination in the tryptophanase operon leader region of Escherichia coli K-12. J Bacteriol 166: 217-223.

Sullivan SL, Gottesman ME. 1992. Requirement for E. coli NusG protein in factor- dependent transcription termination. Cell 68: 989-994.

Tagami S, Sekine S, Kumarevel T, Hino N, Murayama Y, Kamegamori S, Yamamoto M, Sakamoto K, Yokoyama S. 2010. Crystal structure of bacterial RNA polymerase bound with a transcription inhibitor protein. Nature 468: 978-982.

Telesnitsky A, Chamberlin MJ. 1989a. Terminator-distal sequences determine the in vitro efficiency of the early terminators of T3 and T7. Biochemistry 28: 5210-5218.

Telesnitsky AP, Chamberlin MJ. 1989b. Sequences linked to prokaryotic promoters can affect the efficiency of downstream termination sites. J Mol Biol 205: 315-330.

Thomsen ND, Berger JM. 2009. Running in reverse: the structural basis for translocation polarity in hexameric . Cell 139: 523-534.

66

Torres M, Condon C, Balada JM, Squires C, Squires CL. 2001. Ribosomal protein S4 is a with properties remarkably similar to NusA, a protein involved in both non-ribosomal and ribosomal RNA antitermination. EMBO J 20: 3811-3820.

Toulokhonov I, Artsimovitch I, Landick R. 2001. Allosteric control of RNA polymerase by a site that contacts nascent RNA hairpins. Science 292: 730-733.

Toulokhonov I, Landick R. 2003. The flap domain is required for pause RNA hairpin inhibition of catalysis by RNA polymerase and can modulate intrinsic termination. Mol Cell 12: 1125-1136.

Toulokhonov I, Zhang J, Palangat M, Landick R. 2007. A central role of the RNA polymerase trigger loop in active-site rearrangement during transcriptional pausing. Mol Cell 27: 406-419.

Unniraman S, Prakash R, Nagaraja V. 2001. Alternate paradigm for intrinsic transcription termination in eubacteria. J Biol Chem 276: 41850-41855.

Unniraman S, Prakash R, Nagaraja V. 2002. Conserved economics of transcription termination in eubacteria. Nucleic Acids Res 30: 675-684.

Vassylyev D, Vassylyeva M, Zhang J, Palangat M, Artsimovitch I, Landick R. 2007a. Structural basis for substrate loading in bacterial RNA polymerase Nature 448: 163-168.

Vassylyev DG, Sekine S, Laptenko O, Lee J, Vassylyeva MN, Borukhov S, Yokoyama S. 2002. Crystal structure of a bacterial RNA polymerase holoenzyme at 2.6 Å resolution. Nature 417: 712-719.

Vassylyev DG, Vassylyeva MN, Perederina A, Tahirov TH, Artsimovitch I. 2007b. Structural basis for transcription elongation by bacterial RNA polymerase. Nature 448: 157-162.

67 von Hippel PH, Yager TD. 1992. The elongation-termination decision in transcription. Science 255: 809-812.

Wang D, Landick R. 1997. Nuclease cleavage of the upstream half of the nontemplate strand DNA in an E. coli transcripiton elongation complex causes upstream translocation and transcriptional arrest. J Biol Chem 272: 5989-5994.

Ward DF, Gottesman ME. 1981. The nus mutations affect transcription termination in Escherichia coli. Nature 292: 212-215.

Washburn RS, Gottesman ME. 2010. Transcription termination maintains chromosome integrity. Proc Natl Acad Sci U S A.

Washio T, Sasayama J, Tomita M. 1998. Analysis of complete genomes suggests that many prokaryotes do not rely on hairpin formation in transcription termination. Nucleic Acids Res 26: 5456-5463.

Wilson KS, von Hippel PH. 1994. Stability of Escherichia coli transcription complexes near an intrinsic terminator. J Mol Biol 244: 36-51.

Wilson KS, von Hippel PH. 1995. Transcription termination at intrinsic terminators: the role of the RNA hairpin. Proc Natl Acad Sci U S A 92: 8793-8797.

Woodside MT, Behnke-Parks WM, Larizadeh K, Travers K, Herschlag D, Block SM. 2006. Nanomechanical measurements of the sequence-dependent folding landscapes of single nucleic acid hairpins. Proc Natl Acad Sci U S A 103: 6190-6195.

Worbs M, Bourenkov GP, Bartunik HD, Huber R, Wahl MC. 2001. An extended RNA binding surface through arrayed S1 and KH domains in transcription factor NusA. Mol Cell 7: 1177-1189.

Wu AM, Christie GE, Platt T. 1981. Tandem termination sites in the tryptophan operon of Escherichia coli. Proc Natl Acad Sci U S A 78: 2913-2917.

68

Yarnell WS, Roberts JW. 1999. Mechanism of intrinsic transcription termination and antitermination. Science 284: 611-615.

Yin H, Artsimovitch I, Landick R, Gelles J. 1999. Nonequilibrium mechanism of transcription termination from observations of single RNA polymerase molecules. Proc Natl Acad Sci U S A 96: 13124-13129.

Zalatan F, Platt T. 1992. Effects of decreased content on rho interaction with the rho-dependent terminator trp t' in Escherichia coli. J Biol Chem 267: 19082-19088.

Zaychikov E, Denissova L, Heumann H. 1995. Translocation of the Escherichia coli transcription complex observed in the registers 11 to 20: "jumping" of RNA polymerase and asymmetric expansion and contraction of the "transcription bubble". Proc Natl Acad Sci U S A 92: 1739-1743.

Zheng C, Friedman DI. 1994. Reduced Rho-dependent transcription termination permits NusA-independent growth of Escherichia coli. Proc Natl Acad Sci U S A 91: 7543-7547.

69

Chapter 2

Rho trafficking on bacterial transcription units in vivo

This chapter has been published (Rachel A. Mooney, Sarah E. Davis, Jason M. Peters,

Jennifer L. Rowland, Aseem Z. Ansari, and Robert Landick 2009. Regulator trafficking on bacterial transcription units in vivo. Molecular Cell 33:97-108). I performed the Rho

ChIP-chip and RT-PCR experiments. Rachel A. Mooney and Sarah E. Davis performed all other ChIP-chip experiments, Robert Landick and I performed data analysis, and

Rachel A. Mooney and Robert Landick wrote the paper. Supplementary figures can be found at the end of the chapter and supplementary tables can be downloaded at: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2747249/.

70

Abstract

The in vivo trafficking patterns on DNA by the bacterial regulators of transcript elongation σ70, Rho, NusA, and NusG and the explanation for high promoter-proximal levels or peaks of RNA polymerase (RNAP) are unknown. Genome-wide ChIP-chip on

E. coli revealed distinct association patterns of regulators as RNAP transcribes away from promoters (Rho first, then NusA, and then NusG). However, the interactions of elongating complexes with these regulators, including a weak interaction with σ70, did not differ significantly among most transcription units. A modest variation of NusG signal among genes reflected increased NusG interaction as transcription progresses, rather than functional specialization of elongating complexes. Promoter-proximal RNAP peaks were offset from σ70 peaks in the direction of transcription and co-occurred with NusA and Rho peaks, suggesting that the RNAP peaks reflected elongating, rather than initiating, complexes. However, inhibition of Rho did not increase RNAP levels within genes downstream of the RNAP peaks, suggesting the peaks are caused by a mechanism other than simple Rho-dependent attenuation.

71

Introduction

Transcription of genes by RNAP is controlled by a multiplicity of regulators that modulate template DNA conformation, control initiation, or govern RNAP’s progress through transcription units (TUs) in response to internal and environmental signals. In bacteria and eukaryotes, transcription regulators can be divided into those acting during transcript initiation, elongation, or termination. Precisely where initiation regulators release and elongation regulators associate with RNAP is unknown. Further, the distinction between these classes of regulators is not absolute; some may act during multiple stages of transcription, possibly with different effects. Finally, although some elongation regulators are known to target subsets of TUs, it is unclear whether general elongation regulators like NusA, NusG, and Rho interact with most elongating complexes (ECs) equivalently or instead preferentially interact with certain TUs or sites within TUs.

In bacteria, σ initiation factors bind tightly to core RNAP (consisting of β´, β, α2, and

ω subunits) and determine the sequence specificity of RNAP-promoter interactions (Fig.

2.1A). σs are thought to be released shortly after RNA synthesis begins. However, whether σ release occurs obligately or stochastically, whether σ may be completely retained on a subset of TUs, and whether σ may transiently rebind to the EC during elongation with possible regulatory consequence all remain in debate (Bar-Nahum and

Nudler 2001; Mukhopadhyay et al. 2001; Mooney and Landick 2003; Wade and Struhl

2004; Kapanidis et al. 2005; Mooney et al. 2005; Raffaelle et al. 2005; Reppas et al.

2006; Wade and Struhl 2008).

72

Figure. 2.1. Bacterial regulators of transcript elongation. (A) Regulator trafficking during the transcription cycle. RNAP binds σ70 to form holoenzyme which can specifically bind promoter DNA and initiate transcription. Once the nascent RNA has reached a certain length, RNAP releases its strong contacts to σ70 and transitions into a more stable elongation complex (EC). ECs can be targeted by NusA, NusG, Rho, and

σ70 to modulate transcription. (B) ChIP-chip profiles of RNAP and regulators across the

70 E. coli genome. Log2(IP/input) ratios for σ (orange) and RNAP (β´; blue) are shown above plot of E. coli genes (rightward and leftward transcription relative to origin separated above and below center). Regions identified as background RNAP interaction are shown as black bars below the RNAP profile (bkgd, see text). Genes encoding rRNA (blue) and tRNA (green) genes are indicated. An expanded region around 0.95 Mb is shown for RNAP, σ70, NusA (red), NusG (green), and Rho (violet) with the locations of known (vertical lines with black horizontal arrows) or predicted

(vertical lines with gray horizontal arrows) promoters indicated and the baseline set at the Tukey bi-weight mean (Supplemental Experimental Procedures). The middle shaded region shows an example of a region exhibiting low background signals. serS

(bold) is one of the 109 high-quality TUs (Fig. 2.1D). (C) Histogram of RNAP (β´) log2(IP/input) signals (blue) with overlaid histogram from background regions (black). A blowup of the highest signal region showing the point selected for Occapp=1 (mean of top ten 3-probe clusters) is shown in an inset. (D) Locations of the 109 high-quality TUs selected for analysis. The numbers correspond to their position on the genetic map; gene names and map positions are listed in Table 2.S1.

73

Figure 2.1

74

During or after promoter escape, the EC can associate with one or more elongation regulator (Fig. 2.1A). In bacteria, NusA and NusG alter EC properties differently via direct and independent interactions with RNAP, and are the best characterized regulators of elongation (Greenblatt et al. 1981; Li et al. 1992; Linn and Greenblatt

1992; Sullivan and Gottesman 1992; Burns et al. 1998). NusA preferentially enhances transcriptional pausing associated with nascent RNA hairpins (Greenblatt et al. 1981;

Farnham et al. 1982; Artsimovitch and Landick 2000; Yakhnin and Babitzke 2002), enhances intrinsic termination at some sites more than others (Kassavetis and

Chamberlin 1981; Linn and Greenblatt 1992; Yakhnin and Babitzke 2002), modulates

Rho-dependent termination (Burns et al. 1998), and is an essential component of antitermination complexes that form on ribosomal RNA (rrn) and phage λ operons

(Mason et al. 1992; Vogel and Jensen 1997; Torres et al. 2001; Shankar et al. 2007).

NusG increases the rate of RNA chain extension, at least partly by decreasing pausing associated with backtracking (Artsimovitch and Landick 2000), enhances Rho- dependent termination via interactions with RNAP and Rho (Li et al., 1992, 1993;

Sullivan and Gottesman, 1992), and also is a component of both rrn and λ antitermination complexes (Mason et al. 1992; Torres et al. 2001). Despite these multiple roles of NusA and NusG, it is unclear whether they associate equivalently with

ECs on all TUs, differentially with subsets of TUs, or differentially at locations within

TUs.

The homohexameric Rho protein terminates transcription after binding to unstructured, C-rich nascent RNA. RNA-stimulation of its ATP-dependent translocase activity allows Rho to travel 5′ to 3′ along the RNA and dissociate ECs unless blocked

75 by intervening ribosomes (Richardson 2002). It is uncertain where within TUs Rho interacts with ECs and whether Rho preferentially affects a subset of TUs. The report of

Reppas et al. (2006) that a significant fraction of TUs in Escherichia coli exhibit promoter-proximal peaks of RNAP heightens interest in knowing whether promoter- proximal, Rho-dependent termination could contribute to the apparent decrease in

RNAP density downstream from promoters.

To investigate trafficking of these regulators on bacterial TUs and the reported promoter-proximal block to transcription (Reppas et al. 2006), we used “chromatin immunoprecipitation” (Solomon et al. 1988; Kuo and Allis 1999) followed by microarray hybridization (ChIP-chip; Wade et al. 2007). Our study provides comparative analysis with improved resolution of some proteins examined previously (RNAP, σ70, and NusA

Wade and Struhl 2004; Grainger et al. 2005; Herring et al. 2005; Raffaelle et al. 2005;

Reppas et al. 2006) as well as the first genome-wide views of NusA, NusG and Rho, leading to important new insights into trafficking of bacterial transcription regulators.

76

Results

Analysis of RNAP ChIP-chip signals on E. coli TUs

We applied ChIP-chip to E. coli K-12 at mid-log phase of growth at 37 °C in defined minimal glucose medium (Experimental Procedures), conditions in which many biosynthetic genes must be expressed and that were used previously for expression analysis (Allen et al. 2003). Using specific antibodies targeting core RNAP, σ70, NusA,

Rho or a hemagglutinin (HA) epitope present in three copies at the N-terminus of the chromosomal nusG gene, we obtained associated DNA that was then fluorescently labeled and hybridized to a tiled oligonucleotide microarray (~25 bp spacing;

Experimental Procedures). Initial analysis of the immunoprecipitated DNAs relative to input DNA revealed excellent correspondence among the sites of enrichment by anti-σ70 and anti-RNAP (anti-β´) antibodies (Fig. 2.1B). Closer examination (e. g., of the expanded region around 0.94 mb shown in Fig. 2.1B) revealed that σ70 was predominantly associated with DNA near promoters, whereas RNAP could be detected in association with both promoter and transcribed-region DNA. The strongest signals were in genes encoding tRNA, rRNA, and ribosomal proteins (e. g., serW and rpsA), as expected and reported previously (Wade and Struhl 2004; Grainger et al. 2005;

Raffaelle et al. 2005; Reppas et al. 2006). NusA, NusG, and Rho were associated with

ECs in most locations where RNAP was present.

RNAP is known to associate non-specifically with chromosomal DNA (von Hippel et al. 1974; deHaseth et al. 1978; Grigorova et al. 2006). To estimate the corresponding non-specific (background) ChIP-chip signal for RNAP, we examined

77 regions of the bacterial chromosome thought to be devoid of transcription, such as the cryptic bglB gene (Defez and De Felice 1981). We identified 170 regions greater than 1 kb whose average RNAP ChIP signal was indistinguishable from that on bglB (bkgd,

Fig. 2.1B; gray box near 0.94 mB in expanded region; Table 2.2). The signals for these regions were normally distributed with a mean below the signal for ~84% of the complete genome-wide probe set (compare black to blue histograms, Fig. 2.1C;

Supplemental Experimental Procedures). This suggests that most of the E. coli genome is transcribed at levels above the non-specific background, consistent with previous estimates (Selinger et al. 2000).

To characterize RNAP and regulator occupancy further, we identified “high- quality” TUs that were significantly above this background and for which signals from adjacent TUs did not obscure the pattern of RNAP and regulator association and dissociation (e.g., serS in the expanded region as opposed to clpA and cydCD, which were obscured by strong signals from the adjacent serW tRNA gene). We identified

109 such TUs, which were spread across the E. coli genome and represented a range of expression levels and TU lengths (Fig. 2.1D and Table 2.1).

Regulator trafficking on representative E. coli TUs

To gauge the basic patterns of regulator trafficking on these 109 TUs, we wished to scale the data in proportion to occupancy of regulators on DNA. Although true occupancy is impossible to measure without knowing the relative efficiencies of crosslinking for each protein at each TU location as well as the signals corresponding to

78

zero and full occupancy, we nevertheless defined an apparent occupancy (Occapp) by linearly scaling signals for each protein between zero, which was set equal to the background defined by bglB-similar regions (Fig. 2.1C; Table 2.2), and one, which was arbitrarily defined as the average of the ten 3-probe clusters with highest average value

(Fig. 2.1C; Supplemental Experimental Procedures). Therefore, Occapp is a function of true occupancy and relative “crosslinkability.”

An examination of eight representative TUs (seven from among the 109 high- quality TUs plus rrnE) revealed significant variation both in the uniformity of RNAP and

70 regulator Occapp across TUs and in the ratios of RNAP Occapp to σ and other

70 regulators at locations within TUs (Fig. 2.2). In some cases, the peak of σ Occapp surrounding the transcription start site (TSS) was much greater than RNAP Occapp, with the latter exhibiting a relatively uniform distribution across the TU (serS, rspF, and acnB;

Figs. 2.2A, D, and F). In other cases, the σ70 peak was more similar to the corresponding RNAP Occapp (atpIBEFHAGDC, gltBDF, and carAB; Figs. 2.2C, E, and

H); in these cases RNAP typically exhibited a pronounced promoter-proximal peak similar to that previously reported (Reppas et al. 2006; Wade and Struhl 2008). These representative examples suggest there is no one-to-one correspondence between σ70

Occapp and RNAP Occapp at promoters; this observation was reflected in the modest

70 (0.77) correlation between peak Occapp values for σ and RNAP (Fig. 2.S1).

79

Figure. 2.2. Apparent occupancy profiles of RNAP and regulators on representative TUs. Occapp for the rrnE TU and seven representative TUs from among the 109 TUs selected for the absence of interferring upstream or downstream signals

(Fig. 2.1D and Table 2.1). Occapp was calculated as described in the Supplemental

Experimental Procedures using two rounds of sliding-window smoothing (500 bp window for RNAP, NusA, NusG, and Rho; 175 bp window for σ70). Genes are depicted as labeled open arrows; promoters, as vertical lines capped with arrows; and known intrinsic terminators, as hairpins. Note that the scales of Occapp and TU length (in kb, denoted by hatchmarks) differ in each panel. Protein-encoding genes are colored blue, and the rRNA TU is colored yellow. Regulators are colored as in Fig. 2.1. Vertical dotted lines are the center of the σ70 peak. For the rrn TU, there are two promoters (and two

σ70 peaks). (A) serS, a monocistronic TU encoding seryl-tRNA synthetase (B) rrnE, one of seven E. coli rRNA TUs. Due to near-sequence-identity among the rRNA TUs, these signals represent the average of all seven rRNA TUs. (C) atpIBEFHAGDC, the nine-gene TU encoding the F0,F1 ATP synthase. (D) rpsFpriBrpsRrplI, encoding the ribosomal protein S6, DNA replication primosome protein N, ribosomal protein S18, and ribosomal protein L9. (E) gltBDF, encoding glutamate synthase large and small subunits and a periplasmic protein involved in nitrogen metabolism. (F) acnB, a monocistronic TU encoding aconitase B. (G) cyoABCDE, encoding cytochrome bo terminal oxidase and heme O synthase. (H) carAB, encoding carbamoyl phosphate synthetase.

80

Figure 2.2

81

σ70, NusA, NusG, and Rho associate with ECs in different patterns

The regulators σ70, NusA, NusG, and Rho all appeared to be present on each

TU, but with notable differences in their Occapp distributions. NusA closely mirrored

RNAP on each TU, appearing to associate with RNAP as the signal from σ70 disappears. This is consistent with the long-standing view that NusA displaces σ70 during transcript elongation (Gill et al. 1991). In contrast, NusG appeared to associate with elongating RNAP farther from promoters and did not appear to be present at locations where RNAP forms promoter-proximal peaks. Rather, NusG Occapp rose gradually to levels that exceed other regulators on most TUs. The ratio of NusG/RNAP

Occapp appeared to be much greater in the distal portions of some TUs (e.g., atpIBEFHAGDC and cyoABCDE; Figs. 2.2C and G) than others (e.g., rrnE and rpsF- priB-rpsR-rplI; Figs. 2.2B and D). The different pattern of NusG on rrnE may reflect its participation (with NusA, NusB, NusE, and a subset of ribosomal proteins) in the rrn antitermination complex (Torres et al. 2001). Rho exhibited a striking pattern of significant promoter-proximal peaks near σ70 peaks and RNAP promoter-proximal

70 peaks, but a lower Occapp over most of the TU. Finally, although σ was principally present at promoters, as reported previously (Wade and Struhl 2004; Reppas et al.

70 2006), σ Occapp remained above zero across most TUs (e.g., serS, rrnE, and cyoABCDE; Figs. 2.2A, B, and G).

To examine the correlation between RNAP and regulator presence on TUs more carefully, we calculated the average ChIP-chip signals for each in a 200 bp window in the middle of the 109 high-quality TUs (Fig. 2.3A) and compared the regulator and

RNAP ChIP-chip signals directly (Figs. 2.3B-F). Strikingly, σ70, NusA, NusG, and Rho

82

Figure. 2.3. Mid-TU regulator signals correlated with RNAP signals. (A) Diagram illustrating calculation of mid-TU signals. For each of the 109 high-quality TUs, the log2(IP/input) signals for all probes within a 200-bp window surrounding the center of the

TU were averaged to yield an estimate signal due to elongating RNAP or regulator associated with the elongating RNAP. Regulators are colored according to Figs. 2.1 and 2.2. (B) Correlation of σ70 and RNAP mid-TU signals. Only TUs for which the mid-

TU point was more than 500 bp from the σ70 peak were included (to avoid influence of signal from the σ70 peak; n=80); r=0.68; p<0.001. (C) Correlation of NusA and RNAP mid-TU signals (n=109); r=0.97, p<0.001. (D) Correlation of NusG and RNAP mid-TU signals (n=109); r=0.71, p<0.001. (E) Correlation of Rho and RNAP mid-TU signals

(n=109); r=0.78, p=<0.001. (F) The correlation coefficient between the RNAP signal and each of the regulator signals plotted versus mean mid-TU signal for the regulator.

Mean signals for NusA, NusG, σ70, and Rho are 73%, 95%, 33%, and 51% of mean

RNAP signals, respectively.

83

Figure 2.3

84 mid-TU signals all exhibit an obvious correlation with RNAP mid-TU signals. However, the correlation was much greater for NusA than for σ70, NusG, or Rho (Fig. 2.3F). For

σ70 and Rho, the weaker correlation is consistent with lower signal-to-noise ratio resulting from the reduced mean signals in the middle of the TUs. However, this is not the case for NusG, where the mean mid-gene signal was as large as the RNAP signal despite the much-reduced correlation (Fig. 2.3F). These results suggest that elongating

RNAPs do not exhibit TU-specific variations in affinity for σ70, NusA, NusG, or Rho.

Although the relative affinity of each regulator for ECs differs (i.e., σ70 and Rho exhibit lower signals than NusA and NusG), there is no indication that they target one subset of

TUs relative to others. Thus, they can rightly be classified as general elongation regulators as opposed to specialized regulators like RfaH that are recruited to a specific subset of TUs (Artsimovitch and Landick 2002).

To resolve the pattern of σ70, NusA, NusG, and Rho interactions with RNAP more accurately, we took advantage of the similarity of these interactions among TUs to compute aggregate Occapp profiles (Fig. 2.4). For this purpose, we selected a set of highly transcribed TUs among the 109 high-quality TUs (to improve signal-to-noise ratios) and avoided TUs known to contain transcription attenuators (e.g., trp or leu) or multiple promoters that might complicate the distribution of RNAP. This yielded a set of

42 TUs that included 13 lacking an obvious promoter-proximal RNAP peak and 29 containing a readily discerned promoter-proximal RNAP peak (traces B and C in Fig.

2.4A). We computed the aggregate Occapp for these TUs by aligning them relative to

70 the genome coordinate of their σ peak and then averaging normalized Occapp values for each protein (normalized relative to the highest Occapp for that protein in a given TU).

85

Figure. 2.4. Aggregate apparent occupancy for highly expressed TUs. (A)

Aggregate normalized Occapp for 42 highly transcribed TUs (curve A; Table 2.1) was calculated by averaging Occapp from the TUs (2x rolling-averaged; 300-bp window).

Occapp for each TU was normalized as a fraction of the highest Occapp in each TU prior

70 70 to averaging. TUs were aligned to the peak of σ Occapp. σ (orange), blue (RNAP).

Curve A is the aggregate for all 42 TUs; lines B and C represent subsets of this aggregate: B, TUs lacking a promoter-proximal RNAP peak; C, 29 TUs exhibiting a promoter-proximal RNAP peak. Lines B and C are shown in panels B and C with the aggregate signals for the other regulators. (B) Aggregate normalized Occapp for RNAP and regulators on the 13 TUs that lacked an obvious promoter-proximal RNAP peak

(also shown as B for RNAP in panel A). Numbers in the figure correspond to the distance of the peak to the σ70 peak. Colors as in Fig. 2.1. (C) Aggregate normalized

Occapp for RNAP and regulators on the 29 TUs that exhibited an obvious promoter- proximal RNAP peak (also shown as C for RNAP in panel A). Numbers in the figure correspond to the distance of the peak to the σ70 peak.

86

Figure 2.4

87

The RNAP peak aggregate Occapp for the 42 TUs was offset in the direction of transcription from the σ70 peak by ~150 bp (d in Fig 4A). The size of this offset was widely distributed among different TUs and was uncorrelated with RNAP mid-TU signal

(Fig. 2.S2). However, the 29 TUs exhibiting pronounced peaks were, on average, longer (3.43 kb average length), whereas the TUs on which Occapp declined much more slowly were, on average, shorter TUs (1.36 kb average length; Mann-Whitney p<0.001).

The aggregate Occapp profiles highlighted differences in regulator trafficking on E. coli TUs. σ70 appeared to dissociate from RNAP as RNAP loses contact with the promoter (as reported previously by (Wade and Struhl 2004; Raffaelle et al. 2005;

Reppas et al. 2006). Although the σ70 peak was nearly symmetric around its center as noted by Reppas et al. (2006), it was skewed ~20 bp downstream at its vertical midpoint in our data (Fig. 2.S3). This σ70 skew was caused by translocation of RNAP relative to the TSS, as evidenced by loss of the skew and a slight upstream shift of the σ70 peak upon treatment of cells with rifampicin (Fig. 2.S3). Conversely, NusA appeared to associate fully with elongating RNAP sometime after the σ70 signal disappeared (Figs.

2.4B and C). Both the NusA and Rho aggregate profiles exhibited promoter-proximal peaks, as observed for the individual profiles (compare Figs. 2.2 and 2.4C). However, the Rho peak was displaced ~50 bp upstream (relative to the RNAP peak), whereas the

NusA peak was displaced downstream. Finally, NusG associated with elongating

RNAP much more slowly than either NusA or Rho, reaching a plateau of Occapp ~800 bp downstream of the σ70 peak. The same aggregate and individual-TU patterns of

NusG association were observed using anti-NusG polyclonal antibody (Fig. 2.S4), ruling out perturbation caused by the HA3 tag.

88

Taken together, our analysis of regulator trafficking on E. coli TUs (Figs. 2.2-2.4) leads to the following key conclusions. First, σ70 crosslinks almost exclusively to promoter DNA, although a downstream skew of the σ70 peak and weak σ70 signal in the middle of TUs are consistent with stochastic release of σ70 from elongating RNAP followed by weak σ70 association with ECs (Mooney et al. 2005). The extent of σ70-EC association is difficult to assess from ChIP-chip data (see Discussion); we cannot exclude the possibility that non-specific antibody-EC interaction contributes to the mid-

TU σ70.

Second, NusG associates with ECs more slowly than NusA on most TUs (Figs.

2.2 and 2.4), except on antiterminated rrn TUs where its faster association likely reflects incorporation into an antiterminated EC (Torres et al. 2001). Conversely, the slower association of NusG on other TUs may suggest its binding is stimulated by a feature of the EC that increases the farther RNAP transcribes.

Third, Rho is evident at most TU locations, with a peak interaction at locations in between the strongest σ70 and RNAP signals (Figs 2.4B-C). This suggests that Rho may associate with transcripts shortly after the initiation of transcription. Rho is detectable throughout TUs, and the extent of this interaction is well-correlated with the amount of RNAP located on the TU (Fig. 2.3E). This is consistent with the generally accepted role of Rho in premature termination whenever translation is compromised.

89

NusG apparent occupancy depends on TU length, not gene function

To investigate the greater variability of NusG/RNAP ratios and NusG’s apparently slower association with ECs, we computed the NusG/RNAP, NusA/RNAP, and

Rho/RNAP ratios for each gene and examined these ratios as a function of the average

RNAP signal per gene (Figs. 2.5A-C). NusA and Rho both exhibited relatively uniform distributions; genes with low RNAP signals exhibited higher ratios (as expected mathematically; Figs. 2.5A-B). In this analysis, NusA/RNAP ratios on rRNA genes were slightly above the trend line, but were still consistent with at least 1:1 NusA:RNAP on most ECs. tRNA genes exhibited disproportionately high ratios of both NusA and Rho, suggesting transcription of tRNA genes may differ from protein-coding genes. Small

RNA (sRNA) genes, in contrast, exhibited normal ratios of NusA and Rho to RNAP.

The NusG/RNAP ratio distribution differed strikingly from the NusA or Rho ratios.

Although rRNA genes exhibited high NusG/RNAP ratios, a subset of genes with lower average RNAP signal exhibited even higher NusG/RNAP ratios (inset, Fig. 2.5C).

Interestingly, several of these were genes involved in energy production (genes from the nuo and cyo operons), murein/peptidoglycan biosynthesis and recycling (oppD&F, murB&E), or amino-acid biosynthesis (trpA&B, metI, cysM). This raised the possibility of a functional connection to elevated NusG levels on certain TUs (e.g., to localize transcription of certain genes). As an alternative, we considered whether the length of

TUs might explain the abnormal NusG/RNAP ratios (e.g., if long TUs acquire higher

NusG occupancy). To test this, we compared the NusG/RNAP ratio to the distance of genes from their TSS (for cases where the TSS is known) and found a strong correlation of TSS-gene distance to NusG/RNAP ratio (Spearman r = 0.57; Fig.

90

Figure. 2.5. Gene-averaged regulator/RNAP ratios. (A) Gene-averaged NusA/RNAP ratios computed using average NusA IP/input values for each gene divided by the average RNAP IP/input values and plotted as a function of RNAP log2(IP/input) values

(Table 2.S3). Each gene is represented by one data point. (B) Gene-averaged

Rho/RNAP ratios. (C) Gene-averaged NusG/RNAP ratios. Zoomed in region shows identity of genes with unusually high NusG/RNAP ratios. (D) Gene-averaged

NusG/RNAP ratios plotted as a function of distance of gene from TSS using genes for which this distance could be assigned and for which the average RNAP log2(IP/input) signal was greater than 0.1 (830 genes; Table 2.S3). Zoomed in region shows identity of genes with unusually low NusG/RNAP ratios. (E) Gene-averaged NusG/RNAP ratios computed for the 30 functional classes of genes shown in the panel (Table 2.S5) and plotted as a function of the average distance to the TSS for genes in each functional class (Supplemental Experimental Procedures). Red dot represents the genome average.

91

Figure 2.5

92

2.5D). Genes that deviated significantly from this strong correlation by exhibiting low

NusG/RNAP ratios included rfa and rfb genes (inset, Fig. 2.5D). This is readily explained because rfa and rfb genes are regulated by RfaH, a specialized paralog of

NusG that competes with NusG for interaction with ECs (Belogurov et al. 2007).

We conclude that the gradual increase in NusG association as transcription progresses, rather than a connection to gene function, explains elevated NusG/RNAP ratios on some genes. The high NusG/RNAP ratios on energy-related and amino-acid- biosynthetic operons simply reflect the greater-than-average length of these TUs. To confirm this interpretation, we plotted the average NusG/RNAP ratios for different gene functional classes by the average TSS-gene distance for the functional class (Fig.

2.5E). Classes with NusG/RNAP signal ratios below the genome average (red circle,

Fig. 2.5E) contained, on average, shorter genes, whereas classes exhibiting significantly higher NusG/RNAP signal ratios contained longer genes. Thus, the primary determinant of NusG levels is TSS-gene distance, rather than gene function.

Promoter-proximal RNAP peaks correlate with promoter-proximal NusA and Rho peaks

Promoter-proximal RNAP peaks have been detected in E. coli and Drosophila, and are suggested to reflect RNAPs kinetically blocked early in elongation (for Drosophila) or possibly even prior to promoter escape (Reppas et al. 2006; Muse et al. 2007; Zeitlinger et al. 2007; Core and Lis 2008; Wade and Struhl 2008). Therefore, we asked whether promoter-proximal RNAP peaks were associated with NusA and Rho, which presumably requires promoter escape. We first calculated the traveling ratio (TR; the

93 ratio of RNAP signal in the promoter-proximal peak to that within the TU; Reppas et al.,

(2006) for a set of genes with a 5´-σ70 peak and that were greater than 1 kb in length (to insure the peak and mid-gene signals were well separated; Fig. 2.6A). We then tested whether a NusA peak, Rho peak, or both occurred within 300 bp of the RNAP peak and binned the results based on TR (Fig. 2.6B). If the RNAP peaks reflect RNAPs poised prior to promoter escape, then the fraction of RNAP peaks with NusA or Rho co-peaks should decrease at low TR (because a low TR would indicate promoter-bound RNAP that should not recruit NusA or Rho in contrast to ECs that can bind both). Instead, we observed little change in the frequency of NusA and Rho co-peaks at low TR.

We also binned the frequency of NusA and Rho co-peaks based on gene expression level (Allen et al. 2003), to ask if a block to promoter escape correlates with low expression (as suggested previously by Reppas et al., (2006); Fig. 2.6C). No correlation was evident. Further, the frequency of co-peaks correlated to RNAP peak height (Fig. 2.6D), suggesting that the failure to detect NusA or Rho co-peaks for a fraction of RNAP peaks (~25%) is mostly explained by false negatives in the peak- calling algorithm, since the signal-to-noise ratio for RNAP is better than that for NusA or

Rho. Taken together, these results suggest that promoter-proximal RNAP peaks reflect

RNAPs that have escaped promoters, at which point signals for NusA and Rho become detectable.

To verify that RNAP peaks reflected premature termination rather than a block to promoter escape, we used quantitative RT-PCR to test representative sets of TUs that exhibited or lacked RNAP peaks (Figs. 2.4B,C) for a drop in RNA transcript levels. This is an imperfect test because RNAs generated by premature termination are more

94

Figure. 2.6. Frequency of Rho and NusA co-occurrence for RNAP peaks associated with genes. (A) Diagram illustrating the identification of RNAP peaks associated with genes and the calculation of traveling ratio (TR; (Reppas et al. 2006);

Supplemental Experimental Procedures). Vertical dotted lines indicate σ70 peak

(orange) and RNAP peak (blue). (B) Fraction of promoter-proximal RNAP peaks for which NusA or Rho peaks exist within 300 bp of the RNAP peak, binned by the TR of the gene associated with the RNAP peak. Only RNAP peaks that could be associated with specific genes and only genes longer than 1 kb were included in this analysis.

NusA peaks, red columns; Rho peaks, purple columns. (C) Fraction of promoter- proximal RNAP peaks for which NusA or Rho peaks exist within 300 bp, binned by expression level (Allen et al. 2003) of the gene associated with the RNAP peak. Only

RNAP peaks that could be associated with specific genes and only genes longer than

500 bp were included in this analysis. (D) Fraction of promoter-proximal RNAP peaks for which NusA or Rho peaks exist within 300 bp, binned by height of RNAP peak. The same RNAP peaks as shown in Fig. 2.6C were included in this analysis. (E) Effect of

Rho inhibition on the aggregate RNAP Occapp profile for the 29 TUs with promoter- proximal RNAP peaks (Table 2.S1; Fig. 2.4C). RNAP signal is for cells grown with

(lavender) or without (blue) 20 µg bicyclomycin (BCM)/ml to cause Rho inhibition. The

σ70 aggregate profile for cells grown without BCM treatment is shown for reference (light orange). (F) Histogram of TR calculated for the 293 genes shown in Fig. 2.6B with or without Rho inhibition. RNAP signal with no inhibition, blue; RNAP signal with Rho inhibition, lavender. (G) Occapp on rho gene with (lavender) or without (blue) Rho inhibition. The effect of BCM is apparent in the readthough of the rho attenuator.

95

Figure. 2.6

96 difficult than long mRNAs to quantify accurately and also may be unstable.

Nonetheless, 6 of 8 TUs exhibiting RNAP peaks produced significantly more RNA near the 5´ end versus 0 of 4 for TUs lacking RNA peaks (Fig. 2.S5; p<0.005; Student’s t- test). Thus, most RNAP peaks are associated with premature transcription termination.

Reppas et al. (2006) raised the possibility that RNAP peaks might instead correspond to

RNAPs poised prior to promoter escape in part because they found 300 σ70 peaks not associated with detectable mRNAs. Thus, we asked if these σ70 peaks exhibited NusA or Rho co-peaks. Of the 300 peaks, 20 correspond to highly expressed stable RNA genes; 138 of the remainder were associated with an RNAP peak (Table 2.S6). Of these 138, 74 were within 300 bp of σ70 and RNAP peaks in our data. Of these 74, 45

(61%) were associated with a NusA peak; 49 (66%) were associated with a Rho peak;

33 (46%) were associated with both; and 13 (18%) were associated with neither (Fig.

2.S6). As noted above, some NusA and Rho co-peaks for small RNAP peaks were probably missed. Nonetheless, a few RNAP peaks likely represent promoter-bound enzyme: of three examples specifically cited by Reppas et al. (2006), one (hepA) was associated with NusA and Rho but two (deoB and yjiT) were associated with neither

(data not shown).

Rho-dependent termination is not the primary cause of promoter-proximal RNAP peaks

The finding that promoter-proximal RNAP peaks correspond to RNAPs blocked early in elongation raised the possibility they result from transcriptional attenuation. Indeed, the

Occapp profiles of genes regulated by attenuation resembled the aggregate profiles of

97 genes associated with promoter-proximal RNAP peaks (Fig. 2.S7). To ask if Rho, which also forms promoter-proximal peaks, could cause the RNAP peaks by Rho-dependent attenuation before a ribosome can bind and initiate translation, we examined the effect of the well-characterized Rho inhibitor, bicyclomycin (Supplemental Experimental

Procedures). If the RNAP peaks were caused by Rho-dependent attenuation, they should be reduced when cells are treated with bicyclomycin. Instead, we observed little effect on the aggregate RNAP Occapp profiles of genes exhibiting promoter-proximal

RNAP peaks (Fig. 2.6E), even though pronounced effects were evident on a gene known to be regulated by Rho-dependent attenuation (rho; Fig. 2.6G; (Matsumoto et al.

1986). We also examined the effect of Rho inhibition on TR and observed little if any effect (Fig. 2.6F). Consistent with this result, there also is no preferential effect of even higher levels of bicyclomycin on expression of genes that exhibit low TRs (Cardinale et al. 2008); Fig. 2.S8). Thus, Rho-dependent attenuation does not appear to be the principal cause of promoter-proximal RNAP peaks.

98

Discussion

Our ChIP-chip study of the distributions of RNAP, σ70, NusA, NusG, and Rho on E. coli

TUs reveals the patterns of trafficking for regulators most central to control of transcript elongation in bacteria, and has important implications for understanding the mechanisms underlying these patterns. σ70, NusA, NusG, and Rho are distributed relatively uniformly among most transcribing RNAP molecules with apparent relative affinities for elongating RNAP of NusA≈NusG>Rho> σ70. As RNAP moves away from a promoter, crosslinking of σ70 greatly decreases. Rho and NusA appear to associate with RNAP as σ70 association decreases, with Rho slightly preceding NusA, whereas

NusG associates with elongating RNAP more slowly. As previously reported (Reppas et al. 2006), RNAP exhibits strong promoter-proximal peaks on many, but not all TUs.

We find that these peaks correspond to ECs and that they do not result from Rho- dependent attenuation.

NusA, NusG, and Rho exhibit different patterns of EC association, but no TU-specific specialization

Our finding that NusA, NusG, and Rho are, to a first approximation, uniformly associated with ECs on most TUs suggests they act as general modulators of transcript elongation with about equal probability of altering responses of RNAP to intrinsic pause, arrest or termination sites, regardless of where these sites occur in the genome. Due to the limited resolution of ChIP-chip, this does not preclude specific associations of regulators at intrinsic sites that affect only a minority of elongating RNAP molecules or

99 at which events occur rapidly relative to movement of RNAP over the surrounding DNA sequences. The results do rule out the possibilities that NusA, NusG, or Rho associate with certain TUs or certain sites within TUs to the exclusion of other TUs or locations.

Nonetheless, each regulator associates with ECs as they move away from promoters in a distinct, regulator-specific pattern that is similar on most TUs (Fig. 2.7).

NusA exhibits negligible signal at promoters and associates with RNAP as σ70 association is lost, closely paralleling RNAP levels once RNAP moves away from a promoter (Figs. 2.2-2.4). NusA’s highest affinity contacts occur between the NusA CTD and the α-subunit CTD; additional contacts are made by NusA’s KH and S1 domains to the nascent RNA and by the NusA NTD to a second site on RNAP, which may include the β-subunit flap tip (Liu et al. 1996; Mah et al. 1999; Mah et al. 2000; Toulokhonov et al. 2001). At promoters, the α CTD binds to upstream DNA, either sequence- specifically at UP elements or non-specifically in association with σ70 (Estrem et al.

1999), and σ70 region 4 occupies the flap-tip until nascent RNA reaches 16-17 nt in length (Murakami et al. 2002; Nickels et al. 2006). Thus, NusA contacts are either not possible (to nascent RNA) or masked by DNA or σ70 until RNAP moves away from the promoter, at which point the association of NusA with the α CTD and nascent transcript likely tether NusA to the EC via interactions that are largely independent of EC position in a TU (Fig. 2.7).

Like NusA, NusG exhibits negligible signal at promoters, but unlike NusA appears to associate with RNAP in two phases. In the first phase, evident in aggregate

Occapp profiles (Fig. 2.4), NusG increases association with RNAP rapidly to ~1 kb downstream from promoters. This first phase is distinct from NusA association both in

100

Figure. 2.7. Model of transcription regulator trafficking during initiation to elongation transition. As RNAP moves away from a promoter, contacts to upstream

DNA are presumably lost upon the transition from abortive to productive synthesis

(Revyakin et al. 2006). At least some of σ70 contacts to RNAP must release during this transition. Release of upstream DNA contacts would free the α CTD and flap tip for interaction with NusA, and thus explain the early association of NusA. Rho appears to target RNA as it emerges from the RNAP exit channel and to bind without terminating transcription. As elongation progresses, NusG may slowly displace σ70 from interaction with the clamp helices, and ribosome binding could occlude Rho interaction with RNAP.

101

Figure. 2.7

102 the slower rise (NusA association appears to be complete by 300 bp into TUs) and in that NusG signal does not mirror the promoter-proximal RNAP peaks (Fig. 2.4C). In the second phase, NusG Occapp increases more slowly, resulting in the increased

NusG/RNAP ratios for genes farther from promoters (Figs. 2.5D-E).

One explanation for the delayed association pattern of NusG could be competition with σ70 for its binding location on RNAP. NusG is suggested to bind RNAP via contacts to the clamp helices (Belogurov et al. 2007), which also make the tightest

RNAP contact to σ70 (via σ70 region 2; Arthur and Burgess 1998; Young et al. 2001).

Although σ70 region 4 dissociates from the flap tip when 16-17 nt of RNA are synthesized, the σ70 region 2-clamp helices interaction can persist in the EC without steric conflict (Mooney et al. 2005). In this case, slow NusG association could reflect delayed dissociation of σ70 region 2. This would mean that σ70 dissociates from RNAP more slowly than reported by the ChIP-chip assay, which instead shows a sharp fall off in σ70 crosslinking immediately downstream from promoters (Fig. 2.4; (Wade and Struhl

2004; Raffaelle et al. 2005; Reppas et al. 2006); see below). Alternatively, σ70 may release rapidly and NusG binding could require long RNA transcripts since it has been suggested that NusG contains an RNA-binding activity (Steiner et al. 2002).

Rho associates with TUs closer to promoters than either NusA or NusG, and then appears to decrease somewhat in TU association farther from promoters, with an approximately uniform association relative to RNAP signal (Figs. 2.3-2.4). The location of the promoter-proximal Rho peak is consistent with the requirement of 80-100 nt for

Rho effects on ECs (Richardson 2002). Thus, Rho appears to bind as soon as the requisite nascent transcript becomes available, but perhaps fails to terminate

103 transcription because NusG is not yet associated with RNAP. This early binding could position Rho to detect and subsequently terminate synthesis of the occasional mRNA on which translation fails. The strong Rho ChIP signal may be reduced once ribosomes load onto nascent RNA and prevent Rho from translocating close to RNAP (Fig. 2.7).

σ70 appears to associate with ECs stochastically

Our analysis of σ70 confirmed prior reports that the great majority of σ70 ChIP signal is lost as RNAP escapes the promoter (Wade and Struhl 2004; Raffaelle et al. 2005;

70 Reppas et al. 2006), but asymmetry of the σ aggregate Occapp peak suggests the signal is lost on average ~20 bp into TUs (Fig. 2.S3). However, a low σ70 ChIP signal was present and was correlated with RNAP signal at the middle of TUs (Fig. 2.3B).

This likely reflects σ70-EC interaction, although we cannot exclude other possibilities

(e.g., that transcription increases non-specific binding of σ70-containing holoenzyme to

DNA, for instance by removing nucleoid proteins from DNA). In any case, it is difficult to assess the extent of the interaction from the low σ70 ChIP signal because it may reflect far less efficient σ70 crosslinking to DNA than for promoter complexes (e.g., indirect σ70-

RNAP and RNAP-DNA crosslinking in ECs rather than direct σ70-promoter DNA crosslinking). Our results are consistent with that σ70 breaks DNA contact when

RNAP escapes a promoter after which σ70’s weakened contacts to RNAP allow its stochastic release (Shimamoto et al. 1986; Mooney et al. 2005; Raffaelle et al. 2005) but still support at least a weak equilibrium association with ECs and σ70 rebinding at

104 promoter-like sequences encountered during elongation (Mooney and Landick 2003;

Mooney et al. 2005).

The mechanistic basis of promoter-proximal RNAP peaks

In principal, promoter-proximal RNAP peaks could reflect one of at least three mechanistically distinct types of blocks to transcription. RNAP could be trapped (1) prior to promoter escape (e.g., before strand opening or in ); (2) early in elongation in a paused (or poised) state from which it can be released to productive elongation; or (3) by premature and presumably regulated transcription termination

(transcriptional attenuation). Promoter-proximal RNAP peaks are common for human and Drosophila genes where they appear to be correlated with developmentally regulated rather than with housekeeping genes (ENCODE Project Consortium 2004;

Guenther et al. 2007; Muse et al. 2007; Zeitlinger et al. 2007). These peaks have been attributed to promoter-proximal pausing based on several criteria (Core and Lis, 2008;

Muse et al., 2007; Zeitlinger et al., 2007). In Sacchromyces cerevisiae, promoter- proximal peaks occur only in stationary phase and by unknown mechanism (Wade and

Struhl, 2008). All three types of mechanisms are well characterized in E. coli: promoter- trapping (Laishram and Gowrishankar, 2007; Rosenthal et al., 2008), promoter-proximal pausing (Marr and Roberts, 2000; Hatoum and Roberts, 2008), and attenuation (Merino and Yanofsky, 2005).

Our findings establish that most promoter-proximal E. coli RNAP peaks correspond to ECs. First, the promoter-proximal RNAP peaks were offset in the

105 direction of transcription by ~150 bp (Fig. 2.4). The transition from abortive to productive elongation, marked by release of σ70 from promoter contacts (or from RNAP contacts), occurs within the first 20 nt of transcript elongation (Revyakin et al. 2006;

Chander et al. 2007). Known cases of σ70-stimulated pausing in vivo occur no later than

+25 (Ring et al. 1996). Thus, the location of RNAP peaks at +150 is inconsistent with a block prior to promoter escape and EC formation. Second, NusA, which is thought to bind to ECs after release of σ70, and Rho, which requires >50 nt of RNA to bind, both appeared to be associated with RNAP in the promoter-proximal peaks.

Assuming that ChIP-chip captures a close-to-instantaneous snapshot of RNAP positions on DNA, we suggest that the promoter-proximal RNAP peaks reflect transcriptional attenuation caused by a mechanism other than Rho-dependent termination, rather than RNAP poised at promoters (Wade and Struhl, 2008). The position of these RNAP peaks is consistent with the typical position of transcription attenuators (Merino and Yanofsky 2005) and strongly resembles ChIP-chip profiles of

RNAP on TUs known to be subject to transcriptional attenuation (e.g., trp and pyrBI;

Fig. 2.S7). Promoter-proximal peaks in eukaryotes have been ascribed to paused ECs

(Core and Lis, 2008; Muse et al., 2007; Zeitlinger et al., 2007). Although long elusive, transcription attenuation is now clearly shown to occur in eukaryotes (Steinmetz et al.

2006). Conclusive evidence that promoter-proximal halted RNAPs are actually paused rather than on a termination pathway exists only for a limited number of cases (e.g.,

Drosophila heat shock genes and bacteriophage λ PR´; (Marr and Roberts 2000;

Adelman et al. 2005). The regulation of early elongation by attenuation may prove to be more common in all organisms than has been appreciated.

106

Materials and Methods

Materials

E. coli K12 strains MG1655 and MG1655 HA3::nusG were used for all experiments.

MG1655 HA3::nusG was constructed by gene replacement without selection to give a strain isogenic to MG1655 encoding three copies of the haemagglutinin (HA) epitope tag at the 5´ end of nusG. Monoclonal antibodies against σ70 (2G10), RNAP (anti-β´,

NT73 or anti-β, NT63), and NusA (1NA1) were purchased from Neoclone (Madison,

WI). The monoclonal 12CA5 anti-HA antibody (to target HA3::nusG) was purchased from Roche. The polyclonal antibody against NusG was generated by Proteintech

() and polyclonal antibody against Rho was a kind gift from Jeff Roberts

(Cornell U.) After labeling, ChIP samples were hybridized to a custom microarray from

Nimblegen (Madison, WI) that contains two copies of 187,204 Tm-matched ≥45mer oligonucleotides that tile the E. coli chromosome with an average of spacing of 24.5 bp.

Cell growth and ChIP-chip

Cells were grown in defined minimal medium (with 0.2% glucose) with vigorous shaking at 37 °C to mid-log (light scattering at 600 nm equivalent to 0.4 OD). Formaldehyde was added to 1% final and shaking was continued for 5 min before quenching with glycine. Cells were harvested, washed with PBS, and stored at -80 °C. Cells were sonicated and digested with micrococcal nuclease and RNase A before immunoprecipitation. The ChIP DNA sample was amplified by ligation-mediated PCR

(Lee et al. 2006) to yield >4 µg of DNA, pooled with two other independent samples,

107 and sent to Nimblegen where samples were labeled with Cy3 and Cy5 fluorescent dyes

(one for the ChIP sample and one for a control input sample) and hybridized to a single microarray as a two-color experiment.

Strains

All experiments were performed with E. coli K12 strain MG1655 or an isogenic strain encoding three copies of the haemagglutinin (HA) epitope tag at the 5´ end of nusG

(MG1655 HA3::nusG; RL1664). This strain was constructed by λ Red-mediated recombination without selection using a suicide plasmid encoding HA3::nusG (Herring et al. 2003). The plasmid was created as follows. First, 500 bp of DNA upstream from the translational start of NusG was amplified from chromosomal DNA by PCR: upstream1:

TAGGGATAACAGGGTAATcgtaccagaacctggctcat-3´; downstream1:

5´-TGGGTAAACCATctcagaacctcaggccagtgat-3’; for both primers, the sequence in lowercase is complementary to the chromosomal sequence. For the upstream primer, the uppercase sequence encodes an I-SceI homing endonuclease recognition site. For the downstream primer, the uppercase sequence encodes the first 12 nucleotides of nusG, and is part of the complementarity shared with the upstream primer used in a second PCR reaction to amplify the HA-tagged nusG gene (upstream2:

GAGGTTCTGAGatggtttacccatacgatgttcct; downstream2: gcgaacggaccatcattaac; uppercase portion of upstream primer is part of the complementarity shared with the downstream primer from the first PCR reaction (downstream1). For this second PCR reaction, plasmid pRM467, which expresses NusG with the HA tag at its N-terminus

108 under control of the IPTG inducible trc promoter, was used as the template (see below).

The two PCR reactions were then purified, combined, and amplified with the upstream1 and downstream2 primers, yielding a larger single fusion product. This nusG product was cloned into a vector using blunt-end ligation and sequenced.

To perform the recombination, E. coli K12 strain MG1655 was grown, made competent and electroporated with the new nusG plasmid and pACBSR, a plasmid encoding the l-

Red recombination genes and the I-SceI homing endonuclease (Herring et al. 2003).

After selection for transformants containing both plasmids, the gene replacement process was performed as described. Individual colonies were screened by PCR to detect the introduction of the HA tag and then cured of the plasmids (Herring et al.

2003). The resulting strain, RL1664, is MG1655 HA3::nusG. To test for phenotypic effects of the chromosomally tagged nusG, we measured growth alongside wild-type and saw no difference in the growth rate of this strain (data not shown). We also performed Western analysis using an antibody specific to the HA epitope and were able to detect the fusion protein (data not shown). This established that the HA epitope at the

N-terminus of NusG was accessible to antibody, thus allowing its use for ChIP analysis.

To construct plasmid pRM467, PCR amplification of 3 repeats of the HA epitope tag

(YPYDVPDYPG) was performed, adding flanking NcoI and BspHI restriction sites. The

HA tag- PCR product was purified, digested with NcoI and BspHI, and cloned into pRM431 (Ptrc-nusG) cut similarly to yield plasmid pRM467 (Ptrc-HA3::nusG). This plasmid expresses NusG protein with the following HA3 sequence at the N-terminus

(underlined M is the start codon of NusG; HA epitopes enclosed in parantheses):

109

MV(YPYDVPDYPG)(YPYDVPDYAGS)(YPYDVPDYA)ELMGSSHHHHHHSSGLVPRGS

HM….

Preparation of DNA for ChIP-chip analysis

To verify our experiments would perform a snapshot of transcription within the cell, we first sought conditions that would yield adequate crosslinking with minimal time of formaldehyde exposure so as to minimize the possibility of RNAP or regulators moving to new locations during the treatment process. We found that 5 min treatment with 1% formaldehyde was optimal for Ab specificity. We also verified that E. coli was unable to re-program transcription during this treatment (data not shown).

To prepare cells for ChIP-chip analysis, strain MG1655 or isogenic strain MG1655

HA3::nusG cells were grown in MOPS minimal medium supplemented with 0.2% glucose (Neidhardt et al. 1974). Cells were grown with vigorous shaking at 37 C to mid-log (light scattering at 600 nm equivalent to 0.4 OD). Crosslinking and cell preparation were performed largely as described (Raffaelle et al. 2005) using protocols adapted from the labs of Alan Grossman and Peggy Farnham (Lin and Grossman

1998); http://genomics.ucdavis.edu/farnham/protocol.html). Sodium phosphate (1/100 vol. of 1 M, pH 7.6; 10 mM final) was added to the mid-log cultures followed by formaldehyde to 1% final, and shaking was continued for 5 min. Cold 2.5 M glycine was added to 100 mM and the mixture was incubated at 4 C with agitation for 30 minutes to stop the crosslinking. Cells were spun at 5000 x g, and washed repeatedly with phosphate-buffered saline before being frozen at -80 C.

110

Cell pellets (from initial 50 ml of culture) were thawed and resuspended in 250 µl of IP buffer (100 mM Tris pH 8, 300 mM NaCl, 2% TritonX-100) and sonicated using a microtip sonicator set at 10% output for 20 second intervals with periods of cooling in between. Cells were then treated for one hour at 4 C with RNase A (2 ng/ml; USB,

Inc.), micrococcal nuclease (50 units; USB, Inc.), 20 mM CaCl2,1.2 mM KCl, 0.3 mM

NaCl, 6 mM sucrose, and 10 mM DTT. After treatment, a distribution of DNA fragments ranging from 200-600 bp was detected by agarose-gel electrophoretic separation of a small sample that was de-crosslinked by incubation at 65 °C for >4 hr. EDTA was added to 10 mM to stop the micrococcal nuclease and the samples were spun down to remove cell debris. The lysate was then incubated with a 50/50 slurry of Sepharose protein A beads (Upstate; now Millipore) and protein G beads (GE Healthcare) in IP buffer for 2-3 hours at 4 C. The beads were removed by centrifugation and antibody was added to the pre-cleared lysate for an overnight incubation. The next day, 30 ml of a 50/50 slurry of Sepharose protein A and G beads in IP buffer was added to the lysate to capture antibody-protein-DNA complex for one hour at 4 C. Beads were then washed once with 1 ml of 250 mM LiCl wash buffer (100 mM Tris pH 8, 250 mM LiCl,

2% Triton X-100), twice with 600 mM NaCl wash buffer (100 mM Tris pH 8, 600 mM

NaCl, 2% SDS), twice with 300 mM NaCl wash buffer (100 mM Tris pH 8, 300 mM

NaCl, 2% SDS), and twice with TE. Elution buffer (50 mM Tris pH 8, 10 mM EDTA, 1%

SDS) was added after the final wash step, and beads were incubated at 65 C for 30 minutes to remove the crosslinked protein-DNA complexes from the beads. After centrifugation to remove the beads, the samples were incubated overnight at 65 C to

111 reverse the protein-DNA formaldehyde crosslinks. DNA was purified using Qiagen’s

PCR Purification kit and eluted to a final volume of 50 µl with EB.

To prepare the ChIP DNA samples for array analysis (ChIP-chip), the DNA was amplified by ligation-mediated PCR (LM-PCR) following a protocol provided by

Nimblegen (http://www.chiponchip.org/protocol_itm3.html) and adapted from one developed in the lab of Rick Young (Ng et al. 2003); http://jura.wi.mit.edu/cgi- bin/young_public/navframe.cgi?s=19&f=ChIPLMPCR). Briefly, the ChIP DNA was treated with T4 DNA polymerase to form blunt ends, and then ligated to annealed linkers (oJW102 5’-GCGGTGACCCGGGAGATCTGAATTC; oJW103 5’-

GAATTCAGATC). The ligated DNA product was amplified using oJW102 and purified.

Individual DNA samples for microarray analysis were pools generated from triplicate independent IP samples obtained from separate cultures grown in parallel. Samples were combined after final DNA amplification and before labeling.

Array design, hybridization, and data extraction

For the ChIP-chip analysis, immunoprecipitated samples are compared to control samples of input DNA (a DNA sample recovered from the clarified cell extract immediately prior to IP). Following amplification of IP and input control DNA samples, 4

µg each unlabeled IP and input DNA samples were delivered to Nimblegen for subsequent labeling with Cy3 or Cy5 dyes. The labeled DNAs were then mixed and hybridized to the microarray as a two-color experiment by Nimblegen, and the resulting

Cy3 and Cy5 specific signals for each probe were reported to us. The microarray

112

contains two copies of 187,204 Tm-matched ≥45mer oligonucleotides that tiled the chromosome with an average of 24.5bp separation. The Cy3 and Cy5 intensities for the probe set were converted to log2(IP/input) values that were corrected for dye interaction by lowess normalization (Yang et al. 2002), using the NormalizeWithinArray function

(Smyth and Speed 2003) in the limma package (Smyth 2005) for the statistical program

R (R Development Core Team 2006). Lowess normalization corrects for dye interactions by generating a local regression model of log2(IP/input) vs. log2(IP * input) signals with the global median of the data set to zero. For each experiment, data were processed from one, two, or three microarrays (Table 2.S7) and averaged after quantile normalization using the normalize.quantiles function in the R package affy; (Gautier et al. 2004). Each datapoint in the combined datasets was associated with a genome position corresponding to the midpoint of the corresponding probe.

Selection of TUs and data analysis

To identify high-quality TUs for careful analysis, the ChIP-chip signals for RNAP and σ70 across the entire E. coli K12 genome were visually compared to gene and TU annotations using Signalmap (Nimblegen, Inc.; U00096.2 gene annotation, www.ncbi.nlm.nih.gov; RegulonDB 4/4/06 TU annotation, http://regulondb.ccg.unam.mx:80). Promoter locations were identified based on previous genome annotations (Huerta and Collado-Vides 2003; Gama-Castro et al.

2008). High quality TUs (109 total; Table 2.S1) were selected using two criteria: (1) the

RNAP signal across the entire TU was visibly above the robust mean of the signal

113 across the entire genome (Tukey bi-weight mean, see following section); and (2) adjacent signals from other TUs were well separated from the signals from the identified

TU. For each TU, the center of the σ70 peak and the upstream and downstream ends of the TU (defined as the 5’ end of the first gene and the 3’ end of the last gene, respectively) were recorded. The following parameters were then calculated computationally: the center of the σ70 peak, the center of the promoter-proximal RNAP peak, the average signal of RNAP over the TU, and the average signals for RNAP, σ70,

NusA, NusG, and Rho over a 200 bp segment surrounding the center of the TU.

Background determination

The background signal distribution (mean, s.d.) for each dataset was calculated from the signals in genome regions that met three criteria: (i) the region was greater than 1 kb; (ii) the log2(IP/input) signals for all 300 bp windows within the region were indistinguishable from that of bglB (Student’s t test; p<0.05); and (iii) no portion of the region overlapped a gene for which the estimated transcript abundance is 1/cell or higher in expression analysis conducted in identical growth conditions (Allen et al.

2003). There were 170 regions that met these criteria, corresponding to 328 kb or 7.1% of the genome (Table 2.2). The distribution of log2(IP/input) signals in these regions was normal for each dataset (Cramer-von Mises normality test; p<0.02). It is likely this method of defining background regions of non-specific RNAP or regulator interactions with DNA underestimates the true extent of these regions. As mentioned in the text,

84% of the RNAP signals for genome-wide probe set were above this mean background

114 signal, consistent with the estimate of Selinger et al. (2000) that 87% of E. coli ORFs are expressed during log-phase growth. Thus, it is also likely that low levels of specific

RNAP or regulator association with DNA are not detected above background in our

ChIP-chip experiments, but might be detected if the efficiencies of crosslinking or IP were greater (as they could be in other experiments). For this reason, the regions used for definition of background levels of nonspecific association here (Table 2.S2) should not be generalized to other experiments, but need to be reassessed for each experiment.

The assignment of background signal in RNAP and regulator ChIP chip experiments is a complex issue. Most available methods were adapted from expression array analysis and offer imperfect solutions. Nimblegen, for instance, subtracts a background value derived from the Tukey bi-weight robust mean estimator (Hoaglin et al. 2000). This is an appropriate background estimate when the number of DNA locations occupied by a protein is small relative to the genome-wide probe set.

However, this is not the case for RNAP and many regulators of transcription that interact with RNAP. Hence, the background levels estimated by the method described above are well below the Tukey bi-weight mean (0 in Fig. 2.1C). We considered using the mode of the ChIP-signal distribution (-0.23 in Fig. 2.1C) as the true background, under the assumption that it represents the mean of a normal background distribution whose right portion is merged with signal above background. However, even the mode proved to be above the background for RNAP as estimated from bglB-like regions (Fig.

2.1C). We settled on the average signal from bglB-like regions as the best estimator of background signal in our ChIP-chip experiments, but note that even this background

115 may include signal from non-specifically bound RNAP and that the extent of non- specific RNAP association with DNA could be affected by other proteins that interact with DNA, including chromatin-like proteins or actively transcribing RNAP.

Calculation of apparent occupancy (Occapp)

We defined the apparent occupancy (Occapp) as the fractional signal (S; lowess- corrected and quantile-normalized IP signal/input signal) for a given genome location relative to the maximal enrichment signal (M) observed anywhere in the genome, after correcting both values by subtraction of the background signal (B, see preceding section). The maximal enrichment signal was defined empirically as the average of the ten sets of three adjacent probes with highest total signal in the dataset. We calculated this statistic using rolling sets of three adjacent probes to minimize distortions from single-probe outliers exhibiting abnormal hybridization. Where the maximal enrichment signal reflects an actual occupancy of 1 (i.e., a given DNA site is occupied by the protein in question 100% of the time) and where the efficiency of crosslinking can be assumed to be constant for all DNA sites for a given protein, then Occapp can be regarded as the actual occupancy of the protein for a given site on DNA. More generally, Occapp is an empirical measure of apparent occupancy that will be a function of both the relative rate of crosslinking between the protein and a given DNA site and the fractional occupancy of the protein on that site relative to the maximum observed in a given experiment (where the maximum observed occupancy may be less than 100%).

These qualifications are especially important in the case of active RNAP-DNA

116 complexes (e.g., initiating and elongating complexes) because the exact protein-DNA contacts, and thus the proximity of formaldehyde-reactive moieties, are constantly changing as DNA translocates through RNAP. σ factors present an especially profound case of these qualifications because σ contacts to DNA change dramatically as RNAP moves away from a promoter; both σ itself and σ-RNAP contacts undergo profound rearrangements (Marr et al. 2001) (Murakami et al. 2002; Mooney et al. 2005; Nickels et al. 2005; Kapanidis et al. 2006) (Murakami and Darst 2003). Thus, it is likely that the rates of σ70-DNA crosslinking (either directly via σ70-DNA crosslinks or indirectly via combined σ70-RNAP and RNAP-DNA crosslinks) will vary dramatically between initiating and elongating complexes. For this reason, the level of σ70 occupancy on ECs is uncertain (see Discussion).

Subject to these qualifications, Occapp can be calculated from the observed, maximal, and backgrounds signals (S, M, and B, respectively) using equation 1.

Occapp = (S- B) / (M – B) (1)

Since the ratio signals are usually manipulated as log2 ratios, this can be rearranged into equation 2, which is convenient for direct calculation once log2(M) and log2(B) have been determined.

(log2(S)-log2(B) (log2(M)-log2(B) Occapp = (2 -1) / (2 -1) (2)

Calculation of Aggregate Normalized Occapp Profiles

117

To compute aggregate normalized signals for RNAP and regulators (Figs. 2.4, 2.6E,

2.6G, 2.S3, and 2.S4A), the ChIP-chip signals [log2(IP/input)] for each of the TUs to be averaged were extracted, interpolated to 20 bp steps, smoothed by calculating the average of each data point with the 2 points on either side of it, and then assigned minus or plus bp coordinates relative to the center of the σ70 peak in the promoter region for each TU (which was set to 0). The log2(IP/input) signals were then converted to Occapp as described above and normalized relative to the highest Occapp value in the

TU for a given regulator or RNAP set to 1. The Occapp values were combined to yield an average value for each TU coordinate from -500 to +1000 (for all high expression

TUs and for TUs not exhibiting promoter proximal peaks, Fig. 2.4A and 4B) or to +1300

(for TUs exhibiting promoter proximal peaks, Fig. 2.4C, 6E, 6G, S3 and S4A).

Peak Identification

We used the method of Reppas et al. (2006) to compute the locations of RNAP, σ70,

NusA, NusG, and r peaks near gene starts (Table 2.S4; Figs. 2.6 and 2.S5). The background-corrected, quantile-averaged log2(IP/input) ratios were assigned to the midpoint coordinate of the corresponding probe and then used to interpolate values at

20-bp intervals along the genome. The interpolated values were smoothed by two rounds of rolling averaging using a 280-bp window centered on each 20-bp interval genome coordinate. The smoothed data were used to assign maxima wherever a value was higher than the values ±140 bp and to assign minima wherever a value was lower that the values ±140 bp. Maxima within 280 bp were merged; a peak location was

118 assigned corresponding to the highest absolute signal value. Adjacent minima were merged using analogous criteria. Peak heights were then computed as the greater of the differences between the peak signal and the signals at the minima on either side of the peak.

To assess the statistical significance of the peaks, we divided the peaks into 0.1 interval bins with a lower cutoff of peak height=0.25. Starting with the lowest bin, we then calculated the distance of each peak to the nearest gene start or stop and compared these distances to those computed using genome coordinates arbitrarily rotated 1 x 106 bp around the E. coli K12 chromosome using Wilcoxon-Mann-Whitney rank-sum test for non-similarity of distributions (for σ70 peaks we used only gene starts).

By these criteria, σ70 peaks in the ≥0.45 peak-height bins and RNAP, NusA, and r peaks in the ≥0.35 peak-height bins were statistically significant (p values for similarity of the distributions <0.0001).

Average RNAP and regulator levels on genes

70 We computed average log2(IP/input) for RNAP, σ , NusA, NusG, and r for each E. coli

K12 gene by averaging the background-corrected, quantile-averaged log2(IP/input) ratios for all probes whose midpoints were within 100 bp of the middle of a TU (Table

2.S1; Fig. 2.3) or were between the start and stop of a gene (Table 2.3; Fig. 2.5). To avoid bias from nearby peaks for RNAP, σ70, NusA, or Rho, we excluded values for probes whose midpoints were within 500 bp of a peak if the absolute peak height was

>0.2 above the mid-TU- or gene-average.

119

Functional classification of genes and determination of average NusG/RNAP ratios

We calculated the average NusG/RNAP ratio for each gene and then sorted those genes according to the functional classification of (Riley 1997). This classification allows separation of all E.coli genes into 30 general classes based on the function of the gene product (Table 2.S5). We then averaged the NusG/RNAP ratio for all genes within each functional group to determine a NusG/RNAP average for each class and plotted that by the average length of each gene in the class (Fig. 2.5E).

Bicyclomycin inhibition of Rho

We used the known antibiotic and inhibitor of Rho, bicyclomycin, (Zwiefka et al. 1993) to test the effect of Rho on RNAP ChIP-chip signals. To do this, we used conditions previously shown to inhibit Rho function without significant effects on growth of E.coli

(Ederth et al. 2006). RNAP signal from ChIP DNA samples were compared for cells grown with or without 20 µg/ml bicyclomycin (Fig. 2.6).

Calculation of traveling ratio (TR)

We used the method of Reppas et al. (2006) to calculate TR for genes greater than 1 kb in length and to which an RNAP peak could be assigned (Table 2.S4; Figs. 2.6 and

2.S5) based on occurrence of the peak within 300 bp of the start of the gene. TR was

120

calculated as the ratio of the average Occapp within 100 bp of the mid-point of the gene to the RNAP peak Occapp.

RT-PCR quantitation of RNA levels

To obtain the data shown in supplemental Fig. 2.S5, we performed RT-PCR on RNA isolated from MG1655 cells grown in conditions identical to those used for ChIP chip experiments (early log-phase in MOPS minimal media with 0.2% glucose). Total RNA was isolated using boiling lysis and phenol-extraction as described by Schneider et al.

(2003). Total RNA was converted to cDNA using the MasterAmp High Fidelity RT-PCR

Kit using random nonamers as primers according to the instructions provided by the manufacturer (Epicentre Biotechnologies). Quantitative real-time PCR (qPCR) was performed using the SYBR Green JumpStart Taq ReadyMix for Quantitative PCR

(Sigma-Aldrich) on a model 7500 real time PCR thermal cycler (Applied Biosystems).

Primers for qPCR were designed to hybridize within the first 100 bp downstream of the

σ70 peak to measure attenuated and full-length transcripts (5’-proximal RNA; PCR product 1 in Fig. 2.S5) or within the last 100 bp of the transcription unit to detect full- length transcripts (3-proximal RNA; PCR product 2 in Fig. 2.S5). Primer sequences are available upon request. Primer quality was assessed using Primer3 (Rosen and

Skaletsky, http://frodo.wi.mit.edu/). Differences in priming efficiency were accounted for by normalizing the cycle threshold (Ct) values of the experimental samples to those produced by a standard curve of defined DNA concentrations. In Fig. 2.S5, for each

121

TU, the relative amount of cDNA signal from PCR is shown normalized to the amount of signal from the upstream primer pair.

122

Acknowledgements

We thank C. Herring for construction of the HA3::nusG allele, K. Struhl for helpful discussions and sharing results prior to publication, and J. Grass for assistance with a control experiment. This work was supported by grants to A.Z.A. (USDA Hatch) and R.

L. ( NIH GM38660).

123

Supplementary Figures

70 70 Figure 2.S1. RNAP (β´) peak Occapp versus σ peak Occapp. Of 1625 σ peaks that were determined to be significant (see Supplemental Experimental Procedures), only

671 were within 500 bp of an significant RNAP (b´) peak. The Pearson correlation of peak Occapp values for these 671 cases was 0.77, with scatter as represented in the above diagram (note log scale).

124

Figure 2.S1

125

Figure 2.S2. Offset of RNAP peaks from σ70 peaks. Peak locations were identified as the location wtihin a given TU of the highest values for RNAP and σ70 log2(IP/input) signals (averaged and smoothed twice over a 300 bp window as described in

Supplemental Experimental Procedures). The average RNAP signal over a 200 bp window in the middle of the TU (as described for Fig. 3) is plotted (y-axis) versus the offset distance of the RNAP peak from the σ70 peak (x-axis; d in Fig. 4A). (A)

Distribution of peak offsets for all 109 high-quality TUs (Table S1). mean = 140 ± 110 bp. (B) Distribution of peak offsets for 13 TUs lacking obvious promoter-proximal RNAP peak (see Fig. 4 and Table S1). mean = 180 ± 100 bp. (C) Distribution of peak offsets for 29 TUs exhibiting an obvious promoter-proximal RNAP peak (see Fig. 2.4 and Table

2.S1). mean = 130 ± 80 bp.

126

Figure 2.S2

127

Figure 2.S3. Shape and location of RNAP and σ70 peaks with and without treatment of cells with rifampicin. Aggregate normalized Occapp was calculated for the 29 TUs exhibiting an obvious promoter-proximal RNAP peak (Fig. 2.4 and Table

2.S1) as described in Materials and Methods. The centers of the peaks were located as the highest point in each peak. The slight skewing of the σ70 peak in the direction of transcription is indicative of persistence of σ70-DNA crosslinking (either direct or mediated through RNAP) for about 20 bp after the initiation of transcription by RNAP.

The slight upstream shift in the σ70 peak (and its coincidence with the RNAP peak) after treatment of cells with rifampicin also suggest that translocation of DNA through RNAP during active transcription shifts the location of the σ70 and RNAP peaks downstream

(by about 25 bp and 155 bp, respectively). Note that because the Occapp scales were created from background and maximal signal values specific to each curve (see

Materials and Methods), the aggregate normalized Occapp for each curve should not be equated with actual occupancy and compared to values in the other curves.

128

Figure 2.S3

129

Figure 2.S4. Comparison of NusG profiles. ChIP profiles are colored as described in the legend to Fig. 2.4 (light green is NusG detected by 12CA5 monoclonal antibody in strain containing HA3-NusG) with NusG detected by anti-NusG polyclonal antisera shown in dark green (NusG* in panel A inset). (A) Occapp on the carAB TU. (B) Occapp on the trp TU. (C) Aggregate normalized Occapp for the 13 TUs not exhibiting a promoter-proximal peak (Fig. 4B). (D) Aggregate normalized Occapp for the 29 TUs exhibiting a promoter-proximal peak (Fig. 2.4C).

130

Figure 2.S4

131

Figure 2.S5. RT-PCR quantitation of RNA levels near transcription start sites and within TUs. RNAs were isolated and quantified as described in Materials and Methods.

For each panel, the 5´-proximal RNA level is represented by the left column and the

RNA level measured within the TU is represented by the right column. Horizontal black bars beneath the charts depict the locations of PCR fragments used for quantitation relative to the genes in the TUs (genes in different panels are not drawn to scale relative to genes in other panels). (A)-(H) Examples of TUs that exhibit promoter-proximal

RNAP peaks (blue). (I)-(J) Examples of peaks that do not exhibit promoter-proximal

RNAP peaks (yellow).

132

Figure 2.S5

133

Figure 2.S6. Fraction of σ70 peaks not associated with transcripts (Reppas et al.

2006) for which NusA or Rho peaks occur within 300 bp of an associated RNAP peak, binned by RNAP peak height. The 74 peaks reported by Reppas et al. (2006) that are associated with RNAP peaks in both the Reppas et al. data and our own data were tested for association with NusA or Rho peaks. Overall, 45 (61%) of this 74 were associated with a NusA peak, 49 (66%) were associated with a r peak, 33 (46%) were associated with both, and only 13 (18%) were associated with neither a NusA nor a Rho peak (Table 2.S6). These data were binned based on the height of the RNAP peak (in our data). The correlation of NusA and r peak association with RNAP peak height suggests that some associated NusA and r peaks were missed because they fell below the limit of detection.

134

Figure 2.S6

135

Figure 2.S7. Comparison of aggregate Occapp profiles from TUs exhibiting promoter-proximal RNAP peaks to those for TUs regulated by attenuation (trp and pyrBI). (A) Aggregate normalized Occapp was calculated for the 29 TUs exhibiting an obvious promoter-proximal RNAP peak (Fig. 2.4C and Table 2.S1; See Materials and

Methods). (B) Normalized Occapp was calculated for the trp operon as described for aggregate normalized Occapp except that it was not averaged with other TUs. (C)

Normalized Occapp was calculated for the pyrBI operon as described for aggregate normalized Occapp except that it was not averaged with other TUs.

136

Figure 2.S7

137

Figure 2.S8. Effect of Rho inhibition by bicyclomycin on expression of genes

(Cardinale et al, 2008) as a function of traveling ratio. The 293 genes greater than 1 kb for which traveling ratios could be calculated (see legend to Fig. 6) were examined in the datasets reported by Cardinale et al. (2008). Expression ratios were calculated for each bicyclomycin concentration as described by the authors. The absence of any increase in expression ratios for genes that exhibit low traveling ratios confirms our conclusion that r-dependent termination is not required for formation of promoter- proximal RNAP peaks. The only evident trend is decreased expression of many genes, irrespective of traveling ratio, at the highest bicyclomycin concentration (black dots).

138

Figure 2.S8

139

References

Adelman K, Marr MT, Werner J, Saunders A, Ni Z, Andrulis ED, Lis JT. 2005. Efficient release from promoter-proximal stall sites requires transcript cleavage factor TFIIS. Mol Cell 17: 103-112.

Allen TE, Herrgard MJ, Liu M, Qiu Y, Glasner JD, Blattner FR, Palsson BO. 2003. Genome-scale analysis of the uses of the Escherichia coli genome: model-driven analysis of heterogeneous data sets. J Bacteriol 185: 6392-6399.

Arthur TM, Burgess RR. 1998. Localization of a sigma70 binding site on the N terminus of the Escherichia coli RNA polymerase beta' subunit. J Biol Chem 273: 31381-31387.

Artsimovitch I, Landick R. 2000. Pausing by bacterial RNA polymerase is mediated by mechanistically distinct classes of signals. Proc Natl Acad Sci U S A 97: 7090-7095.

Artsimovitch I, Landick R. 2002. The transcriptional regulator RfaH stimulates RNA chain synthesis after recruitment to elongation complexes by the exposed nontemplate DNA strand. Cell 109: 193-203.

Bar-Nahum G, Nudler E. 2001. Isolation and characterization of sigma(70)-retaining transcription elongation complexes from Escherichia coli. Cell 106: 443-451.

Belogurov GA, Vassylyeva MN, Svetlov V, Klyuyev S, Grishin NV, Vassylyev DG, Artsimovitch I. 2007. Structural basis for converting a general transcription factor into an operon-specific virulence regulator. Mol Cell 26: 117-129.

Burns CM, Richardson LV, Richardson JP. 1998. Combinatorial effects of NusA and NusG on transcription elongation and Rho-dependent termination in Escherichia coli. J Mol Biol 278: 307-316.

Cardinale CJ, Washburn RS, Tadigotla VR, Brown LM, Gottesman ME, Nudler E. 2008. Termination factor Rho and its cofactors NusA and NusG silence foreign DNA in E. coli. Science 320: 935-938.

140

Chander M, Austin KM, Aye-Han NN, Sircar P, Hsu LM. 2007. An alternate mechanism of abortive release marked by the formation of very long abortive transcripts. Biochemistry 46: 12687-12699.

Core LJ, Lis JT. 2008. Transcription regulation through promoter-proximal pausing of RNA polymerase II. Science 319: 1791-1792.

Defez R, De Felice M. 1981. Cryptic operon for beta-glucoside metabolism in Escherichia coli K12: genetic evidence for a regulatory protein. Genetics 97: 11-25. deHaseth PL, Lohman TM, Burgess RR, Record MT, Jr. 1978. Nonspecific interactions of Escherichia coli RNA polymerase with native and denatured DNA: differences in the binding behavior of core and holoenzyme. Biochemistry 17: 1612-1622.

Ederth J, Mooney R, Isaksson L, Landick R. 2006. Functional interplay between the downstream DNA-Jaw domain of bacterial RNA polymerase and allele-specific residues in the product RNA-binding pocket. J Mol Biol 356: 1163-1179.

ENCODE Project Consortium. 2004. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306: 636-640.

Estrem ST, Ross W, Gaal T, Chen ZW, Niu W, Ebright RH, Gourse RL. 1999. Bacterial promoter architecture: subsite structure of UP elements and interactions with the carboxy-terminal domain of the RNA polymerase alpha subunit. Genes Dev 13: 2134- 2147.

Farnham PJ, Greenblatt J, Platt T. 1982. Effects of NusA protein on transcription termination of the tryptophan operon of Escherichia coli. Cell 29: 945-951.

Gama-Castro S, Jimenez-Jacinto V, Peralta-Gil M, Santos-Zavaleta A, Penaloza- Spinola MI, Contreras-Moreira B, Segura-Salazar J, Muniz-Rascado L, Martinez-Flores I, Salgado H et al. 2008. RegulonDB (version 6.0): gene regulation model of Escherichia

141 coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucleic Acids Res 36: D120-124.

Gautier L, Cope L, Bolstad BM, Irizarry RA. 2004. affy--analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20: 307-315.

Gill SC, Weitzel SE, von Hippel PH. 1991. Escherichia coli sigma 70 and NusA proteins. I. Binding interactions with core RNA polymerase in solution and within the transcription complex. J Mol Biol 220: 307-324.

Grainger DC, Hurd D, Harrison M, Holdstock J, Busby SJ. 2005. Studies of the distribution of Escherichia coli cAMP- protein and RNA polymerase along the E. coli chromosome. Proc Natl Acad Sci U S A 102: 17693-17698.

Greenblatt J, McLimont M, Hanly S. 1981. Termination of transcription by nusA gene protein of Escherichia coli. Nature 292: 215-220.

Grigorova IL, Phleger NJ, Mutalik VK, Gross CA. 2006. Insights into transcriptional regulation and sigma competition from an equilibrium model of RNA polymerase binding to DNA. Proc Natl Acad Sci U S A 103: 5332-5337.

Guenther MG, Levine SS, Boyer LA, Jaenisch R, Young RA. 2007. A chromatin landmark and transcription initiation at most promoters in human cells. Cell 130: 77-88.

Herring CD, Glasner JD, Blattner FR. 2003. Gene replacement without selection: regulated suppression of amber mutations in Escherichia coli. Gene 311: 153-163.

Herring CD, Raffaelle M, Allen TE, Kanin EI, Landick R, Ansari AZ, Palsson BO. 2005. Immobilization of Escherichia coli RNA polymerase and location of binding sites by use of chromatin immunoprecipitation and microarrays. J Bacteriol 187: 6166-6174.

Hoaglin D, mosteller F, Tukey j. 2000. Understanding Robust and Exploratory Data Analysis. Wiley, New York.

142

Huerta AM, Collado-Vides J. 2003. σ70 promoters in Escherichia coli: specific transcription in dense regions of overlapping promoter-like signals. J Mol Biol 333: 261- 278.

Kapanidis AN, Margeat E, Ho SO, Kortkhonjia E, Weiss S, Ebright RH. 2006. Initial transcription by RNA polymerase proceeds through a DNA-scrunching mechanism. Science 314: 1144-1147.

Kapanidis AN, Margeat E, Laurence TA, Doose S, Ho SO, Mukhopadhyay J, Kortkhonjia E, Mekler V, Ebright RH, Weiss S. 2005. Retention of transcription initiation factor σ70 in transcription elongation: single-molecule analysis. Mol Cell 20: 347-356.

Kassavetis GA, Chamberlin MJ. 1981. Pausing and termination of transcription within the early region of bacteriophage T7 DNA in vitro. J Biol Chem 256: 2777-2786.

Kuo MH, Allis CD. 1999. In vivo cross-linking and immunoprecipitation for studying dynamic Protein:DNA associations in a chromatin environment. Methods 19: 425-433.

Lee TI, Johnstone SE, Young RA. 2006. Chromatin immunoprecipitation and microarray-based analysis of protein location. Nat Protoc 1: 729-748.

Li J, Horwitz R, McCracken S, Greenblatt J. 1992. NusG, a new Escherichia coli elongation factor involved in transcriptional antitermination by the N protein of phage lambda. J Biol Chem 267: 6012-6019.

Lin DC, Grossman AD. 1998. Identification and characterization of a bacterial chromosome partitioning site. Cell 92: 675-685.

Linn T, Greenblatt J. 1992. The NusA and NusG proteins of Escherichia coli increase the in vitro readthrough frequency of a transcriptional attenuator preceding the gene for the beta subunit of RNA polymerase. J Biol Chem 267: 1449-1454.

143

Liu K, Zhang Y, Severinov K, Das A, Hanna MM. 1996. Role of Escherichia coli RNA polymerase alpha subunit in modulation of pausing, termination and anti-termination by the transcription elongation factor NusA. EMBO J 15: 150-161.

Mah TF, Kuznedelov K, Mushegian A, Severinov K, Greenblatt J. 2000. The alpha subunit of E. coli RNA polymerase activates RNA binding by NusA. Genes Dev 14: 2664-2675.

Mah TF, Li J, Davidson AR, Greenblatt J. 1999. Functional importance of regions in Escherichia coli elongation factor NusA that interact with RNA polymerase, the bacteriophage lambda N protein and RNA. Mol Microbiol 34: 523-537.

Marr MT, Datwyler SA, Meares CF, Roberts JW. 2001. Restructuring of an RNA polymerase holoenzyme elongation complex by lambdoid phage Q proteins. Proc Natl Acad Sci U S A 98: 8972-8978.

Marr MT, Roberts JW. 2000. Function of transcription cleavage factors GreA and GreB at a regulatory pause site. Mol Cell 6: 1275-1285.

Mason SW, Li J, Greenblatt J. 1992. Host factor requirements for processive antitermination of transcription and suppression of pausing by the N protein of bacteriophage lambda. J Biol Chem 267: 19418-19426.

Matsumoto Y, Shigesada K, Hirano M, Imai M. 1986. Autogenous regulation of the gene for transcription termination factor rho in Escherichia coli: localization and function of its attenuators. J Bacteriol 166: 945-958.

Merino E, Yanofsky C. 2005. Transcription attenuation: a highly conserved regulatory strategy used by bacteria. Trends Genet 21: 260-264.

Mooney RA, Darst SA, Landick R. 2005. Sigma and RNA polymerase: an on-again, off- again relationship? Mol Cell 20: 335-345.

144

Mooney RA, Landick R. 2003. Tethering σ70 to RNA polymerase reveals high in vivo activity of σ factors and σ70-dependent pausing at promoter-distal locations. Genes Dev 17: 2839-2851.

Mukhopadhyay J, Kapanidis AN, Mekler V, Kortkhonjia E, Ebright YW, Ebright RH. 2001. Translocation of sigma(70) with RNA polymerase during transcription: fluorescence resonance energy transfer assay for movement relative to DNA. Cell 106: 453-463.

Murakami KS, Darst SA. 2003. Bacterial RNA polymerases: the wholo story. Curr Opin Struct Biol 13: 31-39.

Murakami KS, Masuda S, Campbell EA, Muzzin O, Darst S. 2002. Structural basis of transcription Initiation: an RNA polymerase holoenzyme/DNA complex. Science 296: 1285-1290.

Muse GW, Gilchrist DA, Nechaev S, Shah R, Parker JS, Grissom SF, Zeitlinger J, Adelman K. 2007. RNA polymerase is poised for activation across the genome. Nat Genet 39: 1507-1511.

Neidhardt FC, Bloch PL, Smith DF. 1974. Culture medium for enterobacteria. J Bacteriol 119: 736-747.

Ng HH, Robert F, Young RA, Struhl K. 2003. Targeted recruitment of Set1 histone methylase by elongating Pol II provides a localized mark and memory of recent transcriptional activity. Mol Cell 11: 709-719.

Nickels BE, Garrity SJ, Mekler V, Minakhin L, Severinov K, Ebright RH, Hochschild A. 2005. The interaction between sigma70 and the beta-flap of Escherichia coli RNA polymerase inhibits extension of nascent RNA during early elongation. Proc Natl Acad Sci U S A 102: 4488-4493.

Nickels BE, Roberts CW, Roberts JW, Hochschild A. 2006. RNA-mediated destabilization of the σ70 region 4/beta flap interaction facilitates engagement of RNA polymerase by the Q antiterminator. Mol Cell 24: 457-468.

145

R Development Core Team. 2006. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.

Raffaelle M, Kanin EI, Vogt J, Burgess RR, Ansari AZ. 2005. Holoenzyme switching and stochastic release of sigma factors from RNA polymerase in vivo. Mol Cell 20: 357-366.

Reppas NB, Wade JT, Church GM, Struhl K. 2006. The Transition between transcriptional initiation and elongation in E. coli Is highly variable and often rate limiting. Mol Cell 24: 747-757.

Revyakin A, Liu C, Ebright RH, Strick TR. 2006. Abortive initiation and productive initiation by RNA polymerase involve DNA scrunching. Science 314: 1139-1143.

Richardson J. 2002. Rho-dependent termination and ATPases in transcript termination. Biochim Biophys Acta 1577: 251.

Riley M. 1997. Genes and proteins of Escherichia coli K-12 (GenProtEC). Nucleic Acids Res 25: 51-52.

Ring B, Yarnell W, Roberts J. 1996. Function of E. coli RNA polymerase s factor s70 in promoter-proximal pausing. Cell 86: 485-493.

Selinger DW, Cheung KJ, Mei R, Johansson EM, Richmond CS, Blattner FR, Lockhart DJ, Church GM. 2000. RNA expression analysis using a 30 resolution Escherichia coli genome array. Nat Biotechnol 18: 1262-1268.

Shankar S, Hatoum A, Roberts JW. 2007. A transcription antiterminator constructs a NusA-dependent shield to the emerging transcript. Mol Cell 27: 914-927.

Shimamoto N, Kamigochi T, Utiyama H. 1986. Release of the sigma subunit of Escherichia coli DNA-dependent RNA polymerase depends mainly on time elapsed

146 after the start of initiation, not on length of product RNA. J Biol Chem 261: 11859- 11865.

Smyth G. 2005. Limma: linear models for microarray data. in Bioinformatics and Computational Biology Solutions using R and Bioconductor (eds. R Gentleman, V Carey, S S. Dudoit, R Irizarry, W Huber), pp. 397-420. Springer, New York.

Smyth G, Speed T. 2003. Normalization of cDNA microarray data. Methods 31: 265- 273.

Solomon MJ, Larsen PL, Varshavsky A. 1988. Mapping protein-DNA interactions in vivo with formaldehyde: evidence that histone H4 is retained on a highly transcribed gene. Cell 53: 937-947.

Steiner T, Kaiser JT, Marinkovic S, Huber R, Wahl MC. 2002. Crystal structures of transcription factor NusG in light of its nucleic acid- and protein-binding activities. EMBO J 21: 4641-4653.

Steinmetz EJ, Warren CL, Kuehner JN, Panbehi B, Ansari AZ, Brow DA. 2006. Genome-wide distribution of yeast RNA polymerase II and its control by Sen1 helicase. Mol Cell 24: 735-746.

Sullivan SL, Gottesman ME. 1992. Requirement for E. coli NusG protein in factor- dependent transcription termination. Cell 68: 989-994.

Torres M, Condon C, Balada JM, Squires C, Squires CL. 2001. Ribosomal protein S4 is a transcription factor with properties remarkably similar to NusA, a protein involved in both non-ribosomal and ribosomal RNA antitermination. EMBO J 20: 3811-3820.

Toulokhonov I, Artsimovitch I, Landick R. 2001. Allosteric control of RNA polymerase by a site that contacts nascent RNA hairpins. Science 292: 730-733.

147

Vogel U, Jensen KF. 1997. NusA is required for ribosomal antitermination and for modulation of the transcription elongation rate of both antiterminated RNA and mRNA. J Biol Chem 272: 12265-12271. von Hippel PH, Revzin A, Gross CA, Wang AC. 1974. Non-specific DNA binding of genome regulating proteins as a biological control mechanism: I. The : equilibrium aspects. Proc Natl Acad Sci U S A 71: 4808-4812.

Wade JT, Struhl K. 2004. Association of RNA polymerase with transcribed regions in Escherichia coli. Proc Natl Acad Sci U S A 101: 17777-17782.

Wade JT, Struhl K. 2008. The transition from transcriptional initiation to elongation. Curr Opin Genet Dev 18: 130-136.

Wade JT, Struhl K, Busby SJ, Grainger DC. 2007. Genomic analysis of protein-DNA interactions in bacteria: insights into transcription and chromosome organization. Mol Microbiol 65: 21-26.

Yakhnin AV, Babitzke P. 2002. NusA-stimulated RNA polymerase pausing and termination participates in the Bacillus subtilis trp operon attenuation mechanism in vitro. Proc Natl Acad Sci U S A 99: 11067-11072.

Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP. 2002. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 30: e15.

Young BA, Anthony LC, Gruber TM, Arthur TM, Heyduk E, Lu CZ, Sharp MM, Heyduk T, Burgess RR, Gross CA. 2001. A coiled-coil from the RNA polymerase β' subunit allosterically induces selective nontemplate strand binding by σ70. Cell 105: 935-944.

Zeitlinger J, Stark A, Kellis M, Hong JW, Nechaev S, Adelman K, Levine M, Young RA. 2007. RNA polymerase stalling at developmental control genes in the Drosophila melanogaster embryo. Nat Genet 39: 1512-1516.

148

Zwiefka A, Kohn H, Widger WR. 1993. Transcription termination factor rho: the site of bicyclomycin inhibition in Escherichia coli. Biochemistry 32: 3564-3570.

149

Chapter 3

Rho directs widespread termination of intragenic and stable RNA transcription

This chapter has been published (Jason M. Peters, Rachel A. Mooney, Pei F. Kuan,

Jennifer L. Rowland, Sunduz Keles, and Robert Landick 2009. Rho directs widespread termination of intragenic and stable RNA transcription. Proceedings of the National

Academy of Science U. S. A. 106:15406-11). I performed the quantitative PCR experments, data analysis, and Robert Landick and I wrote the manuscript. Rachel A.

Mooney performed the ChIP-chip experiments, Pei F. Kuan and Sunduz Keles provided software and assistance with statistical analyses, and Jennifer L. Rowland performed

ChIP-chip sample preparation. Supplementary figures can be found at the end of the chapter and supplementary tables can be downloaded at: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2741264/.

150

Abstract

The transcription termination factor Rho is a global regulator of RNA polymerase

(RNAP). Although individual Rho-dependent terminators have been studied extensively, less is known about the sites of RNAP regulation by Rho on a genome- wide scale. Using chromatin immunoprecipitation and microarrays (ChIP-chip), we examined changes in the distribution of Escherichia coli RNAP in response to the Rho- specific inhibitor bicyclomycin (BCM). We found ~200 Rho-terminated loci that were divided evenly into two classes: intergenic (at the ends of genes) and intragenic (within genes). The intergenic class contained noncoding RNAs such as small RNAs (sRNAs) and transfer RNAs (tRNAs), establishing a previously unappreciated role of Rho in termination of stable RNA synthesis. The intragenic class of terminators included a novel set of short antisense transcripts, as judged by a shift in the distribution of RNAP in BCM-treated cells that was opposite to the direction of the corresponding gene.

These Rho-terminated antisense transcripts point to a novel role of noncoding transcription in E. coli gene regulation that may resemble the ubiquitous noncoding transcription recently found to play myriad roles in eukaryotic gene regulation.

151

Introduction

Transcription termination is critical for maintaining control over gene expression.

Bacteria employ two distinct types of termination: intrinsic termination, for which a GC- rich RNA hairpin followed by a U-tract dissociates RNA polymerase (RNAP) without the need for accessory proteins, and factor-dependent termination caused by the Rho protein. Rho was originally identified as a factor that increased the “accuracy” of in vitro transcription by terminating RNAP at specific positions on a bacteriophage λ DNA template (Roberts 1969). Later, Rho was found to be the cause of polarity, whereby the uncoupling of transcription and translation by premature stop codons decreases gene expression of downstream genes in an operon (Richardson et al. 1975). Rho is a homohexameric protein with RNA-dependent ATPase activity (Galluppi and Richardson

1980). Rho binds to the nascent RNA and translocates 5´ to 3´ along RNA using energy derived from ATP hydrolysis (Richardson 2006). At certain sites, Rho contacts

RNAP, and terminates the elongation complex (EC) by an unknown mechanism

(Banerjee et al. 2006).

Bicyclomycin (BCM) is a specific inhibitor of Rho (Zwiefka et al. 1993). BCM blocks Rho-dependent termination in vivo (Yanofsky and Horn 1995) and in vitro

(Magyar et al. 1996) through non-competitive inhibition of the RNA-dependent ATPase activity of Rho (Park et al. 1995). Biochemical and structural analyses show that BCM binds adjacent to the ATPase of Rho (Magyar et al. 1996) and prevents ATP hydrolysis by interfering with a key glutamic acid residue that is involved in catalysis (Skordalakes et al. 2005). Treatment of wild-type Escherichia coli K-12 with high concentrations of

BCM is lethal (Zwiefka et al. 1993), because rho is an essential gene (Bubunenko et al.

152

2007). However, sub-lethal doses of BCM are sufficient to perturb Rho termination in vivo (Yanofsky and Horn 1995).

Genome-wide studies have documented the role of Rho as a global regulator of

RNAP. Chromatin immunoprecipitation assays using tiling microarrays (ChIP-chip) reveal remarkably similar global distributions of RNAP and Rho on DNA (Mooney et al.

2009). These similar distributions suggest that Rho contacts ECs soon after initiation, interacts with ECs throughout elongation, and interacts with ECs on nearly all transcription units (TUs), rather than having specificity for a small set of genes.

Cardinale et al. used expression array analysis to gauge the effect of BCM treatment on mRNA levels (Cardinale et al. 2008). Their results showed changes in abundance of a subset of transcripts, particularly for genes integrated into the genome by horizontal transfer. Thus, Rho termination occurs preferentially on a subset of genes, even though its physical distribution is widespread. However, the specific locations of BCM-inhibited

Rho-dependent terminators have not yet been determined.

We used ChIP-chip to examine changes in the distribution of RNAP in response to Rho inhibition by BCM. We found ~200 Rho-terminated loci where BCM shifted the distribution of RNAP downstream of the apparent termination site. Half of the Rho- dependent terminators were located at the 3´ ends of genes (intergenic) including small

RNAs (sRNAs) and transfer RNAs (tRNAs). The other half were found within the coding sequence of annotated genes (intragenic). For one set of intragenic terminators, the readthrough event was in the opposite direction of the gene, indicating antisense transcription.

153

Results

BCM Alters the Distribution of RNAP

To determine the contribution of Rho to the genome-wide distribution of RNAP, ChIP was performed on cells grown in the presence or absence of BCM at 20 µg/ml. This concentration of BCM was chosen because it did not alter the growth rate of cells under the conditions used in these experiments (Ederth et al. 2006), and thus limited the potential indirect effects that could result from inhibiting Rho. DNAs from ChIP experiments targeting the β or β′ subunit of RNAP and “input” genomic DNA were differentially labeled with Cy3 and Cy5 dyes, then hybridized to a tiling microarray (see

Materials and Methods), revealing the RNAP distribution in BCM-treated and untreated conditions. Independent biological replicates showed good agreement (Pearson’s R =

0.9).

Changes in the distribution of RNAP upon BCM treatment were readily apparent by visual inspection of the data, and were quantified statistically. A moving average method implemented in the program CMARRT (Kuan et al. 2008) was used to identify regions where at least three consecutive probes exhibited increased ChIP-chip signal in

BCM-treated cells versus untreated cells (see Materials and Methods; no BCM-induced reductions in RNAP occupancy were detected). This analysis revealed a total of 199

BCM significant regions (BSRs) dispersed throughout the E. coli K-12 chromosome.

Most of the probes with increased ChIP-chip signal in BCM-treated cells were within

BSRs, but they represented only a small percentage of the total probes (~3%, Fig.

3.1A). This suggests that the effects of BCM were mostly direct consequences of Rho

154

Figure. 3.1. Global effects of Rho inhibition on the distribution of RNAP. (A)

Scatterplot of ChIP-chip data from untreated cells versus cells grown in BCM. Probes within BSRs are colored magenta. (B) BCM effect on the distribution of RNAP at the rho locus. RNAP ChIP-chip data from untreated (blue) and BCM treated conditions

(navy) were smoothed using two rounds of sliding-window averaging over 500bp.

Magenta dashed lines represent left and right boundaries of the rhoL BSR. Genes are shown as labeled arrows. (C) BSR classifications. Intergenic and intragenic BSRs are represented as separate pie charts, with corresponding keys below each chart.

155

Figure 3.1

156 inhibition rather than a large-scale redistribution of RNAP in response to cellular stresses or other pleiotropic effects.

The BSR dataset was compared to previously characterized Rho-dependent terminators, which confirmed that BCM effectively inhibited Rho in our experiment. For instance, the rho gene is autoregulated by a Rho-dependent terminator immediately upstream of its coding sequence (Matsumoto et al. 1986). As expected, a BSR was found at the rho locus just after the rhoL gene (Fig. 3.1B). In untreated cells, ChIP-chip signal for RNAP was highest at the rho promoter, situated just upstream of rhoL. After the rhoL gene, the signal for RNAP decreased, indicative of Rho-dependent termination.

When BCM was used to inhibit Rho function, however, the RNAP signal remained high throughout the Rho-dependent terminator region and gradually decreased across the rho gene, indicating readthrough of the rhoL terminator.

Our results also are broadly consistent with effects of BCM on global mRNA expression reported by Cardinale et al. (Cardinale et al. 2008), but provide high-resolution positional information that could not be accessed through mRNA expression analysis alone. Based on genomic position, approximately half of all BSRs were located within

300 bp of an expression array probeset that was upregulated at least two-fold in mRNA expression (Fig. 3.S1). However, the mRNA expression analysis did not detect a large fraction of the BSRs identified in our dataset (49%). The lower resolution of the expression array data relative to the tiling array-based ChIP-chip data likely explains this discordance, although differences in experimental growth conditions could also contribute. Importantly, the ChIP-chip-derived BSR data define the locations at which

Rho-dependent termination normally occurs.

157

To understand the roles of these Rho-dependent terminators, we next sought to associate each BSR with a specific gene. Although ChIP-chip experiments do not provide strand information per se, the “directionality” of terminator readthough was used to assess the orientation of RNAP on DNA. An example of directionality can be found at the rho locus (Fig. 3.1B). The distribution of RNAP shifts to the right downstream of rhoL in ChIP-chip data from BCM-treated cells compared to untreated cells. Therefore, the terminator must be on the “plus” direction at the 3´ end of rhoL. This logic was extended to assign each BSR to a particular gene (Table 3.S1). When directionality could not be determined (as was the case in 15 BSRs), the BSR was assigned to the gene that contained the majority of significant probes for that BSR. Quantitative PCR of

ChIP DNA was used to confirm the array results at three of the BSR-associated loci

(rho, valVW, and rygD, Fig. 3.S2).

Our analysis revealed a diverse set of Rho targets in the E. coli genome (Fig. 3.1C and

3.2; Table 3.1). Half of the targets (102) were after genes (intergenic targets), where

Rho would be expected to terminate transcription. Most of these followed protein-coding genes (83 mRNAs), but 12 followed tRNA genes and 7 followed sRNA genes.

However, the other half (97) were within coding regions (intergenic targets), including 25 that could be assigned to antisense transcripts. This distribution suggests that Rho plays important roles in E. coli transcription in addition to termination at the ends of operons or mediation of polarity.

158

Figure. 3.2. Locations of BSRs and BSR associated genes across the E. coli chromosome. Genome features are represented to scale as colored bars.

159

Figure. 3.2

160

Table 3.1. BSR Annotation Summary.

*Number of BSRs associated with annotated tRNA genes

†Number of BSRs associated with annotated sRNA genes

‡Number of BSRs associated with annotated mRNA genes

§Total number of BSRs associated with E. coli K-12-specific genes or prophage DNA¶ llDirectionality was not determined

¶(ASAP Database, http://www.genome.wisc.edu/tools/asap.htm)

161

Table 3.1

BSR type tRNA* sRNA† mRNA‡ Total K12- Prophage¶ specificll

Intergenic 12 7 83 102 21 15

Intragenic Sense 0 0 57 57 10 4

Antisense 0 0 25 25 6 3

NDll 0 0 15 15 2 1

Total 12 7 180 199 39 23

162

Rho termination at tRNAs

Many tRNA operons appeared to be terminated by Rho. Of the 36 tRNA-containing

TUs located outside of rrn (and thus subject to termination), 12 had a BSR immediately downstream of the mature 3´ end of the last tRNA in the TU (Table 3.S2). Rho termination had been previously demonstrated in vivo and in vitro at one of these tRNA

TUs (tyrTV; Kupper et al. 1978). Two tRNA loci that show the effects of BCM treatment on the distribution of RNAP are valVW and thrW (Fig. 3.3A and 3.S3). Although the

RNAP ChIP-chip signal is restricted to the tRNA operon itself in untreated cells, BCM treatment caused the distribution of RNAP to extend downstream past the presumed

Rho-dependent termination point. The ChIP-chip signals on tRNA operons without significant BCM effects, such as lysT-valT-lysW-valZ-lysYZQ, were qualitatively and quantitatively distinct from tRNA operons affected by BCM (compare Fig. 3.3A and 3.S3 to Fig. 3.3B).

To determine the distinctions between tRNA operons that were affected by BCM and those that were not, we analyzed the sequence within and surrounding the BSR.

The number of tRNAs in an operon, and the direction of transcription in genes downstream of the operon had no relationship with Rho termination (Table 3.S2, Fig.

3.3A, Fig. 3.S3). Additionally, no obvious “termination sequence” could be ascribed to tRNA BSRs using motif-finding algorithms (e.g. MEME, http://meme.sdsc.edu/meme4/).

However, the first 50 nucleotides after the mature 3´ end of the tRNA differed significantly in GC content for tRNAs affected by BCM (Table 3.S3). Although these sequences were only 25% C on average, they were significantly more enriched for C than their non-Rho terminated counterparts (Student’s t-test, p = 0.01) and significantly

163

Figure. 3.3. Rho termination at tRNAs and sRNAs . BCM effects the distribution of

RNAP at (A) the valVW tRNA operon, but not (B) the lysT-valT-lysW-valZ-lysYZQ tRNA operon. The distribution of RNAP at (C) the sroG sRNA is affected by BCM. Colors, labels, and data smoothing are as described in Fig. 3.1B, except that noncoding RNA genes are colored yellow.

164

Figure. 3.3

165 depleted in G. The average G content was only 12% within the first 50 nucleotides after these tRNAs, which was highly significant compared to tRNAs without corresponding

BSRs (Student’s t-test, p = 0.001). These patterns are consistent with previous studies that noted a bias toward C and away from G in cases of Rho-dependent polarity after premature stop codons (Alifano et al. 1991).

Unsurprisingly, the feature that most distinguished tRNA operons affected by

BCM from those that were not was the presence or absence of a putative intrinsic terminator hairpin RNA structure. Of the 24 tRNA operons that lacked associated

BSRs, 22 (92%) encoded putative intrinsic terminator hairpin structures and corresponding U-tracts within 150 bp of the 3´ end of the tRNA. Potential hairpins were identified by examining the RNA secondary structure in silico using the mFold algorithm

((Zuker 2003), Table 3.S2). The two exceptions lacking both a BSR and putative intrinsic terminator were ileY and the thrU-tyrU-glyT-thrT operon. The ileY tRNA gene produced very little RNAP ChIP-chip signal in both BCM-treated and untreated conditions, and likely fell below the limits of detection. The thrU-tyrU-glyT-thrT operon is known to be co-transcribed with the downstream tufB gene (Lee et al. 1981). Although a small drop in RNAP ChIP-chip signal occurred between thrT and tufB, apparently, the majority of ECs were not terminated. Eleven of the 12 (92%) tRNA operons with an associated BSR lacked putative intrinsic terminator hairpin structures. The exception, asnU, contained a putative RNA structure that resembled an intrinsic terminator hairpin despite being affected by BCM treatment (Table 3.S2). However, the purported terminator contained an unpaired A residue between the hairpin stem and U-tract.

Systematic substitutions of U-tract residues with A in the canonical pyrBI intrinsic

166 terminator revealed that mutations closer to the hairpin stem caused progressively greater termination defects (although the first U of the U-tract was not tested (Sipos et al. 2007)). Also, weakening the base of the hairpin stem reduces termination markedly

(Larson et al. 2008). Therefore, this deviation from a canonical intrinsic terminator would likely disrupt the function of the terminator hairpin. This finding raises the intriguing possibility that Rho-dependent termination is a “default” termination pathway in E. coli, taking over when intrinsic terminator hairpins are disrupted by mutation or removed by horizontal transfer events.

Rho termination of sRNA synthesis

Genes in a second class uncovered in the BSR analysis encoded known sRNAs.

Seven annotated sRNA genes were found to have BSRs associated with the 3´ end of the gene (Fig. 3.1C). Two types of Rho-dependent terminators were found at sRNAs.

The first type was primarily involved in sRNA 3´ end formation. The rygD gene (also known as sibD) produces a noncoding stable RNA product that regulates the toxicity of the short, hydrophobic IbsD protein (Fozo et al. 2008). An extension in the distribution of RNAP at rygD is seen in the presence of BCM, indicating that this sRNA is terminated by Rho (Fig. 3.S4).

The second type of Rho-dependent terminator found at sRNAs appeared to play a role in the regulation of downstream genes. The sroG sRNA is situated in between the promoter and protein coding sequence of the ribB gene, which is involved in riboflavin synthesis (Vogel et al. 2003). Although the exact function of the SroG RNA

167 has not been demonstrated experimentally, sequence alignments suggest that it contains a flavin mononucleotide (FMN) binding known as an RFN

(riboflavin) element (Vitreschak et al. 2002). Based on the absence of an intrinsic terminator hairpin, and complementarity between the Shine-Dalgarno (SD) of ribB and upstream sequences in the RNA, the riboswitch contained in sroG was proposed to operate by blocking translation of ribB in conditions of high FMN concentration

(Vitreschak et al. 2002). Interestingly, a BSR occurred at the 3´ end of sroG, implicating

Rho-dependent termination as a mechanism for tightening the regulation of this riboswitch (Fig. 3.3C). The ribB transcript, when left untranslated, is logically a good substrate for Rho action, and termination by Rho would prevent synthesis of the full- length ribB mRNA. This would ensure that RibB protein could not be produced, even if

SD pairing is lost by FMN release from the riboswitch. This system is similar to the

Bacillus subtilis trp operon, where Rho termination occurs after translation initiation is blocked by a hairpin that occludes the SD of trpE (Yakhnin et al. 2001). Our findings indicate that Rho termination at sRNAs can be involved both in 3´-end formation, and in the mechanism by which sRNAs regulate their target genes. Just seven of the ~80 known sRNAs are terminated by Rho. Previous studies have identified sRNAs by searching for promoter-intrinsic terminator pairs in intergenic regions (see (Livny and

Waldor 2007)), suggesting that only a fraction of Rho-terminated sRNAs have been discovered. Therefore, identifying Rho-dependent terminators with associated promoters could function as an additional method for finding novel sRNAs.

168

Rho inhibition reveals novel antisense transcription

Although half of the BSRs were found at the 3´ ends of genes, as would be predicted if

Rho functions to terminate RNAP at the ends of TUs (intergenic), the other half were located within genes (intragenic). In many cases, we found that the “directionality” of intragenic terminator readthrough was opposite to the direction of the annotated gene

(Fig. 3.4A and 4B). These observations were indicative of antisense transcription by

RNAP. In total, we found 25 instances of antisense transcription in the BSR dataset, 24 of which were novel transcripts (Table 3.S4). A majority (17/25) of the antisense transcripts had an associated 70 peak in ChIP-chip data from Mooney et al. that indicated a putative promoter for the transcript (Mooney et al. 2009). We estimated the approximate lengths of the antisense transcripts by finding the distance between the start of the BSR and the midpoint of its associated 70 peak. The average antisense transcript length was 456 nt. This number likely overstates the transcript length, since the same analysis applied to tRNAs overestimated their lengths by 50-150 nucleotides.

Thus, the average length of antisense transcripts found in this study falls within the range of 50 to 400 nt typically assigned to sRNAs (Vogel and Sharma 2005).

Reppas et al. had previously identified an antisense transcript on the opposite strand of the eutB gene (Reppas et al. 2006). This transcript is also apparent in the

BSR dataset due to the directionality of terminator readthrough (Fig. 3.4A), and a corresponding peak in 70 ChIP-chip data suggests a promoter location for the transcript. An example of a novel antisense transcript found in the BSR dataset lies within the “cryptic” bgl operon on the opposite strand of the bglF gene (Fig. 3.4B). The

169

Figure. 3.4. Rho inhibition reveals antisense transcription. (A) An antisense transcript within the eutB gene was detected based on the direction of terminator readthrough by RNAP in BCM treated cells. 70 ChIP-chip data (orange) from Mooney et al. 2009, suggests a promoter location for the antisense transcript. (B) Novel antisense transcription within the bglF gene. Colors, labels, and data smoothing are as described in Fig. 3.1B, except that putative antisense transcripts are represented by red arrows.

170

Figure. 3.4

171 ambiguous directionality of the 70 and RNAP peaks in bglF was made clear by readthrough of an antisense, Rho-dependent terminator in BCM treated cells, establishing the existence of antisense transcription in bglF.

Our finding of ~100 intragenic Rho-dependent terminators demonstrates that transcription in E. coli is much more complex than previously envisioned, with many transcripts terminated within coding sequences and a greater amount of antisense transcription. Intragenic terminators are associated with both sense and antisense transcription. Intragenic antisense transcripts terminated by Rho represent a mostly novel group of RNAs with unknown functions. Intragenic sense Rho-dependent terminators may be associated with transcriptional attenuation (Henkin and Yanofsky

2002), premature termination due to failed translation, or synthesis of sRNAs that lie within larger genes.

172

Discussion

Our results lead to three novel insights into of the role of Rho in global gene regulation. First, Rho terminates synthesis of small noncoding RNAs, including tRNAs, to a much greater extent than previously realized. This is significant because the extensive structure of such RNAs is thought to inhibit Rho binding. Second, Rho terminates synthesis of intragenic transcripts, including antisense transcripts, of unknown function. Many of these likely represent novel, noncoding transcripts in E. coli.

Finally, the strong effect of Rho on horizontally transferred genes may reflect the propensity of such genes to insert at tRNA-encoding loci, rather than Rho-targeting of foreign DNA per se.

The BSR dataset is likely to reveal only a subset of Rho-dependent terminators in E. coli. Detection of Rho terminators by ChIP-chip requires sufficient occupancy of

RNAP prior to the terminator to see the readthrough event. For instance, the well- characterized Rho-dependent trp t’ terminator (Wu et al. 1981) was barely discernable, and failed to meet the statistical cutoff due to low RNAP signal at the 3´ end of the trp operon. Many condition-specific Rho terminators also were likely missed (e.g. the tnaC terminator in the catabolite-repressed tna operon (Stewart et al. 1986)). Finally,

“cryptic” Rho terminators that occur only when transcription and translation are uncoupled would not be found since translation should be efficient under our assay conditions.

To estimate the total extent to which Rho terminates mRNA synthesis, we examined 109 “high-quality” TUs for which the RNAP ChIP-chip signal was significantly

173 above background and could readily be distinguished from adjacent TUs (Mooney et al.

2009). Of these 109 TUs, 18 were associated with intergenic BSRs, indicating that 17% of these TUs are terminated at their 3´ ends by Rho. We extrapolated this percentage out to the total predicted number of TUs in E. coli (2271 (Salgado et al. 2000)), which gave 386 as the estimated number of intergenic Rho-dependent terminators. Based on this estimate, Rho-dependent termination is likely to account for ~20% of the total mRNA 3´-end formation in E. coli, rather than the 50% estimate that is often cited (Zhu and von Hippel 1998). We note that the 50% estimate does not appear to be based on a genome-scale analysis.

Rho-dependent termination and stable RNA synthesis

Stable RNA transcripts are surprising substrates for Rho action, since they are typically highly structured whereas Rho is thought to bind unstructured RNA. However, ChIP- chip analysis reveals Rho occupancy across most TUs, including sRNAs, tRNAs, and rRNAs (Mooney et al. 2009). Thus, Rho appears capable of association with structured transcripts, consistent with our finding that Rho terminates these transcripts.

Rho-dependent termination of tRNA and sRNA transcripts is also unexpected because Rho generates heterogeneous transcript 3´ ends that would seemingly be problematic for the function of these RNAs. Extraneous 3´ nucleotides may interfere with folding or enzymatic modifications of stable RNAs, many of which require specific secondary structures for biological activity. However, extra 3´ nucleotides can be removed by multiple 3´5´ exonucleases that exist in E. coli. For instance, a 3´ tail on

174 the Rho-terminated valVW tRNA transcript becomes detectable only in a pnp rnb double mutant, implying that 3´ ends generated by Rho termination are rapidly degraded by redundant 3´5´ exonucleases encoded by pnp and rnb (Mohanty and Kushner 2007).

Thus, heterogeneous 3´ tails generated by Rho can be readily removed to avoid interfering with RNA function.

Rho inhibition had no effect on RNAP occupancy of rRNA TUs, whereas the recent proposal that Rho removes paused RNAPs predicts a 3´-proximal decrease

(Klumpp and Hwa 2008). However, the lower level of rrn transcription expected for minimal medium could preclude detection of this effect.

Rho terminates novel antisense transcripts in E. coli

We find that Rho is involved in termination of a set of antisense transcripts with unknown function. These antisense transcripts are likely to be noncoding, since the protein-coding sequence on the opposite strand greatly constrains the sequence of the antisense RNA. The 25 antisense transcripts we detected likely represent only a small fraction of a larger set of similar antisense transcripts in E. coli. As noted above for intergenic transcripts, our method will miss a significant number on which the effect of

BCM fails to generate a BSR. Additionally, to be detectable, antisense TUs within genes must also generate signals significantly above the level of the corresponding genic TU. Thus, E. coli likely possesses a large set of intragenic, antisense TUs of which the 25 we detected are only a limited, highly transcribed subset.

175

Some antisense transcripts may encode small RNAs with specific regulatory functions.

For instance, antisense RNAs are known to block translation by pairing to a sense transcript (e.g., RyhB and IS10 in bacteria (Waters and Storz 2009)), to block formation of persistent RNA-DNA hybrids (e.g, RNAI in ColE1-type plasmids (Waters and Storz

2009) or to interfere with sense transcription during their synthesis (Ward and Murray

1979)). Some of these transcripts could conceivably produce sRNAs with functions unrelated to the genes within which they are embedded.

However, the possibility that some intragenic transcripts result from

“transcriptional noise” must be considered. The involvement of Rho is itself compellingly analogous to some types of noncoding transcription in eukaryotes.

Bacterial RNAP and eukaryotic RNAPII are both terminated by at least two distinct pathways. In many bacteria, intrinsic termination appears to be the dominant mechanism for termination of mRNA synthesis; indeed, our results suggest Rho terminates only a minority of full-length E. coli mRNAs. RNAPII termination is coupled to transcript cleavage and polyadenlyation for most mRNAs (Lykke-Andersen and

Jensen 2007), but can instead occur by the Nrd1/Nab3/Sen1-dependent pathway for small nuclear RNAs (snRNAs), small nucleolar RNAs (snoRNAs), and some short mRNAs (Steinmetz et al. 2006). Sen1 contains an ATP-dependent, 5´3´ RNA/DNA helicase activity and may function similarly to Rho (Steinmetz and Brow 1996). Thus,

Rho-dependent termination in bacteria appears to be analogous to Sen1-dependent termination in eukaryotes. The Nrd1/Nab3/Sen1 pathway is implicated in the termination of cryptic unstable transcripts (CUTs) that become detectable in S. cerevisiae mutants defective for nuclear RNA degradation (Arigo et al. 2006). Similar to

176 pervasive noncoding transcription in other eukaryotes, the biological function of CUTs is unknown; however, CUTs may simply reflect transcriptional noise that is an unavoidable consequence of robust gene expression, and the Nrd1/Nab3/Sen1 pathway may play a role in “genome surveillance” by suppressing them. Given the similarities of Rho- dependent and Sen1-dependent termination, one possibility is that at least some antisense transcription terminated by Rho in bacteria may also reflect transcriptional noise.

Rho-dependent termination and horizontal transfer

Our findings are consistent with a connection between Rho and suppressed expression of horizontally transferred, “foreign” genes (Cardinale et al. 2008), but suggest an indirect mechanism underlies this relationship. The connection is evident from the significant association of BSRs with E. coli K-12 genes lacking homologs in E. coli

0157:H7 EDL 933 (Mann-Whitney U, p < 0.001). However, specific targeting of Rho to

AU-rich RNA in horizontally transferred genes (Cardinale et al. 2008) can be ruled out, since the global distribution of Rho lacks bias toward any particular set of TUs (Mooney et al. 2009).

Three non-mutually exclusive ideas may explain why Rho termination is associated with “foreign” DNA. First, foreign genes acquired from distantly related organisms may not be adapted to the E. coli translation apparatus, allowing Rho to act on poorly translated RNAs. Second, some foreign DNA may contain specific Rho- dependent terminators. For instance, the rac prophage contains the Rho-dependent

177

timm terminator upstream of the lethal kilR gene (Cardinale et al. 2008); BCM causes readthrough of timm and the appearance of a BSR (Table 3.S1).

Third, foreign DNA may preferentially insert into active TUs, and thereby produce readthrough transcription into foreign DNA that is terminated by Rho. Of the 63 E. coli

K-12-specific genes associated with a BSR, 24 are inserted into active TUs at which

Rho terminated transcription into the horizontally transferred DNA. This phenomenon is apparent at tRNA operons terminated by Rho (Fig. 3.S5). Half the tRNAs terminated by

Rho have associated BSRs that read into E. coli K-12 specific genes or prophage elements (Fig. 3.1C, Table 3.S1). Indeed, the majority of prophages and other horizontally transferred elements in γ-proteobacteria encode integrases that specifically target tRNAs genes as attachment sites (Williams 2002). Thus, horizontally transferred elements may integrate into the chromosome by disruption of tRNA genes, causing loss of their intrinsic terminators. In such cases, Rho can supply an alternate termination mechanism to prevent transcription of potentially toxic foriegn genes from the tRNA gene promoter. Williams categorized horizontally transferred elements that use tRNA as insertion points across several species of proteobacteria and Gram-positive bacteria

(Williams 2002). Of the 54 characterized, horizontally transferred elements that insert into tRNA, 22 (41%) lacked an intrinsic terminator within 400 bases of the mature 3´ end of the tRNA (Williams 2002). These data suggest that Rho termination provides a general mechanism for guarding the borders of tRNA transcription against the deleterious consequences of foreign gene expression in a diverse set of bacteria.

178

Conclusion

Rho-dependent termination plays many roles in bacterial transcription, including generation of full-length mRNA 3´ ends (Roberts 1969), establishment of polarity

(Richardson et al. 1975), resolution of extended RNA-DNA hybrids (Harinarayanan and

Gowrishankar 2003), and protection of cells from harmful expression of foreign genes

(Cardinale et al. 2008). Our results suggest Rho plays additional, and possibly more significant, roles by halting RNA chain elongation in a novel antisense transcriptome and by terminating synthesis of stable RNAs, including tRNAs and sRNAs. Rho- dependent termination is especially well suited for halting antisense transcription. The stringent sequence requirements of intrinsic terminators would be incompatible with a protein-coding gene on the opposite strand. In contrast, Rho-dependent terminators exhibit modest sequence specificity (C enriched and G depleted), which would place few limitations on codon usage in a protein-coding gene. Taken together, these data suggest Rho may play a principal role in halting transcription at locations where intrinsic terminators could not readily evolve (e.g., horizontally transferred DNA and antisense transcripts). Thus, further study of the targets of Rho may help elucidate the scope of the noncoding transcriptome of E. coli.

179

Materials and Methods

Growth Conditions

E. coli K-12 MG1655 was grown in MOPS minimal medium containing 0.2% glucose at

37 oC with vigorous agitation in the presence or absence of 20 g/ml BCM (Mooney et al. 2009). BCM was obtained from Fujisawa Pharmaceutical Co. (Osaka, Japan).

ChIP-chip

ChIP-chip assays were performed as previously described (Mooney et al. 2009).

Briefly, cells were grown to an apparent OD600 of ~0.4 and crosslinked by the addition of formaldehyde at 1% final concentration with continued shaking at 37 °C for 5 min before quenching with glycine (100 mM final). Cells were then lysed and DNAs were sheared by sonication followed by treatment with micrococcal nuclease and RNase A. RNAP crosslinked to DNA was immunoprecipitated using antibodies against either the  or ’ subunit (antibodies 8RB13 and NT73, respectively, Neoclone, Madison WI) using sepharose protein A beads. Enriched ChIP DNA and input DNA were amplified by linker-mediated PCR (Ng et al. 2003), and processed by Nimblegen, Inc. (Madison, WI) to incorporate Cy3 or Cy5 dyes, hybridized to a tiling array, and quantified by fluorescence scanning. Two biological replicates were obtained for both BCM-treated and untreated conditions.

180

Array Designs

We used two distinct isothermal tiling arrays that cover the entire E. coli K-12 MG1655 genome. The first array contained 187,204 oligonucleotide probes synthesized on the array in duplicate with ~24-bp spacing (Mooney et al. 2009), whereas the second contained 374,408 probes that alternated strands with ~12-bp spacing.

Data Analysis

We performed locally-weighted linear regression (LOWESS) normalization (Yang et al.

2002) on raw Cy3 and Cy5 signals to correct for intensity-dependent dye effects within each array using the normalizewithinarrays function (Smyth and Speed 2003) in the limma package (Smyth et al. 2005) for the statistical program R (Team 2008).

Normalized log2 ratios were then averaged over probe positions found in the 187,204 probe array to make the two array formats directly comparable. Next, biological replicates for BCM treated or untreated conditions were quantile normalized between arrays using the normalize.quantiles function in the R package affy (Gautier et al. 2004).

For each of the BCM treated (Trt) and untreated (UnTrt) conditions, we computed the average of the two biological replicates for each probe position. The analysis to identify regions enriched in Trt relative to UnTrt was performed using CMARRT (Kuan et al.

2008) on the difference between the average of treated and untreated conditions

(AveTrt-AveUnTrt) at the FDR level of 0.05.

181

Quantitative PCR

Quantitative PCR was performed on ChIP DNA using SYBR Green JumpStart Taq

ReadyMix for Real-Time PCR (Sigma-Aldrich) in a model 7500 Real-Time PCR System thermal cycler (Applied Biosystems). Two primers pairs were designed for each BSR locus tested. The first primer pair annealed before the BSR, and the second annealed within the BSR. Primer sequences are available upon request. Cycle threshold values obtained from quantitative PCR were converted to a relative quantity of DNA based on a standard curve created for each primer pair. The relative DNA quantity within the BSR was then normalized to the quantity before the BSR.

182

Acknowledgements

We thank Yann Dufour for array design, and Nicole Perna for assistance in defining E. coli K-12 specific genes. We also thank Richard Gourse, David Brow, Charles

Turnbough Jr., and members of the Landick Lab for critical reading of the manuscript.

This work was supported by NIH grant GM38660 to R.L.

183

Supplementary Figures

Figure 3.S1. Comparison of BSR positions to genes reported to be affected by

BCM in expression profiling experiments from Cardinale et al. Bars represent the percentage of BSRs that are within the indicated distance of a gene reported to be at least two-fold upregulated by BCM treatment. Distance indicates overlap between

BSRs and BCM upregulated genes.

184

Figure 3.S1

185

Figure 3.S2. Quantitative PCR confirmation of ChIP-chip results. Bar graphs indicate the relative quantity of DNA as determined by quantitative PCR normalized to the primer set before the BSR. Each primer set is designated by a horizontal black bar above its priming position. Colors, labels, and data smoothing are as described in Fig.

3.1B, except that noncoding RNA genes are colored yellow.

186

Figure 3.S2

187

Figure 3.S3. BCM effect on the distribution of RNAP at the thrW tRNA. Colors, labels, and data smoothing are as described in Fig. 3.1B, except that noncoding RNA genes are colored yellow.

188

Figure 3.S3

189

Figure 3.S4. BCM effect on the distribution of RNAP at the rygD sRNA. Colors, labels, and data smoothing are as described in Fig. 3.1B, except that noncoding RNA genes are colored yellow.

190

Figure 3.S4

191

Figure 3.S5. Transcriptional readthough from tRNA operons onto K-12 specific genes and prophage elements. Genes above the black line are transcribed to the right, and genes below the line are transcribed to the left. Colors, labels, and data smoothing are as described in Fig. 3.1B, except that tRNA genes are colored green, and E. coli K-12 specific genes and prophage genes are colored violet.

192

Figure 3.S5

193

References

Alifano P, Rivellini F, Limauro D, Bruni CB, Carlomagno MS. 1991. A consensus motif common to all Rho-dependent prokaryotic transcription terminators. Cell 64: 553-563.

Arigo JT, Eyler DE, Carroll KL, Corden JL. 2006. Termination of cryptic unstable transcripts is directed by yeast RNA-binding proteins Nrd1 and Nab3. Mol Cell 23: 841- 851.

Banerjee S, Chalissery J, Bandey I, Sen R. 2006. Rho-dependent transcription termination: more questions than answers. J Microbiol 44: 11-22.

Bubunenko M, Baker T, Court DL. 2007. Essentiality of ribosomal and transcription antitermination proteins analyzed by systematic gene replacement in Escherichia coli. J Bacteriol 189: 2844-2853.

Cardinale CJ, Washburn RS, Tadigotla VR, Brown LM, Gottesman ME, Nudler E. 2008. Termination factor Rho and its cofactors NusA and NusG silence foreign DNA in E. coli. Science 320: 935-938.

Ederth J, Mooney RA, Isaksson LA, Landick R. 2006. Functional interplay between the jaw domain of bacterial RNA polymerase and allele-specific residues in the product RNA-binding pocket. J Mol Biol 356: 1163-1179.

Fozo EM, Kawano M, Fontaine F, Kaya Y, Mendieta KS, Jones KL, Ocampo A, Rudd KE, Storz G. 2008. Repression of small toxic protein synthesis by the Sib and OhsC small RNAs. Mol Microbiol 70: 1076-1093.

Galluppi GR, Richardson JP. 1980. ATP-induced changes in the binding of RNA synthesis termination protein Rho to RNA. J Mol Biol 138: 513-539.

Gautier L, Cope L, Bolstad BM, Irizarry RA. 2004. affy--analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20: 307-315.

194

Harinarayanan R, Gowrishankar J. 2003. Host factor titration by chromosomal R-loops as a mechanism for runaway plasmid replication in transcription termination-defective mutants of Escherichia coli. J Mol Biol 332: 31-46.

Henkin TM, Yanofsky C. 2002. Regulation by transcription attenuation in bacteria: how RNA provides instructions for transcription termination/antitermination decisions. Bioessays 24: 700-707.

Klumpp S, Hwa T. 2008. Stochasticity and traffic jams in the transcription of ribosomal RNA: Intriguing role of termination and antitermination. Proc Natl Acad Sci U S A 105: 18159-18164.

Kuan PF, Chun H, Keles S. 2008. CMARRT: A tool for the analysis of ChIP-Chip data from tiling arrays by incorporating the correlation structure. Pacific Symposium on Biocomputing 13: 515-526.

Kupper H, Sekiya T, Rosenberg M, Egan J, Landy A. 1978. A rho-dependent termination site in the gene coding for tyrosine tRNA su3 of Escherichia coli. Nature 272: 423-428.

Larson MH, Greenleaf WJ, Landick R, Block SM. 2008. Applied force reveals mechanistic and energetic details of transcription termination. Cell 132: 971-982.

Lee JS, An G, Friesen JD, Fill NP. 1981. Location of the tufB promoter of E. coli: cotranscription of tufB with four transfer RNA genes. Cell 25: 251-258.

Livny J, Waldor MK. 2007. Identification of small RNAs in diverse bacterial species. Curr Opin Microbiol 10: 96-101.

Lykke-Andersen S, Jensen TH. 2007. Overlapping pathways dictate termination of RNA polymerase II transcription. Biochimie 89: 1177-1182.

195

Magyar A, Zhang X, Kohn H, Widger WR. 1996. The antibiotic bicyclomycin affects the secondary RNA binding site of Escherichia coli transcription termination factor Rho. J Biol Chem 271: 25369-25374.

Matsumoto Y, Shigesada K, Hirano M, Imai M. 1986. Autogenous regulation of the gene for transcription termination factor rho in Escherichia coli: localization and function of its attenuators. J Bacteriol 166: 945-958.

Mohanty BK, Kushner SR. 2007. Ribonuclease P processes polycistronic tRNA transcripts in Escherichia coli independent of ribonuclease E. Nucleic Acids Res 35: 7614-7625.

Mooney RA, Davis SE, Peters JM, Rowland JL, Ansari AZ, Landick R. 2009. Regulator trafficking on bacterial transcription units in vivo. Mol Cell 33: 97-108.

Ng HH, Robert F, Young RA, Struhl K. 2003. Targeted recruitment of Set1 histone methylase by elongating Pol II provides a localized mark and memory of recent transcriptional activity. Mol Cell 11: 709-719.

Park HG, Zhang X, Moon HS, Zwiefka A, Cox K, Gaskell SJ, Widger WR, Kohn H. 1995. Bicyclomycin and dihydrobicyclomycin inhibition kinetics of Escherichia coli rho- dependent transcription termination factor ATPase activity. Arch Biochem Biophys 323: 447-454.

Reppas NB, Wade JT, Church GM, Struhl K. 2006. The transition between transcriptional initiation and elongation in E. coli is highly variable and often rate limiting. Mol Cell 24: 747-757.

Richardson JP. 2006. How Rho exerts its muscle on RNA. Mol Cell 22: 711-712.

Richardson JP, Grimley C, Lowery C. 1975. Transcription Termination Factor Rho Activity Is Altered in Escherichia coli with suA Gene Mutations. Proc Natl Acad Sci U S A 72: 1725-1728.

196

Roberts JW. 1969. Termination factor for RNA synthesis. Nature 224: 1168-1174.

Salgado H, Santos-Zavaleta A, Gama-Castro S, Millan-Zarate D, Blattner FR, Collado- Vides J. 2000. RegulonDB (version 3.0): transcriptional regulation and operon organization in Escherichia coli K-12. Nucleic Acids Res 28: 65-67.

Sipos K, Szigeti R, Dong X, Turnbough CL, Jr. 2007. Systematic mutagenesis of the thymidine tract of the pyrBI attenuator and its effects on intrinsic transcription termination in Escherichia coli. Mol Microbiol 66: 127-138.

Skordalakes E, Brogan AP, Park BS, Kohn H, Berger JM. 2005. Structural mechanism of inhibition of the Rho transcription termination factor by the antibiotic bicyclomycin. Structure 13: 99-109.

Smyth GK, Michaud J, Scott HS. 2005. Use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics 21: 2067-2075.

Smyth GK, Speed T. 2003. Normalization of cDNA microarray data. Methods 31: 265- 273.

Steinmetz EJ, Brow DA. 1996. Repression of gene expression by an exogenous sequence element acting in concert with a heterogeneous nuclear ribonucleoprotein-like protein, Nrd1, and the putative helicase Sen1. Mol Cell Biol 16: 6993-7003. Steinmetz EJ, Warren CL, Kuehner JN, Panbehi B, Ansari AZ, Brow DA. 2006. Genome-wide distribution of yeast RNA polymerase II and its control by Sen1 helicase. Mol Cell 24: 735-746.

Stewart V, Landick R, Yanofsky C. 1986. Rho-dependent transcription termination in the tryptophanase operon leader region of Escherichia coli K-12. J Bacteriol 166: 217-223.

Team RDC. 2008. R: A language and environment for statistical computing.

197

Vitreschak AG, Rodionov DA, Mironov AA, Gelfand MS. 2002. Regulation of riboflavin biosynthesis and transport genes in bacteria by transcriptional and translational attenuation. Nucleic Acids Res 30: 3141-3151.

Vogel J, Bartels V, Tang TH, Churakov G, Slagter-Jager JG, Huttenhofer A, Wagner EG. 2003. RNomics in Escherichia coli detects new sRNA species and indicates parallel transcriptional output in bacteria. Nucleic Acids Res 31: 6435-6443.

Vogel J, Sharma CM. 2005. How to find small non-coding RNAs in bacteria. Biol Chem 386: 1219-1238.

Ward DF, Murray NE. 1979. Convergent transcription in bacteriophage lambda: interference with gene expression. J Mol Biol 133: 249-266.

Waters LS, Storz G. 2009. Regulatory RNAs in bacteria. Cell 136: 615-628.

Williams KP. 2002. Integration sites for genetic elements in prokaryotic tRNA and tmRNA genes: sublocation preference of integrase subfamilies. Nucleic Acids Res 30: 866-875.

Wu AM, Christie GE, Platt T. 1981. Tandem termination sites in the tryptophan operon of Escherichia coli. Proc Natl Acad Sci U S A 78: 2913-2917.

Yakhnin H, Babiarz JE, Yakhnin AV, Babitzke P. 2001. Expression of the Bacillus subtilis trpEDCFBA operon is influenced by translational coupling and Rho termination factor. J Bacteriol 183: 5918-5926.

Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP. 2002. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 30: e15.

198

Yanofsky C, Horn V. 1995. Bicyclomycin sensitivity and resistance affect Rho factor- mediated transcription termination in the tna operon of Escherichia coli. J Bacteriol 177: 4451-4456.

Zhu AQ, von Hippel PH. 1998. Rho-dependent termination within the trp t' terminator. I. Effects of rho loading and template sequence. Biochemistry 37: 11202-11214.

Zuker M. 2003. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 31: 3406-3415.

Zwiefka A, Kohn H, Widger WR. 1993. Transcription termination factor rho: the site of bicyclomycin inhibition in Escherichia coli. Biochemistry 32: 3564-3570.

199

Chapter 4

Rho and NusG suppress pervasive antisense transcription in Escherichia coli

This chapter has been submitted for publication (Jason M. Peters, Rachel A. Mooney,

Jeffrey A. Grass, Frances Tran, and Robert Landick 2012. Rho and NusG suppress pervasive antisense transcription in Escherichia coli). I performed the tiling expression microarray experiments, hns genetic experiments, data analysis, and Robert Landick and I wrote the paper. Rachel A. Mooney performed the ΔnusG and ΔnusA* RNAP

ChIP-chip experiments, Jeffrey A. Grass extracted RNA for tiling expression microarrays and RNAseq, and Frances Tran performed H-NS ChIP-chip experiments.

Supplementary figures can be found at the end of the chapter.

200

Abstract

Despite the prevalence of antisense transcripts in bacterial transcriptomes, little is known about how their synthesis is controlled. We report that a major function of the

Escherichia coli termination factor Rho and its co-factor NusG is suppression of ubiquitous antisense transcription genome-wide. Rho binds C-rich unstructured nascent

RNA (high C/G ratio) prior to its ATP-dependent dissociation of transcription complexes.

NusG is required for efficient termination at minority subsets (~20%) of both antisense and sense Rho-dependent terminators with lower C/G ratio sequences. In contrast, a widely studied nusA deletion proposed to compromise Rho-dependent termination had no effect on antisense or sense Rho-dependent terminators in vivo. Global co- localization of the nucleoid-associated protein H-NS with Rho-dependent terminators and genetic interactions between hns and rho suggest that H-NS aids Rho in suppression of antisense transcription. The combined actions of Rho, NusG, and H-NS appear to be analogous to the Sen1-Nrd1-Nab3 and systems that suppress antisense transcription in eukaryotes.

201

Introduction

Antisense transcription is a common feature of both bacterial and eukaryotic transcriptomes. Antisense transcripts have been identified in diverse bacteria (Georg and Hess 2011), including Bacillus subtilis (Rasmussen et al. 2009; Irnov et al. 2010;

Nicolas et al. 2012) and Escherichia coli (Peters et al. 2009; Dornenburg et al. 2010;

Shinhara et al. 2011). With the exception of a few well-studied examples (e.g., λ OOP;

Krinke and Wulff 1987 and E. coli GadY; Opdyke et al. 2004), bacterial antisense transcripts remain largely uncharacterized. Although some antisense transcripts have specific regulatory functions, others may result from “transcriptional noise” generated by nonspecific transcription initiation or weak promoters that become fixed within genes by evolutionary constraints on coding sequence (Struhl 2007). At high levels, however, even spurious antisense transcription could have deleterious effects by interferring with sense transcription, stimulating mRNA degradation, or diverting important cellular resources.

The bacterial Rho-dependent transcription termination pathway, which relies on the ATP-dependent translocase Rho (Fig. 4.1; Roberts 1969; reviewed in Banerjee et al. 2006; Boudvillain et al. 2010; Peters et al. 2011), targets untranslated RNAs including tRNAs, sRNAs, and at least some antisense transcripts (Peters et al. 2009).

Rho binds to ~ 80 nt of C-rich unstructured (G-depleted) nascent RNA segments known as Rho-utilization (rut) sites, and subsequently dissociates elongation complexes (EC) via its RNA translocase activity (reviewed in Peters et al. 2011). Rho targeting of untranslated RNA causes the polar effects of nonsense mutations on expression of downstream genes in bacterial operons (Adhya et al. 1974).

202

Figure 4.1. Regulators of transcript elongation in bacteria. The EC is comprised of

RNAP (β′βα2ω subunits), DNA template, and RNA transcript. Rho hexamer binds nascent RNA in primary sites on each subunit an secondary site in the central pore.

NusG contacts RNAP via its N-terminal domain and Rho via a C-terminal domain connected by a flexible linker. NusA binds RNAP via an N-terminal domain near the

RNA exit channel, and contains additional domains (KH1, KH2, S1, and a C-terminal domain present in some bacteria).

203

Figure 4.1

204

Bacteria contain two general elongation factors, NusA and NusG, that may modulate

Rho-dependent termination (Fig. 4.1). In vitro, NusG enhances Rho termination through direct interactions with Rho and RNAP (Li et al. 1993; Pasman and von Hippel 2000;

Mooney et al. 2009b; Chalissery et al. 2011). In vivo, NusG is proposed either to aid all

Rho-dependent termination (Cardinale et al. 2008) or to assist just a subset of terminators (Sullivan and Gottesman 1992). In vitro, NusA exhibits complex effects on

Rho-dependent termination (Ward and Gottesman 1981; Burns and Richardson 1995;

Cardinale et al. 2008; Saxena and Gowrishankar 2011a), but is proposed to aid all Rho- dependent termination in vivo (Cardinale et al. 2008).

The bacterial histone-like nucleoid-structuring protein (H-NS) also can inhibit transcription. H-NS represses transcription initiation by binding directly or adjacent to promoter DNA, then oligomerizing into higher order structures that occlude RNAP binding or trap RNAP at promoters (Fang and Rimsky 2008; Grainger and Busby 2008;

Dorman 2009). H-NS may also affect transcription elongation by binding downstream of

ECs and acting as a roadblock, although experimental evidence for such effects is lacking. Recent studies have identified a genetic link between Rho and H-NS-like proteins (Saxena and Gowrishankar 2011b; Tran et al. 2011). It is therefore possible that Rho and H-NS function together to silence transcription at the same loci. However, a comparison of the Rho termination sites to H-NS binding locations has not been reported.

In a previous study, we identified 25 sites in the E. coli genome at which Rho terminated antisense transcription (Peters et al. 2009). These findings raised several important questions. Is suppression of antisense transcription a major, global function of

205

Rho, or is it particular to a relatively small set of transcripts? What are the contributions of the elongation factors NusG and NusA to suppression of antisense transcription and

Rho termination in general? Finally, if bacterial Rho and eukaryotic Sen1 play analogous roles in terminating antisense transcripts, does H-NS also participate in silencing noncoding transcription in a manner similar to in eukaryotes?

In this study, we used two strand-specific, global RNA profiling techniques, tiling microarray analysis (Perocchi et al. 2007) and RNAseq (Parkhomchuk et al. 2009) to obtain high-resolution maps of Rho-dependent termination in E. coli, and to determine the relationships between Rho, H-NS, NusG, and NusA. Our results establish a major role for Rho in widespread suppression of antisense transcription, show that NusG, but not full-length NusA, plays a significant role in Rho-dependent termination, establish the sequence basis for NusG effects on Rho-dependent termination, and reveal synergy between Rho and H-NS in transcriptional silencing.

206

Results

A major function of Rho is suppression of antisense transcription

To investigate the effects of inhibiting Rho on the transcriptome of E. coli K-12, we first used tiling arrays to measure RNAs from wild-type E. coli (MG1655) grown with or without the specific Rho inhibitor bicyclomycin (BCM) at a concentration that reduces

Rho function without affecting the rate of cell growth (Ederth et al. 2006). We defined sites of Rho termination as positions at which BCM treatment caused a statistically significant increase in downstream transcript levels [False Discovery Rate (FDR) ≤ 5%;

Table 4.S1]. By these criteria, we identified a total of 1264 bicyclomycin significant transcripts (BSTs, Fig. 4.2A) whose levels or lengths increased when Rho was inhibited.

We next used RNAseq to confirm the identification of Rho-dependent terminators. The majority (91%) of terminators detected using tiling arrays also exhibited a two-fold or greater increase RNAseq-defined readthrough in BCM-treated cells (p < 10-4), which confirmed that these terminators were not the result of microarray artifacts.

Analysis of the Rho-dependent terminator dataset revealed a striking connection between Rho termination and antisense transcription. We divided terminators into three broad categories based on whether they affected antisense transcription, sense transcription, or intergenic transcription (Fig. 4.2A; Supplementary Materials and

Methods), using an updated E. coli K-12 genome annotation (www..org; Keseler et al. 2011). Antisense BSTs were caused by readthrough of Rho-dependent terminators either upstream of an oppositely oriented gene (classes I and III) or within a gene (class II). Sense BSTs resulted from readthrough of terminators at the ends of a

207

Figure 4.2. Genome-wide analysis of Rho-dependent transcription termination. (A)

Distribution of Rho-dependent terminators in the E. coli genome as detected by

Bicyclomycin Significant Transcripts (BSTs). Features on the + strand are shown above the solid black lines, and features on the – strand are shown below the lines. Genes are depicted as black boxes and transcript abundance detected by tiling arrays as blue bar graphs. Rho-dependent termination sites are shown as colored boxes; the colors correspond to BST annotations from Fig. 4.2B. (B) Rho-dependent terminator annotations. Terminators are divided into eight classes based on their locations relative to annotated genes. The diagram immediately to the right of the class number illustrates the orientation of transcripts and termination sites relative to annotated genes. Light blue arrows represent genes, and the arrows point in the direction of translation of the gene. The blue line indicates the length of the transcript in untreated cells, and the violet, dashed line shows extension of the transcript in BCM-treated cells. The bar graph to the right of the gene diagram shows the number and percentage (rounded) of terminators in each class. (C) Effects of Rho inhibition on sense and antisense transcription. The log2 ratios of normalized read counts in BCM-treated versus untreated conditions are shown for the coding strand (sense) or the strand opposite the coding strand (antisense) for all E. coli K-12 genes in biological duplicate. Each row represents one gene. Fold increases in transcript abundance due to BCM treatment are shown in yellow and decreases are shown in blue.

208

Figure 4.2

209 genes (classes IV and VI) or within genes (class V), and intergenic BSTs resulted from readthrough of terminators at the ends of a genes (class VII) or where no gene is known to exist (Class VIII). The vast majority (88%) of Rho-dependent terminators controlled antisense transcription (Fig. 4.2B); 52% were class I (e.g., readthrough of grxD transcription into lhr; Fig. 4.3A); 35% were class II (e.g., arising within bglF, or within bglH; Fig. 4.3B), and a few (~1%) were class III.

To quantify the occurrence Rho-dependent termination in sense vs. antisense transcription, we used RNAseq to measure the levels of both sense and antisense transcripts in each gene in E. coli (Fig. 4.2B). Using a stringent cutoff (FDR ≤ 1%, and

5-fold effect), we found that the antisense strands of 1555 genes (34% of all genes) were significantly upregulated by BCM treatment, whereas only 416 genes were upregulated to the same degree on the sense strand (Table 4.S2). Taken together, our results reveal a much greater increase in genome-wide antisense transcription after

Rho inhibition than in sense transcription, and, thus establish that a major function of

Rho is to suppress antisense transcription.

Rho-dependent termination of TUs with potential to generate class I antisense transcripts relative to TUs lacking this potential is statistically significant. Of the 2054 genes at the end of TUs, 725 (35%) are terminated by Rho. Of the 1051 genes at the end of TUs at which the next gene is oriented in the opposite direction, 536 (51%) are terminated by Rho (p < 10-4), whereas only 189 of 1003 (19%) TUs for which the next gene is in the same orientation are terminated by Rho.

210

Figure 4.3. Effects of Rho inhibition at class I and class II Rho-dependent terminators. (A) BCM effects on the class I Rho-dependent terminator at the grxD locus. A statistically-significant increase in transcript levels (magenta dashed brackets) between untreated (blue bars) and BCM-treated (violet bars) cells occurs at the 3′ end of the grxD gene, indicating that the grxD transcript is terminated by Rho. (B) BCM effects on two class II Rho-dependent terminators at the bgl locus. Rho terminates antisense transcripts that arise from within the bglF and blgH genes. Note that the bglH antisense transcript is essentially undetectable in untreated conditions.

211

Figure 4.3

212

Sense transcription is not affected by moderate increases in antisense transcription

The functions of Rho-terminated antisense transcripts within genes are unknown.

Proposed functions of the few antisense transcripts characterized to date include reduction of sense strand gene expression due either to pairing-mediated mRNA degradation or transcriptional interference (Georg and Hess 2011). In such cases, levels of sense and antisense transcripts should be anticorrelated; in other words, increases in antisense transcription would lead to decreases in sense expression. To determine if increased antisense transcription in Rho-inhibited cells caused a decrease in sense strand transcription or an increase in RNA degradation, we calculated the correlation between BCM effects on sense and antisense transcription for annotated genes. We found no correlation between BCM-induced changes in sense and antisense transcript abundance at the gene level (r = -0.09; Fig. 4.S1A). However, the effects of antisense transcription might be evident only at a subset of sites where antisense transcripts are most highly upregulated. To address this possibility, we conducted a second correlation analysis comparing the transcript levels from both strands of antisense BSTs that occurred within genes (class II BSTs). Again, there was little correlation between sense and antisense transcript levels (r = -0.01; Fig. 4.S1B). We conclude that an increase in antisense transcription caused by sub-lethal inhibition of

Rho does not inhibit sense transcription, consistent with the idea that most antisense transcription is transcriptional noise. Greater increases in antisense transcription

(caused by complete, and lethal, inhibition of Rho) could reveal effects on sense RNAs, but would be unlikely to reflect a physiologically relevant state of regulation in the cell.

213

Rho and H-NS silence transcription at the same genomic loci

Recent studies have identified genetic interactions between Rho activity and genes encoding H-NS-like proteins (Saxena and Gowrishankar 2011b; Tran et al. 2011).

These interactions may result from cooperative inhibition of deleterious transcription at certain loci by both Rho and H-NS. To test this hypothesis, we identified the genome- wide locations of H-NS binding using chromatin immunoprecipitation on high-density tiling microarrays (ChIP-chip) under defined growth conditions, and determined the overlap between these locations and the sites at which Rho terminates transcription.

We found a strong association between H-NS binding sites and Rho-dependent terminators (Fig. 4.4A-C). Of the 1264 Rho-dependent terminators identified by tiling expression, 1066 (84%) were within 300 bp of sites with significant H-NS ChIP-chip signal (p < 10-4). Of the 1105 Rho-dependent terminators that suppress antisense transcription, 921 (83%) were associated with H-NS binding sites. The occupancy of H-

NS associated with Rho-dependent terminators was greatest near the site of termination, indicating that Rho termination occurs within H-NS patches (Fig. 4.4A).

Increased H-NS occupancy near Rho-dependent terminators was not simply due to H-

NS binding in intergenic regions at the ends of genes; terminators that occurred within genes (class II BSTs) were bound to an even greater extent by H-NS (Fig. 4.4A). These results establish that H-NS generally occupies DNA at the sites of Rho termination, and are consistent with a model in which Rho and H-NS act cooperatively to silence antisense transcription (see Discussion).

The overlap between sites of Rho-dependent termination and H-NS binding may reflect a functional relationship between Rho and H-NS. For instance, H-NS could slow

214

Figure 4.4. Spatial and functional associations between H-NS and Rho-dependent termination. (A) H-NS binding near Rho-dependent terminators. The median H-NS

ChIP signal (see Supplemental Materials and Methods) is shown at specified distances from the 5′ end of all BSTs (violet), class II BSTs (black) or random chromosome positions (grey). (B) H-NS binding at the class I Rho-dependent terminator at the yhhJ locus. Colors are as in Fig. 4.3, except that H-NS ChIP-chip data is shown in orange.

(C) H-NS binding at the class II Rho-dependent terminator within the xylG gene. (D)

Genetic interactions between hns and rho. Fitness is expressed as the ratio of the doubling times (D) for wild-type (DWT = 24.75 ± 0.33 min) versus mutant cells in liquid

LB medium. Wild-type fitness is set at one. Predicted fitness of double mutants is based on the multiplicative model (fitness of mutant #1 × fitness of mutant #2 = predicted fitness of double mutant; St Onge et al. 2007).

215

Figure 4.4

216 elongating RNAP and allow time for Rho to terminate transcription of silenced genes.

Functional relationships between genes can often be inferred from the growth phenotypes of double mutant strains; a slower-than-expected growth rate for the double mutant indicates a genetic interaction that suggests shared function (St Onge et al.

2007). To test for a functional relationship between Rho and H-NS, we measured the growth rate of strains containing defective rho alleles (either rho4; Morse and Guertin

1972), or rho115; (Guterman and Howitt 1979) and a deletion of hns (Δhns; Baba et al.

2006) in LB medium (Fig. 4.4D). Double mutant rho4 Δhns and rho115 Δhns strains grew more slowly than expected from a multiplicative model of fitness (Fig. 4.4D; (St

Onge et al. 2007)), revealing genetic interactions between rho and hns. A third rho allele, rho15(Ts) (Das et al. 1976), was even more detrimental to cell viability when combined with Δhns. rho15(Ts) allows growth at 30 °C, but not 42 °C. We could only construct a strain carrying both rho15(Ts) and Δhns in the presence of a rho+ complementing plasmid, regardless of the growth temperature, suggesting that the combination of rho15(Ts) and Δhns is lethal to cells. To test this possibility, we monitored retention of an unstable plasmid (Koop et al. 1987; Bernhardt and de Boer

2004) carrying a rho+ allele and a lacZ+ reporter gene by the rho15(Ts) Δhns double mutant on plates containing X-gal at 30 °C (Fig. 4.S2). The rho15(Ts) Δhns strain formed only blue colonies, indicating that the rho+ plasmid was required for viability.

These genetic interactions confirm cooperative action of Rho and H-NS in transcriptional silencing.

217

NusG principally assists termination at a minority subset of Rho-dependent terminators

It is uncertain whether NusG enhancement is a significant requirement for termination at all Rho-dependent terminators (Cardinale et al. 2008; Burmann et al. 2010), or a subset of Rho-dependent terminators (Sullivan and Gottesman 1992) in vivo. We used ChIP- chip to detect RNAP readthrough of Rho-dependent terminators in a multiple deletion strain lacking cryptic prophage MDS42 (Posfai et al. 2006) and also lacking NusG

(MDS42 ∆nusG; (Cardinale et al. 2008). We found a statistically significant overlap between Rho-dependent terminators and sites affected by deletion of nusG (p < 10-4).

Of the 157 Rho-dependent terminators detectable by ChIP-chip (Peters et al. 2009) that are present in MDS42, 42 (27%) were within 300 bp of a statistically significant increase in RNAP occupancy in ∆nusG cells (Table 4.S3). These results indicate that NusG enhancement significantly affects termination efficiency at less than half of Rho- dependent terminators.

To investigate the contribution of NusG to suppression of antisense transcription, we performed tiling expression analysis on cells deleted for nusG and the rac prophage

(Fig. 4.5ABC). Consistent with our ChIP-chip results, only a subset of the Rho- terminated transcripts were affected by deletion of nusG. Of 1264 Rho-dependent terminators, 247 (20%) exhibited greater readthrough in ∆nusG cells. Of the 1105 Rho- dependent terminators controlling antisense transcription, 229 (21%) exhibited greater readthrough when NusG was absent. These results demonstrate that NusG acts in concert with Rho to suppress antisense transcription at a minority subset of Rho- dependent terminators, rather than at all terminators (Cardinale et al. 2008).

218

Figure 4.5. Effects of ΔnusG and ΔnusA* on Rho-dependent termination. (A)

Effects of ΔnusG and ΔnusA* on expression within MDS42 BSTs. The log2 ratio of median intensity within MDS42 BSTs in BCM-treated or mutant cells versus untreated cells are shown in biological duplicate. Each row represents one MDS42 BST. Fold increases in transcript abundance due to either BCM treatment, ΔnusG or ΔnusA* are shown in yellow and decreases are shown in blue. (B) Effects of ΔnusG or ΔnusA* on the class I Rho-dependent terminator at the grxD locus. Neither ΔnusG nor ΔnusA* have a significant effect on transcript abundance at the 3′ end of the grxD gene. Colors are as in Fig. 4.3, except that transcripts from ΔnusG cells are shown in green and transcripts from ΔnusA* cells are shown in red. (C) Effects of ΔnusG or ΔnusA* on the class II Rho-dependent terminator within the bglF gene. ΔnusG had a significant effect on transcript abundance of the bglF antisense transcript, but ΔnusA* had no sigificant effect. Note that the ΔnusG effect on termination of the bglF antisense transcript was not as potent as direct inhibition of Rho with BCM.

219

Figure 4.5

220

The overlap between sites of Rho termination and either NusG enhancement or

H-NS binding suggests that Rho, NusG, and H-NS may act synergistically at the same sites to suppress transcription. Alternatively, NusG may be required at terminators lacking H-NS. To distinguish between these possibilities, we compared Rho-dependent terminators that require NusG for efficient termination to those associated with H-NS.

We found that the majority of NusG-dependent terminators were bound by H-NS (210 out of 247, or 85%; p < 10-4). Further, H-NS association with NusG-affected terminators was statistically indistinguishable from H-NS association with terminators in general (p =

0.1). We conclude that NusG is not required at terminators lacking H-NS; instead, our data are consistent with the idea that Rho, NusG, and H-NS act coordinately at specific loci.

NusG is required at Rho-dependent terminators with sub-optimal nucleic-acid sequences

The features that distinguish Rho-dependent terminators that require NusG for efficient termination from those that do not are unknown. The NusG requirement at certain terminators in not explained by a lack of H-NS binding. In principle, NusG dependence could be associated with the type or function of the terminated gene (mRNA, tRNA, sRNA), whether the gene downstream of the terminator is oriented in the same (sense) or opposite (antisense) direction as readthrough, or nucleic-acid sequences present at the terminator. To determine if NusG-dependent terminators are enriched at certain types of genes or classes of terminators, we compared gene associations and classes

221 from the total set of BSTs to those BSTs with significant overlapping NusG effects. We found that NusG-dependent terminators were not significantly enriched at mRNAs, tRNAs, or sRNAs (p = 0.44). Further, NusG enhancement was not differentially represented at any specific class (sense, antisense, within genes, at the end of genes, etc.) of Rho-dependent terminator (p = 0.21, Fig. 4.6A). Finally, functional annotation analysis using the DAVID Bioinformatics Database (david.abcc.ncifcrf.gov, (Huang da et al. 2009)) failed to identify any particular set of genes that required NusG for efficient termination. We conclude that NusG stimulation of Rho termination is not associated with gene type, function, or location of the terminator relative to annotated genes.

We next considered whether particular sequences distinguish NusG-dependent

Rho terminators. Athough upstream rut sites are present at some Rho-dependent terminators (e.g., λ tR1 (Chen and Richardson 1987) and E. coli trp t´ (Zalatan and Platt

1992), sequences associated with Rho termination or NusG-dependence have not been examined on a genome scale. To identify possible sequence patterns at Rho-dependent terminators, we calculated the C/G ratio in 100-bp windows from -500 bp upstream to

+1000 bp downstream of the first probe significantly affected by BCM treatment (Fig.

4.6B; high C/G ratio sequences will generate rut sites in RNA). We found a peak in the

C/G ratio centered at approximately +200 bp (p < 10-4; Fig. 4.6B). This peak was due to both an increase in C (from 25% to 27%) and a decrease in G residues (from 25% to

21%). Because we defined Rho-dependent terminators by the location of the transcript

3´ end in untreated cells and because Rho-terminated transcripts are typically processed by cellular 3´→5´ exonucleases (Mohanty and Kushner 2007), the elevated

C/G ratio downstream from the mapped 3´ ends likely reflects Rho-dependent

222

Figure 4.6. Basis for NusG effects on Rho-dependent termination. (A) NusG enhancement of termination is not associated with terminator class. All BSTs and BSTs with an overlapping significant ΔnusG effect were not statistically distinguishable by terminator class (Fisher Exact Test; p = 0.2053). Colors are as in Fig. 4.2B. (B) NusG enhancement of termination is associated with sequences present at the termination site. The median C/G ratio (see Supplemental Materials and Methods) is shown at specified distances from the 5′ end of all BSTs (violet), NusG-independent BSTs (black),

NusG-dependent BSTs (green) or random chromosome positions (grey). The orange

“pacman” represents putative processing of transcripts by 3´→5´ exonucleases.

223

Figure 4.6

224 termination in or near the high C/G ratio sequences followed by processing of these transcripts to generate RNA 3´ ends that map upstream from these sequences (Fig.

4.6B). Interestingly, the median C/G ratio remained above the random baseline more than 1 kb downstream of the initial BCM effect (Table 4.S4), suggesting that sites capable of eliciting Rho termination are arranged consecutively in the genome to increase overall termination efficiency.

To investigate if NusG effects on termination are related to nucleotide content at Rho- dependent terminators, we examined the C/G ratios of 300 terminators that were highly dependent on NusG for efficient termination, and 300 that were NusG-independent (see

Supplemental Materials and Methods). We found that terminators with strong NusG dependence exhibited a lower median C/G ratio than the total set of Rho-dependent terminators (Fig. 4.6E, Table 4.S3). In contrast, NusG-independent terminators display an even greater skew toward C and away from G than the majority of Rho-dependent terminators. Differences in the median C/G ratio for both NusG-dependent and NusG- independent terminators compared to all terminators were statistically significant within

300 bp of the 5′ end of the BST (Table 4.S4). We conclude that the NusG-dependence of some Rho terminators results from the presence of sequences less likely to generate rut sites (i.e. lower C/G ratio sequences).

REP elements also are associated with Rho-dependent terminators

We used the MEME algorithm (Bailey and Elkan 1994) to identify potential motifs in the

300 bp upstream of Rho-dependent terminators. (Supplementary Materials and

225

Methods). We found several long (20-29 nt) motifs with high information content (E values from 1.7e-36 to 3.7e-108) at a subset of terminators (Fig. 4.S3A). The sequences and locations of the motifs matched those of REP elements, which are repetitive sequences found in intergenic regions near the 3´ ends of genes (Stern et al. 1984). Of the 695 REP elements present in the E. coli K-12 genome, 334 (48%) were within 300 bp of a Rho-dependent terminator (p < 10-4). REP elements contain palindromic units

(PUs) with dyad symmetry that form hairpin structures when transcribed into RNA (Fig.

4.S3B). REP elements near Rho-dependent terminators could potentially function as modulators of termination (Espeli et al. 2001) or as stabilizing hairpins that prevent RNA decay (Stern et al. 1984) in the absence of intrinsic terminator hairpins. To distinguish between these possibilities, we determined the distribution of REP elements from -500 bp upstream to +1000 bp downstream of the first probe significantly affected by BCM treatment (Fig. 4.S3C). We found that the distribution of REP elements was centered at approximately -100, about 300 bp upstream of high C/G ratios that represent Rho binding sites. Thus, the majority of REP elements appear too far upstream to directly affect Rho binding and subsequent termination, but optimally positioned to explain RNA

3' ends present after exonuclease trimming. We suggest that the major function of REP elements found near Rho-dependent terminators is to inhibit RNA decay by impeding the processivity of 3´→5´ exonucleases, rather than to affect Rho termination efficiency.

226

A classic nusA deletion has no effect on Rho-dependent termination in vivo

The transcription elongation factor NusA has been proposed to enhance all Rho- dependent termination in vivo (Cardinale et al. 2008). We used RNAP ChIP-chip to identify sites of terminator readthrough in a strain that carried the same large nusA deletion studied previously either in a rho mutant strain containing unmapped mutations that allowed viability or in MDS42 (Zheng and Friedman 1994; Cardinale et al. 2008).

This deletion, which we term ∆nusA*, disrupts nusA at codon 128, leaving the NusA

NTD intact but removing the NusA S1, KH1, and KH2 domains. We found little or no overlap between Rho-dependent terminators and sites affected by ∆nusA*. Of the 157

MDS42 Rho-dependent terminators detectable by ChIP-chip (Peters et al. 2009), only 9

(6%) were within 300 bp of a statistically-significant increase in RNAP occupancy in

∆nusA* cells (p = 0.2756; Table 4.S3). To rule out the possibility that our inability to detect a major effect of ∆nusA* on Rho termination was due to the low sensitivity of the

ChIP-chip technique, we performed tiling expression analysis on ∆nusA* cells (Fig.

4.5ABC). Again, we found no significant overlap between BCM effects and ∆nusA* effects. Of the 1264 Rho-dependent terminators identified by tiling expression, only 4

(<1%) overlapped with a transcript significantly upregulated in the ∆nusA* strain (p =

0.4139). Because our ∆nusA* results differed dramatically from those previously published using the same nusA allele (Cardinale et al. 2008), we confirmed that our strain contained the ∆nusA* allele by sequencing of the nusA locus (data not shown), and by visual inspection of nusA transcript data (Fig. 4.S4). We were unable to explain the discrepancy between our results and those reported by Cardinale et al., 2008. We conclude that the S1, KH1, and KH2 RNA-binding domains of NusA (the domains

227 deleted in nusA*) have little effect on Rho termination in vivo under standard E. coli growth conditions.

228

Discussion

Our transcriptomic analysis of Rho termination establishes suppression of antisense transcription as a major role of Rho in bacteria. Most antisense transcription suppressed by Rho arises from a large and mostly uncharacterized set of antisense promoters within genes (internal antisense; Fig. 4.7) and from continuation of sense transcription past the ends of genes into oppositely oriented downstream genes (readthrough antisense; Fig. 4.7). Both H-NS and NusG contribute to the suppression of antisense transcription through apparently independent mechanisms. These findings raise several questions that merit discussion and further study. What impact does pervasive antisense transcription have on gene expression? Does H-NS affect Rho-dependent termin ation directly, possibly by slowing elongation by RNAP and allowing more time for Rho to terminate transcription? What is the mechanistic basis for NusG effects that depend on terminator sequence? Finally, how similar are the bacterial and eukaryotic systems that control antisense transcription?

Impact of antisense transcription on sense transcription

We observed no apparent effect on sense transcription when Rho inhibition caused significantly elevated antisense transcription (Fig. 4.S1AB). Although complete Rho inhibition causing possibly higher levels of antisense transcription could affect sense transcription, it is also lethal. Thus, if physiologically relevant effects on sense transcription occur by Rho modulation in wild-type cells, we should have detected them.

This result suggests counterintuitively that most antisense transcription in wild-type cells

229

Figure 4.7. Models of antisense transcription termination by Rho. Rho terminates transcription at the end of genes, preventing antisense transcription into downstream genes (Readthrough Antisense, or class I terminators). Rho also terminates antisense transcription arising from within genes (Internal Antisense, or class II terminators).

Termination sites are more C rich and G poor than random genomic DNA, and, thus, facilitate Rho loading onto the nascent RNA (violet box labeled “C>G”). H-NS (orange ovals) is typically bound adjacent to sites of Rho termination (H-NS binding sites are shown as orange boxes labeled “H-NS”), and is functionally synergistic with Rho. NusG enhances Rho termination at termination sites with reduced C and increased G content, which account for less than a quarter of all Rho-dependent terminators (dashed black arrow). The RNA-binding domains of NusA do not affect Rho termination of antisense transcription (black crossout). Class I terminators can also be associated with REP elements, which may stablize the terminated mRNA (light blue box labeled “REP”).

230

Figure 4.7

231 has minimal effects on gene expression, even though specific effects of some antisense transcripts on gene expression are well known (Georg and Hess 2011).

Apparently, collisions between RNAPs as a result of antisense transcription are tolerated in normal cells as a form of transcriptional noise (Struhl 2007). The density of

RNAP on any mRNA gene is low (less than one per cell; Bon et al. 2006) so collisions between sense and antisense RNAP molecules are likely infrequent. Even when collisions do occur, transcription by sense but not antisense RNAP molecules will be aided by ribosomes translating nascent mRNA (Proshkin et al. 2010). Sense RNAPs may thus overpower antisense RNAPs, causing them to halt or backtrack and thereby facilitate Rho dissociation of antisense transcription complexes.

H-NS may assist Rho in silencing antisense transcription

Our findings that H-NS is generally associated with DNA at the sites of Rho termination, and that hns and rho genetically interact suggest that Rho and H-NS act coordinately to silence transcription synergistic effects is that H-NS increases the time window for effective Rho action by slowing transcript elongation or increasing pausing by RNAP.

Several observations support this direct model of H-NS/Rho-synergy, although we cannot rule out indirect effects. First, H-NS occupancy of DNA increases as RNAP approaches the termination site, with maximal H-NS binding occurring at the position of termination. This is consistent with RNAP “running into” a downstream patch of oligomerized H-NS, which may serve as a block to elongation. Second, overexpression of H-NS lacking its C-terminal DNA-binding domain or of ydgT, whose product

232 resembles the N-terminal H-NS oligomerization domain, can suppress defects in Rho- dependent termination caused by mutations in rho or nusG in hns+ cells (Williams et al.

1996; Saxena and Gowrishankar 2011b). These results suggest that changes to the H-

NS nucleoprotein filament alters the efficiency of Rho termination. These effects are likely specific to sites of H-NS binding, rather than reflecting general effects of YdgT or

H-NS fragments on rates of transcript elongation because neither YdgT nor an H-NS fragment altered the rate of mRNA synthesis in a gene (lacZ) that does not bind H-NS

(Grainger et al. 2006; Oshima et al. 2006; Kahramanoglou et al. 2011; Saxena and

Gowrishankar 2011b). Finally, sites of H-NS binding and Rho termination need to be coincident for a direct model of H-NS and Rho synergy, but not for an indirect model.

Testing whether H-NS affects transcription elongation, pausing, and Rho-termination directly will require carefully designed mechanistic and structure/function experiments.

Mechanism of NusG enhancement and relevance to polarity models

Our results establish that NusG affects a subset of Rho-dependent terminators, and that these effects depend on sequences at the termination site. The simplest mechanistic explanation for these results is that NusG is required at terminators with suboptimal

Rho-binding sites (rut sites). High C/G sequences that distinguish NusG-independent and NusG-dependent terminators should generate rut sites that bind Rho with high affinity. Further, lower C/G sequences should generate RNA structures that further diminish unpaired C residues available for Rho binding. Logically, NusG may help Rho

233 bind nascent RNA at terminators with lower C/G ratios by tethering Rho near the RNAP exit channel.

However, NusG-stimulation of Rho-RNA binding challenges existing ideas about Rho termination. NusG does not affect the half-maximal concentration of Rho required for termination at a NusG-dependent terminator in lacZ (Burns and Richardson 1995).

Further, Sen and co-workers argue that NusG stimulates the EC dissociation step rather than the RNA binding step of Rho-dependent termination (Chalissery et al. 2011;

Kalyani et al. 2011). Others argue that Rho is permanently associated with RNAP

(Epshtein et al. 2010), obviating a role of NusG in tethering Rho to the EC. Conceivably, the sequence dependence of NusG effects on Rho-dependent terminators could reflect

NusG modulation of a step other than RNA-binding that also is favored by Rho-rut site interaction. Unambigously determining the mechanistic basis of termination enhancement by NusG will require careful assays of the Rho-concentration dependence of termination at a collection of Rho-dependent terminators.

Our finding that NusG affects a subset of Rho-dependent terminators also has implications models of transcriptional polarity (Peters et al. 2011). In the RNA- competition model, ribosomes block Rho binding to rut sites on the nascent RNA

(Adhya et al. 1974). In the NusG-competition model, ribosomes sequester the NusG-

CTD in an interaction with ribosomal protein S10 (Burmann et al. 2010) and thereby prevent the NusG-CTD from activating Rho-dependent termination. Since only a minority subset of Rho-dependent terminators require NusG, however, most polarity suppression cannot be explained by ribosomes sequestering the NusG-CTD. Our results favor a mixed model in which both the RNA competition and NusG competition

234 play roles. Mutations in the ribosome that disrupt the binding between S10 and NusG will be crucial to test the contribution of the S10-NusG interaction to polarity in vivo.

Analogous control of antisense transcription in bacteria and eukaryotes

Although the synergy among chromatin structure and elongation factors in terminating spurious transcription is more complex in eukaryotes, the helicase Sen1 plays an analogous to bacterial Rho. Together with Nrd1 and Nab, which confer both RNA recognition (Carroll et al. 2007) and early elongation phase-specific RNAPII CTD recognition (Vasiljeva et al. 2008), Sen1, like Rho, terminates cryptic, untranslated transcripts including antisense transcripts (Arigo et al. 2006; Brow 2011). Both histone modifications and nucleosome composition affect these processes in eukaryotes

(Carrozza et al. 2005; Santisteban et al. 2011), and some histone modifications appear to aid Sen1 action (Terzi et al. 2011). Further, the evolutionarily conserved Spt5 (NusG) elongation factor both aids transcript elongation (Hirtreiter et al. 2010) and, through its multiple C-terminal KOW domains, recruits elongation factors (Zhang et al. 2005), apparently including Nrd1 (Vasiljeva and Buratowski 2006; Lepore and Lafontaine

2011). Thus, in both bacteria and eukaryotes, nucleoprotein structure and a functionally analogous termination complex increase the overall fidelity of RNA synthesis by suppressing noncoding transcription.

235

Materials and Methods

Strains, plasmids, and primers

Strains are listed in Table 4.S4. rho and hns strains were constructed by P1vir-mediated transduction (Burns and Richardson 1995). The unstable plasmid, pJP124 (pRC7-rho+), was generated by cloning a PCR product containing the rho promoter and rho gene between the ApaI and HindIII sites of pRC7 (Koop et al. 1987; Bernhardt and de Boer

2004) using the following primers: 5′-ctctctctgggcccataagggaatttcatgttcgg, and 5′- ctctctaagcttatgagcgtttcatcattt. Transcription of rho+ and lacZ+ was driven by the rho

+ + + promoter (Prho-rhoL -rho -lacZYA ).

RNA isolation

Cells were grown in MOPS minimal medium (Neidhardt et al. 1974) with 0.2% glucose

o at 37 C in gas-sparged Roux bottles to mid-log phase (OD600 ~ 0.3-0.4). Cultures samples were transferred directly into an ice-cold ethanol/phenol stop solution (Rhodius et al. 2006), which immediately inactivated cellular RNases. Cells were collected by centrifugation and stored at -80 oC until RNA extraction. Total RNA was extracted from cell pellets by hot phenol extraction (Khodursky et al. 2003). The integrity of total RNA was determined from agarose gel or microchannel (Agilent Bioanalyzer) electrophoretograms. Ribosomal RNA (16S and 23S) was depleted prior to construction of RNAseq libraries using MICROBExpress reagents (Ambion).

RNAseq, data normalization, processing, and significant gene identification

236

RNAseq was performed by the DOE Joint Genome Institute (Walnut Creek, CA), using the dUTP method (Parkhomchuk et al. 2009). Briefly, ribosome-depleted RNA was fragmented in a buffered zinc solution (Ambion), then purified using AMPure SPRI beads (Agencourt). First-strand cDNAs were then synthesized from the fragmented

RNA using Superscript II reverse transcriptase (Invitrogen), followed by a second bead purification. dUTP was included in the second strand synthesis reaction in addition to dTTP to chemically mark the second strand. Two further bead purification steps using different ratios of beads to cDNA (85/100, then 140/100) selected cDNAs in a range between 150-350 bp. cDNAs were then A-tailed using Exo- Klenow, followed by ligation of sequencing adapter oligos. Following bead purification, dUTP was cleaved from the second strand using AmpErase UNG ( N-glycosylase, Applied Biosystems), resulting in adaptor ligated single stranded cDNAs. Deep sequencing of cDNAs was performed using the Illumina Tru-Seq Sequencing Platform. RNAseq reads were mapped to the E. coli K-12 MG1655 genome (Genbank ID U00096.2) using SOAP ((Li et al. 2008); Table 4.S6). RNAseq data depicted in Fig.s 2C, 3A, and 3B was normalized by dividing either the total number of read counts per sense or antisense strand of genes (Fig. 4.2C) or by the number of read counts at a given chromosome coordinate (Fig.s 3A and 3B) by the total library size in millions (i.e. counts per million).

Normalized biological replicates showed good agreement (r ≥ 0.99 at the count per gene level). The log2 ratios for BCM-treated versus untreated cells shown in Fig. 4.2C were calculated after a constant of one read per million was first added to each gene to avoid divide by zero errors. These log2 ratios were quantile normalized between biological replicates of the same strand (sense or antisense) using R

237

(normalizequantiles; (Arigo et al. 2006). To identify genes with significant changes in expression in BCM-treated cells, raw (unnormalized) RNAseq reads between the start and end coordinates of genes were summed for both sense and antisense strands.

Library normalization and significant gene determination were carried out using an R implementation of the edgeR algorithm (Robinson et al. 2010). Fold-effect and FDR values for each gene are listed in Table 4.S2.

Tiling expression, data normalization, processing, and significant transcript identification

Reverse transcription and cDNA labeling were performed as previously described (Cho et al. 2009), except that Cy3 was used instead of Cy5. Microarrays were designed using chipD (Dufour et al. 2010), and contained 378,408 probes that alternate strands with 

12 bp spacing (Roche-Nimblegen, Madison, WI). Hybridization and washing of microarrays was performed according to standard Nimblegen protocols

(http://www.nimblegen.com). Microarrays were scanned at 532 nm using a GenePix

4000B scanner (Molecular Devices). Raw probe intensities were normalized across samples using RMA analysis implemented in the NimbleScan software package

(Roche-Nimblegen, Madison, WI). Data from Δrac and MDS42 strain backgrounds were normalized separately to avoid problems arising from the lack of probe intensity in

MDS42 deleted regions. Normalized biological replicates showed good agreement (r ≥

0.90 at the probe level). Normalized probe intensities were transformed to log2, split into strands, and biological replicates were averaged. Transcripts that were significantly upregulated in BCM-treated or mutant cells were identified using an R implementation

238 of the CMARRT algorithm (Washburn and Gottesman 2011). First, data from untreated cells were subtracted from data from BCM-treated or mutant cells. Second, the

CMARRT was used to identify regions of at least three consecutive significantly upregulated probes (FDR ≤ 5%), in the subtracted data. Probe data from MDS42 deletions were discarded prior to CMARRT analysis to avoid significant transcripts that spanned deletions being called as two distinct transcripts. The significance of overlap between two genome features (e.g., significantly upregulated transcripts in BCM-treated cells and significantly upregulated transcripts in ΔnusG cells) was determined by counting the number of overlapping features before and after rotation of the positions of one set of features by one megabase. Significance was then determined using the

Mann-Whitney U-test.

ChIP-chip, data normalization, processing, and significant region identification

ChIP-chip was performed as previously described (Mooney et al. 2009a). Monocolonal antibody against RNAP (anti-β, NT63) was purchased from Neclone (Madison, WI), and polyclonal antisera against H-NS (Harlan Laboratories) was purfied by adsorbsion with cell power from an E. coli ∆hns strain. ChIP-chip data was normalized and averaged as previously described (Mooney et al. 2009a). H-NS ChIP-chip data shown in Fig. 4.4 were smoothed using two rounds of sliding-window averaging over 300 bp. Significant increases in RNAP occupancy in ΔnusG and ΔnusA* cells were identified using

CMARRT as previously described for BCM (Peters et al. 2009), except that probe data from MDS42 deletions were discarded prior to CMARRT analysis to avoid regions of

239 increased RNAP occupancy that spanned deletions being called as two significant regions. Significant regions of H-NS occupancy were defined using CMARRT as regions of at least three consecutive probes that were significantly above background

(FDR ≤ 5%).

Bicyclomycin Significant Transcript (BST) annotation

BST annotation proceeded by a two-step process. First, genes were computationally assigned to BSTs. If a BST was located less than 600 bp from the nearest upstream gene on the same strand, then the BST was assigned to the end of that gene. The 600 bp distance was arbitrarily chosen, and is approximately two times the median distance of the entire BST dataset from the nearest upstream gene on the same strand (591 bp).

BSTs that were located a distance greater than 600 bp away from the nearest upstream gene on the same strand were assigned to the gene that overlapped the first significant probe in the BST. If this gene was on the same strand, the BST was annotated as sense, if the gene was on the opposite strand, the BST was annotated as antisense. A small number of BSTs (< 10%) were reannotated after visual inspection.

H-NS ChIP-chip signal near BSTs analysis

Normalized H-NS ChIP-chip signal was averaged in 100-bp bins at specified distances from the 5’ probe coordinate (5′ end) of each BST (all BSTs) or from the 5′ probe coordinate of each BST shifted by one mb to generate a random dataset. The median

240 value of H-NS ChIP-chip signal for all bins (1264 bins for all BSTs) at a specified distance was plotted (Fig. 4.4A).

NusG-dependence sequence analysis

The median log2 probe intensity for the the first 300 bp after the 5′ end of the BST was calculated for BCM-treated and ΔnusG tiling expression data. The ΔnusG/BCM ratio of median probe intensities was used to sort terminators for NusG effects. The 300 most

NusG-dependent terminators (ΔnusG/BCM ratio of 0.76 or greater) and the 300 most

NusG-independent (ΔnusG/BCM ratio of less than 0.002) were chosen for sequence analysis.

Identification of REP element motifs

DNA sequences from 300 bp upstream to the 5′ end of each BST were submitted to a web-based implementation of the MEME algorithm (meme.sdsc.edu/meme; (Bailey and

Elkan 1994)) for motif identification using the default settings. The genomic locations and sequences of the motifs identified by MEME (Fig. 4.S3) matched those of REP elements in the Ecocyc database (www.ecocyc.org; (Keseler et al. 2011)).

241

Clustering and heatmap analysis

Centroid-linkage clustering with Euclidean distance as the distance metric was performed using the program Cluster 3.0 (Eisen et al. 1998). Heatmaps of clustered data were visualized in the program Java Treeview (Figs. 1C, 4A; Saldanha 2004).

242

Acknowledgements

We thank members of the Landick lab, A. B. Banta (UW-Madison), and M. E.

Gottesman (Columbia) for critical reading of the manuscript; B. K. Cho and B. Ø.

Palsson (UCSD) for the tiling expression microarray protocol; D. H. Keating (Great

Lakes Bioenergy Research Center), D. L. Court (NCI-Frederick), and M. E. Gottesman

(Columbia) for strains; the US DOE Joint Genome Institute (Walnut Creek, CA) for

RNAseq library preparation and sequencing, and H. Yan (Great Lakes Bioenergy

Research Center) for RNAseq read mapping. This work was supported by the National

Institutes of Health (GM38660) and for RNAseq work by the US DOE Great Lakes

Bioenergy Research Center (DOE Office of Science BER DE-FC02-07ER64494) and the US DOE Joint Genome Institute (DOE Office of Science Contract No. DE-AC02-

05CH11231).

243

Supplementary Figures

Figure 4.S1. Increased antisense transcription due to Rho inhibition does not affect sense transcription. (A) Correlation between the effects of BCM treatment on sense strand transcription for all genes versus antisense strand transcription (r = -0.09).

(B) Correlation between the effects of BCM treatment on the antisense strand of antisense BST and the sense strand of antisense BSTs (r = -0.01).

244

Figure 4.S1

245

Figure 4.S2. The combination of rho15(Ts) and Δhns is synthetic lethal. The rho+ lacZ+ unstable plasmid is rapidly lost from wild-type and Δhns cells, resulting in blue

(blue arrow) and white (white arrow) sectored colonies and white colonies on X-gal containing LB plates (rho+ hns+ and rho+ Δhns respectively). The unstable plasmid is also lost from rho15(Ts) cells, albeit at a lower frequency, forming large blue colonies and small white colonies (rho15(Ts) hns+). rho15(Ts) Δhns constitute a synthetic lethal pair, and, thus, retain the unstable plasmid, forming all blue colonies (rho15(Ts) Δhns).

The small colony phenotype of rho15(Ts) Δhns even when complemented by the rho+ plasmid is likely due to mixed hexamer formation between wild-type Rho and Rho15 monomers.

246

Figure 4.S2

247

Figure 4.S3. Motifs found near Rho-dependent terminators correspond to REP elements. (A) Three statistically-significant motifs identified by MEME that correspond to REP elements at the ends of genes near class I Rho-dependent terminators. (B)

Predicted hairpin structure of a REP element PU based on Mfold analysis (Zuker 2003) of the PU consensus sequence (Bachellier et al. 1994). (C) Histogram of the occurrence of REP elements and median C/G ratios near Rho-dependent terminators (BSTs).

Colors and symbols are as in Fig. 6.

248

Figure 4.S3

249

Figure 4.S4. Confirmation of the ΔnusA* allele. Tiling expression and ChIP-chip input

DNA lacked signal intensity for nusA DNA deleted in the ΔnusA* allele, confirming that the strain used in our study was ΔnusA*.

250

Figure 4.S4

251

References

Adhya S, Gottesman M, De Crombrugghe B. 1974. Termination and antitermination in transcription: control of gene expression. Basic Life Sci 3: 213-221.

Arigo JT, Eyler DE, Carroll KL, Corden JL. 2006. Termination of cryptic unstable transcripts is directed by yeast RNA-binding proteins Nrd1 and Nab3. Mol Cell 23: 841- 851.

Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H. 2006. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2: 2006 0008.

Bailey TL, Elkan C. 1994. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 2: 28-36.

Banerjee S, Chalissery J, Bandey I, Sen R. 2006. Rho-dependent transcription termination: more questions than answers. J Microbiol 44: 11-22.

Bernhardt TG, de Boer PA. 2004. Screening for synthetic lethal mutants in Escherichia coli and identification of EnvC (YibP) as a periplasmic septal ring factor with murein hydrolase activity. Mol Microbiol 52: 1255-1269.

Bon M, McGowan SJ, Cook PR. 2006. Many expressed genes in bacteria and yeast are transcribed only once per cell cycle. FASEB J 20: 1721-1723.

Boudvillain M, Nollmann M, Margeat E. 2010. Keeping up to speed with the transcription termination factor Rho motor. Transcr 1: 70-75.

Brow DA. 2011. Sen-sing RNA terminators. Mol Cell 42: 717-718.

Burmann BM, Schweimer K, Luo X, Wahl MC, Stitt BL, Gottesman ME, Rosch P. 2010. A NusE:NusG complex links transcription and translation. Science 328: 501-504.

252

Burns CM, Richardson JP. 1995. NusG is required to overcome a kinetic limitation to Rho function at an intragenic terminator. Proc Natl Acad Sci U S A 92: 4738-4742.

Cardinale CJ, Washburn RS, Tadigotla VR, Brown LM, Gottesman ME, Nudler E. 2008. Termination factor Rho and its cofactors NusA and NusG silence foreign DNA in E. coli. Science 320: 935-938. Carroll KL, Ghirlando R, Ames JM, Corden JL. 2007. Interaction of yeast RNA-binding proteins Nrd1 and Nab3 with RNA polymerase II terminator elements. RNA 13: 361- 373.

Carrozza MJ, Li B, Florens L, Suganuma T, Swanson SK, Lee KK, Shia WJ, Anderson S, Yates J, Washburn MP et al. 2005. Histone H3 methylation by Set2 directs deacetylation of coding regions by Rpd3S to suppress spurious intragenic transcription. Cell 123: 581-592.

Chalissery J, Muteeb G, Kalarickal NC, Mohan S, Jisha V, Sen R. 2011. Interaction surface of the transcription terminator Rho required to form a complex with the C- terminal domain of the antiterminator NusG. J Mol Biol 405: 49-64.

Chen CY, Richardson JP. 1987. Sequence elements essential for rho-dependent transcription termination at lambda tR1. J Biol Chem 262: 11292-11299. Cho BK, Zengler K, Qiu Y, Park YS, Knight EM, Barrett CL, Gao Y, Palsson BO. 2009. The transcription unit architecture of the Escherichia coli genome. Nat Biotechnol 27: 1043-1049.

Das A, Court D, Adhya S. 1976. Isolation and characterization of conditional lethal mutants of Escherichia coli defective in transcription termination factor rho. Proc Natl Acad Sci U S A 73: 1959-1963.

Dorman CJ. 2009. Nucleoid-associated proteins and bacterial physiology. Adv Appl Microbiol 67: 47-64.

253

Dornenburg JE, Devita AM, Palumbo MJ, Wade JT. 2010. Widespread antisense transcription in Escherichia coli. MBio 1.

Dufour YS, Wesenberg GE, Tritt AJ, Glasner JD, Perna NT, Mitchell JC, Donohue TJ. 2010. chipD: a web tool to design oligonucleotide probes for high-density tiling arrays. Nucleic Acids Res 38: W321-325.

Ederth J, Mooney RA, Isaksson LA, Landick R. 2006. Functional interplay between the jaw domain of bacterial RNA polymerase and allele-specific residues in the product RNA-binding pocket. J Mol Biol 356: 1163-1179.

Epshtein V, Dutta D, Wade J, Nudler E. 2010. An allosteric mechanism of Rho- dependent transcription termination. Nature 463: 245-249.

Espeli O, Moulin L, Boccard F. 2001. Transcription attenuation associated with bacterial repetitive extragenic BIME elements. J Mol Biol 314: 375-386.

Fang FC, Rimsky S. 2008. New insights into transcriptional regulation by H-NS. Curr Opin Microbiol 11: 113-120.

Georg J, Hess WR. 2011. cis-antisense RNA, another level of gene regulation in bacteria. Microbiol Mol Biol Rev 75: 286-300.

Grainger DC, Busby SJ. 2008. Global regulators of transcription in Escherichia coli: mechanisms of action and methods for study. Adv Appl Microbiol 65: 93-113.

Grainger DC, Hurd D, Goldberg MD, Busby SJ. 2006. Association of nucleoid proteins with coding and non-coding segments of the Escherichia coli genome. Nucleic Acids Res 34: 4642-4652.

Guterman SK, Howitt CL. 1979. Rifampicin supersensitivity of rho strains of E. coli, and suppression by sur mutation. Mol Gen Genet 169: 27-34.

254

Hirtreiter A, Damsma GE, Cheung AC, Klose D, Grohmann D, Vojnic E, Martin AC, Cramer P, Werner F. 2010. Spt4/5 stimulates transcription elongation through the RNA polymerase clamp coiled-coil motif. Nucleic Acids Res 38: 4040-4051.

Huang da W, Sherman BT, Lempicki RA. 2009. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4: 44-57.

Irnov I, Sharma CM, Vogel J, Winkler WC. 2010. Identification of regulatory RNAs in Bacillus subtilis. Nucleic Acids Res 38: 6637-6651.

Kahramanoglou C, Seshasayee AS, Prieto AI, Ibberson D, Schmidt S, Zimmermann J, Benes V, Fraser GM, Luscombe NM. 2011. Direct and indirect effects of H-NS and Fis on global gene expression control in Escherichia coli. Nucleic Acids Res 39: 2073-2091.

Kalyani BS, Muteeb G, Qayyum MZ, Sen R. 2011. Interaction with the nascent RNA is a prerequisite for the recruitment of Rho to the transcription elongation complex in vitro. J Mol Biol 413: 548-560.

Keseler IM, Collado-Vides J, Santos-Zavaleta A, Peralta-Gil M, Gama-Castro S, Muniz- Rascado L, Bonavides-Martinez C, Paley S, Krummenacker M, Altman T et al. 2011. EcoCyc: a comprehensive database of Escherichia coli biology. Nucleic Acids Res 39: D583-590.

Khodursky AB, Bernstein JA, Peter BJ, Rhodius V, Wendisch VF, Zimmer DP. 2003. Escherichia coli spotted double-strand DNA microarrays: RNA extraction, labeling, hybridization, quality control, and data management. Methods Mol Biol 224: 61-78.

Koop AH, Hartley ME, Bourgeois S. 1987. A low-copy-number vector utilizing beta- galactosidase for the analysis of gene control elements. Gene 52: 245-256.

Krinke L, Wulff DL. 1987. OOP RNA, produced from multicopy plasmids, inhibits lambda cII gene expression through an RNase III-dependent mechanism. Genes Dev 1: 1005- 1013.

255

Lepore N, Lafontaine DL. 2011. A functional interface at the rDNA connects rRNA synthesis, pre-rRNA processing and nucleolar surveillance in budding yeast. PLoS One 6: e24962.

Li J, Mason SW, Greenblatt J. 1993. Elongation factor NusG interacts with termination factor rho to regulate termination and antitermination of transcription. Genes Dev 7: 161-172.

Li R, Li Y, Kristiansen K, Wang J. 2008. SOAP: short oligonucleotide alignment program. Bioinformatics 24: 713-714.

Mohanty BK, Kushner SR. 2007. Ribonuclease P processes polycistronic tRNA transcripts in Escherichia coli independent of ribonuclease E. Nucleic Acids Res 35: 7614-7625.

Mooney RA, Davis SE, Peters JM, Rowland JL, Ansari AZ, Landick R. 2009a. Regulator trafficking on bacterial transcription units in vivo. Mol Cell 33: 97-108.

Mooney RA, Schweimer K, Rosch P, Gottesman M, Landick R. 2009b. Two structurally independent domains of E. coli NusG create regulatory plasticity via distinct interactions with RNA polymerase and regulators. J Mol Biol 391: 341-358.

Morse DE, Guertin M. 1972. Amber suA mutations which relieve polarity. J Mol Biol 63: 605-608.

Neidhardt FC, Bloch PL, Smith DF. 1974. Culture medium for enterobacteria. J Bacteriol 119: 736-747.

Nicolas P, Mader U, Dervyn E, Rochat T, Leduc A, Pigeonneau N, Bidnenko E, Marchadier E, Hoebeke M, Aymerich S et al. 2012. Condition-dependent transcriptome reveals high-level regulatory architecture in Bacillus subtilis. Science 335: 1103-1106.

256

Opdyke JA, Kang JG, Storz G. 2004. GadY, a small-RNA regulator of acid response genes in Escherichia coli. J Bacteriol 186: 6698-6705.

Oshima T, Ishikawa S, Kurokawa K, Aiba H, Ogasawara N. 2006. Escherichia coli histone-like protein H-NS preferentially binds to horizontally acquired DNA in association with RNA polymerase. DNA Res 13: 141-153.

Parkhomchuk D, Borodina T, Amstislavskiy V, Banaru M, Hallen L, Krobitsch S, Lehrach H, Soldatov A. 2009. Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res 37: e123.

Pasman Z, von Hippel PH. 2000. Regulation of rho-dependent transcription termination by NusG is specific to the Escherichia coli elongation complex. Biochemistry 39: 5573- 5585.

Perocchi F, Xu Z, Clauder-Munster S, Steinmetz LM. 2007. Antisense artifacts in transcriptome microarray experiments are resolved by actinomycin D. Nucleic Acids Res 35: e128.

Peters JM, Mooney RA, Kuan PF, Rowland JL, Keles S, Landick R. 2009. Rho directs widespread termination of intragenic and stable RNA transcription. Proc Natl Acad Sci U S A 106: 15406-15411.

Peters JM, Vangeloff AD, Landick R. 2011. Bacterial transcription terminators: the RNA 3'-end chronicles. J Mol Biol 412: 793-813.

Posfai G, Plunkett G, 3rd, Feher T, Frisch D, Keil GM, Umenhoffer K, Kolisnychenko V, Stahl B, Sharma SS, de Arruda M et al. 2006. Emergent properties of reduced-genome Escherichia coli. Science 312: 1044-1046.

Proshkin S, Rahmouni AR, Mironov A, Nudler E. 2010. Cooperation between translating ribosomes and RNA polymerase in transcription elongation. Science 328: 504-508.

257

Rasmussen S, Nielsen HB, Jarmer H. 2009. The transcriptionally active regions in the genome of Bacillus subtilis. Mol Microbiol 73: 1043-1057.

Rhodius VA, Suh WC, Nonaka G, West J, Gross CA. 2006. Conserved and variable functions of the σE stress response in related genomes. PLoS Biol 4: e2.

Roberts JW. 1969. Termination factor for RNA synthesis. Nature 224: 1168-1174.

Robinson MD, McCarthy DJ, Smyth GK. 2010. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26: 139- 140.

Santisteban MS, Hang M, Smith MM. 2011. Histone variant H2A.Z and RNA polymerase II transcription elongation. Mol Cell Biol 31: 1848-1860.

Saxena S, Gowrishankar J. 2011a. Compromised factor-dependent transcription termination in a nusA mutant of Escherichia coli: spectrum of termination efficiencies generated by perturbations of Rho, NusG, NusA, and H-NS family proteins. J Bacteriol 193: 3842-3850.

Saxena S, Gowrishankar J. 2011b. Modulation of Rho-dependent transcription termination in Escherichia coli by the H-NS family of proteins. J Bacteriol 193: 3832- 3841.

Shinhara A, Matsui M, Hiraoka K, Nomura W, Hirano R, Nakahigashi K, Tomita M, Mori H, Kanai A. 2011. Deep sequencing reveals as-yet-undiscovered small RNAs in Escherichia coli. BMC Genomics 12: 428.

St Onge RP, Mani R, Oh J, Proctor M, Fung E, Davis RW, Nislow C, Roth FP, Giaever G. 2007. Systematic pathway analysis using high-resolution fitness profiling of combinatorial gene deletions. Nat Genet 39: 199-206.

258

Stern MJ, Ames GF, Smith NH, Robinson EC, Higgins CF. 1984. Repetitive extragenic palindromic sequences: a major component of the bacterial genome. Cell 37: 1015- 1026.

Struhl K. 2007. Transcriptional noise and the fidelity of initiation by RNA polymerase II. Nat Struct Mol Biol 14: 103-105.

Sullivan SL, Gottesman ME. 1992. Requirement for E. coli NusG protein in factor- dependent transcription termination. Cell 68: 989-994.

Terzi N, Churchman LS, Vasiljeva L, Weissman J, Buratowski S. 2011. H3K4 trimethylation by Set1 promotes efficient termination by the Nrd1-Nab3-Sen1 pathway. Mol Cell Biol 31: 3569-3583.

Tran L, van Baarsel JA, Washburn RS, Gottesman ME, Miller JH. 2011. Single-gene deletion mutants of Escherichia coli with altered sensitivity to bicyclomycin, an inhibitor of transcription termination factor Rho. J Bacteriol 193: 2229-2235.

Vasiljeva L, Buratowski S. 2006. Nrd1 interacts with the nuclear exosome for 3' processing of RNA polymerase II transcripts. Mol Cell 21: 239-248.

Vasiljeva L, Kim M, Mutschler H, Buratowski S, Meinhart A. 2008. The Nrd1-Nab3-Sen1 termination complex interacts with the Ser5-phosphorylated RNA polymerase II C- terminal domain. Nat Struct Mol Biol 15: 795-804.

Ward DF, Gottesman ME. 1981. The nus mutations affect transcription termination in Escherichia coli. Nature 292: 212-215.

Washburn RS, Gottesman ME. 2011. Transcription termination maintains chromosome integrity. Proc Natl Acad Sci U S A 108: 792-797.

259

Williams RM, Rimsky S, Buc H. 1996. Probing the structure, function, and interactions of the Escherichia coli H-NS and StpA proteins by using dominant negative derivatives. J Bacteriol 178: 4335-4343.

Zalatan F, Platt T. 1992. Effects of decreased cytosine content on rho interaction with the rho-dependent terminator trp t' in Escherichia coli. J Biol Chem 267: 19082-19088.

Zhang Z, Fu J, Gilmour DS. 2005. CTD-dependent dismantling of the RNA polymerase II elongation complex by the pre-mRNA 3'-end processing factor, Pcf11. Genes Dev 19: 1572-1580.

Zheng C, Friedman DI. 1994. Reduced Rho-dependent transcription termination permits NusA-independent growth of Escherichia coli. Proc Natl Acad Sci U S A 91: 7543-7547.

260

Chapter 5

Conclusions and Future Directions

261

Conclusions

The overarching goal of my thesis research was to understand the roles of Rho- dependent transcription termination in vivo. Our initial studies defined the distribution of

Rho and RNA polymerase (RNAP) on DNA using chromatin immunoprecipitation and microarrays (ChIP-chip; Chapter 2; (Mooney et al. 2009a). We determined that the distributions of Rho and RNAP are very similar, suggesting that Rho can interact with elongation complexes (ECs) soon after promoter escape, and throughout elongation.

The source of Rho-DNA crosslinking remains unknown, and may be indirect (see Future

Directions). We also showed that Rho and NusA are associated with RNAP ChIP-chip peaks near the 5’ ends of genes, indicating that RNAP peaks represent ECs, rather than RNAP molecules trapped in initiation at the promoter (Reppas et al. 2006).

Although Rho and RNAP peaks are coincident, inhibition of Rho termination did not shift the distribution of RNAP peaks further into the gene body. Thus, Rho termination does not appear to be the cause of RNAP ChIP-chip peaks.

Our ChIP-chip studies of Rho and RNAP demonstrated that Rho could associate with ECs throughout transcription, but the locations at which Rho acted to terminate transcription remained unknown. To identify sites of Rho termination, we initially determined the distribution of RNAP using ChIP-chip in the presence and absence of the Rho inhibitor bicyclomycin (BCM; Chapter 3; (Peters et al. 2009). We were able to find ~ 200 locations at which Rho inhibition caused a downstream shift in the distribution of RNAP indicative of terminator readthrough. Of the ~200 Rho-dependent terminators identified using this method, approximately half were found downstream of genes and

262 the other half were found within genes. We discovered Rho-dependent terminators at the ends of stable RNA genes such as transfer RNAs (tRNAs) and small RNAs

(sRNAs), including the sroG riboswitch. This finding is particularly significant because

Rho was thought not to bind highly structured RNAs such as stable RNAs. Whether tRNAs and sRNAs act as Rho aptamers, or Rho binds to unstructured portions of stable

RNAs prior to processing is unknown (see Future Directions). The set of Rho- dependent terminators found within genes included 24 novel antisense transcripts judged by a shift in the distribution of RNAP after Rho inhibition that was in the opposite direction of the annotated gene. In theory, non-coding antisense transcripts make ideal targets for Rho termination, as the ribosome is not present to prevent Rho binding.

Finally, we confirmed that Rho termination is associated with horizontally-transferred

DNA (i.e., “foreign” DNA) in E. coli (Cardinale et al. 2008). However, our results suggest that the largest effects of Rho inhibition on horizontally-transferred DNA occur due to terminator readthrough at tRNA genes that are positioned at the junction between E. coli DNA and foreign DNA elements, rather than within those elements (Cardinale et al.

2008). It is currently unknown if Rho termination is associated with horizontally- transferred DNA in other bacteria (see Future Directions).

Our ChIP-chip investigations into Rho termination suggested that termination of antisense transcription may be a major function of Rho in the cell. However, our ability to detect antisense transcripts using ChIP-chip was limited due to lack of strand specificity and limited specificity. To investigate Rho termination more sensitively in a strand-specific manner, we used a combination of tiling expression microarrays and deep sequencing of RNAs (RNAseq) to analyze transcripts from Rho-inhibited cells

263

(Chapter 4; (Peters et al. 2012). We found that the vast majority of Rho-dependent terminators (> 80%) prevented antisense transcription either at the ends of genes, or within genes. In addition, we identified hundreds of novel transcripts within genes that, in some cases, could only be detected in Rho-inhibited conditions. The Rho co-factor

NusG also prevented antisense transcription, but only at a subset of Rho-dependent terminators (rather than at all terminators; (Cardinale et al. 2008). NusG-dependent terminators were defined by having a low C/G ratio, which would likely impede Rho loading onto the nascent RNA. How sequences at the termination site affect the mechanism of NusG-dependent termination enhancement remains an open question

(see Future Directions).

In contrast to previous reports (Cardinale et al. 2008), a large deletion within the nusA gene had no effect on Rho termination. The essential roles of NusA in transcription elongation and termination (and potentially other processes) in vivo are poorly understood (see Future Directions and Appendix E). We also found that the nucleoid structuring protein H-NS generally bound near Rho-dependent terminators, and that mutations in rho and hns show negative epistasis. The underlying mechanism behind rho and hns synergy has not been shown definitively (see Future Directions), although we favor a model where H-NS acts as a roadblock to RNAP, thus slowing transcription elongation and providing Rho with a larger kinetic window in which to terminate. Finally, we show that repeated palindromic elements (REP elements; Stern et al. 1984) are generally associated with Rho-dependent terminators found at the ends of mRNAs. We suggest that these elements stabilize Rho terminated messages by forming hairpin structures that impede the processivity of 3′→5′ exonucleases.

264

Future Directions

Rho crosslinking and association with ECs

Rho ChIP-chip experiments show that Rho and RNAP have similar distributions on DNA

(Mooney et al. 2009a). The source of RNAP-DNA crosslinking is obvious; RNAP crosslinks directly to template DNA that passes through the enzyme. However, the physical basis for Rho-DNA crosslinking is less clear. Conceivably, Rho-DNA crosslinks could be direct and result from Rho helicase activity on persistent RNA/DNA hybrids that form when untranslated nascent RNAs strand invade into template DNA

(Harinarayanan and Gowrishankar 2003). Rho could also crosslink to DNA indirectly through the nascent RNA, which would then crosslink to DNA through RNAP. The inclusion of RNase A in our ChIP-chip experiments would seem to negate this possibility; however, the distance between Rho and RNAP may be short enough to block the access of RNase A to the intervening nascent RNA.

One of the more intriguing possible explanations for Rho-DNA crosslinking is that

Rho is directly bound to RNAP throughout elongation. Whether RNAP and Rho make a specific protein-protein interaction that alters the properties of RNAP (rather than a simple steric interactions) is currently a source of contention in the termination field

(Epshtein et al. 2010; Kalyani et al. 2011; reviewed in Peters et al. 2011). Epshtein et al. propose that Rho is stably bound to RNAP in elongation and even initiation complexes

(Epshtein et al. 2010). Unfortunately, the in vitro Rho-RNAP association experiments performed by Epshtein et al. lacked key negative controls and were only performed under one experimental condition (i.e., the concentration of proteins or salts was not

265 varied; Epshtein et al. 2010). Kalyani et al. (Kalyani et al. 2011) could not reproduce the

Rho-RNAP association seen by Epshtein et al., and, thus, concluded that Rho can only associate with RNAP through the nascent RNA. Both groups used hexahistidine-tagged

Rho proteins that are known to perturb Rho function in vivo (Balasubramanian and Stitt

2010). Our Rho ChIP-chip results support, but by no means confirm the idea that Rho is binds specifically to RNAP throughout elongation (Mooney et al. 2009a). An alternate interpretation of the Rho-DNA crosslinks observed in ChIP-chip experiments is that

Rho, while translocating on the nascent RNA, makes transient non-specific contacts with RNAP that are trapped by crosslinking. Unpublished results from our lab disfavor the idea of Rho binding to initiation complexes in vivo, as cells treated with the antibiotic rifampicin (which locks RNAP at promoters) show a loss of Rho ChIP-chip signal

(Jeffrey A. Grass and Rachel A. Mooney, unpublished). Future ChIP-chip experiments that vary the concentration of formaldehyde used for crosslinking or that use site- specific crosslinkers incorporated into the nascent RNA, RNAP, or Rho and experiments with Rho variants that are defective for RNA binding or translocation may help clarify the physical basis for Rho-DNA crosslinking.

Rho binding to structured RNAs

Our findings that Rho terminates stable RNAs such as tRNAs and sRNAs opens the possibility that highly structured RNAs may serve as aptamers for Rho binding (Peters et al. 2009). Recently, Hollands et al. reported that Rho is involved in terminating transcription downstream of the mgtA riboswitch in Salmonella enterica serovar

266

Typhimurium (Hollands et al. 2012). Like other , the mgtA riboswitch adopts alternate RNA conformations in the presence and absence of ligand (in this case, Mg2+); the ligand-bound form of the riboswitch promotes transcription termination, and the unbound form allows terminator readthough into downstream genes (the mgtA gene encodes a Mg2+ transporter; Cromie et al. 2006). In the presence of excess Mg2+, the mgtA riboswich adopts a conformation that is compatible with Rho binding and termination (Hollands et al. 2012). Although the Mg2+-bound riboswitch remains structured overall, a C-rich, single-stranded portion of the RNA is exposed that is apparently sufficient for Rho binding. Thus, structured RNAs, such as riboswitches, can act as Rho aptamers.

Testing structured RNAs, such as tRNAs and sRNAs, for binding to Rho will require carefully designed in vitro experiments. Rho is unlikely to bind folded, mature

(i.e., nuclease-processed) tRNAs and sRNAs (Chae et al. 2011). Instead, Rho may associate with structured RNAs co-transcriptionally, interacting with partially folded intermediates rather than the final folded form of the RNA. Using translocation-defective

Rho variants (Chalissery et al. 2007) in purified transcription reactions may allow detection of stable RNA complexes with Rho by footprinting or other methods.

Rho-dependent termination in other bacteria

Little is known about Rho-dependent termination outside of a handful of model organisms. Genome-scale datasets involving Rho termination are only available for two species: E. coli (Cardinale et al. 2008; Peters et al. 2009; Peters et al. 2012), and

267

Bacillus subtilis (Nicolas et al. 2012). E. coli, a Gram-negative bacterium, and B. subtilis, a Gram-positive bacterium, are distantly related, but Rho proteins from the two species are ~60% identical (Quirk et al. 1993). Despite similarities at the amino acid level, the relative importance of Rho to cell viability is considerably different. rho is an essential gene in E. coli, but is dispensable in B. subtilis with only a modest effect on growth in rich medium (Quirk et al. 1993). The number of experimentally-determined Rho- dependent terminators in E. coli versus B. subtilis differs by an order of magnitude

(1264 versus 174, respectively; (Nicolas et al. 2012; Peters et al. 2012). Also, Rho in B. subtilis is present at < 5% of the levels of E. coli relative to the total protein content of the cell (Ingham et al. 1999). Interestingly, Rho retains its role in preventing antisense transcription in B. subtilis (Nicolas et al. 2012), although on a much more limited scale.

Ultimately, additional genome-wide experiments coupling loss of Rho function with tiling microarrays or RNA sequencing will determine if Rho is generally more utilized in Gram- negatives versus Gram-positives.

It is also unknown if Rho-dependent terminators are associated with horizontally- transferred DNA in bacteria outside of E. coli and B. subtilis (Cardinale et al. 2008;

Peters et al. 2009; Nicolas et al. 2012; Peters et al. 2012). Importantly, it is crucial to determine Rho termination sites experimentally within or adjacent to foreign DNA, rather than to assign a Rho-dependent terminator to the end of a gene at which computational prediction of an intrinsic terminator has failed. Horizontally-transferred DNA typically contains long operons in which termination does not occur at the ends of most genes

(e.g., in prophages), and may contain non-canonical intrinsic termination structures that are not adequately predicted by computational algorithms. Further, bacteria outside of

268

E. coli may contain novel termination factors or operate using unexpected mechanisms

(Ingham et al. 1995; Unniraman et al. 2001; Hosid and Bolshoy 2004).

Mechanism of termination enhancement by NusG

The mechanistic basis for our finding that NusG-dependent terminators have lower C/G ratios than those that are NusG-independent is unclear (Peters et al. 2012). The simplest explanation is that NusG assists or stabilizes Rho in binding to the nascent

RNA by tethering Rho near the RNA exit channel of RNAP. However, in vitro transcription experiments performed on a template carrying both NusG-dependent and

NusG-independent terminators showed that NusG has no effect on the half-maximal concentration of Rho needed to terminate transcription (Burns and Richardson 1995).

This result strongly suggests that NusG does not enhance Rho binding to RNA. This finding is not conclusive, however, as only one NusG-dependent terminator was investigated (tiZ1 within the lacZ coding sequence), and it is unknown how this terminator responds to NusG in vivo (Burns and Richardson 1995). The tiZ1 terminator was not detected in our experiments because lacZ expression was not induced, and because tiZ1 only functions when transcription and translation are uncoupled by a premature stop codon in lacZ. An additional possibility is that NusG alters the interaction between Rho and RNA in an undefined way that enhances Rho release activity but not

RNA binding.

Alternatively, low C/G ratios at NusG-dependent terminators may affect some property of RNAP that alters the ability of Rho to terminate. NusG may enhance

269 termination by stabilizing an elongation-competent form of RNAP that may be a better substrate for Rho termination (Dutta et al. 2008).The NusG NTD appears to inhibit opening of the RNA polymerase clamp domain (Sevostyanova et al. 2011) and NusG is known to prevent RNAP backtracking or arrest at certain sequences (Pasman and von

Hippel 2000). The sequences at Rho-dependent terminators affected by NusG could make these contributions more important than at terminators not affected by NusG.

However, the kinetic coupling model of Rho-dependent termination (Jin et al. 1992) predicts that NusG stimulation of elongation should reduce not increase Rho-dependent termination. Further, the NusG NTD alone stimulates elongation by RNAP in vitro, but the NusG CTD, which binds to Rho (Burmann et al. 2010; Chalissery et al. 2011), is required for enhancement of Rho termination (Mooney et al. 2009b). Thus, the subset of terminators affected by NusG must require this NusG CTD-Rho contact. As we now have positional data for hundreds of NusG-dependent and NusG-independent terminators (Peters et al. 2012), we can test the NusG-dependence of several terminators in vitro to better understand the sequence dependence of NusG termination enhancement.

Roles of NusA in vivo

We determined that the ΔnusA* allele has no effect on Rho termination (Peters et al.

2012), however, we also determined that ΔnusA* not null (Appendix E). Although it is formally possible that a fragment of NusA encoded by ΔnusA* (such as the NusA-NTD) is entirely responsible for potential NusA effects on Rho termination in vivo, unpublished

270 in vitro transcription experiments from our laboratory failed to show an effect of the

NusA-NTD on Rho termination, even at high concentrations (Rachel A. Mooney, unpublished). Therefore, the essential role of NusA in E. coli is unlikely to be in Rho termination; rather, we suggest that the critical function of NusA may be enhancement of intrinsic termination or enforcement of transcription-translation coupling by affecting

RNAP pausing.

Determining the roles of NusA in vivo will be complicated by the fact that nusA is an essential gene. Attempts to deplete NusA by shutting off expression from an inducible promoter (λpL under control of the temperature-sensitive λcI-857 ) had only a modest effect on growth (Jason M. Peters and Max E. Gottesman, unpublished), suggesting that NusA is a stable protein and that even lowered levels of

NusA are sufficient for viability. Also, tagging the chromosomal copy of nusA with a C- terminal DAS+4 degradation tag that targets proteins to the ClpXP protease

(McGinness et al. 2006), had no effect on cell viability (Jason M. Peters, unpublished).

Assuming that a genetic method of reducing NusA function is feasible, testing

NusA effects on intrinsic termination and pausing in vivo is fairly straightforward due to recent improvements in sequencing technology. Genome-wide sequencing of nascent

RNA 3’-ends (NET-seq, (Churchman and Weissman 2012) can determine altered patterns of RNAP pausing in the absence of NusA activity, and ribosome footprinting can show altered ribosome densities. Standard strand-specific RNA-seq can be used to determine readthrough of intrinsic terminators in strains with reduced NusA activity that may be difficult to detect with less sensitive, microarray-based methods.

271

The mechanistic basis for Rho-H-NS synergy

Although we favor a model in which the negative epistasis between Rho and H-NS is due to direct effects of H-NS on the efficiency of Rho termination, it is also possible that

H-NS repression and Rho termination constitute two separate pathways that converge at the same sites to silence transcription. For example, H-NS can repress transcription initiation from cryptic promoters, and Rho can terminate untranslated, antisense transcripts. This indirect model is analogous to suppression of antisense transcription elongation by Sen1-dpendent termination (Brow 2011) and silencing of antisense promoters by nucleosomes in eukaryotes (Neil et al. 2009). Loss of H-NS silencing causes transcription from cryptic promoters (Defez and De Felice 1981; Higgins et al.

1988); these transcripts would be terminated by Rho if untranslated and sufficiently C- rich and G-poor. Loss of both Rho and H-NS silencing may allow continued elongation of cryptic transcripts, leading to transcription of toxic genes (e.g., those that reside in prophages; Cardinale et al. 2008), or to inhibitory effects of elongated RNAs such as occlusion of ribosome binding or degradation of sense transcripts. Interestingly, the reduced-genome E. coli strain MDS42, which lacks toxic prophage genes, is more resistant than wild-type to Rho inhibition, but becomes highly sensitive to BCM when deleted for hns (Tran et al. 2011). These results support the idea that Rho and H-NS synergy extends to regions outside horizontally-transferred DNA and within evolutionarily conserved portions of the bacterial chromosome.

The direct and indirect models for Rho and H-NS synergy lead to distinct predictions about the behavior of RNAP in an hns null strain. If H-NS directly influences the efficiency of Rho termination, deletion of hns should phenocopy Rho inhibition at

272 terminators bound by H-NS. IF H-NS and Rho act independently, deletion of H-NS should increase transcription initiation upstream from or within Rho-dependent terminators. These two possibilities can be distinguished by testing the distribution of

RNAP on DNA or examining transcript 3′ ends in strains lacking H-NS.

273

References

Balasubramanian K, Stitt BL. 2010. Evidence for amino acid roles in the chemistry of ATP hydrolysis in Escherichia coli Rho. J Mol Biol 404: 587-599.

Brow DA. 2011. Sen-sing RNA terminators. Mol Cell 42: 717-718.

Burmann BM, Schweimer K, Luo X, Wahl MC, Stitt BL, Gottesman ME, Rosch P. 2010. A NusE:NusG complex links transcription and translation. Science 328: 501-504.

Burns CM, Richardson JP. 1995. NusG is required to overcome a kinetic limitation to Rho function at an intragenic terminator. Proc Natl Acad Sci U S A 92: 4738-4742.

Cardinale CJ, Washburn RS, Tadigotla VR, Brown LM, Gottesman ME, Nudler E. 2008. Termination factor Rho and its cofactors NusA and NusG silence foreign DNA in E. coli. Science 320: 935-938.

Chae H, Han K, Kim KS, Park H, Lee J, Lee Y. 2011. Rho-dependent termination of ssrS (6S RNA) transcription in Escherichia coli: implication for 3' processing of 6S RNA and expression of downstream ygfA (putative 5-formyl-tetrahydrofolate cyclo-ligase). J Biol Chem 286: 114-122.

Chalissery J, Banerjee S, Bandey I, Sen R. 2007. Transcription termination defective mutants of Rho: role of different functions of Rho in releasing RNA from the elongation complex. J Mol Biol 371: 855-872.

Chalissery J, Muteeb G, Kalarickal NC, Mohan S, Jisha V, Sen R. 2011. Interaction surface of the transcription terminator Rho required to form a complex with the C- terminal domain of the antiterminator NusG. J Mol Biol 405: 49-64.

Churchman LS, Weissman JS. 2012. Native elongating transcript sequencing (NET- seq). Curr Protoc Mol Biol Chapter 4: Unit 4 14 11-17.

274

Cromie MJ, Shi Y, Latifi T, Groisman EA. 2006. An RNA sensor for intracellular Mg2+. Cell 125: 71-84.

Defez R, De Felice M. 1981. Cryptic operon for beta-glucoside metabolism in Escherichia coli K12: genetic evidence for a regulatory protein. Genetics 97: 11-25.

Dutta D, Chalissery J, Sen R. 2008. Transcription termination factor rho prefers catalytically active elongation complexes for releasing RNA. J Biol Chem 283: 20243- 20251.

Epshtein V, Dutta D, Wade J, Nudler E. 2010. An allosteric mechanism of Rho- dependent transcription termination. Nature 463: 245-249.

Harinarayanan R, Gowrishankar J. 2003. Host factor titration by chromosomal R-loops as a mechanism for runaway plasmid replication in transcription termination-defective mutants of Escherichia coli. J Mol Biol 332: 31-46.

Higgins CF, Dorman CJ, Stirling DA, Waddell L, Booth IR, May G, Bremer E. 1988. A physiological role for DNA supercoiling in the osmotic regulation of gene expression in S. typhimurium and E. coli. Cell 52: 569-584.

Hollands K, Proshkin S, Sklyarova S, Epshtein V, Mironov A, Nudler E, Groisman EA. 2012. Riboswitch control of Rho-dependent transcription termination. Proc Natl Acad Sci U S A 109: 5376-5381.

Hosid S, Bolshoy A. 2004. New elements of the termination of transcription in prokaryotes. J Biomol Struct Dyn 22: 347-354.

Ingham CJ, Dennis J, Furneaux PA. 1999. Autogenous regulation of transcription termination factor Rho and the requirement for Nus factors in Bacillus subtilis. Mol Microbiol 31: 651-663.

275

Ingham CJ, Hunter IS, Smith MC. 1995. Rho-independent terminators without 3' poly-U tails from the early region of actinophage ΦC31. Nucleic Acids Res 23: 370-376.

Jin DJ, Burgess RR, Richardson JP, Gross CA. 1992. Termination efficiency at rho- dependent terminators depends on kinetic coupling between RNA polymerase and rho. Proc Natl Acad Sci U S A 89: 1453-1457.

Kalyani BS, Muteeb G, Qayyum MZ, Sen R. 2011. Interaction with the nascent RNA is a prerequisite for the recruitment of Rho to the transcription elongation complex in vitro. J Mol Biol 413: 548-560.

McGinness KE, Baker TA, Sauer RT. 2006. Engineering controllable protein degradation. Mol Cell 22: 701-707.

Mooney RA, Davis SE, Peters JM, Rowland JL, Ansari AZ, Landick R. 2009a. Regulator trafficking on bacterial transcription units in vivo. Mol Cell 33: 97-108.

Mooney RA, Schweimer K, Rosch P, Gottesman M, Landick R. 2009b. Two structurally independent domains of E. coli NusG create regulatory plasticity via distinct interactions with RNA polymerase and regulators. J Mol Biol 391: 341-358.

Neil H, Malabat C, d'Aubenton-Carafa Y, Xu Z, Steinmetz LM, Jacquier A. 2009. Widespread bidirectional promoters are the major source of cryptic transcripts in yeast. Nature 457: 1038-1042.

Nicolas P, Mader U, Dervyn E, Rochat T, Leduc A, Pigeonneau N, Bidnenko E, Marchadier E, Hoebeke M, Aymerich S et al. 2012. Condition-dependent transcriptome reveals high-level regulatory architecture in Bacillus subtilis. Science 335: 1103-1106.

Pasman Z, von Hippel PH. 2000. Regulation of rho-dependent transcription termination by NusG is specific to the Escherichia coli elongation complex. Biochemistry 39: 5573- 5585.

276

Peters JM, Mooney RA, Grass JA, Tran F, Landick R. 2012. Rho and NusG suppress pervasive antisense transcription in Escherichia coli. submitted.

Peters JM, Mooney RA, Kuan PF, Rowland JL, Keles S, Landick R. 2009. Rho directs widespread termination of intragenic and stable RNA transcription. Proc Natl Acad Sci U S A 106: 15406-15411.

Peters JM, Vangeloff AD, Landick R. 2011. Bacterial transcription terminators: the RNA 3'-end chronicles. J Mol Biol 412: 793-813.

Quirk PG, Dunkley EA, Jr., Lee P, Krulwich TA. 1993. Identification of a putative Bacillus subtilis rho gene. J Bacteriol 175: 8053.

Reppas NB, Wade JT, Church GM, Struhl K. 2006. The transition between transcriptional initiation and elongation in E. coli is highly variable and often rate limiting. Mol Cell 24: 747-757.

Sevostyanova A, Belogurov GA, Mooney RA, Landick R, Artsimovitch I. 2011. The beta subunit gate loop is required for RNA polymerase modification by RfaH and NusG. Mol Cell 43: 253-262.

Stern MJ, Ames GF, Smith NH, Robinson EC, Higgins CF. 1984. Repetitive extragenic palindromic sequences: a major component of the bacterial genome. Cell 37: 1015- 1026.

Tran L, van Baarsel JA, Washburn RS, Gottesman ME, Miller JH. 2011. Single-gene deletion mutants of Escherichia coli with altered sensitivity to bicyclomycin, an inhibitor of transcription termination factor Rho. J Bacteriol 193: 2229-2235.

Unniraman S, Prakash R, Nagaraja V. 2001. Alternate paradigm for intrinsic transcription termination in eubacteria. J Biol Chem 276: 41850-41855.

277

Appendix A

Synthetic lethal screening identifies a genetic connection between transcript cleavage and central metabolism

I performed all of the experiments shown in this section.

278

Introduction

GreA is a transcription elongation factor that stimulates the intrinsic RNA cleavage activity of RNA polymerase (RNAP). Intrinsic cleavage of RNA relieves RNAP from a state of backtracked pausing (where RNAP moves backward on the RNA/DNA hybrid) allowing the enzyme to continue transcription from a newly generated 3′ end

(Borukhov et al. 1992). GreA-enhanced intrinsic cleavage by RNAP also enhances promoter escape at certain promoters. Much of what is known about GreA has been elucidated through in vitro biochemical studies. As such, the role of GreA in vivo is not well understood. The single greA deletion has only minor phenotypes (such as slight salt sensitivity). A strain containing a double deletion of greA and its structural and functional homolog greB shows reduced growth at 37 oC, and is lethal at 42 oC (Orlova et al. 1995).

To further characterize the roles of GreA in vivo, I performed a ΔgreA synthetic lethal screen (Fig. A.1). Synthetic lethal screening identifies otherwise lethal double mutations as colonies from a transposon insertion library that retain an unstable plasmid expressing the gene of interest (in this case, greA; Bernhardt and de Boer 2004). I found that greA and pgi, which encodes the glycolytic enzyme phosophoglucose isomerase, constitute a synthetic lethal pair at 42 oC. My results thus uncover a novel connection between the processes of transcript cleavage and central metabolism at high temperatures.

279

Figure A.1. Overview of synthetic lethal screening. (i) Unstable plasmid sectoring. greA+ was cloned into an unstable mini-F′ plasmid with a defective origin of replication and a lacZ reporter gene (pJP101). pJP101 was not required for viability in ΔgreA cells and was lost at high frequency, resulting in blue and white sectored colonies. (ii)

Synthetic lethal screening. ΔgreA cells transformed with pJP101 were randomly mutagenized with Tn5 transposons. Most transposon mutants lost pJP101 at high frequency, however, slgA::Tn5 mutants retained pJP101, resulting in solid blue colonies.

280

Figure A.1

281

Results

A screen for mutations that are synthetic lethal with ΔgreA (slgA mutants)

To screen for mutations that were synthetic lethal with ΔgreA, I first cloned greA+ under control of the IPTG-inducible trc promoter as a transcriptional fusion to lacZYA+ in a single-copy, unstable vector (pRC7; Koop et al. 1987) that is lost at elevated frequency in the absence of selection. I then transformed the plasmid (pRC7-greA+ a.k.a. pJP101) into a ΔgreA Δlac strain, and monitored plasmid loss by plating on LB X-gal. As expected, the strain formed blue, white, or blue and white sectored colonies of equal size, indicating that pJP101, and, thus, greA+ was not required for growth. However, a

ΔgreA ΔgreB Δlac strain grown at 42 oC formed only blue colonies, demonstrating that pJP101 was retained under conditions where greA+ was essential. I also confirmed that uninduced expression of greA+ from the trc promoter was sufficient to complement the modest salt sensitivity phenotype caused by the greA deletion.

I next randomly mutagenized the ΔgreA Δlac pJP101 strain with Tn5 transposons, and screened the resulting double mutant libraries for plasmid retention

(solid blue colonies) on M9 or LB X-gal plates at 37 oC. After screening ~ 34,000 colonies, I failed to identify any that remained solid blue after restreaking (Table A.1).

Because ΔgreA was known to have a genetic interaction with ΔgreB at 42 oC, I reasoned that gre function might be more important at elevated temperatures. To test this hypothesis, I repeated the synthetic lethal screen at 42 oC. I found 3 colonies (slgA-

24, slgA-71, and slgA-98) out of ~10,000 screened that retained pJP101 after restreaking, suggesting that the transposon insertion in these strains caused a loss of

282

Table A.1. Synthetic lethal screening. anumber of plates screened bapproximate number of colonies screened cnumber of colonies that were initially solid blue, and were restruck to confirm retention of pJP101 dnumber of colonies that retained pJP101 after restreaking

283

Table A.1 temperature media # platesa # coloniesb # restreaksc # positived

37o C LB X-gal 200 24,000 48 0 37o C M9 X-gal 100 10,000 13 0 42o C LB X-gal 100 10,000 14 3

284 function mutation in a gene that is required for the growth of ΔgreA cells at 42 oC (Table

A.1). All three slgA ΔgreA mutants lost pJP101 after growth at 37 oC, consistent with the fact that they were not identified in the screen performed at 37 oC (Fig. A.2). I verified that the white slgA ΔgreA colonies that had lost the pJP101 plasmid during growth at 37 oC were temperature sensitive for viability at 42 oC (Fig. A.3). My identification of mutations that are synthetic lethal with ΔgreA at 42 oC, but not 37 oC, suggests that greA+ is part of a pathway that is critical for growth at high temperatures.

All three slgA mutations are transposon insertions in pgi

I mapped the locations of transposon insertion in the three slgA mutants by arbitrary PCR (see Materials and Methods; (O'Toole and Kolter 1998). I found that slgA-

24, slgA-71, and slgA-98 were transposon insertions in nucleotides 527, 484, and 1144 of the pgi gene, respectively. The temperature-sensitive phenotype of ΔgreA pgi::Tn5 mutants was not due to the direction of transposon insertion relative to the pgi coding sequence, as insertions occurred in both the 5′→3′ (slgA-24) and the 3′→5′ direction

(slgA-74 and slgA-98).

Transcript mapping experiments (Cho et al. 2009) showed that pgi is the sole gene transcribed from transcripts originating from the pgi promoter, arguing that disruption of pgi, rather than polar effects on downstream transcription caused by the

Tn5 insertion, caused the temperature-sensitive phenotype. I confirmed that expression of pgi from a multi-copy plasmid restored growth of the pgi::Tn5 mutants at 42 oC (data not shown).

285

Figure A.2. slgA greA double mutants are viable at 37 oC. slgA greA cells were grown at 37 oC on LB X-gal plates. I observed two distinct colony phenotypes; large blue colonies containing pJP101 (blue arrows), and small white colonies that had lost pJP101.

286

Figure A.2

287

Figure A.3. slgA greA double mutants are temperature-sensitive for growth. A dilution series of slgA greA cells (labeled slgA) that contained pJP101 (pRC7-greA+) or had lost the plasmid after growth at 30 oC (sectored) were spotted on LB X-gal plates and grown overnight at 42 oC.

288

Figure A.3

289

Growth on alternate carbon sources fails to rescue the temperature sensitivity of a greA pgi double mutant pgi encodes the glycolytic enzyme phosphoglucose isomerase (Pgi), which catalyzes the conversion of β-D-glucose-6-phosphate to β-D-fructose-6-phosphate (Fraenkel

1967). Despite having an early role in glycolysis, strains lacking Pgi are viable due to metabolic flux through alternate pathways (e.g. the pentose phosphate pathway). We performed my initial experiments on complex LB plates containing an undefined set of metabolites. To simply the number of metabolic inputs to my system, I evaluated growth of ΔgreA pgi::Tn5 mutants on minimal M9 plates containing a single carbon source (Fig

A.4). Extracellular β-D-glucose is transported into the cell and modified to produce β-D- glucose-6-phosphate by the glucose PTS permease, then converted to β-D-fructose-6- phosphate by Pgi. ΔgreA pgi::Tn5 mutants failed to grow on M9 glucose plates, demonstrating that the temperature-sensitive phenotype could be reproduced in defined medium.

One possible explanation for the observed phenotype of ΔgreA pgi::Tn5 mutants is that GreA is necessary for expression of some component of an alternate metabolic pathway to early glycolysis, such as the pentose phosphate pathway, at high temperature. To test this model, I grew ΔgreA pgi::Tn5 mutants on plates containing D- mannose, a sugar that is modified through several steps to D-glyceradehyde-3- phosphate that enters glycolysis downstream of Pgi. D-glyceradehyde-3-phosphate is also the product of the pentose phosphate pathway, but it can be produced by an alternate pathway that is used when mannose or fructose is the sole carbon source.

Thus, growth on mannose bypasses the pentose phosphate pathway. ΔgreA pgi::Tn5

290

Figure A.4. Growth of pgi greA on M9 minimal glucose medium. A dilution series of pgi greA cells were spotted on M9 0.2% glucose (v/v) X-gal plates and grown for ~24 hours at 37 oC or 42 oC. The large circles seen in the last two rows of the plate incubated at 42 oC are air bubbles.

291

Figure A.4

292

mutants failed to grow on M9 0.2 % mannose (v/v) at 42 oC, indicating that GreA effects are not manifested through the pentose phosphate pathway (ΔgreA and pgi::Tn5 controls grew normally; data not shown).

Formally, absence of the pgi mRNA, Pgi protein, or Pgi enzymatic product (β-D- fructose-6-phosphate) could be responsible for the temperature-sensitive phenotype of

ΔgreA pgi::Tn5 mutants. I attempted to bypass the defect of ΔgreA pgi::Tn5 mutants by growing cells on directly on fructose-6-phosphate. ΔgreA pgi::Tn5 mutants formed colonies on M9 fructose-6-phosphate at 37 oC, but not 42 oC, suggesting that the temperature-sensitive phenotype is caused by something other than the lack of the Pgi enzymatic product (Fig A.5; ΔgreA and pgi::Tn5 controls grew similarly to the pictured double mutant at 37 oC; data not shown).

293

Figure A.5. Growth of pgi greA on M9 minimal fructose-6-phosphate medium. A dilution series of pgi greA cells were spotted on M9 0.2 % fructose-6-phosphate (v/v) X- gal plates and grown for ~24 hours at 37 oC or 42 oC.

294

Figure A.5

295

Discussion

My synthetic genetic analysis establishes a connection between GreA-enhanced transcript cleavage by RNAP and central metabolism at high temperatures. greA and pgi constitute a synthetic lethal pair at 42 oC, but not at 37 oC, suggesting that these genes may be involved in cellular responses to heat stress. greA pgi lethality at 42 oC could not be rescued by alternate carbon sources that bypass the pentose phosphate pathway, or the enzymatic activity of Pgi. These results raise several important questions. First, are greA and pgi part of two independent pathways that provide resistance to heat stress? Second, is the catalytic activity of Pgi required to survive heat stress in a greA strain background? Finally, does the greA homolog greB show similar genetic interactions with pgi?

GreA and Pgi may be components in two independent pathways that respond to stress incurred by growth at high temperatures. Expression of the greA gene is regulated by the membrane stress-responsive σE (a.k.a. σ24; (Rhodius et al. 2006)). Heat stress causes misfolding of outer membrane proteins, which serve a signal to activate proteolysis of the membrane-anchored anti-σE factor RseA (Ades et al.

1999); (Ades et al. 2003). σE is then released from the membrane and can form holoenzymes with free core RNAP. These holoenzymes then transcribe the σE regulon, including greA (Rhodius et al. 2006). The downstream targets of GreA in cells grown 42 oC are unknown. It is also unclear if GreA is required in a pgi strain background at 42 oC because of the effects of GreA on promoter escape or transcript elongation.

Regardless, transcript profiling experiments in the presence and absence of greA+ at 42 oC would likely shed light on heat stress pathways that involve GreA.

296

To my knowledge no published research has linked Pgi to heat stress tolerance.

However, metabolic profiling experiments have shown that metabolites involved in glycolysis and the pentose phosphate pathway are reduced in abundance after heat stress (as well as other stresses; Jozefczuk et al. 2010). I speculate that mutations in central metabolism, such as those identified in pgi, affect the ability of cells to sense heat stress (or other stresses) and appropriately downregulate genes involved in metabolism. In this model, GreA stimulates transcription of genes involved in alleviating membrane stress while Pgi is part of a signaling pathway that downregulates central metabolism in response to stress. Thus, the combined effects of extensive membrane stress (ΔgreA) without concomitant changes in metabolism (pgi::Tn5) would be lethal to cells. To test this model, global gene expression could be measured in greA, pgi, and greA pgi cells as a time course following temperature shift to 42 oC. This analysis should show if GreA and Pgi are involved in separate heat-stress regulation pathways.

Analysis of greA pgi suppressor mutants that grow at 42 oC would likely also be informative as to the cause of greA pgi lethality.

297

Materials and Methods

Strains and plasmids

Strains are listed in Table A.2. pJP101 (pRC7-greA+) was generated by cloning an

ApaI-HindIII restriction fragment from pDNL278 (pTrc-greA+) containing the trc promoter and greA+ into pRC7. pJP106 (pTrc-pgi+) was constructed by cloning a PCR product containing pgi+ between the NcoI and HindIII sites of pTrc99c using the following primers: 5′-acacacccatggATGAAAAACATCAATCCAAC (pgi cloning 1), and 5′- gtgtgtaagcttTTAACCGCGCCACGCTTTAT (pgi cloning 2).

Synthetic lethal screening

Synthetic lethal screening was carried out as previously described (Bernhardt and de

Boer 2004), except that IPTG was not included in the screening plates because pJP101 complemented ΔgreA without IPTG addition.

298

Table A.2. Strains used in this study.

Strain Background Genotype Source Reference

MG1655 MG1655 F- λ- rph-1 F. Blattner (Blattner et al. 1997) (RL1655)

DJ4000 MG1655 lacX74 W. Ross - (RL1621)

RL1828 MG1655 lacX74 ΔgreA::cat This study -

RL1857 MG1655 pgi-24::Tn5(KanR) This study -

RL1858 MG1655 pgi-71::Tn5(KanR) This study -

RL1859 MDS42 pgi-98::Tn5(KanR) This study -

RL1860 MG1655 lacX74 pgi-24::Tn5(KanR) This study -

RL1861 MG1655 lacX74 pgi-71::Tn5(KanR) This study -

RL1862 MG1655 lacX74 pgi-98::Tn5(KanR) This study -

RL1863 MG1655 lacX74 pgi-24::Tn5(KanR) This study - ΔgreA::cat

RL1864 MG1655 lacX74 pgi-71::Tn5(KanR) This study - ΔgreA::cat

RL1865 MG1655 lacX74 pgi-98::Tn5(KanR) This study - ΔgreA::cat

299

Acknowledgements

I thank Tom Bernhardt (Harvard) for strains, Diana Downs (UW-Madison) for β-

D-fructose-6-phosphate, and Jennifer Rowland and Jennifer Ross for technical assistance.

300

References

Ades SE, Connolly LE, Alba BM, Gross CA. 1999. The Escherichia coli σE-dependent extracytoplasmic stress response is controlled by the regulated proteolysis of an anti- sigma factor. Genes Dev 13: 2449-2461.

Ades SE, Grigorova IL, Gross CA. 2003. Regulation of the alternative sigma factor σE during initiation, adaptation, and shutoff of the extracytoplasmic heat shock response in Escherichia coli. J Bacteriol 185: 2512-2519.

Bernhardt TG, de Boer PA. 2004. Screening for synthetic lethal mutants in Escherichia coli and identification of EnvC (YibP) as a periplasmic septal ring factor with murein hydrolase activity. Mol Microbiol 52: 1255-1269.

Blattner FR, Plunkett G, 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF et al. 1997. The complete genome sequence of Escherichia coli K-12. Science 277: 1453-1462.

Borukhov S, Polyakov A, Nikiforov V, Goldfarb A. 1992. GreA protein: a transcription elongation factor from Escherichia coli. Proc Natl Acad Sci U S A 89: 8899-8902.

Cho BK, Zengler K, Qiu Y, Park YS, Knight EM, Barrett CL, Gao Y, Palsson BO. 2009. The transcription unit architecture of the Escherichia coli genome. Nat Biotechnol 27: 1043-1049.

Fraenkel DG. 1967. Genetic mapping of mutations affecting phosphoglucose isomerase and fructose diphosphatase in Escherichia coli. J Bacteriol 93: 1582-1587.

Jozefczuk S, Klie S, Catchpole G, Szymanski J, Cuadros-Inostroza A, Steinhauser D, Selbig J, Willmitzer L. 2010. Metabolomic and transcriptomic stress response of Escherichia coli. Mol Syst Biol 6: 364.

Koop AH, Hartley ME, Bourgeois S. 1987. A low-copy-number vector utilizing beta- galactosidase for the analysis of gene control elements. Gene 52: 245-256.

301

O'Toole GA, Kolter R. 1998. Initiation of biofilm formation in Pseudomonas fluorescens WCS365 proceeds via multiple, convergent signalling pathways: a genetic analysis. Mol Microbiol 28: 449-461.

Orlova M, Newlands J, Das A, Goldfarb A, Borukhov S. 1995. Intrinsic transcript cleavage activity of RNA polymerase. Proc Natl Acad Sci U S A 92: 4596-4600.

Rhodius VA, Suh WC, Nonaka G, West J, Gross CA. 2006. Conserved and variable functions of the σE stress response in related genomes. PLoS Biol 4: e2.

302

Appendix B

Mutations in rho identify a novel mechanism of ethanol tolerance in Escherichia coli

This work is a subset of a manuscript in preparation (David H. Keating, Michael

Schwalbach, Jason M. Peters, Mary Tremaine, Edward Pohlmann, Frances Tran,

Jeffrey Vinokur, Alan Higbee, Patricia J. Kiley, and Robert Landick. Mutations in rho identify a novel mechanism of ethanol tolerance in Escherichia coli). David Keating wrote the manuscript. Michael Schwalbach performed ethanol tolerance experiments and analyzed expression microarray data. Mary Tremaine performed directed evolution experiments. Frances Tran performed RNA polymerase (RNAP) chromatin immunoprecipitation and microarrays (ChIP-chip) experments. I generated strains, analyzed ChIP-chip data, and assisted David Keating in writing portions of the manuscript. I rewrote the parts of the manuscript that are shown in this thesis to summarize my role in the project.

303

Introduction

The mechanism of ethanol tolerance is of general interest to studies of bacterial bioenergy (Yomano et al. 2008). The benefits of Esherichia coli as a model organism for genetics are well known; E. coli is genetically tractable and many genetic tools have been developed to allow for transformation and genome manipulation (Thomason et al.

2007; Sharan et al. 2009). E. coli also has a well-defined metabolism. Metabolic research on E. coli has been carried out for many decades, culminating in genome-wide computational models that can predict metabolic flux (Feist et al. 2009). Despite these advantages, wild-type E. coli is only modestly ethanol tolerant (Yomano et al. 1998).

Although studies have sought to address ethanol sensitivity by using directed evolution to generate ethanol-resistant E. coli strains (Goodarzi et al. 2010), little is known about the cellular mechanisms of acquired ethanol tolerance.

Previous directed evolution studies have identified mutations in the gene encoding the transcription termination factor Rho in the genomes of ethanol tolerant strains (Goodarzi et al. 2010). The consensus explanation for the appearance of mutations in rho and other genes that encode components of the transcription machinery is that large-scale changes in transcription can be achieved with a single point mutation that allows cells to quickly adapt to multiple stresses caused by ethanol

(Alper and Stephanopoulos 2007). Directed evolution experiments carried out by the

Great Lakes Bioenergy Research Center (GLBRC) generated ethanol-tolerant strains that contained mutations in rho, as well as numerous other genes (Mary Tremaine, unpublished, Table B.1). It is currently unknown why rho mutations are linked to ethanol tolerance.

304

Results

The ethanol-tolerant strain, MTA156, contains a mutation in the rho gene that is sufficient for ethanol tolerance

Directed evolution experiments carried out by the GLBRC produced the strain MTA156, which experiences a growth yield reduction of 20% in 3% ethanol (v/v) versus 50% for wild-type E. coli (data not shown). Genome resequencing revealed that MTA156 had several nonsynonymous mutations in annotated coding sequences (Table B.1), one of which was found in the rho gene. The mutation was a C to A transversion that would result in substitution of methionine in the place of leucine at residue 270. To determine if the rho(L270M) mutation alone could provide ethanol tolerance, we constructed a strain that carried only the rho mutation in an otherwise wild-type background (MG1655). The strain was constructed by two successive P1vir-mediated transductions. The first transduction moved a Tn10 transposon insertion in ilvD into MG1655, resulting in a strain that was auxotrophic for isoleucine. ilvD is ~70% linked to rho. The second transduction moved the rho(L270M) and ilvD+ alleles from MTA156 into ilvD::Tn10, resulting in a prototrophic rho(L270M) strain that was otherwise unmarked. The rho(L270M) strain was similarly ethanol tolerant as MTA156 at 3% ethanol (data not shown) demonstrating that the rho(L270M) mutation alone was sufficient for ethanol tolerance.

305

Strains containing the rho(L270M) mutation are defective for transcription termination

The L270 residue is buried within the Rho-CTD (which contains the ATPase, secondary

RNA binding site, and translocase functions) in crystal structures of Rho (Thomsen and

Berger 2009), suggesting that the L270M substitution may cause defects in transcription termination. Also, expression microarray data showed an upregulation of the rho gene

(2-fold) and cryptic prophage genes (e.g., a 6.1-fold increase in the nmpC of the DLP12 prophage), consistent with a loss of Rho function (Cardinale et al. 2008; Peters et al.

2009). To assess Rho function in the rho(L270M) mutant, we performed RNAP ChIP- chip. We examined the RNAP distribution at the rho locus, as rho is known to be autoregulated by termination. As expected, the RNAP distribution was shifted downstream into the rho gene in the rho(L270M) mutant compared to wild-type (Fig.

B.1). The shift in RNAP distribution in rho(L270M) was similar to that observed in the presence of the Rho inhibitor bicyclomycin (BCM; (Peters et al. 2009). This indicated that Rho(L270M) is defective for termination.

Altered fatty acid composition in rho(L270M) may relate to ethanol tolerance

One of the more obvious sites of terminator readthrough in the rho(L270M) strain was upstream of fabF (Fig. B.2). fabF encodes the enzyme β-ketoacyl-ACP synthase II which regulates fatty acid composition by catalyzing the conversion of palmitoleate to cis-vaccenate (Garwin et al. 1980). As a result of terminator readthrough, fabF was upregulated 2 to 5 fold when grown in the absence or presence of ethanol, respectively.

Changes in fatty acid composition had been previously implicated in ethanol tolerance.

306

Figure B.1. Readthrough of a Rho-dependent terminator at the rho locus. RNAP

ChIP-chip signals are shown from wild-type (blue) and rho(L270M) (purple) cells.

Magenta dashed lines indicate a region significantly affected by BCM in a previous study (Peters et al. 2009).

307

Figure B.1

308

Figure B.2. Readthrough of a Rho-dependent terminator upstream of fabF. RNAP

ChIP-chip signals are shown from wild-type (blue) and rho(L270M) (purple) cells.

Magenta dashed lines indicate a region significantly affected by BCM in a previous study (Peters et al. 2009).

309

Figure B.2

310

The rho(L270M) strain showed an increase in the ratio of saturated to unsaturated fatty acids in the presence of ethanol compared to wild-type [the ratio of saturated to unsaturated fatty acids in the presence of ethanol was 1.4 for wild type versus 2.1 for rho(L270M); David Keating, unpublished]. A fabF deletion mutant grew slowly in the presence of ethanol, even when combined with the rho(L270M) mutation (Fig. B.3), suggesting that fabF is involved in ethanol tolerance caused by the rho mutation.

311

Figure B.3. Mutations in fabF affect growth of wild type and rho(L270M) mutant in

5% ethanol. A fabF::kan deletion was transduced into wild type MG1655 and the rho(L270M) mutant. The cells were then cultured in M9 medium containing 5% ethanol, and growth measured by OD600. (i) Growth of wild type and fabF::kan mutant in 5% ethanol (wild type, closed circles; fabF::kan mutant, closed squares). (ii) Growth of the rho(L270M) mutant and rho(L270M) fabF::kan double mutant in 5% ethanol

[rho(L270M), closed triangles; rho(L270M) fabF::kan, open squares]. Error bars, in most cases too small to be seen, represent standard deviation. This figure was provided by

David Keating.

312

Figure B.3

313

Discussion

We used to directed evolution to isolate E. coli strains that are tolerant to ethanol. This selection produced a strain that contained the rho(L270M) mutation that is defective for transcription termination. Interestingly, other rho mutants that were isolated based on termination defects were also more ethanol tolerant than wild-type (data not shown).

Also, an independently evolved ethanol tolerant strain, MTA236, contained a rho(L345F) mutation. This suggests ethanol tolerance is not specific to certain rho alleles, but is generally associated with defects in Rho termination.

314

Materials and Methods

Strains and primers

Strains and primers are listed in Table B.1.

ChIP-chip

ChIP-chip was performed as previously described (Mooney et al. 2009).

Cell growth and RNA extraction

Cells were grown in M9 minimal medium with 10 g glucose per liter at 37 oC in gas- sparged Roux bottles to mid-log phase (OD600 ~ 0.3-0.4). Cells were then challenged with 4% ethanol (v/v) or were left untreated for 60 min before samples were taken for

RNA extraction. RNA for expression microarray analysis was extracted using the hot phenol method as previously described (Khodursky et al. 2003).

315

Table B.1. Strains and primers used in this study.

Strain Background Genotype Source Reference

MG1655 MG1655 F- λ- rph-1 F. Blattner (Blattner et al. 1997) (RL1655)

CAG18431 MG1655 ilvD::Tn10(TetR) CGSC (RL1890)

MTA156 MG1655 topA(H122L) M. Tremaine - (RL2324) fadI(T350S) ispB(I32L) gltB(G1062S) rpsQ(H31P) rho(L270M)

RL2325 MG1655 rho(L270M) This study -

MTA236 MG1655 rho(L345F) numerous M. Tremaine - (RL2421) additional mutations

JW1081 BW25113 ΔfabF759::FRT-kan- D. Keating (Baba et al. 2006) FRT

Primer Sequence (5′-3′) Purpose

7396 TCCTGCCATACCATTCACAA PCR amplify and sequence rho

7397 AAGCAAAACGCCACGTAAAC PCR amplify and sequence rho

316

Acknowledgements

We thank the DOE Joint Genome Institute for genome resequencing, and the Gene

Expression Center (UW-Madison) for expression microarrays.

317

References

Alper H, Stephanopoulos G. 2007. Global transcription machinery engineering: a new approach for improving cellular phenotype. Metab Eng 9: 258-267.

Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H. 2006. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2: 2006 0008.

Blattner FR, Plunkett G, 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF et al. 1997. The complete genome sequence of Escherichia coli K-12. Science 277: 1453-1462.

Cardinale CJ, Washburn RS, Tadigotla VR, Brown LM, Gottesman ME, Nudler E. 2008. Termination factor Rho and its cofactors NusA and NusG silence foreign DNA in E. coli. Science 320: 935-938.

Feist AM, Herrgard MJ, Thiele I, Reed JL, Palsson BO. 2009. Reconstruction of biochemical networks in microorganisms. Nat Rev Microbiol 7: 129-143.

Garwin JL, Klages AL, Cronan JE, Jr. 1980. Structural, enzymatic, and genetic studies of beta-ketoacyl-acyl carrier protein synthases I and II of Escherichia coli. J Biol Chem 255: 11949-11956.

Goodarzi H, Bennett BD, Amini S, Reaves ML, Hottes AK, Rabinowitz JD, Tavazoie S. 2010. Regulatory and metabolic rewiring during laboratory evolution of ethanol tolerance in E. coli. Mol Syst Biol 6: 378.

Khodursky AB, Bernstein JA, Peter BJ, Rhodius V, Wendisch VF, Zimmer DP. 2003. Escherichia coli spotted double-strand DNA microarrays: RNA extraction, labeling, hybridization, quality control, and data management. Methods Mol Biol 224: 61-78.

Mooney RA, Davis SE, Peters JM, Rowland JL, Ansari AZ, Landick R. 2009. Regulator trafficking on bacterial transcription units in vivo. Mol Cell 33: 97-108.

318

Peters JM, Mooney RA, Kuan PF, Rowland JL, Keles S, Landick R. 2009. Rho directs widespread termination of intragenic and stable RNA transcription. Proc Natl Acad Sci U S A 106: 15406-15411.

Sharan SK, Thomason LC, Kuznetsov SG, Court DL. 2009. Recombineering: a homologous recombination-based method of genetic engineering. Nat Protoc 4: 206- 223.

Thomason LC, Costantino N, Court DL. 2007. E. coli genome manipulation by P1 transduction. Curr Protoc Mol Biol Chapter 1: Unit 1 17.

Thomsen ND, Berger JM. 2009. Running in reverse: the structural basis for translocation polarity in hexameric helicases. Cell 139: 523-534.

Yomano LP, York SW, Ingram LO. 1998. Isolation and characterization of ethanol- tolerant mutants of Escherichia coli KO11 for fuel ethanol production. J Ind Microbiol Biotechnol 20: 132-138.

Yomano LP, York SW, Zhou S, Shanmugam KT, Ingram LO. 2008. Re-engineering Escherichia coli for ethanol production. Biotechnol Lett 30: 2097-2103.

319

Appendix C

Investigating the NusG-S10 model of transcription-translation coupling

Soon Li Teh and I performed all of the experiments shown in this section.

320

Introduction

In bacteria and archea, translation begins prior to the complete synthesis and release of mRNA by RNA polymerase (RNAP). Thus, transcription and translation are coupled; the presence of a translating ribosome on the nascent mRNA affects the elongation and termination properties of RNAP. Translation prevents RNAP backtracking and arrest

(Proshkin et al. 2010), and can stimulate RNAP escape from regulatory pauses

(Landick et al. 1985). Translation also blocks Rho-dependent transcription termination that would otherwise cause a decrease in expression of distal genes in operons (i.e., polarity; Adhya et al. 1974).

There are two non-mutually-exclusive models that explain how translation prevents Rho termination. In the RNA competition model, the ribosome outcompetes

Rho for binding to the nascent RNA (Adhya et al. 1974). In the NusG competition model, the ribosome is physically tethered to RNAP via an interaction between ribosomal protein S10 and the Rho co-factor NusG. Because NusG binds to Rho and

S10 through the same domain (the NusG-CTD)(Mooney et al. 2009; Chalissery et al.

2011), tethering of the ribosome to RNAP prevents Rho binding to NusG, and, thus,

NusG-dependent enhancement of Rho termination.

The NusG competition model is based solely on a solution structure of S10 bound to the NusG-CTD (Fig. C.1; Burmann et al. 2010). As such, many questions remain about how the NusG-S10 interaction influences polarity in vivo. For instance, does transcription and translation coupling through the NusG-S10 interaction affect all translated genes, or a subset of genes? To answer such questions, we identified

321

Figure C.1. Structural model of the NusG-S10-NusB complex. Residues that are part of the NusG-S10 interaction surface are shown in red (NusG) and orange (S10).

S10 proteins from the NusG-S10 solution structure (2KVQ) and the S10-NusB crystal structure (3D3B) were aligned using the PyMol command “super” (RMSD = 0.06 Å).

322

Figure C.1

323 several substitutions in S10 that specifically disrupt the interaction between S10 and

NusG using the bacterial two-hybrid assay (Fig C.2; Dove and Hochschild 2004; Nickels

2009). These S10 variants should be useful tools to determine the importance of the

NusG-S10 interaction to polarity in vivo.

324

Results

NusG and S10 interact in the bacterial two-hybrid assay

To test if the interaction between NusG and S10 could be recapitulated in the bacterial two-hybrid system, we first cloned full-length nusG as a translational fusion to rpoA, and full-length rpsJ as a translational fusion to cI (Fig C.2). We then transformed the fusion plasmids into a two-hybrid reporter strain, and plated the transformants on X-gal plates

(Fig C.3). The strain containing both nusG and rpsJ fusions was qualitatively above background (darker blue) compared to control strains lacking one of the fusions, indicating that NusG and S10 interacted in the two-hybrid assay. We quantified the level of interaction between NusG and S10 using the β-galactosidase assay (Fig C.4). The

NusG-S10 interaction was ~2-fold above background, whereas the interactions between

NusG-RNAP(clamp helicies) and NusB-S10 were ~5-fold, and ~7-fold above background, respectively. This result is consistent with previous reports that measured weak binding between NusG and S10 in vitro (Kd ~50 µM; Burmann et al. 2010). We were unable to detect an interaction between the NusG-CTD alone and S10, although this may be due poor expression or instability of the α::NusG-CTD fusion in vivo. We conclude that the interaction between full-length NusG and S10 can be reproduced in the bacterial two-hybrid system.

A screen for substitutions that disrupt the interaction between NusG and S10

The solution structure of the NusG-S10 complex shows several side chain interactions between NusG and S10 that potentially contribute to binding (Fig C.1;

325

Figure C.2. The bacterial two-hybrid assay. Proteins of interest are fused either to the

α subunit of RNAP (e.g., NusG), or to the DNA-binding protein λcI (e.g., S10).

Interaction of the proteins fused to α (encoded by rpoA) and λcI recruits RNAP to a weak promoter upstream of the lacZ reporter gene. Expression of lacZ scales with the strength of the interaction between the two fusion proteins. This figure was adapted from Nickels, 2009.

326

Figure C.2

327

Figure C.3. NusG and S10 interact in the bacterial two-hybrid assay. The bacterial two-hybrid reporter strain was transformed with plasmids that expressed a or cI alone or as an α::NusG fusion or a λcI::S10 fusion. Only the strain expressing both α::NusG and

λcI::S10 fusion showed above-background production of β-galactosidase (blue colonies on X-gal plates).

328

Figure C.3

329

Figure C.4. Quantification of bacterial two-hybrid interactions. β -galactosidase activity (in Miller units) was quantified for two-hybrid reporter strain transformed with the indicated fusions. “RpoC” is a fragment of the β′ subunit of RNAP (residues 262-309) that contains the binding site for NusG (Mooney et al. 2009; Nickels 2009).

330

Figure C.4

331

Burmann et al. 2010). NusG residues R167, I164, and E172 interact with S10 residues

V98, M88, and S101, respectively. Also, a salt bridge may form between NusG-R167 and S10-D97. We identified substitutions in S10 that specifically disrupt binding to NusG by a three-step process. First, we randomized the rpsJ codons that encode S10 residues M88, D97, V98, and S101 using site-directed mutagenesis. Next, we transformed the two-hybrid reporter strain with the rpsJ randomized libraries and pBR- rpoA::nusG, and screened for colonies in which the interaction between NusG and S10 was lost (white colonies). Finally, we transformed the plasmids that expressed variant

S10 proteins that no longer interacted with NusG into the two-hybrid reporter strain with pBR-rpoA::nusB, and screened for colonies in which the interaction between NusB and

S10 was retained (blue colonies). The third counterscreening step was used to eliminate S10 variants that failed to bind to NusG because of non-specific changes in the overall structure of the S10 protein.

Substitutions in S10 residues M88, D97, V98, and S101 disrupt the interaction between

NusG and S10

We identified the S10 variants M88P, D97V, D97Q, V98K, V98E, S101R, and

S101G as substitutions that failed to interact with NusG, but retained interaction with

NusB (Fig. C.5 and C.6). We quantified the two-hybrid interactions of variant S10 proteins with NusG and NusB using β-galactosidase assays. We found that S10 variants showed background levels of interaction with NusG, with the exceptions of

D97Q and S101R, which were slightly above background. All S10 variants also retained

332

Figure C.5. Substitutions in S10 that affect the binding of NusG to S10. β - galactosidase activity (in Miller units) was quantified for two-hybrid reporter strain transformed with the indicated substitutions. The magenta dashed line represents the assay background. The first bar on the graph is the wild-type level of interaction.

333

Figure C.5

334

Figure C.6. Substitutions in S10 that eliminate binding to NusG have little effect on the S10-NusB interaction. β -galactosidase activity (in Miller units) was quantified for two-hybrid reporter strain transformed with the indicated substitutions. The magenta dashed line represents the assay background. The first bar on the graph is the wild-type level of interaction.

335

Figure C.6

336 at least 75% of their interaction with NusB, and S10-V98K showed ~1.6-fold greater interaction with NusB than wild-type S10.

In the process of our screen for substitutions that disrupt the interaction between

NusG and S10, we unexpectedly identified a darker blue colony that qualitatively appeared to have an increased interaction. Upon quantification of the interaction, we found that the S10 M88R substitution showed ~3-fold increased binding to NusG, and an only slightly higher interaction with NusB (~1.2-fold).

To determine if S10 variants that fail to interact with NusG can provide the essential functions of S10 to E. coli cells, we used λRed-mediated homologous recombination (λRed) to insert mutant rpsJ alleles into the chromosomal rpsJ locus. We succeeded in generating strains encoding the S10 variants D97V, D97Q, V98E, S101G, and S101R as the sole source of S10, as well as the M88R variant that strengthens the interaction between S10 and NusG. All mutants were confirmed by Sanger sequencing.

The strains expressing variant S10 proteins had no apparent phenotype when grown on

LB + 20 µg/ml chloramphenicol plates at 37 oC. The S10 variants M88P and V98K could not provide the essential function of S10, and could possibly be defective in other ways, such as ribosome binding.

337

Discussion

Our two-hybrid analysis of the interaction between the transcription elongation factor

NusG and ribosomal protein S10 establishes tools for further exploration of the NusG competition model of transcription-translation coupling in vivo. S10 substitutions that disrupt its interaction with NusG, but not with NusB, support the relative binding sites of

NusG and NusB on S10 determined by NMR (Burmann et al. 2010) and X-ray crystallography (Luo et al. 2008), respectively. As the 16S ribosomal RNA and NusB bind to the same surface of S10 (Luo et al. 2008; Burmann et al. 2010), it is also unlikely that the substitutions identified in our study would affect S10 association with the ribosome.

Several questions remain about the S10 substitutions identified in our study. For instance, do S10 variants that fail to bind NusG cause defects in NusG recruitment to transcription elongation complexes (ECs), or do the mutations only affect ribosome recruitment to ECs? Monitoring NusG recruitment by ChIP-chip and ribosome recruitment by deep sequencing ribosome-protected RNAs in cells expressing variant

S10 proteins will be helpful in making these distinctions.

338

Materials and Methods

Strains and plasmids

Strains, plasmids, and primers are listed in Table C.1. All plasmids were constructed by cloning a PCR product containing the gene of interest between the NotI and BamHI sites of either pAC-λcI::rpoC(262-309) or pBR-rpoA::nusG(1-132) using the primers listed in Table C.1. Site-directed mutagenesis was carried out as previously described

(Zhang et al. 2010).

Bacterial two-hybrid plate and β-galactosidase assays

Plates for bacterial two-hybrid screening included 100 µg/ml carbenicillin, 50 µg/ml kanamycin, 25 µg/ml chloramphenicol, 100 µg/ml IPTG, 40 µg/ml X-gal and 75 mg/ml phenyethyl- β-D-galactoside (TPEG; an competitive inhibitor of β-galactosidase that was required to reduce background X-gal hydrolysis). β-galactosidase assays were performed in 96-well plates in a ELx808 plate reader (BioTek Instruments) as previously described (Nickels 2009), except that IPTG was omitted from overnight cultures.

λRed-mediated recombination

λRed-mediated recombination was carried out as previously described (Bubunenko et al. 2007). The cat (CmR) gene was amplified from strain NC397 using primers 8221 and

8222, then amplified a second time using primers that contained mutations in the rpsJ coding sequence, and was recombined downstream of rpsJ as a transcriptional fusion.

339

Table C.1. Strains, plasmids, and primers used in this study.

Strain Background Genotype Source Reference

MG1655 MG1655 F- λ- rph-1 F. Blattner (Blattner et al. (RL1655) 1997)

R NC397 W3110 lacI′-kan(Kan )-rrnBT1- D. Court - (RL1951) cat(CmR)-sacB-lacZYA contains defective λ prophage BG455 CSH100 F′[two-hybrid test A. Hochschild (Dove and (RL2134) promoter KanR] Hochschild 2004) RL2294 BG455 pAC-λcI pBR-rpoA This study - RL2295 BG455 pAC-λcI::rpoC(262-309) This study - pBR-rpoA RL2296 BG455 pAC-λcI pBR- This study - rpoA::nusG RL2297 BG455 pAC-λcI::rpoC(262-309) This study - pBR-rpoA RL2298 BG455 pAC-λcI::rpsJ pBR-rpoA This study - RL2299 BG455 pAC-λcI::rpsJ pBR-rpoA This study - RL2300 BG455 pAC-λcI::rpsJ pBR- This study - rpoA::nusB RL2301 BG455 pAC-λcI pBR- This study - rpoA::nusB RL2794 BG455 pAC-λcI::rpsJ(M88P) This study - pBR-rpoA::nusG RL2795 BG455 pAC-λcI::rpsJ(M88R) This study - pBR-rpoA::nusG RL2796 BG455 pAC-λcI::rpsJ(D97V) This study - pBR-rpoA::nusG RL2797 BG455 pAC-λcI::rpsJ(D97Q) This study - pBR-rpoA::nusG RL2798 BG455 pAC-λcI::rpsJ(V98K) This study -

340

pBR-rpoA::nusG RL2799 BG455 pAC-λcI::rpsJ(V98E) This study - pBR-rpoA::nusG RL2800 BG455 pAC-λcI::rpsJ(S101R) This study - pBR-rpoA::nusG RL2801 BG455 pAC-λcI::rpsJ(S101G) This study - pBR-rpoA::nusG RL2802 BG455 pAC-λcI::rpsJ(M88P) This study - pBR-rpoA::nusB RL2803 BG455 pAC-λcI::rpsJ(M88R) This study - pBR-rpoA::nusB RL2804 BG455 pAC-λcI::rpsJ(D97V) This study - pBR-rpoA::nusB RL2805 BG455 pAC-λcI::rpsJ(D97Q) This study - pBR-rpoA::nusB RL2806 BG455 pAC-λcI::rpsJ(V98K) This study - pBR-rpoA::nusB RL2807 BG455 pAC-λcI::rpsJ(V98E) This study - pBR-rpoA::nusB RL2808 BG455 pAC-λcI::rpsJ(S101R) This study - pBR-rpoA::nusB RL2809 BG455 pAC-λcI::rpsJ(S101G) This study - pBR-rpoA::nusB RL2834 MG1655 rpsJ+-cat This study - RL2835 MG1655 rpsJ(M88R)-cat This study - RL2836 MG1655 rpsJ(D97V)-cat This study - RL2837 MG1655 rpsJ(D97Q)-cat This study - RL2838 MG1655 rpsJ(V98E)-cat This study - RL2839 MG1655 rpsJ(S101G)-cat This study - RL2840 MG1655 rpsJ(S101R)-cat This study -

Plasmid (stock #) Description Source Reference

341 pAC-λcI (4845) bacterial two-hybrid plasmid B. Nickels (Nickels 2009) pBR-rpoA (4846) bacterial two-hybrid plasmid B. Nickels (Nickels 2009) pAC-λcI::rpoC(262-309) bacterial two-hybrid plasmid B. Nickels (Nickels 2009) (4847) pBR-rpoA::nusG(1-132) bacterial two-hybrid plasmid B. Nickels (Nickels 2009) (4848) pBR-rpoA::nusG (4849) bacterial two-hybrid plasmid This study - pAC-λcI::rpsJ (4850) bacterial two-hybrid plasmid This study - pBR-rpoA::nusB (4851) bacterial two-hybrid plasmid This study - pAC-λcI::rpsJ(M88P) bacterial two-hybrid plasmid This study - (4852) pAC-λcI::rpsJ(M88R) bacterial two-hybrid plasmid This study - (4853) pAC-λcI::rpsJ(D97V) bacterial two-hybrid plasmid This study - (4854) pAC-λcI::rpsJ(D97Q) bacterial two-hybrid plasmid This study - (4855) pAC-λcI::rpsJ(V98K) bacterial two-hybrid plasmid This study - (4856) pAC-λcI::rpsJ(V98E) bacterial two-hybrid plasmid This study - (4857) pAC-λcI::rpsJ(S101R) bacterial two-hybrid plasmid This study - (4858) pAC-λcI::rpsJ(S101G) bacterial two-hybrid plasmid This study - (4859)

Primer Sequence (5′-3′) Purpose

6508 tatatgcggccgcaATGTCTGAAGCT PCR amplify nusG for cloning CCTAAAAA

6509 tatataGGATCCtcattactaGGCTT PCR amplify nusG for cloning TTTCAACCTGGCTGA

342

7018 CGCAGCGGCCGCAgtgaaacctgctg PCR amplify nusB for cloning ctcgtcg

7019 GTAAGGATCCTTAtcactttttgtta PCR amplify nusB for cloning gggcgaa

7020 CGCAGCGGCCGCAatgcagaaccaaa PCR amplify rpsJ for cloning gaatccg

7021 GTAAGGATCCTTAttaacccaggctg PCR amplify rpsJ for cloning atctgca

7545 aaaaccgttgatgctctgNNNcgtct randomize rpsJ codon 88 ggatctggctgcc

7546 gatctggctgccggtgtaNNNgtgca randomize rpsJ codon 97 gatcagcctgggt

7547 ctggctgccggtgtagacNNNcagat randomize rpsJ codon 98 cagcctgggttaa

7548 ggtgtagacgtgcagatcNNNctggg randomize rpsJ codon 101 ttaataaggatcc

8221 tctggctgccggtgtagacgtgcaga for recombination of cat gene downstream of rpsJ tcagcctgggttaatcaggagctaag gaagctaa

8222 ccaatcattgtttcaacctctcaatc for recombination of cat gene downstream of rpsJ gctcaatgacctgattacgccccgcc ctgccact

8223 catcgttgagccaaccgagaaaaccg amplify cat gene with rpsJ(M88P) mutation for recombination ttgatgctctgCCTcgtctggatctg gctgccggtgtagacg

8224 catcgttgagccaaccgagaaaaccg amplify cat gene with rpsJ(M88R) mutation for recombination ttgatgctctgCGGcgtctggatctg

343

gctgccggtgtagacg

8225 tgatgctctgatgcgtctggatctgg amplify cat gene with rpsJ(D97V) mutation for recombination ctgccggtgtaGTAgtgcagatcagc ctgggtta

8226 tgatgctctgatgcgtctggatctgg amplify cat gene with rpsJ(D97Q) mutation for recombination ctgccggtgtaCAGgtgcagatcagc ctgggtta

8227 tgctctgatgcgtctggatctggctg amplify cat gene with rpsJ(V98E) mutation for recombination ccggtgtagacAAGcagatcagcctg ggttaatc

8228 tgctctgatgcgtctggatctggctg amplify cat gene with rpsJ(V98K) mutation for recombination ccggtgtagacGAAcagatcagcctg ggttaatc

8229 gcgtctggatctggctgccggtgtag amplify cat gene with rpsJ(S101G) mutation for acgtgcagatcGGGctgggttaatca recombination ggagctaa

8230 gcgtctggatctggctgccggtgtag amplify cat gene with rpsJ(S101R) mutation for recombination acgtgcagatcAGActgggttaatca ggagctaa

8055 TCGGGGAGCTACGTAAGAAC amplify/sequence rpsJ alleles on chromosome 8056 TCAACCTCTCAATCGCTCAA amplify/sequence rpsJ alleles on chromosome

344

Acknowledgements

We thank Bryce Nickels (Rutgers), Ann Hochschild (Harvard), and Amy Banta

(UW-Madison) for strains and plasmids.

345

References

Adhya S, Gottesman M, De Crombrugghe B. 1974. Termination and antitermination in transcription: control of gene expression. Basic Life Sci 3: 213-221.

Burmann BM, Schweimer K, Luo X, Wahl MC, Stitt BL, Gottesman ME, Rosch P. 2010. A NusE:NusG complex links transcription and translation. Science 328: 501-504.

Chalissery J, Muteeb G, Kalarickal NC, Mohan S, Jisha V, Sen R. 2011. Interaction surface of the transcription terminator Rho required to form a complex with the C- terminal domain of the antiterminator NusG. J Mol Biol 405: 49-64.

Dove SL, Hochschild A. 2004. A bacterial two-hybrid system based on transcription activation. Methods Mol Biol 261: 231-246.

Landick R, Carey J, Yanofsky C. 1985. Translation activates the paused transcription complex and restores transcription of the trp operon leader region. Proc Natl Acad Sci U S A 82: 4663-4667.

Luo X, Hsiao HH, Bubunenko M, Weber G, Court DL, Gottesman ME, Urlaub H, Wahl MC. 2008. Structural and functional analysis of the E. coli NusB-S10 transcription antitermination complex. Mol Cell 32: 791-802.

Mooney RA, Schweimer K, Rosch P, Gottesman M, Landick R. 2009. Two structurally independent domains of E. coli NusG create regulatory plasticity via distinct interactions with RNA polymerase and regulators. J Mol Biol 391: 341-358.

Nickels BE. 2009. Genetic assays to define and characterize protein-protein interactions involved in gene regulation. Methods 47: 53-62.

Proshkin S, Rahmouni AR, Mironov A, Nudler E. 2010. Cooperation between translating ribosomes and RNA polymerase in transcription elongation. Science 328: 504-508.

346

Zhang J, Palangat M, Landick R. 2010. Role of the RNA polymerase trigger loop in catalysis and pausing. Nat Struct Mol Biol 17: 99-104.

347

Appendix D

Generating strains for use in genome-scale phenotypic mapping to investigate transcription

Anthony Shiver (UCSF) and I performed all of the experiments shown in this section.

348

Introduction

Genome-scale identification of mutant phenotypes (i.e., phenotypic mapping) is a rapid approach to determine gene function. There are two major methods for phenotype mapping in Escherichia coli: (i) chemical genomics, in which libraries of deletion mutants are subjected to a panel of stress conditions (e.g., antibiotics, bile salts,

EDTA/SDS, etc.; (Nichols et al. 2011)), and (ii) epistasis maps (eMAPs, a.k.a. GIANT- coli; (Typas et al. 2008) in which two differentially marked deletion libraries are mated to form double mutants. In both cases, deletion mutant libraries are arrayed in multi-well plates and are manipulated robotically to increase throughput. The power of phenotypic mapping comes from its scale. Genes that have highly correlated “phenotypic signatures” across hundreds of conditions are likely to have similar functions (Typas et al. 2010).

Previous phenotypic mapping experiments largely relied on an E. coli deletion mutant library (the Keio collection; (Baba et al. 2008); only a few point mutants of essential genes were included in the analysis. The E. coli transcription machinery is encoded by several essential genes (Bubunenko et al. 2007) that were not represented in previous phenotypic mapping studies (Nichols et al. 2011). Genes that encode parts of the transcription apparatus make attractive targets for phenotypic analysis because they have been characterized extensively through biochemistry and because numerous mutations exist for each gene. Several important questions about the process of transcription and how it relates to other cellular functions can be answered through phenotypic mapping. What are the consequences to cell physiology of mutations in certain essential transcription factors? Do mutations that spatially cluster on the surface

349 of RNA polymerase (RNAP) lead to similar phenotypes? Finally, do portions of RNAP that are not evolutionarily conserved have specific roles in cell physiology?

350

Results

Generation of mutant strains for phenotype mapping

We chose mutations for phenotype mapping based on three criteria: (1) the mutation was in a gene encoding a component of the transcription apparatus (2) the mutation was previously known (either described in the literature, or identified in directed evolution experiments at the Great Lakes Bioenergy Research Center) (3) the mutation conferred an “interesting” phenotype (e.g. ethanol tolerance, transcription termination defects). If possible, we also selected mutations that would result in substitutions in different domains of multidomain proteins. Based on these criteria, we identified mutations in rho, nusA, rpoD (encoding σ70), and rpoB (encoding the β subunit of

RNAP) to include in phenotype mapping experiments (Table D.1).

To generate strains compatible with phenotypic mapping technology, we used λRed- mediated homologous recombination (Datta et al. 2006) to insert linked antibiotic resistance markers immediately downstream of genes containing the desired mutations.

Each mutated gene was marked with kan (kanamycin resistance) or cat

(chloramphenicol resistance) to provide independent replicate strains, and to facilitate double mutant generation by mating (with the exception of rpoB mutations, which were only marked with cat). The nusA gene is found in an operon upstream of other essential genes (i.e. infB encoding translation initiation factor IF2; (Plumbridge et al. 1982). To avoid problems with transcriptional polarity due to insertion of an antibiotic marker, we inserted kan and cat as transcriptional fusions downstream of nusA so that the antibiotic-resistance genes would be transcribed and translated as members of the

351

Table D.1 Mutations selected for phenotype mapping

Allele Variant protein Phenotype Reference rho(L270M) Rho(L270M) ethanol tolerant M. Tremaine (unpublished) rho4 Rho(A243E) termination deficient (Morse and Guertin 1972) rho(s-82) Rho(L3F, D156N, hyperactive (Tsurushita et al. T323I) termination 1989) rho115 Rho(G99V) termination deficient (Sharp et al. 1986) nusA(R258G) NusA(R258G) ethanol tolerant M. Tremaine (unpublished) nusAts11 NusA(A181T) temperature sensitive (Nakamura et al. 1986) rpoD(G106V) σ70(G106V) found in ethanol M. Tremaine tolerant strain (unpublished) rpoB(L184R) β(L184R) found in ethanol M. Tremaine tolerant strain (unpublished) rpoB(I112S) β(I112S) found in ethanol M. Tremaine tolerant strain (unpublished)

352 nusA operon. Attempts to insert a kan marker flanked by a dedicated promoter and

Flippase Recognition Target sites (FRT sites) downstream of nusA resulted in duplications of the nusA locus, indicating that perturbing transcription through the nusA operon was lethal.

353

Discussion

We have developed a straightforward method for generating mutant strains that are compatible with phenotypic analysis. Our method first generates an antibiotic-marked allele of a gene of interest using λRed-mediated recombination (Datta et al. 2006). The marked mutation is then moved into an isogenic strain background by P1vir-mediated transduction (Thomason et al. 2007), which can be performed robotically to increase throughput (Anthony Shiver, personal communication). Our strategy for marking strains can be expanded to essential genes outside of the transcription apparatus, or even to other organisms in which λRed-mediated recombination and transduction are functional

(e.g., Salmonella enterica; Datta et al. 2006).

354

Materials and Methods

Strains and plasmids

Strains, plasmids, and primers are listed in Table D.2.

λRed-mediated recombination

λRed-mediated recombination was carried out as previously described (Bubunenko et al. 2007). PCR products to insert markers downstream of rpoD and rpoB were provided by Anthony Shiver (UCSF).

355

Table D.2. Strains, plasmids, and primers used in this study.

Strain Background Genotype Source Reference

MG1655 MG1655 F- λ- rph-1 F. Blattner (Blattner et al. 1997) RSW415 MG1655 Tn10(TetR) linked to R. Washburn - (RL2533) rpoC nusAts11 MTA195 MG1655 rpoB(L184R) M. Tremaine - (RL2761) numerous additional mutations MTA200 MG1655 rpoB(I112S) - (RL2760) numerous additional mutations MTA207 MG1655 rpoD(G106V) - (RL2759) numerous additional mutations RL2599 MG1655 nusA+-cat This study - RL2600 MG1655 nusA+-kan This study - RL2603 MG1655 rho+-cat This study - RL2604 MG1655 rho+-kan This study - RL2706 MG1655 rho(L270M)-cat This study - RL2707 MG1655 rho(L270M)-kan This study - RL2708 MG1655 nusA(R258G)-cat This study - RL2709 MG1655 nusA(R258G)-kan This study - RL2710 MG1655 rho(L345F)-cat This study - RL2711 MG1655 rho(L345F)-kan This study - RL2712 MG1655 rho(A243E)-cat This study - RL2713 MG1655 rho(A243E)-kan This study - RL2714 MG1655 rho(L3F, D156N, This study - T323I)-cat RL2715 MG1655 rho(L3F, D156N, This study - T323I)-kan

356

RL2716 MG1655 rho(G99V)-cat This study - RL2717 MG1655 rho(G99V)-kan This study - RL2718 RSW415 nusAts11-cat This study - RL2719 RSW415 nusAts11-kan This study - RL2768 MTA207 rpoD(G106V)-cat This study - RL2769 MTA207 rpoD(G106V)-kan This study - RL2770 MTA195 rpoB(L184R)-cat This study - RL2771 MTA200 rpoB(I112S)-cat This study -

Plasmid Description Source Reference pSIM6 λRed recombination plasmid D. Court (Datta et al. 2006)

Primer Sequence (5′-3′) Purpose

7637 TGATTATGGCTGCCCGTAATATTTGC PCR amplify cat to recombine downstream of nusA TGGTTCGGTGACGAAGCGTAAtaatc aggagctaaggaagctaa

7638 AGCGTTTTAATCGTTACATCTGTCAT PCR amplify cat to recombine downstream of nusA GCTGTTCCTTCCTGCTACAGTTTAtt acgccccgccctgccact

7339 TGATTATGGCTGCCCGTAATATTTGC PCR amplify kan to recombine downstream of nusA TGGTTCGGTGACGAAGCGTAAtaaac aggatgaggatcgtttcg

7340 AGCGTTTTAATCGTTACATCTGTCAT PCR amplify kan to recombine downstream of nusA GCTGTTCCTTCCTGCTACAGTTTAtt agaagaactcgtcaagaa

7341 TGACCAAGACCAATGACGATTTCTTC PCR amplify cat to recombine downstream of rho GAAATGATGAAACGCTCATAAtaatc

357

aggagctaaggaagctaa

7342 AAGCAAAACGCCACGTAAACACGTGG PCR amplify cat to recombine downstream of rho CGTTTTTGGCATAAGACAAATTTAtt acgccccgccctgccact

7343 TGACCAAGACCAATGACGATTTCTTC PCR amplify kan to recombine downstream of rho GAAATGATGAAACGCTCATAAtaaac aggatgaggatcgtttc

7344 AAGCAAAACGCCACGTAAACACGTGG PCR amplify kan to recombine downstream of rho CGTTTTTGGCATAAGACAAATTTAtt agaagaactcgtcaagaa

358

Acknowledgements

We thank Mary Tremaine and David Keating for strains, and Carol Gross for helpful discussions.

359

References

Baba T, Huan HC, Datsenko K, Wanner BL, Mori H. 2008. The applications of systematic in-frame, single-gene knockout mutant collection of Escherichia coli K-12. Methods Mol Biol 416: 183-194.

Blattner FR, Plunkett G, 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF et al. 1997. The complete genome sequence of Escherichia coli K-12. Science 277: 1453-1462.

Bubunenko M, Baker T, Court DL. 2007. Essentiality of ribosomal and transcription antitermination proteins analyzed by systematic gene replacement in Escherichia coli. J Bacteriol 189: 2844-2853.

Datta S, Costantino N, Court DL. 2006. A set of recombineering plasmids for gram- negative bacteria. Gene 379: 109-115.

Morse DE, Guertin M. 1972. Amber suA mutations which relieve polarity. J Mol Biol 63: 605-608.

Nakamura Y, Mizusawa S, Court DL, Tsugawa A. 1986. Regulatory defects of a conditionally lethal nusAts mutant of Escherichia coli. Positive and negative modulator roles of NusA protein in vivo. J Mol Biol 189: 103-111.

Nichols RJ, Sen S, Choo YJ, Beltrao P, Zietek M, Chaba R, Lee S, Kazmierczak KM, Lee KJ, Wong A et al. 2011. Phenotypic landscape of a bacterial cell. Cell 144: 143- 156.

Plumbridge JA, Howe JG, Springer M, Touati-Schwartz D, Hershey JW, Grunberg- Manago M. 1982. Cloning and mapping of a gene for translational initiation factor IF2 in Escherichia coli. Proc Natl Acad Sci U S A 79: 5033-5037.

360

Sharp JA, Guterman SK, Platt T. 1986. The rho-115 mutation in transcription termination factor rho affects its primary polynucleotide binding site. J Biol Chem 261: 2524-2528.

Thomason LC, Costantino N, Court DL. 2007. E. coli genome manipulation by P1 transduction. Curr Protoc Mol Biol Chapter 1: Unit 1 17.

Tsurushita N, Shigesada K, Imai M. 1989. Mutant rho factors with increased transcription termination activities. I. Functional correlations of the primary and secondary polynucleotide binding sites with the efficiency and site-selectivity of rho- dependent termination. J Mol Biol 210: 23-37.

Typas A, Banzhaf M, van den Berg van Saparoea B, Verheul J, Biboy J, Nichols RJ, Zietek M, Beilharz K, Kannenberg K, von Rechenberg M et al. 2010. Regulation of peptidoglycan synthesis by outer-membrane proteins. Cell 143: 1097-1109.

Typas A, Nichols RJ, Siegele DA, Shales M, Collins SR, Lim B, Braberg H, Yamamoto N, Takeuchi R, Wanner BL et al. 2008. High-throughput, quantitative analyses of genetic interactions in E. coli. Nat Methods 5: 781-787.

361

Appendix E

Investigating nusA essentiality in Escherichia coli

I performed all of the experiments shown in this section.

362

Introduction

The gene encoding the transcription elongation factor NusA is essential in wild-type

Escherichia coli K-12 (MG1655 and W3110 strains). Several temperature-sensitive alleles of nusA have been isolated that are inviable either at 42 oC (Nakamura et al.

1986; Tsugawa et al. 1988) or 30 oC (Craven and Friedman 1991). Baba et al. failed to isolate a kanamycin-marked deletion of nusA during construction of an E. coli ORF deletion library (Baba et al. 2006). Systematic analysis of the essentiality of nusA and other transcription-related proteins showed that attempts to replace nusA with antibiotic resistance markers resulted in duplications of the nusA locus (Bubunenko et al. 2007).

However, nusA could be deleted from the chromosome when nusA+ was expressed from a plasmid in trans, confirming that duplications did not result from polar effects on downstream essential genes in the nusA operon (Bubunenko et al. 2007). These experiments conclusively demonstrated that nusA is essential in wild-type E. coli.

Mutant strains that suppress partial loss of nusA function have been isolated. An undefined strain background carrying a loss-of-function mutation in rho [rho(E134D)] and at least one additional unknown mutation can suppress lethality caused by

ΔnusA533::cam (called ΔnusA* throughout), which is thought to be a null mutation of nusA (Zheng and Friedman 1994). The rho mutation alone is insufficient to allow growth of ΔnusA* in a wild-type strain background (Cardinale et al. 2008; Max E. Gottesman, personal communication; Jason M. Peters, unpublished). ΔnusA* can be the sole source of nusA in the reduced genome strain MDS42 (Cardinale et al. 2008).

Unfortunately, MDS42 contains 42 large deletions (~15% of the genome; Posfai et al.

2006), and the suppressor(s) of ΔnusA* was not mapped. Because ΔnusA* is presumed

363 to be a null mutation, nusA is thought to be nonessential in MDS42 (Cardinale et al.

2008). However, my results demonstrate that nusA is essential in MDS42.

364

Results

A partial deletion of nusA (ΔnusA* ) is not a null allele

Based on the description in Zheng and Friedman (Zheng and Friedman 1994) large portions of the nusA coding sequence remain intact in ΔnusA*. I used PCR followed by

Sanger sequencing to determine the exact sequence of ΔnusA*. I found that the DNA encoding the entire NusA-NTD (residues 1-128) and NusA-CTD (residues 305-495) was present in ΔnusA*. If translated, the NusA-NTD would also have an extension of nine residues (PHISPKSAT) from sequences that were part of the chloramphenicol- resistance marker (CmR) insertion. I conclude that a large part of the nusA coding sequence is present in the ΔnusA*.

The NusA-NTD and NusA-CTD can bind to RNAP as isolated domains in vitro

(Mah et al. 2000). Also, the NusA-NTD alone can enhance hairpin-dependent pausing and intrinsic termination in vitro (Ha et al. 2010). Because ΔnusA* may encode both the

NusA-NTD, and NusA-CTD, I hypothesized that ΔnusA* would be phenotypically distinct from a complete nusA deletion. To test this hypothesis, I attempted to construct a complete deletion of nusA in MDS42 by exactly replacing the nusA coding sequence with the kan gene that confers resistance to kanamycin (KanR) using λRed-mediated homologous recombination (λRed). I could only obtain kanamycin-resistant colonies if nusA+ was expressed in trans from a multi-copy plasmid, suggesting that nusA is an essential gene in MDS42. Typically, targeting essential genes for deletion using λRed results in a partial diploid strain that contains both a wild-type and deleted copy of the targeted gene (e.g., nusA+/ΔnusA::kan). However, the efficiency of λRed in MDS42 was

365 reduced compared to MG1655 for unknown reasons (data not shown). I improved λRed efficiency by increasing the length of homology to the chromosome on our ΔnusA::kan

PCR product from 40 bp to >500 bp. I isolated KanR colonies, all of which had a duplication of the nusA locus (Fig. E.1). In contrast to the report of Cardinale et al.

(Cardinale et al. 2008), I conclude that nusA is an essential gene in MDS42.

Expression of the NusA-NTD alone is insufficient for MDS42 viability

The NusA-NTD shows significant activity in vitro and thus was the most likely cause for the difference in viability between ΔnusA* and a complete nusA deletion. To test this possibility, I constructed four nusA alleles marked with cat (CmR): nusA+-cat (a transcriptional fusion of cat downstream of wild-type nusA), ΔnusA*, nusA(Δ138-495)- cat (an allele that can only express the NusA-NTD), and ΔnusA::cat (a complete deletion) in the presence of plasmid expressing nusA+. I amplified these alleles with 500 bp homology using PCR, and attempted to recombine them into MDS42. CmR colonies were obtained for all four alleles (Table E.1), however all nusA(Δ138-495)-cat and

ΔnusA::cat colonies tested (10/10) contained duplications of the nusA locus. Various attempts to delete nusA while expressing the NusA-NTD in trans also resulted in duplications or the absence of colonies (data not shown; Robert S. Washburn, personal communication). I conclude that the NusA-NTD alone is insufficient for viability in

MDS42.

366

Figure E.1. PCR analysis of nusA recombinants. Agarose gel electrophoresis of

PCR products generated from primers that anneal to chromosomal sequences that flank the nusA gene. Lanes 1-3 are size markers for various nusA alleles. Lanes 4-7 are representative PCR products from KanR recombinants. Lanes 8 and 9 are 100 bp and 1 kb DNA ladders, respectively (New England Biolabs).

367

Figure E.1

368

Table E.1. Viablility of nusA alleles in MDS42

Allele # of CmR recombinants with Strain viable with allele as duplications of nusA sole source of nusA? nusA+-cat 0/10 yes ΔnusA* 0/10 yes nusA(Δ138-495)-cat 10/10 no ΔnusA::cat 10/10 no

369

Unexpected properties of the rho(E134D) ΔnusA* strain

I chose to study the nusA essentiality in MDS42, rather than the rho(E134D) ΔnusA* strain described by Zheng and Friedman (Zheng and Friedman 1994), because the genotype of MDS42 had been determined by genome sequencing. As I mentioned in the introduction, the rho(E134D) ΔnusA* strain must have additional mutations that explain the viability of ΔnusA* . Zheng and Friedman reported that the ΔnusA* mutation causes auxotrophy (Zheng and Friedman 1994). However, MDS42 ΔnusA* is prototrophic, and the rho(E134D) ΔnusA* strain could be transduced to prototrophy while still containing the ΔnusA* allele. Also, the rho(E134D) ΔnusA* strain may harbor foreign genetic elements, such as a Φ80 prophage (Robert S. Washburn and Max E.

Gottesman, personal communication). Finally, the strain is mildly KanR for reasons that are not clear from its published genotype. I are currently sequencing the genome of rho(E134D) ΔnusA* in hopes of identifying additional suppressor mutations of ΔnusA*.

370

Discussion

Our genetic analysis of nusA essentiality reveals that nusA is an essential gene in

MDS42, contrary to previous reports (Cardinale et al. 2008). The ΔnusA* allele, which potentially encodes both the NusA-NTD and NusA-CTD can be the sole source of nusA in MDS42 for unknown reasons. Expression of the NusA-NTD was insufficient for viability in MDS42, suggesting either that the NTD is unstable when expressed alone, or that the NusA-CTD may also have some activity. Alternatively, the nine residue extension onto the NusA-NTD encoded by ΔnusA* may stabilize the NusA-NTD against degradation by protease, allowing it to provide the essential function of NusA. Further experiments producing different fragments of NusA in trans will be needed to determine which portions of NusA are required for viability in MDS42.

371

Materials and Methods

Strains and plasmids

Strains, plasmids, and primers are listed in Table E.2.

λRed-mediated recombination

λRed-mediated recombination was carried out as previously described (Bubunenko et al. 2007).

372

Table E.2. Strains, plasmids, and primers used in this study.

Strain Background Genotype Source Reference

MG1655 MG1655 F- λ- rph-1 F. Blattner (Blattner et al. 1997) (RL1655) MDS42 MG1655 42 deletions F. Blattner (Posfai et al. 2006) (RL1961) K9619 unknown rho(E134D) ΔnusA* D. Friedman (Zheng and (RL1567) Friedman 1994) RL1962 MDS42 42 deletions ΔnusA* M. Gottesman (Cardinale et al. 2008)

Plasmid Description Source Reference pSIM5 λRed recombination plasmid D. Court (Datta et al. 2006) pNG1 pTrc-nusA+ NusA expression - (Ha et al. 2010) plasmid

Primer Sequence (5′-3′) Purpose

6969 tccccacttttaatagtctggatgag PCR amplify kan for nusA deletion gtgaaaagcccgcgATGATTGAACAA GATGGATT

6970 taatcgttacatctgtcatgctgttc PCR amplify kan for nusA deletion cttcctgctacagtTTATTAGAAGAA CTCGTCAA

7306 TTCGCGCTGAGTAATATCCA PCR amplify nusA from flanking chromosomal sequence 7307 AATTGCTGTACCAGGCGTTC PCR amplify nusA from flanking chromosomal sequence 7304 AGAATTTGCCACGTTTCAGG PCR amplify nusA with >500 bp flanking homology 7305 GTTCGCCAACGGTGATAGTT PCR amplify nusA with >500 bp flanking homology

373

Acknowledgements

I thank Max Gottesman (Columbia University) and Robert Washburn (Columbia

University) for strains and helpful discussions.

374

References

Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H. 2006. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2: 2006 0008.

Blattner FR, Plunkett G, 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF et al. 1997. The complete genome sequence of Escherichia coli K-12. Science 277: 1453-1462.

Bubunenko M, Baker T, Court DL. 2007. Essentiality of ribosomal and transcription antitermination proteins analyzed by systematic gene replacement in Escherichia coli. J Bacteriol 189: 2844-2853.

Cardinale CJ, Washburn RS, Tadigotla VR, Brown LM, Gottesman ME, Nudler E. 2008. Termination factor Rho and its cofactors NusA and NusG silence foreign DNA in E. coli. Science 320: 935-938.

Craven MG, Friedman DI. 1991. Analysis of the Escherichia coli nusA10(Cs) allele: relating nucleotide changes to phenotypes. J Bacteriol 173: 1485-1491. Datta S, Costantino N, Court DL. 2006. A set of recombineering plasmids for gram- negative bacteria. Gene 379: 109-115.

Ha KS, Toulokhonov I, Vassylyev DG, Landick R. 2010. The NusA N-terminal domain is necessary and sufficient for enhancement of transcriptional pausing via interaction with the RNA exit channel of RNA polymerase. J Mol Biol 401: 708-725.

Mah TF, Kuznedelov K, Mushegian A, Severinov K, Greenblatt J. 2000. The alpha subunit of E. coli RNA polymerase activates RNA binding by NusA. Genes Dev 14: 2664-2675.

Nakamura Y, Mizusawa S, Court DL, Tsugawa A. 1986. Regulatory defects of a conditionally lethal nusA Ts mutant of Escherichia coli. Positive and negative modulator roles of NusA protein in vivo. J Mol Biol 189: 103-111.

375

Posfai G, Plunkett G, 3rd, Feher T, Frisch D, Keil GM, Umenhoffer K, Kolisnychenko V, Stahl B, Sharma SS, de Arruda M et al. 2006. Emergent properties of reduced-genome Escherichia coli. Science 312: 1044-1046.

Tsugawa A, Saito M, Court DL, Nakamura Y. 1988. nusA amber mutation that causes temperature-sensitive growth of Escherichia coli. J Bacteriol 170: 908-915.

Zheng C, Friedman DI. 1994. Reduced Rho-dependent transcription termination permits NusA-independent growth of Escherichia coli. Proc Natl Acad Sci U S A 91: 7543-7547.