CHARACTERIZATION OF THE metK AND yitJ LEADER FROM THE

Bacillus subtilis S BOX REGULON

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy

in the Graduate School of The Ohio State University

By

Vineeta A. Pradhan, B.S.

Graduate Program in Microbiology

The Ohio State University

2012

Dissertation Committee:

Professor Tina M. Henkin, Advisor

Professor Charles J. Daniels

Professor Kurt L. Fredrick

Professor Joseph A. Krzycki

Copyright by

Vineeta A. Pradhan

2012

ABSTRACT

A variety of mechanisms that regulate expression have been uncovered in . are cis-acting regulatory sequences that reside typically in the untranslated regions of bacterial mRNAs. Riboswitches serve as genetic regulatory switches that sense and respond specifically to environmental signals to regulate expression of the downstream gene, typically in the absence of any factor. The S box is a termination control system found mostly in Gram- positive bacteria that regulates the expression of many involved in sulfur metabolism. The S box genes are characterized by the presence of a set of highly conserved primary sequence and secondary structural elements in the untranslated leader region upstream of the regulated coding sequence. SAM, the molecular effector of the S box riboswitch, is synthesized from and ATP. Expression of the majority of the S box genes is induced during methionine starvation (when SAM pools are low) and is repressed in the presence of methionine (when SAM pools are high).

In spite of high sequence and structural conservation, a few S box leader RNAs from fail to exhibit typical S box gene regulation and variation is seen in response to SAM both in vivo and in vitro. This work examines the leader RNA elements that contribute to the observed S box variability, with a special focus on the metK leader ii

RNA.

Investigation of the metK leader RNA was performed using biochemical and genetic techniques. We modulated in vivo SAM pools without removing methionine from the growth medium and provided evidence for a SAM-dependent change in metK in vivo. Phylogenetic analyses revealed the presence of unique sequence elements, the Upstream (US) and Downstream (DS) boxes, that are highly conserved in the metK leader RNAs in several Firmicutes. Using RNase H assays, we showed that these regions are involved in a base-pairing interaction that is stabilized in the absence of

SAM. Extensive mutagenic analysis of the US and DS box sequences confirmed the need for an intact US-DS base-pairing interaction for response to SAM in vivo. Transcript stability and abundance studies showed that the US-DS pairing is disrupted in the presence of SAM and that any alteration in the US box sequence reduces transcript stability significantly. A model for metK regulation was proposed in which the metK gene is regulated at the level of mRNA stability, in addition to being under the control of the S box regulon.

In vitro investigation of the B. subtilis yitJ SAM binding pocket was conducted to identify RNA determinants for ligand affinity and specificity. We attempted to generate yitJ variants that exhibit higher SAM affinity compared to wild-type yitJ or a change in ligand specificity. Extensive mutational analysis was conducted as part of the crystal structure study of the B. subtilis yitJ RNA in complex with SAM and mutants were tested for effects on in vitro transcription and SAM binding. As expected, most of the mutants exhibited loss of SAM binding. However, some mutants resulted in constitutive high

iii

termination, indicating that the RNA was locked in a SAM-bound-like conformation in the absence of SAM. Selected mutants were also tested in response to a series of SAM analogs and compared to the response of the wild-type yitJ RNA.

Individual RNA elements critical for S box riboswitch function were examined using the metE and yusC leader RNAs. Our data suggest that both the SAM-binding and the terminator/antiterminator structures play a crucial role in the calibration of the S box regulatory system. The effect of the metK and US box sequence on expression of yusC was also examined. We predict that the metK promoter, along with the US box sequence, are responsible for reduced transcription initiation or reduced RNA polymerase processivity in vitro. These studies provide possible implications of the metK promoter and US box sequence on transcription and therefore metK regulation.

iv

This work is dedicated to my parents Neela and Prakash Kurlekar.

v

ACKNOWLEDGEMENTS

I wish to express my sincere gratitude to my advisor, Dr. Tina Henkin, for her guidance, patience, and support throughout the years. It has been a privilege to work with such a gifted scientist and excellent mentor.

I am grateful to Dr. Frank Grundy for his guidance and support throughout the years. Dr. Grundy played an instrumental role in this project, particularly in the work of the metK project.

I would like to extend a special thanks to my committee members, Dr. Charles

Daniels, Dr. Kurt Fredrick and Dr. Joseph Krzycki, for their guidance, support and time over the years.

I also wish to thank my colleague and dear friend, Dr. Sharnise N. Mitchell, for being a great support both on a scientific and personal level, throughout the years. I really appreciate and value her friendship, guidance, and constant encouragement. I would like to thank the present and past members of the lab, particularly, Dr. Brooke A. McDaniel,

Dr. Jerneja Tomšič and Dr. Enrico Caserta for their helpful discussions and encouragement during our time together. I would also like to thank Susan Tigert and

Chris Woltjen for their technical assistance.

I would like to thank Mike Zianni from the Microbe and Genome Facility

vi

for his help with the qRT-PCR assays.

I wish to express my special thanks to Dr. Madhura Pradhan for her constant encouragement and support throughout the years.

Most of all, I am truly grateful to my family, especially my husband Ashish and my parents Neela and Prakash Kurlekar. Without their encouragement, patience and support this would not have been possible.

vii

VITA

January 19, 1982 ...... Born – New Brunswick, New Jersey

2003...... B.S., Microbiology, University of Pune

2004-present ...... Graduate Teaching and Research Associate, Department of Microbiology, The Ohio State University

PUBLICATIONS

1. McDaniel BA, Grundy FJ, Kurlekar VP, Tomsic J, Henkin TM. 2006. Identification of a mutation in the Bacillus subtilis S-adenosylmethionine synthetase gene that results in derepression of S box gene expression. J Bacteriol 188: 3674-3681.

2. Lu C, Ding F, Chowdhury A, Pradhan V, Tomsic J, Holmes WM, Henkin TM, Ke A. 2010. SAM recognition and conformational switching mechanism in the Bacillus subtilis yitJ S box/SAM-I riboswitch. J Mol Biol 404: 803-818.

FIELDS OF STUDY

Major Field: Microbiology

viii

TABLE OF CONTENTS

ABSTRACT ...... ii

DEDICATION ...... v

ACKNOWLEDGEMENTS ...... vi

VITA ...... viii

LIST OF TABLES ...... xv

LIST OF FIGURES ...... xvi

LIST OF ABBREVIATIONS ...... xx

CHAPTER 1 ...... 1

REGULATION OF GENE EXPRESSION BY RIBOSWITCHES ...... 1

1.1 Types of riboswitch classes ...... 7

1.2 Riboswitch classes ...... 10

1.2.1 RNA Thermosensors ...... 10

1.2.2 T box riboswitch ...... 15

1.2.3 Amino acid binding riboswitches ...... 19

1.2.3.1 L box riboswitch ...... 19 1.2.3.2 Glycine riboswitch ...... 21 1.2.3.3 riboswitch ...... 24

1.2.4 -sensing riboswitches...... 26

ix

1.2.4.1 riboswitch ...... 27 1.2.4.2 riboswitch ...... 28 1.2.4.3 2’- riboswitch ...... 30 1.2.4.4 PreQ1 riboswitch ...... 32

1.2.5 c-di-GMP ...... 33

1.2.6 glmS ...... 38

1.2.7 M box riboswitch ...... 40

1.2.8 Fluoride riboswitch ...... 43

1.2.9 B12 riboswitch ...... 45

1.2.10 TPP riboswitch ...... 47

1.2.11 FMN riboswitch ...... 50

1.2.12 THF riboswitch ...... 52

1.2.13 Moco and Tuco RNA elements...... 54

1.2.14 SAM-sensing ribsowitches ...... 55

1.2.14.1 S box/SAM-I riboswitch ...... 56 1.2.14.2 SAM-II riboswitch ...... 61 1.2.14.3 SMK box/SAM-III riboswitch ...... 62 1.2.14.4 SAM-IV riboswitch ...... 65 1.2.14.5 SAM-V riboswitch ...... 67 1.2.14.6 SAH riboswitch ...... 68

1.3 Research goals ...... 69

CHAPTER 2 ...... 77

CHARACTERIZATION OF THE metK LEADER RNA: AN ATYPICAL MEMBER OF THE Bacillus subtilis S BOX REGULON ...... 77

2.1 Introduction ...... 77

2.2 Materials and Methods ...... 80

2.2.1 Bacterial strains and growth conditions ...... 80

x

2.2.2 Genetic techniques ...... 80 2.2.3 β-Galactosidase measurements ...... 85 2.2.4 Measurement of SAM pools in vivo ...... 85 2.2.5 In vitro transcription termination assays for determination of SAM pools ...... 86 2.2.6 In vitro transcription termination assays ...... 87 2.2.7 RNase H cleavage assay ...... 88 2.2.8 Total RNA extraction for primer extension analysis ...... 89 2.2.9 Primer extension analysis of the metK leader RNA...... 89 2.2.10 Quantitative reverse transcriptase PCR (qRT-PCR) assay ...... 90

2.3 Results ...... 92

2.3.1 Response to varying SAM pools in vivo by the wild-type B. subtilis metK leader RNA ...... 92 2.3.2 Deletion mapping of the metK leader RNA ...... 97 2.3.3 Unique sequences are located on the 5' and 3' sides of the metK S box element ...... 102 2.3.4 The metK US and DS box regions are involved in a base-pairing interaction which is dependent on a functional SAM-binding domain ... 108 2.3.5 Mutagenic analysis of the conserved metK US box element ...... 112 2.3.6 Physiological context of the G5U mutant ...... 115 2.3.7 Mutagenesis of the US box sequence ...... 117 2.3.8 Effect of the US box mutations on metK transcript stability and abundance ...... 121 2.3.9 In vitro analysis of the metK US and DS box mutants ...... 128 2.3.10 Conditions to generate a metK halted-complex during in vitro transcription ...... 132

2.4 Discussion ...... 133

CHAPTER 3 ...... 151

IN VITRO INVESTIGATION OF THE SAM BINDING POCKET OF THE Bacillus subtilis yitJ RIBOSWITCH ...... 151

3.1 Introduction ...... 151

3.2 Materials and methods ...... 157

3.2.1 Construction of DNA templates for in vitro selection ...... 157 3.2.2 Site-directed mutagenesis ...... 159 3.2.3 In vitro transcription assays ...... 160 3.2.4 RNase H cleavage assay ...... 161 xi

3.3 Results ...... 162

3.3.1 SAM-dependent structural transition of the wild-type yitJ leader RNAs with distinct 3' end-points ...... 162 3.3.2 Mutagenesis of a region in helix P3 to generate a pool of yitJ variants . 166 3.3.3 Effect of leader region mutations on the SAM-dependent structural transition of the yitJ RNA ...... 168 3.3.4 Identification of yitJ leader RNA determinants for SAM affinity and recognition ...... 172 3.3.5 Disruption of G11 causes loss of SAM binding, yet high constitutive transcription termination in vitro ...... 176 3.3.6 Mutating residues in the P3 helix of the yitJ SAM-binding pocket results in a surprising stabilization of the domain ...... 180 3.3.7 Substitution of the U85-A109 base-pair within the pseudoknot weakens the SAM-binding ability without affecting the termination efficiency .. 182 3.3.8 Termination efficiency of wild-type and U85-A109 variant yitJ constructs in response to SAM analogs ...... 184

3.4 Discussion ...... 192

CHAPTER 4 ...... 198

INVESTIGATION OF THE FUNCTION OF S BOX RIBOSWITCH STRUCTURAL ELEMENTS: INSIGHTS INTO FACTORS CONTRIBUTING TO S BOX RIBOSWITCH VARIABILITY ...... 198

4.1 Introduction ...... 198

4.2 Materials and methods ...... 204

4.2.1 Bacterial strains ...... 204 4.2.2 Genetic techniques ...... 204 4.2.3 Construction of hybrid leader RNAs ...... 205 4.2.3.1 metE and yusC hybrid leader RNAs ...... 205 4.2.3.2 metK and yusC hybrid constructs ...... 210 4.2.4 β-Galactosidase measurements ...... 211 4.2.5 In vitro transcription termination assay ...... 211 4.2.5.1 Conditions for the metE and yusC wild-type and hybrid constructs211 4.2.5.2 Conditions for the metK-yusC hybrid leader RNA constructs ...... 212

xii

4.3 Results ...... 213

4.3.1 Repression of S box gene expression in the presence of methionine is dependent on the SAM-binding domain of the S box leader RNA ...... 213 4.3.2 The termination efficiency of S box genes is dictated by the terminator/antiterminator domains ...... 216 4.3.3 The length of the metK upstream (US) box sequence affects expression of the metK-yusC hybrid leader RNAs ...... 225 4.3.4 The metK US box sequence contributes to the termination efficiency of the metK-yusC hybrid constructs ...... 232

4.4 Discussion ...... 236

CHAPTER 5 ...... 247

SUMMARY AND DISCUSSION ...... 247

LIST OF REFERENCES ...... 257

APPENDIX A ...... 277

EFFECTS OF THE RelA MUTANT ALLELE ON B. subtilis S BOX GENE EXPRESION...... 277

A.1 Introduction ...... 277

A.2 Hypothesis ...... 282

A.3 Aim of study...... 282

A.4 Materials and methods ...... 283

A.4.1 Bacterial strains and growth conditions ...... 283 A.4.2 Genetic techniques ...... 283 A.4.3 β-Galactosidase measurements ...... 285 A.4.4 Determination of SAM pools in vivo ...... 285

A.5 Results ...... 287

A.5.1 Methionine prototrophic strains containing a wild-type or relA1 allele exhibit distinct S box-lacZ gene expression profiles, despite similar in vivo SAM pools ...... 287 xiii

A.5.2 The relA1 mutant allele does not affect S box-lacZ expression in a methionine auxotroph ...... 289 A.5.3 A relA null strain fails to exhibit high SAM pools during methionine starvation ...... 291

A.6 Discussion ...... 293

xiv

LIST OF TABLES

Table

1.1 Known classes of riboswitches based on ligand type ...... 7

2.1 Expression of wild-type metK-lacZ transcriptional fusion in vivo ...... 96

2.2 Expression of the metK-lacZ deletion mutants during IPTG assay ...... 100

3.1 Oligonucleotide primers for the B. subtilis yitJ leader RNA ...... 159

3.2 Mutational analysis of the B. subtilis yitJ S box riboswitch ...... 179

3.3. In vitro transcription termination of wild-type B. subtilis yitJ in the presence of

SAM or SAM analogs ...... 186

3.4 In vitro transcription termination of B. subtilis yitJ variants in the presence of SAM

or SAM analogs ...... 187

4.1 DNA oligonucleotides for metE and yusC leader RNA constructs ...... 207

4.2 DNA oligonucleotides for metK and yusC leader RNA constructs ...... 210

4.3 In vivo expression analysis of the wild-type and hybrid constructs...... 215

4.4 In vitro analysis of the wild-type and hybrid metE and yusC leader RNAs ...... 225

4.5 Sequences of the metK and yusC hybrid leader RNA constructs ...... 227

xv

LIST OF FIGURES

Figure

1.1 Models for sensing of regulatory signals by leader RNAs ...... 3

1.2 RNA thermosensors ...... 11

1.3 The T box mechanism ...... 16

1.4 Glycine riboswitch ...... 23

1.5 variants and their ligand specificities ...... 37

1.6 Proposed mechanism for allosteric ribozyme-mediated gene control ...... 37

1.7 Model for regulation of S box gene expression in response to SAM ...... 57

1.8 Crystal structure of the B. subtilis yitJ leader RNA bound to SAM ...... 59

1.9 Crystal structure of the E. faecalis metK SMK box riboswitch ...... 63

2.1 Plasmid used to generate strain BR151 Pspac-metK ...... 83

2.2 Construction of the BR151 Pspac-metK strain ...... 84

2.3 Measurement of in vivo SAM pools and β-galactosidase activity during IPTG

limitation ...... 94

2.4 In vivo expression of the wild-type metK-lacZ transcriptional fusion ...... 95

xvi

2.5 The predicted secondary structure of the B. subtilis metK leader RNA ...... 99

2.6 Alignment of S box sequences from metK genes in Firmicutes ...... 103

2.7 The alignment of the metK upstream (US) and downstream (DS) box

sequences from Firmicutes ...... 104

2.8 Primer extension analysis of the B. subtilis metK leader RNA to map the 5’

transcriptional start-site ...... 107

2.9 Oligonucleotide-direction RNase H cleavage mapping of the B. subtilis metK

leader RNA ...... 110

2.10 In vivo expression analysis of wild-type metK compared to the metK US box

mutant constructs using the IPTG assay ...... 113

2.11 In vivo expression analysis of wild-type metK compared to the metK DS box

mutant constructs and metK US-DS box double mutants using the IPTG assay .. 119

2.12 Measurement of RNA abundance of metK transcripts ...... 124

2.13 Expression profiles of the wild-type and mutant metK-lacZ transcripts...... 128

2.14 In vitro analysis of the metK US box mutants ...... 130

2.15 In vitro analysis of the metK DS box mutants ...... 131

2.16 Proposed model for regulation of B. subtilis metK gene expression ...... 148

3.1 B. subtilis yitJ leader RNA structural model ...... 153

3.2 The B. subtilis yitJ leader RNA...... 164

3.3 SAM-dependent structural transition of the B. subtilis yitJ S box RNA using

RNase H cleavage assay ...... 165

3.4 A close-up view of the B. subtilis yitJ SAM-binding pocket ...... 167

xvii

3.5 SAM-dependent structural transition of the pools of wild-type and mutant yitJ

transcripts ...... 169

3.6 SAM-dependent structural transition of wild-type and yitJ variant RNAs...... 171

3.7 A SAM titration of the yitJ-AAT template in the RNase H cleavage assay ...... 173

3.8 The SELEX scheme ...... 174

3.9 within the SAM-binding pocket of the B. subtilis yitJ RNA ...... 177

3.10 Nucleotides within the SAM-binding pocket of the B. subtilis yitJ RNA ...... 181

3.11 In vitro transcription termination assay ...... 183

3.12 Chemical structures of SAM and SAM analogs ...... 185

4.1 Methionine pathways in B. subtilis...... 200

4.2 Predicted secondary structures of the yusC and metE leader RNAs ...... 208

4.3 In vivo expression assay of the wild-type and hybrid leader RNA constructs during

methionine starvation ...... 214

4.4 SAM-dependent transcription termination of an S box riboswitch ...... 218

4.5 Direct comparison of the terminator/antiterminator competition ...... 221

4.6 Direct comparisons of the binding domains ...... 224

4.7 In vivo expression analysis of the wild-type metK and yusC leader RNAs ...... 226

4.8 In vivo analyses of the metK-yusC hybrid leader RNA fusions ...... 229

4.9 In vivo expression of metK-yusC hybrid leader constructs ...... 231

4.10 In vitro transcription termination analysis ...... 234

A.1 Measurement of in vivo SAM pools and β-galactosidase activity ...... 288

A.2 In vivo expression assay during methionine starvation ...... 290

xviii

A.3 Measurement of in vivo SAM pools and β-galactosidase activity ...... 292

xix

LIST OF ABBREVIATIONS

Å angstrom aa amino acid

AASD anti-anti-Shine-Dalgarno

AdoCbl 5’deoxy-5’-adensylcobalamin aa-tRNA aminoacyl-tRNA

AEC aminoethylcysteine

ASD anti-Shine-Dalgarno

ATP adenosine triphosphate bp base-pair cDNA complementary DNA

CTP cytidine triphosphate

DAP diaminopimelate

DNA deoxyribonucleic acid

FMN flavin mononucleotide

GlcN6P glucosamine-6-phosphate

xx

GTP guanosine triphosphate

IPTG isopropyl β-D-1-thiogalactopyranoside

ITC isothermal titration calorimetry

Kd dissociation constant

LB Luria Bertani

LysRS lysyl tRNA synthetase mRNA messenger RNA

MTA methylthioadenosine

MTR methylthioribose

NAIM analog interference mapping nt nucleotide

NTP nucleotide triphosphate

NMR nuclear magnetic resonance

O.D. optical density

ORF open reading frame ppGpp guanosine tetraphosphate

PAGE polyacrylamide gel electrophoresis

PCR polymerase chain reaction preQ1 7-aminomethyl-7-deazaguanine qRT-PCR quantitative real time-polymerase chain reaction

xxi

RBS ribosome binding site

RNA ribonucleic acid

RNAP RNA polymerase

RNase ribonuclease

ROSE repression of heat-shock gene expression

SAC S-adenosylcysteine

SAH S-adenosylhomocysteine

SAM S-adenosylmethionine

SAXS small angle X-ray scattering

SD Shine-Dalgarno

T0 time zero t1/2 half-life

TBAB tryptose blood agar base

Term1/2 half-maximal termination

THF tetrahydrafolate

TIC initiation complex

TMP thiamin monophosphate

TPP thiamin pyrophosphate

Tris-HCl tris-(hydroxylmethyl) aminomethane hydrochloride tRNA transfer RNA

xxii

uORF upstream open reading frame

UTP uridine triphosphate

UTR

X-gal 5-bromo-4-chloro-3-indolyl-beta-D-galactopyranoside

xxiii

CHAPTER 1

REGULATION OF GENE EXPRESSION BY RIBOSWITCHES

Bacteria constantly modulate gene expression in response to fluctuating physical and chemical parameters, which in turn ensures that the cell utilizes valuable resources without wasteful energy expenditure. The central genetic dogma states that genetic information is transferred from deoxyribonucleic acid (DNA) via messenger ribonucleic acid (mRNA) to protein (Crick 1970). Regulation can occur at any stage during this process. A variety of mechanisms that regulate gene expression in bacteria have been uncovered to date. The versatility of RNA as a regulatory molecule allows RNA elements within a transcript to modulate gene expression at a multitude of levels such as transcription, translation and mRNA stability.

Over the past decade, studies have identified the presence of regulatory sequences that reside typically in the 5’-untranslated regions (5’UTRs) of bacterial mRNAs, also known as “leader” regions. The majority of these RNA elements, termed riboswitches

(Nahvi et al. 2002), have been identified in Gram-positive bacteria. Riboswitches serve as genetic regulatory switches that sense and respond specifically to environmental signals

1

to regulate expression of the downstream gene in the absence of any protein factor. The environmental signals include coenzymes, nucleotide derivatives, amino acids, sugars, metal ions, small RNAs and physical parameters such as temperature. From a physiological standpoint, bacterial cells need to sense these environmental signals and respond to their dynamic nature. Thus, regulation by riboswitches occurs such that response to the environmental signal either upregulates or downregulates gene expression, depending on whether the system is an “on” or “off” switch.

Riboswitches generate complex structures to perform two important functions, signal recognition and conformational switching (Breaker 2010). In the case of simple metabolite-binding riboswitches, the RNA consists of one aptamer domain and one regulatory expression platform. The aptamer domain forms a selective ligand-binding pocket that confers specific recognition of the cognate metabolite. Metabolite binding leads to a conformational shift in the RNA structure. This results in formation of one of two mutually exclusive structures that either permit or prevent expression of the downstream coding region. The key feature that connects the ligand-binding domain and the regulatory expression platform is sequence complementarity, which leads to the existence of the alternate RNA conformations.

2

A.

B.

Figure 1.1 Models for sensing of regulatory signals by leader RNAs. A. Transcription termination control. Interaction of a regulatory molecule (a small RNA or a small metabolite) modulates the structure of the nascent RNA as it emerges from RNAP. This structural modulation determines whether the transcript folds into the helix of an intrinsic terminator, resulting in premature termination of transcription, or a competing antiterminator that sequesters sequences necessary for formation of the terminator, resulting in continued transcription and expression of the downstream coding sequence. B. Translational control. Interaction of a regulatory molecule, or changes in temperature, determine whether the RNA transcript folds into a structure that sequesters the Shine- Dalgarno (SD) sequence of the downstream coding region. Sequestration of the SD results in inhibition of translation, while release of the SD, either directly or by folding of the RNA into a competing structure that allows access of the translational machinery to the SD, allows expression of the downstream coding sequence. Adapted from (Grundy and Henkin 2006).

3

Riboswitch-mediated gene control in bacteria takes place mostly at the level of premature transcription termination (i.e., transcription attenuation) or translation initiation (Figure 1.1). Typically, transcription termination in bacteria is directed by intrinsic or factor-independent terminators. Intrinsic transcription termination involves folding of the nascent RNA transcript into a G-C-rich helix followed by a U-rich track.

While the stable G-C-rich hairpin induces pausing of the transcribing RNAP complex, the relatively weak binding between the poly U residues in the nascent RNA transcript and the corresponding poly A sequence in the DNA facilitates dissociation of the RNAP from the DNA template, releasing the nascent mRNA and terminating transcription. The poly U residues may also induce RNAP stalling, thereby providing time for the RNA hairpin to form (Nudler and Gottesman 2002, Peters et al. 2011).

The majority of the riboswitches that regulate at the transcriptional level are “off” switches, in that the presence of a regulatory signal represses expression of the downstream gene. Ligand binding to the aptamer domain results in folding of the leader

RNA into an intrinsic terminator (T) hairpin (Figure 1.1A). In the absence of the signal, the RNA forms an alternate structure termed the antiterminator (AT). The AT is more stable than the T and serves as the default state in the absence of a regulatory signal.

Formation of the AT structure prevents formation of the T helix, because the AT sequesters a sequence that participates in the formation of the T. When the RNAP encounters the AT, the RNAP continues transcription into the downstream coding region, resulting in expression of the downstream gene. As the AT structure is the default state, the T helix can form only when a third structure termed the anti-antiterminator (AAT)

4

forms. The AAT is stabilized in the presence of the cognate ligand, and as the name suggests, it prevents formation of the AT structure by sequestering a sequence involved in formation of the AT. The presence or absence of the signal therefore dictates the conformation of the RNA, which in turn controls expression of the downstream coding region.

Unlike “off” switches, some transcriptional riboswitches function as “on” switches (Mandal and Breaker 2004, Mandal et al. 2004). In this case, the default state of the riboswitch RNA is the T structure, which forms in the absence of ligand resulting in low gene expression. Signal recognition by the RNA promotes formation of the AT structure and prevents formation the T helix. This type of riboswitch results in upregulation of gene expression.

In contrast to described above, a second general mechanism of transcription termination employed by bacteria involves the presence of specific protein factors. Factor-dependent termination of transcription requires binding of a protein (designated Rho) to a Rho utilization (rut) site in the nascent transcript. The rut site is ~70 nt long and consists of a (C)-rich sequence that is relatively unstructured. The Rho protein functions as an RNA helicase and travels in an ATP- dependent manner towards the 3’ end of the RNA. The Rho protein contacts the paused

RNAP and results in termination of transcription by dissociating the transcription elongation complex (TEC) (Boudvillain et al. 2010, Peters et al. 2011, Platt 1994). A recently identified riboswitch employs Rho-dependent termination of transcription

(instead of the typical intrinsic termination) and represents the newest mechanism of

5

riboswitch-mediated control (Hollands et al. 2012) (see section 1.2.7).

Riboswitch-mediated regulation at the level of translation initiation proceeds by occlusion of the Shine-Dalgarno (SD) sequence (Figure 1.1B). The SD sequence is located within the ribosome-binding site (RBS) and availability of the RBS for binding of the 30S ribosomal subunit is crucial for initiation of translation. For most of the riboswitches that regulate at the level of translation initiation, the presence of a regulatory signal downregulates gene expression. In these “off” switches, when gene expression is downregulated, the SD sequence is occluded through base-pairing interactions with a complementary upstream sequence, the anti-Shine-Dalgarno (ASD) sequence. In the absence of the signal, the ASD sequence pairs with a third sequence located upstream from the ASD called the anti-anti-Shine-Dalgarno (AASD). When the ASD-AASD pairing is favored, the SD sequence becomes free and available to interact with the translation initiation complex (TIC), which results in high expression of the downstream gene.

Some translational switches function as “on” switches such that the default state of the RNA favors occlusion of the SD region, resulting in low gene expression. For translational riboswitches that function as “on” switches, the presence of a regulatory signal results in the RNA to fold into a conformation that promotes ASD-AASD pairing.

This results in the SD region to be free and available for binding of the translation initiation complex, leading to upregulation of gene expression.

In addition to the above-described regulatory mechanisms, riboswitch-mediated gene control has also been identified at the level of ribozyme self-destruction, mRNA

6

splicing and mRNA stability (Cheah et al. 2007, Croft et al. 2007, Winkler et al. 2004).

This chapter will describe the known mechanisms of riboswitch-mediated gene control based on the specific ligands recognized by each class.

1.1 Types of riboswitch classes

Riboswitches modulate gene expression in response to environmental signals that range from metabolites and inorganic ions to temperature and small RNAs. A number of structures (solution as well as X-ray crystal) have revealed the architecture of the compact binding pockets that result in high affinity and specificity for the various ligands. Table 1.1 categorizes each riboswitch class based on the molecular recognition signal and highlights the structures that are currently available.

Table 1.1 Known classes of riboswitches based on ligand type

Riboswitch Molecular signal High-resolution structure

RNA thermosensors Temperature T box Amino acids via (Gerdeman et al. 2003, uncharged transfer RNA Wang et al. 2010) (tRNA) Amino Acids Glycine Glycine (Butler et al. 2011) L box (Garst et al. 2008; Serganov et al. 2008) Glutamine Glutamine

(continued)

7

Table 1.1 (continued)

Riboswitch Molecular signal High-resolution structure

Nucleotide derivatives G box Guanine (Batey et al. 2004; Noeske et al. Adenine Adenine 2005; Serganov et al. 2004) 2’-dG 2’-deoxyguanosine (Pikovskaya et al. 2011) PreQ1 7-aminomethyl-7-deazaguanine (Klein et al. 2009) c-di-GMP-I Cyclic-di-GMP (Kulshina et al. 2009; Smith et al. 2009; Smith et al. 2010a) c-di-GMP-II Cyclic-di-GMP (Smith et al. 2011) Sugars glmS ribozyme Glucosamine-6-phosphate (Klein and Ferre-D’Amare 2006; Cochrane et al. 2007) Ions M box Magnesium (Dann et al. 2007; Wakemann et al. 2009; Ramesh et al. 2011) Fluoride Fluoride (Ren et al. 2012) Coenzymes B12 5’-deoxy-5’- THI box Thiamin pyrophosphate (Edwards and Ferre- D’Amare 2006; Noeske et al. 2006; Serganov et al. 2006; Thore et al. 2006; Thore et al. 2008) FMN Flavin mononucleotide (Serganov et al. 2009; Vicens et al. 2011) THF Tetrahydrofolate (Huang et al. 2011; Trausch et al 2011) MoCo Molybdenum WCo Tungsten S box (SAM-I) S-adenosylmethionine (Montange and Batey 2006; Lu et al. 2010; Stoddard et al. 2010) SAM-II S-adenosylmethionine (Gilbert et al. 2008) SMK box (SAM-III) S-adenosylmethionine (Lu et al. 2008) SAM-IV S-adenosylmethionine SAM-V S-adenosylmethionine SAH S-adenosylhomocysteine (Edwards et al. 2010)

8

One of the earliest and most prevalent riboswitches to be identified is the THI box, which recognizes thiamin pyrophosphate (TPP). The TPP-responsive riboswitch has been found in all three domains of life (Miranda-Rios et al. 2001, Sudarsan et al. 2003,

Winkler et al. 2002a). A few early-discovered riboswitches also sense vitamin-derived coenzymes such as adenosylcobalamin ( or AdoCbl) and flavin mononucleotide (FMN) (Mironov et al. 2002, Nahvi et al. 2002, Winkler et al. 2002b).

Subsequently reported riboswitch classes include RNAs that recognize S- adenosylmethionine (SAM), lysine, guanine/adenine, glycine and the bacterial second messenger c-di-GMP (Grundy and Henkin 1998, Grundy et al. 2003, Mandal et al. 2003,

Mandal et al. 2004, Sudarsan et al. 2008, Weinberg et al. 2007). Together, these represent the ten most-common riboswitch classes currently known (Breaker 2011). Unlike most of the characterized riboswitches, RNA thermosensors do not bind a regulatory signal directly. Rather, this class responds to a change in temperature to form a simple thermo- responsive base-paired structure (Altuvia et al. 1989, Morita et al. 1999b, Narberhaus et al. 1998). However, similar to most other riboswitch classes, RNA thermosensors also employ Watson-Crick base-pairing to control gene expression.

Early discoveries of riboswitches were made by manually examining sequence alignments to generate predicted secondary structures (Grundy and Henkin 1993, Grundy and Henkin 1998). A few riboswitch classes are known to be highly conserved in sequence and structure and therefore can be relatively easy to identify. However, identification of several putative novel riboswitches has been complicated due to low sequence conservation, making discovery much more tedious. For example, the T box

9

leader RNAs display only a 14 nt T box consensus sequence and a few other small elements, but exhibit extensive secondary structural conservation

(Grundy and Henkin 1993, Henkin et al. 1992). Recent advances in search tools have identified several putative structured RNA motifs within noncoding regions of mRNAs, resulting in an expansion of the riboswitch field (Bengert and Dandekar 2004,

Chang et al. 2009, Clote et al. 2012, Singh et al. 2009).

1.2 Riboswitch classes

1.2.1 RNA Thermosensors

RNA thermosensors, also termed RNA thermometers, have one of the simplest architectures among riboswitches (Narberhaus et al. 2006). In most cases, these elements are located in the 5’UTRs of bacterial heat- and cold-shock genes as well as virulence genes (Giuliodori et al. 2010, Johansson et al. 2002, Morita et al. 1999a, Waldminghaus et al. 2007). RNA thermosensors differ from other riboswitches in that there is no need for an effector molecule-binding domain. To date, all known RNA thermosensors regulate gene expression at the level of translation initiation in response to a change in temperature. Typically, at low temperatures, the SD region is sequestered in a hairpin structure. Increasing temperatures destabilize the RNA structure, making the RBS accessible for translation to be initiated (Figure 1.2). Translational control results in a rapid response and ensures that the mRNA transcript is available for translation initiation as soon as the requirement for the gene products arises.

10

A. B.

5’ 3’ 5’ 3’

Figure 1.2 RNA thermosensors. A. At low temperature, the SD-ASD pairing results in inhibition of translation initiation. B. At higher temperature, the ASD–SD helix is disrupted and the SD sequence is available for translation initiation. Adapted from (Henkin 2008).

The first RNA thermosensor was identified in the Escherichia coli rpoH gene, which encodes the alternate sigma factor σ32 or RpoH (Morita et al. 1999a, Morita et al.

1999b). Two regions within the rpoH mRNA form an extensive structure that blocks the entry of the ribosome to the SD sequence. An increase in temperature disrupts the RNA structure thereby exposing the RBS for translation initiation. This enhances translation of the transcript for the alternate sigma factor, resulting in rapid induction of the heat-shock response.

Another heat-shock RNA thermosensor that uses RNA secondary structure to sense temperature is the repression of heat-shock gene expression (ROSE) element. This is the most prevalent RNA thermosensor, discovered initially in rhizobia (Narberhaus et 11

al. 1998, Nocker et al. 2001a, Nocker et al. 2001b) and later identified in numerous alpha

(α)- and gamma (γ)-proteobatcteria such as E. coli and Salmonella (Waldminghaus et al.

2005). All known ROSE elements control the expression of small heat shock genes, such as the E. coli inclusion body-binding protein A (ibpA) gene. The ROSE-like element in the 5’UTR of the ibpA gene is termed ROSEibpA (Waldminghaus et al. 2009). The IbpA protein is induced dramatically (>100-fold) by heat shock conditions during growth of an

E. coli biofilm (Kuczynska-Wisnik et al. 2010). RNA secondary structure predictions strongly suggest that the SD sequence in the ROSEibpA RNA is masked at low temperatures and that translation of this transcript is enhanced upon exposure to higher temperatures (Waldminghaus et al. 2005, Waldminghaus et al. 2009).

RNA thermometers also play a crucial role in the induction of translation of virulence gene transcripts. For pathogenic bacteria, an increase in temperature to 37°C indicates successful invasion of a mammalian host and triggers the expression of genes encoding virulence factors. The prfA gene, which encodes the key transcriptional activator of Listeria monocytogenes virulence genes, is transcribed at 30ºC and 37ºC, but is translated only at 37°C. The 5’UTR of the prfA gene from L. monocytogenes is folded in a structure that occludes the SD sequence at 30ºC (Johansson et al. 2002). When the pathogen invades the mammalian host, the prfA transcript structure becomes destabilized at 37ºC, which enables efficient translation of the prfA mRNA, resulting in activation of virulence gene expression. A similar mechanism has been proposed for the virulence gene transcriptional activator encoded by the lcrF gene in Yersinia pestis (Hoe and

Goguen 1993).

12

RNA thermosensors have also been documented in the lytic-lysogenic decision of the bacteriophage lambda (λ). The λ cIII gene product regulates the lysogenic pathway by stabilizing the λ cII regulatory protein. Two alternate RNA structures (A and B) control translation of the cIII gene in a temperature-dependent manner (Altuvia et al. 1989). At optimal growth conditions (37ºC), the RNA is in conformation B in which the RBS of the cIII mRNA is accessible. The 30S ribosomal subunit binds the RNA in structure B, resulting in translation initiation. The lysogenic cycle is favored as the concentration of the cIII protein rises. During severe heat stress (45ºC), the cIII RNA undergoes a conformational change in which tertiary interactions of structure B are altered, resulting in a structure in which the translational initiation region of the cIII mRNA is now blocked

(structure A) (Altuvia et al. 1989). Occlusion of the RBS in structure A inhibits the 30S ribosomal subunit from binding, thereby preventing translation initiation of the cIII mRNA. As a result, cIII gene expression is downregulated at high temperatures, which leads to degradation of the phage λ cII protein and entry into the lytic pathway. Thus, the temperature-dependent equilibrium between cIII mRNA structures A and B is dependent on the binding of the 30S ribosomal subunit, which preferentially binds structure B and initiates translation. Entering the lytic cycle allows the phage to escape from the host during severe conditions (such as heat stress). The cIII RNA regulatory mechanism provides a unique example of a thermosensor in which gene expression is turned off in response to elevated temperatures.

It has been predicted that expression of the phage λ cIII gene is dependent on a sequence upstream of the cIII structural gene, which is recognized by the host RNase III

13

enzyme. RNase III functions to expose the occluded cIII RBS, resulting in high expression of cIII (Altuvia et al. 1989, Altuvia et al. 1991). Binding (but not processing) by RNase III stabilizes the RNA in structure B at optimal growth conditions (37ºC) and promotes translation initiation of the cIII mRNA. It is therefore possible that RNase III also controls the equilibrium between structures A and B. However, the exact mechanism by which RNase III regulates this equilibrium is not yet understood (Oppenheim et al.

1991).

Along with heat-shock RNA thermosensors, cold-induced genetic RNA switches also exist. The first such example was identified in the mRNA of the cspA gene in E. coli

(Goldstein et al. 1990). The CspA protein functions as an RNA chaperone that binds and stabilizes single-stranded RNAs (Jiang et al. 1997). The cspA mRNA adopts one of two mutually exclusive conformations in a temperature-dependent manner. At optimal growth conditions (37ºC), the 5’UTR of the cspA transcript forms a highly folded conformation that sequesters the RBS in a helical structure and prevents translation initiation. However, when the mRNA is exposed to lower temperatures (around 10ºC), a structural reorganization of the mRNA exposes the SD sequence in the loop of an alternate stem that results in ribosome binding and translational initiation (Giuliodori et al. 2010). Thus, increased expression of the CspA protein prevents single-stranded RNA molecules from folding into detrimental conformations that might be stabilized upon exposure to lower environmental temperatures (Giuliodori et al. 2010, Jiang et al. 1997).

14

1.2.2 T box riboswitch

The T box mechanism is widely used among Gram-positive bacteria to regulate expression of aminoacyl-tRNA synthetase (aaRS) genes and amino acid biosynthesis and uptake genes (Grundy and Henkin 1993, Grundy et al. 2002, Henkin et al. 1992, Henkin and Grundy 2006). More than 1000 T box genes have been identified through bioinformatics analyses (Gutierrez-Preciado et al. 2009). T box riboswitches most commonly utilize a mechanism of transcription termination control. However, some T box RNAs in Gram-negative bacteria and members of the Actinomycetes are predicted to regulate at the level of translation initiation (Gutierrez-Preciado et al. 2009).

The T box nascent transcript includes an element that serves as an intrinsic transcriptional terminator. Sequences on the 5’ side of the terminator can also participate in formation of an alternate, less stable antiterminator structure. Binding of a specific uncharged tRNA stabilizes the antiterminator and therefore prevents formation of the terminator helix. This leads to increased synthesis of the full-length mRNA (Figure 1.3B)

Binding of a charged cognate tRNA promotes termination indirectly, by preventing binding of the uncharged tRNA (Figure 1.3A). Thus, the T box riboswitch monitors the ratio of charged vs. uncharged tRNA species. Regulation by the T box mechanism maintains appropriate pools of aminoacylated tRNAs (aa-tRNAs) that are essential for cell viability (Grundy and Henkin 1993, Grundy et al. 1994, Grundy et al. 2002).

15

A. B.

Figure 1.3 The T box mechanism. Regulation by the T box riboswitch occurs primarily at the level of transcription termination in response to the charging state of the cognate tRNA. A. Aminoacylated tRNA binds only to the Specifier Loop. When the cognate tRNA is charged with the correct amino acid (aa), the leader RNA contacts the tRNA at its anticodon stem-loop but fails to make key contacts with the antiterminator element via the acceptor stem of the tRNA. Under these conditions, a terminator helix is favored (blue-black) and transcription ceases. B. Uncharged tRNA interacts at both the Specifier Loop and the antiterminator and results in structural changes throughout the leader RNA. Under these conditions, an antiterminator element (red-blue) is stabilized, which sequesters sequences (blue) that otherwise participate in formation of the terminator helix, and transcription continues into the downstream coding region. The tRNA is shown in cyan; the amino acid (aa) is shown as a yellow circle attached to the 3′ end of the charged tRNA. Positions of base-pairing between the leader RNA and the tRNA (Specifier Loop–tRNA anticodon, antiterminator bulge–tRNA acceptor end) are shown as green lines. Adapted from (Green et al. 2010).

16

The T box system was initially uncovered based on the analysis of the tyrS gene from B. subtilis (Henkin et al. 1992). This study showed that a long leader region preceding the tyrS coding region contains an intrinsic transcriptional terminator.

Immediately upstream of this terminator is a 14 nt sequence, termed the T box sequence

(Henkin et al. 1992). Further studies involved manual examination of 10 aaRS leader sequences (Grundy and Henkin 1993). The predicted secondary structure of a T box riboswitch consists of three helical domains (Stems I, II, III) and a pseudoknot element

(Stem IIA/B) that are present within the leader RNA and precede the T box sequence and intrinsic terminator helix (Grundy and Henkin 1993). The identity of a single codon, termed the Specifier Sequence, within a specific internal loop of Stem I directs amino acid specificity for the T box mechanism. The leader RNA-tRNA interaction is facilitated by pairing of the cognate tRNA anticodon with the Specifier Sequence. Genetic analysis showed that mutating the tyrosine UAC codon to a phenylalanine UUC codon switches the response to phenylalanine limitation rather than tyrosine (Grundy and Henkin 1993).

Replacement of the tyrosine UAC codon with a nonsense codon resulted in transcription termination, and a nonsense suppressor tRNA containing a compensatory anticodon mutation was able to restore expression (Grundy and Henkin 1993). These studies established that base-pairing between the Specifier Sequence of the leader RNA and the anticodon of the corresponding tRNA is essential for antitermination.

The competing antiterminator helix includes a portion of the T box sequence within a 7-nt bulge region on the 5’ side (5’-UGGNACC-3’, where N is variable)

17

(Grundy and Henkin 1993). A crucial base-pairing interaction between the acceptor end of the uncharged tRNA (5’-NCCA-3’) and residues in the antiterminator bulge (5’-

UGGN-3’, where the N residues covary) stabilizes the antiterminator and prevents formation of the competing terminator helix (Grundy et al. 1994). This second tRNA- leader RNA interaction is blocked by charged tRNA (Figure 1.3A) (Grundy and Henkin

1993, Grundy et al. 1994, Yousef et al 2003). Extensive mutational analysis of both the tyrS leader RNA sequence and tRNATyr demonstrated that the sequence and structural elements conserved in T box leader RNAs are important for antitermination in vivo

(Grundy et al. 2002, Rollins et al. 1997).

As tRNA interactions with the tyrS leader RNA could not be shown in vitro, a different T box leader RNA was chosen for these analyses. The B. subtilis glyQS leader

RNA is a natural variant that lacks two large structural elements common to most T box leader RNAs. Uncharged tRNAGly promotes readthrough of the B. subtilis glyQS leader

RNA in an in vitro transcription assay in the absence of any additional factors, indicating that tRNA alone is capable of generating a regulatory response in vitro (Grundy et al.

2002). Additional mutational analysis showed that an intact tRNA is required for antitermination in vitro (Yousef et al. 2003), and addition of an extra nucleotide at the 3’ end of the tRNA inhibits antitermination (Grundy et al. 2005). RNase H structural studies were conducted in which an antisense DNA oligonucleotide was used to probe the 3’ side of the terminator helix. These studies demonstrated that complexes generated in the absence of tRNA or in the presence of a charged tRNA mimic are in the terminator configuration, whereas complexes generated in the presence of uncharged tRNA are in

18

the antiterminator configuration (Yousef et al. 2005).

The nuclear magnetic resonance (NMR) solution structure of the tyrS T box antiterminator helix revealed extensive stacking of the upper helix and the 3’ portion of the internal bulge onto the bottom helix (Gerdeman et al. 2003). It was suggested that the stacking interaction facilitates the tRNA to sample a set of conformations during binding

(Gerdeman et al. 2003). Phylogenetic studies and genetic analyses had revealed the presence and importance of highly conserved structural elements, such as the GA (or kink-turn) and S-turn (or E loop) motifs in T box leader RNAs (Rollins et al. 1997,

Winkler et al. 2001). Recent NMR studies have confirmed the presence of these motifs in the tyrS T box leader RNA. The structures revealed that the S-turn loop is located adjacent to the Specifier Sequence, while the GA motif is conformationally independent from the Specifier Sequence and is not affected by the presence of the Specifier Sequence

(Wang et al. 2010, Wang and Nikonowicz 2011).

1.2.3 Amino acid binding riboswitches

1.2.3.1 L box riboswitch

The lysine biosynthesis pathway is essential in bacteria for the production of the amino acid lysine for protein synthesis and cell wall biosynthesis, and for generation of diaminopimelate (DAP), a key cell wall component. The lysine biosynthetic pathway also generates dipicolinate, which is required for endospore formation. Phylogenetic analysis of lysine biosynthesis genes revealed a complex leader RNA structural array, conserved in low G+C Gram-positive bacteria (e.g., Firmicutes) and γ- (Grundy et al.

19

2003, Sudarsan et al. 2003). Typically, the lysine (or L box) riboswitches from Gram- positive bacteria utilize a mechanism of transcriptional control. Lysine specifically promotes a structural shift in the B. subtilis lysC leader RNA that favors the terminator structure (Grundy et al. 2003, Rodionov et al. 2003, Sudarsan et al. 2003). Mutant with sequence changes that disrupt the highly conserved regions in the B. subtilis lysC leader

RNA exhibit loss of lysine repression in vivo and loss of lysine-dependent transcription termination in vitro (Grundy et al. 2003). Regulation of L box genes from Gram-negative bacteria appears to take place at the level of translation initiation.

The high-resolution crystallographic structures of the lysine regulatory mRNA element (Garst et al. 2008, Serganov et al. 2008) reveal a complex architecture in which the binding pocket completely encapsulates the lysine molecule. These results indicate that lysine is located within a 5-helical junctional core. Lysine recognition is stabilized by potassium (K+)-mediated hydrogen bonds to the lysine carboxyl oxygen atoms (Serganov et al. 2008). A recent study showed that K+ plays a key role in lysine-dependent termination by increasing the affinity of the lysC leader RNA for lysine, rather than having an impact on specificity (Wilson-Mitchell et al. 2012).

The L box aptamer domain binds L-lysine with an apparent dissociation constant

(Kd) of ~1 µM and exhibits a high level of molecular discrimination against closely related analogs such as D-lysine, DAP and ornithine (Sudarsan et al. 2003). A recent mutagenic characterization of the lysC leader RNA revealed variants with altered ligand specificities. Ligand recognition was attributed to the specific molecular interactions of lysine or lysine analogs with the nucleotides within the ligand-binding pocket (Wilson-

20

Mitchell et al. 2012).

Mutations that confer resistance to the lysine analog aminoethylcysteine (AEC) have been mapped to the 5’UTRs of the lysC genes of both B. subtilis and E. coli (Di

Girolamo et al. 1988, Lu et al. 1991, Lu et al. 1992, Patte et al. 1998). However, AEC has

10-fold lower efficiency than lysine in promoting transcription termination (Grundy et al.

2003). Hence, AEC is unlikely to act as a repressor of L box expression in vivo. It has been shown that the lysyl-tRNA synthetase (LysRS) plays a role in misincorporating the lysine analog during translation, thereby acting as the primary cellular target for growth inhibition by AEC (Ataide et al. 2007). Resistance in L box mutants is likely due to derepression of expression of the downstream coding region, resulting in increased production of lysine, which outcompetes AEC during tRNA aminoacylation (Ataide et al.

2007).

1.2.3.2 Glycine Riboswitch

The glycine riboswitch elements usually reside in the 5’UTRs of glycine degradation genes. The RNA acts as a transcriptional “on” switch to regulate expression of glycine transport and cleavage genes in response to an increase in glycine concentration. The glycine binding riboswitch forms a unique structure in that two aptamer domains are present in tandem and gene expression is regulated through cooperative binding of the amino acid (Mandal et al. 2004). The ability of the RNA to bind glycine in a cooperative manner results in a transition from a fully off-state to a fully on-state across a narrow concentration gradient, thereby improving the sensitivity of

21

regulation (Kwon and Strobel 2008, Mandal et al. 2004). Regulation by the glycine riboswitch results in greater flexibility of carbon metabolism within the cell (Mandal et al. 2004).

The glycine riboswitch from B. subtilis is part of the gcvT , which encodes that form the glycine cleavage system (Mandal et al. 2004). The gcvT RNA motif exists in two forms, type I and type II, separated by a linker sequence (Figure 1.4).

The glycine riboswitch is also present in the 5’UTR of a gene from Vibrio cholerae

(VC1422) that encodes a putative amino acid transporter (Mandal et al. 2004). In-line probing, which reveals metabolite-induced changes in aptamer structure through spontaneous RNA cleavage, along with equilibrium dialysis, indicated specificity toward glycine over alanine or serine.

22

Figure 1.4 Glycine riboswitch. Secondary structure model of the glycine aptamer domains. Shown are the consensus sequences of the RNA motifs I and II separated by a linker. Each aptamer has two conserved paired regions (P1 and P3) and two stem-loops (P2 and P4) that are not conserved. Adapted from (Kwon and Strobel 2008).

Nucleotide analog interference mapping (NAIM) and mutagenesis studies explored the chemical basis for cooperativity by the glycine riboswitch from the glycine permease operon Fusobacterium nucleatum (gene FN0328) (Kwon and Strobel 2008).

These studies revealed that the minor groove of the P1 helix from aptamer 1 and the major groove of the P3a helix from both facilitate a cooperative tertiary interaction (Kwon and Strobel 2008). A recent 3.6 Å crystal structure of the glycine riboswitch from F. nucleatum revealed the ligand binding sites and confirmed an extensive network of tertiary interactions mediated largely by A-minor contacts (Butler et

23

al. 2011). It was predicted that the interaptamer stacking interactions between the P1 helix from aptamer 1 and the P3 helix from aptamer 2 play a role in cooperativity.

Small-angle-X-ray scattering (SAXS), which provides information about the global size and shape of the RNA in solution, indicated a two-state transition of the glycine aptamer in response to Mg2+ and glycine (Lipfert et al. 2009). The absence of glycine but the presence of Mg2+ leads to a significant conformational change in the unbound RNA. However, it is only in the presence of glycine that the RNA structure is further compacted by tertiary packing to form the biologically relevant ligand-bound state

(Lipfert et al. 2009).

1.2.3.3 Glutamine aptamer

The glutamine riboswitch is the latest addition to the amino acid recognizing class of aptamers. A structured non-coding RNA motif, glnA, was identified in the 5’UTRs of cyanobacterial genes encoding ammonium transporters and glutamine and glutamate synthetases (Ames and Breaker 2011). This RNA is approximately 60 nt in length and is predicted to function in metabolism, by specifically recognizing the amino acid

L-glutamine. The 67 nt glnA motif from Synechococcus elongates (67glnA) was subjected to in-line probing in the presence of L-glutamine, which revealed an apparent

Kd of ~575 µM. With the exception of D-glutamine, which binds to the 67glnA motif with 10-fold lower affinity, all other analogs fail to bind to the aptamer (Ames and

Breaker 2011). Disruptive mutations in conserved stem regions of the glnA RNA motif prevent the RNA from exhibiting a conformational change in the presence of L-

24

glutamine, but compensatory mutations restore the response (Ames and Breaker 2011).

These data suggest the importance of structure over the precise sequence. The possibility of glutamate as the natural metabolite (rather than glutamine) was tested based on the physiological concentrations of the two metabolites. Even though the intracellular level of glutamate is ~100 mM as opposed to ~4 mM for glutamine, the in-line probing assay fails to show any structural modulation even when 100 mM glutamate is added to the reaction.

The glnA RNA forms a tandem orientation (similar to the glycine riboswitch, see section 1.2.3.2) with two or three aptamers separated by linker regions, suggesting cooperative binding of the amino acid. However, in-line probing assays have failed to confirm the suggested cooperativity for the glnA RNA aptamers (Ames and Breaker

2011).

A structural variant of the novel glutamine-responsive glnA RNA motif, termed the Downstream-peptide (DP), was identified in marine metagenomic sequences. As marine environments often experience nitrogen limitation, it is predicted that glutamine sensing by the DP RNA regulates nitrogen metabolism in cyanobacteria (Ames and

Breaker 2011). The 83 DP RNA from Synechococcus sp. CC9902 shows an apparent Kd value of ~5 mM for L-glutamine, which is roughly 10-fold higher than the Kd value of the glnA motif, and the 83 DP RNA displays no affinity toward D-glutamine. The DP RNA motif with disruptive mutations in a predicted pseudoknot structure fails to exhibit structural modulation of the RNA in the presence of L-glutamine, and compensatory mutations restore the response of the RNA (Ames and Breaker 2011).

25

The precise mechanism of gene regulation for nitrogen metabolism by the naturally occurring glnA or DP RNA aptamers is unclear, due to the failure to identify an obvious expression platform downstream of either of the glutamine binding aptamers

(Ames and Breaker 2011). However, the discovery of the third amino acid-binding RNA expands the scope of metabolites recognized by natural aptamers.

1.2.4 Purine-sensing riboswitches

Members of the purine-sensing riboswitch class specifically recognize or modified purines. Although three riboswitches from this class are similar in sequence and secondary structure, they each specifically recognize a distinct ligand: guanine (G), adenine (A) or 2’-deoxyguanosine (2’-dG) (Kim et al. 2007, Mandal et al. 2003, Mandal and Breaker 2004). The specificity towards each ligand is dependent on the identities of nucleotides within the ligand-binding pocket (Figure 1.5). The fourth member of this riboswitch class adopts a distinct secondary structure to selectively bind the guanine analog prequeuosine 1 (preQ1) or 7-aminomethyl-7-deazaguanine (Roth et al. 2007).

These purine-sensing riboswitches share a common mechanism by which they recognize their respective ligands, namely by Watson-Crick base-pairing.

26

A. B.

L2 L2

L3 L3

C.

L2

L3

Figure 1.5 Purine riboswitch variants and their ligand specificities. Depicted are the consensus sequence and secondary structure models for riboswitch aptamers that selectively respond to A. guanine, B. adenine or C. 2′-deoxyguanosine. Red nucleotides are present in greater than 90% of the guanine riboswitch representatives. Blue nucleotides in the adenine and 2′-deoxyguanosine aptamers differ from the guanine consensus. P, pairing; J, junction; L, loop region. Adapted from (Breaker 2011).

1.2.4.1 Guanine riboswitch

The guanine riboswitch was the first purine riboswitch to be identified (Mandal et al. 2003). The B. subtilis xpt-pbuX operon encodes a xanthine phosphoribosyltransferase and a xanthine-specific purine permease, and guanine was shown to repress the expression of this operon in vivo (Christiansen et al. 1997). It was proposed that a protein

27

factor is responsible for the regulation of the xpt-pbuX operon. Failure to identify such a protein factor led to the investigation of a possible alternate regulatory mechanism.

Phylogenetic analysis of the xpt-pbuX 5’UTR revealed an RNA motif with a conserved sequence and secondary structure, termed the G box (Mandal et al. 2003). The secondary structure of the G box is characterized by three stems (P1, P2 and P3) arranged in a tuning fork-like orientation. P1 serves as the anchor for the two parallel P2-L2 and

P3-L3 hairpins (Figure 1.5A). In-line probing and equilibrium dialysis indicated that guanine binding by the riboswitch is specific and promotes formation of an intrinsic transcription terminator. Although purine analogs xanthine and hypoxanthine induce modulation of spontaneous cleavage during in-line probing analysis, they fail to out- compete guanine during equilibrium dialysis, i.e., only an excess of unlabeled guanine

(and not unlabeled analogs) can redistribute the tritiated guanine that associates with the

RNA. Adenine and several other guanine analogs show very low or no affinity towards the G box. The apparent Kd of the guanine binding aptamer for guanine is ~5 nM, and for xanthine and hypoxanthine is 10-fold higher (Mandal et al. 2003). Together, these results clearly indicate that the 5’UTR of xpt-pbuX RNA preferentially binds guanine over other purines and purine analogs (Mandal et al. 2003).

1.2.4.2 Adenine riboswitch

The adenine riboswitch was identified as a variant of the G box motif (Mandal and Breaker 2004). Analysis of the leader RNA of the B. subtilis ydhL (now known as pbuE) gene, which encodes a purine efflux pump, revealed several notable differences

28

relative to the guanine binding domain of the xpt RNA. The most important difference was observed at a strictly conserved nucleotide in the P1/P3 junction in the xpt sequence.

The ydhL sequence carries a U at position 74 instead of a C observed in the xpt RNA

(Figure 1.5B). Two additional adenine responsive riboswitches were identified in the

5’UTRs of the add (adenine deaminase) genes from Clostridium perfringens and Vibrio vulnificus (Mandal and Breaker 2004).

In-line probing analysis indicated that the three variant RNAs are specific for adenine. The ydhL RNA binds adenine with high affinity (apparent Kd ~300 nM) and is selective against most adenine analogs (Mandal and Breaker 2004). The ydhL RNA, similar to the xpt RNA from the guanine riboswitch, binds the ligand by Watson-Crick base-pairing. Hence, a simple change from a G-C base-pair in the xpt RNA to an A-U base-pair in the ydhL RNA switches the molecular recognition of the RNA from guanine to adenine (Mandal and Breaker 2004). As compared to the guanine riboswitch, the default state of the adenine riboswitch is “off”, as the absence of ligand promotes formation of the intrinsic terminator within the ydhL leader RNA. The ydhL gene is predicted to encode a purine efflux pump that maintains the in vivo concentration of purines. Therefore, when adenine levels are high, expression of the ydhL gene is upregulated and purines are pumped out of the cell at a higher rate. The enzyme adenine deaminase functions to break down adenine into hypoxanthine and ammonia during purine metabolism, reducing the levels of excess purine in the cell. Therefore, upregulation of the add gene in response to high concentrations of adenine would make sense.

29

The precise interactions of the two purine aptamers with their respective ligands and details of purine binding by the RNAs were revealed by high-resolution X-ray crystal structures and NMR analysis (Batey et al. 2004, Noeske et al. 2005, Serganov et al.

2004). These studies showed nearly identical three-dimensional (3-D) folds for the guanine and adenine riboswitches and suggested that the purines are enveloped completely within the ligand-binding pockets. The high-resolution analysis showed that the regulatory helix P1 is connected to the P2-L2 and P3-L3 hairpins and is stabilized by tertiary loop-loop interactions. The structural studies confirmed that a cytidine residue makes a Watson-Crick base-pair with the guanine ligand, and a uridine residue in the equivalent position creates a canonical base-pair with the adenine ligand (Batey et al.

2004, Noeske et al. 2005, Serganov et al. 2004). Although these interactions are critical for ligand recognition, they are not the only determinants for specificity. Additional residues within the aptamer junctions make up the conserved core, which results in tight binding of the ligand to the RNA.

1.2.4.3 2’-Deoxyguanosine riboswitch

The third class of purine riboswitches was identified as a variant of the guanine- sensing aptamers only in a single organism Mesoplasma florum, a nonparasitic member of the class Mollicutes (Kim et al. 2007). 12 putative riboswitch variants are classified into type I-V, based on the sequence variations relative to the guanine consensus (Kim et al. 2007). Although the overall architecture of this riboswitch class is similar to that of the characterized purine aptamers, several sequence deviations have been observed in the

30

critical core region and the L2 and L3 loops, suggesting that the M. florum RNA has an altered ligand-binding pocket resulting in recognition of a metabolite other than guanine

(Figure 1.5C).

In-line probing in the presence of a series of guanine and guanosine derivatives revealed the ligand specificity of the variant RNAs. One RNA subclass, I-A, exhibits substantial structural modulation in the presence of 100 µM 2’-dG and reveals a pattern consistent with the formation of a three-way junction similar to the previously characterized purine riboswitches (Kim et al. 2007). The I-A RNA binds 2’-dG with high affinity (apparent Kd ~80 nM) and specificity, such that it discriminates against guanine as well as guanosine by approximately two orders of magnitude (Kim et al. 2007). Type

I-A RNA binds 2’-dG using a core structure which is similar to that of the purine aptamers.

Additional biochemical and structural studies revealed that mutating the uridine at position 51 to a cytidine (U51C) in a guanine riboswitch results in a switch in selectivity from guanine to 2’-dG (Edwards and Batey 2009). Thus, the 2’-dG riboswitch achieves its specificity through modification of key interactions involving the nucleobase.

Reorganization of the ligand-binding pocket accommodates the additional sugar moiety.

A recent 2.3 Å crystal structure of the M. florum 2’-dG riboswitch in complex with 2’-dG confirmed these findings (Pikovskaya et al. 2011). Bound 2’-dG is positioned in the center of the core where it forms a Watson-Crick base-pair with a cytidine equivalent to the discriminatory C74 in the guanine riboswitch (Batey et al. 2004, Noeske et al. 2005,

Pikovskaya et al. 2011, Serganov et al. 2004). The J2-3 junction is predicted to be the

31

specificity determinant for the 2’-dG riboswitch as it encapsulates 2’-dG and discriminates against related compounds.

1.2.4.4 PreQ1 riboswitch

The fourth class of purine riboswitches includes RNAs that specifically recognize the purine analog prequeuosine 1 (preQ1) (Meyer et al. 2008, Roth et al. 2007). PreQ1 is a precursor molecule for biosynthesis of , a hypermodified 7-deazaguanosine nucleoside found in the anticodon wobble position of certain tRNAs (Harada and

Nishimura 1972). The majority of preQ1-binding riboswitches function at the level of transcription termination.

Two distinct classes of preQ1-binding riboswitches have been identified. The preQ1-I class (including subtypes 1 and 2) from B. subtilis was identified upstream of genes involved in biosynthesis of queuosine and was shown to specifically bind preQ1 with an affinity in the low nanomolar range (Roth et al. 2007). The most striking feature of this riboswitch is its small size (34 nt in length). This riboswitch consists of a simple stem-loop and a short, 3’ A-rich tail (Roth et al. 2007). The preQ1-II class, identified upstream of genes that encode hypothetical membrane proteins in the Streptococcaceae family, shows distinct structural features (Meyer et al. 2008). The aptamer domain of preQ1-II lacks primary sequence conservation with the preQ1-I class. In addition, the preQ1-II RNA is twice as long as the preQ1 riboswitch and is predicted to have four helices (Meyer et al. 2008).

Crystal structure analysis of the preQ1-I queC RNA riboswitch (Class preQ1-I,

32

subtype 2) from B. subtilis revealed the presence of an H-type pseudoknot structure

(Klein et al. 2009). Comparative studies of crystal and solution structure data suggested identical conformations of the P1/L3 region in the lower part of the binding pocket.

However, significant structural differences were observed in regions above the preQ1 binding pocket. The L1-P2 region is more compact in the crystal structure than in the solution structure, whereas the base-pairing interactions in the L2 region are well-defined in the solution structure. Together, these results indicate conformational heterogeneity in the preQ1 RNA (Klein et al. 2009, Zhang et al. 2011).

Structural modulation of the queC RNA using in-line probing is seen only in the presence of preQ1, compared to the presence of varying concentrations of a series of purine analogs. (Roth et al. 2007). Loss of preQ1-dependent structural modulation is observed for a mutant that contains a uridine in place of the highly conserved cytidine

(C34). This indicates that the C34 position is involved in canonical Watson-Crick pairing with the ligand, analogous to the discriminator cytidine at position 74 of the G box (Batey et al. 2004, Mandal et al. 2003, Roth et al. 2007, Serganov et al. 2004). It is predicted that discrimination between the closely related ligands preQ1 and guanine occurs through the

7’-aminomethyl group, unique to preQ1 (Kim and Breaker 2008, Roth et al. 2007).

1.2.5 c-di-GMP

Bis-(3’-5’)-cyclic dimeric guanosinemonophosphate (c-di-GMP) is a circular

RNA dinucleotide that functions as a second messenger to trigger wide-ranging processes, such as the switch between motile and biofilm lifestyles, pilus and flagellum

33

formation, and virulence gene expression (Cotter and Stibitz 2007, Hengge 2009,

Tamayo et al. 2007). c-di-GMP is generated from two guanosine-5’-triphosphates by the diguanylate cyclase (DGC) enzyme and is degraded by the phosphodiesterase (PDE) enzyme. A c-di-GMP-specific riboswitch element (c-di-GMP-I) was identified based on highly conserved RNA domains, such as the Genes for Environment, for Membrane and for Motility (GEMM) motif (Weinberg et al. 2007), upstream of the DGC and PDE sequences (Sudarsan et al. 2008). The V. cholerae riboswitch element (Vc2) is predicted to be an “on” switch with high expression in the presence of c-di-GMP to upregulate virulence genes. The Clostridium difficile (Cd1) riboswitch, however, works as an “off” switch to regulate the transcription of genes encoding flagellar proteins (Sudarsan et al.

2008).

In-line probing assays conducted on the V. cholerae riboswitch element (Vc2) show a 1:1 RNA:c-di-GMP stoichiometry, with a tight affinity towards the ligand

(apparent Kd ~1 nM). This RNA-ligand interaction is of particular importance as it shows the highest affinity of a c-di-GMP receptor and one of the tightest RNA- interactions (Smith et al. 2010a). The cellular pools of c-di-GMP are predicted to range in the nanomolar to low micromolar concentrations (Hengge 2009). The Kd value of 1 nM would then suggest that this RNA is always in the bound or “on” state, resulting in no regulation of gene expression. However, it is predicted that regulation of the c-di-GMP riboswitch is kinetically-controlled based on the on- and off-rates of ligand interaction with the RNA. The off-rate is extremely slow such that the ligand is bound irreversibly to the RNA within the time-frame required for the switch to be triggered. It is therefore

34

suggested that the activity of this riboswitch is modulated primarily by the on-rate of ligand binding and that high intracellular concentrations of c-di-GMP facilitate rapid binding (Smith et al. 2009, Smith et al. 2010a).

Two independent and simultaneous high-resolution crystal structures of the V. cholera riboswitch element established that the ligand is bound within a 3-helix junction and is recognized by canonical Watson-Crick and Hoogsteen base-pairing interactions

(Kulshina et al. 2009, Smith et al. 2009). The crystal structure identified a new helix formed by flanking nucleotides, including a G-C base-pair that interacts with the c-di-

GMP molecule. Stacking interactions were identified as critical determinants for ligand recognition and high affinity (Kulshina et al. 2009, Smith et al. 2009). A separate crystal structure revealed the presence of metal ions, which were absent from both the previous crystal structures (Smith et al. 2010a).

A second class of c-di-GMP riboswitch (c-di-GMP-II) from C. difficile (84 Cd aptamer) shows distinct structural characteristics (Lee et al. 2010). A high-resolution X- ray crystal structure analysis suggested that the class II riboswitch recognizes c-di-GMP using a pseudoknot element that is closely involved in molecular recognition of the ligand through stacking interactions (Smith et al. 2011).

The c-di-GMP-II riboswitch functions as a tandem RNA sensory system in which c-di-GMP binding by the RNA induces folding changes at atypical splice-site junctions to modulate alternative RNA processing (Lee et al. 2010). The 84 Cd aptamer is located at an unusually long distance (~600 bp) from its corresponding coding region. The long sequence between the aptamer and the coding region exhibits characteristics of a typical

35

group I intron (Lee et al. 2010).

Group I introns are a class of self-splicing that catalyze their own excision from RNA precursors (Cech 1990). Splicing by the group I intron follows a two- step sequential process. The first phosphodiester cleavage is at the 5’ splice-site (ss). An exogenous guanosine docks in the active site of the ribozyme and releases the 5’ exon.

The precursor molecule results in a conformational change followed by a second attack on a phosphodiester linkage at the 3’ ss. The second cleavage is catalyzed by a different guanosine located in the terminal region of the intron. The two exons are ligated as the intron is released (Cech 1990).

While group I introns typically function as selfish elements, it has been speculated that the c-di-GMP II RNA and the group I intron collaborate to function as an allosteric ribozyme, wherein splicing is controlled by c-di-GMP (Lee et al. 2010). The aptamer and ribozyme domains incubated in the presence of GTP yield characteristic group I ribozyme products and splicing occurs only at the 3’ ss. However, c-di-GMP when present in addition to GTP significantly increases production of the spliced exons by increasing the rate of attack (Lee et al. 2010).

Based on the sequences and structures of the precursor mRNA and processed

RNAs, it is proposed that the c-di-GMP-II riboswitch functions at the translational level

(Lee et al. 2010). In the precursor RNA, the start codon resides in a helical structure that restricts ribosome access and precludes translation (Figure 1.6A). In the presence of both c-di-GMP and GTP, the ribozyme action yields a processed mRNA in which the 5’ and

3’ exons are ligated such that an intact RBS is located at an optimal distance upstream of

36

the exposed start codon (Figure 1.6B). Thus, allosteric activation of ribozyme self- splicing by c-di-GMP promotes translation. In the absence of c-di-GMP, the ribozyme action favors GTP attack only 4 nt upstream of the start codon in the 3’ exon, and cleaves a sequence that serves as an RBS (Figure 1.6C). This alternate condition inhibits translation initiation and gene expression is off.

A.

GTP + c-di-GMP GTP

B. C.

Figure 1.6 Proposed mechanism for allosteric ribozyme-mediated gene control. A. Precursor mRNA with the start codon sequestered by the ribozyme stem (P10). B. RNA processed in the presence of GTP and c-di-GMP unmasks the start codon and creates a RBS. C. RNA processed in the presence of GTP alone lacks a RBS. Adapted from (Lee et al. 2010).

37

1.2.6 glmS ribozyme

The B. subtilis glmS gene encodes glucosamine-fructose-6-phosphate amidotransferase to generate glucosamine-6-phosphate (GlcN6P). GlcN6P is an essential component of sugar metabolism and cell wall biosynthesis, and is recognized by the glmS ribozyme. The glmS ribozyme from several Gram-positive organisms represses expression of the glmS gene in response to increasing concentrations of GlcN6P (Winkler et al. 2004).

The glmS RNA functions as a catalytic ribozyme. Phylogenetic analysis revealed high sequence conservation in the glmS RNA secondary structural element that consists of four paired domains, P1-P4. The domains P1 and P2, in addition to a critical pseudoknot structure, form the essential components for ligand recognition. The dispensable P3 and P4 domains are involved mostly in catalytic rate-enhancement (Roth et al. 2006, Soukup 2006, Wilkinson and Been 2005, Winkler et al. 2004). The pseudoknot organizes the glmS core architecture by bringing the paired domains and the catalytic site in close proximity (Soukup 2006).

Effector binding by the glmS element does not result in structural rearrangement, suggesting that the glmS binding pocket is pre-organized in the absence of ligand

(Hampel and Tinsley 2006). Comparison of the high-resolution crystal structures of the apo and activator-bound forms confirmed the rigidity of the RNA element (Cochrane et al. 2007, Klein and Ferre-D'Amare 2006).

The glmS cleavage reaction proceeds through a transesterification step and the

38

cleavage products possess 5’-hydroxyl and 2’, 3’-cyclic phosphate termini (McCarthy et al. 2005, Winkler et al. 2004). GlcN6P specifically increases the cleavage rate by 1,000- fold (Winkler et al. 2004). Additional studies showed that spontaneous cleavage in the absence of GlcN6P takes place only in Tris buffers and that GlcN6P enhances the rate of cleavage ~100,000-fold in a HEPES buffer (McCarthy et al. 2005). The cleavage due to

Tris was attributed to a dependence on amine analogs for RNA self-cleavage. Ligand- activated catalysis was shown to be a function of the acid dissociation constant (pKa) of the amine group. It was concluded that the RNA lacks catalytic function on its own and that GlcN6P, specifically the amine group, acts as a coenzyme in this system rather than an effector of structural change in the RNA (McCarthy et al. 2005).

A highly conserved, catalytically important guanosine residue is located in the active site (Cochrane et al. 2007). There is a strong interdependence between GlcN6P and the G residue as catalytic rate enhancement occurs only when both GlcN6P and G are present. The N1 of the G residue acts as a general base upon GlcN6P binding (Klein et al.

2007). Mechanistic studies of the glmS leader RNA indicated that the metabolite-induced self-cleavage of the glmS ribozyme results in intracellular degradation of the downstream transcript by RNase J1, which targets transcripts with 5’-OH termini, and ultimately lowers production of the GlmS enzyme (Collins et al. 2007).

The glmS ribozyme does not function as a metalloenzyme, as it lacks any active site metal ions (Cochrane et al. 2007, Klein and Ferre-D'Amare 2006). Characterization of the glmS ribozyme activity showed that divalent metal ions (such as Mg2+) play only a structural role, and that the lack of domains P3 and P4 results in an increased demand for

39

Mg2+ (Roth et al. 2006, Soukup 2006).

1.2.7 M box riboswitch

Magnesium ion (Mg2+) is an essential divalent metal ion in cellular systems and is critical for the function of all physiological processes. The first cation-responsive RNA sensor was identified in Salmonella enterica (Cromie et al. 2006). In vivo genetic analyses, mutagenesis and RNA structural probing assays indicate that expression of the mgtA gene, which encodes the Mg2+ transporter (MgtA), is controlled by its 5’UTR. It was hypothesized that the mgtA leader RNA, which is present in one of two alternate conformations depending on the intracellular concentration of the divalent metal ion, controls transcription of the downstream coding region (Cromie et al. 2006).

Subsequent studies showed that mgtA expression is regulated at the level of transcription initiation by the PhoP/PhoQ two-component system by responding to periplasmic Mg2+, and that the Mg2+-responsive mgtA 5’UTR exhibits control at the level of transcription elongation (Cromie and Groisman 2010). It is therefore suggested that two independent mechanisms are involved in Mg2+-dependent transcriptional regulation of mgtA in Gram-negative bacteria. The PhoP/PhoQ two-component system responds to micromolar levels of environmental Mg2+ and activates transcription initiation or to millimolar concentrations of Mg2+ and represses transcription initiation (Garcia Vescovi et al. 1996). Once transcription is initiated, the 5’ leader region of the nascent mgtA transcript functions as an alternative Mg2+-sensing system. If the Mg2+ concentration increases in the bacterial cytoplasm, mgtA transcription is interrupted before reaching the

40

downstream coding region (Cromie et al. 2006).

A recent study identified a novel component that controls the regulatory function of the Salmonella mgtA riboswitch (Zhao et al. 2011). A 17-residue proline-rich peptide,

MgtL, is translated specifically under high Mg2+. The open reading frame (ORF) for

MgtL is embedded in the Mg2+ riboswitch sequence. Data from structural probing and mutational studies indicate that the presence of high Mg2+ alters the base-pairing interaction in one riboswitch loop region to favor an alternate stem-loop by a stem- switching mechanism. Formation of this alternate structure subsequently reveals the RBS for mgtL translation. This prevents transcription to continue into the downstream coding region (Zhao et al. 2011). Inhibition of mgtL translation (due to start codon mutations) under high Mg2+ conditions prevents premature termination of transcription, but leader peptide amino acid limitation does not prevent premature termination of transcription.

Together, these observations suggest that the RNA conformational changes are independent of stalling of the translating ribosome (Zhao et al. 2011).

Although regulation is predicted at the level of transcription termination, the mgtA leader RNA lacks an intrinsic transcription terminator consisting of a CG-rich hairpin followed by a run of Us (Cromie et al. 2006, Peters et al. 2011). Results from a latest study provide evidence that the mgtA riboswitch is regulated by a unique mechanism, which employs Rho-dependent transcription termination (instead of intrinsic termination) under high Mg2+ concentrations (Hollands et al. 2012). A sequence required for binding of Rho (R1) is located in the mgtA leader region. When Mg2+ levels are high, the RNA undergoes a conformational change that allows Rho to interact with the R1 site and

41

promote ATP hydrolysis resulting in Rho-dependent termination (Hollands et al. 2012).

The recent data suggest that the truncated product observed during the study conducted by Zhao and coworkers is an artifact of pausing rather than true termination.

Ribosome stalling is predicted to favor a conformation in which the R1 site is sequestered, thereby preventing it from interacting with Rho. When Mg2+ concentration is low, ribosome stalling on the MgtL peptide promotes mgtA transcription elongation, which inhibits Rho-dependent termination. When Mg2+ concentration is high, a ribosome translating the full mgtL ORF favors formation of an alternate conformation in which the

R1 site is free to interact with Rho. Thus, complete translation of mgtL promotes Rho- dependent transcription termination (Hollands et al. 2012).

A distinct Mg2+-responsive riboswitch, termed the M box riboswitch, was characterized in the Gram-positive bacterium B. subtilis (Dann et al. 2007). This metalloregulatory RNA functions to maintain Mg2+ homeostasis in the cell. The M box

RNAs, identified originally in a bioinformatics search, regulate expression of three major families of Mg2+ transporters – CorA, MgtE and MgtA/MgtB P-type ATPases (Barrick et al. 2004). A detailed study involving genetic, biochemical and biophysical analyses showed that the M box RNA couples metal ion-induced RNA folding with genetic control (Dann et al. 2007).

The 5’UTR of the B. subtilis mgtE gene contains a putative transcriptional terminator structure (Dann et al. 2007). Expression of the wild-type mgtE-lacZ reporter fusion is repressed selectively by Mg2+ in vivo, and mutations that disrupt the terminator/antiterminator sequences result in loss of Mg2+-responsive regulation (Dann et

42

al. 2007). Selective 2'-hydroxyl acylation and primer extension (SHAPE) and in-line probing analyses showed a lowered reactivity of the M box RNA in the presence of Mg2+, suggesting that the aptamer domain is rearranged substantially upon association with

Mg2+ to create a more compacted architecture (Dann et al. 2007).

Structural models of the M box riboswitch reveal the presence of 6 individual

Mg2+ ions associated within 3 closely packed, nearly parallel helices and formation of the tertiary structure is cooperative (Dann et al. 2007, Wakeman et al. 2009). A recent high- resolution crystal structure of the M box RNA was generated by replacing Mg2+ ions with

Mn2+ ions (Ramesh et al. 2011). The manganese-chelated structure reveals metal ion binding by the RNA, which is important to facilitate long-range tertiary interactions. It further confirms the metal-dependent conformational change in the RNA structure

(Ramesh et al. 2011).

1.2.8 Fluoride riboswitch

The fluoride riboswitch is the second metal-responsive genetic switch (Baker et al. 2012). However, it should be noted that fluoride is not considered an active metal, but rather belongs to the group of halides. Regulatory elements, such as the crcB RNA motif from the organism Pseudomonas syringae, were identified upstream of genes predicted to be involved in ion transport. The P. syringae crcB element is predicted to regulate at the level of translation initiation and activate expression of fluoride export genes. The activation of transport genes can reduce the cellular concentrations of fluoride that can otherwise be toxic to the cell (Baker et al. 2012).

43

The fluoride riboswitch consists of two helical regions separated by an asymmetrical central loop. The 5’ terminal region of the RNA is predicted to base-pair with a sequence in the loop region through pseudoknot interactions. In-line probing experiments show that the crcB RNA motif binds free fluoride ions with an affinity of

~60 µM and has the ability to discriminate against other halogen ions (such as chloride, bromide and iodide) (Baker et al. 2012). The minimum inhibitory concentration (MIC) of fluoride ions in E. coli containing the crcB gene is ~200 mM, but a crcB deletion strain is highly sensitive to lower concentrations of fluoride (MIC ~1 mM) (Baker et al. 2012).

A subsequent study analyzed the crcB RNA from Thermotoga petrophila (Ren et al. 2012). Isothermal titration calorimetry (ITC) showed an apparent Kd value of ~140

µM. This study solved the crystal structure of the crcB RNA in complex with a fluoride ion at a resolution of 2.3 Å and revealed the ligand-RNA interactions. This was of particular interest as both the ligand and RNA are negatively charged molecules. Other examples of metabolite-responsive riboswitches that bind to negatively charged ligands are pyrophosphate (see section 1.2.10) and flavin monophosphate (see section

1.2.11).

The fluoride ion is in a central core and is coordinated specifically to an inner shell of three Mg2+ ions. The metal ions are surrounded and coordinated to an outer shell of five backbone phosphates and water molecules. Previous studies have examined the ability of nucleic acids to bind anions such as chloride. Results obtained from these studies reveal that electropositive groups within the RNA molecule (mainly amino, imino, and hydroxyl groups) create specific anion binding pockets (Auffinger et al.

44

2004). Along with metal ion coordination, fluoride-dependent pseudoknot formation, stacking and long-range interactions are critical in ligand recognition (Ren et al. 2012).

1.2.9 B12 riboswitch

The B12 riboswitch is a coenzyme-recognizing regulatory element that was identified upstream of cobalamin transport (btuB) and biosynthetic (cob operon) genes in

E. coli and Salmonella (Nahvi et al. 2002). The B12 riboswitch recognizes 5’-deoxy-5’- adenosylcobalamin (AdoCbl or vitamin B12). Earlier studies showed that AdoCbl downregulates synthesis of the cobalamin transport protein BtuB, by inhibiting ribosome binding (Nou and Kadner 2000). The B12 riboswitch (B12 box) box has also been identified in several Gram-positive organisms, expanding the scope of this genetic element (Nahvi et al. 2004). Regulation by the B12 box is associated with transcriptional control of the metE gene from Streptomyces and Mycobacterium tuberculosis and of the ribonucleotide reductases from Streptomyces and Enterococcus faecalis (Baker and

Perego 2011, Borovok et al. 2006, Warner et al. 2007).

The B12 riboswitch element from E. coli binds AdoCbl with an apparent Kd of

~300 nM, and related analogs such as cyanocobalamin and methylcobalamin do not exhibit any measurable binding (Nahvi et al. 2002). Ligand binding sequesters the RBS, thereby controlling gene expression at the level of translation initiation (Nahvi et al.

2002). Studies conducted with AdoCbl analogs showed that stereochemical modification of the corrin ring renders the ligand inactive, both in vivo and in vitro (Gallo et al. 2008,

Nahvi et al. 2002).

45

The B12 element is present in two different tandem arrangements, such that two riboswitch elements are located adjacent to each other, upstream of one ORF (Sudarsan et al. 2006). The tandem arrangement carries two B12 elements in conjunction to each other

(as seen in Desulfitobacterium hafniense) or adjacent to a distinct riboswitch class, such as the S-adenosylmethionine (SAM)-responsive S box riboswitch (seen in B. clausii; see section 1.2.14.1 for S box riboswitch). The former example is similar to the glycine riboswitch with respect to structural architecture and does not employ cooperative ligand binding (refer to section 1.2.3.1).

The second type of arrangement yields a composite gene control by having two distinct riboswitches arranged in tandem (Sudarsan et al. 2006). The B. clausii MetE enzyme is involved in methionine metabolism and generates methionine from homocysteine in the absence of AdoCbl. The alternate, more efficient isoenzyme, MetH, requires AdoCbl for the synthesis of methionine. Both metE and metH are regulated by S box riboswitches. In addition, metE is also regulated by a B12-responsive element.

Therefore, under high AdoCbl levels, the metE gene is turned off which results in the preferential use of MetH for methionine synthesis. When SAM levels are high, both genes are downregulated by the S box mechanism.

The metE 5’UTR contains the S box riboswitch element upstream of the B12 element. Each structural element within the tandem riboswitch contains a terminator helix and generates an independent response to the respective ligands (Sudarsan et al. 2006).

The presence of either molecular effector (SAM or AdoCbl) is sufficient to confer repression, and the binding affinity for one ligand is unaffected by the presence of the

46

other. Mutagenic analysis showed that disruption of the consensus sequence or secondary structure of either aptamer element affects the sensitivity only to the corresponding ligand, without affecting the function of the adjacent riboswitch (Borovok et al. 2006).

This type of arrangement consequently achieves higher gene control.

1.2.10 TPP riboswitch

The (TPP) riboswitch (or THI box) was among the earliest riboswitches to be discovered (Miranda-Rios et al. 2001, Winkler et al. 2002a).

This riboswitch downregulates expression of genes involved in the biosynthesis and transport of the essential cofactor thiamine (vitamin B1) and its pyrophosphate derivative

TPP. The THI box sequence was identified in a variety of Gram-positive and Gram- negative bacteria, as well as in , fungi and (Miranda-Rios et al. 2001,

Sudarsan et al. 2003, Winkler et al. 2002a).

TPP-mediated riboswitch control has been observed at a variety of regulatory levels. In E. coli, the THI box is located in the 5’UTRs of the thiM and thiC genes that are downregulated by sequestration of the SD region. The B. subtilis thiM gene that contains a THI box in the 5’UTR is regulated only at the level of transcription termination (Mironov et al. 2002). The B. anthracis tenA gene contains two TPP riboswitches in tandem. Both elements respond independently to TPP, yet appear to function in concert with each other (Welz and Breaker 2007). TPP-mediated transcriptional regulation of a thiamine transporter was identified recently in an oral spirochete (Bian et al. 2011).

47

The TPP riboswitch also regulates at the level of mRNA splicing in certain filamentous fungi such as Neurospora crassa and Apergillus oryzae (Cheah et al. 2007,

Kubodera et al. 2003). TPP binding to the THI box riboswitch located in the NMT1 gene in N. crassa leads to increased production of an alternatively spliced product that contains upstream ORFs (uORFs). The uORFs compete for translation initiation and downregulate expression of the main ORF (Cheah et al., 2007). Arabidopsis thaliana and other plant species, as well as some photosynthetic algae, carry the TPP riboswitch in the 3’UTR of the thiC gene near the 3’-poly A tail, suggesting control at the level of mRNA processing and mRNA stability (Croft et al. 2007, Sudarsan et al. 2003, Wachter et al. 2007). TPP binding to a THI box element located in an intron downstream of the thiC gene in A. thaliana results in increased splicing and production of a transcript with decreased stability relative to the unprocessed transcript.

An extensive in vivo analysis of the E. coli thiM riboswitch showed that mutants can be categorized into two classes based on their expression profiles (Ontiveros-Palacios et al. 2008). Results from this investigation suggested that the mutations lock the RNA in one of two conformations that either inhibit or activate expression. RNase H structural mapping and 30S ribosomal subunit toeprinting assays suggested that TPP actively inhibits accessibility of the SD region (Ontiveros-Palacios et al. 2008).

The crystal structures of the TPP riboswitch from various phylogenetic backgrounds have revealed the intricate ligand-RNA interactions (Edwards and Ferre-

D'Amare 2006, Noeske et al. 2006, Serganov et al. 2006, Thore et al. 2006, Thore et al.

2008). The structures with plant and bacterial origin are highly similar, indicating

48

evolutionary conservation. The RNA employs a Y-shaped architecture, also seen in the purine riboswitches (refer to section 1.2.4). The structural studies examined the high specificity of the RNA towards its natural ligand TPP, in comparison to various analogs that lack either one or both of the phosphate moieties, such as thiamine monophosphate

(TMP) or thiamine. The binding affinities of the RNA for these ligands is ~0.1 μM for

TPP, 100 μM for TMP and 600 μM for thiamin (Winkler et al. 2002a).

The RNA makes direct contacts with two surfaces of TPP, the ring and the pyrophosphate moiety (Sudarsan et al. 2003). Recognition of the phosphate moiety contributes to ~100 to 1000-fold higher affinity for TPP compared to its analogs

(Serganov et al. 2006, Thore et al. 2006, Winkler et al. 2002a). However, the central thiazole ring of the ligand is not recognized directly by the RNA (Serganov et al. 2006).

The thiazole component of TPP bridges the two RNA domains together and stabilizes the overall RNA fold. This provides an explanation for the ability of the antimicrobial agent pyrithiamine pyrophosphate (PTPP), which contains a pyridine ring in place of the central thiazole moitety (Thore et al. 2008), to bind the TPP riboswitch and downregulate gene expression (Sudarsan et al. 2005). The ability of PTPP to bind to THI box elements is considered a major cause of PTPP-dependent toxicity in bacterial cells.

The high-resolution crystal structures of the THI box RNA confirmed the presence of at least two divalent metal ions. These cations are required to counteract the negative charges from the RNA phosphate backbone as well as from the pyrophosphate moiety of the ligand, resulting in a tight interaction (Edwards and Ferre-D'Amare 2006,

Noeske et al. 2006, Thore et al. 2006). Results obtained from an ITC analysis supported

49

the divalent Mg2+-dependent folding of RNA and provide evidence for the role of metal ions in ligand affinity (Kulshina et al. 2010).

1.2.11 FMN riboswitch

The FMN riboswitch (also termed the RFN element) directs expression of the biosynthetic and transport genes of (vitamin B2) and its precursor, flavin mononucleotide (FMN) (Mironov et al. 2002). At least two FMN elements have been identified in B. subtilis that bind the cofactor FMN and result in downregulation of the downstream genes. The two FMN elements control expression of essential downstream genes in distinct manners. The FMN biosynthetic gene ribD is regulated at the level of transcription termination, and regulation of the transport gene ribU (also known as ypaA) occurs by translation inhibition (Winkler et al. 2002b).

In-line probing and fluorescence studies show that FMN binds the RNA with high affinity (apparent Kd ~5-10 nM), but analogs flavin adenine dinucleotide (FAD) and riboflavin bind with much lower affinities (~300 nM and 3 µM, respectively) (Wickiser et al. 2005, Winkler et al. 2002b). RNA structure probing studies showed that the crucial phosphate moiety is necessary to discriminate between FMN and riboflavin. Riboflavin lacks the phosphate moiety and therefore is unable to generate a regulatory response

(Winkler et al. 2002b). The negative charges contributed by the phosphates of the RNA and FMN are counteracted by divalent cations (such as Mg2+) (Wickiser et al. 2005,

Winkler et al. 2002b) but the identity of these divalent cations is not crucial (Serganov et al. 2009).

50

The high-resolution crystal structures from F. nucleatum and B. subtilis revealed

FMN to be enveloped completely by the RNA (Serganov et al. 2009, Vicens et al. 2011).

The RNA is arranged in a butterfly-like scaffold, similar to one of the several architectural modules found in 23S ribosomal RNA (rRNA). These structures, from phylogenetically distinct organisms, show the direct recognition (through hydrogen bonding) of the phosphate moiety by the RNA and confirm the involvement of the chromophoric isoalloxazine ring. The -like edge of the FMN ring is involved in

Watson-Crick-like hydrogen bonding (Serganov et al. 2009, Vicens et al. 2011). The phosphate and ring structures of FMN are directed towards different domains of the

RNA. A similar ligand-RNA interaction has been observed in the TPP riboswitch (see section 1.2.9 for TPP riboswitch). SAXS and evaluation of the free and bound crystal structures revealed little global switching of the FMN riboswitch, suggesting that there is no obvious ligand-dependent conformational change in the RNA structure (Baird and

Ferre-D'Amare 2010, Vicens et al. 2011).

An investigation of the ribD FMN-responsive riboswitch revealed that regulation is dictated by the rate constant for FMN association, along with the rate at which the

RNAP completes transcription of the terminator helix (Wickiser et al. 2005). It was suggested that during transcription, the RNA-ligand complex is unable to reach thermodynamic equilibrium before the RNAP commits to a regulatory decision. The discrepancy observed between the apparent Kd values for binding and the ligand concentrations required to induce transcription termination in vitro suggested that the

FMN riboswitch operates at a kinetic level (Wickiser et al. 2005).

51

Additional biochemical and genetic studies have confirmed the ability of the

FMN riboswitch to bind the chemical analog roseoflavin and downregulate FMN- dependent expression (Lee et al. 2009, Mansjo and Johansson 2011, Ott et al. 2009).

Roseoflavin, a natural pigment synthesized by Streptomyces, is the only known

FMN/riboflavin analog with antibacterial properties (Otani et al. 1974). Mutations in the ligand-binding pocket of the FMN riboswitch were identified in roseoflavin-resistant bacteria that exhibited derepression of reporter gene expression (Lee et al. 2009).

Roseoflavin binds the FMN riboswitch with affinity lower than that of FMN (apparent Kd

~100 nM) but higher than that of riboflavin due to the presence of the dimethylamino group on the flavin ring structure of roseoflavin (Lee et al. 2009).

1.2.12 THF riboswitch

The THF regulatory system is a recent addition to the riboswitch family and expands the number of coenzymes that are sensed directly by RNA elements. The THF

RNA motif is found primarily in Firmicutes and resides upstream of folate transport genes (folT) as well as biosynthesis genes (folC, folE and folQPBK) (Ames et al. 2010).

The THF receptor consists of a ~100 nt element predicted to form four helices (P1-P4) and an additional pseudoknot structure around a three-way junction. High nucleotide conservation is observed within the P2 helix and around the single-stranded junctions between the paired regions (Ames et al. 2010). Long-range tertiary interactions facilitated by the pseudoknot structure were confirmed by a high-resolution crystal structure (Huang et al. 2011). This study suggested that the pseudoknot interactions are crucial for the

52

regulatory response of the THF riboswitch.

The THF riboswitch selectively binds derivatives of the vitamin folate, including tetrahydrofolate (THF) and dihydrofolate (DHF) (Ames et al. 2010). THF, the active form of folate, is an essential cofactor involved in 1-carbon transfer reactions. It is predicted that the THF riboswitch monitors only the active fraction of the total intracellular folate pool (Trausch et al. 2011). In-line probing assays showed that the apparent Kd of the RNA element for THF is ~70 nM, and ~300 nM for the THF analogs.

Mutating key base-pairs in the P2 helix increases spontaneous cleavage, making the apparent Kd ~1000-fold poorer than wild-type, and mutants with compensatory substitutions exhibit restored affinity towards THF (Ames et al. 2010). Additional assays revealed that the RNA binds to various 5- and 10-modified forms of THF but rejects folic acid from the binding pocket, indicating a preference towards reduced forms of the vitamin (Ames et al. 2010).

A recent high-resolution structure of a Streptococcus mutans THF element, bound to the THF analog folinic acid, sheds light onto the recognition of ligand by the RNA

(Trausch et al. 2011). This crystal structure revealed the presence of two separate ligand- binding pockets within a single structured domain. Despite different RNA motifs, the mode of ligand recognition by both binding sites is strikingly similar (Trausch et al.

2011). Under physiological Mg2+ concentrations, the riboswitch reveals strong cooperative binding to the two ligands, although only one of these sites is required for a regulatory response (Trausch et al. 2011).

A Lactobacillus casei THF element shows a potential hairpin structure that

53

functions as a translational “off” switch, such that in the presence of the ligand the anti-

RBS sequence pairs with the RBS, preventing translation initiation (Ames et al. 2010).

However, this mechanism has not been verified experimentally. A THF riboswitch identified from the human gut metagenome shows a distinct stem-loop structure that is predicted to function as an intrinsinc transcriptional terminator (Ames et al. 2010).

1.2.13 Moco and Tuco RNA elements

A comparative genomics approach using computational analysis revealed several highly conserved RNA motifs (Weinberg et al. 2007). One such RNA motif was identified upstream of genes involved in molybdate transport, molybdenum cofactor

(Moco) biosynthesis as well as proteins that employ Moco as a coenzyme (Weinberg et al. 2007). A variety of organisms, like γ- and δ-Proteobacteria, Clostridia, Actinobacteria and Deinococcus, contain one or multiple copies of the Moco element. At least 8 Moco motifs have been identified in D. hafniens. The Moco elements are also present in a tandem arrangement. A few organisms that do not require molybdenum instead utilize tungsten and its cofactor (Tuco), and a few variants of the Moco element can be triggered by Tuco (Weinberg et al. 2007).

The E. coli moa operon is under the control of two promoters and expression is upregulated by two transcriptional factors (Regulski et al. 2008). In addition to the multi- leveled regulation by transcriptional factors, the Moco element likely functions as an

“off” switch in response to Moco and discriminates against Moco analogs. Regulation of gene expression is either at the transcriptional or translational level, and some organisms

54

that contain multiple Moco elements show evidence of both types of expression platforms

(Regulski et al. 2008).

In-line probing of the Moco element upstream of the moa operon showed that the

RNA forms a highly structured metabolite-sensing regulatory element (containing paired regions P1-P5) (Regulski et al. 2008). The Moco elements can be divided into two overall architectures (structures with or without P3), which likely correlates with the ability to utilize Moco, Tuco or both. Signature motifs (such as the GNRA tetraloop and a tetraloop receptor, where N is A, C, G or U and R is A or G) that stabilize the overall RNA fold have been identified in the Moco RNA (Regulski et al. 2008).

1.2.14 SAM-sensing ribsowitches

S-adenosylmethionine (SAM) is an essential cellular metabolite and is intricately involved in physiological processes. Most importantly, it is used as a methyl-group donor in a variety of chemical reactions. As SAM is synthesized from methionine and ATP by

SAM synthetase (encoded by the metK gene), growth in the presence of methionine leads to high concentration of SAM in vivo and growth in the absence of methionine results in low in vivo SAM pools (Grundy and Henkin 1998, Tomsic et al. 2008).

The SAM-binding riboswitches represent the most diverse collection of regulatory elements that recognize the same effector molecule. Six different riboswitch classes (S box/SAM-I, SAM-II, SMK box/SAM-III, SAM-IV, SAM-V and SAM-I/IV) that recognize SAM as the molecular effector have been identified, with the S box being the most prevalent (Corbino et al. 2005, Fuchs et al. 2006, McDaniel et al. 2003, Poiata et

55

al. 2009, Weinberg et al. 2008, Weinberg et al. 2010). These riboswitches bind SAM with high affinity and selectivity and discriminate against near-cognate derivatives, such as S- adenosylhomocysteine (SAH) and S-adenosylcysteine (SAC).

Regulation of the SAM-binding riboswitches is most commonly seen at the level of premature transcription termination. Riboswitches that recognize SAH and discriminate against SAM have also been characterized (Wang and Breaker 2008). The

X-ray crystal structures of many SAM-binding riboswitches have deciphered the distinct architectural ligand-recognition properties. The following sections will describe the current literature regarding these riboswitch elements.

1.2.14.1 S box/SAM-I riboswitch

A highly conserved RNA motif termed the S box was originally identified upstream of 11 methionine, SAM and cysteine biosynthetic genes in B. subtilis and subsequently in other low G+C, Gram-positive bacteria (Grundy and Henkin 1998).

Increased S box gene expression is observed during methionine limitation (when SAM pools are low), and growth in the presence of methionine results in repression of gene expression (when SAM pools are high) (Grundy and Henkin 1998).

The S box leader RNAs show high conservation in primary sequence and secondary structure. The predicted secondary structure consists of helices P1-P4, with highly conserved residues in unpaired regions. Phylogenetic analysis revealed that a transcription terminator helix is located at the 3’ end of the leader region, suggesting regulation at the level of premature transcription termination. Mutational analysis showed

56

that disruption of helix P1 leads to constitutive expression, suggesting that the RNA is unable to form the terminator conformation (Grundy and Henkin 1998). SAM was identified as the effector molecule that binds directly to S box leader RNAs and promotes premature transcription termination in vitro, in the absence of any auxiliary protein factor

(McDaniel et al. 2003).

3 3

2 4 2 4 1

Figure 1.7 Model for regulation of S box gene expression in response to SAM. The antiterminator structure (AT, red-blue) forms in the absence of SAM, allowing expression of the downstream coding region(s). Binding of SAM (represented by the asterisk) stabilizes the anti-antiterminator structure (AAT), which sequesters sequences (red) required for formation of the antiterminator and frees sequences (blue) required for formation of the terminator helix (T), resulting in premature termination of transcription. Numbers indicate helices 1-4. SAM binding also promotes a tertiary interaction (dashed line) between residues in the terminal loop of helix 2 and the unpaired region between helices 3 and 4. Adapted from (Tomsic et al. 2008).

The S box model proposes that when the cells are starved for methionine, the

SAM pools drop and the antiterminator structure forms (Figure 1.7, left panel).

57

Formation of the antiterminator results in transcription to continue into the downstream coding region. In presence of high SAM pools, the anti-antiterminator sequesters sequences required for formation of the antiterminator. This promotes formation of the terminator helix and therefore gene expression is turned off (Figure 1.7, right panel). The

S box model was further supported by two additional studies that confirmed direct sensing of SAM by the S box RNA (Epshtein et al. 2003, Winkler et al. 2003). Extensive biochemical and genetic analyses further revealed that a highly conserved kink-turn motif

(GA motif) and an essential pseudoknot element facilitate the appropriate folding of the S box RNAs and the subsequent SAM-dependent transcription termination (McDaniel et al.

2005, Winkler et al. 2001).

The yitJ S box leader RNA binds SAM with high affinity (Kd ~20 nM) (Tomsic et al. 2008) and binding by the yitJ RNA to SAM is highly specific. The yitJ RNA exhibits

100- and 10,000-fold lower affinities for SAH and SAC, respectively, as compared to the affinity for SAM (McDaniel et al. 2003, Winkler et al. 2003). The tertiary structure of the yitJ RNA in complex was SAM was revealed by two independent high-resolution structures from Thermoanaerobacter tengcongensis (Montange and Batey 2006) and B. subtilis (Lu et al. 2010) (Figure 1.8). (Chapter 3 will describe the mutagenic analysis of the B. subtilis yitJ RNA conducted as part of the crystal structure study in collaboration with Dr. Ailong Ke, Cornell University). The two nearly superimposable crystal structures revealed that the ligand is buried deep within the SAM-binding pocket, which is formed by helices P1-P4. A crystal structure study of the apo-form of the aptamer domain suggested that a sampling of a variety of intermediate RNA conformations results

58

in the selection of the appropriate structure, based on the ligand concentration (Stoddard et al. 2010).

A. B.

Figure 1.8 Crystal structure of the B. subtilis yitJ leader RNA bound to SAM. A. Helices P1 through P4 are colored in gray, green, cyan, and yellow, respectively. Red dashes denote pseudoknot base-pairs. SAM-contacting bases are labeled in magenta. The sheared A46·U78 base-pair that recognizes the adenosine base of SAM is labeled with a red dot. B. Ribbon representation of the B. subtilis yitJ RNA structure. The color scheme is the same as that in A. SAM is shown as a ball-and-stick model overlaid with surface representations in gray. The assigned magnesium metal ion is shown in pink. Adapted from (Lu et al. 2010).

59

A comprehensive genetic and biochemical study comparing the S box riboswitches from B. subtilis revealed variability in the response to SAM both in vivo and in vitro (Tomsic et al. 2008). The S box gene-lacZ fusions show a 250-fold range in induction ratios after 4 h of methionine starvation (Tomsic et al. 2008). Variability is also observed both in the termination efficiency in the absence of SAM and in the concentration of SAM required for half-maximal termination in vitro. This study showed that genes involved in methionine biosynthesis are tightly repressed in the presence of

SAM and show high induction of expression in the absence of SAM as compared to genes involved in methionine transport. Overall, it was concluded that the S box gene expression is finely tuned based on the physiological function of the genes (Tomsic et al.

2008). Chapter 4 will describe the importance of the S box leader RNA structural elements that play a role in the observed variability and Chapter 2 will describe the detailed characterization of the atypical metK S box leader RNA. The metK gene encodes

SAM synthetase, the enzyme responsible for synthesizing the S box molecular effector,

SAM.

Although most S box riboswitches regulate at the level of premature transcription termination, S box-mediated regulation has been predicted at the level of translation initiation in certain Gram-negative bacteria and Actinomycetes (Rodionov et al. 2004).

Antisense RNA interference has been documented for S box riboswitch in pathogenic organisms, where the riboswitch transcript is predicted to interact with the mRNA of a virulence factor in trans (Loh et al. 2009) or a sulfur operon in cis (Andre et al. 2008), to regulate gene expression. An S box riboswitch has also been identified upstream of the B.

60

clausii metE gene, in a tandem arrangement adjacent to a B12-binding riboswitch element

(refer to section 1.2.9) (Sudarsan et al. 2006). Both riboswitch elements (S box and B12 elements) function independently in response to their respective ligands and regulate gene expression at the level of premature transcription termination.

1.2.14.2 SAM-II riboswitch

The SAM-II riboswitch was first identified using comparative sequence and structural probing analyses (Corbino et al. 2005). The SAM-II element, found predominantly in α-proteobacteria such as Agrobacterium tumefaciens in the metA leader

RNA, is the smallest of the SAM-binding riboswitches and is distinct from the S box riboswitch in sequence and structure. The initial characterization of this regulatory RNA showed lower affinity for SAM compared to other S box riboswitch elements; the RNA was shown to bind SAM with an apparent KD of ~1 μM. In-line probing and equilibrium dialysis showed strong discrimination against SAM-related compounds (>1,000-fold for

SAH).

The simple architecture of the SAM-II riboswitch consists of a single stem-loop structure with an H-type pseudoknot (Corbino et al. 2005). The high-resolution structure of a SAM-II element obtained from an environmental sample revealed the first structure of an entire riboswitch element, inclusive of both the aptamer domain and expression platform (Gilbert et al. 2008). This crystal structure predicted that repression by the

SAM-II riboswitch takes place by blocking the translation initiation site, rather than structural switching with a complementary downstream sequence. Ligand-dependent

61

stabilization of the pseudoknot structure sequesters the SD sequence at the 3’ end of the riboswitch element.

Although the SAM-II RNA is globally different from the S box motif, functional group recognition takes place in an analogous manner, which forms the basis for efficient discrimination against near-cognate analogs such as SAH (Gilbert et al. 2008). A variety of biophysical techniques (NMR, Single Molecule Fluorescence Energy Resonance

Transfer [smFRET], SAXS and molecular dynamics simulations) have shed light on the ligand-dependent conformational switching of the SAM-II riboswitch (Chen et al. 2011,

Doshi et al. 2012, Haller et al. 2011, Kelley and Hamelberg 2010). Overall, these studies revealed that the essential divalent Mg2+ ions promote a compact RNA conformation, resulting in a structural preorganization that enables pseudoknot formation. However, it is only in the presence of the ligand that this crucial pseudoknot structure is formed fully

(Haller et al. 2011).

1.2.14.3 SMK box/SAM-III riboswitch

The third class of SAM-binding riboswitches has a relatively simple architecture

(unlike S box, but similar to SAM-II) such that both ligand binding and regulatory control encompass a single module. The SMK box riboswitch was identified upstream of metK genes from members of the Lactobacillales (Fuchs et al. 2006). The SMK RNA element from E. faecalis binds SAM directly, resulting in a structural rearrangement that regulates gene expression at the level of translation initiation, in the absence of auxiliary protein factors. Mutational and structural probing analyses showed that pairing between

62

SD and anti-SD (ASD) regions is required for SAM binding (Fuchs et al. 2006).

Ribosomal toeprinting assays subsequently revealed that the SD-ASD pairing blocks access of the ribosome to the SD region, inhibiting gene expression (Fuchs et al. 2007).

Thus, the SMK box is a unique SAM-binding riboswitch as the SAM-binding domain, which participates directly in metabolite recognition, is not separable from the regulatory target.

A. B.

Figure 1.9 Crystal structure of the E. faecalis metK SMK box riboswitch. A. Secondary structure of the SMK riboswitch RNA. Helices P1 through P4 are colored in cyan, green, silver and yellow, respectively. Gray shading, SD sequence; solid magenta lines, direct contacts between the RNA and the SAM molecule; dashed magenta lines, tertiary interactions between J3/2 and P2 and J2/4. B. Cartoon representation of the crystal structure of the SMK riboswitch. SAM is shown in magenta. The coloring scheme for the crystal structure is consistent with A. Adapted from (Lu et al. 2008).

63

Although initial genetic analyses for the E. faecalis SMK box RNA were conducted in B. subtilis, translational repression was observed during elevated SAM pools (during growth in the presence of methionine), supporting the model for SMK box regulation (Fuchs et al. 2006). SMK-mediated regulation was further confirmed in the native background during which in vivo SAM pools were modulated. Regulation at the translational level was inferred, as no significant change was observed in the overall transcript abundance (Smith et al. 2010b). As the RNA-SAM complex in vitro shows a half-life (7.8 s) that is ~20-fold shorter than the transcript half-life (3 min) in vivo, it was suggested that the SMK box functions as a reversible riboswitch (Smith et al. 2010b). In addition, this reversibility (or conformational switching) was shown using fluorescence spectroscopy (Smith et al. 2010b). This type of regulatory mechanism ensures that the cell is poised to respond rapidly to changing SAM pools.

X-ray crystallographic studies revealed a Y-shaped RNA with SAM intercalated within a 3-way helical junction (Lu et al. 2008) (Figure 1.9). The crystal structure confirmed the direct involvement of the SD sequence in SAM recognition. Similar to

SAM recognition by the S box riboswitch, the adenine moiety of SAM intercalates within the RNA resulting in continuous base stacking. The positive charge of the sulfonium ion is crucial for specific recognition of SAM over a near-cognate derivative such as SAH, which lacks the overall positive charge. Crystal structure studies showed that the selenium-derivative of SAM (Se-SAM) binds to the RNA in an identical manner, thereby confirming the location of the sulfonium ion. However, the SAH-bound SMK RNA

64

exhibited only minimal contacts. Competition binding assays showed that the RNA binds

SAM with high affinity (apparent Kd ~0.85 µM) and specificity (~100-fold preference over SAH). However, unlike the S box and SAM-II RNAs, the SMK RNA does not interact with the methionine side chain of SAM. These SAM-binding riboswitches recognize the same biological ligand, yet they exhibit distinct RNA folds suggesting independent evolution (Lu et al. 2008).

Chemical probing analyses (using SHAPE) in combination with mutational studies showed that the SMK box RNA samples one of three conformations depending on the concentration of the ligand sensed (Lu et al. 2011). Similar to the conformational switching seen for the S box RNA, the majority of the SMK RNA structures are present in the apo state (in the absence of ligand) that represents the “on” conformation. However, a subset of the RNA structures pre-organize into a SAM-bound-like or “ready” state. It is only upon exposure to SAM that the RNA is stabilized into the “off” conformation.

Similar results were obtained using NMR and SAXS analyses (Wilson et al. 2011). ITC analyses of SMK box variant RNAs showed that the reversible, translational riboswitch is controlled thermodynamically (Wilson et al. 2011).

1.2.14.4 SAM-IV riboswitch

The SAM-IV element, uncovered during a search for novel riboswitches, contains a set of elements similar to the S box riboswitch (Weinberg et al. 2008). This class is found primarily in Actinomycetales (such as M. tuberculosis) upstream of genes involved in sulfur metabolism. The SAM-IV riboswitch core shares five of the six key

65

nucleotides that make direct contacts with the ligand in S box RNA, suggesting a similar mode of molecular recognition. However, the sulfonium ion binding site appears to be different (Weinberg et al. 2008).

The SAM-IV and S box RNAs show distinct scaffolds with significant differences in the peripheral tertiary architecture. The P4 helix in SAM-IV is located outside the core, the KT motif is absent in the P2 helix and an additional pseudoknot structure is predicted at the top of the P3 helix. Binding analyses by the Streptomyces coelicolor SAM-IV riboswitch showed analogous affinity for SAM and comparable discrimination against the near-cognate ligand SAH, relative to the yitJ S box riboswitch (Weinberg et al. 2008).

A precise regulatory mechanism for SAM-IV was not identified, although SAM-IV elements are observed upstream of a few rho-independent transcriptional terminators as well as many translational start-sites.

Recently, an S box/SAM-IV riboswitch class was predicted that is similar to the S box RNAs (Weinberg et al. 2010). Like SAM-IV, the S box/SAM-IV class contains conserved nucleotides within the ligand-binding core but it differs from S box in the global scaffold. The S box/SAM-IV riboswitch does not contain helix P1 or the KT motif and it appears to have only one pseudoknot structure located at the top of helix P3. The overall similarities between the S box, SAM-IV and S box/SAM-IV riboswitches suggest that these RNAs have evolved from a common ancestor to regulate the same effector molecule. It has been suggested that these SAM binding variants together with the S box riboswitch constitute a SAM superfamily (Weinberg et al. 2008).

66

1.2.14.5 SAM-V riboswitch

The SAM-V element is the fifth member of the SAM-binding riboswitch class

(Poiata et al. 2009). This motif was originally identified during a comparative sequence analysis of G-C-rich intergenic regions in the marine α-proteobacterium Candidatus

Pelagiobacter ubique HTCC 1062 (Meyer et al. 2009). The SAM-V elements have a consensus sequence and secondary structure similar to the S box riboswitch. SAM-V is predicted to form an H-type pseudoknot structure with two stems and two loops. Most

SAM-V elements have been identified upstream of SD sequences, suggesting regulation at the level of translational inhibition (by RBS occlusion). In-line probing and equilibrium dialysis have independently confirmed SAM binding by the aptamer, with an apparent Kd value of ~150 µM and have also shown discrimination against SAH (Poiata et al. 2009).

The SAM-V element has also been identified immediately downstream from a

SAM-II riboswitch, in a tandem arrangement (Poiata et al. 2009). This is the first example of a tandem architecture in which two different riboswitch classes bind the same effector molecule. The two tandem riboswitches do not bind SAM cooperatively and appear to be regulated differently. It is predicted that SAM-II controls gene expression at the level of transcription termination, whereas the SAM-V element is regulated at the translation initiation level (Poiata et al. 2009). The SAM-II element fails to show a ligand-dependent structural modulation in the presence of the downstream SAM-V element, implying that SAM-II binds SAM before the SAM-V transcript is synthesized

(Poiata et al. 2009). This double regulation can be advantageous for cells having slow

67

mRNA turnover rates (such as marine organisms) (Poiata et al. 2009). If the cell detects high SAM concentrations, regulation can take place by premature transcription termination at the SAM-II element. However, if the full-length mRNA is synthesized, after which SAM concentration in the cell increases, then expression of the downstream coding region can be prevented by inhibition of translation (Poiata et al. 2009).

1.2.14.6 SAH riboswitch

A putative SAH regulatory motif was predicted originally during a comparative sequence analysis (Weinberg et al. 2007). The SAH-binding riboswitch was subsequently confirmed upstream of genes involved in SAH catabolism in a number of α- proteobacteria and actinobacteria (Wang et al. 2008). SAH is a natural by-product of the

SAM demethylation reaction and it differs from SAM by the absence of a single methyl group. SAH can thus act as a competitive inhibitor for reactions that involve SAM. In addition, it is imperative that the SAH pools in the cell are regulated tightly, as increased

SAH levels lead to toxicity.

Standard experimental procedures such as in-line probing and equilibrium dialysis show that the metH regulatory element from Dechloromonas aromatica binds SAH with high affinity (apparent Kd ~20 nM) and selectivity against SAM (>1,000-fold) (Wang et al. 2008). In-line probing analyses showed that almost all functional groups present on

SAH are essential for molecular recognition by the RNA and the inability to bind SAM is due to the steric clash of the methyl side-chain. These findings were corroborated by the

X-ray crystal structure of the SAH-bound riboswitch (Edwards et al. 2010). Modeling of

68

SAM into the crystal structure revealed the steric interference, confirming the 1,000-fold discrimination against SAH. This discrimination suggests that the α-proteobacteria and actinobacteria control catabolic gene expression of SAH only when high concentrations of SAH are attained (Wang et al. 2008).

The central core of the RNA is made up of highly conserved nucleotides, flanked by helices P1, P2 and P4; helix P3 is seen only in a few organisms. An unusual LL-type pseudoknot structure, which creates a shallow cleft on the RNA surface, is stabilized in the presence of SAH (Edwards et al. 2010, Wang et al. 2008). The SAH-stabilized conformation either prevents formation of the intrinsic transcription terminator or reveals the SD sequence resulting in ribosome binding. Regulation is predicted at the level of transcription antitermination for metH, and at the level of translation initiation for the ahcY gene from P. syringae. In vivo studies of the SAH hydrolase ahcY from P. syringae showed that in the presence of SAH the riboswitch is “on”. Any disruption of the aptamer domain results in downregulation of expression (Wang et al. 2008).

Biophysical analysis using ITC showed that the SAH riboswitch, like many other riboswitches, samples an ensemble of conformations (Edwards et al. 2010). In the ligand- free state, the RNA adopts a bound-like conformation that shifts easily to the fully bound conformation in the presence of SAH.

1.3 Research goals

The focus of this dissertation has been to investigate the S box (SAM-I) riboswitch from B. subtilis. The S box regulatory system was identified in the Henkin

69

laboratory (Grundy and Henkin 1998). S box genes involved in sulfur metabolism, as well as methionine, cysteine and SAM biosynthesis and transport pathways, show a high degree of conservation in primary sequence and secondary structure in the 5’UTRs

(Grundy and Henkin 1998). Extensive biochemical and genetic analyses showed that the

S box leader RNAs undergo a conformational change specifically in response to SAM

(McDaniel et al. 2003). SAM directs increased termination of transcription in a purified in vitro system, in the absence of any auxiliary protein factor.

MetK synthesizes SAM from methionine and ATP. Growth in the presence of methionine results in high SAM pools, while growth in the absence of methionine results in low SAM pools. Expression of the majority of the S box genes is induced during methionine starvation (when SAM pools are low) and is repressed in the presence of methionine (when SAM pools are high) (Grundy and Henkin 1998, McDaniel et al. 2003,

Tomsic et al. 2008).

A mutagenic study confirmed the direct involvement of SAM in S box regulation

(McDaniel et al. 2006). A single trans-acting mutation in the metK gene resulted in derepression of S box gene expression in vivo. This mutation (SBD1; S box-derepressed mutation 1) resulted in reduced SAM synthetase activity and decreased SAM pools in the cell (McDaniel et al. 2006). I tested the effect of this mutation on the growth of methionine auxotrophic (BR151; metB10) and prototrophic (BR151MA; Met+) strains, as well as on yitJ-lacZ expression (McDaniel et al. 2006). The SBD1 allele was sufficient to confer loss of repression of S box gene expression during growth in the presence of methionine. This mutation also resulted in a modest growth rate defect in the prototrophic

70

background, suggesting that the metB10 allele present in the methionine auxotroph was able to suppress the growth rate reduction caused by the SBD1 mutation (McDaniel et al.

2006).

Although metK is a part of the S box regulon, a metK-lacZ transcriptional fusion fails to exhibit increased expression during methionine limitation (when SAM pools are low). SAM also fails to stimulate increased transcription termination in vitro at the metK leader region terminator. Northern blot and quantitative real-time polymerase chain reaction (qRT-PCR) have revealed only a transient increase in the amount of metK readthrough transcripts during methionine limitation in vivo (Tomsic et al. 2008). From a physiological standpoint, MetK utilizes methionine to synthesize SAM, while the rest of the S box gene products function to synthesize and transport methionine. Thus, the functional role of metK suggests the need for an additional level of regulation, which functions in conjunction with the S box system.

The major focus of my research has been to characterize the B. subtilis metK leader RNA, in order to gain further insight regarding its regulatory mechanism. Chapter

2 will describe the characterization of the metK leader RNA using genetic and biochemical techniques. As previous studies failed to show an increase in metK-lacZ expression under low SAM pools during methionine starvation conditions, we modulated in vivo SAM pools without removing methionine from the growth medium. For this purpose, we constructed a strain in which the chromosomal metK gene was placed under the control of an inducible promoter and total SAM pools were measured. The key result obtained from our studies provided evidence for a SAM-dependent change in metK gene

71

expression in vivo. Increased metK-lacZ expression was observed when the SAM pools were low, and metK-lacZ expression was repressed when the in vivo SAM pools were elevated.

Chapter 2 will further discuss the phylogenetic analyses that revealed unique sequences located upstream (US box) and downstream (the DS box) of the metK S box element. Using primer extension analysis, we mapped the metK transcriptional start-site

(+1) and established that the 5’ end of the US box sequence is located precisely at the +1 position of the metK transcript. The US and DS box sequences display significant complementarity, suggesting a base-pair interaction, which was confirmed using an

RNase H cleavage assay. The base-pairing interaction was stabilized only in the absence of SAM, and the US-DS pairing was dependent on a functional S box element. Extensive mutagenic analysis of the US and DS box sequences confirmed the need for an intact US-

DS base-pairing interaction for a wild-type response to SAM.

Chapter 2 will further discuss qRT-PCR analyses performed to study the metK transcript half-life and abundance. Significant reductions in transcript stability and abundance were observed when the US box sequence was altered. These results implied that the US box sequence plays a role in mRNA stability and that pairing of the US box region with the DS box sequence protects the 5’ end of the transcript from degradation.

Chapter 2 will also describe studies of the metK leader RNAs using in vitro transcription termination assays. Multiple-round transcription assays were performed to compare the termination efficiencies of wild-type and mutant metK constructs in response to SAM. Consistent with previous results, SAM did not stimulate increased termination at

72

the wild-type metK leader region terminator. Our results showed a variation in the total amount of transcript for mutant constructs relative to the wild-type control. Transcription monitored using time-course analyses may be helpful to show if the US box sequence plays a role in the recruitment of RNAP, thereby affecting transcription initiation.

Attempts to generate a halted complex in order to perform single-round transcription using the metK DNA template have been technically challenging. However, we have identified the conditions necessary to generate such a halted complex. Results obtained from the in vitro studies will be discussed in this chapter.

Based on the results from Chapter 2, we proposed a model for the regulation of metK gene expression from B. subtilis. We hypothesized that the metK gene is regulated at the level of mRNA stability, in addition to being under the control of the S box regulon. It is also possible that regulation occurs at the level of transcription initiation, or involves a combination of both RNA stability and transcription initiation.

Chapter 3 will focus on the in vitro investigation of the B. subtilis yitJ S box riboswitch, a well-studied leader RNA from the S box regulon. yitJ encodes methylenetetrahydrofolate reductase and is closely involved in the methionine biosynthetic pathway (Grundy and Henkin 1998, Murphy et al. 2002). The yitJ leader

RNA shows high affinity for SAM and discriminates strongly against closely related natural analogs such as SAH (100-fold lower than SAM) and SAC (nearly 10,000-fold lower than SAM) (McDaniel et al. 2003, Winkler et al. 2003). This chapter will discuss the mutational analysis of the SAM binding pocket of the B. subtilis yitJ S box leader

RNA. The first part of the chapter will describe our attempts to isolate aptamers with high

73

affinity or altered specificity using Systematic Evolution of Ligands by Exponential enrichment (SELEX). Experimental procedures and preliminary data will be explained.

However, due to technical difficulties, yitJ variants with altered properties could not be obtained.

A more direct approach was explored which targeted the yitJ SAM-binding pocket using site-directed mutagenesis. This extensive mutational study was conducted

(by V. A. Pradhan and J. Tomšič) as part of the crystal structure analysis of the B. subtilis yitJ S box riboswitch in complex with ligand (Lu et al. 2010; collaboration with Dr.

Ailong Ke, Cornell University). Thirty-two mutants that targeted key residues in the yitJ sequence were generated. We analyzed the effects of these mutations on in vitro transcription termination (V. A. Pradhan) and SAM binding (J. Tomšič). Most of the mutations disrupted SAM binding (consistent with their position in the crystal structure), as these residues either make important contacts with SAM or are crucial for stabilizing the structural domains within the SAM-binding core. A majority of these mutants exhibited high constitutive transcription termination in the absence of SAM. These data suggest that the mutations lock the RNA into a conformation that resembles the SAM- bound form even in the absence of the ligand (Lu et al. 2010). Selected mutants were analyzed further in response to a series of SAM analogs and compared to the response of the wild-type yitJ RNA. These results will also be described in Chapter 3.

Characterization of individual RNA elements within the S box leader region critical for riboswitch function will be described in Chapter 4. These results provide insight into factors responsible for S box riboswitch variability. This chapter will focus

74

on the design and analysis of hybrid leader RNAs generated with metE and yusC. These leader RNAs served as good candidates for this study as they exhibit different expression profiles in response to limiting SAM levels. metE, which encodes methionine synthase, is associated with the methionine biosynthetic pathway and shows high affinity towards

SAM (Grundy and Henkin 1998, Murphy et al. 2002, Tomsic et al. 2008). metE-lacZ expression is highly induced during growth in the absence of methionine and tightly repressed in the presence of methionine (Tomsic et al. 2008). yusC, which encodes an

ABC-type methionine transporter, exhibits low induction during growth in the absence of methionine and expression is not repressed completely during growth in the presence of methionine (Grundy and Henkin 1998, Hullo et al. 2004, Murphy et al. 2002, Tomsic et al. 2008). Hybrid constructs of metE and yusC were analyzed in vivo and in vitro under low and high SAM conditions. Overall, the data suggest that even though the ability to respond to changing SAM levels is a function of the SAM binding domain, the ability to promote transcription termination efficiently in the presence of SAM is dictated by both the SAM-binding domain and the terminator/antiterminator structures. We therefore conclude that both structural domains play a crucial role in the calibration of the S box regulatory system.

In the second half of Chapter 4, a distinct hybrid leader RNA will be discussed which involves metK and yusC. We investigated the effect of the metK promoter on the transcription efficiency. The metK-lacZ expression was induced when SAM pools were low and methionine levels were high (Chapter 2), but high levels of SAM failed to promote transcription termination of metK in vitro (Chapter 2). Based on these results and

75

the US box mutagenesis (Chapter 2), we hypothesized that the metK promoter, along with the US box sequence, are responsible for reduced transcription initiation or reduced

RNAP processivity in vitro as well as reduced transcript stability. The effects of the metK promoter and US box sequence on expression of the yusC S box leader RNA have been examined both in vivo and in vitro. These studies provide possible implications of the metK promoter and US box sequence on transcription and therefore metK regulation.

76

CHAPTER 2

CHARACTERIZATION OF THE metK LEADER RNA: AN ATYPICAL

MEMBER OF THE Bacillus subtilis S BOX REGULON

2.1 Introduction

The S box regulon, originally identified in the Gram-positive bacterium Bacillus subtilis, is characterized by high primary sequence and secondary structural conservation.

These conserved sequences are located upstream of genes involved in sulfur metabolism and methionine and SAM biosynthesis pathways (Grundy and Henkin 1998). SAM, the molecular effector of the B. subtilis S box regulon, is synthesized from methionine and

ATP by SAM synthetase, encoded by the metK gene. The S box genes are regulated at the level of premature transcription termination. SAM binds to the majority of S box leader RNAs from B. subtilis and promotes formation of an intrinsic terminator helix, leading to downregulation of gene expression (Grundy and Henkin 1998, McDaniel et al.

2003, McDaniel et al. 2005).

As SAM is synthesized from methionine, growth in the presence of methionine

77

results in high SAM pools, while growth in the absence of methionine results in low

SAM pools. The physiological concentration of SAM in a methionine auxotrophic strain grown in the presence of methionine is ~300 µM (Tomsic et al. 2008). Depleting methionine from the culture medium leads to a rapid drop in SAM pools to ~50 μM and then below the limit of detection (25 µM) after 1 h of methionine limitation (Tomsic et al.

2008). Expression of an S box gene-lacZ transcriptional fusion is induced when cells are starved for methionine, and growth in the presence of methionine results in repression of

S box gene expression. The ratio of gene expression during growth in the absence of methionine to that in the presence of methionine is termed the induction ratio.

An extensive in vivo and in vitro characterization of the 11 S box-regulated transcriptional units from B. subtilis revealed a high degree of variation in response to

SAM limitation (Tomsic et al. 2008). In spite of high sequence and structural conservation, a few S box leader RNAs fail to exhibit typical S box gene regulation.

Induction ratios of S box-lacZ transcriptional fusions exhibit a 250-fold range after 4 h of growth in the presence or absence of methionine. Variability is also observed both in the termination efficiency in the absence of SAM and in the concentration of SAM required for half-maximal termination in vitro. Overall, these studies concluded that S box gene expression is tuned physiologically to the functional roles of the S box genes and that expression is turned on only when necessary (Tomsic et al. 2008).

Although the B. subtilis metK leader RNA contains sequence and structural elements that have been observed in other S box leader RNAs, metK-lacZ expression is not induced in the presence of low SAM pools during methionine starvation and SAM

78

fails to stimulate transcription termination at the metK leader region terminator in vitro

(Tomsic et al. 2008). Northern blot analysis and quantitative real-time PCR (qRT-PCR) revealed only a transient increase in the metK readthrough transcript during methionine limitation in vivo (Tomsic et al. 2008). Although studies from other laboratories have demonstrated a decrease in metK expression during growth in the presence of methionine, these experiments were conducted under steady-state growth conditions (rather than during methionine starvation conditions) in a methionine prototroph (Auger et al. 2002,

Yocum et al. 1996). It is possible that as a methionine prototrophic strain synthesizes methionine on its own, addition of exogenous methionine increases the in vivo SAM levels further, which results in the observed reduction of metK expression.

Starvation for methionine indirectly depletes the in vivo SAM pools (Grundy and

Henkin 1998, McDaniel et al. 2003, Murphy et al. 2002) and the reduction in SAM pools correlates with the increase in expression of most S box gene-lacZ transcriptional fusions

(Tomsic et al. 2008). However, methionine starvation in a methionine auxotroph results in inhibition of growth. The metK gene product specifically synthesizes SAM from methionine, while the rest of the S box gene products synthesize and transport methionine. The functional role of metK might explain why it does not show typical S box regulation during methionine starvation. We hypothesize that the B. subtilis metK S box gene is regulated by a mechanism that works in conjunction with the S box regulon.

In the current study, in vivo and in vitro assays were performed to examine why metK is unique compared to the other S box genes and how metK expression is regulated in response to SAM.

79

2.2 Materials and Methods

2.2.1 Bacterial strains and growth conditions

The B. subtilis strains used in this study were BR151 (lys-3 metB10 trpC2);

BR151MA (lys-3 trpC2); BR151 Pspac-metK (lys-3 metB10 trpC2 Pspac-metK) (Pspac promoter; Yansura and Henner 1984); ZB307A (SPβc2del2::Tn917::pSK10∆6) (Zuber and Losick 1987) and ZB449 (trpC2 pheA1 abrB703 SPβ-cured) (Nakano and Zuber

1989). B. subtilis strains were grown on tryptose blood agar base medium (TBAB; Difco,

Franklin Lakes, NJ), Spizizen minimal medium (Anagnostopoulos and Spizizen 1961) and 2XYT broth (Miller 1972). were added as indicated at the following concentrations: chloramphenicol, 5 μg/ml; neomycin, 5 μg/ml. IPTG (isopropyl β-D-1- thiogalactopyranoside; Gold Biotechnologies, St. Louis, MO) was used at 0.2 mM and

1.0 mM. X-Gal (5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside; Gold

Biotechnologies) was used at 40 μg/ml as an indicator of β-galactosidase activity. All growth was at 37ºC.

2.2.2 Genetic techniques

Transformation of B. subtilis was carried out as described previously (Henkin et al. 1990). Chromosomal DNA was prepared using the DNeasy tissue kit (Qiagen,

Chatsworth, CA). Wizard columns (Promega, Madison, WI) were used for plasmid preparations. Oligonucleotide primers were purchased from Integrated DNA

Technologies (Coralville, IA). Restriction endonucleases and DNA-modifying enzymes

80

were purchased from New England Biolabs (Beverly, MA) and used as described by the manufacturer. Mutations were identified by DNA sequencing (Genewiz Inc., North

Brunswick, NJ). Transcriptional fusions to lacZ were generated in plasmid pFG328

(Grundy et al. 1993), which contains a cat gene that confers resistance to chloramphenicol. The metK-lacZ fusion constructs were introduced in single copy into the B. subtilis chromosome by recombination into the SPβ prophage carried in strain

ZB307A and purified by passage of the phage through strain ZB449 (Nakano and Zuber

1989, Zuber and Losick 1987). The phage carrying the fusion was then introduced into a

B. subtilis host strain. Strains containing lacZ fusions were grown in the presence of chloramphenicol.

Construction of the strain BR151 Pspac-metK was performed as follows (done by

F. Grundy). An integration vector pSpacINT (6.2 kilobase [kb]), which served to integrate the Pspac promoter (Yansura and Henner 1984) upstream of the B. subtilis chromosomal metK gene, was generated by ligating DNA fragments from three vectors

(pMUTIN4, pBEST502 and pDG148) that are used commonly in Gram-positive bacteria

(Figure 2.1). A 600 bp fragment from pMUTIN4 carried the Pspac promoter sequence, a

2.0 kb fragment from pBEST502 carried the resistance marker for neomycin and a 3.6 kb fragment from pDG148 carried the sequence for the lacI gene (LacI repressor) and the resistance marker for ampicillin. A B. subtilis DNA fragment (~400 bp) containing part of the metK sequence, with restriction sites for HindIII and BamHI, was cloned into the pSpacINT vector using HindIII-BamHI restriction endonucleases. The Pspac promoter sequence was located immediately upstream of the metK fragment. The Pspac-metK

81

sequence was then integrated at the BR151 chromosomal metK site by double crossover recombination. Thus, the native metK promoter was located upstream of a truncated metK gene, while the IPTG-dependent Pspac promoter was located upstream of the intact metK gene (Figure 2.2). The LacI repressor protein is constitutively expressed in the BR151

Pspac-metK strain. Growth of this strain in the absence of IPTG resulted in inhibition of expression of the intact metK gene. The addition of IPTG inhibited the DNA binding activity of the LacI repressor, resulting in expression of the full-length MetK protein.

Strains containing the metK gene under the control of the Pspac promoter were grown in the presence of neomycin and IPTG. The metK-lacZ transcriptional fusions were integrated into BR151 Pspac-metK via transduction, as described above.

82

Figure 2.1 Plasmid used to generate strain BR151 Pspac-metK. The pSpacINT was generated using fragments from three vectors as indicated. Restriction sites are shown as vertical lines. Locations of resistance cassettes and the lacI gene are shown. Pspac, spac promoter.

83

Figure 2.2 Construction of the BR151 Pspac-metK strain. The metK fragment was cloned into the pSpacINT vector and integrated into BR151 at the chromosomal metK site by double crossover recombination. This resulted in the intact metK gene under the control of the IPTG-dependent Pspac promoter.

84

2.2.3 β-Galactosidase measurements

Strains containing lacZ fusions were grown in Spizizen minimal medium containing the required amino acids at a concentration of 50 μg/ml until early exponential growth phase and were then harvested by centrifugation. Cells were resuspended in fresh

Spizizen minimal medium in the presence or absence of methionine. Samples were collected at 1-h intervals and assayed for β-galactosidase as described previously (Miller

1972) using toluene permeabilization. Strains containing the metK gene under the control of the IPTG-dependent Pspac promoter were grown in 10 ml of 2XYT broth containing

IPTG (0.2 mM). Cells were grown until mid-log phase, harvested by centrifugation and resuspended in an equal volume (10 ml) of 2XYT broth in the absence of IPTG. The cells were then diluted 100-fold into fresh 2XYT broth, in the presence or absence of 1.0 mM

IPTG. Samples were collected at 1-h intervals and assayed for β-galactosidase activity.

All starvation experiments and assays were conducted at least twice, and variation was

<10%.

2.2.4 Measurement of SAM pools in vivo

BR151 Pspac-metK containing a metK-lacZ fusion was grown in 2XYT broth in the presence of 0.2 mM IPTG until mid-log phase. Cells were harvested by centrifugation and resuspended in fresh 2XYT broth in the absence of IPTG. Cell samples were collected by filtration at the indicated time points and extracted with 1.0 ml 0.5 M formic acid; the formic acid was removed by lyophilization as described by Ochi and coworkers.

(Ochi et al. 1981). Cell extracts were tested in an in vitro transcription termination assay

85

using a yitJ template that included the glyQS promoter sequence and compared to a SAM standard curve, as previously described (McDaniel et al. 2006). Samples were also harvested at each time point and assayed for β-galactosidase activity, as described above.

2.2.5 In vitro transcription termination assays for determination of SAM pools

Templates for in vitro transcription by B. subtilis RNAP were generated by PCR using oligonucleotide primers that contained the glyQS promoter sequence upstream of the leader region of the yitJ gene, to generate a transcription start-site 17 nt upstream of the start of helix 1 (McDaniel et al. 2003). The promoter sequences were designed to allow initiation with a dinucleotide (ApC) corresponding to the +1/+2 positions of the transcript and a halt in transcription at position +16, by omission of GTP (McDaniel et al.

2003). The PCR fragment was ~400 bp in length and included 92 bp downstream from the transcription terminator to allow resolution of terminated and readthrough products.

The PCR product was purified with a Qiagen PCR cleanup kit and sequenced by

Genewiz. Single-round transcription reactions were carried out as described (Grundy et al. 2002, McDaniel et al. 2003). A reaction mixture of 20 mM Tris–HCl, pH 8, 20 mM

NaCl, 10 mM MgCl2, 100 mM EDTA, 150 µM ApC (Sigma, St. Louis, MO), 2.5 µM

GTP and ATP, 0.75 µM UTP, 0.25 µM [α-32P]-UTP (800 Ci/mmol [30 TBq/mmol]; GE

Healthcare, Piscataway, NJ), DNA template (10 nM) and His-tagged purified B. subtilis

RNAP (6 nM) was incubated for 15 min at 37ºC and placed on ice. Heparin (20 µg/ml;

Sigma) was added to block subsequent reinitiation. Elongation was resumed by the addition of 10 µM rNTPs and 40 mM MgCl2 and reactions were incubated at 37°C for an

86

additional 15 min. Templates were transcribed in the presence of various concentrations of SAM or cellular extract, as indicated. The final volume of the reaction was 35 µl.

Transcription was terminated by extraction with phenol-chloroform. Transcription products were resolved by denaturing polyacrylamide gel electrophoresis (PAGE) and visualized by PhosphorImager (Molecular Dynamics) analysis. Efficiency of termination was calculated as the amount of termination product divided by the sum of the readthrough and termination products. Percent termination was plotted as a function of ligand concentration (GraphPad Prism). The reactions were performed in duplicate and

-1 reproducibility was ±5%. An average internal cell volume of 0.54 ± 0.13 µl A595

(Wabiko et al. 1988) was used to calculate the intracellular SAM concentrations from the cell equivalents and A595 of cell cultures used in preparation of the extracts, as described previously (McDaniel et al. 2006).

2.2.6 In vitro transcription termination assays

The wild-type and mutant metK DNA templates containing the native metK promoter were generated by PCR (KOD DNA polymerase, EMD Biosciences, San

Diego, CA). PCR products were purified with a Qiagen PCR cleanup kit and sequenced by Genewiz. The PCR fragments were ~400 bp in length and included 121 bp downstream from the metK transcription terminator, to allow resolution of terminated and readthrough products. Multiple-round transcription reactions were carried out in the presence of a high concentration of GTP, which corresponds to the +1 position of the metK transcript. The reaction mixture was as described above, except that GTP was

87

added at a concentration of 0.29 mM. The reaction mixture also contained 7 mM KCl.

ApC was omitted from the reaction mixture. Transcription was conducted in the presence of 10 µM rNTPs. Templates were transcribed in the presence or absence of SAM, as indicated. The reaction volume was 35 µl. Transcription reactions were incubated at 37°C for 30 min and were terminated by extraction with phenol-chloroform. Transcription products were analyzed as described above. The reactions were performed in duplicate and reproducibility was ±5%.

2.2.7 RNase H cleavage assay

The transcription reaction contained 20 mM MgCl2, 40 mM Tris pH 8.0, 50 µg/ml

BSA (Sigma), 5.0 mM each of ATP, CTP and GTP, 0.50 mM UTP, 7.5 mM GMP.

Transcription was carried out in the presence of 0.5 mM [α-32P]-UTP (800 Ci/mmol [30

TBq/mmol], GE Healthcare) for radiolabeling. DNA template was added at a concentration of 10 ng/µl. The DNA template was generated by PCR (KOD polymerase) from a plasmid containing the T7 promoter sequence attached upstream of the metK sequence. RNAs were transcribed in the presence or absence of 2.5 mM SAM, or ligand, as indicated, for 25 min at 37ºC. After transcription, a DNA oligonucleotide (5 µM) complementary to a region in the DS box sequence of the B. subtilis metK leader RNA

(positions 262-269) was added and hybridized for 5 min at 37ºC. RNase H (1 µl, 10 U/µl;

Ambion, Austin, TX) was then added to the reaction and incubated for 10 min at 37ºC.

The final volume of the reaction was 20 µl. The reactions were stopped by phenol– chloroform extraction. The resulting RNA products were resolved by denaturing PAGE

88

and visualized by PhosphorImager analysis. The percentage of RNAs protected from cleavage was calculated as the amount of full-length RNAs relative to the total amount of

RNA in each reaction. The reactions were performed in duplicate and reproducibility was

±10%.

2.2.8 Total RNA extraction for primer extension analysis

BR151 cells were grown in 100 ml 2XYT broth until mid-exponential growth phase, harvested by centrifugation and resuspended in 6 ml LETS buffer (0.1 M LiCl, 10 mM EDTA, 10 mM Tris HCl [pH 7.4], 1% sodium dodecyl sulfate). 6 ml phenol:chloroform:isoamyl alcohol (25:24:1) and 3 ml washed glass beads (425-600 µ;

Sigma) were added to the resuspended cells and vortexed for 2 min. RNA extraction was performed as described by Wu and coworkers (Wu et al. 1989). The RNA concentration was measured using a ND-1000 spectrophotometer (NanoDrop Technologies, Inc.,

Wilmington, DE).

2.2.9 Primer extension analysis of the metK leader RNA

Primer extension analysis was conducted using three separate oligonucleotide primers, metK RC 6 (5’-GCACCTTGGTTGTCTCACTCAGTTG-3’), metK RC 6-1 (5’-

ACCTTGGTTGTCTCACTCAGTTGAAC-3’) and metK RC 9-1 (5’-

GAATCATACAACCTTGCAACAGGTTAGC-3’). The oligonucleotide primers were 5’ end-labeled by incubation at 37ºC for 1 h with T4 polynucleotide kinase (New England

Biolabs) in the presence of [γ-32P]-ATP (3000Ci/mmol, [259 TBq/mmol]; MP 89

Biomedicals, Solon, OH). 50 μg of total RNA was subjected to reverse transcription using 600 pmol 5’ end-labeled primers with Superscript III reverse transcriptase

(Invitrogen, Carlsbad, CA). The primer extension products were heated at 95ºC for 3 min and 3 μl samples were loaded onto an 8 M urea-6.0% polyacrylamide sequencing gel. A

DNA sequencing ladder (Sequenase 2.0 Kit, USB, Santa Clara,CA) was generated as a size standard by using the same oligonucleotide primers and wild-type metK plasmid

DNA as template. All products were visualized using PhosphorImager analysis.

2.2.10 Quantitative reverse transcriptase PCR (qRT-PCR) assay

BR151 Pspac-metK cells containing a metK-lacZ transcriptional fusion were grown in 10 ml 2XYT broth in the presence of 0.2 mM IPTG. Cells were grown until mid-log phase, harvested by centrifugation and resuspended in an equal volume (10 ml) of 2XYT broth in the absence of IPTG. The cells were then diluted 100-fold into fresh

2XYT broth, in the presence or absence of 1.0 mM IPTG, and growth was monitored.

Rifampicin (150 µg/ml; Sigma) was added 3 h after the cells reached an OD of 0.10 at

595 nm. Duplicate samples (5 ml) of the cultures with and without IPTG were collected at 0, 2.5, 5, 10 and 20-min intervals after rifampicin addition. Cells were collected on

0.45 µM pore size nitrocellulose filters (Nalgene, Rochester, NY) using a vacuum manifold. Filters were frozen immediately on dry ice. Frozen cell culture samples were scraped from the filter and resuspended in 330 µl LETS buffer. 300 µl phenol:chloroform:isoamyl alcohol (25:24:1) and 150 µl washed glass beads were added to the resuspended cells and RNA extraction was performed as described above. 30 µl of

90

the RNA samples generated from duplicate cultures at each time point were treated with

3 µl RNase-free DNase I (Turbo DNA-free kit; Ambion), in a final reaction volume of

100 µl, to minimize genomic DNA contamination. This step was followed by acid phenol:chloroform extractions. Reverse transcription reactions were carried out using a

Thermoscript reverse transcriptase PCR (RT-PCR) system (Invitrogen). Each DNase- treated RNA sample (1 µl) was used to generate duplicate cDNA templates resulting in 4 cDNA samples per RNA sample (final reaction volume 25 µl). Oligonucleotide primers for cDNA sysnthesis, lacZ 290(a) RC (5’-CGTAACCGTGCATCTGCCAGTT-3’) and 5s rRNA RC (5’-TCCTACTCTCACAGGGGGAAAC-3’), were used at a final concentration of 50 µM. Quantitative PCRs (25 µl) were performed in duplicate using iQ

SYBR Green supermix (Bio-Rad, Hercules, CA). Data sets were collected on a Bio-Rad

CFX96 real-time PCR system. The cycling conditions were as follows: 3 min at 95°C

(activation of Taq polymerase and well factor collection), followed by 40 cycles consisting of 30 s at 95°C, 20 s at 56°C, and 30 s at 72°C. Fluorescence signal data were collected during the 72°C phase of each cycle. Melt curves from 55°C-95°C (in 0.5°C increments, measuring fluorescence at each temperature) were collected for all samples following the last cycle and showed the presence of only one product in each reaction.

DNA obtained by PCR, using the same oligonucleotide primers used for RT-PCR, was used as the standard DNA. Each cDNA sample was generated in duplicate resulting in a total of 8 replicates. Efficiencies of amplification of each gene were similar based on the slopes of the standard curves. The standard curves were used to derive the copy number of each transcript in each RNA sample, which was an average of 8 replicates. Starting

91

quantities, abundance and half-life measurements of the transcripts were calculated using the qRT-PCR software (Bio-Rad CFX Manager) and Graph Pad Prism.

2.3 Results

2.3.1 Response to varying SAM pools in vivo by the wild-type B. subtilis metK leader RNA

Previous data from our laboratory have shown that expression of the wild-type metK-lacZ transcriptional fusion is not induced during methionine starvation in a methionine auxotroph and the presence of SAM does not promote metK transcription termination in vitro (Grundy and Henkin 1998, Tomsic et al. 2008). Northern blot and qRT-PCR analyses show only a transient increase in metK readthrough transcripts when cells are starved for methionine (Tomsic et al. 2008).

The majority of our earlier studies have focused on gene expression in a methionine auxotroph under limiting methionine conditions. The effects of methionine starvation on a methionine auxotrophic strain are two-fold. Starvation for methionine indirectly depletes the in vivo SAM pools (Grundy and Henkin 1998, McDaniel et al.

2003, Murphy et al. 2002) and results in inhibition of growth. We designed a system in which the in vivo SAM levels were modulated without changing the methionine levels, thereby not affecting the growth rate as severely as during a methionine starvation.

We constructed a BR151 Pspac-metK strain in which the chromosomal metK gene was under the control of an IPTG-dependent Pspac promoter (Yansura and Henner

1984). As this system was designed to deplete the cellular SAM pools without the need to 92

starve for methionine, cells were grown in rich medium. We predicted that growth in the presence of IPTG would result in high in vivo SAM pools (indicating an IPTG-dependent induction of metK expression), while growth in the absence of IPTG would result in low intracellular SAM pools (suggesting a reduction in metK gene expression).

The intracellular SAM pools of BR151 Pspac-metK were measured during growth in rich medium to validate the function of the IPTG-dependent Pspac promoter (Figure

2.3). Formic acid extracts were prepared from cells harvested at different time points during IPTG limitation. To calculate the concentration of SAM present in the cellular extracts, we measured the termination efficiency at the yitJ leader region terminator promoted by addition of extracts compared to the termination efficiency promoted by known concentrations of SAM (McDaniel et al. 2006, Tomsic et al. 2008).

93

( )

( )

Figure 2.3 Measurement of in vivo SAM pools and β-galactosidase activity during IPTG limitation. BR151 Pspac-metK cells containing the wild-type metK-lacZ transcriptional fusion were grown until mid-exponential phase in the presence of IPTG (0.2 mM), harvested and resuspended in 2XYT in the absence of IPTG. Samples were collected at the indicated times. Cell extracts were neutralized by the addition of 1 N KOH, and samples were added to a B. subtilis RNAP in vitro transcription termination reaction mixture containing a yitJ DNA template. Termination efficiency was compared to a standard curve generated using known concentrations of SAM, which was used to calculate the SAM concentration present in the cell extracts. The in vivo SAM levels are plotted on the left Y-axis (µM) and β-galactosidase activity is shown on the right Y-axis (Miller units). Dashed line with open squares, β-galactosidase activity; solid line with filled circles, in vivo SAM pools.

At the T0 time point, the SAM pools in the cell extracts were at ~150 µM (Figure

2.3, left Y-axis, solid line). The SAM levels dropped rapidly to ~40 µM within the first hour of IPTG limitation and eventually dropped below the limit of detection (to ~25 µM),

~2 h after removal of IPTG from the growth medium. Expression of the metK-lacZ 94

transcriptional fusion was monitored concurrently for samples collected in parallel, to corroborate the effect of SAM pool modulation (using IPTG limitation) on metK-lacZ derepression in BR151 Pspac-metK. β-Galactosidase activity started at ~40 Miller units and reached 90 Miller units after 4 h of IPTG removal (Figure 2.3, right Y-axis, dashed line). These data support a correlation between lowered SAM pools and increased metK- lacZ expression in this strain.

120 -IPTG 100

80

60

40

20 +IPTG

-Galactosidase (MU) -Galactosidase  0 0 1 2 3 4 5 Time (h)

Figure 2.4 In vivo expression of the wild-type metK-lacZ transcriptional fusion. The metK-lacZ fusion was integrated in single copy in strain BR151 Pspac-metK (lys-3 metB10 trpC2 Pspac-metK). Cells were grown in 2XYT broth containing IPTG (0.2 mM) and resuspended in fresh medium in the presence (filled symbols) or absence (open symbols) of IPTG (1 mM). Samples were taken at 1-h intervals until 4 h after resuspension. Open squares, in the absence of IPTG; filled squares, in the presence of IPTG. MU, Miller units.

The expression of the wild-type metK-lacZ transcriptional fusion was measured during growth in the presence of IPTG and compared to expression during growth in the

95

absence of IPTG (Figure 2.4). The wild-type metK-lacZ fusion construct exhibited repression of gene expression when SAM pools were high (in the presence of IPTG). The

β-galactosidase activity for cells grown in the presence of IPTG was in the range of 30-40

Miller units. The metK-lacZ fusion construct exhibited a 3.5-fold increase in β- galactosidase activity (to ~110 Miller units) after 4 h of growth in the absence of IPTG, when SAM pools were low. Table 2.1 lists the values of the wild type metK-lacZ β- galactosidase activity and induction ratios during methionine starvation and IPTG limitation.

Table 2.1 Expression of wild-type metK-lacZ transcriptional fusion in vivo

Assay Strain β-Galactosidase activity Induction ratio a b (Miller units at T4) (T4)

+Met -Met Methionine BR151 20 ±2.0 17 ±5.0 0.85 Starvation +IPTG -IPTG IPTG assay BR151 Pspac-metK 31 ±2.0 110 ±5.0 3.5

a T4 indicates the value after 4 h of methionine starvation or IPTG limitation. Values are reported as the means ± the standard deviations for three assays. b Induction ratio at T4 indicates the ratio of values in the absence of IPTG to values in the presence of IPTG

96

2.3.2 Deletion mapping of the metK leader RNA

The predicted secondary structure of the metK leader RNA is made up of elements that resemble a typical S box RNA (Figure 2.5). Helices 1 through 4 are arranged around a conserved core. The kink-turn (or GA) motif in helix 2, as well as the potential tertiary interaction between the loop of helix 2 (L2) and the junction region between helices 3 and 4 (J3/4), are conserved in the metK leader sequence. The highly conserved sequences within the SAM-binding pocket in helix 3, as well as the AU-rich base-pairs at the top of helix 1 are also evident.

97

Figure 2.5 The predicted secondary structure of the B. subtilis metK leader RNA. Numbering is relative to the transcription start site (+1). The sequence is shown in the terminator conformation; red and blue residues illustrate the alternate pairing required for formation of the antiterminator, shown above the terminator. Helices 1 to 5 are identified by boxed numbers; T, terminator; AT, antiterminator; AAT, anti-antiterminator. Bold dashed line (black) indicates sequence downstream of the terminator. Arrows indicate endpoints of the deletion mutants; red, 5’ deletion; green, 3’ deletion; purple, S box deletion. The triangle and the purple dashed line indicate the sequence that has been deleted for the S box deletion mutant.

98

AT

3

AT

2 4

1 5

AAT T

Coding region Figure 2.5 99

Table 2.2 Expression of the metK-lacZ deletion mutants during IPTG assay

Construct β-Galactosidase activity Induction ratio a d (Miller units) (T4)

+/-IPTG +IPTG -IPTG b c T0 T4

wild-type 43 ± 0.70 31 ± 2.0 110 ± 5.0 3.5 5’ deletion 7.0 ± 0 9.0 ± 1.0 9.0 ± 1.0 1.0 3’ deletion 32 ± 4.5 18 ± 5.5 26 ± 6.0 1.5 delta S box 150 ±51 98 ±13 290 ±40 3.0

a Values are reported as the means ± the standard deviations for two assays. b T0 indicates the value at the start of the IPTG assay c T4 indicates the value after 4 h of growth in the presence or absence of IPTG d Induction ratio at T4 indicates the ratio of values in the absence of IPTG to values in the presence of IPTG

In order to identify features within the metK leader RNA that are critical for metK gene expression in response to SAM limitation (during IPTG limitation), we performed a systematic deletion mapping of the metK leader RNA. The 5’ deletion construct (in which positions +1 to +7 were deleted) failed to exhibit an induction of expression during growth in the absence of IPTG and regulation to limitation of SAM was lost. Deletion of residues at the 5’ end of the metK leader sequence resulted in a 6.0-fold reduction in β- galactosidase activity at time T0, relative to the wild-type metK-lacZ fusion construct

(Table 2.2). The β-galactosidase activity of the 5’ deletion fusion construct was reduced

~12-fold after 4 h (T4) of growth in the absence of IPTG, while β-galactosidase activity 100

was reduced ~3.5-fold after 4 h of growth in the presence of IPTG, as compared to the wild-type metK-lacZ fusion construct under the same conditions.

The 3’ deletion construct (in which the sequence from postion +219 downstream of the terminator was deleted) also failed to exhibit an induction of expression after growth in the absence of IPTG. Deleting the 3’ end of the metK leader RNA resulted in a small decrease in β-galactosidase activity at T0 (1.3-fold) and T4 (1.7-fold) during growth in the presence of IPTG as compared to wild-type metK-lacZ. The expression of the 3’ deletion mutant at T4 in the absence of IPTG was reduced ~4.0-fold as compared to wild- type metK-lacZ (Table 2.2). These results showed that deleting either end of the metK leader RNA resulted in a loss of response to SAM limitation, as the constructs failed to show an induction of metK-lacZ expression upon IPTG removal. These results indicate that the first 7 nt at the 5’ and the sequence downstream of the terminator region at the 3’ end of the metK leader RNA are important for metK gene expression.

Deletion of the S box riboswitch element (ΔS box, which deletes helices 1-4, as well as the terminator element, from positions +24 to +247) from the metK leader RNA resulted in a 3-fold induction of expression after 4 h of growth in the absence of IPTG compared to expression in the presence of IPTG. β-Galactosidase activity of the ΔS box mutant at T0 was 4.0-fold higher than that of the wild-type metK construct and activity at

T4 was ~3.0-fold higher than wild-type both in the presence and absence of IPTG (Table

2.2). Together, results obtained from the deletion mapping indicated that deleting the sequence at the 5’ and 3’ ends of the leader region had a negative effect on metK expression, while deleting the metK S box element increased overall expression without

101

affecting the response to the changing SAM pools in vivo. These results suggest that the metK S box element does not participate directly in the primary mode of regulation. This prompted us to look more closely at the 5’ and 3’ regions of the metK leader RNA sequence.

2.3.3 Unique sequences are located on the 5’ and 3’ sides of the metK S box element

A phylogenetic analysis of the metK leader RNAs from several Firmicutes was conducted by F. Grundy (Figure 2.6). Comparative sequence analysis revealed two highly conserved regions in the metK leader RNAs in Firmicutes in addition to the S box element (Figure 2.7). These sequences were unique to metK as they were absent from the other S box leader RNAs from B. subtilis. Several positions within the sequence elements were invariant, leaving little possibility for covariation. Significant sequence complementarity was observed between the two sequence elements, which suggested a base-pairing interaction. The base-pairing interaction resulted in a stem loop structure that was located upstream of the SD region, designated the ‘pre-SD’ structure (Figure

2.7C).

102

Figure 2.6 Alignment of S box sequences from metK genes in Firmicutes. The sequences in green are the predicted -35 and -10 promoter regions. The sequence shown in red is the highly conserved US box region located near the 5’ end of the metK leader RNA. The nucleotides shown in blue indicate the conserved core region. Abbreviations are as follows: Afla, Anoxybacillus flavithermus; Bcoa, Bacillus coagulans; Bhal, B. halodurans; Blic, B. licheniformis; Bpum, B. pumilus; Bsel, B. selenetireducans; Bsp14911, B. species strain 14911; Bsp SG1, B. species strain SG1; Bste, B. stearothermophilus; Bsub, B. subtilis; Esib, Exiguobacterium sibiricum; Gsp WCH70, Geobacillus species strain WCH70; Lsph, Lysinibacillus sphaericus; Mcas, Magicicada cassini; Oihe, Oceanobacillus iheyensis; Saur, Staphylococcus aureus; Scar, S. carnosus.

103

A.

C.

19- -251

10- -259 B.

·

1- -268

Pre-SD

Figure 2.7 The alignment of the metK upstream (US) and downstream (DS) box sequences from Firmicutes. A. The metK US box sequence alignment (red sequences) with the conserved core sequence highlighted in blue. Underlined nucleotides overlap helix 1 of the metK S box leader RNA. The US box consensus sequence is shown at the bottom. Upper case letters, 100% conservation. B. The metK DS box sequnce alignment (green sequences) with the conserved core shown in blue. The DS box consensus sequence is at the top of the alignment. Upper case letters, 100% conservation. Abbreviations are as in Figure 2.6. Additional abbreviations are as follows: Bant, B. anthracis; Sepi, S. epidermidis. C. Pre-SD, Pre-Shine-Dalgarno. 104

The sequence element located to the 5’ side of the metK S box riboswitch element was termed the Upstream (US) box sequence (Figure 2.7A). It overlapped helix 1 of the metK S box riboswitch and extended into the junction region between helices 1 and 2.

The 19-nucleotide consensus sequence contained 10 invariant positions. Primer extension analysis was conducted to map the metK transcriptional start-site (+1) in order to identify the location of the metK US box sequence relative to the 5’ end of the transcript. A primer extension product (~110 bases in length) was observed only in the reaction to which reverse transcriptase enzyme was added and corresponded to a guanine (G) in the sequencing ladder (Figure 2.8, lanes 2 & 7). Several complementary DNA oligonucleotides designed to hybridize to different regions in the RNA were employed to verify this result (data not shown). The primer extension analysis indicated that the 5’ end of the metK US box sequence is located precisely at the +1 position of the metK transcript.

105

Figure 2.8 Primer extension analysis of the B. subtilis metK leader RNA to map the 5’ transcriptional start-site. A DNA oligonucleotide (complementary to nucleotides 83- 107) was hybridized to total RNA isolated from strain BR151 grown in 2XYT broth. Two concentrations of total RNA (10 µg and 50 µg) were tested in a reverse transcription (RT) reaction. The same primer was used for generating the sequencing ladder. The bold arrows indicate the band corresponding to the metK transcription start-site; the first transcribed nucleotide of the metK mRNA is indicated by the bold letter G with an asterisk. M, size marker.

106

Figure 2.8

107

Phylogenetic analysis revealed a second sequence element, designated the

Downstream (DS) box sequence, located ~60 nt downstream of the metK S box terminator sequence and was significantly complementary to the US box sequence

(Figure 2.7B). Of the 16 nucleotides in the DS box consensus sequence, 5 nucleotides were 100% conserved. The B. subtilis metK DS box sequence was located ~30 nt upstream of the metK translational start-site. This distance appeared to vary, such that a few organisms showed longer insertions between the end of the DS box sequence and the metK start codon (data not shown). Deleting a portion of this region (positions 278-289) from the B. subtilis metK leader RNA resulted in expression similar to wild-type metK- lacZ under limiting SAM conditions (data not shown).

2.3.4 The metK US and DS box regions are involved in a base-pairing interaction and the SAM reponse for pairing is dependent on a functional SAM-binding domain

We performed an RNase H probing assay to investigate the formation of the pre-

SD structure. The RNase H enzyme targets only DNA-RNA hybrids. Accessibility of the

DS box sequence was probed using a DNA oligonucleotide complementary to a region in the DS box sequence (positions 262-269). The antisense oligonucleotide was designed such that it did not interfere with nucleotides in the US box sequence that overlapped helix 1. T7 RNA transcripts were generated in the presence or absence of SAM. The wild-type metK transcript control reaction, in the absence of antisense oligonucleotide and SAM, showed protection from RNase H cleavage (Figure 2.9A, lane 1). Addition of the oligonucleotide in the absence of SAM resulted in a increase (~14-fold) in the 108

cleavage product, while transcripts generated in the presence of SAM were highly sensitive to cleavage, such that cleavage increased 40-fold relative to lane 1 and 3.0-fold relative to lane 2 (Figure 2.9A, lanes 1, 2 and 3). These data indicated that the addition of

SAM to the wild-type metK RNA resulted in increased accessibility of the DS box sequence.

The specificity of the metK leader RNA for SAM was confirmed by testing the wild-type metK leader RNA in response to a closely related SAM analog or SAM precursors using the RNase H cleavage assay. The wild-type metK RNA exhibited cleavage only in response to SAM. Millimolar concentrations of S-adenosylhomocysteine

(SAH), methionine or dATP (either alone or in combination) failed to stimulate increased cleavage of the wild-type leader RNA by RNase H (data not shown). We also tested the effect of the stringent response alarmone (ppGpp) and failed to observe an increase in cleavage of the wild-type leader RNA (data not shown). Titration of the wild-type metK leader RNA in the presence of varying concentrations of SAM indicated that a minimum concentration of 50 µM was required for ~50% cleavage (Figure 2.9B).

109

A.

B.

Figure 2.9 Oligonucleotide-direction RNase H cleavage mapping of the B. subtilis metK leader RNA. A. RNase H cleavage of the wild-type and mutant metK leader RNAs in response to SAM. The US1 mutant contains a CGG→AUU change and the S box mutant contains a single U→C substitution in the SAM binding pocket (position +105). Denaturing PAGE analysis of RNase H cleavage products. Oligonucleotide was added to radiolabeled B. subtilis metK RNA generated in the presence (+) or absence (–) of SAM (1 mM). RNA-DNA hybrids were cleaved with RNase H and the products were visualized by autoradiography. B. SAM titration of the wild-type metK leader RNA. Increasing concentrations of SAM (0, 0.10, 1.0, 10, 50, 100, 500, 1000 and 5000 µM) were added to each reaction. P, protected transcript; C, cleavage transcript; % C, percentage of cleavage transcript relative to the total amount of transcript.

110

Three positions in the metK US box sequence that are predicted to be 100% conserved throughout all the Firmicutes were mutated simultaneously (C4A, G5U and

G6U) to generate the metK US1 mutant. The resulting triple mutant exhibited constitutive high cleavage in both the presence and absence of SAM (Figure 2.9A, lanes 5 and 6) relative to wild-type metK. Thus, disrupting the invariant US box nucleotides resulted in a loss of response to SAM in the RNase H cleavage assay.

The metK S box mutant, containing a U105C substitution in the SAM binding domain, resulted in a nearly equal ratio of cleavage to protection, regardless of whether

SAM was added to the reaction (Figure 2.9A, lanes 7, 8, 9). A similar U→C substitution in the SAM binding pocket of the yitJ leader RNA abolishes SAM binding (Lu et al.

2010). It is important to note that the metK S box mutant exhibited higher cleavage of the

DS box sequence in the absence of SAM, while increased protection in the presence of

SAM compared to that of the wild-type RNA under the same conditions. This result indicated that a point mutation in the SAM binding pocket (independent of the US and

DS box sequences) led to a general increase in accessibility of the DS box sequence.

The metK ΔS box RNA exhibited high protection from RNase H cleavage (~98%, data not shown) regardless of the presence of SAM. This result indicated a loss of response to SAM due to the deletion of the S box element. The metK ΔS box RNA containing the US1 mutation showed constitutive cleavage (~95%, data not shown). The metK ΔS box US1 RNA exhibited a cleavage pattern (data not shown) similar to that shown by the metK US1 RNA alone (Figure 2.9A, lanes 5 and 6), suggesting the consistency of the US1 phenotype in the RNase H cleavage assay.

111

2.3.5 Mutagenic analysis of the conserved metK US box element

Extensive site-directed mutagenesis was performed to study the unique sequences in the metK leader RNA that are predicted to be important for metK regulation. Guanine at the metK +1 site was left intact so as to not affect transcription initiation by the RNAP.

We have evidence that shows that a G1A-lacZ fusion construct exhibited loss of induction during IPTG limitation with an overall reduced β-galactosidase activity, compared to wild-type metK-lacZ (Woltjen and Grundy, data not shown). Nucleotides beyond position U8 in the metK US box sequence overlap helix 1 of the S box element

(Figure 2.7A) and hence were not altered. We therefore focused our attention on the US box positions A2-U8, which were mutated individually to every other nucleotide or also deleted (in case of U8). A total of 22 US box point mutants were examined in response to

SAM limitation in the BR151 Pspac-metK strain using in vivo expression assays. Figure

2.10 illustrates the results obtained from a comprehensive in vivo analysis of the US box point mutants.

112

Figure 2.10 In vivo expression analysis of wild-type metK compared to the metK US box mutant constructs using the IPTG assay. BR151 Pspac-metK cells, containing the appropriate transcriptional fusion construct, were grown in the presence of IPTG (0.2 mM) up to mid-exponential phase, harvested and resuspended in the absence or presence of IPTG (1.0 mM). Samples were taken every 1h until 4 h and tested for β-galactosidase activity. The graph shows data for the 4 h time point. MU, Miller units.

Mutant constructs with sequence changes in the US box region that disrupted the base-pairing between the US and DS boxes in the pre-SD structure, exhibited an overall reduced gene expression with no response to the changing SAM levels. For example, the

A2C/U, G3C/U, A7C/U and U8C mutants (Figure 2.10). Each nucleotide substitution weakened the base-pairing interaction and resulted in a loss of response to changing 113

concentrations of SAM (Figure 2.5, right panel). However, mutants in which the US-DS pairing was maintained, either by a newly formed Watson-Crick or a G∙U wobble base- pair, exhibited a response to SAM limitation with increased readthrough in the absence of

IPTG, consistent with expression of the wild-type metK-lacZ fusion construct.

Of the 8 selected US box sequence positions, 5 positions (C4, G5, G6, A7 and

U8) were predicted to be 100% conserved. However, the A7 and U8 positions tolerated a sequence change. The A7G substitution was tolerated as it formed a wobble base-pair with a U in the DS box sequence (Figure 2.10). Although the U8 position is located in a loop region and no obvious nucleotide in the DS box sequence is predicted to base-pair with it, mutation of this invariant position to any of the other three nucleotides (A, C or

G) or even deleting it (U8Δ) was tolerated (Figure 2.10).

Positions C4-G6 did not tolerate any sequence change, even though a potential to base-pair was maintained. Mutations C4A and C4G resulted in constitutive high readthrough. Substitutions at the G5 and G6 positions, as well as the mutation C4U, led to overall reduced β-galactosidase activity, with loss of response to SAM limitation relative to the wild-type metK-lacZ construct (Figure 2.10). The G5U mutant exhibited the lowest β-galactosidase activity among the 22 US box point mutants. These results imply that the C4, G5 and G6 residues are particularly crucial for the metK response in vivo, and indicate the importance of sequence over base-pairing interaction within this region. The three positions C4, G5 and G6, that are 100% conserved in the metK alignments, are part of a designated ‘conserved core’ region (Figures 2.7A and 2.7B, residues highlighted in blue). The results from the in vivo expression analysis of the C4-

114

G6 mutants are consistent with the in vitro results from the RNase H cleavage assay, in which the severe metK US1 triple mutant (C4G5G6→A4U5U6) exhibited constitutive high cleavage. Together, these results suggest that disrupting the US-DS pairing interaction results in a loss of response to SAM.

2.3.6 Physiological context of the G5U mutant

The G5U metK-lacZ transcriptional fusion construct exhibited the lowest β- galactosidase activity in BR151 Pspac-metK (~5 Miller units in the absence of IPTG, 22- fold lower than wild-type metK-lacZ expression, Figure 2.10) compared to the other metK

US box mutants. The G→U substitution at position G5 was predicted to disrupt a 100% conserved G-C base-pair with DS box sequence. We introduced this mutation into the chromosomal metK copy of a methionine auxotroph (BR151; metB10) to generate the strain, BR151-G5U. Mutating the native metK gene resulted in a growth defect on TBAB plates, such that colonies appeared tiny compared to a strain with an intact metK gene

(Grundy, data not shown). The difference in growth seen on nutrient rich plates suggested that the mutation G5U resulted in decreased SAM synthetase activity.

We generated the strain BR151-G6A in which the US box point mutation G6A was introduced into the chromosomal metK sequence. The G6A metK point mutant had exhibited an intermediate β-galactosidase activity in the BR151 Pspac-metK background, which was lower than wild-type (~4.4-fold lower in the absence of IPTG) but ~5-fold higher than G5U in the absence of IPTG (~22 Miller units; Figure 2.10). BR151-G6A accordingly showed a growth phenotype of the colonies that was intermediate between 115

wild-type and BR151-G5U (Grundy, data not shown).

We tested the ability of the mutant strains to grow in the presence of ethionine, the S-ethyl analog of methionine. Growth of the mutant strains was compared to that of an isogenic strain containing the wild-type metK gene, BR151-ZKO. Methionine was included in the minimal medium to support growth of the auxotrophic strains. Growth was analyzed qualitatively (by observing the colony size) on minimal medium plates, with or without ethionine (25 µg/ml). In the absence of ethionine but in the presence of methionine, the wild-type BR151-ZKO strain showed bigger-sized colonies compared to the tiny colonies of BR151-G5U, while BR151-G6A showed an intermediate growth phenotype (data not shown). In the presence of both ethionine and methionine, the difference in colony sizes for the three strains was consistent such that the wild-type showed bigger sized colonies, BR151-G5U showed tiny colonies and the BR151-G6A showed medium sized colonies (data not shown). This result suggested that the ethionine present in the medium was outcompeted by methionine, leading to preferential uptake of methionine over ethionine.

We also tested the effect of the metK mutations in a methionine prototrophic

(BR151MA; Met+) background, during which exogenous methionine was omitted from the minimal medium and growth of the three strains was compared only in response to ethionine. In the absence of ethionine, BR151MA-ZKO (containing the wild-type metK allele) displayed bigger colonies compared to BR151MA-G5U, while BR151MA-G6A showed an intermediate colony size. As expected, BR151MA-ZKO showed no growth on minimal medium plates containing ethionine, indicating the toxic effect of ethionine.

116

These results suggested that the toxicity of ethionine for the prototrophic wild-type strain could not be overcome as methionine was not included in the medium. However, both

BR151MA-G5U and BR151MA-G6A were able to grow in the presence of ethionine compared to the wild-type control (data not shown). We predicted that mutations G5U and G6A in the metK sequence resulted in significantly lower SAM synthetase activity compared to the wild-type strain, which led to very low amounts of ethionine incorporation into the mutant strains. The difference in SAM synthetase activity of the wild-type and mutant strains can explain the preferential toxicity of ethionine for the wild-type strain.

To corroborate the results obtained during the ethionine analysis, we compared the SAM pools at a single time point for the strains BR151-ZKO and BR151-G5U. Cells grown in rich medium or minimal medium containing methionine were harvested at mid- log phase, and the SAM pools were measured. Under both growth conditions, BR151-

ZKO showed ~80 µM SAM, while SAM pools for the strain BR151-G5U were below the level of detection (data not shown). These results validated our prediction that the G5U mutant resulted in lower SAM synthetase activity.

2.3.7 Mutagenesis of the metK DS box sequence

A subsequent mutagenic analysis was conducted to gain insight into the functional role of the DS box sequence. Unlike the US box mutagenesis, we did not target the entire DS box sequence for mutagenesis as a few substitutions on the 5’ side of the DS box sequence would have subsequently disrupted the 3’ portion of the US box 117

sequence, which overlaps helix 1 of the S box element. We therefore targeted specific positions within the DS box sequence based on the results obtained from the US box mutagenesis. Figure 2.11 illustrates the results from the comprehensive in vivo analysis of the DS box mutants.

The majority of the DS box point mutant constructs failed to exhibit a response to

SAM limitation, and showed 3-4-fold reduced β-galactosidase activity in the absence of

IPTG compared to wild-type metK-lacZ (Figure 2.11). The DS box mutants showed less variation in gene expression at each position compared to the US box mutants, although, not every alternate nucleotide was tested in the DS box mutagenic analysis. Only two DS box mutants (U3pC and G16pA) showed a response to changing SAM levels. As seen for the US box point mutants, an intact US-DS base-pairing interaction appeared to maintain a response to limiting levels of SAM, consistent with the wild-type phenotype.

The wild-type nucleotide U265 in the DS box sequence makes a wobble base-pair with position G3 in the US box sequence. The U265C mutation in the DS box sequence formed a Watson-Crick interaction by pairing with position G3 in the US box. The single nucleotide substitution in the DS box sequence maintained the response to SAM, as it resulted in increased expression during SAM limitation. The DS box point mutation

G252A also showed a response to SAM, although expression in the absence of IPTG was

~2-fold higher than in the presence of IPTG. In the absence of IPTG, the β-galactosidase activity of the G16pA-lacZ construct was ~2-fold lower than the activity of the wild-type metK-lacZ fusion construct.

118

U265AU265 CG 26G4A26 4CC 26C3A26C3G263UC262AC26C2G262UG259UG252U C4UG5A-G264A-C263U G6AA7U-C262U-U261A C C C C C C C C

Figure 2.11 In vivo expression analysis of wild-type metK compared to the metK DS box mutant constructs and metK US-DS box double mutants using the IPTG assay. BR151 Pspac-metK cells, containing the appropriate transcriptional fusion construct, were grown in the presence of IPTG (0.2 mM) up to mid-exponential phase, harvested and resuspended in the absence or presence of IPTG (1.0 mM). Samples were taken every hour until 4 h and tested for β-galactosidase activity. The graph shows data for the 4 h time point. MU, Miller units.

Three positions within the DS box sequence, G264, C263 and C262 (C263 and

C262 are 100% conserved in the DS box consensus sequence) were predicted to base-pair with three positions C4, G5 and G6 from the US box sequence. The six GC-rich positions together make up the ‘conserved core’ region based on the phylogenetic analysis of the metK genes from Firmicutes. Similar to the mutations in the US box conserved core,

119

substitutions within the DS box conserved core region did not result in increased expression during SAM limitation, in spite of the potential to base-pair. This result strongly indicated sequence preference over base-pairing, as five out of the six conserved core nucleotides are invariant throughout the entire phylogeny.

Compensatory mutations were generated by simultaneously changing the US and

DS box positions within and outside of the conserved core region to maintain the base- pairing interaction between the two regions. We tested three double mutants within the conserved core, out of which only one double mutant (with the mutation C4U-G264A) showed increased gene expression upon limitation for SAM, suggesting that the compensatory mutant rescued the phenotype of the individual point mutants (Figure

2.11). However, wild-type expression levels were not achieved under these conditions as

β-galactosidase activity of the C4U-G264A double mutant was 1.6-fold lower than activity of the wild-type construct.

We also tested mutants with compensatory changes outside the conserved core region (G3A-U265C, A7U-U261A and A7G-U261C). The single nucleotide substitutions

G3A and U265C, in the US and DS box sequences respectively, maintained a response to changing concentrations of SAM (Figure 2.10). However, the compensatory mutation

G3A-U265C resulted in a loss of response to SAM limitation (Woltjen and Grundy, data not shown). The newly formed A-C mismatch disrupted the US-DS pairing, and the mutant was unable exhibit high expression under low SAM conditions. This result further validated the prediction that the US-DS base-pairing interaction is required for a response to SAM.

120

The single nucleotide substitution A7U in the US box sequence resulted in a disruption of the US-DS base-pairing. This mutant exhibited low expression during SAM limitation (Figure 2.10). Introducing the compensatory change in the DS box sequence

(U261A) failed to rescue the US box mutant phenotype to wild-type expression levels

(Figure 2.11). A different double mutation A7G-U261C resulted in high expression under low SAM concentrations (Woltjen and Grundy, data not shown). This result suggested a preference for a purine at position 7, as only a G·U wobble (formed by the A7G point mutant) or a G-C Watson-Crick base-pair (formed by the A7G-U261C double mutant) was tolerated at this position. Based on the above results, we can conclude that the metK

US and DS box sequences are involved in a pairing interaction and that the US-DS pairing is required to obtain a response to SAM. However, at the same time, we can speculate based on the mutagenesis results for the conserved core that base-pairing is not always sufficient to achieve increased metK-lacZ expression during SAM limitation.

These results suggest that the conserved core region acts as a recruiting site for a potential factor that binds to the US-DS pairing interaction and further stabilizes the interaction. These results clearly indicate a sequence requirement for the conserved core region.

2.3.8 Effect of the US box mutations on metK transcript stability and abundance

Based on the results from the in vivo reporter assays, we wanted to analyze whether mutating the 5’ US box sequence had an effect on the metK transcript stability.

Addition of rifampicin followed by qRT-PCR was performed to measure the half-life 121

(t1/2) of metK-lacZ RNAs. We also measured the abundance of the transcripts from the same samples. Abundance of the 5S rRNA, which was measured for each RNA sample, served as the control. As expected, no significant decrease in 5S rRNA abundance was observed during growth.

122

Figure 2.12 Measurement of RNA abundance of metK transcripts. Cells were grown in rich medium either in the presence or in the absence of IPTG (0.2 mM). Growth was monitored until mid-exponential phase and cells were harvested at T3. Duplicate samples were filtered through a vacuum manifold at 0, 2.5, 5, 10 and 20-min intervals post- rifampicin (150 µg/ml) treatment and instantly frozen on dry ice. The transcript half-life (t1/2) was determined by measuring the decrease in transcript abundance over time. A. Wild-type metK-lacZ; B. G5U metK-lacZ; C. US1 metK-lacZ. Open symbols indicate transcripts in the absence of IPTG, filled symbols indicate transcripts in the presence of IPTG. SQ denotes the starting quantity of the RNA as determined by the qRT-PCR software (Bio-Rad CFX Manager).

123

A.

-IPTG

WT metK-lacZ

+IPTG

B.

-IPTG G5U metK-lacZ

+IPTG

C.

US1 metK-lacZ +IPTG

-IPTG

Figure 2.12

124

The RNA abundance for the wild-type metK-lacZ fusion construct was compared to two mutant metK-lacZ fusion constructs, the US box G5U point mutant, and the more severe triple mutant, US1. In the absence of IPTG (when SAM levels are low), the wild- type metK-lacZ RNA abundance dropped ~4.0-fold after 20 min of IPTG limitation, relative to the 0 min time point (Figure 2.12A), while in the presence of IPTG (under high SAM levels), wild-type metK-lacZ RNA abundance dropped ~10-fold relative to the

0 min time point. The US1 mutant exhibited a drastic (~340-fold) reduction in transcript abundance after 20 min of IPTG limitation, relative to the 0 min time point (Figure

2.12C), while RNA abundance in the presence of IPTG dropped 12-fold over a period of

20 min. These results indicated the severe effect of the triple mutant on metK RNA abundance. The G5U construct showed an intermediate phenotype relative to the wild- type and US1 metK constructs. The G5U-lacZ transcript levels dropped ~24-fold after 20 min of IPTG limitation, while the RNA abundance in the presence of IPTG dropped ~7- fold over a period of 20 min (Figure 2.12B).

The t1/2 for the wild-type metK-lacZ RNA in the absence of IPTG was ~15 min, suggesting that the wild-type transcript is extremely stable when SAM levels are low. In the presence of IPTG (under high SAM levels), the t1/2 for the wild-type metK-lacZ RNA dropped ~12-fold to 1.2 min (Figure 2.12A). These results indicated that the wild-type transcript was stabilized when SAM levels were low. The US1 mutant exhibited a reduction in the transcript half-life when cells were grown in the absence of IPTG compared to wild-type metK-lacZ. The t1/2 value in the absence of IPTG was less than

0.82 min, ~18-fold lower than the wild-type t1/2 value, suggesting that the transcript

125

stability is reduced significantly due to the severe triple mutation. The transcript stability of the US1 mutant in the presence of IPTG was comparable to wild-type, with t1/2 value of 1.9 min (Figure 2.12C). The transcript stability of the G5U mutant was intermediate relative to the wild-type and US1 metK constructs, with a t1/2 value of ~7.4 min (Figure

2.12B). This value is ~2-fold lower compared to the wild-type t1/2 under the same conditions, indicating a less severe disruption of the US-DS pairing interaction as compared to the US1 mutant. The G5U-lacZ transcript showed a t1/2 of 1.2 min in the presence of IPTG after 20 min, equivalent to that of wild-type metK-lacZ RNA. These results suggest that pairing of the US-DS boxes in important for transcript stability and that the presence of SAM reduces transcript stability.

We also measured the change in gene expression over time for the same rifampicin-treated samples, grown in the presence or in the absence of IPTG. Expression at each time point was normalized to the stable 5S rRNA control, which showed no change in expression under these conditions (Figure 2.13). All values are relative to the 0 min time point under each condition. Wild-type metK-lacZ exhibited a ~3.0-fold reduction in gene expression over a 20-min period in the absence of IPTG, and an 11-fold drop in expression over the same time period in the presence of IPTG (Figure 2.13, green squares). The overall expression of wild-type metK-lacZ during growth in the absence of

IPTG (when SAM pools are low) was ~4.0-fold higher than for cells grown in the presence of IPTG (when SAM pools are high). This further corroborated the observation that wild-type metK-lacZ expression increases when SAM pools are low during IPTG limitation. The US1-lacZ transcript showed a ~400-fold decrease in expression after 20

126

min during growth in the absence of IPTG, while expression dropped ~17-fold after cells were grown in the presence of IPTG (Figure 2.13, blue triangles). The G5U-lacZ RNA exhibited a 23-fold reduction in gene expression in the absence of IPTG, and a 10-fold drop in expression in the presence of IPTG, over a period of 20 min (Figure 2.13, orange circles). The G5U mutant exhibited an expression profile that was intermediate between that of the wild-type metK-lacZ and US1-lacZ constructs, suggesting that the single point mutation was less severe in its effect on gene expression. Expression for all three constructs in the presence of IPTG was nearly superimposable (Figure 2.13, filled symbols).

127

1

WT metK

0.1

G5U metK

0.01

Normalized fold expression

US1 metK

0.001 0 5 10 15 20 25 Time (min)

Figure 2.13 Expression profiles of the wild-type and mutant metK-lacZ transcripts. Gene expression for the same rifampicin-treated samples (as in Figure 2.12), grown in the presence or in the absence of IPTG. Expression at each time point was normalized to the stable 5S rRNA control. Squares, wild-type metK-lacZ; circles, G5U metK-lacZ; triangles, US1 metK-lacZ. Open symbols, cells grown in the absence of IPTG; filled symbols, cells grown in the presence of IPTG.

2.3.9 In vitro analysis of the metK US and DS box mutants

Transcription termination assays were performed to determine if SAM could promote premature termination of transcription of wild-type and mutant metK gene expression. Wild-type metK failed to show a response to SAM in vitro, i.e., there was no

128

increase in transcription termination in the presence of SAM (Figures 2.14 and 2.15).

(Note: The in vitro data have been expressed as percent readthrough instead of termination, for easier comparison with the in vivo readthrough expression analysis).

The majority of the metK US box mutants did not show a response to SAM in vitro (Figure 2.14). Only two US box mutants, A2C and G3A, showed 1.5-fold reduction in percent readthrough (or increase in percent termination) in the presence of SAM. Not all substitutions at the same position showed the same amount of readthrough. For example, the A2 mutants showed a mixed phenotype, such that A2G exhibited high readthrough, A2U showed very low readthrough, while A2C responded to changing concentrations of SAM. Positions G3 and A7 exhibited different phenotypes for each nucleotide substitution. Out of the 3 mutants at position G5, 2 showed low readthrough, while mutants at position C4 exhibited constitutive high readthrough (Figure 2.14).

129

Figure 2.14 In vitro analysis of the metK US box mutants. Transcription termination assays were conducted in the absence or presence of SAM (2.5 mM). Constructs have been grouped according to nucleotide position. Termination efficiency is the amount of the terminated product relative to the sum of the terminated and readthrough products. Here, the Y-axis denotes the percent readthrough, which was calculated by subtracting the termination value from 100.

The metK DS box mutants were also analyzed in response to SAM using the in vitro transcription termination assay (Figure 2.15). The metK DS box mutants showed a relatively consistent percent readthrough compared to the metK US box mutants

(comparing Figures 2.14 and 2.15). However, the DS box mutants failed to show any increase in percent termination in the presence of SAM (Figure 2.15). 130

U265AU265 CG 26G4A26 4CC 26C3A26C3G263UC26C2A26C2G262UG259UG252U C4UG5A-G264AG6A-C263U-C262U C C C C C C C C

Figure 2.15 In vitro analysis of the metK DS box mutants. Transcription termination assays were conducted in the absence or presence of SAM (2.5 mM). Constructs have been grouped according to nucleotide position. Termination efficiency is the amount of the terminated product relative to the sum of the terminated and readthrough products. Here, the Y-axis denotes the percent readthrough, which was calculated by subtracting the termination value from 100.

Three US-DS box double mutants exhibited a ~1.1-1.2-fold increase in the total transcription compared to wild-type metK and the DS box point mutants (Figure 2.15).

However, none of the double mutants responded to the presence of SAM in vitro.

Overall, unlike results from the in vivo reporter assays, the in vitro transcription 131

termination analysis remained inconclusive. Additional studies will be required to understand the metK regulation in vitro.

2.3.10 Conditions to generate a metK halted-complex during in vitro transcription

While testing the metK mutants using the multiple-round in vitro transcription termination assays, we simultaneously identified experimental conditions that would allow us to perform a single-round transcription assay using the metK template and were successful in generating a metK halted-complex. As the +1 site was identified using primer extension analysis, we used high GTP as the initiating nucleotide. Attempts to initiate transcription using the GpA dinucleotide were also successful. Most of the previous in vitro assays were performed using [α-32P]-UTP as the radionucleotide and by leaving out GTP to generate a halt during transcription. As the metK leader RNA sequence contains 2 Us within the first 12 nt, we omitted UTP (instead of GTP) and performed the experiment in the presence of an alternate radionucleotide, [α-32P]-CTP.

This was predicted to halt the transcription upon reaching position U8. Transcription was resumed by addition of all 4 nucleotides. We observed transcripts for the wild-type metK

RNA and the metK US box mutation C4A (data not shown) but SAM failed to stimulate termination of transcription at the leader region terminators for either constructs.

Consistent with previous results (Figure 2.14), the wild-type RNA exhibited an equal ratio of terminated to readthrough products in the presence and absence of SAM, while the metK C4A RNA showed constitutive high readthrough (data not shown). Further optimization will be necessary to identify the precise conditions required to characterize

132

the metK leader RNA in a purified in vitro system and to obtain a SAM-dependent response.

2.4 Discussion

The B. subtilis S box regulon controls genes involved in sulfur metabolism pathways (Grundy and Henkin 1998). The S box genes exhibit significant variation in the amounts of induction and repression of expression in vivo as well as in the concentrations of SAM required to promote termination in vitro (Tomsic et al. 2008). The metK leader

RNA is an exception. The metK-lacZ fusion does not exhibit induction of expression during methionine starvation and SAM fails to promote increased termination at the metK leader region terminator (Tomsic et al. 2008). Although the metK leader RNA shows high conservation in the primary sequence and secondary structure compared to the other S box leader RNAs, the mechanism by which metK is regulated has been elusive. The objective of the current study was to characterize the B. subtilis metK leader RNA and identify the regulatory mechanism of the metK gene.

We constructed a strain, BR151 Pspac-metK, in which the native metK gene was under the control of an IPTG-dependent Pspac promoter. We measured the SAM pools during growth in the absence of IPTG-limiting conditions. It was predicted that cells grown in the presence of IPTG would result in high SAM pools while cells grown in the absence of IPTG would result in low in vivo SAM pools. Total SAM pools from the current study were at 150 µM concentration during growth in the presence of IPTG. This concentration is 2-fold lower than the previously measured B. subtilis SAM pools 133

(McDaniel et al. 2006, Tomsic et al. 2008, Wabiko et al. 1988). Earlier studies of a methionine auxotroph grown in minimal medium in the presence of methionine revealed in vivo SAM pools to be ~300 μM (Tomsic et al. 2008). Depleting methionine from the culture medium leads to a rapid drop in SAM pools to ~50 μM and then below the limit of detection (25 µM) after 1 h of methionine limitation (Tomsic et al. 2008). We had expected the overall SAM pools to be equal or even higher during growth in the nutrient rich medium, as there would be a constant supply of methionine for SAM production.

The apparent difference in total SAM pools can be due to the different growth conditions employed during the two studies. In the current study, cells were grown in nutrient-rich medium, whereas previous SAM pool measurements were performed during methionine starvation using defined minimal medium. It is possible that actively growing cells in rich medium have a higher demand for SAM as compared to cells grown in minimal medium and hence result in a 2-fold reduced in vivo SAM concentration during growth in rich medium.

Along with measuring the SAM pools, the β-galactosidase activity of the metK- lacZ fusion was measured. The wild-type metK-lacZ construct exhibited induction of expression after SAM levels in the cell dropped. However, we report that the increase in metK-lacZ expression does not coincide precisely with the drop in SAM levels in BR151

Pspac-metK. A delay seen before the maximum β-galactosidase activity was reached suggests that the metK-lacZ expression is induced only after in vivo SAM drops to very low levels. This result is different from the yitJ-lacZ expression profile seen previously

(Tomsic et al. 2008), which shows a simultaneous increase of yitJ-lacZ expression as

134

SAM pools continue to drop. The peculiar expression profile of the metK gene sheds light on its physiological role. The B. subtilis metK gene differs from the rest of the S box genes in that it utilizes methionine to synthesize SAM, while the other S box genes (like yitJ) are closely involved in synthesis of methionine. It is possible that the metK gene is controlled by a global regulatory mechanism, in addition to being controlled by the existing S box mechanism.

We compared the expression of the wild-type metK-lacZ fusion during growth in the presence and absence of IPTG in the BR151 Pspac-metK strain. More than 3-fold induction of metK-lacZ expression was observed in the absence of IPTG (during low

SAM pools), whereas the presence of IPTG (when SAM pools are high) resulted in repression of metK expression. Unlike a typical S box gene, expression in the presence of high SAM was not completely off. As metK is an essential gene, the constitutive low expression (~30-40 Miller units) seen in the presence of IPTG ensures that SAM is always being synthesized.

Modulation of the in vivo SAM pools using IPTG was different from our previous method, in which SAM was indirectly depleted by removal of methionine. Starvation for methionine results in inhibition of cell growth. Our current results suggest that induction of the wild-type metK-lacZ fusion occurs when the SAM pools are limiting (by control of the chromosomal metK gene expression), and methionine levels are high. We speculate that along with sensing SAM pools, metK also senses in vivo methionine levels to ensure steady growth. This makes sense as SAM is synthesized from methionine by SAM synthetase, encoded by the metK gene.

135

Analysis of the the metK leader RNA deletion mutants revealed the RNA elements necessary to achieve a wild-type response during growth in presence and absence of IPTG. Deleting either end of the metK leader RNA was not tolerated, as the 5’ and 3’ deletion constructs exhibited a complete loss of metK induction under low SAM conditions (Table 2.2). These results indicate that intact 5’ and 3’ regions are important and play a direct role in the unique regulation of the metK gene.

Deleting the metK S box riboswitch element (helices 1 through 4, as well as the terminator element) was tolerated, as the ΔS box mutant showed a response to changing levels of SAM in vivo (Table 2.2). Removal of the terminator element can explain the increased expression shown by this mutant compared to wild-type metK. These preliminary results using the in vivo reporter assays suggested that the metK S box element does not participate directly in the primary mode of metK regulation and that regions at the 5’ and 3’ ends of the leader RNA are crucial.

Phylogenetic analysis of metK leader RNAs from several Firmicutes revealed two highly conserved sequence motifs, the US and DS boxes. These sequences were located on the 5’ and 3’ sides of the metK S box riboswitch element. These elements make the metK RNA unique among the S box leader RNAs. The importance of these sequences in the regulation of the metK gene was first highlighted using lacZ reporter assays of the deletion mutants. The close proximity of the US box sequence to the 5’ end of the RNA prompted us to map the metK RNA transcriptional start-site. Primer extension analysis demonstrated that the US box sequence started precisely at the metK +1, suggesting a role for the US box region in metK regulation, at the level of mRNA stability, transcription

136

initiation, and/or RNAP recognition.

Analysis of the metK leader RNAs revealed significant sequence complementarity of the US box region with a region located ~60 nt downstream of the terminator helix.

The DS box sequence was located ~30 nt upstream of the SD region and was predicted to form a pre-SD structure by base-pairing with the US box sequence. The consensus sequences for each of these regions showed 10 out of the 19 US box nucleotides and 5 out of the 16 DS box nucleotides to be 100% conserved. Out of the 6 nucleotides that are part of the GC-rich ‘conserved core’, 5 were invariant, suggesting a regulatory role for this sequence.

Accessibility of the DS box sequence to a DNA oligonucleotide, complementary to a region within the DS box sequence, was monitored by cleavage using RNase H. The

T7 transcribed wild-type metK RNA exhibited cleavage in the presence of SAM. This result suggested that SAM reduced the US-DS box pairing, which resulted in the DS box sequence to be available for the antisense oligonucleotide to bind. This makes sense as the US box is involved in helix 1 formation, which is stabilized typically under high

SAM conditions. However, the absence of SAM resulted in protection of the DS box sequence from RNase H cleavage. In the absence of SAM, the US box sequence is not involved in helix 1 formation. The result indicates that in the absence of SAM, the US box sequence base-pairs with the DS box region and protects it from RNase H cleavage.

We can therefore conclude that the pairing of the US box sequence with the DS box sequence occurs under low SAM conditions.

The metK US1 triple mutant exhibited constitutive cleavage regardless of whether

137

SAM was present. This result indicated that the US1 mutation disrupted the base-pairing interaction and the DS box sequence was no longer protected from RNase H cleavage. It is possible that the conserved core region within the US box sequence recruits a protein factor that further stabilizes the US-DS box pairing under low SAM conditions. However, in the case of the metK US1 mutant, the recruiting site is disrupted which reduces the

UD-DS base-pairing interaction.

The U105C substitution in the metK SAM binding pocket was independent of the

US and DS box sequences, and the resulting metK S box mutant exhibited equal protection and cleavage in the presence and absence of SAM. This result suggested that the SAM response for pairing of the US and DS box sequences is dependent on a functional S box riboswitch and requires an intact SAM binding pocket. We speculate that the point mutation in the SAM binding pocket created a floppy RNA molecule, which underwent a rapid transition between the paired and unpaired US-DS box interaction, leading to a constant access of the antisense oligonucleotide to the DS box sequence. A more likely explanation is tht the RNA was stuck in a conformation which resulted in constant cleavage and protection regardless of the presence of SAM. This mutation is similar to that observed for yitJ SAM binding pocket mutants in which SAM binding is abolished, yet these mutants exhibit high constitutive termination in vitro (see

Chapter 3) (Lu et al. 2010). Based on the above RNase H probing results we conclude that the US box sequence base-pairs with the complementary DS box sequence in the absence of SAM.

The ΔS box mutant RNA was protected in the RNase H cleavage assay in the

138

presence and absence of SAM, suggesting that the US and DS box sequences were always paired in this construct. The ΔS box mutant fails to respond to changing SAM concentrations in vitro. The US1 ΔS box mutant RNA on the other hand showed constitutive high cleavage, suggesting that the US-DS pairing was disrupted due to the sequence change in the conserved core region (data not shown).

The unrestricted pairing of the US-DS boxes in the ΔS box mutant would suggest that gene expression is always on. However, the ΔS box mutant responds to changing concentrations of SAM as it exhibited increased expression only under low SAM conditions, using the IPTG limitation assay (Table 2.2). We can reason that the increased expression of the ΔS box mutant was only in part due to the US-DS pairing. It is possible that pairing alone is not sufficient for increased expression, and that an additional stabilizing factor is needed that is present in vivo under condtions when SAM concentration is low, but methionine levels are high.

The extensive in vivo analysis of the US and DS box mutants further validated that the US-DS pairing interaction is important for metK gene expression. Overall, changes in the US box sequence showed a greater variation in expression of the metK- lacZ fusion as compared to changes in the DS box sequence. Several US box mutants (for example, A2G, G3A, A7G, U8A, U8G and U8Δ) exhibited increased readthrough in the absence of IPTG, consistent with wild-type metK-lacZ. Some mutants exhibited constitutive high readthrough (C4A and C4G) while others showed constitutive low readthrough (A2C, A2U, G3U and mutations of G5 and G6 positions). The results indicated that maintenance of (or a change to) a purine residue was tolerated (with the

139

exception of the U8 position located within a loop region). The US box mutational analysis supported our observation that maintenance of the US-DS base-pairing results in increased metK expression when SAM pools are low.

Compared to sequence changes in the US box sequence, the majority of the DS box mutants exhibited 3-4-fold reduced β-galactosidase activity in the absence of IPTG compared to wild-type metK-lacZ. Only two mutants (U265C) exhibited increased β- galactosidase activity during growth in the absence of IPTG that was comparable to wild- type metK-lacZ. U265C-lacZ showed increased readthrough during growth in the absence of IPTG. U265C forms a Watson-Crick base-pair with the G3 position in the US box sequence. The US box mutation G3A, which formed a Watson-Crick base-pair with the position U265 in the DS box sequence, also exhibited increased readthrough during growth in the absence of IPTG. However, the compensatory mutation G3A-U265C, which disrupts the base-pairing interaction between the US and DS boxes, failed to show increased readthrough under low SAM pools (Woltjen and Grundy, data not shown).

The underlying conclusion from the mutagenesis was that maintenance of the pre-

SD structure (either through Watson-Crick or wobble base-pairing) results in increased expression during growth in the absence of IPTG. These data support the model that under low SAM conditions, when helix 1 of the S box element is not formed, the US box sequence base-pairs with the DS box sequence, leading to upregulation of gene expression.

There were two exceptions to this model. We failed to observe an increase in readthrough of the A7U-U261A double mutant under low SAM conditions. This result

140

was unexpected because the compensatory mutation was able to form a Watson-Crick base-pair. However, it is possible that a purine is required at position +7 in the US box sequence, as the only point mutation tolerated at this position was A7G. We also have evidence that shows that an A7G-U261C double mutant responded to IPTG limitation, and exhibited wild-type expression values (Woltjen and Grundy, data not shown).

The second exception was that any disruption of the conserved core region was not tolerated. Although the potential to base-pair was maintained, the majority of substitutions in the conserved core disrupted regulation of metK. Only one double mutant

(C4U-G264A) responded to IPTG limitation, but showed lower β-galactosidase activity than wild-type metK-lacZ (Figure 2.11).

We speculate that the strict sequence preference shown in this region is a recognition site for an alternate trans-acting factor that binds the US box sequence only in the presence of SAM. However, in the absence of SAM the pairing of the US-DS box sequences acts a protector and prevents access to such a trans-acting factor(s). It is possible that SAM synthetase itself binds to this region to control gene expression as a feedback inhibition mechanism for metK gene regulation. Additional experiments will be required to test these possibilities.

Earlier studies have shown that ethionine, the S-ethyl analog of methionine, can be incorporated into proteins in place of methionine (Cheng et al. 1968, Gross and Tarver

1955). Only a few organisms, including B. subtilis, can further metabolize ethionine into

S-adenosylethionine (SAE) (Allen et al. 1986). SAE differs from SAM by a single methyl group, in that it replaces the methyl group of SAM with an ethyl group. SAE can

141

therefore be toxic if synthesized by the cell. Prototrophic strains BR151MA-G5U and

BR151MA-G6A exhibited a growth defect but were able to survive in the presence of ethionine as compared to the isogenic wild-type control strain BR151MA-ZKO which exhibited no growth. The absence of methionine from the growth medium suggested that ethionine could not be outcompeted, resulting in death of the wild-type strain. The preferential toxicity of ethionine for the wild-type strain suggested that normal SAM synthetase activity resulted in incorporation of ethionine leading to production of SAE, which eventually killed the cells. We predicted that the survival of the G5U and G6U mutants in the presence of ethionine was due to defective SAM synthetase activity, which in turn resulted in significantly lower levels of SAE in vivo.

The in vivo SAM pools from the auxotrophic mutant strain BR151-G5U were indeed below the level of detection when compared to the SAM pools from the isogenic wild-type strain. This suggested that the single nucleotide substitution was sufficient to lower the SAM synthetase activity. These results provided a physiological proof of the effect of changing the US box sequence on in vivo SAM pools and provided further evidence that the US box element is important for metK regulation.

It is important to note that the single nucleotide substitution in the leader RNA sequence of the essential metK gene was tolerated by the cell and resulted in a reduction in SAM synthetase activity (as reflected by the lowered SAM pools). Mutants with sequence changes in the metK coding region have been isolated which also exhibit a growth defect due to low SAM synthetase activity, subsequently resulting in low SAM pools in vivo (McDaniel et al. 2006, Wabiko et al. 1988). One such mutant showed

142

derepression of the S box gene expression (McDaniel et al. 2006). Another study implied that a minimum SAM pool concentration of ~25µM was optimal for growth (Wabiko et al. 1988). Although the G5U SAM pools were below the detection limit using our method, we can infer that the SAM pools are in a similar range (~25 µM) based on the study conducted by Wabiko and coworkers.

In vitro transcription assays were performed to test the effect of the US and DS box mutations on SAM-directed transcription termination. The in vitro results were not consistent with the results obtained from the in vivo reporter assays. Addition of SAM failed to stimulate termination at the wild-type metK leader region terminator. The metK mutants that showed a response to changing concentrations of SAM in vivo failed to exhibit a response in vitro. These results suggested that a potential factor(s), that has not been identified yet, was missing from the in vitro purified system and hence we were not able to see an increase in SAM-directed termination. Overall, the in vitro analyses were inconclusive. Further studies are necessary to observe a SAM-dependent response in vitro. In vitro transcription can be performed in the presence of cellular extracts generated from cells growing exponentially to see if addition of the extract improves the response of the metK in vitro.

Upon examining the in vitro data closely, we found significant variation in the transcription yields of some mutants compared to the wild-type construct. Results from multiple repeats of the in vitro transcription experiments revealed that some metK mutants showed very high band intensities while others showed bands that were barely visible (data not shown). This pattern remained consistent despite the same concentration

143

of template DNA added to the reaction. A similar result was obtained with hybrid leader

RNA constructs of metK and yusC (see Chapter 4). These results suggest that transcription initiation at the metK transcriptional start-site by the RNAP is inconsistent and that mutating the US box sequence interferes with the proper docking or recognition of the DNA sequence for efficient transcription by the RNAP. Another explanation is that altering the metK US box sequence affects the processivity of the RNAP and in turn affects transcription elongation. Time-course analyses could provide information regarding the rate of RNAP elongation. We have identified the experimental conditions to generate a halted complex in order to perform single-round transcription reactions using the metK template. However, additional studies are necessary to improve this in vitro transcription system.

Recent advances in the field of mRNA decay and turnover have revealed a number of important RNases in B. subtilis (Bechhofer 2011). The essential RNase J1 is a broad-specificity endonuclease that also possesses the unique 5’→3’ exonuclease activity, previously thought to be lacking in bacteria (Even et al. 2005). RNase J1 is involved in turnover of mRNA intermediates and is predicted to participate directly in the initiation of mRNA decay. The non-essential RNase J2 is also known to contribute in some of these events (Even et al. 2005). The recently identified RNase Y is an essential enzyme that exerts an effect on global mRNA stability in B. subtilis (Shahbabian et al.

2009). Together, these studies have shown that the 5’ end of a transcript plays a crucial role in the overall RNA stability, such that 5’ monophosphates are more susceptible to degradation by RNases than 5’ triphosphates.

144

In the second part of our study, we examined the effect of the US box sequence

(which is located precisely at the 5’ end of the transcript) on metK leader RNA stability.

The wild-type RNA, when compared to the two US box mutant transcripts, had the longest half-life (t1/2 ~15 min) in the absence of IPTG, while the half-life dropped to only

~1.2 min after 20 min in the presence of IPTG. This suggests that the US-DS pairing in the wild-type transcript is disrupted in the presence of SAM, leading to a significant decrease in transcript stability. These data are in good agreement with the results for the wild-type metK construct obtained using the RNase H cleavage assay. The RNase H results revealed that the DS box sequence was highly accessible in the presence of SAM

(Figure 2.9, left panel), suggesting that the US-DS box interaction is disrupted under these conditions.

The ~18-fold reduction in t1/2 of the US1-lacZ RNA during growth in the absence of IPTG compared to wild-type metK-lacZ RNA indicated that the mutation disrupted the

US-DS pairing resulting in significantly lowered RNA stability in the absence of SAM.

These results are in agreement with the results from the RNase H assays during which the

US1 mutant exhibited cleavage even in the absence of SAM (Figure 2.9, middle panel).

The t1/2 of the G5U mutant was down only 2-fold during low SAM conditions relative to wild-type. This suggests that the single nucleotide substitution did not alter the base-pairing interaction as much as the triple base substitution of the US1 mutant and hence resulted in a less severe impact on transcript half-life.

All three constructs exhibited similar half-lives in the presence of high SAM pools (ranging from 1.2 to 2.0 min). These results suggested that the presence of SAM

145

destabilizes the US-DS pairing in case of the wild-type transcript, while sequence changes in the US box region destabilize the pairing regardless of the presence of SAM.

Earlier studies have determined the half-life of the B. subtilis metK transcript as

~2.0 min (Smith et al. 2010b). This value is lower than the transcript half-life measured in the current study (15 min). The difference can be attributed to the different growth conditions used during these two studies. The previous analysis was conducted during a methionine starvation using minimal medium monitored over much longer periods of growth. Our analysis, on the other hand, has been performed in rich growth medium. It is possible that growth in a defined medium resulted in an overall shorter mRNA half-life.

Additional RNA stability assays performed during methionine starvation conditions may be helpful to explain the differences observed in metK-lacZ gene expression during methionine starvation and IPTG-limitation assays.

The RNA abundance for all three constructs was nearly identical when grown in the presence of IPTG, suggesting minimal change in the total RNA during growth in the presence of high in vivo SAM pools. These results indicate that a steady growth rate leads to constant RNA abundance. The stability and abundance results together validate the importance of the US-DS pairing interaction and suggest that the US box is protected from degradation by pairing with the DS box sequence. The results support our model that the US-DS pairing interaction is stabilized under low SAM conditions. In addition, we predict that the US-DS pairing in the absence of SAM, but in the presence of methionine, is stabilized further by a separate protein factor, which binds to the potential recruiting site within the conserved core region.

146

Based on the data obtained from the current study, we propose a model for the regulation of metK gene expression from B. subtilis, in which the metK gene is subjected to regulation at the level of mRNA stability, in addition to being under the control of the

S box regulon (Figure 2.16). It is also possible that regulation occurs at the level of transcription initiation, or involves a combination of both RNA stability and transcription initiation.

147

Figure 2.16 Proposed model for regulation of B. subtilis metK gene expression. The 5’ end of the metK leader RNA with the US box sequence (red line) transcribed by RNAP (panel A). The RNA undergoes a conformational change depending on the concentration of SAM, such that in the presence of SAM, the RNA forms the terminator (T) (panel D) amd gene expression is off. The RNA forms the antiterminator (AT) (blue-orange) under low SAM conditions. This leads to transcription of the DS box sequence (green line) with the potential for the US-DS pairing to occur (panel B). The next check point is methionine dependent, such that under low SAM and low methionine (Met) conditions, the US-DS pairing is unstable and the metK RNA is susceptible to degradation by an RNase (brown pie) resulting in gene expression to be off (panel C). Under conditions with low SAM and high methionine, a methionine-dependent factor (blue oval) stabilizes the US-DS pairing, resulting in stabilization of the RNA. This results in gene expression to be on (panel E). Under high SAM and high methionine concentrations, the RNA reconforms into the terminator (T) conformation and unzips the US-DS pairing interaction. This causes the RNA to now be susceptible to a potential degradation event, resulting in reduced RNA stability and gene expression is off. SD, Shine-Dalgarno region; AUG, start codon.

148

AT AT A. B. C. - SAM - MET

US box US DS US DS box box box box SD AUG SD AUG RNase

DEGRADED Gene + SAM + MET expression 149 OFF

E. AT F. D. + SAM

T T US box US box DS box RNase SD AUG Factor SDSD AUG TERMINATE DEGRADED

Gene STABLE Gene Gene expression expression expression OFF ON Figure 2.16 OFF

1

The fact that the wild-type metK-lacZ construct exhibits induction of expression only during conditions when SAM pools are low but methionine levels are high, suggests the importance of growth rate on metK regulation. It is possible that the metK gene is under the control of a superimposing global regulatory event like the stringent response, which occurs under high stress conditions such as amino acid starvation. Further investigation using ppGpp, the stringent response alarmone, or ppGpp synthetase (RelA)

(encoded by the relA gene) can shed light on such a mechanism. We have evidence to suggest such a global regulatory mechanism might exist, as sequences resembling the metK US and DS boxes have been observed in a distinct regulatory element that is not part of the S box regulon (Grundy and Henkin, unpublished data). Additional studies will be necessary to verify whether these elements are regulated by a common mechanism that also regulates expression of the metK gene.

150

CHAPTER 3

IN VITRO INVESTIGATION OF THE SAM-BINDING POCKET

OF THE Bacillus subtilis yitJ S BOX RIBOSWITCH

3.1 Introduction

The S-adenosylmethionine (SAM)-binding S box riboswitch is a widespread class of riboswitches, originally identified upstream of 11 transcriptional units from the Gram- positive bacterium B. subtilis (Grundy and Henkin 1998). The S box regulon controls the expression of 26 genes involved in import, synthesis and recycling pathways for methionine, cysteine and SAM (Grundy and Henkin 1998). The B. subtilis S box riboswitch consists of a SAM-sensing “aptamer domain” followed by an “expression platform” which functions to regulate the downstream genes by transcription attenuation

(Epshtein et al. 2003, McDaniel et al. 2003, Winkler et al. 2003). The intracellular concentration of SAM dictates whether the RNA folds into one of two mutually exclusive conformations, resulting in either continuation of transcription into the downstream coding region (through formation of the antiterminator) or termination at the leader region terminator (through formation of the terminator). Termination is dependent on

151

formation of a third structure, the anti-antiterminator, which competes with the antiterminator. When SAM levels are low, the anti-antiterminator structure is destabilized, resulting in formation of the antiterminator structure and expression of the downstream coding regions. When SAM levels are high, the anti-antiterminator structure is stabilized, preventing formation of the antiterminator and allowing formation of the terminator and premature termination of transcription.

152

Figure 3.1. B. subtilis yitJ leader RNA structural model. The structural model is based on phylogenetic analyses (Grundy and Henkin 1998) and is shown in the terminator conformation. Red and blue residues indicate the alternate pairing for formation of the antiterminator, shown above the terminator. Boxed numbers indicate helices (or paired regions) 1–4; T, terminator; AT, antiterminator; AAT, anti-antiterminator. Numbering of residues is relative to the predicted transcription start-site. Adapted from (Grundy and Henkin 1998).

153

Phylogenetic analyses revealed that the S box aptamer domain consists of four helical segments (P1-P4) organized around a four-way junction (Figure 3.1). Support for the secondary structural model of the S box leader RNA was provided by data from extensive genetic studies, which confirmed a pseudoknot structure between the loop of helix P2 and junction region between helices P3 and P4 (J3/4) (McDaniel et al. 2005).

This tertiary structural element was validated by two independent high-resolution crystal structures from Thermoanaerobacter tengcongensis (Montange and Batey 2006) and B. subtilis (Lu et al. 2010). The crystal structures are nearly superimposable and establish the global architecture of the S box riboswitch aptamer domain. The structural analyses revealed that P1/P4 and P2/P3 form two sets of coaxially stacked helices that are packed together at a ~70º angle. The ligand-binding pocket is situated between the minor grooves of helices P1 and P3, enveloping SAM into a tightly packed interface (Lu et al. 2010,

Montange and Batey 2006).

SAM, the molecular effector of the S box riboswitch, is synthesized from methionine and ATP by SAM synthetase, encoded by the metK gene. Growth in the presence of methionine results in high SAM pools, while growth in the absence of methionine results in low SAM pools. The physiological concentration of SAM in a methionine auxotrophic strain grown in the presence of methionine is ~300 µM (Tomsic et al. 2008). Depleting methionine from the culture medium leads to a rapid drop in SAM pools to <50 μM and then below the limit of detection (25 µM) after 1 h of methionine limitation (Tomsic et al. 2008).

The yitJ gene encodes methylenetetrahydrofolate reductase, an enzyme in

154

methionine biosynthesis (Grundy and Henkin 1998). The B. subtilis yitJ S box RNA has been the most carefully characterized S box riboswitch and many of the early studies on the S box regulon have been conducted using the yitJ leader RNA (Grundy and Henkin

1998, McDaniel et al. 2003, McDaniel et al. 2005, Winkler et al. 2001). The yitJ leader

RNA is highly sensitive to changes in SAM concentrations and gene expression of a yitJ- lacZ transcriptional fusion is induced only when intracellular SAM pools are low

(Tomsic et al. 2008). yitJ variants containing mutations in highly conserved regions in the leader RNA exhibit loss of repression during growth in the presence of methionine

(McDaniel et al. 2003, McDaniel et al. 2005).

Several studies of the wild-type yitJ RNA have examined the response of the yitJ riboswitch to SAM using an in vitro transcription termination assay (McDaniel et al.

2003, McDaniel et al. 2005, Tomsic et al. 2008). yitJ exhibits half-maximal termination in vitro at a concentration of 0.35 µM SAM (Tomsic et al. 2008). Size-exclusion filtration assays were used to determine the affinity of the yitJ leader RNA for SAM. The wild- type yitJ RNA exhibits high affinity for SAM with an apparent Kd ~20 nM. This is consistent with data obtained from previous analyses that reported Kd values between 4 and 10 nM (Lim et al. 2006, Winkler et al. 2003). The yitJ leader RNA is highly specific for SAM and discriminates strongly against closely related natural analogs such as S- adenosyl-L-homocysteine (SAH) and S-adenosyl-L-cysteine (SAC). The yitJ RNA exhibits 100- and 10,000-fold lower affinities for SAH and SAC, respectively, as compared to the affinity for SAM (McDaniel et al. 2003, Winkler et al. 2003). yitJ variants containing mutations in highly conserved regions of the leader RNA exhibit loss

155

of SAM binding and SAM-directed transcription termination in vitro (McDaniel et al.

2003, McDaniel et al. 2005).

Based on these results, we conducted a study to generate B. subtilis yitJ variants that exhibit higher SAM affinity compared to wild-type yitJ or a change in ligand specificity. We targeted residues in the SAM-binding pocket of the yitJ leader RNA that make direct and indirect contacts with the SAM molecule. First, we employed Systematic

Evolution of Ligands by EXponential enrichment (SELEX) on the B. subtilis yitJ RNA.

The goal of this study was to subject a pool of mutant yitJ transcripts to iterative rounds of selection in the presence of lower concentrations of SAM in order to obtain aptamers with either higher affinity or modified specificity. However, due to technical difficulties using SELEX, our attempts to obtain such aptamers were unsuccessful. The first part of this chapter will describe the methods as well as the preliminary results obtained during the SELEX study.

In a subsequent study, we used a more direct approach in which we mutated individual residues in the SAM-binding pocket. This targeted mutational analysis of the yitJ leader RNA was conducted as part of the crystal structure study of the B. subtilis yitJ

S box riboswitch (in collaboration with the laboratory of Dr. Ailong Ke, Cornell

University) (Lu et al. 2010). In this study, 32 mutants were tested for effects on in vitro transcription termination (V. A. Pradhan) and SAM binding (J. Tomšič) (Lu et al. 2010).

The second part of this chapter will focus on the results obtained during these experiments.

A separate study was conducted that investigated the response of certain yitJ

156

variants to a series of SAM analogs using in vitro transcription termination assays. We found that some of these mutants retained a response to SAM and a limited number of

SAM analogs. However, we failed to obtain variants that exhibited specificity only toward a SAM analog. These results will be discussed in the last part of this chapter.

3.2 Materials and methods

3.2.1 Construction of DNA templates for in vitro selection

DNA templates used for T7 RNA polymerase (RNAP) transcription during the in vitro selection were generated using complementary pairs of overlapping DNA oligonucleotides (Integrated DNA Technologies, Coralville, IA). Briefly, the 5’ pair contained the phage T7 RNAP promoter sequence fused to the +14 position of the yitJ leader region. The remaining complementary pairs contained the wild-type yitJ sequence, each with a 4-nt 3’ overhang complementary to the 5’ region of the adjacent pair. The terminal pair contained the 3’ leader RNA sequence that ended at +169, near the 5’ side of the transcription terminator. This position was selected so that helix P1 formation could be monitored without the competing antiterminator structure. Each internal oligonucleotide pair was phosphorylated using T4 polynucleotide kinase (New England

Biolabs, Beverly, MA). The pairs were then mixed, incubated at 95ºC and slow-cooled to room temperature to permit annealing. The paired oligonucleotides were ligated using T4

DNA ligase (New England Biolabs) per the manufacturer’s instructions. The resulting

DNA template was amplified using the flanking 5’ and 3’ DNA oligonucleotides as primers for PCR using Pfu DNA polymerase (Stratagene, La Jolla, CA). The final

157

template DNA was purified with a Qiagen PCR cleanup kit (Qiagen, Chatsworth, CA) and sequenced by Genewiz (South Plainfield, NJ).

For construction of the pool of mutant yitJ DNA templates, the same procedure was followed except that oligonucleotides with the desired sequence changes were used in place of the wild-type oligonucleotides. The spiked oligonucleotides used to generate the mutant DNA templates are listed in Table 3.1. The percentage of spiking and the nucleotides that were mutated are indicated in the same table. The percentage of spiking was calculated based on the total number of positions that were targeted for mutagenesis, as well as the maximum number of mutations expected per output molecule. The following formula was used to determine the percentage of spiking per oligonucleotide:

n! x n-x P(x) = Cm (1-Cm) x! (n-x)!

where, P(x) = probability of x mutations per oligonucleotide x = number of mutations per oligonucleotide n = number of bases in oligonucleotide to be mutagenized Cm = total fraction concentration of mutant nucleotides

We defined the value of ‘x’ as 4 mutations per molecule, ‘n’ as 7 and the value of

‘Cm’ as 0.50. A Cm value of 0.50 resulted in a percentage of ~17% for each of the 3 contaminating nucleotides. Therefore, a ratio of wild-type to mutant nucleotides was set to 49:51.

DNA fragments (180 bp) containing wild-type or mutant sequences were cloned into plasmid pGEM7Zf+ (Promega). The resulting constructs were introduced into E. coli

158

- + strain DH5α (φ80dlacZΔM15 endA1 recA1 hsdR17 (rk mk ) thi-1 gyrA96 relA1

Δ(lacZYA-argF)U169) by transformation and grown on Luria-Bertani (LB) medium

(Miller 1972). Transformants were screened for formation of white colonies on LB medium containing X-Gal (5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside; Gold

Biotechnologies, St. Louis, MO), indicating disruption of the multiple cloning site within the vector. Plasmid DNA was isolated from the transformants using a Promega Wizard prep kit (Madison, WI) and sequenced to confirm the mutations.

Table 3.1 Oligonucleotide primers for the B. subtilis yitJ leader RNA

Primer Sequence (5’→3’) a, b, c yitJ (14-50) AAAATTTCATATCCGTTCTTATCAAGAGAAGCAGAGG yitJ (14-46)RC TGCTTCTCTTGATAAGAACGGATATGAAATTTT yitJ (51-89) GACTGGCCCGACGAAGCTTCAG[CAAC]CGGTGTAATGGCG yitJ (47-85)RC ATTACACCG[GTTG]CTGAAGCTTCGTCGGGCCAGTCCCTC yitJ (90-126) ATCAGCCATGACCAAG[GTG]CTAAATCCAGCAAGCTCG yitJ (86-122)RC CTTGCTGGATTTAG[CAC]CTTGGTCATGGCTGATCGCC yitJ (127-169) AACAGCTTGGAAGATAAGAAGAGACAAAATCACTGACAAAGTC yitJ (123-169)RC GACTTTGTCAGTGATTTTGTCTCTTCTTATCTTCCAAGCTGTTCGAG

a Bold positions within parentheses indicate variation from the wild-type sequence. Each position contained 49% wild-type sequence and 17% for each mutant nucleotide. b The underlined sequence indicates the 4 nt-overhang for each oligonucleotide-pair designed for cassette-ligation. c RC, reverse complement

3.2.2 Site-directed mutagenesis

The transcriptional fusion vector pFG328 (Grundy et al. 1993) containing wild-

159

type yitJ DNA was used as a template for oligonucleotide-directed mutagenesis as described previously (McDaniel et al. 2003). For the purpose of the in vitro transcription termination assay, the strong B. subtilis glyQS promoter replaced the native yitJ promoter as described previously (McDaniel et al. 2003). The glyQS promoter was fused to the +14 position of the yitJ leader sequence. The yitJ sequence continued into the coding region and had an endpoint that was 92 bp downstream of the transcription terminator. DNA oligonucleotides containing the desired mutations were designed as primers for PCR amplification of pFG328-yitJ DNA using Pfu DNA polymerase. The resulting products were subjected to digestion with DpnI (New England Biolabs) to remove the starting wild-type template and introduced into XL-2 blue ultracompetent cells by transformation as per the manufacturer’s instructions (Stratagene). The plasmid DNA was isolated using a Promega Wizard prep kit and sequenced to confirm the mutations. DNA templates for in vitro transcription termination were generated from the plasmid DNA by PCR.

3.2.3 In vitro transcription assays

Templates for in vitro transcription reactions were constructed to generate a transcription start-site 17 nt upstream of the start of P1 as described above (section 3.2.2)

(Figure 3.1). ApC was used as the initiating dinucleotide and GTP was omitted to halt the transcription at position +16 (McDaniel et al. 2003). Single-round transcription reactions were carried out (as in Chapter 2) as described previously (Grundy et al. 2002, McDaniel et al. 2003). Templates were transcribed in the presence or absence of SAM (140 µM) or

SAM analogs in a reaction volume of 35 µl. Transcription products were resolved by

160

denaturing PAGE and visualized by PhosphorImager (Molecular Dynamics) analysis.

Percent termination was plotted as a function of ligand concentration (GraphPad Prism).

The reactions were performed in duplicate and reproducibility was ±5%.

3.2.4 RNase H cleavage assay

RNase H cleavage was carried out as described previously (McDaniel et al. 2003).

The DNA template for T7 RNAP transcription was generated using complementary pairs of oligonucleotides as described above (section 3.2.1). Transcription was carried out using a MEGAshortscript T7 RNAP transcription kit (Ambion, Austin, TX) in the presence of 0.5 mM [α-32P]-UTP (800 Ci/mmol [30 TBq/mmol], GE Healthcare,

Piscataway, NJ) for radiolabeling. RNAs were transcribed in the presence or absence of

2.5 mM SAM, as indicated, for 30 min at 37ºC. After transcription, a DNA oligonucleotide (5 µM) complementary to positions 142-150 or 170-178 of the B. subtilis yitJ leader RNA was added and hybridized for 5 min at 37ºC. RNase H (1 µl, 10 U/µl;

Ambion) was then added to the reaction and incubated for 10 min at 37ºC. The final volume of the reaction was 30 µl. The reactions were stopped by phenol–chloroform extraction. The resulting RNA products were resolved by denaturing PAGE and visualized by PhosphorImager analysis. The percentage of RNAs protected from cleavage was calculated as the amount of full-length RNAs relative to the total amount of

RNA in each reaction. The reactions were performed in duplicate and reproducibility was

±10%.

161

3.3 Results

3.3.1 SAM-dependent structural transition of wild-type yitJ leader RNAs with distinct 3’ endpoints

The model for S box gene regulation predicts that the anti-antiterminator (helix

P1) is stabilized in the presence of SAM, while the absence of SAM results in formation of the antiterminator structure. We probed the structure of helix P1 by using an oligonucleotide that is complementary to the 3’ side of helix P1 and detected annealing by sensitivity to cleavage with RNase H. The RNase H enzyme cleaves only RNA-DNA hybrids. Formation of helix P1 in the presence of SAM prevents the complementary oligonucleotide from binding to the 3’ side of helix P1. This leads to protection from

RNase H cleavage. In the absence of SAM, the 3’ side of helix P1 is single-stranded and available for the oligonucleotide to bind, resulting in cleavage by RNase H. Thus, the yitJ leader RNA exhibits structural transition in a SAM-dependent manner.

We also probed the structure of the antiterminator by using a separate oligonucleotide that is complementary to the 3’ side of the antiterminator. As the antiterminator is formed in the absence of SAM, the 3’ side of the antiterminator is not available for the oligonucleotide to bind. Hence, protection of the RNA from RNase H cleavage is expected in the absence of SAM. In the presence of SAM, the 5’ side of the antiterminator is sequestered in formation of the alternate helix P1. This results in the 3’ side of the antiterminator to be available for the RNase H oligonucleotide to bind, resulting in RNase H cleavage. Thus, sensitivity to RNase H cleavage was used as a measure of the SAM-dependent structural transition shown by the yitJ leader RNA.

162

We tested the RNase H cleavage sensitivity of two wild-type yitJ templates with distinct 3’ endpoints. Figure 3.2 shows the endpoints of two yitJ templates and the positions of two complementary oligonucleotide primers used to probe for SAM- dependent structural transitions. The shorter yitJ template (159 nt), used to test the SAM- dependent structural transition of helix P1, should form helix P1 in the presence of SAM and be protected from oligonucleotide 1-directed RNase H cleavage. The longer template

(187 nt), used to detect the SAM-dependent structural transition of the antiterminator, should form the antiterminator in the absence of SAM and be protected from oligonucleotide 2-directed RNase H cleavage.

163

Positions targeted for mutagenesis

RNase H * SAM oligo 2 n d p o i n t Endpoint of o template f RNase H yitJ-AT fi oligo 1 (187 nt) r st t e Endpoint of m template p yitJ-AAT (159 nt) l a t e ( 1 5 Figure 3.2 The B. subtilis yitJ leader9 RNA. The boxed nucleotides depict the 7 positions within the SAM-bindingn pocket targeted for mutagenesis; SAM is shown as an asterisk in magenta. Endpoints oft) the two templates are designated by arrows. RNase H oligonucleotides are shown by black lines, oligonucleotide 1 is complementary to sequence 142-150; oligonucleotide 2 is complementary to sequence 170-178.

164

yitJ-AAT A. Template Oligo 1 - - + +

RNase H + - + + SAM - + - +

(159 nt) P ► (~145 nt) C ►

1 2 3 4

B. Template yitJ-AT Oligo 2 - - + + RNase H + - + + SAM - + - +

(187 nt) P ► (~170 nt) C ►

1 2 3 4

Figure 3.3 SAM-dependent structural transition of the B. subtilis yitJ S box RNA using RNase H cleavage assay. A. RNase H sensitivity of the yitJ-AAT transcript. B. RNase H sensitivity of the yitJ-AT transcript. SAM was added at a final concentration of 2.5 mM. The predicted sizes of the RNA products are shown to the left. P, Protection; C, Cleavage.

165

The addition of oligonucleotide 1 to the RNase H cleavage reaction triggered cleavage only in the absence of SAM, consistent with the S box model in which the region targeted by oligonucleotide 1 is sequestered only when SAM is present (Figure

3.3A, lane 3). The yitJ-AAT transcript is available for oligonucleotide 1 to hybridize with the 3’ side of helix P1, resulting in cleavage. However, the yitJ-AAT transcript was protected in the presence of SAM, suggesting that the oligonucleotide was unable to bind to the 3’ side of helix P1, preventing RNase H cleavage (Figure 3.3A, lane 4).

Figure 3.3B illustrates the structural transition of the longer T7 RNAP-transcribed yitJ transcript, which includes the antiterminator sequence of the yitJ RNA (yitJ-AT).

Addition of oligonucleotide 2 directed cleavage of the transcripts only in the presence of

SAM. This indicates that the 5’ side of the antiterminator sequence is sequestered in formation of the terminator structure in the presence of SAM. As a result, the 3’ side of the antiterminator is available for oligonucleotide 2 to hybridize, resulting in RNase H- directed cleavage. These results are consistent with the S box model in which the antiterminator structure is formed only in the absence of SAM.

3.3.2 Mutagenesis of a region in helix P3 to generate a pool of yitJ variants

We developed an experimental system to identify S box leader sequence determinants important for ligand affinity and specificity. Direct contacts between the T. tengcongensis yitJ S box RNA and SAM were revealed in helix P3 in the crystal structure

(Montange and Batey 2006). These ligand-RNA interactions were confirmed in a separate crystal structure study of the B. subtilis yitJ RNA in complex with SAM (Lu et

166

al. 2010). We targeted nucleotides in the SAM-binding pocket of the B. subtilis yitJ RNA for mutagenesis, in an effort to obtain variants with either higher affinity for the natural ligand or altered specificity towards an analog (Figure 3.4). Positions C45, A46, A47 and

C48 on the 5’ side and G77, U78 and G79 on the 3’ side of helix P3 were targeted for mutagenesis (Figure 3.4). Positions C45, A46, U78 and G79 are predicted to interact directly with the SAM molecule, while A47, C48 and G77 are predicted to be in close proximity to SAM.

SAM P3 P1

Nucleotides targeted for mutagenesis

Figure 3.4 A close-up view of the B. subtilis yitJ SAM-binding pocket. Helix P1 is depicted in grey; helix P3 is in cyan; SAM is in magenta; the sequence targeted for mutagenesis is highlighted in dark blue. This figure was generated using the PyMol software; Protein Data Bank (PDB) accession code 3NPB (Lu et al. 2010).

167

Spiked DNA oligonucleotides were used to generate a pool of mutant DNA templates (Table 3.1). DNA sequencing of 27 representatives from the pool of mutant

DNA templates revealed the presence of 1-6 changes, with an average of 4 changes per molecule (data not shown). These data confirmed the efficiency of mutagenesis. We simultaneously generated a wild-type yitJ template as a positive control. A T7 RNAP promoter sequence was attached to the 5’ end of the wild-type and mutant DNA templates to allow for T7 RNAP-mediated transcription.

3.3.3 Effect of leader region mutations on the SAM-dependent structural transition of the yitJ RNA

The RNase H assay served as one of the experimental tools to characterize the various yitJ mutants. Based on the preliminary results for wild-type RNA (section 3.3.1), we tested the sensitivity of the wild-type and pool of mutant yitJ transcripts towards

RNase H cleavage. The wild-type yitJ transcript (with the yitJ-AAT endpoint) exhibited cleavage in the absence of SAM (Figure 3.5, lane 3). Addition of 40 µM SAM also resulted in cleavage similar to that seen in the absence of SAM (Figure 3.5, lane 4). This was surprising considering that yitJ exhibits half-maximal termination at a SAM concentration of 0.35 µM and an apparent Kd of ~20 nM (Tomsic et al. 2008). It is possible that the RNase H assay is not as sensitive as the in vitro transcription termination or the SAM binding assays and hence requires a much higher concentration of SAM to detect a conformational change in the RNA. As expected, addition of 2.5 mM SAM resulted in protection of the wild-type yitJ transcript from RNase H cleavage (Figure 3.5,

168

lane 4). However, the pool containing mutant yitJ leader RNAs exhibited cleavage regardless of the presence SAM (Figure 3.5, lanes 6-8). Even the addition of a high concentration of SAM (2.5 mM) did not change the response of the pool of yitJ mutants in the RNase H assay. This result was not surprising as the pool of yitJ mutants contains alterations within a highly conserved region of the leader sequence.

Wild-type yitJ Pool of mutant transcripts yitJ transcripts

Oligo - + + + - + + + RNase H - + + + - + + + SAM - - 40 2.5 - - 40 2.5 µM mM µM mM

P ►

C ►

Lane 1 2 3 4 5 6 7 8

Figure 3.5 SAM-dependent structural transition of the pools of wild-type and mutant yitJ transcripts. SAM was added at the indicated concentrations. P, protected transcript; C, cleaved transcript.

169

We also tested a few individual mutants that were sequenced from the variant template pool (Figure 3.6). Lanes 1 and 2 depict wild-type yitJ RNA in the absence and presence of SAM, respectively (Figure 3.6). Mutant M1 showed constitutive cleavage regardless of the presence of SAM (Figure 3.6, lanes 3 and 4). Sequencing of mutant M1 revealed the presence of 4 nucleotide substitutions within the SAM-binding pocket:

C45A, A46G, C48A and G77U. A different mutant (M3) with a point mutation, C45U, also resulted in high constitutive cleavage regardless of the presence of SAM, indicating that helix P1 cannot be formed (data not shown). These results suggest the importance of position C45, which is part of a conserved base-triple (G79-C45·G11) and makes a direct contact to the methionine moiety of SAM. It is possible that the newly formed wobble base-pair (U45·G79) at the bottom of helix P3 can no longer interact with SAM or with

G11 to form the base-triple. Together, these results conclusively show that a change at position C45 abolishes response to SAM in vitro.

170

Wild-type M1 M2 M7

Oligo + + + + + + + + RNase H + + + + + + + + SAM - + - + - + - +

P ►

C ►

Lane 1 2 3 4 5 6 7 8

Figure 3.6 SAM-dependent structural transition of wild-type and yitJ variant RNAs. SAM was added at a final concentration of 2.5 mM. M, mutant from pool of variant yitJ templates; P, protected transcript; C, cleaved transcript.

Point mutants M2 (G77C) and M7 (A47G) showed wild-type-like responses to

RNase H in the presence of SAM (Figure 3.6). Unlike mutant M2, a separate mutant, M9, that contained G77C along with G79C, exhibited a complete loss of response to SAM

(data not shown). The results of M2 and M9 suggested that the mutation G77C in combination with a second substitution (G79C) switched the response entirely. Positions

G77 and A47 do not interact directly with the SAM molecule but are in close proximity.

It is therefore possible that these positions tolerate a sequence change. As G79 is part of a base-triple that is involved in hydrogen bond interactions with the methionine moiety of 171

SAM, the above results indicate that any disruption of the base-triple leads to a loss of

SAM-dependent structural transition. This preliminary analysis provided insight into the varied responses that could be obtained from the pool of yitJ mutants.

3.3.4 Identification of yitJ leader RNA determinants for SAM affinity and recognition

We performed SELEX (Ellington and Szostak 1990, Tuerk and Gold 1990) or in vitro evolution to obtain yitJ variants from the pool of mutant DNA templates using the

RNase H assay as the primary selection step. SELEX allows for a simultaneously screening of diverse pools of DNA (or RNA) molecules with a specific functionality. As a complex pool of mutant templates is expected to contain only a small fraction of the functional molecules, we hypothesized that by passing mutant sequences through iterative rounds of selection, yitJ variants only with a specific functionality could be obtained eventually.

To estimate the starting concentration of SAM to be used for the first round of in vitro selection, we performed a SAM titration on the wild-type yitJ leader RNA (Figure

3.7). A gradual increase in protection from RNase H cleavage was evident as the concentration of SAM increased from 5 µM to 2.5 mM (Figure 3.7). This result showed that SAM concentrations higher than 40 µM prevent the RNase H oligonucleotide from binding to the 3’ side of the helix P1. This indicates that the RNA undergoes a structural transition that favors formation of helix P1 in the presence of SAM, which is in agreement with the S box model.

172

Figure 3.7 A SAM titration of the yitJ-AAT template in the RNase H cleavage assay. SAM was added in increasing concentrations from lanes 3 to 8. P, protected transcript; C, cleaved transcript; %C, percentage of cleavage. Percent cleavage is the amount of the cleaved transcript relative to the sum of the cleaved and protected transcripts.

We observed ~50% cleavage at a SAM concentration of 160 µM (Figure 3.7, lane

8). To ensure that the yitJ variants obtained during SELEX maintain the response to

SAM, we selected a less stringent concentration of SAM that would result in at least 50% protection from RNase H cleavage. The first round of selection was performed with a starting concentration of SAM that was 2-fold higher (320 µM), as it would exhibit >50% protection. Subsequent assays in the presence of lower concentrations of SAM were used to detect higher affinity variant RNA molecules.

173

Figure 3.8 The SELEX scheme. The selection cycle begins with the DNA pool (template 1, top right corner). The RNase H oligonucleotides are denoted by bold red (oligo 1) and blue (oligo 2) lines.

We employed a two-stage SELEX scheme as illustrated in Figure 3.8. In the first stage, the pool of mutant DNA templates (consisting of the 159 bp yitJ-AAT templates) was transcribed using T7 RNAP in the presence of 320 µM SAM. The RNA pool thus generated was used as the starting point for our in vitro selection. Transcripts with SAM- dependent helix P1 formation were selected using antisense DNA oligonucleotide 1 in the

174

RNase H cleavage assay. The RNase H cleavage eliminates the primer-binding site required for the subsequent reverse-transcription step. Reverse-transcription was therefore a key step in this scheme, as it specifically selected uncleaved transcripts from the population. The cDNA products were amplified to generate a sub-pool of molecules with SAM-dependent helix P1 formation.

In the second stage of SELEX, we tested if the yitJ leader RNAs could form the antiterminator structure in the absence of SAM. For this purpose, the cDNA products generated during stage 1 were extended to include the antiterminator region downstream of helix P1 (187 bp yitJ-AT template). Transcripts with the ability to form the antiterminator structure in the absence of SAM were protected from RNase H cleavage in the presence of oligonucleotide 2 and selected specifically during the ensuing reverse- transcription step. The amplified cDNA products served as the template pool for the next selection round. The two RNase H assays, along with both the reverse-transcription reactions formed one round of in vitro selection. We aimed to conduct iterative rounds of selection to obtain a pool of yitJ variants with specific functionality.

The selections were performed on pools of the wild-type and mutant yitJ templates, where wild-type yitJ served as the control in which no sequence changes were expected after multiple rounds of selection. Four selections for each starting template were performed in parallel in order to have multiple pools of molecules. In spite of optimizing experimental conditions prior to starting the actual selection, the reverse- transcription as well as the PCR amplification and extension steps were technically challenging. We modified several variables individually or in combination, such as the

175

source of DNA polymerase, annealing temperature, template dilution, concentration of oligonucleotide primers, and concentration of Mg2+, to list a few. Overall, we found that the amplification products obtained using Pfu DNA polymerase exhibited a better yield than those obtained using Taq DNA polymerase (data not shown). These efforts to improve the experimental conditions were halted eventually after unsuccessful attempts over a period of one year, after which the yitJ SELEX project was terminated.

We next studied the B. subtilis yitJ leader RNA using an alternative approach in which the yitJ SAM-binding pocket was targeted directly rather than using spiked oligonucleotide primers. Site-directed mutagenesis was performed on the wild-type yitJ construct to generate single and double mutants. These mutants were analyzed as part of a crystal structure study of the B. subtilis yitJ leader RNA (in collaboration with Dr.

Ailong Ke, Cornell University) (Lu et al. 2010). The following sections will describe the results obtained using in vitro transcription termination assays in the presence of SAM and SAM analogs.

3.3.5 Disruption of G11 causes loss of SAM binding, yet high constitutive transcription termination in vitro

The G11 nucleotide is involved in a direct interaction with the methionine moiety of SAM and participates in a G79-C45·G11 base-triple that ties together the J1/2 and P3 within the yitJ RNA (Figure 3.9) (Lu et al. 2010). Mutating G11 to any of the other three nucleotides (A, C, or U) resulted in a complete loss of detectable SAM binding (SAM binding done by J. Tomšič, Lu et al. 2010). These results were consistent with the

176

position of G11 in the crystal structure. However, each of the G11 mutants exhibited increased termination in the absence of SAM (see Table 3.2) (Lu et al. 2010).

1

2

3

4

5

6

7

8

Figure 3.9 Nucleotides within the SAM-binding pocket of the B. subtilis yitJ RNA. Shown is the interaction of SAM with the G79-C45·G11 base-triple. Distances are given in angstrom in magenta. Carbon, oxygen, nitrogen, sulfur, and phosphorus atoms are colored gray, red, blue, gold, and orange, respectively. Adapted from (Lu et al. 2010).

In particular, G11C exhibited the most dramatic effect, with >80% termination in the presence or absence of high SAM (140 µM). The G11U mutant also exhibited 70% termination regardless of the presence of SAM. The wild-type yitJ construct showed a maximal response (99% termination) at 0.2 µM SAM, 700-fold less than the maximum concentration of SAM used for the G11C mutant. These results indicate that mutating the 177

highly conserved position (G11) in the yitJ leader RNA facilitates formation of an RNA structure that resembles the SAM-bound conformation, even when no SAM is present

(Lu et al. 2010). The G11A mutant exhibited only a modest effect on termination in the absence of SAM, despite complete loss of SAM binding, indicating that the structural change that results in the stabilization of RNA into a form similar to the ligand-bound form is separable from the effects on SAM binding (Lu et al. 2010).

178

Table 3.2 Mutational analysis of the B. subtilis yitJ S box riboswitch.

a b

a Values indicate the efficiency of termination in vitro performed in the absence (−SAM) or presence (+SAM) of SAM (140 μM). b ND, not detectable (Kd >200 μM).

179

3.3.6 Mutating residues in the P3 helix of the yitJ SAM-binding pocket results in a surprising stabilization of the aptamer domain

Substitutions of G79 and C45, the residues that participate in the base-triple with

G11 (Figure 3.9), exhibited high constitutive transcription termination, regardless of the presence or absence of SAM (Table 3.2). These results were similar to those obtained for the G11 mutants (section 3.3.5). These data suggest that disrupting any nucleotide within the G79-C45·G11 base-triple results in a loss of response to SAM, similar to the results obtained during RNase H assays. It is possible that sequence changes stabilize the yitJ

SAM-binding domain in a structure that resembles the SAM-bound conformation (Lu et al. 2010).

180

Figure 3.10 Nucleotides within the SAM-binding pocket of the B. subtilis yitJ RNA. A sheared A46·U78 pair recognizes the Watson–Crick and Hoogsteen faces of the adenosine moiety of SAM. Distances are given in angstrom in magenta. Carbon, oxygen, nitrogen, sulfur, and phosphorus atoms are colored gray, red, blue, gold, and orange, respectively. Adapted from (Lu et al. 2010).

Like the G79-C45·G11 base-triple, the A46·U78 base-pair is also highly critical.

The A46 and U78 nucleotides interact directly with the adenine ring of SAM (Figure

3.10). This base-pair is highly sensitive to substitution, consistent with its position in the crystal structure. Mutants of A46·U78 exhibited high constitutive termination similar to the mutants of the G79-C45·G11 base-triple (Table 3.2) (Lu et al. 2010).

181

The adenine ring of SAM is further stabilized by a second base-triple A47·C48-

G77, which is stacked above the A46·U78 base-pair. Changing most of the positions within the A47·C48-G77 base-triple resulted in high constitutive termination regardless of the presence of SAM (Table 3.2). However, several notable substitutions like C48A and C48G retained a partial response in the presence of high SAM, indicating that SAM binding was not affected severely (Lu et al. 2010). It is possible that C48 is more tolerant to sequence changes compared to the other nucleotides within the base-triple as it does not participate in direct interactions with the SAM molecule.

3.3.7 Substitution of the U85-A109 base-pair within the pseudoknot weakens the

SAM-binding ability without affecting the termination efficiency

The U85-A109 base-pair supports the crucial pseudoknot and creates a compact

RNA structure (Lu et al. 2010, Montange and Batey 2006). A109 interacts with position

A24 to form a base-triple that stabilizes the pseudoknot interaction (Lu et al. 2010). In contrast to the point mutations in helix P3 and the J1/2-P3 base-triple, substitution of either nucleotide in the U85-A109 base-pair exhibited a response to SAM, with an increase in transcription termination only in the presence of SAM (Table 3.2) (Lu et al.

2010). Percent termination in the presence of SAM was similar to wild-type yitJ, and percent termination in the absence of SAM was ~1.6-fold higher for the U85 position and

~1.5-fold lower for the A109 position compared to wild-type (Figure 3.11). The U85C-

A109G compensatory mutant also exhibited a wild-type-like response during transcription in both the presence and absence of SAM (Table 3.2). This indicates that

182

altering the U85-A109 base-pair does not affect termination in the absence of SAM compared to the other point mutants (Table 3.2).

Figure 3.11 In vitro transcription termination assay. Comparison of in vitro transcription termination efficiencies of wild-type and variant yitJ constructs in response to SAM. SAM concentration (µM) is denoted on the X-axis and termination efficiency is denoted as percentage values on the Y-axis.

Although the termination efficiency of the U85-A109 base-pair did not change significantly, the binding affinity for SAM was affected strongly (Lu et al. 2010). The

U85C point mutant showed a ~220-fold increase in Kd value, and A109G resulted in a

~110-fold increase in Kd (Table 3.2, SAM binding, J. Tomšič). The U85C-A109G compensatory mutant restored partially the SAM-binding affinity relative to the point

183

mutants alone, but was still 58-fold lower as compared to the wild-type yitJ RNA.

Changing the basepair from an A-U to a C-G maintained the response to SAM in an in vitro transcription termination assay but reduced the affinity for SAM. The fact that the compensatory mutant was unable to rescue the phenotype of the individual point mutants in the binding assay suggests that these residues exhibit sequence specificity in addition to their ability to base-pair. This is consistent with the critical role that A109 plays in formation of the pseudoknot structure (Lu et al. 2010).

3.3.8 Termination efficiency of wild-type and U85-A109 variant yitJ constructs in response to SAM analogs

In order to examine the specificity of the yitJ S box riboswitch RNA, we tested the wild-type yitJ leader RNA in response to a series of SAM analogs using an in vitro transcription termination assay. Each SAM analog tested during this study differed from

SAM at a single functional group and most of the modifications were located at the sulfur group or within the amino acid side chain (Figure 3.12). The concentration for half- maximal termination (Trm1/2) at the wild-type yitJ leader region terminator was determined for each SAM analog (Table 3.3). Several yitJ variants of the SAM-binding pocket were tested initially at one high concentration of each SAM analog. Mutants that showed a change in percent termination were further titrated using a range of analog concentrations and compared to the percent termination of wild-type yitJ. The concentrations necessary to reach Trm1/2 for the U85-A109 base-pair in presence of the

SAM analogs are listed in Table 3.4.

184

SeSAM

Figure 3.12 Chemical structures of SAM and SAM analogs. Modifications for each analog are depicted as pink spheres relative to SAM (top left corner). Abbreviations for each compound are as follows: SAM, S-adenosyl-L-methionine; SeSAM, Se-adenosyl-L- selenomethionine; TeSAM, Te-adenosyl-L-telluromethionine; SAH, S- adenosylhomocysteine; SAEt, S-adenosyl-ethionine; OH-SAM, hydroxyl-SAM; ME- SAM, methyl-ethyl-SAM.

185

Table 3.3. In vitro transcription termination of wild-type B. subtilis yitJ in the presence of SAM or SAM analogs. Max% denotes the maximum percentage of termination observed in the presence of ligand. Trm1/2 denotes half-maximal termination and Trm1/2 (µM) indicates the concentration of ligand for half-maximal termination. Maximum concentration of SAM or SAM analogs used are as follows: SAM, 2.5 µM; SAH, 500 µM; Sinefungin, 5.0 mM; SAEt, 2.5 µM; SeSAM, 2.5 µM; TeSAM, 2.5 µM; 3’-deoxy SAM, 2.5 µM; OH-SAM, 2.5 µM; ME-SAM, 125 µM.

186

18

7

Table 3.4 In vitro transcription termination of B. subtilis yitJ variants in the presence of SAM or SAM analogs. Max% denotes the maximum percentage of termination in the presence of ligand. Trm1/2 denotes half-maximal termination and Trm1/2 (µM) indicates the concentration of ligand required for half- maximal termination. Maximum concentration of SAM or SAM analogs used are as follows: SAM, 2.5 µM; SAH, 500 µM; Sinefungin, 5.0 mM; SAEt, 2.5 µM; SeSAM, 2.5 µM; TeSAM, 2.5 µM; 3’-deoxy SAM, 2.5 µM; OH-SAM, 2.5 µM; ME-SAM, 125 µM.

187

SAH differs from SAM by the absence of a single methyl group and a positive charge. Wild-type yitJ exhibited the lowest maximum percent termination (Max%T) in the presence of SAH as compared to that in the presence of SAM and the other analogs

(Table 3.3). SAH resulted in 78% termination even when added at a concentration of 500

µM, while 1.2 µM SAM was sufficient to stimulate 100% termination for wild-type yitJ

(Table 3.3). The A109G and U85C mutants exhibited a maximum termination of 50% and required ~3000 µM concentration SAH (data not shown). The C85-G109 compensatory mutant rescued the phenotype of each single mutant. The compensatory mutant exhibited a maximum termination value similar to wild-type yitJ, but required ~6- fold higher concentration of SAH (Table 3.4). These results are in agreement with the model that states the requirement for an overall positive charge. SAH lacks the positive charge and results in reduced affinity for wild-type and variant yitJ RNAs.

Sinefungin, a natural antibiotic produced by Streptomyces sp, is a SAM analog in which methionine is replaced by ornithine. Sinefungin lacks the sulfonium ion and contains an amine substitution in place of the methyl group (Figure 3.12). We found that this analog induced transcription termination at the wild-type yitJ terminator. A concentration of 500 µM was necessary to reach ~90% termination (Table 3.3). It is possible that the decreased affinity for this analog is due to the reduced net positive charge caused by the absence of the central sulfonium ion. Previous studies have demonstrated that a high concentration (1.5 mM) of sinefungin reduces the percent termination by increasing readthrough (~4-fold) at the ykrW leader region terminator

188

during in vitro transcription termination (McDaniel et al. 2003). In the current study, a

10-fold higher concentration of sinefungin (~15 mM) was required to reach ~50% termination by the yitJ variants A109G and U85C (Table 3.4). The C85-G109 compensatory mutant rescued the phenotype of the individual mutants marginally. A

~3.0-fold lower concentration of sinefungin was required to achieve Trm1/2 for the compensatory mutant compared to the U85C mutant (Table 3.4).

Se-SAM and Te-SAM differ from SAM by the identity of the chemical element attached to the sulfur group. Se-SAM contains selenium in place of sulfur, while Te-

SAM contains tellurium (Figure 3.12). The positive charge at the central position of SAM is crucial for specific ligand recognition (Lim et al. 2006, Lu et al. 2010, Montange and

Batey 2006). Both Se-SAM and Te-SAM retain this positive charge. Similar to the termination efficiency in the presence of SAM, wild-type yitJ showed termination efficiencies reaching 100% and 97% in the presence of ~2.5 µM Se-SAM and Te-SAM, respectively (Table 3.3). Sulfur, selenium and tellurium belong to the same group of elements, termed chalcogens, and exhibit highly similar chemical properties. It is therefore not surprising that wild-type yitJ responds similarly to these two analogs, as it does to SAM. The difference lies in the concentration required for Trm1/2. Wild-type yitJ needed 2-fold lower concentration of Se-SAM and Te-SAM as compared to the concentration of SAM (Table 3.3). Altering the U85-A109 base-pair had a minor effect on termination efficiency in the presence of Se-SAM. Both point mutants exhibited similar Trm1/2 values in the presence of 2-fold and 7-fold higher concentrations of the

189

analog (Table 3.4). However, the compensatory mutant restored the phenotype of the individual mutants. The double mutant achieved a Trm1/2 at 0.1 µM Se-SAM, similar to the value for SAM. The A109G and U85C mutants showed a maximum termination of

~50% in the presence of Te-SAM. U85C required 22-fold higher concentration of Te-

SAM compared to wild-type yitJ (Table 3.4). Again, the compensatory mutant rescued the phenotype exhibited by the point mutants. These results indicate that Se-SAM and

Te-SAM resemble SAM closely, and are therefore able to generate a SAM-like response for wild-type yitJ. These results further validate the model that a positive charge is essential for ligand recognition and not necessarily the identity of the charged moiety (Lu et al. 2010, Montange and Batey 2006).

The response of the wild-type yitJ leader RNA to the SAM analog SAEt was nearly identical to SAM. SAEt substitutes the methyl group in SAM with an ethyl group

(Figure 3.12). High-resolution analyses have revealed that the methyl group of SAM points towards the solvent cavity within the interior of the RNA and does not participate directly in ligand recognition (Lu et al. 2010, Montange and Batey 2006). Our results appear to be in agreement with this model. Relative to wild-type yitJ RNA, the A109G mutant needed a 10-fold higher concentration of ligand to reach Trm1/2 of <40% (Table

3.4), and the U85C mutant required 8-fold higher concentration. The compensatory mutation C85-G109 rescued these phenotypes partially. These results, yet again, suggest that disruption of the U85-A109 base-pair leads to a severe defect in the response to the ligand by the leader RNA, indicating the importance of sequence over base-pairing.

190

3’-deoxy SAM differs from SAM by the absence of the 3’ hydroxyl group of the sugar moiety (Figure 3.12). Alteration of the sugar moiety of SAM did not affect the termination efficiency of wild-type yitJ (Table 3.3). Wild-type yitJ reached ~100% termination efficiency in the presence of 3’-deoxy SAM with a Trm1/2 concentration of

0.15 µM. These values were similar to those seen in the presence of SAM (Table 3.3).

However, the yitJ point mutants exhibited barely 50% termination in the presence of the analog. Although the compensatory mutant rescued the phenotype of each point mutant, it required 2-fold higher concentration of 3’-deoxy SAM compared to the wild-type yitJ

RNA (Table 3.4).

Substituting the amine group in the methionine tail by a hydroxyl group did not alter the response of wild-type yitJ in vitro. This was surprising, as previous studies have indicated the importance of the amine group in correct recognition of SAM (Lim et al.

+ 2006). Changing the NH3 to an H resulted in a Kd value of >1000 µM compared for wild-type yitJ (Lim et al. 2006). However, the carboxylate group of SAM involved in

Watson-Crick base-pairing with the yitJ RNA is left intact in OH-SAM. OH-SAM promoted termination at the wild-type yitJ leader region terminator and a 2-fold higher concentration of OH-SAM was necessary to reach a Trm1/2 of ~60% (Table 3.3). The yitJ variants tested in the presence of this analog exhibited very low termination yield, and the compensatory mutant did not rescue the phenotypes of the point mutants (Table 3.4).

This suggests a direct interaction of the amine group with the U85-A109 base-pair within the SAM-binding pocket and that any alteration at this sequence is detrimental to the

191

termination efficiency.

Lastly, the SAM analog ME-SAM differs from SAM in that the α-carboxylate, α- amine and the α-carbon of methionine are missing (Figure 3.12). ME-SAM promoted transcription termination at the wild-type yitJ leader region terminator. However, a 50- fold higher concentration (5 µM) was required for maximum termination of 96%, compared to SAM (Table 3.3). The individual mutants did not exhibit high termination values even with analog concentrations of 125 µM. Also, the compensatory mutant exhibited a Trm1/2 concentration of 12.5 µM ME-SAM for 80% maximum termination

(Table 3.4). These results are consistent with the crystal structure that highlights the importance of the functional groups in ligand recognition.

3.4 Discussion

The B. subtilis yitJ S box leader RNA is one of the most well-studied S box riboswitches (Grundy and Henkin 1998). Regulation of this riboswitch occurs such that expression of the yitJ gene is upregulated when the cells are starved for methionine

(when SAM pools are low) and gene expression is turned off in the presence of methionine (when SAM pools are high). Several genetic, biochemical and biophysical studies together provided the basis for the S box model and the ligand-dependent structural transitions of the RNA (McDaniel et al. 2003, McDaniel et al. 2005). Crystal structure analyses of the yitJ RNA have revealed the nucleotides that participate in SAM- recognition (Lu et al. 2010, Montange and Batey 2006, Stoddard et al. 2010). 192

In this study, we examined the SAM-binding pocket of the yitJ leader RNA in an attempt to obtain variant aptamers with altered affinities or ligand specificities. We initially examined a pool of yitJ mutants with an altered sequence in helix P3 using

SELEX. RNase H cleavage assays followed by reverse-transcription PCR served as the selection steps in the experiment. Wild-type yitJ exhibited a SAM-dependent structural transition with increased protection in the presence of SAM during the RNase H cleavage assay (Figure 3.5). However, the pool of mutant yitJ RNAs failed to show increased protection even in the presence of high (2.5 mM) SAM concentration. This result was not surprising as a highly conserved region closely associated with SAM was targeted for mutagenesis. This result indicated that an alteration of the SAM-binding pocket affects the ability of the RNA to undergo a ligand-dependent structural transition. By testing a few yitJ mutants using the RNase H assay, we were able to explore the possible responses of representatives from a partially randomized pool of yitJ templates. These preliminary results established the design for the two-stage SELEX scheme.

The first half of the SELEX scheme selected for RNA molecules with SAM- dependent helix P1 formation, while the second half of SELEX selected RNAs able to form the antiterminator in the absence of SAM. Every step in the SELEX scheme was optimized before proceeding with the in vitro selection experiment. However, technical difficulties hampered our attempts and the investigation had to be terminated. An alternate experimental design might minimize the PCR-related technical problems and result in a more efficient selection system. Sites for restriction enzymes can be included

193

in the two template sequences to bypass the PCR-based methods of modifying the template end-points at each step within the selection scheme. Treating the amplified cDNA products with the appropriate restriction enzyme can reduce the loss of DNA that occurred during the PCR amplification and extension steps.

A subsequent in vitro characterization of the yitJ riboswitch was performed as part of a larger study, which crystallized the SAM-bound yitJ RNA from B. subtilis (Lu et al. 2010). We characterized a set of binding domain mutants using an in vitro transcription termination assay. A few selected mutants were also tested for their ability to bind SAM using a filter exclusion assay. Most of the mutants exhibited loss of SAM binding (consistent with their position in the crystal structure), as these residues either make important contacts with SAM or are important for stabilization of crucial structural domains that form the SAM-binding pocket (Lu et al. 2010).

The more surprising mutations were those that resulted in high transcription termination in the absence of SAM, suggesting that they stabilize a structural arrangement in the aptamer domain similar to the SAM-bound form. The fact that many mutants retained the ability to respond to SAM suggested that formation of the SAM- bound conformation in the absence of SAM is separable from the ability to bind SAM.

Mutational studies conducted with the THI-box riboswitch revealed similar results

(Ontiveros-Palacios et al. 2008). These observations together suggest that a riboswitch undergoes significant evolutionary selection to regulate gene expression efficiently, only in response to the appropriate ligand (Lu et al. 2010).

194

The above results also suggest that it is advantageous for the aptamer domain to fold into a pre-bound conformation in the absence of the ligand. This can result in a quick ligand-dependent structural transition that can be coupled to gene regulation only in response to the correct ligand. Data obtained from additional analyses of the yitJ variants can provide a better understanding of how certain mutations preferentially stabilize the bound-like state of the RNA.

It is important to note that the in vitro transcription termination assay detects the efficiency of termination at the leader region terminator in response to a ligand, while the

RNase H assay detects the ligand-dependent structural transition of the RNA. It is therefore possible that the RNase H assay is not as sensitive as the in vitro transcription assays and hence requires a much higher concentration of SAM to detect a SAM- dependent response in vitro. For example, the SAM concentration required to achieve half-maximal termination at the yitJ leader region terminator is ~0.35 µM, whereas the concentration of SAM required for 50% cleavage using RNase H cleavage was ~160 µM.

Discrepancy was also observed in some results for the yitJ variants. For example, the C45U mutant exhibited high constitutive cleavage in the RNase H assay, suggesting the inability to form helix P1 (section 3.3.2). Based on the RNase H result, we would predict the C45U mutant to show high readthrough activity in an in vitro transcription termination assay. However, this mutant exhibited high constitutive termination in the in vitro transcription termination assay (Table 3.2). It is possible that such single nucleotide substitutions that trap the RNA in the SAM-bound-like conformation, fail to exhibit a

195

structural-transition in the RNase H cleavage assay, leading to inconsistent results.

Lastly, we characterized the wild-type yitJ RNA and a few selected mutants in response to SAM analogs using in vitro transcription termination assays. Our results indicated that the yitJ leader RNA is highly specific for SAM, its natural ligand. In some cases, we found that the wild-type yitJ transcript exhibited increased termination in the presence of a few analogs. However, this was achievable only under high ligand concentrations that are not physiologically relevant. The analysis of U85C, A109G and

C85-G109 indicated that the U85 position was highly critical and that alteration of this nucleotide severely affected the ability of the RNA to respond to SAM or SAM analogs in vitro. The compensatory mutations showed the importance of sequence over base- pairing. The U85-A109 base-pair supports the crucial pseudoknot and creates a compact

RNA structure (Lu et al. 2010, Montange and Batey 2006). A109 interacts with the N1 position of A24 to further stabilize the pseudoknot interaction (Lu et al. 2010). In addition to stabilizing the pseudoknot structure, these tertiary interactions also stabilize the SAM-binding pocket through ribose-mediated hydrogen bonds in a SAM- independent manner.

Based on the above results, we can conclude that changing the ligand specificity of the B. subtilis yitJ leader RNA is challenging. Due to the highly conserved sequence of the yitJ SAM-binding pocket, generation of variants with altered affinities or specificities can be difficult. However, as seen in the case of the B. subtilis lysC leader RNA, nucleotide changes within the lysine-binding pocket are enough to modify the ligand

196

specificity from lysine to a lysine analog (Wilson-Mitchell et al. 2012). We can speculate that the high conservation seen among the S box leader RNAs results in an evolutionary advantage, which prevents the RNA from recognizing closely-related ligands and inappropriately triggering gene regulation.

197

CHAPTER 4

INVESTIGATION OF THE FUNCTION OF S BOX RIBOSWITCH RNA

STRUCTURAL ELEMENTS: INSIGHTS INTO FACTORS CONTRIBUTING TO

S BOX RIBOSWITCH VARIABILITY

4.1 Introduction

The gene products regulated by the S box transcription termination control system in B. subtilis are involved in different steps of sulfur metabolism (Grundy and Henkin

2004, Grundy and Henkin 2006, Hullo et al. 2004, Murphy et al. 2002) (Figure 4.1).

Expression of these transcriptional units is induced during starvation for methionine, in response to a drop in the intracellular concentration of the molecular effector, SAM

(Grundy and Henkin 1998). SAM is synthesized from methionine and ATP by SAM synthetase, encoded by the metK gene. Growth in the presence of methionine results in high SAM pools, while growth in the absence of methionine results in low SAM pools.

The 11 S box genes from B. subtilis exhibit differential regulation. An extensive biochemical and genetic analysis revealed the correlation between the physiological roles of the S box genes and their sensitivity to SAM in vivo and in vitro (Tomsic et al. 2008).

198

The S box gene-lacZ fusions show a 250-fold range in induction ratios after 4 h of methionine starvation (Tomsic et al. 2008). This study concluded that the genes directly involved in methionine biosynthesis (e.g., metE, yitJ and yjcI) (Figure 4.1) exhibit the tightest regulation in vivo, i.e., these genes exhibit the lowest expression level during growth in the presence of methionine and the greatest increase in expression during starvation for methionine. The metE and yitJ genes, in particular, exhibit a long delay before gene expression is induced. This pattern of gene expression suggests that the cell turns on methionine biosynthesis only after SAM levels become very low (Tomsic et al.

2008).

In contrast, the yusCBA operon (which encodes an ABC-type methionine transporter, Figure 4.1) exhibits a higher level of expression during growth in the presence of methionine and a lower magnitude of induction during methionine starvation.

The yusC gene exhibits rapid induction of expression (Tomsic et al. 2008). This pattern of gene expression suggests that expression of the methionine transporter is turned on even before the SAM levels in the cell become very low. These properties are consistent with the role in methionine transport, as it is more efficient to take up exogenous methionine rather than to synthesize it in vivo. Compared to most other S box genes, the metK gene does not exhibit induction during methionine starvation (Tomsic et al. 2008).

199

Figure 4.1 Methionine biosynthesis pathways in B. subtilis. Enzymatic steps catalyzed by the S box gene products are shown. Genes with unknown function (yoaDCB, yxjG and yxjH) are positioned based on sequence similarity. SAM, S-adenosylmethionine; SAH, S- adenosylhomocysteine; MT, methylthio; THF, tetrahydrofolate. Adapted from (Murphy et al. 2002).

Northern blot analysis revealed the relative levels of the terminated and readthrough transcripts in B. subtilis during methionine starvation (Tomsic et al. 2008).

An RNA probe complementary to the 5’UTR of each S box transcript was used to detect both the terminated and readthrough transcripts. As the primary sequence of S box leader

200

RNAs is highly conserved, RNA probes that hybridize in the coding region of the S box gene were employed to ensure specific detection of the readthrough transcript. These experiments showed that for most S box genes (with the exception of cysH and metK) the terminated transcript is the major product during growth in the presence of high methionine, while starvation for methionine results in a decrease in the amount of terminated transcript and an increase in the readthrough transcript (Tomsic et al. 2008).

The major product at the start of methionine starvation for both metE and yusC is the terminated product. However, a 72-fold increase in readthrough transcript is observed for metE, while yusC exhibits a 4.0-fold increase in the readthrough product (Tomsic et al.

2008). These results corroborate the in vivo expression data and are consistent with the physiological roles of these two S box genes (Tomsic et al. 2008).

Variability is also seen in the in vitro SAM-dependent transcription termination efficiency for the S box transcriptional units (except metK) (Tomsic et al. 2008). This analysis showed that genes closely associated with methionine biosynthesis (metE, yitJ, yjcI and ykrT) exhibit the highest sensitivity to SAM. These genes require very low concentrations (in the µM range) of SAM to achieve a half-maximal termination response. It was observed that these biosynthetic genes generally correspond to the genes that are most tightly repressed during growth in the presence of methionine (when SAM pools are high). These genes also exhibit similar termination efficiencies in the absence of SAM (10-15%) and similar maximal efficiencies of termination, which is consistent with tight regulation in vivo (Tomsic et al. 2008). The yusC gene requires a high

201

concentration of SAM (15 µM) to reach a half-maximal response in vitro and shows high termination (~50%) in the absence of SAM (Tomsic et al. 2008). The high termination of yusC in the absence of SAM suggests that the antiterminator does not effectively compete with the terminator thereby resulting in low expression even under inducing conditions.

These results are consistent with yusC expression in vivo. Overall, the S box genes exhibit a 100-fold range in sensitivity to SAM in vitro. Together, these results show that genes that are tightly repressed in vivo during growth under conditions with high SAM pools are also highly sensitive to SAM during transcription in vitro (Tomsic et al. 2008).

A 250-fold range in affinity for SAM has been observed for the S box transcripts

(Tomsic et al. 2008). The apparent Kd values of metE, yitJ and leader RNAs of other methionine biosynthetic genes are in the range of 14 to 25 nM, while the yusC leader

RNA has the weakest affinity for SAM with an apparent Kd value of 3.5 µM (Tomsic et al. 2008). These experiments showed that the binding affinity of the leader RNAs for

SAM correlates with the sensitivity to SAM in vitro (Tomsic et al. 2008). Genes with a lower apparent Kd value respond to low concentrations of SAM in the in vitro transcription termination assay. Genes with the greatest sensitivity to SAM in vitro also exhibited the highest induction ratio in vivo (Tomsic et al. 2008). Based on these results, it was proposed that the affinity of the SAM binding domain for SAM is a critical factor for regulation, and that other parameters, such as the mutually exclusive competing elements, and the efficiency of termination, play a major role in the calibration of the system (Tomsic et al. 2008).

202

Based on the information gleaned from the above studies, we investigated the

RNA elements that are responsible for S box riboswitch variability. The first part of this chapter focuses on in vivo and in vitro analyses of hybrid leader RNAs, generated with metE and yusC sequences. These leader RNAs served as good candidates for this study as they exhibit different expression profiles in response to limiting SAM levels. metE shows the greatest increase in expression when compared to other S box genes and exhibits high affinity for SAM, while yusC is less sensitive to changes in SAM in vivo and exhibits low affinity for SAM (Grundy and Henkin 1998, Hullo et al. 2004, Tomsic et al. 2008).

We also investigated a possible promoter effect on the efficiency of S box gene transcription. For this study, we chose the metK and yusC leader RNAs. The metK-lacZ expression was induced when SAM pools were low and methionine levels were high

(Chapter 2). High levels of SAM failed to promote transcription termination of wild-type metK in vitro. Phylogenetic analysis revealed highly conserved sequences upstream (US box) and downstream (DS box) of the metK S box riboswitch element (Chapter 2). The metK US box is located at the transcriptional start-site of the metK leader RNA. The majority of sequence changes within the US box region resulted in a loss of response to changing SAM pools in vivo. Alteration of the US box sequence also disrupted the predicted US-DS pairing interaction and reduced the transcript stability in vivo (Chapter

2). The in vitro analysis of the metK US box mutants revealed variation in the total amount of transcript generated relative to the wild-type metK RNA (Chapter 2). Based on these results and the US box mutagenesis (Chapter 2), we hypothesized that the metK

203

promoter, along with US box sequence, is responsible for reduced transcription initiation or reduced RNAP processivity in vitro. The effects of the metK promoter and US box sequence on expression of the yusC S box leader RNA were tested in vivo and in vitro.

The results obtained from these studies will be discussed in the second half of this chapter.

4.2 Materials and methods

4.2.1 Bacterial strains

The B. subtilis strains used in this study were BR151 (lys-3 metB10 trpC2);

ZB307A (SPβc2del2::Tn917::pSK10∆6) (Zuber and Losick 1987); ZB449 (trpC2 pheA1 abrB703 SPβ-cured) (Nakano and Zuber 1989). B. subtilis strains were grown on tryptose blood agar base medium (TBAB; Difco, Franklin Lakes, NJ), Spizizen minimal medium

(Anagnostopoulos and Spizizen 1961) and 2XYT broth (Miller 1972). Chloramphenicol was added at a concentration of 5 μg/ml. X-Gal (5-bromo-4-chloro-3-indolyl-β-D- galactopyranoside; Gold Biotechnologies, St. Louis, MO) was used at 40 μg/ml as an indicator of β-galactosidase activity. All growth was at 37ºC.

4.2.2 Genetic techniques

Transformation of B. subtilis was carried out as described previously (Henkin et al. 1990). Chromosomal DNA was prepared using the DNeasy tissue kit (Qiagen,

Chatsworth, CA). Wizard columns (Promega, Madison, WI) were used for plasmid

204

preparations. Oligonucleotide primers were purchased from Integrated DNA

Technologies (Coralville, IA). Restriction endonucleases and DNA-modifying enzymes were purchased from New England Biolabs (Beverly, MA) and used as described by the manufacturer. Mutations were identified by DNA sequencing (Genewiz Inc., North

Brunswick, NJ). Transcriptional fusions were generated in plasmid pFG328 (Grundy et al. 1993) which contains a cat gene conferring resistance to chloramphenicol. The lacZ fusion constructs were introduced in single copy into the B. subtilis chromosome by recombination into the SPβ prophage carried in strain ZB307A and purified by passage of the phage through strain ZB449 (Nakano and Zuber 1989, Zuber and Losick 1987). The phage carrying the fusion was then introduced into the strain BR151. Strains containing lacZ fusions were grown in the presence of chloramphenicol.

4.2.3 Construction of hybrid leader RNAs

4.2.3.1 metE and yusC hybrid leader RNAs

The metE and yusC leader RNA hybrids were constructed such that the binding domain of one leader RNA was fused upstream of the terminator domain of the second leader RNA. For example, the metE-yusC leader RNA hybrid contained the binding domain of the metE leader RNA and the terminator domain of the yusC leader RNA.

Likewise, the yusC-metE leader RNA hybrid construct contained the binding domain of the yusC leader RNA and the terminator domain of the metE leader RNA. The metE and yusC leader RNAs share a sequence (5’-AGAUGAGAGA-3’) between the J4/1 region

205

and the 3’ side of helix 1 (anti-antiterminator) (Figure 4.2, sequence in green box). This common sequence was used to fuse the binding domain of the first leader RNA upstream of the terminator domain of the second leader RNA. The 3’ side of the antiterminator

(Figure 4.2, red sequence) is complementary to the 5’ side of the terminator (Figure 4.2, blue sequence), and these sequences form the alternate antiterminator structure. For sequences that form such alternate structures, alteration in one part of the region of the

RNA usually requires a compensatory change in the complementary region of the RNA to maintain base-pairing. However, as the common sequence between metE and yusC is located on the 3’ side of the anti-antiterminator, corresponding sequence changes in the terminator or antiterminator sequences were not required. Complementary pairs of DNA oligonucleotides (Table 4.1) were employed to generate the hybrid sequence for each construct. Template end-points for each pair of wild-type and hybrid construct were identical. Templates for in vitro transcription by B. subtilis RNAP were generated by

PCR (Pfu DNA Polymerase, Stratagene, La Jolla, CA) using oligonucleotide primers that contained the glyQS promoter sequence (McDaniel et al. 2003). The B. subtilis glyQS promoter is a strong, constitutive promoter that allows efficient transcription in vitro. Use of this promoter does not affect the response to SAM (McDaniel et al. 2003) and results in consistent transcription initiation for each construct.

206

Table 4.1 DNA oligonucleotides for metE and yusC leader RNA constructs

Oligonucleotide Sequence (5’→3’)a, b, c, d metE-yusC CTAATTCCATCAGATTGTGTCTGAGAGATGAGAGAAAGGC metE-yusC RC GCCTTTCTCTCATCTCTCAGACACAATCTGATGGAATTAG yusC-metE CCAATTCACACGAAGCGTTCAGCTTTGAAAGATGAGAGAGGCAGTG yusC-metE RC CACTGCCTCTCTCATCTTTCAAAGCTGAACGCTTCGTGTGAATTGG metE DSXba ATTAATTTCTAGATGTAAAACACTCTCTTTC yusC DSXba ATTAATTTCTAGAAATCACCTGCCTTAACAC metEgly US GGTCCATCTTTTTATATGATCATTTACAAAAAATTAATAACATTT yusCgly US GGTCCATCTTTTTATATGATCATTTACTATATATTTCTCTTATCA glyQ USKpnBam TAAGGATCCGGTACCACGAAGAATATTCGGGATTGTA

a Sequences in bold indicate the common sequence between metE and yusC b Underlined sequences indicate the transition into the second leader RNA c RC indicates reverse complement d US and DS indicate upstream and downstream, respectively

207

A. yusC AT

U A U U A U U U A U G A A C U C G A U G C C G G C A A A A U C A A U U A A U A U G C C G A U C G SAM-binding G A A U domain G C C G U C A U U G G C U A G C A A G C A C A U A A U G U A C A A CCC GCC G U G G G CGAAGC GGG UGG U A U A A A G GUUUCG C G A A A G G A A U A A G G G A U C G A A U A AC G U C G U U A U U A U C G C U G G C G U A U A C G C G U A U A C G U A U A U A U A C G U A C G G GCAUUUUAUAUAAG CUUUUCUUUU AAT T

Figure 4.2 Predicted secondary structures of the yusC and metE leader RNAs. A. Wild-type yusC leader RNA. B. Wild-type metE leader RNA. Both RNAs are shown in the terminator conformation. Red and blue residues indicate the alternate pairing for formation of the antiterminator, shown above the terminator. T, terminator; AT, antiterminator; AAT, anti-antiterminator. The sequence highlighted by the green box served as the common sequence for the leader RNA hybrid constructs. (continued) 208

Figure 4.2 (continued)

B. metE AT C G A U A U U A U G C G U A U A G U A G U U A U A G A U A A A C G C G G C A C G G C U A A U A C G C U A U G A G U C A U CA G C G C U G C A U C G A SAM-binding G C A U A U domain C G CA U G A G G U C U U A C U C G G U A U A U G G C C A A C G CCU CCC AA UCAGAU U U G G G U GGG U A GGGUU AGUCUG U A C G G A G C A G A A U U G C G C G U A U G A U C G U G U A U A U A C G U A U A C G C G U A U A C G U G C G U G U C G C A AGUGUUUUACGUAGAAAA UUUUUGUU C AAT T

209

4.2.3.2 metK and yusC hybrid constructs

The metK and yusC hybrid leader RNAs were constructed such that the native yusC promoter sequence was replaced by the metK promoter sequence and the sequence upstream of the yusC helix 1 region was replaced by the metK US box sequence.

Complementary pairs of DNA oligonucleotides (Table 4.2) were employed to generate the fusion sequence for each hybrid construct. Templates for in vitro transcription were generated as described above.

Table 4.2 DNA oligonucleotides for metK and yusC leader RNA constructs

Oligonucleotide Sequence (5’→3’)a, b metK-yusC 1 FD GATATTTCATTGAGCGGATATTTCTCTTATCAAGAGAGG metK-yusC 1 RC CCTCTCTTGATAAGAGAAATATCCGCTCAATGAAATATC metK-yusC 2 FD GATATTTCATTGAGCGGATACTCTTATCAAGAGAGG metK-yusC 2 RC CCTCTCTTGATAAGAGTATCCGCTCAATGAAATATC metK-yusC 3 FD GATAAGATATTTCATTGTTTCTCTTATCAAGAGAGG metK-yusC 3 RC CCTCTCTTGATAAGAGAAACAATGAAATATCTTATC metK-yusC 4 FD GATATTTCATTGAGATTATATTTCTCTTATCAAGAGAGG metK-yusC 4 RC CCTCTCTTGATAAGAGAAATATAATCTCAATGAAATATC metK-yusC 5 FD GATATTTCATTGAGATTATACTCTTATCAAGAGAGG metK-yusC 5 RC CCTCTCTTGATAAGAGTATAATCTCAATGAAATATC metK-yusC 6 FD GATAAGATATTTCATTGAGTTTCTCTTATCAAGAGAGG metK-yusC 6 RC CCTCTCTTGATAAGAGAAACACAATGAAATATCTTATC metK-yusC 7 FD GATAAGATATTTCATTGAGTCTCTTATCAAGAGAGG metK-yusC 7 RC CCTCTCTTGATAAGAGACACAATGAAATATCTTATC metK-yusC 8 FD CGATAAGATATTTCATTGAGCGGTTTCTCTTATCAAGAGAGGT metK-yusC 8 RC ACCTCTCTTGATAAGAGAAACCGCTCAATGAAATATCTTATCG

a Underlined sequences denote the metK US1 mutation (C4G5G6→A4U5U6) b FD, forward; RC, reverse complement

210

4.2.4 β-Galactosidase measurements

Strains containing lacZ fusions were grown in Spizizen minimal medium containing the required amino acids at a concentration of 50 μg/ml until early exponential phase and then harvested by centrifugation. Cells were resuspended in fresh Spizizen minimal medium in the presence or absence of methionine. Samples were collected at 1-h intervals and assayed for β-galactosidase as described previously (Miller 1972) using toluene permeabilization. All starvation experiments and assays were conducted at least twice, and variation was <10%.

4.2.5 In vitro transcription termination assay

4.2.5.1 Conditions for the metE and yusC wild-type and hybrid constructs

Templates for in vitro transcription by the B. subtilis RNAP were generated by

PCR using oligonucleotide primers that contained the glyQS promoter sequence and hybridized within the leader region of the target gene, to generate a transcription start-site

16 nt upstream of the start of the metE helix 1 or 8 nt upstream of the start of the yusC helix 1. The promoter sequences were designed to allow initiation with a dinucleotide

(ApC) corresponding to the +1/+2 positions of the transcript and a halt in transcription at position +29 for metE and +21 for yusC by omission of GTP (McDaniel et al. 2003). The

PCR fragments were ~400 bp in length and included 67 and 63 bp downstream from the transcription terminator of metE and yusC, respectively, to allow resolution of terminated and readthrough products. PCR products were purified with a Qiagen cleanup kit and 211

sequenced by Genewiz. Single-round transcription reactions were carried out as described in Chapter 2, section 2.2.5. Templates were transcribed in the presence or absence of SAM (as indicated). Transcription products were resolved by denaturing

PAGE and visualized by PhosphorImager (Molecular Dynamics) analysis. Percent termination was plotted as a function of ligand concentration (GraphPad Prism). The reactions were performed in duplicate and reproducibility was ±5%.

4.2.5.2 Conditions for the metK-yusC hybrid leader RNA constructs

Templates for in vitro transcription by the B. subtilis RNAP were generated by

PCR using oligonucleotide primers that contained the metK promoter sequence with or without the metK US box sequence and fused to the helix 1 or the J1/2 region of the yusC leader RNA. The metK transcription start-site was located 1-17 nt upstream of the start of the yusC helix 1, depending on the construct (refer to Table 4.5 for sequence details). The

PCR fragments were ~400 bp in length and included 63 bp downstream from the yusC transcription terminator, to allow resolution of terminated and readthrough products. PCR products were purified and sequenced as described above. Multiple-round transcription reactions were carried out as described in Chapter 2, section 2.2.6. Templates were transcribed in the presence or absence of SAM (as indicated). Transcription products were analyzed and percent termination was determined as described above. The reactions were performed in duplicate and reproducibility was ±5%.

212

4.3 Results

4.3.1 Repression of S box gene expression in the presence of methionine is dependent on the SAM-binding domain of the S box leader RNA

The wild-type and hybrid constructs were tested using in vivo β-galactosidase expression assays during methionine starvation. Cells were grown in minimal medium in the presence of methionine until mid-exponential phase, after which the cells were harvested and resuspended in the presence or absence of methionine. The wild-type metE-lacZ fusion exhibited high β-galactosidase activity in the absence of methionine compared to the other three constructs (Figure 4.3, open squares), and very low expression in the presence of methionine (Figure 4.3, filled squares). The ratio of expression during growth in the absence of methionine to that in the presence of methionine is termed the repression ratio. metE exhibited a high repression ratio of 590, which suggests that the metE gene is tightly regulated in response to SAM (Table 4.3).

The metE repression ratio is consistent with the functional role of metE in methionine biosynthesis.

The wild-type yusC-lacZ fusion exhibited ~2.3-fold lower β-galactosidase activity after 4 h of growth in the absence of methionine compared to wild-type metE-lacZ under the same conditions (Figure 4.3, open inverted triangles), while wild-type yusC-lacZ failed to show complete repression of β-galactosidase activity after 4 h of growth in the presence of methionine (Figure 4.3, filled inverted triangles). The repression ratio of wild-type yusC was 31, ~20-fold lower than the repression ratio of wild-type metE (Table 213

4.3). A low repression ratio indicates that the gene is less tightly regulated in response to

SAM, which is consistent with the functional role of yusC as a methionine transport gene.

These results were in agreement with previously reported data (Tomsic et al. 2008).

Figure 4.3 In vivo expression assay of the wild-type and hybrid leader RNA constructs during methionine starvation. S box leader gene-lacZ fusions were integrated in single copy in strain BR151 (lys-3 metB10 trpC2). Cells were grown in Spizizen minimal medium (Anagnostopoulos and Spizizen 1961) containing methionine and resuspended in fresh medium in the presence (filled symbols) or absence (open symbols) of methionine. Samples were taken at 1-h intervals until 4 h after resuspension. Squares, wild-type metE-lacZ; inverted triangles, wild-type yusC-lacZ; diamonds, metE- yusC-lacZ; circles, yusC-metE-lacZ. MU, Miller units; -M, in the absence of methionine; +M, in the presence of methionine.

214

Table 4.3 In vivo expression analysis of the wild-type and hybrid constructs.

*

a β-Galactosidase activity is expressed as Miller Units (MU) and is reported for samples taken at 4 h after the growth in the presence of methionine (+Met) or absence of methionine (-Met).Values are reported as the means ± the standard deviations for two assays. *, value is too low to measure with high accuracy. b Repression ratio indicates the ratio of values in the absence of methionine to the values in the presence of methionine after 4 h of growth.

The metE-yusC hybrid leader construct consists of the SAM binding domain of the metE leader RNA and the terminator domain of the yusC leader RNA. Expression of the metE-yusC-lacZ hybrid construct was 4.5-fold lower compared to wild-type metE- lacZ after 4 h of growth in the absence of methionine, and expression of the hybrid construct was ~2.0-fold lower than that of wild-type yusC-lacZ under the same conditions

(Figure 4.3, open diamonds). Expression of metE-yusC-lacZ was repressed completely in the presence of methionine, similar to wild-type metE-lacZ (Figure 4.3, filled diamonds).

Also, the metE-yusC-lacZ hybrid showed a repression ratio of ~220, which was ~2.7-fold

215

lower than wild-type metE-lacZ and 7.0-fold higher than wild-type yusC-lacZ (Table

4.3). These results suggest that sensitivity of the metE-yusC RNA to the changing concentration of SAM in vivo is a function of the metE binding domain. It is possible that expression of the metE-yusC hybrid is relatively low due to the ineffective competition between the yusC antiterminator and terminator domains, which is consistent with the wild-type yusC expression.

The yusC-metE-lacZ hybrid construct exhibited a delayed induction. β-

Galactosidase activity of this hybrid construct was only 1.3-fold lower compared to that of wild-type yusC-lacZ after 4 h of growth in the absence of methionine (Figure 4.3, open circles). In the presence of methionine, expression was not completely repressed (Figure

4.3, filled circles). These results indicate that the expression profile for the yusC-metE- lacZ construct is very similar to that of wild-type yusC-lacZ. The yusC-metE-lacZ construct exhibited a repression ratio that was 1.6-fold lower than that of wild-type yusC- lacZ, suggesting that this hybrid construct is also less tightly regulated, similar to wild- type yusC. Again, these results suggest that the binding domain plays a critical role in sensing SAM. These results indicate that the binding domain contributes to the degree of repression in the presence of methionine (when SAM pools are high), while the terminator/antiterminator domain dictates the level of expression.

4.3.2 The termination efficiency of S box genes is dictated by the terminator/antiterminator domains

216

In vitro transcription termination assays were employed to test the efficiency of termination for each wild-type and hybrid construct in response to varying SAM concentrations. A representative image of a polyacrylamide gel and the corresponding calculation of the percent termination are shown (Figure 4.4).

217

A.

RT

B.

Figure 4.4 SAM-dependent transcription termination of an S box riboswitch. A. In vitro transcription of a representative B. subtilis S box template, yitJ, in the presence or absence of SAM. SAM was added ranging from 0-2.4 µM final concentration. Terminated (T, filled circles) and readthrough (RT, open circles) RNAs are labeled. B. Quantitation of the SAM response. Termination efficiency is the amount of the terminated product relative to the sum of the terminated and readthrough products.

218

Analysis of the binding domains versus the terminator/antiterminator domains for a wild-type and hybrid construct pair was performed using in vitro transcription termination assays. In Figure 4.5, each wild-type and hybrid pair has the same binding domain. Wild-type yusC and the yusC-metE hybrid both needed a high concentration of

SAM (150 µM) to achieve maximum termination of ~80%. Wild-type yusC exhibited a high percent termination of ~40% in the absence of SAM (Figure 4.5A, filled circles).

This suggests that the antiterminator of the yusC leader RNA cannot effectively compete with the terminator. However, by changing the yusC terminator domain to the metE terminator domain, the yusC-metE hybrid construct exhibited a termination efficiency of

~15% in the absence of SAM (Figure 4.5A, open circles). The ~2.5-fold lowered percent termination compared to wild-type yusC indicated that the metE terminator domain contributes significantly to the reduced termination efficiency in the absence of SAM.

The downshift in the curve of the hybrid leader construct compared to the curve of the wild-type construct illustrates the effect of the terminator domain.

Wild-type metE exhibited ~80% termination in the presence of ~1 µM SAM

(Figure 4.5B, filled squares). The metE-yusC hybrid exhibited a higher percent termination (98%) at the same concentration of SAM (Figure 4.5B, open squares), indicating that the yusC terminator domain contributes to the overall increased termination compared to wild-type metE. However, as both constructs have the same binding domain, high termination in the presence of low concentration of SAM indicates the sensitivity of the metE binding domain. In the absence of SAM, wild-type metE

219

exhibited a low percent termination of 7.0%, consistent with the physiological role of metE (Figure 4.5B, filled squares). By changing the metE terminator domain to that of the yusC terminator domain, we observed an increase in termination efficiency to 66% in the absence of SAM (Figure 4.5B, open squares). The change in terminator domains resulted in the ~10-fold reduction in termination efficiency in the absence of SAM compared to wild-type metE, which is seen as an upshift in the hybrid curve compared to that of the wild-type curve. Overall, these data indicate that the terminator/antiterminator domains contribute to the efficiency of termination in response to SAM, while the sensitivity to

SAM is maintained by the binding domain.

220

A.

B.

Figure 4.5 Direct comparison of the terminator/antiterminator competition. In vitro transcription termination analyses for the wild-type and hybrid constructs. A. Wild-type yusC, filled circles; yusC-metE hybrid, open circles. B. Wild-type metE, filled squares; metE-yusC hybrid, open squares. Each wild-type and hybrid pair have the same binding domain but different terminator domains. Titration of the leader RNAs was performed in the presence of increasing concentrations of SAM (X-axis). Termination efficiency (Y- axis) is the amount of the terminated product relative to the sum of the terminated and readthrough products.

221

The effect of the binding domain on termination efficiency was analyzed by comparing the wild-type and hybrid leader RNA pairs in which the terminator domains were constant (Figure 4.6). Wild-type yusC exhibited a termination efficiency of ~40% in the absence of SAM (Figure 4.6A, filled circles). A change in the binding domain from yusC to metE resulted in higher termination (~60%) by the metE-yusC hybrid compared to wild-type yusC under the same conditions (Figure 4.6A, open squares). The key observation was that the metE-yusC hybrid approached ~100% termination in the presence of low SAM (~1 µM), while wild-type yusC required at least 150 µM SAM to reach ~90% termination. The higher termination of the metE-yusC hybrid in the absence of SAM can be due to the ineffective competition of the yusC antiterminator domain with the terminator domain, while the increased termination efficiency in the presence of lower concentrations of SAM is due to the high sensitivity of the metE binding domain.

Wild-type metE exhibited low termination in the absence of SAM (~7%) and high termination (~100%) in the presence of very low concentrations of SAM (~1 µM) (Figure

4.6B, filled squares). A change in the binding domain from metE to yusC resulted in a modest 2-fold increase in termination (~14%) by the yusC-metE hybrid in the absence of

SAM (Figure 4.6B, open circles). However, yusC-metE exhibited a termination efficiency of ~80% in the presence of a much higher concentration of SAM (150 µM). These results suggest that a change in the binding domain resulted in a loss of SAM sensitivity, which was consistent with that of wild-type yusC, while the presence of the metE terminator 222

resulted in low termination efficiency in the absence of SAM. Together, the in vitro data suggest that although the termination efficiency is dictated by the terminator/antiterminator domains, the sensitivity to SAM is a function of the binding domain. Table 4.4 summarizes the in vitro analyses for the wild-type and hybrid leader

RNAs.

223

A.

B.

Figure 4.6 Direct comparisons of the binding domains. In vitro transcription termination analyses for the wild-type and hybrid constructs. A. Wild-type yusC, filled circles; metE-yusC hybrid, open squares. B. Wild-type metE, filled squares; yusC-metE hybrid, open circles. Each wild-type and hybrid pair have different binding domains but the same terminator domains. Titration of the leader RNAs was performed in the presence of increasing concentrations of SAM (X-axis). Termination efficiency (Y-axis) is the amount of the terminated product relative to the sum of the terminated and readthrough products.

224

Table 4.4 In vitro analysis of the wild-type and hybrid metE and yusC leader RNAs

a Termination efficiency is the amount of the terminated product relative to the sum of the terminated and readthrough products. Values are indicated as means ± the standard deviations for two assays. b Termination efficiency in the absence of SAM. c Termination efficiency in the presence of 1 µM SAM.

4.3.3 The length of the metK upstream (US) box sequence affects expression of the metK-yusC hybrid leader RNAs

The metK US box sequence was fused upstream of the helix 1 region of the yusC leader RNA. The yusC promoter and transcriptional start-site (+1) were replaced by the metK promoter and transcriptional start-site. A total of 8 metK-yusC-lacZ fusion constructs (my1-my8, Table 4.5) were analyzed using in vivo expression assays and compared to the wild-type metK-lacZ and yusC-lacZ fusion constructs (Figures 4.7-4.9).

In agreement with previous results, expression of the wild-type metK-lacZ fusion was not repressed in minimal medium during growth in the presence of methionine

225

(when SAM pools are high) (Figure 4.7, filled squares), and did not increase during methionine starvation (when SAM pools are low) (Figure 4.7, open squares; refer to section 2.3.1, Chapter 2). Expression of the wild-type yusC-lacZ fusion construct was induced during growth in the absence of methionine (Figure 4.7, open triangles), and was relatively high during growth in the presence of methionine (Figure 4.7, filled triangles).

These results indicate that yusC is relatively less tightly repressed in the presence of

SAM.

Figure 4.7 In vivo expression analysis of the wild-type metK and yusC leader RNAs. S box gene-lacZ fusions were integrated in single copy in strain BR151 (lys-3 metB10 trpC2). Cells were grown in Spizizen minimal medium (Anagnostopoulos and Spizizen 1961) containing methionine and resuspended in fresh medium in the presence (filled symbols) or absence (open symbols) of methionine. Samples were taken at 1-h intervals until 5 h after resuspension. Squares, metK-lacZ; circles, yusC-lacZ. MU, Miller units; - M, in the absence of methionine; +M, in the presence of methionine.

226

Table 4.5 Sequences of the metK and yusC hybrid leader RNA constructs a, b, c

metK-yusC 1 (my1; with partial US box) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagcgga tatttctcttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgtt

metK-yusC 2 (my2; with longer US box) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagcgga tactcttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgttgaa

metK-yusC 3 (my3; without US box) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgtttctc ttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgttgaaatggt metK-yusC 4 (my4; my1 with US1) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagatta tatttctcttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgtt

metK-yusC 5 (my5; my2 with US1) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagatta tactcttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgttgaa metK-yusC 6 (my6; my3 with ‘AG’ upstream of TTT at start of yusC helix 1) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagtttc tcttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgttgaaatg metK-yusC 7 (my7; my3 with ‘AG’ upstream of T at start of yusC helix 1) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagtctc ttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgttgaaatggt metK-yusC 8 (my8; my6 + CGG) +1 taggtgcctatttcttctgaatcatattgacattgcaaacccttttacgataagatatttcattgagcggt ttctcttatcaagagaggtggagggaagtgccctatgaagcccggcaaccatcaacactgttgaa

a +1 indicates the metK transcriptional start-site, G b underlined sequence indicates the metK US box region fused to the yusC sequence c my, abbreviation for metK-yusC

227

The metK-yusC1 construct contained the first 9 residues of the metK US box sequence (including the metK +1) (Table 4.5, my1) and showed an expression profile similar to that of the wild-type metK-lacZ fusion during methionine starvation

(comparing Figure 4.7, squares to Figure 4.8, triangles). β-Galactosidase activity for the metK-yusC1-lacZ fusion was ~2.0-fold higher than wild-type metK-lacZ after 5 h of growth in the presence and absence of methionine. The metK-yusC2 construct contained the first 17 residues of the metK US box sequence (Table 4.5, my2) and showed ~1.2-fold increase in β-galactosidase activity during growth in the presence of methionine and

~1.5-fold increase during growth in the absence of methionine, compared to metK-yusC1- lacZ (Figure 4.8, squares). Both hybrid constructs failed to exhibit an increase in expression during growth in the absence of methionine, consistent with the expression of wild-type metK-lacZ. These results indicate that including the metK US box sequence upstream of the yusC leader RNA and changing the native yusC promoter to the metK promoter sequence resulted in a loss of response to methionine starvation in vivo.

228

Figure 4.8 In vivo analyses of the metK-yusC hybrid leader RNA fusions. S box leader gene-lacZ fusions were integrated in single copy in strain BR151 (lys-3 metB10 trpC2). Cells were grown in Spizizen minimal medium (Anagnostopoulos and Spizizen 1961) containing methionine and resuspended in fresh medium in the presence (filled symbols) or absence (open symbols) of methionine. Samples were taken at 1-h intervals until 5 h after resuspension. Triangles, metK-yusC1-lacZ; squares, metK-yusC2-lacZ; circles, metK-yusC3-lacZ; diamonds, metK-yusC4-lacZ; inverted triangles, metK-yusC5-lacZ. MU, Miller units; -M, in the absence of methionine; +M, in the presence of methionine.

The metK-yusC3 construct contained only the metK promoter sequence and the metK transcriptional start-site (G+1) upstream of the yusC leader RNA helix 1, while the 229

metK US box sequence was omitted (Table 4.5, my3). The metK-yusC3-lacZ exhibited very low β-galactosidase activity (1-2 Miller units) (Figure 4.8, circles). This result indicated that the metK US box sequence is important for expression in vivo, and that the metK promoter sequence, along with the metK transcriptional start-site, were not sufficient for metK-yusC3 expression in vivo.

Sequence of the metK-yusC3 construct contained the +1 position fused to three consecutive uridines of the yusC helix 1 sequence. We speculated that this sequence caused the RNAP to dissociate from the template (due to insufficient number of purines near the transcriptional start-site), subsequently resulting in no expression (Figure 4.8, circles). It is possible that the RNAP requires the metK A+2 and G+3 positions for proper transcription, in addition to the transcriptional start-site (G+1). We therefore generated hybrid constructs in which the first 3 positions of the metK US box sequence were fused upstream of either 3 or 1 uridines in the helix 1 region of yusC, generating the metK- yusC6 and metK-yusC7 constructs, respectively (Table 4.5). In vivo expression assays showed that adding the +2 and +3 positions (A and G) increased the β-galactosidase by

~5.0-8.0-fold for the two hybrid constructs, relative to the metK-yusC3-lacZ fusion construct (Figure 4.9, inverted triangles and circles).

230

Figure 4.9 In vivo expression of metK-yusC hybrid leader constructs. S box leader gene-lacZ fusions were integrated in single copy in strain BR151 (lys-3 metB10 trpC2). Cells were grown in Spizizen minimal medium (Anagnostopoulos and Spizizen 1961) containing methionine and resuspended in fresh medium in the presence (filled symbols) or absence (open symbols) of methionine. Samples were taken at 1-h intervals until 5 h after resuspension. Inverted triangles, metK-yusC6-lacZ; circles, metK-yusC7-lacZ; diamonds, metK-yusC8-lacZ. MU, Miller units; -M, in the absence of methionine; +M, in the presence of methionine.

The metK-yusC8 construct was generated by adding 3 more nucleotides (to include the first 6 positions, 5’-GAGCGG-3’) of the metK US box sequence upstream of the 3 uridines of yusC. Expression of metK-yusC8-lacZ increased ~20-fold when compared to metK-yusC3-lacZ during growth in the absence of methionine and ~5.0-fold

231

compared to the metK-yusC6 and 7 fusion constructs during growth in the absence of methionine (Figure 4.9, diamonds). These results suggest that the in vivo activity of the metK-yusC hybrid construct was restored by adding 3 additional metK US box positions.

The metK phylogenetic analysis revealed that the invariant C4, G5 and G6 positions are part of a ‘conserved core’ region and are important for metK expression in vivo (see 2.3.3,

Chapter 2). It is interesting to note that the expression of the metK-yusC8-lacZ construct in the absence of methionine was 1.5-fold higher than in the presence of methionine after

5 h of growth (Figure 4.9, comparing open diamonds to filled diamonds). This result suggests that, in contrast to wild-type metK, the metK-yusC8 construct exhibits some level of regulation during methionine starvation.

Mutating the conserved core region within the US box sequence (US1 mutation

C4G5G6→A4U5U6; see Chapter 2) resulted in very low expression of the metK-yusC4- lacZ (3-4 Miller units) and metK-yusC5-lacZ (0 Miller units) fusion constructs during growth in the presence and absence of methionine (Figure 4.8, diamonds and inverted triangles). We have previously predicted that these positions from the metK US box conserved core region play a crucial role in metK regulation most likely at the level of transcript stability (Chapter 2). Overall, the metK-yusC results indicate the requirement for the US box sequence for efficient expression in vivo.

4.3.4 The metK US box sequence contributes to the termination efficiency of the metK-yusC hybrid constructs 232

In vitro transcription termination assays were performed to analyze the efficiency of termination at the yusC leader region terminator for each hybrid construct in response to SAM. Consistent with the results described in Chapter 2, wild-type metK failed to respond to SAM in vitro (section 2.3.9, Chapter 2). Percent termination did not increase upon addition of SAM (Figure 4.10, lanes 1 and 2, bands RT1 and T1). metK-yusC1 and metK-yusC2 exhibited high constitutive readthrough in the absence and presence of SAM

(Figure 4.10, lanes 3, 4 and 5, 6, respectively; RT2, T2). Like wild-type metK, metK-yusC

1 and metK-yusC 2 failed to respond to changing levels of SAM in vitro and showed very low termination efficiency (1%). A strong pause band (~112 b) was observed for the metK-yusC hybrid constructs 1 and 2 and this band was not observed for the wild-type metK control. Previous studies of wild-type yusC also showed no evidence of such a pause band (data not shown).

233

Figure 4.10 In vitro transcription termination analysis. In vitro transcription termination assays of the wild-type metK and metK-yusC hybrid constructs in response to SAM. SAM was added at a final concentration of 2.5 mM. Termination efficiency is the amount of the terminated product relative to the sum of the terminated and readthrough products. RT1, readthrough band of wild-type metK; T1, terminated band of wild-type metK; RT2, readthrough band of the metK-yusC hybrid transcripts; T2, terminated band of the metK-yusC hybrid transcripts; P, pause band.

Compared to constructs metK-yusC 1 and 2, metK-yusC3 (which contained the

G+1 position fused to three consecutive uridines of the yusC helix 1 sequence) failed to show any transcripts (Figure 4.10, lanes 7 and 8). Attempts to improve the transcription yield (by changing the ratio of the nucleotides added to the reaction) were unsuccessful.

Similarly, the metK-yusC6 and 7 constructs, which contained two additional positions of 234

the metK US box sequence (A+2 and G+3) relative to metK-yusC3 (Table 4.5), showed no transcription (Figure 4.10, lanes 13-16). Overall, these results suggest that a longer US box sequence is required for efficient transcription, as the first three nucleotides of the metK US box sequence were not sufficient.

The metK-yusC8 construct, which contained the first 6 positions of the metK US box sequence, showed a marginal improvement in the total amount of transcription compared to the metK-yusC6 and 7 constructs and also showed the reappearance of the pause band. Inconsistent with the in vivo results, metK-yusC 8 exhibited constitutive low termination (2%) and failed to respond to the changing concentrations of SAM in vitro

(Figure 4.10, lanes 17 and 18). metK-yusC8 showed a low yield of total transcription compared to wild-type metK and the hybrid constructs metK-yusC1 and 2.

The metK-yusC4 and metK-yusC5 constructs (containing the US1 mutation) also showed low transcription yields when compared to metK-yusC1 and 2. Similar to the current in vitro results, previous in vitro investigation of the metK US box mutants also revealed significant differences in transcription yields (see Chapter 2). Together, these results suggest that disrupting the conserved core region or including just the first 6 positions in the US box sequence results in reduced transcription efficiency. It was interesting to note that despite the low intensity of transcripts, the metK-yusC4 and 5 constructs exhibited high percent termination in the presence of SAM in vitro (Figure

4.10, comparing lanes 9, 10 and 11, 12, respectively), although, this was inconsistent with the in vivo expression analysis for these constructs.

235

4.4 Discussion

An extensive in vivo and in vitro analysis previously revealed variability in S box gene expression and termination efficiency and correlated the physiological function of each S box gene to the observed variability (Tomsic et al. 2008). The goal of the current study was to investigate the factors that contribute to the differential regulation of S box genes in B. subtilis. In the first half of this chapter, we compared the effects of the binding and terminator domains on gene expression in vivo and termination efficiency in vitro. The leader RNAs (metE and yusC) were selected based on two criteria. They exhibit distinct regulatory profiles and they contain an identical sequence on the 3’ side of helix 1. This sequence provided a seamless transition for hybridizing the binding domain of one leader RNA with the terminator domain of the other RNA.

The data for the wild-type metE and yusC genes from this study are consistent with the previously reported expression profiles for these S box genes (Tomsic et al.

2008). The metE-lacZ fusion exhibited the greatest increase in β-galactosidase activity, compared to the other three constructs, during starvation conditions (Figure 4.3, open squares). The metE gene encodes methionine synthase (Grundy and Henkin 1998).

Expression of metE is tightly repressed in the presence of methionine (when SAM pools are high), while expression increases rapidly when the cells are starved for methionine

(when SAM pools are low) (Tomsic et al. 2008). These features are consistent with the role that metE plays in methionine biosynthesis. A high repression ratio indicates that

236

metE is regulated tightly in response to changing concentrations of SAM.

The wild-type yusC-lacZ fusion exhibited a delayed onset of expression. The expression of the yusC gene was less tightly repressed in the presence of methionine. yusC showed a lower magnitude of induction in the absence of methionine. The yusC gene functions as a methionine transporter (Grundy and Henkin 1998, Hullo et al. 2004).

Expression of yusC is induced even when SAM pools are only slightly reduced, consistent with its physiological role. The consequence of this differential response is that the cell can respond to an initial reduction in SAM by first inducing expression of a transporter that would scavenge any available extracellular methionine (Tomsic et al.

2008). Only if this attempt to increase the cellular SAM pools fails does the cell induce expression of the energetically more expensive biosynthetic pathway (Tomsic et al.

2008).

The metE-yusC-lacZ hybrid construct did not show a rapid increase in expression as shown by wild-type metE-lacZ. However, the repression ratio of the metE-yusC hybrid construct was only ~2.5-fold lower than wild-type metE. These results suggest that the constant binding domain between metE and metE-yusC contributes to the high sensitivity to SAM in vivo. Although expression of the hybrid construct was repressed in the presence of methionine (similar to that of wild-type metE), the β-galactosidase activity did not reach wild-type metE-lacZ levels under inducing conditions. The expression of the hybrid construct was also lower than wild-type yusC-lacZ. These results suggest that the overall low expression of the metE-yusC hybrid construct during growth in the

237

absence of methionine is a function of the yusC terminator domain and is consistent with the yusC expression profile during inducing conditions.

The yusC-metE hybrid was not repressed completely during growth in the presence of methionine. The yusC-metE hybrid showed a delayed response to methionine starvation and the β-galactosidase activity was comparable to that of wild-type yusC.

These data suggest that the yusC-metE hybrid, which showed a repression ratio similar to that of wild-type yusC, is not sensitive to changing levels of SAM and is therefore predicted to be less tightly regulated. Overall, the in vivo data suggest that the binding domain contributes to the degree of repression during growth in the presence of methionine and the terminator domain contributes to the level of expression during growth in the absence of methionine.

The effect of exchanging the binding and terminator/antiterminator domains was also tested on the termination efficiency using in vitro transcription termination assays.

The wild-type metE construct exhibited very low termination in the presence of SAM and rapidly reached ~100% termination at low concentrations of SAM (1 µM). This result, consistent with previous data, indicates that in the presence of SAM, the metE leader

RNA terminator structure is highly stable and prevents formation of the alternate antiterminator structure. In the absence of SAM, the antiterminator effectively competes with the terminator resulting in readthrough. The wild-type yusC construct showed high termination in the presence of SAM and required high concentrations of SAM (150 µM) to achieve 80% termination. This suggests that the antiterminator structure of the yusC

238

leader RNA cannot effectively compete with the terminator structure, resulting in readthrough even in the presence of SAM.

The terminator/antiterminator competition was analyzed by comparing the wild- type and hybrid leader RNAs containing common binding domains. A shift above or below the wild-type control was indicative of the distinct terminator domains suggesting that the terminator and antiterminator structures contribute to the overall termination efficiency. A comparison of the wild-type and hybrid leader RNA pairs in which the binding domains were changed revealed differences in sensitivity to SAM. For example, even though the metE-yusC hybrid exhibited high termination in the absence of SAM

(due to the yusC terminator domain), it appeared to be more sensitive to the concentration of SAM (due to the metE binding domain) as it exhibited ~100% termination in the presence of low concentrations of SAM compared to the wild-type yusC transcript.

Similarly, the yusC-metE hybrid exhibited low percent termination in the absence of

SAM (consistent with the met terminator domain) but required very high concentrations of SAM (150 µM) to achieve ~80% termination (consistent with the yusC binding domain). Therefore, these data suggest that even though the termination efficiency is dictated by the terminator/antiterminator structures, the binding domain dictates the sensitivity to SAM. These results suggest that both the domains contribute in the calibration of the S box gene expression in response to SAM.

As the binding affinity for SAM correlates with sensitivity to SAM in vitro, additional information can be obtained by performing SAM binding assays on the wild-

239

type and hybrid leader RNAs. This can provide an accurate comparison of Kd values for each hybrid leader RNA construct in relation to the wild-type RNAs. In addition to contributions made by the binding and terminator/antiterminator domains in S box gene expression, we hypothesized that parameters such as efficiency of transcription as well as processivity of the RNAP during transcription elongation play a role in the observed differences of S box gene expression. In order to investigate the effect of transcription on

S box gene expression, we generated metK-yusC hybrid leader RNA constructs, and tested the effect of the metK promoter and metK US box sequence on expression and termination efficiency of yusC.

Compared to other S box genes, the metK S box gene is an exception as it fails to exhibit increased expression in the absence of methionine during methionine starvation.

Chapter 2 described that metK-lacZ expression was regulated only when the SAM pools were modulated without changing the methionine levels. Chapter 2 also discussed the unique sequences located in the metK leader RNA that are important for metK expression in vivo. In addition to the unique expression profile during methionine starvation, wild- type metK also fails to exhibit increased transcription termination in the presence of SAM in vitro. We speculated that the metK US box sequence, located immediately downstream of the metK promoter, plays a critical role in the regulation of metK (Chapter 2). In the current investigation, hybrid leader RNAs in which the metK promoter sequence and the metK US box sequence were fused to the helix 1 region (anti-antiterminator) of yusC (an

S box gene which is not tightly regulated in response to SAM) were generated. These

240

constructs varied in the sequence of the US box sequence located upstream of the yusC helix 1.

The in vivo analysis of the metK-yusC hybrid constructs showed that expression of the hybrid construct was dependent on the length and sequence of the metK US box region (Figures 4.7-4.9). The metK-yusC1 hybrid contained the first 9 nucleotides of the metK US box sequence. This construct exhibited β-galactosidase activity in the range of

40-50 Miller units. As the length of the US box sequence was increased to include an additional 8 nucleotides in metK-yusC2, it resulted in a further increase in β-galactosidase activity in the range of 60-85 Miller units. Mutating a region (C4G5G6; US1 mutation) within the US box sequence resulted in a complete loss of β-galactosidase activity. The

US1 mutation resulted in a ~6-fold reduction in metK-lacZ expression during methionine starvation (Chapter 2). These data corroborate the observation that the conserved core region within the US box sequence plays an important role in gene expression in vivo (as discussed in Chapter 2).

Expression was drastically reduced when the entire metK US box sequence was omitted (except the G+1) for metK-yusC3 (Figure 4.8, circles), suggesting that along with the need for an intact US box sequence, expression of metK-yusC is dependent on the length of the US box sequence. It is possible that the conserved core region within the US box sequence acts as a recruiting site for a protein factor (like an RNase) and mutating or deleting this site abolishes gene expression. These results are consistent with data obtained for metK US box mutants (Chapter 2). It is also interesting to note that replacing

241

the native yusC promoter with that of the metK promoter abolished a regulatory response to changing concentrations of SAM. Gene expression was not induced in the absence of methionine (when SAM pools are low). This phenotype is consistent with the wild-type metK-lacZ response in vivo under methionine starvation conditions.

The sequence of the metK-yusC3 fusion contained only one purine at the +1 site, immediately upstream of 3 uridines of the yusC helix 1 sequence. As this fusion construct resulted in low gene expression in vivo, we speculated that the RNAP requires a higher number of purines near the transcriptional start-site for efficient transcription initiation.

Two variant metK-yusC hybrid leader RNAs (metK-yusC 6 and metK-yusC 7) included purines at the +2 and +3 positions of the metK US box sequence but differed in the site of attachment with the yusC leader RNA sequence. These constructs exhibited a 5-10-fold increase in the expression of β-galactosidase reporter gene compared to metK-yusC3 during growth in the absence of methionine. Although expression was not restored completely to that of metK-yusC1 or 2, these results validated the need for a purine-rich sequence for initiation of metK transcription.

Expression of the metK-yusC8 construct was ~3.0-fold higher compared to metK- yusC6 and 7, and ~25-fold higher relative to metK-yusC3 during methionine starvation conditions. This result suggested that including the first six nucleotides of the US box sequence was sufficient to restore expression and that metK-directed expression is dependent on the length of the metK US box sequence, especially an intact conserved core region. It is interesting to note that only the metK-yusC8 hybrid construct exhibited a

242

1.5-fold increase in expression in the absence of methionine relative to expression in the presence of methionine suggesting some level of gene regulation in response to SAM.

Hybrid constructs metK-yusC6, 8, 1 and 2 contain increasing lengths of the US box sequence fused to the yusC leader RNA (Table 4.5). It is possible that regulation shown by the metK-yusC8-lacZ construct is not only dependent on the first 6 nucleotides of the US box sequence, but also on the sequence at the fusion between the metK and yusC leader RNAs (Table 4.5 and Figures 4.8 and 4.9).

Along with in vivo analyses, the 8 hybrid constructs were tested in vitro using a transcription termination assay. The metK-yusC1 and 2 hybrid constructs exhibited high constitutive readthrough in the presence and absence of SAM. No response to changing concentrations of SAM was evident in vitro, consistent with the in vivo expression assays. The appearance of a pause band, roughly half the size of the readthrough band, was seen for the metK-yusC hybrids 1 and 2. This band was not observed previously for wild-type metK and yusC. It is possible that replacing the sequence near the 5’ end of the yusC leader RNA resulted in the unnatural slowing down of the RNAP at a region estimated to be near helix 4 of yusC. However, further investigation is necessary to validate this prediction.

Consistent with the in vivo results, the metK-yusC3 construct exhibited no transcription in vitro. We speculated that the purine at +1 was not sufficient for the

RNAP to efficiently initiate transcription. Two additional purines (as in the case of metK- yusC 6 and 7) helped to increase in vivo expression during growth in the absence of

243

methionine by 5-fold compared to metK-yusC3. However, including these purines did not seem to help in vitro. It is possible that the initiation complex falls off due to the high number of uridines and cytidines located near the transcription start-site (due to the fusion with the yusC leader RNA). Including the CGG of metK US box seemed to improve transcription, as transcript bands reappeared for the metK-yusC8 hybrid.

However, this construct exhibited no increase in termination in the presence of SAM, inconsistent with the in vivo results. It is possible that some additional factor that has not been identified yet and hence could not be incorporated in the in vitro assay is responsible for the observed discrepancy.

Surprisingly, the metK-yusC 4 and metK-yusC 5 hybrids showed a response to changing levels of SAM in vitro. High transcription termination was seen in the presence of SAM (Figure 4.10). This result was unexpected as these constructs contained the US1 mutation in the metK US box sequence, which causes high constitutive readthrough in a metK construct (data not shown). The in vitro results for the metK-yusC4 and 5 constructs were inconsistent with the in vivo results. The in vitro assay is a more defined system as compared to the conditions in an in vivo assay. Again, like in the case of metK-yusC8, it is possible that an unknown in vivo factor is responsible for the discrepancies observed in the results for metK-yusC constructs 4, and 5. Another reason might be the difference in the rates of transcription in vitro compared to in vivo. The in vivo speed of the

Escherichia coli RNAP is estimated at ~40 nt/s (Vogel and Jensen 1994). Our in vitro system artificially reduces the transcription speed, which causes the RNA to fold

244

differently compared to the in vivo conditions, likely yielding inconsistent results.

One point to note from the in vitro transcription reactions is the variation observed in overall transcription yield. Each reaction was performed with the same starting concentration of DNA template, yet some transcripts appeared to have higher intensity bands. A similar observation was made during in vitro transcription analysis of the metK US box mutants in which a drastic, but reproducible difference in transcript yield was observed (Chapter 2). This property appears to be specific to metK, as such differences were not observed with yusC. We speculate that the metK US box sequence recruits a factor or factors that affects the processivity of the RNAP, resulting in variation in transcript yields. Further studies involving in vitro transcription time-course analyses will be necessary to address this possibility.

In view of the current results, we speculate that a number of factors are responsible for the observed variation in S box gene expression, both in vivo and in vitro.

The first part of this chapter compared the sensitivity of the SAM binding domain in relation to the SAM-dependent termination efficiency. It was concluded that both the binding domain as well as the the mutually exclusive terminator and antiterminator structures are important factors in SAM-dependent S box regulation.

The second half of this chapter concludes that sequences located at the 5’ end of a transcript (like the US box sequence at the start of the metK leader RNA) are also responsible for variation in S box gene regulation. The 5’ end of the transcript can affect parameters such as efficiency of transcription initiation as well as processivity of the

245

RNAP during transcription elongation and might play a role in the observed differences of S box gene expression. These are valid possibilities based on differences observed in the transcript yields for the metK-yusC transcripts.

We also predict that regulation occurs at the level of mRNA stability, based on results obtained during the metK investigation (Chapter 2). We have evidence to support that alteration of the metK US box sequence reduces metK transcript stability in the absence of SAM (Chapter 2). It is possible that the metK-yusC hybrid constructs are subject to a similar regulatory mechanism and additional studies that investigate transcript stability and abundance for the metK-yusC hybrids will help to validate this prediction.

246

CHAPTER 5

SUMMARY AND FUTURE DIRECTIONS

The focus of this dissertation has been to investigate the S box riboswitch from the Gram-positive organism B. subtilis. The S box leader RNAs specifically recognize

SAM and undergo a conformational change that results in the formation of an intrinsic transcription terminator helix. Regulation by the S box riboswitch occurs at the level of premature transcription termination. The majority of the S box leader RNAs undergo a structural modification in response to SAM. However, significant variation is observed in the SAM-dependent S box gene expression in vivo and in vitro.

The goals of the current investigation were three-fold. The major part of the research focused on the characterization of the metK leader RNA and the identification of specific elements that make this RNA unique (Chapter 2). The second aim of this research was to identify specificity determinants required for SAM recognition by the yitJ leader RNA (Chapter 3). Finally, we also examined the factors that contribute to

247

variability in S box regulation (Chapter 4).

Previous studies have shown that metK differs from the rest of the S box genes.

SAM fails to promote increased termination at the metK leader region terminator in vitro and the metK-lacZ fusion construct fails to exhibit increased gene expression during methionine starvation conditions. Starvation for methionine indirectly depletes the in vivo

SAM pools (Grundy and Henkin 1998, McDaniel et al. 2003, Murphy et al. 2002) and the reduction in SAM pools correlates directly with the increase in expression of most S box gene-lacZ transcriptional fusions (Tomsic et al. 2008). It is important to note that methionine starvation in a methionine auxotroph results in inhibition of growth. We therefore designed a system in which the SAM pools were modulated without affecting the methionine levels. We constructed a strain, BR151 Pspac-metK, in which the native metK gene was placed under the control of an IPTG-dependent Pspac promoter (Chapter

2). We measured the SAM pools and provided evidence of increased metK-lacZ gene expression during growth in the presence of low in vivo SAM pools and high methionine.

We speculate that as SAM is synthesized from methionine by SAM synthetase (encoded by the metK gene), metK senses in vivo methionine levels in addition to sensing SAM levels. This might be one of the reasons why metK expression is not induced during methionine starvation.

Phylogenetic analyses of the metK genes from Firmicutes revealed the presence of conserved US and DS box sequences, located on the 5’ and 3’ side of the S box element

(Chapter 2). Using primer extension analysis, we mapped the metK transcriptional start-

248

site (+1) and established that the 5’ end of the US box sequence, which overlaps helix 1 of the S box element, is located precisely at the +1 of the metK transcript. Deletion mapping of the metK leader RNA indicated that these sequences were important for wild- type expression in vivo. The US and DS box sequences displayed significant complementarity, which suggested a pairing interaction between the two regions. RNase

H cleavage mapping was employed to examine the potential base-pairing interaction in metK wild-type and mutant constructs. An oligonucleotide primer that was complementary to a region within the DS box sequence was used. The wild-type RNA exhibited cleavage in the presence of SAM and protection in the absence of SAM, while disruption of the US box sequence resulted in loss of response to SAM. A mutation in the

SAM binding pocket resulted in equal protection and cleavage regardless of the presence of SAM. Overall, the results from the RNase H assays were two-fold. First, they first showed that the base-pairing interaction was stabilized only in the absence of SAM, and second, the SAM response for US-DS pairing was dependent on a functional S box element.

The extensive in vivo analysis of the metK US and DS box mutants further validated that the two regions were indeed involved in a base-pairing interaction. We established that maintenance of the US-DS base-pairing interaction, either through

Watson-Crick or wobble base-pairs, is important for a response to SAM in vivo (Chapter

2). These data suggest that under low SAM conditions, when helix 1 of the S box element is not formed, the US box sequence base-pairs with the DS box sequence, leading to

249

upregulation of gene expression. However, this was not true for every position tested. In particular, the conserved core region was highly sensitive to sequence change, as these mutants exhibited loss of response to SAM in vivo. We speculate that the strict sequence preference shown in this region is a recognition site for an alternate trans-acting factor that binds the US-DS pairing only in the presence of methionine and further stabilizes the

RNA resulting in gene expression. However, in the absence of methionine the pairing of the US-DS box sequences is not sufficient for transcript protection resulting in repression of gene expression.

Two metK US box mutants (G5U and G6U) were particularly interesting. We introduced these mutations in the chromosomal metK sequence and tested the ability of the strains to grow in the presence of the toxic methionine analog, ethionine (Chapter 2).

Strains BR151MA-G5U and BR151MA-G6A exhibited a growth defect but were able to survive in the presence of ethionine as compared to the isogenic wild-type control strain

BR151MA-ZKO, which exhibited no growth. The preferential toxicity of ethionine for the wild-type strain suggested that normal SAM synthetase activity resulted in incorporation of ethionine leading to production of SAE, which eventually killed the cells. We predicted that the survival of the G5U and G6U mutants in the presence of ethionine was due to defective SAM synthetase activity, which resulted in low SAM pools and therefore significantly lower levels of SAE in vivo. We observed that the SAM pools measured from a methionine auxotrophic strain containing the G5U chromosomal mutation were indeed below the detection limit. These results provided physiological

250

proof of the importance of the metK US box sequence in metK regulation. Measuring the

SAM synthetase activities of strains that contain these chromosomal mutations can validate our prediction further.

We also examined the effect of the US box sequence (which is located precisely at the 5’ end of the transcript) on metK leader RNA stability (Chapter 2). The wild-type

RNA exhibited the longest half-life in the absence of IPTG, while the half-life dropped significantly in the presence of IPTG. These results suggested that the US-DS pairing in the wild-type transcript was disrupted in the presence of SAM, leading to a significant decrease in transcript stability. These data were consistent with the results obtained using the RNase H assay, which also showed that the US-DS box interaction was disrupted in the presence of SAM. Significant reductions in transcript stability and abundance were observed when the US box sequence was altered. These results implied that the US box sequence plays a role in mRNA stability and that pairing of the US box region with the

DS box sequence protects the 5’ end of the transcript from degradation. Additional RNA abundance and stability assays need to be conducted in order to measure the half-life of metK transcripts during methionine starvation. The values obtained from these experiments can provide an explanation for the possible differences in metK-lacZ gene expression during methionine starvation and IPTG-limitation assays.

Although an extensive analysis of the metK leader RNA was conducted using in vivo and in vitro techniques, much remains to be understood regarding the molecular basis of SAM recognition by the metK leader RNA. Our in vitro transcription termination

251

assays were inconclusive, as we failed to observe a SAM-dependent response for wild- type or mutant metK transcripts. Multiple-round transcription assays do not generate synchronized transcription complexes. Preliminary experiments have been performed to identify the essential components for single-round transcription assays in order to generate such synchronized complexes. However, additional studies are necessary to identify the conditions for a purified in vitro system in which a SAM-dependent increase in termination can be detected.

Differences were also observed in total transcript yields between wild-type and mutant metK constructs. Conducting time-course analyses using single-round in vitro transcription termination assays will help to identify the effect of the US box sequence on transcription elongation. It is possible that an essential protein factor, which functions in vivo but has not yet been identified, is required in vitro. Cellular extracts can be generated from exponentially growing cells and added to an in vitro transcription termination assays to see if termination efficiency improves. Cellular extracts from mutant metK strains

(such as G5U or G6U) can also be used to compare the effect on transcription relative to wild-type extracts.

Based on the data obtained from the current study, we propose a model for the regulation of metK gene expression from B. subtilis, in which the metK gene is subjected to regulation at the level of mRNA stability, in addition to being under the control of the

S box regulon (Chapter 2). It is also possible that regulation occurs at the level of transcription initiation, or involves a combination of both RNA stability and transcription

252

initiation.

Additional studies need to be conducted to establish conditions that are sensitive enough to detect SAM binding by the metK leader RNA in vitro, as we failed to detect

SAM binding by the metK leader RNA using filter-exclusion assays. A possible technique is isothermal titration calorimetry (ITC), which quantitatively detects the interaction of two molecules in solution and can determine the binding affinity of the

RNA for SAM as well as the stoichiometry.

The second goal of this dissertation was to identify specificity determinants of the yitJ S box leader RNA required for recognition of SAM (Chapter 3). yitJ encodes methylenetetrahydrofolate reductase and is closely involved in the methionine biosynthetic pathway (Grundy and Henkin 1998, Murphy et al. 2002). The B. subtilis yitJ leader RNA has been extensively studied using genetic and biochemical techniques. The yitJ leader RNA shows high affinity for SAM and discriminates strongly against closely related natural analogs (McDaniel et al. 2003, Winkler et al. 2003). The crystal structure the B. subtilis yitJ RNA in complex with SAM revealed important interactions between the ligand and the RNA (Lu et al. 2010). As part of this crystal structure study, we generated a series of mutants that contained sequence changes in the highly conserved

SAM-binding pocket. Binding assays revealed that many mutations resulted in lowered

SAM affinities, as these residues either make important contacts with SAM or are important for stabilization of crucial structural domains that form the SAM-binding pocket (Lu et al. 2010). In vitro transcription termination assays showed that the majority

253

of yitJ mutants were locked in the terminator conformation, even in the absence of SAM.

The fact that some mutants retained the ability to respond to SAM suggested that formation of the SAM-bound conformation in the absence of SAM is separable from the ability to bind SAM.

We also characterized the wild-type yitJ RNA and a few selected mutants in response to SAM analogs using in vitro transcription termination assays (Chapter 3). In some cases, we found that the wild-type yitJ transcript exhibited increased termination in the presence of a few analogs. However, this was achievable only under high ligand concentrations that are not physiologically relevant. We also observed that a few yitJ mutants responded to the presence of SAM analogs, however, we failed to obtain variants that exhibited specificity only toward a SAM analog. Structural mapping of these yitJ mutants in the presence of SAM and SAM analogs compared to wild-type yitJ can reveal differences in overall RNA folding and ligand accessibility. Overall, our results indicated that the yitJ leader RNA is highly specific for SAM, its natural ligand. We speculate that the high conservation seen among the S box leader RNAs results in an evolutionary advantage, which prevents the RNA from recognizing closely-related ligands and inappropriately triggering gene regulation.

The final aspect of the current dissertation was to identify structural elements of the leader RNA responsible for the variation seen in S box gene regulation (Chapter 4).

We generated hybrid constructs using metE and yusC, as they exhibit different expression profiles in response to SAM, and tested them using in vitro transcription termination

254

assays as well in vivo lacZ reporter assays. These assays indicated that the binding pocket along with the terminator/antiterminator domains play an important role in the calibration of S box gene expression. Further studies such as SAM titrations of the metE and yusC hybrid leader RNAs using binding assays can reveal the affinity for SAM relative to the wild-type RNAs.

As part of the variability study, we also investigated the effect of the metK promoter on the transcription efficiency (Chapter 4). The metK-lacZ expression was induced when SAM pools were low and methionine levels were high, but high levels of

SAM failed to promote transcription termination of metK in vitro (Chapter 2). Based on these results and the US box mutagenesis (Chapter 2), we hypothesized that the metK promoter, along with the US box sequence, are responsible for reduced transcription initiation or reduced RNAP processivity in vitro as well as reduced transcript stability.

For this purpose, a second type of hybrid leader RNA construct was generated using the metK and yusC leader RNAs. The effects of the metK promoter and US box sequence on expression of the yusC S box leader RNA were examined both in vivo and in vitro. It was concluded that in vivo expression was dependent on the metK US box sequence and that the conserved core sequence within the US box played an important role in the slight regulation observed for one of the constructs (Chapter 4). However, the in vitro transcription termination assays were inconclusive and inconsistent with the in vivo results. As explained above, it is possible that a factor that plays an important role in vivo might be missing from the purified in vitro system. Additional in vitro characterization of

255

the metK-yusC hybrid leader RNAs can be conducted along with the metK analysis described above.

Despite sequence similarities among the S box leader RNAs investigated in the current study, significant differences in SAM-dependent regulation were evident. As metK functions to synthesize the molecular effector of the S box regulon, the cell exerts an additional level of regulation. In conclusion, we predict that the metK leader RNA is regulated by a unique mechanism that functions at the level of mRNA stability in conjunction with the S box regulatory mechanism. The involvement of a global regulator is also a valid possibility and additional investigation is necessary to elucidate the metK regulatory mechanism.

In summary, this work investigated various S box leader RNAs in response to

SAM. Our major focus was the characterization of the atypical metK leader RNA. We used a variety of in vivo and in vitro assays and proposed a model for the possible regulatory mechanism of the metK gene. We also characterized the yitJ SAM binding pocket and investigated specificity determinants required for affinity and recognition.

Lastly, we investigated the effect of structural domains within the S box leader RNAs and their effect on variability seen in S box regulation.

256

List of references

Allen ER, Orrego C, Wabiko H, Freese E. 1986. An ethA mutation in Bacillus subtilis 168 permits induction of sporulation by ethionine and increases DNA modification of bacteriophage phi 105. J Bacteriol 166: 1-8.

Altuvia S, Kornitzer D, Kobi S, Oppenheim AB. 1991. Functional and structural elements of the mRNA of the cIII gene of bacteriophage lambda. J Mol Biol 218: 723- 733.

Altuvia S, Kornitzer D, Teff D, Oppenheim AB. 1989. Alternative mRNA structures of the cIII gene of bacteriophage lambda determine the rate of its translation initiation. J Mol Biol 210: 265-280.

Ames TD, Breaker RR. 2011. Bacterial aptamers that selectively bind glutamine. RNA Biol 8: 82-89.

Ames TD, Rodionov DA, Weinberg Z, Breaker RR. 2010. A eubacterial riboswitch class that senses the coenzyme tetrahydrofolate. Chem Biol 17: 681-685.

Anagnostopoulos C, Spizizen J. 1961. Requirements for Transformation in Bacillus subtilis. J Bacteriol 81: 741-746.

Andre G, Even S, Putzer H, Burguiere P, Croux C, Danchin A, Martin-Verstraete I, Soutourina O. 2008. S-box and T-box riboswitches and antisense RNA control a sulfur metabolic operon of Clostridium acetobutylicum. Nucleic Acids Res 36: 5955-5969.

Artsimovitch I, Patlan V, Sekine S, Vassylyeva MN, Hosaka T, Ochi K, Yokoyama S, Vassylyev DG. 2004. Structural basis for transcription regulation by alarmone ppGpp. Cell 117: 299-310.

257

Ataide SF, Wilson SN, Dang S, Rogers TE, Roy B, Banerjee R, Henkin TM, Ibba M. 2007. Mechanisms of resistance to an amino acid antibiotic that targets translation. ACS Chem Biol 2: 819-827.

Auffinger P, Bielecki L, Westhof E. 2004. Anion binding to nucleic acids. Structure 12: 379-388.

Auger S, Danchin A, Martin-Verstraete I. 2002. Global expression profile of Bacillus subtilis grown in the presence of sulfate or methionine. J Bacteriol 184: 5179-5186.

Baird NJ, Ferre-D'Amare AR. 2010. Idiosyncratically tuned switching behavior of riboswitch aptamer domains revealed by comparative small-angle X-ray scattering analysis. RNA 16: 598-609.

Baker JL, Sudarsan N, Weinberg Z, Roth A, Stockbridge RB, Breaker RR. 2012. Widespread genetic switches and toxicity resistance proteins for fluoride. Science 335: 233-235.

Baker KA, Perego M. 2011. Transcription antitermination by a phosphorylated response regulator and cobalamin-dependent termination at a B riboswitch contribute to ethanolamine utilization in Enterococcus faecalis. J Bacteriol 193: 2575-2586.

Barker MM, Gaal T, Gourse RL. 2001a. Mechanism of regulation of transcription initiation by ppGpp. II. Models for positive control based on properties of RNAP mutants and competition for RNAP. J Mol Biol 305: 689-702.

Barker MM, Gaal T, Josaitis CA, Gourse RL. 2001b. Mechanism of regulation of transcription initiation by ppGpp. I. Effects of ppGpp on transcription initiation in vivo and in vitro. J Mol Biol 305: 673-688.

Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, Collins J, Lee M, Roth A, Sudarsan N, Jona I et al. 2004. New RNA motifs suggest an expanded scope for riboswitches in bacterial genetic control. Proc Natl Acad Sci U S A 101: 6421-6426.

Batey RT, Gilbert SD, Montange RK. 2004. Structure of a natural guanine-responsive riboswitch complexed with the metabolite hypoxanthine. Nature 432: 411-415.

Bechhofer DH. 2011. Bacillus subtilis mRNA decay: new parts in the toolkit. Wiley Interdiscip Rev RNA 2: 387-394.

Bengert P, Dandekar T. 2004. Riboswitch finder--a tool for identification of riboswitch RNAs. Nucleic Acids Res 32: W154-9. 258

Bian J, Shen H, Tu Y, Yu A, Li C. 2011. The riboswitch regulates a thiamine pyrophosphate ABC transporter of the oral spirochete Treponema denticola. J Bacteriol 193: 3912-3922.

Borovok I, Gorovitz B, Schreiber R, Aharonowitz Y, Cohen G. 2006. Coenzyme B12 controls transcription of the Streptomyces class Ia ribonucleotide reductase nrdABS operon via a riboswitch mechanism. J Bacteriol 188: 2512-2520.

Boudvillain M, Nollmann M, Margeat E. 2010. Keeping up to speed with the transcription termination factor Rho motor. Transcription 1: 70-75.

Breaker RR. 2011. Prospects for riboswitch discovery and analysis. Mol Cell 43: 867- 879.

Breaker RR. 2010. Riboswitches and the RNA World. Cold Spring Harb Perspect Biol.

Butler EB, Xiong Y, Wang J, Strobel SA. 2011. Structural basis of cooperative ligand binding by the glycine riboswitch. Chem Biol 18: 293-298.

Cashel M, Gentry DR, Hernandez VJ, Vinella D. 1996. Escherichia coli and Salmonella :cellular and . in (ed. FC Neidhardt), pp. 1458-1496. ASM Press, Washington, D.C.

Cashel M, Gallant J. 1969. Two compounds implicated in the function of the RC gene of Escherichia coli. Nature 221: 838-841.

Cech TR. 1990. Self-splicing of group I introns. Annu Rev Biochem 59: 543-568.

Chang TH, Huang HD, Wu LC, Yeh CT, Liu BJ, Horng JT. 2009. Computational identification of riboswitches based on RNA conserved functional sequences and conformations. RNA 15: 1426-1430.

Cheah MT, Wachter A, Sudarsan N, Breaker RR. 2007. Control of alternative RNA splicing and gene expression by eukaryotic riboswitches. Nature 447: 497-500.

Chen B, Zuo X, Wang YX, Dayie TK. 2011. Multiple conformations of SAM-II riboswitch detected with SAXS and NMR spectroscopy. Nucleic Acids Res.

Christiansen LC, Schou S, Nygaard P, Saxild HH. 1997. Xanthine metabolism in Bacillus subtilis: characterization of the xpt-pbuX operon and evidence for purine- and nitrogen- controlled expression of genes involved in xanthine salvage and catabolism. J Bacteriol 179: 2540-2550. 259

Clote P, Lou F, Lorenz WA. 2012. Maximum expected accuracy structural neighbors of an RNA secondary structure. BMC Bioinformatics 13 Suppl 5: S6.

Cochrane JC, Lipchock SV, Strobel SA. 2007. Structural investigation of the glmS ribozyme bound to Its catalytic cofactor. Chem Biol 14: 97-105.

Collins JA, Irnov I, Baker S, Winkler WC. 2007. Mechanism of mRNA destabilization by the glmS ribozyme. Genes Dev 21: 3356-3368.

Corbino KA, Barrick JE, Lim J, Welz R, Tucker BJ, Puskarz I, Mandal M, Rudnick ND, Breaker RR. 2005. Evidence for a second class of S-adenosylmethionine riboswitches and other regulatory RNA motifs in alpha-proteobacteria. Genome Biol 6: R70.

Cotter PA, Stibitz S. 2007. c-di-GMP-mediated regulation of virulence and biofilm formation. Curr Opin Microbiol 10: 17-23.

Crick F. 1970. Central dogma of molecular biology. Nature 227: 561-563.

Croft MT, Moulin M, Webb ME, Smith AG. 2007. Thiamine biosynthesis in algae is regulated by riboswitches. Proc Natl Acad Sci U S A 104: 20770-20775.

Cromie MJ, Groisman EA. 2010. Promoter and riboswitch control of the Mg2+ transporter MgtA from Salmonella enterica. J Bacteriol 192: 604-607.

Cromie MJ, Shi Y, Latifi T, Groisman EA. 2006. An RNA sensor for intracellular Mg(2+). Cell 125: 71-84.

Dann CE,3rd, Wakeman CA, Sieling CL, Baker SC, Irnov I, Winkler WC. 2007. Structure and mechanism of a metal-sensing regulatory RNA. Cell 130: 878-892.

Di Girolamo M, Busiello V, Di Girolamo A, Foppoli C, De Marco C. 1988. Aspartokinase III repression in a thialysine-resistant mutant of E. coli. Biochem Int 17: 545-554.

Doshi U, Kelley JM, Hamelberg D. 2012. Atomic-level insights into metabolite recognition and specificity of the SAM-II riboswitch. RNA 18: 300-307.

Edwards AL, Batey RT. 2009. A structural basis for the recognition of 2'-deoxyguanosine by the purine riboswitch. J Mol Biol 385: 938-948.

Edwards AL, Reyes FE, Heroux A, Batey RT. 2010. Structural basis for recognition of S- adenosylhomocysteine by riboswitches. RNA 16: 2144-2155. 260

Edwards TE, Ferre-D'Amare AR. 2006. Crystal structures of the thi-box riboswitch bound to thiamine pyrophosphate analogs reveal adaptive RNA-small molecule recognition. Structure 14: 1459-1468.

Ellington AD, Szostak JW. 1990. In vitro selection of RNA molecules that bind specific ligands. Nature 346: 818-822.

Epshtein V, Mironov AS, Nudler E. 2003. The riboswitch-mediated control of sulfur metabolism in bacteria. Proc Natl Acad Sci U S A 100: 5052-5056.

Even S, Pellegrini O, Zig L, Labas V, Vinh J, Brechemmier-Baey D, Putzer H. 2005. Ribonucleases J1 and J2: two novel endoribonucleases in B.subtilis with functional homology to E.coli RNase E. Nucleic Acids Res 33: 2141-2152.

Eymann C, Homuth G, Scharf C, Hecker M. 2002. Bacillus subtilis functional genomics: global characterization of the stringent response by proteome and transcriptome analysis. J Bacteriol 184: 2500-2520.

Fuchs RT, Grundy FJ, Henkin TM. 2007. S-adenosylmethionine directly inhibits binding of 30S ribosomal subunits to the SMK box translational riboswitch RNA. Proc Natl Acad Sci U S A 104: 4876-4880.

Fuchs RT, Grundy FJ, Henkin TM. 2006. The S(MK) box is a new SAM-binding RNA for translational regulation of SAM synthetase. Nat Struct Mol Biol 13: 226-233.

Gallo S, Oberhuber M, Sigel RK, Krautler B. 2008. The corrin moiety of coenzyme B12 is the determinant for switching the btuB riboswitch of E. coli. Chembiochem 9: 1408- 1414.

Garcia Vescovi E, Soncini FC, Groisman EA. 1996. Mg2+ as an extracellular signal: environmental regulation of Salmonella virulence. Cell 84: 165-174.

Garst AD, Heroux A, Rambo RP, Batey RT. 2008. Crystal structure of the regulatory mRNA element. J Biol Chem 283: 22347-22351.

Gerdeman MS, Henkin TM, Hines JV. 2003. Solution structure of the Bacillus subtilis T- box antiterminator RNA: seven nucleotide bulge characterized by stacking and flexibility. J Mol Biol 326: 189-201.

Gilbert SD, Rambo RP, Van Tyne D, Batey RT. 2008. Structure of the SAM-II riboswitch bound to S-adenosylmethionine. Nat Struct Mol Biol 15: 177-182.

261

Giuliodori AM, Di Pietro F, Marzi S, Masquida B, Wagner R, Romby P, Gualerzi CO, Pon CL. 2010. The cspA mRNA is a thermosensor that modulates translation of the cold- shock protein CspA. Mol Cell 37: 21-33.

Goldstein J, Pollitt NS, Inouye M. 1990. Major cold shock protein of Escherichia coli. Proc Natl Acad Sci U S A 87: 283-287.

Green NJ, Grundy FJ, Henkin TM. 2010. The T box mechanism: tRNA as a regulatory molecule. FEBS Lett 584: 318-324.

Grundy FJ, Henkin TM. 2006. From ribosome to riboswitch: control of gene expression in bacteria by RNA structural rearrangements. Crit Rev Biochem Mol Biol 41: 329-338.

Grundy FJ, Henkin TM. 2004. Regulation of gene expression by effectors that bind to RNA. Curr Opin Microbiol 7: 126-131.

Grundy FJ, Henkin TM. 1998. The S box regulon: a new global transcription termination control system for methionine and cysteine biosynthesis genes in gram-positive bacteria. Mol Microbiol 30: 737-749.

Grundy FJ, Henkin TM. 1993. tRNA as a positive regulator of transcription antitermination in B. subtilis. Cell 74: 475-482.

Grundy FJ, Lehman SC, Henkin TM. 2003. The L box regulon: lysine sensing by leader RNAs of bacterial lysine biosynthesis genes. Proc Natl Acad Sci U S A 100: 12057- 12062.

Grundy FJ, Moir TR, Haldeman MT, Henkin TM. 2002. Sequence requirements for terminators and antiterminators in the T box transcription antitermination system: disparity between conservation and functional requirements. Nucleic Acids Res 30: 1646- 1655.

Grundy FJ, Rollins SM, Henkin TM. 1994. Interaction between the acceptor end of tRNA and the T box stimulates antitermination in the Bacillus subtilis tyrS gene: a new role for the discriminator base. J Bacteriol 176: 4518-4526.

Grundy FJ, Waters DA, Allen SH, Henkin TM. 1993. Regulation of the Bacillus subtilis acetate kinase gene by CcpA. J Bacteriol 175: 7348-7355.

Grundy FJ, Winkler WC, Henkin TM. 2002. tRNA-mediated transcription antitermination in vitro: codon-anticodon pairing independent of the ribosome. Proc Natl Acad Sci U S A 99: 11121-11126. 262

Grundy FJ, Yousef MR, Henkin TM. 2005. Monitoring uncharged tRNA during transcription of the Bacillus subtilis glyQS gene. J Mol Biol 346: 73-81.

Gutierrez-Preciado A, Henkin TM, Grundy FJ, Yanofsky C, Merino E. 2009. Biochemical features and functional implications of the RNA-based T-box regulatory mechanism. Microbiol Mol Biol Rev 73: 36-61.

Haller A, Rieder U, Aigner M, Blanchard SC, Micura R. 2011. Conformational capture of the SAM-II riboswitch. Nat Chem Biol 7: 393-400.

Hampel KJ, Tinsley MM. 2006. Evidence for preorganization of the glmS ribozyme ligand binding pocket. Biochemistry 45: 7861-7871.

Harada F, Nishimura S. 1972. Possible anticodon sequences of tRNA His , tRNA Asm , and tRNA Asp from Escherichia coli B. Universal presence of nucleoside Q in the first postion of the anticondons of these transfer ribonucleic acids. Biochemistry 11: 301-308.

Haseltine WA, Block R, Gilbert W, Weber K. 1972. MSI and MSII made on ribosome in idling step of protein synthesis. Nature 238: 381-384.

Haugen SP, Berkmen MB, Ross W, Gaal T, Ward C, Gourse RL. 2006. rRNA promoter regulation by nonoptimal binding of sigma region 1.2: an additional recognition element for RNA polymerase. Cell 125: 1069-1082.

Hengge R. 2009. Principles of c-di-GMP signalling in bacteria. Nat Rev Microbiol 7: 263-273.

Henkin TM. 2008. Riboswitch RNAs: using RNA to sense cellular metabolism. Genes Dev 22: 3383-3390.

Henkin TM. 2002. Bacillus subtilis and its closest relatives :. in (eds. AL Sonenshein, JA Hoch, R Losick), pp. 313-322. ASM Press, Washington, D.C.

Henkin TM, Chambliss GH, Grundy FJ. 1990. Bacillus subtilis mutants with alterations in ribosomal protein S4. J Bacteriol 172: 6380-6385.

Henkin TM, Glass BL, Grundy FJ. 1992. Analysis of the Bacillus subtilis tyrS gene: conservation of a in multiple tRNA synthetase genes. J Bacteriol 174: 1299-1306.

263

Henkin TM, Grundy FJ. 2006. Sensing metabolic signals with nascent RNA transcripts: the T box and S box riboswitches as paradigms. Cold Spring Harb Symp Quant Biol 71: 231-237.

Hoe NP, Goguen JD. 1993. Temperature sensing in Yersinia pestis: translation of the LcrF activator protein is thermally regulated. J Bacteriol 175: 7901-7909.

Hogg T, Mechold U, Malke H, Cashel M, Hilgenfeld R. 2004. Conformational antagonism between opposing active sites in a bifunctional RelA/SpoT homolog modulates (p)ppGpp metabolism during the stringent response [corrected. Cell 117: 57- 68.

Hollands K, Proshkin S, Sklyarova S, Epshtein V, Mironov A, Nudler E, Groisman EA. 2012. Riboswitch control of Rho-dependent transcription termination. Proc Natl Acad Sci U S A 109: 5376-5381.

Huang L, Ishibe-Murakami S, Patel DJ, Serganov A. 2011. Long-range pseudoknot interactions dictate the regulatory response in the tetrahydrofolate riboswitch. Proc Natl Acad Sci U S A 108: 14801-14806.

Hullo MF, Auger S, Dassa E, Danchin A, Martin-Verstraete I. 2004. The metNPQ operon of Bacillus subtilis encodes an ABC permease transporting methionine sulfoxide, D- and L-methionine. Res Microbiol 155: 80-86.

Inaoka T, Ochi K. 2002. RelA protein is involved in induction of genetic competence in certain Bacillus subtilis strains by moderating the level of intracellular GTP. J Bacteriol 184: 3923-3930.

Jiang W, Hou Y, Inouye M. 1997. CspA, the major cold-shock protein of Escherichia coli, is an RNA chaperone. J Biol Chem 272: 196-202.

Johansson J, Mandin P, Renzoni A, Chiaruttini C, Springer M, Cossart P. 2002. An RNA thermosensor controls expression of virulence genes in Listeria monocytogenes. Cell 110: 551-561.

Kelley JM, Hamelberg D. 2010. Atomistic basis for the on-off signaling mechanism in SAM-II riboswitch. Nucleic Acids Res 38: 1392-1400.

Kim JN, Breaker RR. 2008. Purine sensing by riboswitches. Biol Cell 100: 1-11.

264

Kim JN, Roth A, Breaker RR. 2007. Guanine riboswitch variants from Mesoplasma florum selectively recognize 2'-deoxyguanosine. Proc Natl Acad Sci U S A 104: 16092- 16097.

Klein DJ, Been MD, Ferre-D'Amare AR. 2007. Essential role of an active-site guanine in glmS ribozyme catalysis. J Am Chem Soc 129: 14858-14859.

Klein DJ, Edwards TE, Ferre-D'Amare AR. 2009. Cocrystal structure of a class I preQ1 riboswitch reveals a pseudoknot recognizing an essential hypermodified nucleobase. Nat Struct Mol Biol 16: 343-344.

Klein DJ, Ferre-D'Amare AR. 2006. Structural basis of glmS ribozyme activation by glucosamine-6-phosphate. Science 313: 1752-1756.

Krasny L, Gourse RL. 2004. An alternative strategy for bacterial ribosome synthesis: Bacillus subtilis rRNA transcription regulation. EMBO J 23: 4473-4483.

Kubodera T, Watanabe M, Yoshiuchi K, Yamashita N, Nishimura A, Nakai S, Gomi K, Hanamoto H. 2003. Thiamine-regulated gene expression of Aspergillus oryzae thiA requires splicing of the intron containing a riboswitch-like domain in the 5'-UTR. FEBS Lett 555: 516-520.

Kuczynska-Wisnik D, Matuszewska E, Laskowska E. 2010. Escherichia coli heat-shock proteins IbpA and IbpB affect biofilm formation by influencing the level of extracellular indole. Microbiology 156: 148-157.

Kulshina N, Baird NJ, Ferre-D'Amare AR. 2009. Recognition of the bacterial second messenger cyclic diguanylate by its cognate riboswitch. Nat Struct Mol Biol 16: 1212- 1217.

Kulshina N, Edwards TE, Ferre-D'Amare AR. 2010. Thermodynamic analysis of ligand binding and ligand binding-induced tertiary structure formation by the thiamine pyrophosphate riboswitch. RNA 16: 186-196.

Kwon M, Strobel SA. 2008. Chemical basis of glycine riboswitch cooperativity. RNA 14: 25-34.

Lee ER, Baker JL, Weinberg Z, Sudarsan N, Breaker RR. 2010. An allosteric self- splicing ribozyme triggered by a bacterial second messenger. Science 329: 845-848.

Lee ER, Blount KF, Breaker RR. 2009. Roseoflavin is a natural antibacterial compound that binds to FMN riboswitches and regulates gene expression. RNA Biol 6: 187-194. 265

Lim J, Winkler WC, Nakamura S, Scott V, Breaker RR. 2006. Molecular-recognition characteristics of SAM-binding riboswitches. Angew Chem Int Ed Engl 45: 964-968.

Lipfert J, Herschlag D, Doniach S. 2009. Riboswitch conformations revealed by small- angle X-ray scattering. Methods Mol Biol 540: 141-159.

Loh E, Dussurget O, Gripenland J, Vaitkevicius K, Tiensuu T, Mandin P, Repoila F, Buchrieser C, Cossart P, Johansson J. 2009. A trans-acting riboswitch controls expression of the virulence regulator PrfA in Listeria monocytogenes. Cell 139: 770-779.

Lopez JM, Dromerick A, Freese E. 1981. Response of guanosine 5'-triphosphate concentration to nutritional changes and its significance for Bacillus subtilis sporulation. J Bacteriol 146: 605-613.

Lu C, Ding F, Chowdhury A, Pradhan V, Tomsic J, Holmes WM, Henkin TM, Ke A. 2010. SAM recognition and conformational switching mechanism in the Bacillus subtilis yitJ S box/SAM-I riboswitch. J Mol Biol 404: 803-818.

Lu C, Smith AM, Ding F, Chowdhury A, Henkin TM, Ke A. 2011. Variable sequences outside the SAM-binding core critically influence the conformational dynamics of the SAM-III/SMK box riboswitch. J Mol Biol 409: 786-799.

Lu C, Smith AM, Fuchs RT, Ding F, Rajashankar K, Henkin TM, Ke A. 2008. Crystal structures of the SAM-III/S(MK) riboswitch reveal the SAM-dependent translation inhibition mechanism. Nat Struct Mol Biol 15: 1076-1083.

Lu Y, Chen NY, Paulus H. 1991. Identification of aecA mutations in Bacillus subtilis as nucleotide substitutions in the untranslated leader region of the aspartokinase II operon. J Gen Microbiol 137: 1135-1143.

Lu Y, Shevtchenko TN, Paulus H. 1992. Fine-structure mapping of cis-acting control sites in the lysC operon of Bacillus subtilis. FEMS Microbiol Lett 71: 23-27.

Magnusson LU, Farewell A, Nystrom T. 2005. ppGpp: a global regulator in Escherichia coli. Trends Microbiol 13: 236-242.

Mandal M, Boese B, Barrick JE, Winkler WC, Breaker RR. 2003. Riboswitches control fundamental biochemical pathways in Bacillus subtilis and other bacteria. Cell 113: 577- 586.

Mandal M, Breaker RR. 2004. Adenine riboswitches and gene activation by disruption of a transcription terminator. Nat Struct Mol Biol 11: 29-35. 266

Mandal M, Lee M, Barrick JE, Weinberg Z, Emilsson GM, Ruzzo WL, Breaker RR. 2004. A glycine-dependent riboswitch that uses cooperative binding to control gene expression. Science 306: 275-279.

Mansjo M, Johansson J. 2011. The riboflavin analog roseoflavin targets an FMN- riboswitch and blocks Listeria monocytogenes growth, but also stimulates virulence gene-expression and infection. RNA Biol 8: 674-680.

McCarthy TJ, Plog MA, Floy SA, Jansen JA, Soukup JK, Soukup GA. 2005. Ligand requirements for glmS ribozyme self-cleavage. Chem Biol 12: 1221-1226.

McDaniel BA, Grundy FJ, Artsimovitch I, Henkin TM. 2003. Transcription termination control of the S box system: direct measurement of S-adenosylmethionine by the leader RNA. Proc Natl Acad Sci U S A 100: 3083-3088.

McDaniel BA, Grundy FJ, Henkin TM. 2005. A tertiary structural element in S box leader RNAs is required for S-adenosylmethionine-directed transcription termination. Mol Microbiol 57: 1008-1021.

McDaniel BA, Grundy FJ, Kurlekar VP, Tomsic J, Henkin TM. 2006. Identification of a mutation in the Bacillus subtilis S-adenosylmethionine synthetase gene that results in derepression of S-box gene expression. J Bacteriol 188: 3674-3681.

Meyer MM, Ames TD, Smith DP, Weinberg Z, Schwalbach MS, Giovannoni SJ, Breaker RR. 2009. Identification of candidate structured RNAs in the marine organism 'Candidatus Pelagibacter ubique'. BMC Genomics 10: 268.

Meyer MM, Roth A, Chervin SM, Garcia GA, Breaker RR. 2008. Confirmation of a second natural preQ1 aptamer class in Streptococcaceae bacteria. RNA 14: 685-695.

Miller JH. 1972. Experiments in molecular genetics. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.

Miranda-Rios J, Navarro M, Soberon M. 2001. A conserved RNA structure (thi box) is involved in regulation of thiamin biosynthetic gene expression in bacteria. Proc Natl Acad Sci U S A 98: 9736-9741.

Mironov AS, Gusarov I, Rafikov R, Lopez LE, Shatalin K, Kreneva RA, Perumov DA, Nudler E. 2002. Sensing small molecules by nascent RNA: a mechanism to control transcription in bacteria. Cell 111: 747-756.

267

Montange RK, Batey RT. 2006. Structure of the S-adenosylmethionine riboswitch regulatory mRNA element. Nature 441: 1172-1175.

Morita M, Kanemori M, Yanagi H, Yura T. 1999a. Heat-induced synthesis of sigma32 in Escherichia coli: structural and functional dissection of rpoH mRNA secondary structure. J Bacteriol 181: 401-410.

Morita MT, Tanaka Y, Kodama TS, Kyogoku Y, Yanagi H, Yura T. 1999b. Translational induction of heat shock transcription factor sigma32: evidence for a built-in RNA thermosensor. Genes Dev 13: 655-665.

Murphy BA, Grundy FJ, Henkin TM. 2002. Prediction of gene function in methylthioadenosine recycling from regulatory signals. J Bacteriol 184: 2314-2318.

Nahvi A, Barrick JE, Breaker RR. 2004. Coenzyme B12 riboswitches are widespread genetic control elements in prokaryotes. Nucleic Acids Res 32: 143-150.

Nahvi A, Sudarsan N, Ebert MS, Zou X, Brown KL, Breaker RR. 2002. Genetic control by a metabolite binding mRNA. Chem Biol 9: 1043.

Nakano MM, Zuber P. 1989. Cloning and characterization of srfB, a regulatory gene involved in surfactin production and competence in Bacillus subtilis. J Bacteriol 171: 5347-5353.

Nanamiya H, Kasai K, Nozawa A, Yun CS, Narisawa T, Murakami K, Natori Y, Kawamura F, Tozawa Y. 2008. Identification and functional analysis of novel (p)ppGpp synthetase genes in Bacillus subtilis. Mol Microbiol 67: 291-304.

Narberhaus F, Kaser R, Nocker A, Hennecke H. 1998. A novel DNA element that controls bacterial heat shock gene expression. Mol Microbiol 28: 315-323.

Narberhaus F, Waldminghaus T, Chowdhury S. 2006. RNA thermometers. FEMS Microbiol Rev 30: 3-16.

Nocker A, Hausherr T, Balsiger S, Krstulovic NP, Hennecke H, Narberhaus F. 2001a. A mRNA-based thermosensor controls expression of rhizobial heat shock genes. Nucleic Acids Res 29: 4800-4807.

Nocker A, Krstulovic NP, Perret X, Narberhaus F. 2001b. ROSE elements occur in disparate rhizobia and are functionally interchangeable between species. Arch Microbiol 176: 44-51.

268

Noeske J, Richter C, Grundl MA, Nasiri HR, Schwalbe H, Wohnert J. 2005. An intermolecular base triple as the basis of ligand specificity and affinity in the guanine- and adenine-sensing riboswitch RNAs. Proc Natl Acad Sci U S A 102: 1372-1377.

Noeske J, Richter C, Stirnal E, Schwalbe H, Wohnert J. 2006. Phosphate-group recognition by the aptamer domain of the thiamine pyrophosphate sensing riboswitch. Chembiochem 7: 1451-1456.

Nou X, Kadner RJ. 2000. Adenosylcobalamin inhibits ribosome binding to btuB RNA. Proc Natl Acad Sci U S A 97: 7190-7195.

Nudler E, Gottesman ME. 2002. Transcription termination and anti-termination in E. coli. Genes Cells 7: 755-768.

Ochi K. 2007. From microbial differentiation to ribosome engineering. Biosci Biotechnol Biochem 71: 1373-1386.

Ochi K, Kandala J, Freese E. 1982. Evidence that Bacillus subtilis sporulation induced by the stringent response is caused by the decrease in GTP or GDP. J Bacteriol 151: 1062- 1065.

Ochi K, Kandala JC, Freese E. 1981. Initiation of Bacillus subtilis sporulation by the stringent response to partial amino acid deprivation. J Biol Chem 256: 6866-6875.

Ontiveros-Palacios N, Smith AM, Grundy FJ, Soberon M, Henkin TM, Miranda-Rios J. 2008. Molecular basis of gene regulation by the THI-box riboswitch. Mol Microbiol 67: 793-803.

Oppenheim A, Altuvia S, Kornitzer D, Teff D, Koby S. 1991. Translation control of gene expression. J Basic Clin Physiol Pharmacol 2: 223-231.

Otani S, Takatsu M, Nakano M, Kasai S, Miura R. 1974. Letter: Roseoflavin, a new antimicrobial pigment from Streptomyces. J Antibiot (Tokyo) 27: 86-87.

Ott E, Stolz J, Lehmann M, Mack M. 2009. The RFN riboswitch of Bacillus subtilis is a target for the antibiotic roseoflavin produced by Streptomyces davawensis. RNA Biol 6: 276-280.

Patte JC, Akrim M, Mejean V. 1998. The leader sequence of the Escherichia coli lysC gene is involved in the regulation of LysC synthesis. FEMS Microbiol Lett 169: 165-170.

269

Paul BJ, Barker MM, Ross W, Schneider DA, Webb C, Foster JW, Gourse RL. 2004. DksA: a critical component of the transcription initiation machinery that potentiates the regulation of rRNA promoters by ppGpp and the initiating NTP. Cell 118: 311-322.

Paul BJ, Berkmen MB, Gourse RL. 2005. DksA potentiates direct activation of amino acid promoters by ppGpp. Proc Natl Acad Sci U S A 102: 7823-7828.

Peters JM, Vangeloff AD, Landick R. 2011. Bacterial transcription terminators: the RNA 3'-end chronicles. J Mol Biol 412: 793-813.

Pikovskaya O, Polonskaia A, Patel DJ, Serganov A. 2011. Structural principles of nucleoside selectivity in a 2'-deoxyguanosine riboswitch. Nat Chem Biol 7: 748-755.

Platt T. 1994. Rho and RNA: models for recognition and response. Mol Microbiol 11: 983-990.

Poiata E, Meyer MM, Ames TD, Breaker RR. 2009. A variant riboswitch aptamer class for S-adenosylmethionine common in marine bacteria. RNA 15: 2046-2056.

Potrykus K, Cashel M. 2008. (p)ppGpp: still magical?. Annu Rev Microbiol 62: 35-51.

Price VL, Gallant JA. 1982. A new relaxed mutant of Bacillus subtilis. J Bacteriol 149: 635-641.

Ramesh A, Wakeman CA, Winkler WC. 2011. Insights into metalloregulation by M-box riboswitch RNAs via structural analysis of manganese-bound complexes. J Mol Biol 407: 556-570.

Regulski EE, Moy RH, Weinberg Z, Barrick JE, Yao Z, Ruzzo WL, Breaker RR. 2008. A widespread riboswitch candidate that controls bacterial genes involved in molybdenum cofactor and tungsten cofactor metabolism. Mol Microbiol 68: 918-932.

Ren A, Rajashankar KR, Patel DJ. 2012. Fluoride ion encapsulation by Mg2+ ions and phosphates in a fluoride riboswitch. Nature 486: 85-89.

Rodionov DA, Vitreschak AG, Mironov AA, Gelfand MS. 2004. Comparative genomics of the methionine metabolism in Gram-positive bacteria: a variety of regulatory systems. Nucleic Acids Res 32: 3340-3353.

Rodionov DA, Vitreschak AG, Mironov AA, Gelfand MS. 2003. Regulation of lysine biosynthesis and transport genes in bacteria: yet another RNA riboswitch?. Nucleic Acids Res 31: 6748-6757. 270

Rollins SM, Grundy FJ, Henkin TM. 1997. Analysis of cis-acting sequence and structural elements required for antitermination of the Bacillus subtilis tyrS gene. Mol Microbiol 25: 411-421.

Roth A, Nahvi A, Lee M, Jona I, Breaker RR. 2006. Characteristics of the glmS ribozyme suggest only structural roles for divalent metal ions. RNA 12: 607-619.

Roth A, Winkler WC, Regulski EE, Lee BW, Lim J, Jona I, Barrick JE, Ritwik A, Kim JN, Welz R et al. 2007. A riboswitch selective for the queuosine precursor preQ1 contains an unusually small aptamer domain. Nat Struct Mol Biol 14: 308-317.

Schlesinger S. 1967. Inhibition of growth of Escherichia coli and of homoserine O- transsuccinylase by alpha-methylmethionine. J Bacteriol 94: 327-332.

Serganov A, Huang L, Patel DJ. 2009. Coenzyme recognition and gene regulation by a flavin mononucleotide riboswitch. Nature 458: 233-237.

Serganov A, Huang L, Patel DJ. 2008. Structural insights into amino acid binding and gene control by a lysine riboswitch. Nature 455: 1263-1267.

Serganov A, Polonskaia A, Phan AT, Breaker RR, Patel DJ. 2006. Structural basis for gene regulation by a thiamine pyrophosphate-sensing riboswitch. Nature 441: 1167-1171.

Serganov A, Yuan YR, Pikovskaya O, Polonskaia A, Malinina L, Phan AT, Hobartner C, Micura R, Breaker RR, Patel DJ. 2004. Structural basis for discriminative regulation of gene expression by adenine- and guanine-sensing mRNAs. Chem Biol 11: 1729-1741.

Shahbabian K, Jamalli A, Zig L, Putzer H. 2009. RNase Y, a novel endoribonuclease, initiates riboswitch turnover in Bacillus subtilis. EMBO J 28: 3523-3533.

Singh P, Bandyopadhyay P, Bhattacharya S, Krishnamachari A, Sengupta S. 2009. Riboswitch detection using profile hidden Markov models. BMC Bioinformatics 10: 325.

Smith I, Paress P, Pestka S. 1978. Thiostrepton-resistant mutants exhibit relaxed synthesis of RNA. Proc Natl Acad Sci U S A 75: 5993-5997.

Smith KD, Lipchock SV, Ames TD, Wang J, Breaker RR, Strobel SA. 2009. Structural basis of ligand binding by a c-di-GMP riboswitch. Nat Struct Mol Biol 16: 1218-1223.

Smith KD, Shanahan CA, Moore EL, Simon AC, Strobel SA. 2011. Structural basis of differential ligand recognition by two classes of bis-(3'-5')-cyclic dimeric guanosine monophosphate-binding riboswitches. Proc Natl Acad Sci U S A 108: 7757-7762. 271

Sojka L, Kouba T, Barvik I, Sanderova H, Maderova Z, Jonak J, Krasny L. 2011. Rapid changes in gene expression: DNA determinants of promoter regulation by the concentration of the transcription initiating NTP in Bacillus subtilis. Nucleic Acids Res 39: 4598-4611.

Soukup GA. 2006. Core requirements for glmS ribozyme self-cleavage reveal a putative pseudoknot structure. Nucleic Acids Res 34: 968-975.

Spira B, Silberstein N, Yagil E. 1995. Guanosine 3',5'-bispyrophosphate (ppGpp) synthesis in cells of Escherichia coli starved for Pi. J Bacteriol 177: 4053-4058.

Srivatsan A, Wang JD. 2008. Control of bacterial transcription, translation and replication by (p)ppGpp. Curr Opin Microbiol 11: 100-105.

Stoddard CD, Montange RK, Hennelly SP, Rambo RP, Sanbonmatsu KY, Batey RT. 2010. Free state conformational sampling of the SAM-I riboswitch aptamer domain. Structure 18: 787-797.

Sudarsan N, Barrick JE, Breaker RR. 2003. Metabolite-binding RNA domains are present in the genes of . RNA 9: 644-647.

Sudarsan N, Cohen-Chalamish S, Nakamura S, Emilsson GM, Breaker RR. 2005. Thiamine pyrophosphate riboswitches are targets for the antimicrobial compound pyrithiamine. Chem Biol 12: 1325-1335.

Sudarsan N, Hammond MC, Block KF, Welz R, Barrick JE, Roth A, Breaker RR. 2006. Tandem riboswitch architectures exhibit complex gene control functions. Science 314: 300-304.

Sudarsan N, Lee ER, Weinberg Z, Moy RH, Kim JN, Link KH, Breaker RR. 2008. Riboswitches in eubacteria sense the second messenger cyclic di-GMP. Science 321: 411- 413.

Sudarsan N, Wickiser JK, Nakamura S, Ebert MS, Breaker RR. 2003. An mRNA structure in bacteria that controls gene expression by binding lysine. Genes Dev 17: 2688- 2697.

Swanton M, Edlin G. 1972. Isolation and characterization of an RNA relaxed mutant of B. subtilis. Biochem Biophys Res Commun 46: 583-588.

Tamayo R, Pratt JT, Camilli A. 2007. Roles of cyclic diguanylate in the regulation of bacterial pathogenesis. Annu Rev Microbiol 61: 131-148. 272

Thore S, Frick C, Ban N. 2008. Structural basis of thiamine pyrophosphate analogues binding to the eukaryotic riboswitch. J Am Chem Soc 130: 8116-8117.

Thore S, Leibundgut M, Ban N. 2006. Structure of the eukaryotic thiamine pyrophosphate riboswitch with its regulatory ligand. Science 312: 1208-1211.

Tomsic J, McDaniel BA, Grundy FJ, Henkin TM. 2008. Natural variability in S- adenosylmethionine (SAM)-dependent riboswitches: S-box elements in Bacillus subtilis exhibit differential sensitivity to SAM In vivo and in vitro. J Bacteriol 190: 823-833.

Trausch JJ, Ceres P, Reyes FE, Batey RT. 2011. The structure of a tetrahydrofolate- sensing riboswitch reveals two ligand binding sites in a single aptamer. Structure 19: 1413-1423.

Tuerk C, Gold L. 1990. Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249: 505-510.

Vicens Q, Mondragon E, Batey RT. 2011. Molecular sensing by the aptamer domain of the FMN riboswitch: a general model for ligand binding by conformational selection. Nucleic Acids Res 39: 8586-8598.

Vinella D, Albrecht C, Cashel M, D'Ari R. 2005. Iron limitation induces SpoT-dependent accumulation of ppGpp in Escherichia coli. Mol Microbiol 56: 958-970.

Vogel U, Jensen KF. 1994. The RNA chain elongation rate in Escherichia coli depends on the growth rate. J Bacteriol 176: 2807-2813.

Vrentas CE, Gaal T, Berkmen MB, Rutherford ST, Haugen SP, Vassylyev DG, Ross W, Gourse RL. 2008. Still looking for the magic spot: the crystallographically defined binding site for ppGpp on RNA polymerase is unlikely to be responsible for rRNA transcription regulation. J Mol Biol 377: 551-564.

Wabiko H, Ochi K, Nguyen DM, Allen ER, Freese E. 1988. Genetic mapping and physiological consequences of metE mutations of Bacillus subtilis. J Bacteriol 170: 2705-2710.

Wachter A, Tunc-Ozdemir M, Grove BC, Green PJ, Shintani DK, Breaker RR. 2007. Riboswitch control of gene expression in plants by splicing and alternative 3' end processing of mRNAs. Plant Cell 19: 3437-3450.

Wagner R. 2002. Regulation of ribosomal RNA synthesis in E. coli: effects of the global regulator guanosine tetraphosphate (ppGpp). J Mol Microbiol Biotechnol 4: 331-340. 273

Wakeman CA, Ramesh A, Winkler WC. 2009. Multiple metal-binding cores are required for metalloregulation by M-box riboswitch RNAs. J Mol Biol 392: 723-735.

Waldminghaus T, Fippinger A, Alfsmann J, Narberhaus F. 2005. RNA thermometers are common in alpha- and gamma-proteobacteria. Biol Chem 386: 1279-1286.

Waldminghaus T, Gaubig LC, Klinkert B, Narberhaus F. 2009. The Escherichia coli ibpA thermometer is comprised of stable and unstable structural elements. RNA Biol 6: 455-463.

Waldminghaus T, Heidrich N, Brantl S, Narberhaus F. 2007. FourU: a novel type of RNA thermometer in Salmonella. Mol Microbiol 65: 413-424.

Wang J, Henkin TM, Nikonowicz EP. 2010. NMR structure and dynamics of the Specifier Loop domain from the Bacillus subtilis tyrS T box leader RNA. Nucleic Acids Res 38: 3388-3398.

Wang J, Nikonowicz EP. 2011. Solution structure of the K-turn and Specifier Loop domains from the Bacillus subtilis tyrS T-box leader RNA. J Mol Biol 408: 99-117.

Wang JX, Breaker RR. 2008. Riboswitches that sense S-adenosylmethionine and S- adenosylhomocysteine. Biochem Cell Biol 86: 157-168.

Wang JX, Lee ER, Morales DR, Lim J, Breaker RR. 2008. Riboswitches that sense S- adenosylhomocysteine and activate genes involved in coenzyme recycling. Mol Cell 29: 691-702.

Warner DF, Savvi S, Mizrahi V, Dawes SS. 2007. A riboswitch regulates expression of the coenzyme B12-independent methionine synthase in Mycobacterium tuberculosis: implications for differential methionine synthase function in strains H37Rv and CDC1551. J Bacteriol 189: 3655-3659.

Weinberg Z, Barrick JE, Yao Z, Roth A, Kim JN, Gore J, Wang JX, Lee ER, Block KF, Sudarsan N et al. 2007. Identification of 22 candidate structured RNAs in bacteria using the CMfinder comparative genomics pipeline. Nucleic Acids Res 35: 4809-4819.

Weinberg Z, Regulski EE, Hammond MC, Barrick JE, Yao Z, Ruzzo WL, Breaker RR. 2008. The aptamer core of SAM-IV riboswitches mimics the ligand-binding site of SAM- I riboswitches. RNA 14: 822-828.

274

Weinberg Z, Wang JX, Bogue J, Yang J, Corbino K, Moy RH, Breaker RR. 2010. Comparative genomics reveals 104 candidate structured RNAs from bacteria, archaea, and their metagenomes. Genome Biol 11: R31.

Welz R, Breaker RR. 2007. Ligand binding and gene control characteristics of tandem riboswitches in Bacillus anthracis. RNA 13: 573-582.

Wendrich TM, Blaha G, Wilson DN, Marahiel MA, Nierhaus KH. 2002. Dissection of the mechanism for the stringent factor RelA. Mol Cell 10: 779-788.

Wendrich TM, Marahiel MA. 1997. Cloning and characterization of a relA/spoT homologue from Bacillus subtilis. Mol Microbiol 26: 65-79.

Wickiser JK, Winkler WC, Breaker RR, Crothers DM. 2005. The speed of RNA transcription and metabolite binding kinetics operate an FMN riboswitch. Mol Cell 18: 49-60.

Wilkinson SR, Been MD. 2005. A pseudoknot in the 3' non-core region of the glmS ribozyme enhances self-cleavage activity. RNA 11: 1788-1794.

Wilson RC, Smith AM, Fuchs RT, Kleckner IR, Henkin TM, Foster MP. 2011. Tuning riboswitch regulation through conformational selection. J Mol Biol 405: 926-938.

Wilson-Mitchell SN, Grundy FJ, Henkin TM. 2012. Analysis of lysine recognition and specificity of the Bacillus subtilis L box riboswitch. Nucleic Acids Res.

Winkler W, Nahvi A, Breaker RR. 2002a. Thiamine derivatives bind messenger RNAs directly to regulate bacterial gene expression. Nature 419: 952-956.

Winkler WC, Cohen-Chalamish S, Breaker RR. 2002b. An mRNA structure that controls gene expression by binding FMN. Proc Natl Acad Sci U S A 99: 15908-15913.

Winkler WC, Grundy FJ, Murphy BA, Henkin TM. 2001. The GA motif: an RNA element common to bacterial antitermination systems, rRNA, and eukaryotic RNAs. RNA 7: 1165-1172.

Winkler WC, Nahvi A, Roth A, Collins JA, Breaker RR. 2004. Control of gene expression by a natural metabolite-responsive ribozyme. Nature 428: 281-286.

Winkler WC, Nahvi A, Sudarsan N, Barrick JE, Breaker RR. 2003. An mRNA structure that controls gene expression by binding S-adenosylmethionine. Nat Struct Biol 10: 701- 707. 275

Wu JJ, Howard MG, Piggot PJ. 1989. Regulation of transcription of the Bacillus subtilis spoIIA locus. J Bacteriol 171: 692-698.

Xiao H, Kalman M, Ikehara K, Zemel S, Glaser G, Cashel M. 1991. Residual guanosine 3',5'-bispyrophosphate synthetic activity of relA null mutants can be eliminated by spoT null mutations. J Biol Chem 266: 5980-5990.

Yansura DG, Henner DJ. 1984. Use of the Escherichia coli lac repressor and operator to control gene expression in Bacillus subtilis. Proc Natl Acad Sci U S A 81: 439-443.

Yocum RR, Perkins JB, Howitt CL, Pero J. 1996. Cloning and characterization of the metE gene encoding S-adenosylmethionine synthetase from Bacillus subtilis. J Bacteriol 178: 4604-4610.

Yousef MR, Grundy FJ, Henkin TM. 2005. Structural transitions induced by the interaction between tRNA(Gly) and the Bacillus subtilis glyQS T box leader RNA. J Mol Biol 349: 273-287.

Yousef MR, Grundy FJ, Henkin TM. 2003. tRNA requirements for glyQS antitermination: a new twist on tRNA. RNA 9: 1148-1156.

Zhang Q, Kang M, Peterson RD, Feigon J. 2011. Comparison of solution and crystal structures of preQ1 riboswitch reveals calcium-induced changes in conformation and dynamics. J Am Chem Soc 133: 5190-5193.

Zhao G, Kong W, Weatherspoon-Griffin N, Clark-Curtiss J, Shi Y. 2011. Mg2+ facilitates leader peptide translation to induce riboswitch-mediated transcription termination. EMBO J 30: 1485-1496.

Zuber P, Losick R. 1987. Role of AbrB in Spo0A- and Spo0B-dependent utilization of a sporulation promoter in Bacillus subtilis. J Bacteriol 169: 2223-2230.

276

APPENDIX A

EFFECTS OF THE relA MUTANT ALLELE ON B. subtilis S BOX

GENE EXPRESSION

A.1 Introduction

Bacteria exert a tight control of expression of many genes and enzymes upon experiencing adverse environmental conditions. The ‘stringent response’ is one of the most important adaptations by which bacteria survive under harsh conditions, like nutrient starvation. Along with modulation of gene expression, the stringent response is characterized by growth arrest (Cashel et al. 1996, Magnusson et al. 2005).

One of the most prominent elements of the stringent response is the repression of synthesis of stable RNAs, i.e., ribosomal RNA (rRNA) and transfer RNA (tRNA). The stringent response also activates the expression of certain genes involved in amino acid biosynthesis. This strategy diverts the cell’s resources away from energy-consuming processes such as transcription and translation and promotes the amino acid biosynthetic

277

pathways until nutrient conditions improve. The stringent response is also involved in morphological events of the cell that are predominant during the late growth phase such as sporulation (in B. subtilis) and synthesis of aerial mycelium (in Streptomyces spp.)

(Ochi 2007).

The stringent response is associated with the transient increase in the levels of the bacterial alarmones, guanosine tetraphosphate (ppGpp) and guanosine pentaphosphate

(pppGpp). These hyperphosphorylated guanosine nucleotides, collectively known as

(p)ppGpp, were first identified in E. coli (Cashel and Gallant 1969). (p)ppGpp is synthesized from GTP and ATP by the RelA protein, encoded by the relA gene

(Haseltine et al. 1972). In E. coli, two homologous enzymes, RelA and SpoT, are responsible for the control of intracellular concentrations of (p)ppGpp.

Amino acid limitation usually results in an increase in uncharged tRNA molecules. The RelA protein associates specifically with the ribosome and becomes activated when an uncharged tRNA enters the ribosomal A-site, in the presence of the

50S ribosomal protein L11 (encoded by relC) (Wendrich et al. 2002). SpoT is a bifunctional enzyme that catalyzes both hydrolysis and synthesis of ppGpp and mediates ppGpp turnover (Cashel et al. 1996, Potrykus and Cashel 2008). SpoT synthesizes ppGpp in response to carbon, iron and fatty acid starvation (Spira et al. 1995, Vinella et al. 2005,

Xiao et al. 1991). Both activities of SpoT (ppGpp degradation and synthesis) are predicted to take place at the N-terminal domain of SpoT, while the C-terminal domain is predicted to regulate a transition between the degradative and synthetic activities of

278

SpoT, although the regulatory mechanism is not understood clearly (Hogg et al. 2004).

In E. coli, the RNA polymerase (RNAP) is the target of (p)ppGpp binding

(Artsimovitch et al. 2004); however, the precise site for ppGpp binding on the RNAP is still unclear, due to conflicting results from structural and mutational analyses

(Artsimovitch et al. 2004, Vrentas et al. 2008). Binding of ppGpp to the RNAP reduces the half-life of the open promoter complex and results in inhibition of transcription initiation of the stable RNAs (Barker et al. 2001b, Cashel et al. 1996, Wagner 2002). The transcription factor DksA, which acts as a second effector for the stringent response in E. coli, stabilizes the interaction of RNAP with ppGpp and destabilizes open promoter complexes (Magnusson et al. 2005, Paul et al. 2004, Srivatsan and Wang 2008).

The E. coli rRNA (rrn) operon promoters contain a GC-rich sequence between the

-10 and +1 positions, which results in a weak interaction with the RNAP. The rrn operon promoters are sensitive to the concentration of the initiating nucleotide (iNTP) and form unstable open complexes during transcription initiation (Haugen et al. 2006). Association of ppGpp/DksA lowers the stability of all open complexes. Therefore, the intrinsically unstable rrn operon promoter open complexes are destabilized further, resulting in inhibition of transcription initiation that subsequently abolishes transcription of rRNA genes (Barker et al. 2001b).

The E. coli amino acid biosynthetic promoters contain an AT-rich sequence between the -10 and +1 positions, which allows optimal interaction with the RNAP. Open complexes at the amino acid biosynthesis promoters are intrinsically more stable than

279

rRNA operon promoters. It has been shown that ppGpp/DksA increases the rate of open complex formation specifically at the amino acid biosynthetic promoters and facilitates transcription initiation of amino acid biosynthetic genes (Paul et al. 2005). Thus, regulation occurs in a promoter-specific manner and is dependent on the inherent kinetic properties of the amino acid biosynthesis and rRNA operon promoters.

In addition to the direct effect, ppGpp can activate the amino acid biosynthetic promoters indirectly by freeing the RNAP from the rRNA operon promoters (Barker et al. 2001a). It has also been proposed that ppGpp acts at the level of transcription initiation by altering the use of sigma (σ) factors. ppGpp is predicted to shift the balance of the released RNAP from transcribing the house-keeping σ70-dependent genes towards genes dependent on alternate σ factors (Srivatsan and Wang 2008).

In the Gram-positive organism B. subtilis, only one gene encoding a RelA-SpoT homolog is present (Wendrich and Marahiel 1997). The relA gene in B. subtilis is implicated in amino acid auxotrophy (Wendrich and Marahiel 1997), competence development, antibiotic production (Inaoka and Ochi 2002) and spore formation (Ochi et al. 1981, Ochi et al. 1982). Recent studies have indicated the presence of two additional ppGpp synthetases in B. subtilis, YjbM (encoded by the yjbM gene) and YwaC (encoded by the ywaC gene) (Nanamiya et al. 2008). YwaC is triggered specifically upon exposure to alkaline environments. However, the role of these enzymes during the stringent response is still not understood (Nanamiya et al. 2008).

The B. subtilis genome contains 10 rRNA (as opposed to 7 in E. coli),

280

that are transcribed from a pair of promoters (P1 and P2), both of which are sensitive to the iNTP (Henkin 2002, Krasny and Gourse 2004). The B. subtilis rRNA operon promoters initiate transcription exclusively with GTP (Krasny and Gourse 2004). A recent study compared the different sequence determinants for rRNA operon promoter recognition by E. coli and B. subtilis RNAP and identified that the 3’-region of the B. subtilis rRNA operon promoters is required for iNTP-sensitive transcription initiation by the RNAP in vitro (Sojka et al. 2011).

During the stringent response triggered by amino acid starvation, the intracellular concentration of GTP decreases, while that of ATP increases (Lopez et al. 1981). In addition, ppGpp inhibits production of GTP by targeting an enzyme (inosine monophosphate dehydrogenase) that catalyzes an early step in GTP biosynthesis (Krasny and Gourse 2004). An increase in ppGpp during the stringent response results in specific inhibition of rRNA operon promoter activity containing a G at the transcription start-site

(+1G), while activity of a promoter with +1A is not affected. As B. subtilis rRNA operon promoters require a +1G to respond to ppGpp and the reduction in GTP concentration results in a specific downregulation of rRNA operon promoter activity, it is suggested that ppGpp’s effect on rRNA transcription initiation is indirect (Krasny and Gourse

2004).

Cells that are unable to repress stable RNA synthesis during amino acid starvation conditions have been termed ‘relaxed’ (rel) mutants. Several mutations been identified in the relA and relC genes in B. subtilis that are unable to synthesize ppGpp (Price and

281

Gallant 1982, Smith et al. 1978, Swanton and Edlin 1972). B. subtilis relA mutants exhibit partial methionine auxotrophy and complete auxotrophy for branched chain amino acids, isoleucine, leucine and valine (Wendrich and Marahiel 1997). Previous results from our laboratory indicated that a B. subtilis relA point mutant failed to exhibit a typical S box response. S box gene expression was not induced in the relA point mutant during growth in the absence of methionine, but was induced in a strain containing the wild-type relA allele under the same growth conditions (Henkin, unpublished results).

These observations together suggested that the stringent response might influence S box gene regulation. We hypothesized that the in vivo SAM levels in a relA mutant fail to drop, resulting in loss of S box gene expression. The following study describes the experiments conducted to test the hypothesis and the results obtained from the analyses.

A.2 Hypothesis

The relA mutant allele prevents the drop of in vivo SAM pools, thereby resulting in poor S box gene expression during methionine starvation conditions.

A.3 Aim of study

To measure the in vivo SAM pools in a relA mutant strain and compare to an isogenic wild-type strain.

282

A.4 Materials and methods

A.4.1 Bacterial strains and growth conditions

The B. subtilis strains used in this study were 1A765 (lys trpC2); 1A766 (lys trpC2 relA1); 1A765-MET (lys metB10; this study); 1A766-MET (lys metB10 relA1; this study); BR151 (lys-3 metB10 trpC2); BR151 T+ (lys-3 metB10); TW30 (trpC2 pheA1

ΔrelA::mls) (Wendrich and Marahiel 1997); BR151-DeltaREL (lys-3 metB10 trpC2 pheA1 ΔrelA::mls; this study); ZB307A (SPβc2del2::Tn917::pSK10∆6) (Zuber and

Losick 1987); and ZB449 (trpC2 pheA1 abrB703 SPβ-cured) (Nakano and Zuber 1989).

B. subtilis strains were grown on tryptose blood agar base medium (TBAB; Difco,

Franklin Lakes, NJ), Spizizen minimal medium (Anagnostopoulos and Spizizen 1961) and 2XYT broth (Miller 1972). Chloramphenicol was added at a concentration of 5

μg/ml. X-Gal (5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside; Gold

Biotechnologies, St. Louis, MO) was used at 40 μg/ml as an indicator of β-galactosidase activity. Alpha-methyl-methionine (αMM; Sigma, St. Louis, MO) was added at a concentration of 1 µg/ml. All growth was at 37ºC.

A.4.2 Genetic techniques

Transformation of B. subtilis was carried out as described previously (Henkin et al. 1990). Chromosomal DNA was prepared using the DNeasy tissue kit (Qiagen,

Chatsworth, CA). Wizard columns (Promega, Madison, WI) were used for plasmid preparations. As described previously (Chapter 2), a yitJ-lacZ transcriptional fusion 283

construct in plasmid pFG328 (Grundy and Henkin 1993) was introduced in single copy into the B. subtilis chromosome by recombination into the SPβ prophage carried in strain

ZB307A and purified by passage of the phage through strain ZB449 (Nakano and Zuber

1989, Zuber and Losick 1987). The phage carrying the fusion was then introduced into the appropriate host strain as indicated. Strains containing lacZ fusions were grown in the presence of chloramphenicol.

To analyze the effect of the relA1 allele during methionine starvation we generated modified strains of 1A765 (wild-type) and 1A766 (relA1). We converted these strains to methionine auxotrophs by removing the trpC2 allele and selecting for a metB10 allele by congression (congression is defined as a co-transformation of a DNA fragment having a selectable marker and a separate DNA fragment encoding the phenotype of interest). Chromosomal DNA from the strain BR151 T+ (lys-3 metB) was introduced into

1A765 and 1A766 by transformation. We selected transformants that were able to grow on Spizizen minimal medium plates containing both lysine and methionine. After scoring

~300+ colonies, we obtained a single transformant for each strain depicting the correct phenotype. The final strains were named 1A765-MET (wild-type) and 1A766-MET

(relA1).

We also generated a relA null mutant in a methionine auxotrophic background.

Chromosomal DNA from TW30 (deltarelA) (Wendrich and Marahiel 1997) was isolated and transformed into BR151 to generate a strain BR151-DeltaREL (deltarelA). The yitJ- lacZ transcriptional fusion was integrated via transduction as described previously

284

(Chapter 2). Cells were subjected to methionine starvation conditions. Samples were collected at the indicated times for SAM pool and β-galactosidase measurement.

A.4.3 β-Galactosidase measurements

Strains containing lacZ fusions were grown in Spizizen minimal medium containing the required amino acids at a concentration of 50 μg/ml until early exponential phase and were then harvested by centrifugation. Cells were resuspended in fresh

Spizizen minimal medium in the presence or absence of methionine. Samples were collected at 1 h intervals and assayed for β-galactosidase activity as described previously

(Miller 1972) using toluene permeabilization. All starvation experiments and assays were conducted at least twice, and variation was <10%.

A.4.4 Determination of SAM pools in vivo

BR151, 1A765-MET (wild-type) and 1A766-MET (relA1) containing the yitJ- lacZ fusion were grown in Spizizen minimal medium containing methionine until mid- exponential growth. Cells were harvested by centrifugation and resuspended in fresh

Spizizen minimal medium in the absence of methionine. Cell samples were collected by filtration at the indicated time points and extracted with 1.0 ml 0.5 M formic acid as described in Chapter 2. The formic acid was removed by lyophilization as described by

Ochi and coworkers (Ochi et al. 1981). Cell extracts were tested in an in vitro

285

transcription termination assay using a yitJ template that included the glyQS promoter sequence and compared to a SAM standard curve, as described in Chapter 2 (McDaniel et al. 2006). Samples were also harvested at each time point and assayed for β-galactosidase activity, as described above.

The 1A765 (wild-type) and 1A766 (relA1) methionine prototrophic strains, containing the yitJ-lacZ fusion, were grown in Spizizen minimal medium until mid- exponential phase. Cells were harvested by centrifugation and resuspended in fresh

Spizizen minimal medium in the presence of the methionine analog, αMM (1 µg/ml) to mimic methionine starvation conditions. Cell samples for SAM pool determination and

β-galactosidase activity were collected and analyzed, as described above.

For BR151-DeltaREL (relA null) containing the yitJ-lacZ fusion, Spizizen minimal medium contained isoleucine, leucine and valine in addition to lysine, tryptophan and methionine (50 µg/ml final concentration) to support growth of the relA deletion allele (Wendrich and Marahiel 1997). Deletion of the relA allele resulted in a severe growth defect. Therefore, the BR151-DeltaREL cultures were grown for ~7-8 h for the outgrowth phase, after which the cells grew for an additional ~12 h in order to reach the mid-exponential growth phase. At mid-log phase, the cells were harvested by centrifugation and resuspended in Spizizen minimal medium containing the required amino acids, except methionine. Samples were collected at the indicated times for measuring SAM pools and β-galactosidase activity, as described above. In vitro

286

transcription termination assays used to calculate the in vivo SAM levels were performed as described in Chapter 2.

A.5 Results

A.5.1 Methionine prototrophic strains containing a wild-type or relA1 allele exhibit distinct S box-lacZ gene expression profiles, despite similar in vivo SAM pools

The 1A765 (wild-type) and 1A766 (relA1) strains, both containing the yitJ-lacZ S box transcriptional fusion, were grown in Spizizen minimal medium containing amino acids lysine and tryptophan. As these strains are prototrophic for methionine, we used

αMM, a methionine analog, to create methionine starvation-like conditions. Previous studies in E. coli have shown that αMM mimics methionine as a feedback inhibitor of the first enzyme in the methionine biosynthetic pathway (Schlesinger 1967), and this analog was previously shown to result in induction of S box gene expression in B. subtilis

(Henkin, unpublished results). Samples were collected at the indicated times for SAM pool determination and β-galactosidase activity measurement.

287

Figure A.1 Measurement of in vivo SAM pools and β-galactosidase activity. 1A765 (wild-type) and 1A766 (relA1) cells containing a yitJ-lacZ transcriptional fusion were grown until mid-exponential phase, harvested and resuspended in minimal medium containing the methionine analog αMM. Samples were collected at the indicated times. Cell extracts were neutralized by the addition of 1 N KOH, and samples were added to a B. subtilis RNAP in vitro transcription termination reaction mixture containing the yitJ DNA template. Termination efficiency was compared to a standard curve generated using known concentrations of SAM, which was used to calculate the SAM concentration present in the cell extracts. The in vivo SAM levels are plotted on the left Y-axis (µM) and β-galactosidase activity is shown on the right Y-axis (Miller units). Open squares, 1A765 SAM pools; filled squares, 1A765 yitJ-lacZ β-galactosidase activity; open triangle, 1A766 SAM pools; filled triangles, 1A766 yitJ-lacZ β-galactosidase activity.

288

Consistent with previous unpublished reports from our laboratory, expression of the yitJ-lacZ fusion was induced only in the wild-type strain. The β-galactosidase activity of the yitJ-lacZ fusion in the relA1 background failed to show an increase in expression upon exposure to the methionine analog (Figure A.1, filled triangles and squares). These results show that the relA1 allele affects S box gene expression during growth in the presence of the methionine analog, suggesting that the stringent response plays some role in S box regulation in a methionine prototroph.

At T0, the SAM concentration in the wild-type strain was ~380 µM. This concentration was 1.7-fold higher than that of the strain containing the relA1 allele.

However, SAM pools from both the wild-type and the relA1 strains dropped significantly after 50 min of growth in the presence of the methionine analog (Figure A.1, open triangles and squares). These results show similar SAM pool profiles for the two strains during growth in the presence of αMM, indicating that the relA1 allele does not prevent the drop in SAM pools. These results suggest that the observed drop in in vivo SAM pools is independent of the relA1 allele. The above results indicate that the stringent response affects S box gene expression without affecting the intracellular SAM levels.

A.5.2 The relA1 mutant allele does not affect S box-lacZ expression in a methionine auxotroph

We subsequently examined the effect of the relA1 mutant allele on S box gene

289

expression during true methionine starvation conditions, in the absence of the methionine analog. For this purpose, we introduced the metB10 allele into strains 1A765 and 1A766 by congression to generate the strains 1A765-MET (wild-type) and 1A766-MET (relA1).

Figure A.2 In vivo expression assay during methionine starvation. The methionine auxotrophic 1A765-MET (wild-type) and 1A766-MET (relA1) containing a yitJ-lacZ transcriptional fusion were grown in minimal medium until mid-exponential phase, harvested and resuspended in minimal medium with or without methionine. Samples were collected at the indicated times and β-galactosidase activity was measured (Miller units). Open symbols, activity in the absence of methionine; filled symbols, activity in the presence of methionine. Squares, 1A765-MET; circles, 1A766-MET (relA1).

290

Figure A.2 illustrates the increase in β-galactosidase activity of the yitJ-lacZ transcriptional fusion from the 1A765-MET and 1A766-MET (relA1) strains. These data indicated that the relA1 mutant did not exhibit loss of S box gene expression during methionine starvation. This result suggests that S box gene expression in a methionine auxotroph is independent of the stringent response. The increase in S box gene expression during methionine limitation suggested that the SAM pools would drop eventually in both the strains and hence we did not measure the SAM pools from these strains.

The β-galactosidase activity of the S box gene-lacZ fusion in a relA1 methionine auxotroph was different from the result obtained for the methionine prototroph during growth in the presence of the methionine analog. It is possible that the effect of the relA1 mutation on S box gene expression was suppressed in the methionine auxotroph.

Together, these results suggest that the relA1 allele specifically affects S box gene expression in a methionine prototroph.

A.5.3 A relA null strain fails to exhibit high SAM pools during methionine starvation

To test whether eliminating the relA gene from the B. subtilis chromosome would have any effect on total SAM pools or on S box gene expression, we generated the

BR151-DeltaREL strain. Cells containing the yitJ-lacZ fusion were subjected to

291

methionine starvation and samples were collected at the indicated times for SAM pool and β-galactosidase measurement.

Figure A.3 Measurement of in vivo SAM pools and β-galactosidase activity. BR151- DeltaREL cells containing a yitJ-lacZ transcriptional fusion were grown until mid- exponential phase, harvested and resuspended in minimal medium without methionine. Samples were collected at the indicated times. Cell extracts were neutralized by the addition of 1 N KOH, and samples were added to a B. subtilis RNAP in vitro transcription termination reaction mixture containing a yitJ DNA template. Termination efficiency was compared to a standard curve generated using known concentrations of SAM, which was used to calculate the SAM concentration present in the cell extracts. The in vivo SAM levels are plotted on the left Y-axis (µM) and β-galactosidase activity is shown on the right Y-axis (Miller units). Filled squares, in vivo SAM pools; filled triangles, β-galactosidase activity.

292

Figure A.3 illustrates that the SAM pools from the strain containing the relA deletion dropped ~1 h after removal of methionine and the β-galactosidase activity increased concurrently, consistent with previous results from a methionine auxotroph containing an intact relA gene (Tomsic et al. 2008). These results suggest that like the relA1 allele, the removal of the relA gene does not influence the overall SAM pools and therefore does not appear to affect the S box gene expression in a methionine auxotroph under these conditions, as was hypothesized.

These results differed from those observed during growth in the presence of the methionine analog using a methionine prototroph (section A.5.1), suggesting that expression of the genes from the S box regulon is independent of the stringent response, specifically in a methionine auxotroph.

A.6 Discussion

The stringent response is exerted by bacterial cells during extreme harsh environmental conditions (such as nutrient starvation) and is associated with the transient increase in levels of the bacterial alarmone, (p)ppGpp, along with reduced levels of cellular GTP. During the stringent response, synthesis of stable RNAs (rRNA and tRNA) is repressed, while amino acid biosynthesis is activated. Cells demonstrating a relaxed phenotype fail to inhibit stable RNA synthesis, even under amino acid starvation conditions.

In the current study, we attempted to identify the effect of a relA mutant allele on 293

S box gene expression from B. subtilis. Previous studies from our laboratory had demonstrated a lack of induction of S box gene expression during methionine starvation

(Henkin, unpublished results). Another study showed that a B. subtilis relA mutant exhibits partial methionine auxotrophy (Wendrich and Marahiel 1997). These observations together suggested that the stringent response might influence S box gene regulation. One possibility was that the relA mutant allele prevents the drop of in vivo

SAM pools.

We therefore measured the intracellular SAM levels in relA1 and relA null mutants and compared them to isogenic wild-type strains. We began by analyzing isogenic methionine prototrophic strains with and without the relA1 point mutation.

These strains exhibited a difference in the β-galactosidase activity of the yitJ-lacZ fusion, such that the reporter activity in the wild-type strain was derepressed in the presence of the methionine analog, while the reporter activity in the relA1 strain was repressed partially (Figure A.1). These results were consistent with previous studies performed using similar methods (Henkin, unpublished results). However, the SAM levels in both the strains dropped (Figure A.1, left Y-axis). The drop was seen ~1 h after growth in the presence of the methionine analog. The above results therefore indicate that the stringent response affects S box gene expression in a methionine prototroph without affecting the intracellular SAM levels.

We further examined the effect of the relA1 allele on S box gene expression under true methionine starvation conditions, without the interference of the methionine analog.

294

We analyzed isogenic methionine auxotrophic strains (constructed specifically for this study) with and without the relA1 allele. Results from the β-galactosidase assay showed that the yitJ-lacZ expression from the wild-type and mutant strains exhibited an increase during methionine limitation (Figure A.2). These results suggested that the SAM pools would drop eventually in both strains and so we did not measure the in vivo concentration of SAM. The above results indicate that the observed changes in S box gene expression and in vivo SAM concentrations in the methionine auxotrophic background are independent of the relA1 point mutation and therefore independent of the stringent response. The β-galactosidase activity was not affected in the relA1 point mutant during methionine starvation and was distinct from the β-galactosidase activity seen for the methionine prototroph strain containing the relA1 allele. It is possible that the relA1 allele affects S box gene expression only in a prototrophic background under the conditions tested.

Subsequently, we investigated the SAM pools in a methionine auxotrophic background with a relA deletion. This ruled out any possible influence of partial ppGpp synthesizing activity associated with the relA1 allele. Depleting methionine from the culture medium led to a rapid drop in SAM pools, below the limit of detection (25 µM),

~2 h after removal of methionine with a subsequent increase in β-galactosidase activity

(Figure A.3). These results were consistent with previous data obtained from a methionine auxotroph containing an intact relA gene. The current results indicate that the

SAM pools dropped even in a strain from which the relA gene was eliminated, suggesting

295

that the lack of the RelA protein or ppGpp does not alter the in vivo SAM pools in the cell. We can therefore conclude that the S box gene expression in a methionine auxotroph is induced in a stringent response-independent manner.

A possible explanation for the responses seen for the wild-type and the relA null mutants can be attributed to the difference in growth rates for the two strains. The relA null mutant in the current study showed a severe growth defect, consistent with previous reports (Wendrich and Marahiel 1997). As the relA null cells in the current study grew for ~20 h before reaching mid-exponential phase, the chance of obtaining spontaneous suppressor mutants that are able to overcome the growth rate defect increases.

Consequently, such suppressor mutants can express wild-type-like expression of S box genes. To verify the presence of such suppressor mutants, samples from wild-type and relA null cultures were collected at different times during the lag and log phases. These samples were serially diluted, plated on rich medium and incubated overnight at 37ºC. As expected, the wild-type colony sizes appeared normal, while those of the relA deletion mutants were tiny (data not shown). There was no evidence of large colonies or colonies with unusual morphologies on the plates with the relA deletion culture, suggesting the absence of any suppressor mutants. Hence, we can conclude that the observed S box gene expression in the BR151-DeltaREL strain appears to be independent of the stringent response.

A previous study that characterized the B. subtilis stringent response using proteome and transcriptome analysis also reported that induction of S box gene

296

expression exhibited by a wild-type and relA deletion mutant is independent of the RelA protein (Eymann et al. 2002). These results indicate the involvement of specific regulators that are independent of RelA or ppGpp. Additional analyses will be required to confirm the effect of the relA allele on S box gene expression. Identification and characterization of other factors such as the levels of (p)ppGpp (in case of the relA1 mutant), GTP or the rate of transcription will help to further elucidate the effect of the stringent response on the S box regulatory system.

297