Characterization of Two Small of Streptococcus mutans UA159

by

Andrew Latos

A thesis submitted in conformity with the requirements for the degree of Master of Science Graduate Department of Dentistry University of Toronto

© Copyright by Andrew Latos, 2016

Characterization of Two Small RNAs of Streptococcus mutans UA159

Andrew John Latos

Master of Science, Graduate Department of Dentistry University of Toronto 2016

I. Abstract

Small RNAs (sRNA) regulate several processes including metalloregulation, acid-

tolerance, biofilm formation, and virulence. Streptococcus mutans UA159 is an

etiological agent of dental caries. Several sRNAs were found in intergenic regions of S.

mutans and two were characterized. The first characterized sRNA, SurC, was located

within the loci between dnaK and dnaJ (~190 nucleotides) and was developmentally

regulated. The second characterized sRNA, MrrC (~420 nucleotides), a metal responsive

RNA was located between pyrG and fbaA. In THYE, there was no change observed in

growth rate for either ΔSurC and ΔMrrC. In chemically defined medium, the growth rate

of ΔMrrC was significantly reduced at pH 5.0 and in 4.0% ethanol. Both ΔSurC and

ΔMrrC contain defects in biofilm formation, acid tolerance response, and transformation

frequency. Identification of sRNAs and understanding their function within S. mutans is

necessary to provide insight into the molecular control of virulence for this caries

forming organism.

ii

II. Acknowledgments

I would like to thank my supervisor, Dr. Dennis Cvitkovitch, for providing me the opportunity to perform microbial research. I appreciate the challenges, support, and guidance given during the duration of my research.

I would like to thank my committee members, Dr. Tara Moriarty and Dr. Céline Lévesque for intellectually challenging me and for their mentorship.

I especially thank everyone in the Cvitkovitch lab for their support throughout the duration of this project. Milos Legner, Martha Cordova, Dilani Senadheera, Gursonika Binepal, Anca Serbanescu, Kamna Singh, Marie-Christine Kean, Kirsten Krastel, and Iwona Wenderska have all provided support, friendship, and guidance throughout my project.

I thank my family and friends for kindly listening to my commuting stories and supporting me.

I also thank my loving wife, Dr. Janice L. Strap. Her enthusiasm inspired me to go to graduate school and live my dream of performing research. She always believed in my efforts and pushed me to live up to the family motto, “excelsior scientia”.

iii

Non progredi est regredi. Unknown

Sator Arepo tenet opera rotas. Unknown

Whatever may come And whatever may go That river's flowing Peter Brian Gabriel

All is quiet on New Year's Day A world in white gets underway I want to be with you, be with you night and day Nothing changes on New Year's Day Paul David Hewson

iv

III. Table of Contents

I. Abstract ...... ii

II. Acknowledgments ...... iii

III. Table of Contents ...... v

IV. List of Tables ...... viii

V. List of Figures ...... viii

VI. List of Abbreviations ...... x

1 Literature Review ...... 1 1.1 Introduction ...... 1 1.2 Nontranscribed regions ...... 2 1.3 Types of Transcriptionally Active Regions ...... 3 1.3.1 Examples of Transcriptionally Active Regions ...... 3 1.3.2 Small Proteins and Peptides ...... 4 1.3.3 TARs Located Within Annotated ...... 5 1.3.4 Antisense RNA ...... 7 1.3.5 Riboswitches ...... 9 1.3.6 Small RNAs ...... 10 1.4 Small RNAs Form Regulatory Complexes with Proteins ...... 13 1.4.1 Small RNA Regulation by Hfq ...... 14 1.4.2 Small RNA Regulation Involving Transcription and Sigma Factors ...... 15 1.5 Regulation by sRNA-mRNA Complexes ...... 17 1.5.1 RyhB Iron Concentration Regulation ...... 17 1.5.2 OxyS Regulation by Oxidative Stress ...... 18 1.5.3 SgrS Regulates Glucose-Phosphate Stress ...... 19 1.5.4 RNAIII Regulates Virulence Genes ...... 20 1.6 Streptococcus mutans as a Model Organism ...... 21 1.6.1 The Oral Environment and S. mutans ...... 22 1.6.2 Virulence Properties of S. mutans ...... 25 1.6.3 S. mutans and Environmental Response Pathways ...... 28 1.6.4 Expression Studies of sRNA in S. mutans UA159 ...... 31

v

2 Objective and Hypothesis ...... 35 2.1 Statement of the problem ...... 35 2.2 General Hypothesis ...... 35 2.3 Primary Objectives ...... 35

3 Materials and Methods ...... 36 3.1 Bacterial Strains, Growth Conditions, and Plasmids ...... 36 3.2 Intergenic Knockout Construction ...... 37 3.3 Isolation of RNA, Northern Blot Detection, and qRT-PCR ...... 38 3.4 Rapid Amplification of cDNA Ends (RACE) ...... 41 3.5 Transformation Assay ...... 42 3.6 Biofilm Quantification ...... 42 3.7 Acid Tolerance Response Assay ...... 43 3.8 Antibiotic Minimum Inhibitory Concentration Assay ...... 44 3.9 Growth Rate analysis ...... 44 3.10 Statistical analyses ...... 45 3.11 Bioinformatics ...... 45 3.11.1 Inverted Repeat Searches Using EMBOSS ...... 45 3.11.2 Promoter and Terminator Predictions ...... 46 3.11.3 Multiple Em for Motif Elicitation (MEME) ...... 46

4 Results ...... 47 4.1 Identification of Transcriptionally Active Regions ...... 47 4.2 Analysis of SurC (smu82-83) ...... 49 4.3 Analysis of the intergenic region between pyrG and tRNALeu19 ...... 54 4.3.1 The expression of the TARs located within IGRpyrG-tRNALeu19 ...... 54 4.3.2 5’ and 3’ RACE Results of IGRpyrG–tRNALeu19 ...... 62 4.3.3 IGRpyrG-tRNALeu19 in Other Strains of Streptococcus mutans ...... 65 4.3.4 Growth Under Various Stressors ...... 68 4.3.5 Growth Under Copper Stress ...... 71 4.3.6 Bioinformatic Identification of Inverted Repeats and Transcription Termination ... 74 4.4 Combined assays and bioinformatics for Biofilm, Transformation, Acid Tolerance Response, motifs, and MIC for ΔSurC and ΔMrrC ...... 75 4.4.1 Biofilm Assay ...... 75 4.4.2 Transformation Assay ...... 76 4.4.3 Acid Tolerance Response ...... 77

vi

4.4.4 Transcription Factor Motifs ...... 79 4.4.1 Minimum Inhibitory Concentration ...... 81

5 Discussion ...... 82

6 Future Directions ...... 92

7 References ...... 94

8 Appendix ...... 107 8.1 5’ RACE of smu770c-771c and smu1063-1064c ...... 107

vii

IV. List of Tables

Table 3-1 Bacterial strains and plasmids used in this study...... 36

Table 3-2 Primers used in this study...... 37

Table 4-1 Intergenic regions of S. mutans UA159 screened for expression by Northern Analysis in this study...... 48

Table 4-2 Calculated Doubling Times (hours) of S. mutans UA159 and ΔMrrC in both THYE and MDM under various stressors ...... 70

Table 4-5 MICs of antibiotics (µg/ml) for S. mutans UA159, ΔMrrC, and ΔSurC .. 81

V. List of Figures

Figure 1-1 Genetic map of Streptococcus mutans UA159...... 24

Figure 4-1 Synteny map of the genomic organization of the heat shock chaperones in various ...... 50

Figure 4-2 Conserved secondary structure of SurC...... 51

Figure 4-3 Northern analysis of SurC (smu82-83) of S. mutans UA159 grown in THYE...... 52

Figure 4-4 Identification of the transcription start and stop sites of smu82-83 (SurC) in S. mutans...... 53

Figure 4-5 Genomic organization of rpoE and pyrG in various Streptococci...... 55

Figure 4-6 Genomic organization of IGRpyrG-tRNALeu19 in S. mutans UA159...... 56

Figure 4-7 Integrity of DNAse treated RNA...... 57

Figure 4-8 Transcriptional analysis of intergenic region MrrC expressed when S. mutans was grown in THYE using np1 as a probe...... 57

Figure 4-9 The expression of MrrC of S. mutans UA159 grown in MDM under pH stress visualized by A) Northern analysis and B) normalized expression. 58

Figure 4-10 Response of MrrC of S. mutans UA159 to metal stress in MDM...... 60

Figure 4-11 Transcriptional analysis of smu97-99 using np2 of S. mutans UA159 grown in THYE...... 61

Figure 4-12 Identification of transcription start sites for the intergenic region between smu.97 and tRNALeu19 of S. mutans identified by 3’ and 5’ RACE...... 63 viii

Figure 4-13 5’ and 3’ RACE identified transcription start and stop sites for IGRpyrG- tRNALeu19...... 64

Figure 4-14 Synteny map of pyrG - fbaA in S. mutans...... 66

Figure 4-15 T-Coffee alignment of the ORF1 peptide found in MrrCa for four strains of S. mutans...... 67

Figure 4-16 T-Coffee alignment of the ORF2 peptide found in MrrCa for four strains of S. mutans...... 67

Figure 4-17 Growth of S. mutans UA159 (wild-type) and ΔMrrC under pH 5.0 and 4.0 % ethanol stress in THYE and MDM...... 69

Figure 4-18 Growth curves of UA159 and ΔMrrC grown in THYE in the presence of (A) 1.5 mM and (B) 3 mM copper...... 72

Figure 4-19 Normalized final OD600 of of UA159 and ΔMrrC grown in the presence of copper in MDM and THYE in both static and agitated cultures...... 73

Figure 4-21 Normalized biofilm yield for UA159, ΔSurC and ΔMrrC grown in MDM...... 76

Figure 4-22 Transformation efficiency of S. mutans wild-type and mutants...... 77

Figure 4-23 Acid Tolerance Response assay of S. mutans UA159, ΔSurC, and ΔMrrC...... 78

Figure 4-24 Meme motifs of the putative transcription factor sites in proximity to IGRpyrG-tRNALeu19 and smu82-83...... 80

Figure 5-1 Regulatory map of the S. mutans UA159 intergenic region containing SurC based on experimental and MEME analysis...... 84

Figure 5-2 Regulatory map of the S. mutans UA159 intergenic region containing MrrC and smu97-99 based on experimental and MEME analysis...... 88

ix

VI. List of Abbreviations

ATR acid tolerance response BCAA branched chain amino acids Ccp catabolite control protein CDM chemically-defined medium cDNA complementary DNA CFU colony forming unit CRISPR clustered regularly interspaced short palindromic repeats CSP competence stimulating peptide CSPD disodium 3-(4-methoxyspiro {1,2-dioxetane-3,2´-(5'-chloro)tricyclo [3.3.1.13,7] decan}-4-yl) phenyl phosphate DEPC diethyl pyrocarbonate DIG digoxygenin dNTP deoxynucleoside triphosphate EDTA ethylenediaminetetraacetic acid EMSA electromobility shift assay Erm erythromycin EtOH ethanol GFP green fluorescent protein Gtf glucosyltransferase protein

H2O2 hydrogen peroxide IGR intergenic region LB Luria broth MAB maleic acid buffer MDM minimally defined medium MEME Multiple Em for Motif Elicitation webserver alignment tool MIC minimum inhibitory concentration msRNA micro-RNA-sized NCBI National Center for Biotechnology Information nt nucleotides ORF open reading frame PBS phosphate buffered saline PCR polymerase chain reaction PPP Prokaryotic Promoter Prediction RACE rapid amplification of cDNA ends RNA-Seq RNA sequencing rRNA ribosomal RNA Rsm repressor of secondary metabolites SSC saline sodium citrate buffer SD standard deviation SE standard error of the mean Spec spectinomycin SRP signal recognition particle Sur small, untranslated RNA sRNA small RNA

x

TCA tricarboxylic acid tRNA transfer RNA TAR transcriptionally active region TCS two-component regulatory system THYE Todd Hewitt Yeast Extract tracrRNA trans-activating CRISPR RNA TYE tryptone yeast extract UTR untranslated region w/v weight per volume XIP sigX-induced peptide RPM revolutions per minute TBE tris-borate-EDTA buffer TE tris-EDTA buffer UV ultraviolet 3’-UTR 3’ untranslated region 5’-UTR 5’ untranslated region

xi

1 Literature Review

1.1 Introduction

The response of a pathogen to the ever-changing host environment is critical to its survival. Bacterial response to external stimuli is, by necessity, regulated by complex and often redundant processes. We are only starting to understand the regulatory complexity involved in the colonization, virulence and pathogenicity of oral bacteria such as Streptococcus mutans. Our current grasp on the regulatory mechanisms that control cellular growth and survival responses has been made possible by the development of new tools and technologies including advancements in sequencing technologies and in bioinformatics tools.

The sequenced chromosomes of over 30,000 prokaryotic genomes are publically available for analysis (1-4). The genomic sequences are deposited in well maintained public databases such as the National Center for Biotechnology Information (NCBI) (2), the Genomes

OnLine Database (3), and the European Laboratory (4). The sequences are annotated by the depositors using annotations that are biologically valid with names that follow conventional guidelines (5).

Using automated bioinformatics software tools to annotate genomic sequences provides valuable information regarding each species. Unfortunately, current bioinformatics techniques used to identify prokaryotic genes are not capable of identifying all of the transcriptionally active regions (TARs). Small proteins and biologically active peptides are overlooked as the size limits set by the benchmark requirements of the bioinformatics tools frequently falls above the size of the peptides resulting in the mislabeling of loci (6). Many intergenic regions (IGRs) contain small RNA (sRNA) molecules that are also overlooked due to the lack of a universal bioinformatic method to detect all classes of sRNAs (7). If a bioinformatic algorithm was used to 1

identify an sRNA, the sRNA still requires experimental verification of its expression. When comparative genomics are used to identify novel sRNAs through alignment with previously verified sRNAs, the results can be quite successful (8). Bioinformatic approaches have been undertaken with success as low as 3% to a high of 58% verified predictions (7). Although bioinformatics are not always successful in locating small proteins or sRNA, the software approach provides a useful starting point to inform bench-scale studies.

The use of newer technologies such as the RNA-sequencing projects (RNA-seq) has revealed that bacterial expression has an as yet unexplored complex richness (9). Measuring the number of expressed sequence reads has shown that many genes are not simply transcribed in one direction; instead, TARs may be expressed in both the sense and the antisense directions as well as yield more than one transcript for the same region. After finding active unannotated

TARs by RNA-seq, the region still requires verification and characterization. The standard method for confirming expression is Northern analysis.

1.2 Nontranscribed regions

Much of the chromosomal regions between genes in prokaryotes are comprised of cryptic non-transcribed elements. Intergenic regions are populated by elements that are not transcribed but which function as genomic signals such as: promoter elements, transcription factor motifs, palindromes, as well as direct and inverted repeats.

Bioinformatic tools allow for the identification of promoters (5, 10) and rho-independent terminators (11) in order to identify their location for further verification. The online search for a promoter region can be derived from studies of several bacterial strains and combined, as seen with the Virtual Footprint promoter search tool (12). Other online search tools were derived from the study of a single strain as in the cases of the Escherichia coli derived BPROM (13), and

2

Lactococcus lactis derived Prokaryotic Promoter Prediction (PPP) (14). The PPP webserver was developed to allow users to submit a query sequence and identify the conserved catabolite control protein (CcpA) motifs to predict the location of the -10 promoter binding sites (14).

Bioinformatics has also been used to search for transcription factor motifs in bacterial intergenic regions. One such approach, the webserver RegPrecise, uses manually curated motifs for users interested in transcription factor regulon construction (15). The curated collection of motifs allows for the prediction of the interaction of intergenic regions with known transcription factors by comparative analysis. These bioinformatic tools provide valuable information that can be used to design experiments to validate predicted interactions.

1.3 Types of Transcriptionally Active Regions

1.3.1 Examples of Transcriptionally Active Regions

Advances in sequencing technology have dramatically altered our perception of bacterial transcriptomes. The initial understanding of bacterial gene expression was believed to be simpler as the majority of the genomes were thought to consist of protein-coding genes. This belief was based on gene structure, which is the basis for the flow of information in the central dogma of molecular biology. Essentially, DNA is transcribed into mRNA and then the mRNA is translated into proteins by ribosomes. The coding sequences of genes are the regions of DNA that encode proteins and are often referred to as open reading frames (ORFs). The ORFs begin with a start codon of three nucleotides (nt) and end with a stop codon of three nucleotides. In genome sequencing projects, much of the work of genome annotation is performed by using bioinformatics tools (16). The annotation tools assess whether or not an open reading frame should be assigned as a protein coding region. Once the ORF is assigned as a protein coding region, the annotated regions may be examined by targeted nucleotide deletion or full gene

3

deletions to further define their function. This view of bacterial expression has become much more complex as several RNA-seq projects have uncovered numerous TARs which are found in both annotated and unannotated regions (17). Many TARs were initially ignored since they lack defined features and some make poor deletion mutant candidates as the regions are so small (18).

Five categories of TARs that are frequently missed by sequencing projects include (but are not limited to): small IGR proteins, intragenic proteins, antisense RNAs (asRNAs), riboswitches, and sRNAs within IGRs.

1.3.2 Small Proteins and Peptides

It is a challenge to find and characterize proteins that are smaller than 50 amino acids in length. Two methods to find small peptides include two-dimensional (2-D) gel electrophoresis

(18) and liquid-chromatography mass-spectrometry (LC-MS) (19). Both 2-D gel electrophoresis and LC-MS are capable of identifying expressed proteins and are capable of detecting small peptides, but have several limitations. Not all genes translated into proteins under the tested conditions, which reduces the number of proteins identified. Also, there are detection limits for each technique so if a protein or peptide is produced but falls under the detection limit it will not be identified. For example, by using LC-MS to examine cytosolic protein expression in E. coli, investigators were able to verify that approximately 1100 out of the 4300 genes were expressed as proteins (20). One of the major difficulties of verifying protein expression lies with identifying the unique conditions required to express the large number of hypothetical proteins and in the case of E. coli, approximately only 25% of the proteins are identified (19). Another example illustrates the problem observed with detection limits. In the case 2-D gel electrophoresis, small proteins are frequently missed since the detection limit for proteins lies between 30 kDa (approximately 270 amino acids) to 200 kDa (approximately 1800 amino acids)

4

(6). A search for small proteins in E. coli examined the coding regions of putative small proteins to determine if small proteins were missed by previous proteomic studies (18). The screen probed 18 unannotated regions that the authors bioinformatically predicted would encode small proteins. The confirmation that all 18 regions indeed coded for small proteins demonstrated that many small proteins are missed by sequencing projects (18).

A novel, unannotated small peptide was identified in S. mutans UA159 by Mashburn-

Warren et al. (21) in the search for a peptide that regulated transformation that was homologous to the ComS protein in Streptococcus thermophilus and Streptococcus salivarius (22). The location of an unannotated 17 residue protein (ComS) in S. mutans was found and a model was proposed based on peptide analysis experiments suggesting that ComS was post-translationally modified into a six amino acid peptide that induces the expression of sigX (also known as comX), an alternative sigma factor that regulates competence (21). The mature six amino acid form of the peptide within ComS was designated the sigX-inducing peptide (XIP). Investigations into the control XIP has over genetic transformation of S. mutans determined that the peptide was active only when cells were grown in chemically defined medium (23).

1.3.3 TARs Located Within Annotated Genes

The occurrence of overlapping genes has been reported numerous times in viruses and bacteria as a strategy to condense vital coding information in a minimal sequence space (24-26).

For example, the bacteriophage MS2 contains a lysis protein that overlaps the 5’ proximal portion of a replicase protein and the 3’ distal end of a coat protein, both of which are located in a different ORF (27). Overlapping genes are common within bacteria, with 1 – 4 overlapping nt commonly found for genes in the same direction and 5 – 30 overlapping nt for antiparallel genes

(26). However, the arrangement of tandem genes overlapping the entire coding region is not

5

commonly observed and was not predicted to occur in bacteria. There are examples where a gene exists within a gene (28, 29). It is easy to understand how the overlapping TAR structure could be missed during routine genome sequence annotation through the discussion of two examples, one in the genus Thermus (28) and the second from S. mutans (29).

An example of a bacterial gene within a gene was described for Thermus thermophiles with the 50S ribosomal L34 (rpmH) gene overlapped by the RNase P (rnpA) gene (30). The T. thermophiles gene rpmH forms a protein 49 amino acids in length. The rnpA gene overlapped the same region as rpmH and formed a protein 163 amino acids in length (30). In addition, the rnpA gene was transcribed out of frame with rpmH. Examination of other sequenced species of the genus Thermus showed that all had the same overlapping rnpA and rpmH structure (28, 30).

In S. mutans UA159 a TAR exists within the comX gene and was named as the comX regulatory peptide A (xrpA) gene (29). To understand the role of this peptide, a description of the bacterial stringent response provides some necessary background. The alarmone guanosine pentaphosphate or (p)ppGpp, made by the (p)ppGpp synthase (RelA) protein, accumulates as a result of nutrient limitation which then activates stringent response genes (31, 32). In S. mutans, the rel competence related (rcr) genes form an operon with a transcriptional regulator (rcrR) and two ATP binding cassette (ABC) exporters (rcrPQ) (33). It was determined that the 5’ region of comX was differentially expressed in a polar mutant of rcrR (ΔrcrR-P) compared to a non-polar mutant for the same region (ΔrcrR-NP) (29). The evidence suggested that the comX gene is capable of generating two separate transcripts from the same comX promoter which results in two separate proteins. The full 160 amino acid ComX protein is formed by the ΔrcrR-P mutant, while the 69 amino acid XrpA protein is formed by the ΔrcrR-NP mutant.

6

The xrpA and rnpA examples illustrate the concept that although a gene has been annotated, the functional TARs that exist within the region may be more complex, an annotated gene may harbor a second unannotated TAR within the same coding region.

1.3.4 Antisense RNA

The regulation by asRNA adds another complex layer to gene regulation (34).

Historically, antisense transcription was considered to belong to the background transcriptional noise within the cell (35). However, there is growing evidence that there are substantial numbers of distinct bacterial asRNAs (34). The cis-encoded asRNAs can be as short as 60-300 nt or much longer from 700-3500 nt (35). The abundance of asRNA seems to vary depending on the organism. In Helicobacter pylori, 46% of the annotated open reading frames contained asRNA

(36), while less than 2% was reported for Staphylococcus aureus (37). Although there are few literature reports for asRNAs within bacteria, the paucity of information on asRNAs may be due to the difficulty in validating their presence as well as the difficulty in characterizing asRNA function (34, 38). A short list of examples of asRNA regulatory mechanisms includes altering

RNA stability (39), blocking transcription and translation (40), and interfering with gene expression based on its proximity to the adjacent gene (41). These three cases are discussed below.

One way in which asRNA can affect RNA stability is exemplified in the well-studied acid-resistance system in E. coli referred to as the glutamatic acid-decarboxylase (gad) system

(42). The gad system is located on an acid fitness island which is comprised of twelve genes as well as the asRNA GadY (39). The GadY transcript overlaps the 3’ untranslated region (3’-UTR) of the gadX gene, an activator of the gad system. GadY alters the RNA stability of the gadXW bicistronic message through posttranscriptional control which occurs when GadY nucleotides

7

with the 3’-UTR of the gadX mRNA (43). The increase in stability then allows for an accumulation of gadX mRNA, and as a result, the acid resistance genes located downstream are induced (43).

The blocking of transcription and translation are also regulated by asRNA as described for the fish pathogen Vibrio anguillarum (40). Vibrio anguillarum causes hemorrhagic septicemic disease in fish, a disease which relies on iron transport by the anguibactin siderophore encoded on the pJM1 plasmid (44). The ferric anguibactin transport (fat) system encodes the iron transport-biosynthesis operon (fatDCBA) which includes two iron-responsive asRNAs, RNAα and RNAß, both located on the noncoding strand of the fatDCBA-angRT operon (45). The synthesis of RNAα utilizes high-iron concentration in order to stabilize the half-life of RNAα.

The expression of RNAα was shown to reduce the translation of the ferric anguibactin receptor, fatA (40). The second asRNA, RNAß has been shown to regulate the transcription termination of the full length fatDCBA operon and leads to the termination of the fatA gene (46). The regulation of the proteins of the fatDCBA operon relies on both RNAα and RNAß, and this antisense regulation affects the iron transport proteins and virulence of V. anguillarum.

A third mechanistic example of asRNA regulation is defined by the strength of the promoter for overlapping regions (47). The best characterized example has been shown for coliphage 186 which produces a regulatory effect referred to as sitting-duck interference (41). In this mechanism, the first RNA polymerase bound at the complex of a sensitive promoter is pushed away after colliding with a second elongating RNA polymerase complex before the first polymerase can complete elongation. The sitting-duck interference was found regulating the lytic-phase promoter (pR) and the lysogenic promoter (pL) in coliphage (41).

8

These three examples illustrate the mechanisms by which asRNA affects RNA stability, the ability to block transcription or translation, as well as the regulation of promoters.

1.3.5 Riboswitches

Another class of TARs are the cis-acting riboswitches. Riboswitches contain receptor domains for ligands required for their activity and are generally found within the 5’ untranslated regions (5’-UTR) of bacterial transcripts (48). Riboswitches have been shown to regulate the expression of downstream genes by conformational changes that occur when binding to their cognate ligands. Several ligands have been identified including: adenosylcobalamin (49), S- adenosylmethionine (SAM) (50), flavin mononucleotide (51), purines (52), and metals (53). The following two examples illustrate the complexity of these regulatory RNAs. In the first example, the metal magnesium is the activating ligand for riboswitches in both Bacillus subtilis (50) and

Salmonella enterica (50). In the second example, the SAM riboswitch binds SAM to regulate the biosynthesis of methionine or cysteine in Gram-positive bacteria (54).

Metal-ion-sensing riboswitches have been described for bacteria that do not bind to a biochemical compound, instead they are sensitive to the concentration of metal ions within the cell. Much of our understanding of the regulatory role of the magnesium-sensing riboswitch for

Gram-positive bacteria was derived from work done on B. subtilis (55) and S. enterica (56) where the M-Box motif is located by the 5’-UTR of Mg2+ transport (mgt) genes and is co- transcribed with the mgtA and mgtE families of genes (55). In Gram-negative bacteria, the magnesium-sensing riboswitch for S. enterica was described as located by the 5’-UTR of mgtA

(57). The model proposed for the S. enterica mgtA riboswitch suggests that the riboswitch contains a stem-loop structure that is not formed under low Mg2+ conditions which allows for transcription of mgtA to occur allowing magnesium to be transported into the cell. When the

9

concentration within the cell increases, the formation of the stem-loop mgtA riboswitch prevents the formation of the mgtA transcript which results in the reduction of Mg2+ intake. For both B. subtilis and S. enterica, RNA folding studies were conducted in the presence and absence of

Mg2+ which demonstrated that the regulatory role of the magnesium-sensing riboswitch structure depends on the magnesium concentration (55, 57).

The investigation of sulphur metabolism in B. subtilis led to the identification of SAM riboswitches (50, 58). An S-box motif in B. subtilis located at the 5’-UTR of several cysteine and methionine biosynthesis genes was found to be a high–affinity aptamer for SAM. Binding studies showed that when SAM levels are low, an antitermination loop forms allowing for the transcription of downstream genes. When SAM levels are high, the SAM binding pocket forms an anti-antiterminator loop which results in transcriptional termination of the downstream genes

(50).

The above examples describe two mechanism riboswitches used to act as RNA-based sensors. Riboswitches respond to the intracellular concentration of metals, coenzymes, and nucleobases by regulating the cis-encoded genes. Binding cofactors to regulate gene function was previously thought to only occur for proteins. It has been suggested that these TARs are evolutionarily derived from ancient RNA regulatory systems that allow protein functions to be replaced by RNA molecules.

As one can see, there are a number of TAR categories. The remainder of the literature review will focus on the sRNAs found within IGRs.

1.3.6 Small RNAs

Regulatory small RNAs are referred to as sRNAs and although they are ubiquitous in prokaryotes, the vast majority of sRNAs have not been annotated (59). These soluble molecules

10

are heterogeneous in size (50 – 500 nt) and structure (9). Many sRNA are non-coding molecules, however, the term “sRNA” is preferred over “non-coding RNA” since several sRNA (for example SgrS and RNAIII) encode small proteins, and for many the ability to encode proteins has not yet been determined (38). The sRNA form complex components that regulate cell functions and influence gene expression at several stages of cellular metabolism. After transcription, the sRNAs can actively base pair with mRNA, modulate protein translation or stability, as well as confer stability to mRNA (38). They are transcribed either by their own promoter element or an operon promoter that transcribes more than one gene. Most of the sRNAs described in the literature have only been recently identified functionally in bacteria as a result of advances in bioinformatic and experimental approaches (59, 60). It will require effort through multiple approaches to develop our understanding of these regulatory elements.

At one time, the prevailing dogma was that DNA sequences that were not derived by natural selection of the organism belonged into a category referred to as junk DNA (61, 62). The regions between genes were part of this scheme; however, the examination of IGRs has yielded numerous putative sRNAs, few of which have been characterized (8, 36, 63). There are many sRNAs that have captured the attention of researchers. Three examples will be discussed here to illustrate the search for functional TARs within IGRs, namely: RyhB in Escherichia coli (64), the tracrRNA (trans-activating CRISPR RNA) of S. pyogenes (65), and small unknown RNA

(SurC) from B. subtilis (66). The study of these sRNAs has yielded insight to the functions of

TARs that are located between functioning genes. Further examples and interactions are examined in Sections 1.4 and 1.5.

Some sRNAs found in the IGRs are well conserved, which is the case for the sRNA

RyhB which was first identified in E. coli (64) and then further characterized in both E. coli and

11

S. enterica by searching for sRNA that were bound to the host factor required for phage Qβ

(Hfq) protein (67). In the initial studies of RyhB, its expression was strongly induced upon the addition of an iron chelator, suggesting that the sRNA was involved with metal regulation (64).

Further tests revealed that the function of RyhB was the regulation of the ferric uptake repressor

(Fur) protein; the presence of Fur repressed RyhB transcription only when iron was present (68).

The same study identified several other genes down-regulated by RyhB, especially TCA cycle proteins (superoxide dismutase, aconitase) that required iron. The work done on RyhB demonstrates that an sRNA is capable of regulating cellular responses by targeting multiple genes, and the sRNA are frequently located within a conserved IGR.

Another conserved sRNA in prokaryotes is the tracrRNA. The tracrRNA is a regulatory component of the clustered regularly interspaced short palindromic repeats (CRISPR) and the

CRISPR associated (cas) genes. The CRISPR/Cas system has been described as a nucleic-acid bacterial immunity system in S. thermophillus and is found in over half of all tested bacteria (69).

It is made up of spaced palindromic repeats and accompanying cas genes. In S. pyogenes, the

CRISPR system requires the tracrRNA to assist with the cleavage of foreign DNA (65). The mature CRISPR RNAs (crRNA) are part of the bacterial immunity pathway. The CRISPR array is made up of spacers and palindromic repeat sequences on an invading mobile genome, which is referred to as precursor crRNA (pre-crRNA). The tracrRNA associates with the Cas9 protein as well as base pairing with the repeat sequences of the pre-crRNA allowing RNAse III to degrade and cleave the pre-crRNA (65). The existence of the tracrRNA has been shown previously in S. mutans (65). Little is known about its expression in response to external stressors (70) and although little is known of the role of the CRISPR system in S. mutans (71), recent results have

12

shown that the CRISPR system in S. mutans modulated the stress tolerance to acid, oxidative stress and heat; the system also inhibited plasmid transformation (70).

The third example of an IGR sRNA comes from the B. subtilis study which identified sRNAs that are involved with the regulation of sporulation (66). The study identified SurA and

SurC (both named for small, untranslated RNA), as well as polC-ylxS (named for the two bordering genes). Two approaches were used to search for sRNAs. The first approach used a bioinformatics search for RNA structures of IGRs that were conserved for several Bacillus strains. The second approach used a microarray study of the intergenic regions >250 nt in B. subtilis in order to identify IGRs that were differentially regulated during sporulation. The combination of the two approaches yielded the sRNAs SurA, SurC and polC-ylxS. Expression analysis showed that both SurC and polC-ylxS were not transcribed during the vegetative state of

B. subtilis, but rather were expressed during sporulation. It was also found that the polC-ylxS region contained three separate promoters responsible for six different transcripts. Although a link was identified with the sRNAs and the developmental regulation of sporulation, no function was uncovered for the sRNAs. This is a common concern with the study of sRNAs as the literature reports several examples that demonstrate that sRNAs perform subtle, cryptic roles in gene regulation (9, 38).

1.4 Small RNAs Form Regulatory Complexes with Proteins

Regulatory sRNAs operate at the protein, the DNA, and the RNA level of gene regulation. In general, the functions for sRNAs fall within two categories, first the sRNA binds a target protein and as a result modifies the function of the protein (72), or the TAR binds targeted mRNA or DNA which results in the activation or repression of gene expression (73-75). This section examines the interaction of sRNAs with proteins. The best studied sRNA-protein

13

interactions occurs between the Hfq chaperone protein and sRNA (76-79). Other examples include several sRNAs that act as post-transcriptional regulators by interacting with the sigma factors RNA polymerase S and E (RpoS and RpoE) (80-82) or with transcriptional regulators such as CsgD (83, 84) and Lrp (85). The end result is the modification of gene expression by the sRNA.

1.4.1 Small RNA Regulation by Hfq

The RNA chaperone, Hfq, is a well characterized protein that forms a hexameric ring that is capable of binding both mRNA and sRNA transcripts (77, 86). Hfq provides a stabilizing scaffold for the base-pairing of the sRNA with antisense mRNA targets (87). The degradation or protection of sRNAs from ribonuclease activity often depends on the Hfq protein. For example, the E. coli sRNAs RprA and RyhB are protected from RNAse E degradation by binding to Hfq

(86). In contrast to the protective role provided by Hfq binding, it has also been demonstrated that the sRNA OxyS (87) and MicC (88) had enhanced RNAse E degradation upon binding with

Hfq. In this manner, Hfq offers a stable location for the post-transcriptional control of sRNA degradation.

The intrinsic ability of Hfq to recruit sRNAs has been exploited to purify sRNA molecules (67). The procedure involves mixing bacterial lysate with the Hfq protein. The RNA that immunoprecipitates in a complex with Hfq is then recovered, processed to remove the protein, and followed by an extraction of the associated RNA. The RNA was then converted into cDNA and either hybridized to a microarray (86) or sequenced (67, 89). In both methods, the active transcribed RNA was detected and compared with the coding genes for that organism, and the transcripts that did not match the previously determined coding regions comprised the pool of sRNAs for that organism. In S. enterica, Hfq has been shown to control nearly a fifth of the

14

transcribed output (67). This immunoprecipitation technique was used to discover over sixty sRNA transcripts in S. enterica (67), twenty sRNAs in E. coli (86), and three sRNAs in Listeria monocytogenes (90).

1.4.2 Small RNA Regulation Involving Transcription and Sigma Factors

There are a growing number of studies that have identified sRNAs that act as post- transcriptional regulators that interact with transcription factor proteins and sigma factors (75,

91).

The highly conserved sRNA GcvB found in Gram-negative bacteria regulates genes involved in dealing with external acidic stress and is associated with Hfq (92). The activity of

GcvB has been shown to regulate the sigma factor RpoS (91) as well as the global transcription factor Lrp (93). GcvB was initially described as a gene involved with glycine cleavage and was found to regulate two gene targets, oppA and dppA, which are genes that encode portions of the

ABC transport system involved with amino acid uptake (92, 93). An interaction with the sigma factor revealed that GcvB regulated the cell’s response to pH by up-regulating levels of RpoS, a central regulator of cellular stress (92). In the GcvB interaction with the leucine-responsive transcription factor protein (Lrp) in S. enterica, it was determined that GcvB is involved in regulating Lrp along with 20 other amino acid transporters, by utilizing a double negative feedback loop (75). The Lrp transcription factor regulates as much as 10% of the genes in E. coli and the conserved protein is a global regulator of Gram-negative bacteria. In S. enterica regulation occurs by the addition of the sRNA GcvB to the Lrp network interacting by base- pairing with the lrp mRNA in the 5’-UTR. The new model is described as a rewired Lrp circuit with a post-transcriptional level of control by GcvB (93, 94).

15

In E. coli, the switch between the motile planktonic state and the sessile adhesive biofilm lifestyle depends on the production of flagellar and curli genes (95). The generation and export of the curli polymers is mainly controlled by two curli-specific gene (csg) operons, csgBA and csgDEFG. The csgBA and csgDEFG operons are transcribed in opposite directions and are controlled by the transcriptional regulator CsgD (95). At least five sRNAs have been found to regulate CsgD, the outermembrane regulated OmrA and OmrB (96), the previously mentioned

GcvB (97), the multi-cellular adhesive sRNA (McaS) (97), and the cell envelope sensitive RpoS regulator A (RprA) (83). The sRNAs listed here repress csgD expression by base-pairing the 5’ mRNA region. These sRNAs also form a complex with Hfq to post-transcriptionally regulate csgD (84). The presence of multiple sRNA regulators has been hypothesized to offer finer control of the switch between the planktonic and biofilm lifestyle (84).

Bacterial stress response systems are often regulated due to the presence of stressors outside of the cytoplasm (98). In some bacteria, the delta subunit RNA polymerase E (RpoE), often referred to as sigmaE (σE), is an alternative sigma factor which assists in the binding of

RNA polymerase to promoter regions and is intimately connected to the extracytoplasmic stress response monitored by sRNA (98). Examples of the extracytoplasmic stress response regulation by RpoE include metal shock (by cadmium, zinc, and copper) (99), oxidative stress (100), and heat shock (101). In E. coli the sRNA RybB requires σE for transcription and RybB negatively regulates σE promoters. This suggests that the sRNA response to extracytoplasmic stress takes the form of an autoregulatory loop (102). Another sRNA in E. coli, MicA also strictly requires

σE indicating MicA has a role in stress regulation (103). The two sRNAs, MicA and RybB are conserved in S. enterica and are also involved in regulation triggered by extracellular stress.

Under an envelope stress response, RybB is transcribed by σE (104) RybB also binds the mRNAs

16

for outermembrane proteins leading to their degradation (104) and preventing misfolded proteins from accumulating in the periplasm. MicA is also regulated in a similar σE-dependent manner.

It is increasingly apparent that one contribution of sRNA is to provide regulatory feedback to transcription and sigma factors. The effects of the sRNA interaction with both factors are complex and difficult to decipher due to the dynamic nature of the sRNA expression and cryptic functions of the sRNA.

1.5 Regulation by sRNA-mRNA Complexes

There are several regulatory mechanisms utilized by bacterial sRNAs to regulate mRNA expression including the control of transcription as well as translation. The majority of characterized sRNAs bind the target mRNA on the ribosome binding site (RBS) at the 5’ UTR, which obstructs the binding of the 30S ribosome effectively blocking transcription (105).

However, there are examples where an sRNA binds the 5’-UTR to remove an inhibitory secondary structure stimulating the translation of a targeted region as seen with the sRNAs RprA and DsrA which are responsible for an increase in the translation of the sigma factor RpoS (81).

Four examples are described here to illustrate select mechanisms of sRNA regulating their mRNA targets. The sRNA examples examined here include: the iron regulated RyhB (106), the hydrogen peroxide regulated OxyS (107), the phosphosugar stress response regulated sugar transport-regulating sRNA (SgrS) (108), and the virulence mediator RNAIII (109). These sRNAs are examined along with the sRNA-mRNA complexes formed when the sRNA binds selected target regions.

1.5.1 RyhB Iron Concentration Regulation

The sRNA RyhB represents an example of metal regulation by sRNA in E. coli (64).

RyhB is 90 nt long and is conserved in several Gram-negative bacteria. RyhB stimulates rapid

17

degradation of several iron storage and iron uptake proteins when the cell is under iron depletion, which allows essential processes that require iron to continue. The sRNA-mRNA pair is degraded by a complex formed with the RNase E ribonuclease removing RyhB when iron levels are restored, which then allows the iron-using proteins to accumulate iron. RyhB and its mRNA target pairs at the ribosome binding site of the mRNA accompanied by the Hfq chaperone protein which is essential for the activity of RyhB (78). When iron is plentiful, RyhB is repressed by the ferric uptake regulator (Fur) protein (64); however, the regulation is a complex negative feedback loop that occurs as RyhB also represses the expression of Fur (106). Although RyhB is well known for its repressor activity, it also acts as an activator of shikimate permease (shiA) using an anti-antisense mechanism to disrupt an inhibitory structure of the ribosome binding site

(110). The ShiA protein is believed to be involved in the production of a siderophore required to acquire iron. The intricate control mechanism exhibited by RhyB in response to external, physical stimuli is comparable to the post-transcriptional response of protein transcription factors; however, sRNAs like RyhB can react more quickly due to the specific shutdown of expression (111).

1.5.2 OxyS Regulation by Oxidative Stress

In a manner similar the RyhB response to iron depletion, the sRNA OxyS in E. coli senses an oxidative environment and regulates the cellular response to oxidative stress (112).

OxyS is 109 nt in length and was given the name OxyS due to its upregulation by the adjacent oxidative stress response protein, OxyR. Upon immediate exposure to hydrogen peroxide, OxyR upregulates OxyS which then regulates approximately 40 genes in response to oxidative stress

(113). OxyS was one of the initial sRNAs described that regulated the cellular response yet did not translate into protein. This was surprising as most regulatory functions were at the time

18

thought to be protein-controlled. Like many other sRNAs in E. coli, OxyS requires the Hfq chaperone protein for its activity (113). An interesting example of OxyS regulation is the repression of the synthesis of formate hydrogen lyase (FhlA), a transcriptional activator of formate metabolism operons (107). The structure of the OxyS-fhlA binding complex is interesting as it has been described to form an unusual loop-loop kissing complex. There are several notable differences between OxyS regulation and regulation by other sRNAs. As previously mentioned for several RyhB-mRNA complexes, the transcripts are degraded by

RNase E (78). In contrast, the OxyS-mRNA complexes are base paired with the mRNA targets near the translation initiation site which blocks translation but leaves the mRNA intact. Also, when comparing RprA and OxyS regulation of RpoS, RprA enhances RpoS translation by binding upstream of the start codon (78). For OxyS, it is the sequestration of Hfq by OxyS and not the binding of rpoS to OxyS that downregulates RpoS.

1.5.3 SgrS Regulates Glucose-Phosphate Stress

Bacteria utilize the phosphoenolpyruvate-dependent phosphotransferase system (PTS) to take up glucose, which can then be phosphorylated to the essential intermediate glucose-6- phosphate (G-6-P) (108). A buildup of G-6-P is toxic and referred to as phosphosugar stress due to the destabilization of the PTS genes (108). In Gram-negative bacteria the conserved Hfq- binding sRNA SgrS (240 nts) accumulates due to the its synthesis being activated by SgrR, which in turn is activated by phosphosugar stress (114). The buildup of SgrS results in the degradation of ptsG mRNA by RNase E which ultimately reduces the cellular level of G-6-P

(108). The riboregulation occurring between SgrS and ptsG involves the formation of base pair interactions along the ribosome-binding site of ptsG which results in the inhibition of translation as well as the degradation of ptsG mRNA. In S. enterica, SgrS has three other mRNA targets: the

19

mannose uptake system (manXYZ), a virulence factor (sopD), and a putative phosphotase (yigL)

(115). By base-pairing with SgrS, both sopD and manXYZ are degraded by RNase E in a similar

Hfq complex that was found to degrade ptsG. However, unlike the other three negatively regulated targets, yigL is activated by SgrS regulation (116). After binding Hfq, SgrS base-pairs and stabilizes the bicistronic pldB-yigL mRNA which is cleaved by RNase E allowing yigL to be translated. The current model suggests yigL assists in cell detoxification by neutralizing sugars in preparation for their efflux.

SgrS also encodes the small 43 amino-acid protein SgrT which is produced during G-6-P stress (117). SgrT was found to block glucose uptake but not affect PtsG so the mechanism of regulation for SgrT and SgrS were found to be different. The production of an sRNA and a small protein within the same loci occurs in other sRNA examples, such as the Staphylococcus aureus sRNA, RNAIII.

1.5.4 RNAIII Regulates Virulence Genes

Perhaps one of the longest sRNAs characterized from Gram-positive bacteria so far is the

Staphylococcus aureus RNAIII at 514 nt (118). RNAIII is a component of a regulatory locus that monitors population density via complex gene-regulated cell-signaling system, a process referred to as quorum sensing. RNAIII plays a key regulatory role in the control of several virulence genes (119). RNAIII also contains the virulence gene δ-hemolysin (hld) which encodes a 26 amino acid peptide (109). Data suggests that, similar to the SrgS dual-functions, RNAIII functions as an activator as well as a repressor switching between the expression of secreted factors and inhibiting surface proteins (118). Activation occurs for the virulence gene α- hemolysin (hla) when trans-encoded RNAIII base-pair binds and forms a complex with hla

(109). The hla ribosome binding site binds an antisense region inhibiting the translation of hla.

20

The bound RNAIII-hla forms an anti-antisense translation complex interaction that promotes ribosome binding and increases the translation of hla. An example of repression by RNAIII of virulence genes occurs with the transcription factor repressor of toxins gene (rot) (120). A

RNAIII-rot complex is formed by base-pairs binding two hairpin loops which inhibits ribosome binding and prevents rot translation. The inhibitory complex is similar to the loop-loop kissing formation seen by the OxyS-flhA complex. These examples demonstrate the regulatory influence

RNAIII exerts over the pathogenic hemolysin proteins as well as bacterial toxin virulence genes.

These four sRNA examples of RyhB, OxyS, SgrS, and RNAIII illustrate how sRNAs respond to environmental changes that lead to the regulation of stress networks or control of metabolic pathways. Use of sRNA regulation may provide the cell faster regulation since the protein translation step is bypassed (121). Bacteria must adapt to quickly changing environments and sRNAs play a role in the coordinated adaptation to metal, external stressors, metabolism requirements, as well as regulating the complex virulence of pathogens.

1.6 Streptococcus mutans as a Model Organism

The human mouth provides a symbiotic home to a complex community of over 700 species of bacteria (122). Most of the bacteria reside in the dental plaque, due to the adherence to the salivary pellicle on the surface of the tooth (123). While the oral flora is an integral component of a healthy oral environment, oral dwelling bacteria are not entirely benign. Oral bacteria may exist as commensal organisms as well as opportunistic pathogens which are etiological agents of dental caries and periodontal disease (124, 125).

S. mutans is a Gram-positive, catalase negative, nonmotile streptococci and is frequently isolated from human dental caries (124). The pathogen has been associated with the formation of caries as well as infective endocarditis (124, 126). S. mutans is a facultative anaerobe capable of

21

respiration in the presence of oxygen (126). S. mutans is capable of fermenting various carbohydrates including sucrose, glucose, fructose, mannitol, sorbitol, and inulin (126). The ability of S. mutans to ferment mannitol and sorbitol sets it apart from other streptococci. Even though there are several other dominant oral bacterial strains, S. mutans has been long considered a dental pathogen as it has been found to reside in human carious lesions (124, 126-128).

S. mutans UA159 is considered to be a model organism as it is well characterized genetically and through its physiology during stress responses. It has been investigated for its biofilm formation, involvement in dental caries, utilization of carbohydrates, the regulation of stress tolerance, and genetic competence (129). Three other strains of S. mutans also isolated from carious dental lesions, namely S. mutans NN2025 (130), S. mutans LJ23 (131), and S. mutans GS-5 (132) have recently been sequenced. All four strains of S. mutans have a low GC content (~37%). The complete chromosomal sequence of S. mutans UA159 was reported in 2002

(133). The sequence is over 2 million nt in length and contains 1963 predicted proteins and five rRNA operons. Approximately 54% of the genes are coded on the positive strand as can be seen in Figure 1-1. Despite this microbe having been studied since it was originally isolated by Clarke in 1924 (134) much remains to be discovered.

1.6.1 The Oral Environment and S. mutans

Formation of biofilm is an essential trait for S. mutans to colonize the surface of teeth in the form of dental plaque. In this regard, S. mutans is not alone as there are a multitude of bacterial species found in dental plaque forming a diverse and complex habitat (135). Thousands of phylotypes form a multilayered biofilm (136), stratified according to optimal ecological niche

(123). Understandably, the development of dental plaque communities is a complex process.

22

Dental caries is a common infection in humans. Caries is formed by enamel breakdown due to exposure to acid generated by acid tolerant bacteria fermenting low-molecular-weight carbohydrates (124). Early work on caries research in the 1940’s by R. M. Stephan verified that oral pH levels would immediately drop following a glucose rinse (137). Stephan suggested that the drop in pH was due to the bacterial metabolism of carbohydrates (137). Later work confirmed the hypothesis and determined that rate of mineralization and remineralization of tooth enamel was balanced, but could progress to the acidic demineralization state found in caries if the critical pH of ~5.5 could be maintained and lowered (138). The pH level of 5.5 was referred to as the critical pH since a lower pH level would allow bacteria to generate an acidic environment, a process referred to as acidogenicity, establishing the conditions required to demineralize enamel. The development of a plaque community that is acidogenic and acid- tolerant (aciduric) is capable of surviving the acidic conditions as well as outcompeting bacteria that are not equipped to survive in acidic conditions furthering the establishment of an aciduric flora (124). Due to the prolonged exposure to low pH, the protective enamel layer of the tooth is slowly solubilized and eventually the damage forms a carious lesion.

As one of the middle colonizing members of the complex dental plaque community, S. mutans adheres to the dental pellicle found on the surface of the tooth and forms attachments to other early colonizing bacteria (139). Depending on pH and oxygen content, the carbohydrates consumed by S. mutans are metabolized to lactic acid, acetic acid, and formic acid (140-142).

Although S. mutans is not the only bacterium isolated from caries, as previously mentioned, S. mutans has been associated with human caries in several studies.

23

t-Glu64 t-Arg65 1 t-Lys62 t-Asn63 t-Leu19 t-Cys61 t-Thr20 5 2

0 100K 00K 19 t-Thr51 2 00 K K t-Ser34 00 4 18 t-Leu35 3 0 0 3 K K 0 0 7 1 t-Ser36 4 0 0 K K

0

0

6

1

5

0

Streptococcus mutans UA159 0 K

1

5

0 0 2032925 bp K

K

0 1 0 4 6 0 0 K

K 0 1 0 3 7 0 0 K

K 1 00 20 8 0K 0K 1100 90 K 1000K

t-Arg40

t-Tyr39 t-Gln38 t-Arg37

Figure 1-1 Genetic map of Streptococcus mutans UA159. The five rRNA operons are numbered in order along the positive strand as 1-5. Individual isolated tRNAs are identified by amino acid and sequence number. The outside ring represents genes transcribed on the positive strand and the inner ring represents the genes found on the complementary strand. The map was generated using MacVector 12.

24

1.6.2 Virulence Properties of S. mutans

S. mutans colonizes children soon after the emergence of the first tooth (143). The infant oral microflora soon drops pH due to carbohydrate fermentation and acidic end-products (124).

The environmental change is linked to the virulence properties of S. mutans. The virulence properties include the aciduricity, the ability to adhere to teeth, and the formation of biofilm formation. The acid tolerance of S. mutans was described in Section 1.6.1. A description of adhesion and biofilm formed by S. mutans follows.

The colonization of S. mutans on the tooth requires the bacteria to grow as a biofilm attached to the tooth surface. The adhesion occurs by both sucrose-independent and sucrose- dependent structures (142). An example of sucrose-independent adhesion to the acquired enamel pellicle is formed by the fibrillar surface protein antigen P1 (spaP). The SpaP protein belongs to a conserved family of binding proteins that are structurally conserved with an alanine-rich binding domain, which is thought to be necessary for binding salivary proteins (144). The sucrose-dependent attachments by carbohydrate polymers are formed by the glucosyltransferase

(Gtf) enzymes and glucan binding proteins (Gbp). The Gtf enzymes convert sucrose into insoluble and soluble glucans which are polymerized into an exopolysaccharide matrix to build an underlying structural support (145). The Gbps function as glucan receptors at the cell surface and further strengthen the adhesive attachment (146). The attachment of S. mutans by Gtfs and

Gbps to the acquired pellicle that coats the tooth is then built upon by attachments to other plaque bacteria. Streptococci use both the sucrose-independent and sucrose-dependent structures to form the adhesive matrix that attaches to oral bacteria as well as the acquired pellicle. The adhesive biofilm formed by S. mutans provides the stable ecological niche necessary for the bacteria to contribute to the development of caries.

25

S. mutans is found in both healthy plaque and in diseased carious lesions. The ecological plaque hypothesis attempts to explain how this may occur by examining the shift from a balanced healthy tooth surface to the selection of an aciduric caries-producing community (136).

The ecological plaque hypothesis suggests that an imbalance within the oral plaque community occurs when aciduric oral pathogens are selected leading to dental caries (136). The hypothesis also proposes that caries can be prevented. Three practical examples demonstrated how this may occur. First, using fluoride dental products will help maintain healthy enamel, inhibit glycolytic enzymes, and reduce the production of acid to maintain a higher pH to prevent the establishment of an aciduric plaque community (136). Second, if a diet is maintained by removing food/snacks with sugar consumed between meals and instead consuming food/snacks with non-fermentable sweeteners, the pH drop between meals will be avoided (136). And finally, if saliva flow is increased after meals then fermentable sugars are removed to return the pH to resting levels

(136). The hypothesis suggests that in a healthy tooth environment with a neutral pH, the aciduric bacteria are found as a smaller percentage of the overall community and less time is spent at the pH for demineralization due to maintaining a healthy flora; the rates of demineralization and remineralization are in equilibrium (136).

The ecological plaque hypothesis also provides an explanation for contradictory studies.

There are examples where healthy teeth yield S. mutans colonies from the healthy flora (147).

The opposite also occurs where carious lesions are probed but S. mutans are not observed to be a significant portion of the diseased community (125). Although S. mutans are well represented in aciduric communities, it is not obligatory to form a carious lesion. Oral bacteria are diverse enough to shift the balance to an aciduric community and cause a carious lesion without

26

requiring the presence of S. mutans. Both healthy and diseased communities are established by the formation of a diverse biofilm environment.

The formation of a biofilm benefits the bacteria in many ways. The bacteria exist as a community and the close proximity to neighboring bacteria allows for metabolite sharing and the transfer of genetic material. Developing a biofilm allows bacteria to overcome the shear forces found in a fluid environment (148). By forming a dense biofilm, the bacteria are able to recover from antibiotic treatments (149).

The change from a planktonic to a biofilm state alters expression of several genes by S. mutans (150). Some of the genes upregulated as a result of the lifestyle change include: gtfABC, the response regulator vicR, and a two-component regulatory system (TCS) comDE. As pointed out above, the gtfs facilitate the colonization through the attachment of carbohydrate polymers formed by S. mutans (146). The TCS vicRK modulates adherence, biofilm formation as well as competence development (151). The vicRK regulation of gtfB, gtfC, and gtfD signals the change in carbohydrate utilization required for the shift from a planktonic lifestyle. The role of the TCS comDE in the biofilm lifestyle is examined below.

The study of quorum sensing is the basis of our understanding of the crowded biofilm lifestyle of S. mutans. The first described quorum sensing system for S. mutans was described for the genes comC, comD and comE (152). The comC gene encodes a 46 amino acid precursor protein that is processed into the 21 amino acid precursor competence stimulating peptide (CSP)

(153). This precursor is further proteolytically cleaved into the active 18 amino acid peptide

(154). The genes comD and comE form a TCS with a histidine kinase (comD) and its cognate response regulator (comE). In S. mutans the CSP-ComDE quorum sensing system regulates the

27

bacteria’s ability to uptake DNA via competence regulated genes. The system is also involved in the formation of biofilm and maintaining pH control (155).

Another quorum sensing system found in S. mutans was described for the comRS genes

(21). This system was briefly described in Section 1.3.2 above. A proteomic comparison of culture supernatants with synthetic XIP in chemically defined medium showed that the peptide was secreted by the cell (23, 156). Genetic transformation and biofilm formation were also shown to be regulated by XIP when S. mutans was grown in chemically defined medium (23).

1.6.3 S. mutans and Environmental Response Pathways

Understanding the manner by which S. mutans reacts to stressful environmental conditions provides insights into the adaptive lifestyle of this bacterium. Considering their lifestyle and habitat, there are several environmental responses S. mutans undertakes including environmental adaptation, acid tolerance, heat-shock, metal homeostasis and stringent response.

Examples of genes regulated by these environmental responses are examined below.

In S. mutans, RpoE is another monitor of global environmental responses of the cell

(157). Examination of the expression of rpoE using an rpoE deletion mutant in a microarray experiment with S. mutans showed that upon exposure to H2O2, acidic media, and during differing growth phases the expression of over 200 genes was altered (157). RpoE was central to the regulation of malolactic acid fermentation as well as the synthesis of histidine (157). RpoE is adjacent to the cytidine triphosphate synthetase gene (pyrG), which is involved in the formation of pyrimidines.

The expression of pyrG in B. subtilis is regulated by the formation of an antiterminator region between rpoE and and pyrG in response to the concentration of pyrimidines within the cell (158). PyrG catalyzes the conversion of UTP to CTP. In B. subtilis pyrB is essential for

28

forming pyrimidines (159). When pyrB is deleted, the expression of pyrG depends upon complementing the pyrimidine cellular requirement by supplementing the medium with the nucleotide cytidine, and no effect is observed on the expression of rpoE in the pyrB mutant. The intergenic inverted repeat located between rpoE and pyrG regulates the transcription of pyrG and is also sensitive to the concentration of pyrimidines (158). Growth in the presence of the pyrimidine cytidine represses the expression of pyrG in the absence of pyrB. Growth in the presence of orotic acid, a precursor molecule in pyrimidine biosynthesis, does not induce or repress the expression of pyrG. This pointed to a direct role of pyrG with nucleotide synthesis. A low concentration of pyrimidines resulted in derepression of pyrG and the addition of pyrimidines repressed pyrG. It was found that the regulation of pyrG required an antiterminator loop between pyrG and rpoE, which forms in the presence of a low pyrimidine concentration.

Although it has not been shown in S. mutans, in B. subtilis RpoE responds to environmental stress as well as nutrient limitations.

Another example of an environmental response occurs when cells are exposed to heat- shock. Proteins damaged by heat-shock require them to be refolded and repaired. In S. mutans the two heat-shock chaperone proteins, DnaK and DnaJ, are located in the dnaK operon which includes the four genes hrcA, grpE, dnaK, and dnaJ (160). The operon is upregulated in response to changes in temperature and pH which results in the repair/refolding of heat-shock damaged proteins and biogenesis/stabilization of the F-ATPase complex. The repair of the F-ATPase complex is necessary in order to survive at a lower pH. Protein repair by DnaK and DnaJ is required by bacteria as these chaperones maintain the quality and functionality of proteins required for survival (161).

29

Environmental nutritional limitation invokes the stringent response in bacteria. As previously mentioned for S. mutans, the stringent response is mediated by (p)ppGpp (31) and under stress conditions is synthesized by the (p)ppGpp synthetase proteins RelA, RelP, and RelQ

(32). As shown by a microarray gene expression profile for an S. mutans CodY deletion mutant, the CodY transcription factor is a global regulator of gene expression under nutrient starvation

(162). The study found that the CodY transcription factor is regulated by the presence of the branched chain amino acids (BCAA) leucine and valine. Based on deletion mutants for relA, relP, and relQ, BCAAs are also linked to (p)ppGpp regulation and in effect related to CodY activity (162). CodY was found to globally regulate the stringent response mediated by the presence of the bacterial alarmone (p)ppGpp produced by RelP and RelQ which maintains the nutritional homeostatic environment (162).

Bacteria are required to maintain a metal homeostatic environment as well; they do so by adjusting the internal transition metal ion concentration. When metal ions are in excess, the ions may become toxic to the cell. Many regulatory processes require metal ion cofactors and the absence of cofactor ions also interferes with the regulatory processes. The metalloregulatory proteins and small molecules within the cell control the balance between scarcity and excess.

Two examples of small molecules interacting with a metal or a metalloregulatory gene have been described above in Sections 1.3.5 (MgtA riboswitch) and 1.5.1 (Fur-RyhB interaction). In both cases, the sRNA act as metal sensors to mediate the regulation of the transport of ions. S. mutans has several metalloregulatory systems, the regulatory system for manganese and copper are described below.

In S. mutans, Mn2+ homeostasis is controlled by the SloR transcriptional regulator (163).

SloR is a metalloregulatory protein that binds both Mn2+ and Fe2+ to control gene expression and

30

regulate virulence genes (164, 165). A microarray gene expression profile comparing wild-type with an S. mutans SloR deletion mutant was performed in the presence of 0.1 and 10 µM Mn2+.

Bioinformatic analysis identified that the SloR recognition element (SRE), a palindromic 22 nt motif, was located on over 270 regions of the chromosome (163). The microarray analysis identified over 60 SloR modulated genes in close proximity to an SRE. A previous report identified a relationship between SloR and virulence genes (166). The location of palindromic

SREs in close proximity to virulence genes suggested a new model for the mechanism of virulence for S. mutans since several virulence genes are regulated by SloR .

Another metal homeostatic system in S. mutans regulates copper homeostasis and is regulated by the copYAZ operon (167). The three proteins of the operon are: a copper responsive repressor (CopY), an ATPase (CopA), and a copper responsive repressor (CopZ) (168). A known cause of the toxicity of copper exposure in S. mutans is due to the disruption of membrane potential. Another cause may be the affinity that copper has for proteins that bind iron/manganese as an active cofactor which may interrupt important cellular processes, especially peroxide regulation by perR and superoxide by sod. Copper toxicity is linked to the reduction of the ability to form biofilm, cope with pH and oxidative stress, and a reduced expression of competence genes (168). Although the transport of copper is thought to occur passively, removal of copper is important for cell survival.

1.6.4 Expression Studies of sRNA in S. mutans UA159

To date, several studies have focused on the expression of the intergenic regions of S. mutans UA159. Here are highlights of five reports that surveyed the expression of the intergenic transcripts.

31

Examining the expression of rpoE under normal and stressed conditions was performed by Xue et al. (157) and previously mentioned in Section 1.6.3. A gene deletion mutant of rpoE was used to perform microarray analysis to compare expression in the mutant against the wild- type under acidic and oxidative stress. The study concluded that rpoE upregulated over 300 putative sRNAs in S. mutans, of which, the location of nine of these putative sRNAs were identified. Northern analyses to verify the results were not performed (157).

Examining the transcriptome of a cell population by performing a microarray experiment can be refined by cell-sorting methods. The combination of the two methods was used to examine the competence regulon in S. mutans UA159. The initial step sorted cells by flow- cytometry and was followed by microarray analysis of the separated populations to examine the expression of the transcriptome (169). The use of flow-cytometry allowed Lemme et al. to determine the CSP-dependent activation of competence S. mutans UA159 for individual cells

(169). In this example, the promoter of comX was fused with green fluorescent protein (GFP) on a plasmid in a transcriptional fusion study (169). The population of cells was sorted by up- or down-regulation of GFP using flow-cytometry and the two populations were further characterized by microarray analyses. Measuring the change in GFP expression showed that thirty intergenic regions were up-regulated. This suggested a set of putative sRNAs that belong to the competence regulon were regulated by comX. Unfortunately, Northern analyses were not performed to verify the results nor were transcription start and stop sites identified.

In a bioinformatics approach to finding sRNAs, the location of forty putative sRNAs were found for S. mutans UA159 (170). The verification of one of the sRNAs was accomplished by following the expression of the L10-Leader using Northern analysis and quantitative real-time polymerase chain reaction (qRT-PCR). The L10-Leader was expressed during exponential

32

growth and expression dropped significantly once the cells reached stationary phase indicating that L-10 leader likely contributes to developmental regulation.

In the search for a toxin-antitoxin (TA) module in S. mutans UA159, Koyanagi and

Lévesque examined an intergenic region located between smu.219 and smu.220 (171). A protein toxin (fst-Sm) and its cognate sRNA antitoxin (srSm) were identified. The activity of an overexpression mutant for fst-Sm and srSm reduced the survivability of persister cells when exposed to the cell wall-acting antibiotic vancomycin. The TA modules with an sRNA antitoxin are referred to as Type I TAs, and suppress the toxin expression by base pairing with the toxin mRNA (172). Type I TAs are predicted to be broadly distributed within bacteria but the antitoxin sRNA are difficult to identify by bioinformatics (172).

Bioinformatics has led to the successful identification of several sRNAs. An alternative approach to finding sRNAs uses RNA-seq to identify the transcriptome under different conditions. For example, the RNA expressed during growth in the presence of two different carbon sources, glucose and galactose, was examined by an RNA-seq approach to characterize the transcriptome of S. mutans UA159 (173). Previous microarray results were compared with the RNA-seq method and the expression results were found to be consistent for both techniques thus validating the RNA-seq technique. Microarray studies typically only sample the predicted transcribed genes, resulting in biased results as there is no detection of sRNA expression. The

RNA-seq method is an improvement in the observed transcribed RNA pool with sequenced sRNAs detected. The RNA-seq study bioinformatically predicted 240 sRNAs to exist in S. mutans UA159, but reported that only 114 sRNAs of the predicted sRNAs were transcribed. Of the sRNAs expressed, six were differentially regulated due to the change in carbon source (173).

33

These examples of the identification of expressed sRNA for S. mutans UA159 suggest that there are numerous regions that have yet to be characterized. As expected, there are several different sRNAs in S. mutans that respond to environmental changes. The description of the expression and characterization of two sRNAs in S. mutans will serve as the focus of this thesis.

34

2 Objective and Hypothesis

2.1 Statement of the problem

There are only a limited number of studies that characterize the expression of sRNAs in

S. mutans UA159. In order to gain insight into the process of sRNA regulation, we identified sRNA candidates based on a Northern analysis screen to identify IGRs that are transcribed.

Given the importance of regulatory sRNAs in bacteria, it is necessary to differentiate the conditions under which the sRNAs are transcribed. Wild-type cells exposed to heat-shock, oxidative stress, pH change, or metal-stress have regions that are differentially transcribed due to the cellular response. If the sRNAs are regulated by environmental conditions, then exposure to conditions that are relevant to the health of the oral cavity will affect the sRNA.

2.2 General Hypothesis

In response to stress, the Streptococcus mutans sRNAs SurC and MrrC are differentially expressed to enhance survival. Deletion of the sRNAs affects biofilm formation and acid- tolerance.

2.3 Primary Objectives

The objectives were as follows:

1. Examine the expression of two TARs, SurC and MrrC, during various stages of growth,

exposure to CSP, oxidative stress, and heat stress.

2. Generate deletion mutants for the two TARs, ΔSurC and ΔMrrC, and perform phenotypic

analysis for growth, biofilm development, acid-tolerance, and transformation efficiency.

35

3 Materials and Methods

3.1 Bacterial Strains, Growth Conditions, and Plasmids

The bacterial strains and plasmids used in this study are listed in Table 3-1.

TM S. mutans UA159 was grown at 37°C with 5% CO2 in Bacto Todd Hewitt broth (BD,

MD) with Yeast Extract (BioShop, Burlington, ON) (THYE), or in minimally defined medium

(MDM) as described by Fujiwara et al. (174) unless otherwise indicated. MDM medium contains either 1% (w/v) sucrose or glucose. Erythromycin (10 µg/ml for S. mutans) was added to media when it was required. All media and solutions were prepared using distilled water and were sterilized by either autoclaving or by filtration through 0.2 µm filters (Millipore, Kankakee, Il).

Antibiotics and reagents were purchased through MedStore (University of Toronto, ON).

Bacterial cells were maintained in 15% glycerol stocks and stored at -80°C. Solid agar plates were prepared by adding 1.8% (w/v) agar to the various liquid media. Tests requiring metal used copper sulfate, zinc chloride, and sodium chloride (Sigma-Aldrich Canada Co., Oakville, ON).

The primers used for this study are listed in Table 3-2.

Table 3-1 Bacterial strains and plasmids used in this study.

Strain or plasmid Characteristics Sources or references UA159 Wild-type, ErmS (133) ΔSurC (IGR smu82-83) ErmR This study ΔMrrC (IGR smu97-99) ErmR This study pDL277 SpecR (175)

S = sensitive, R = Resistant

36

3.2 Intergenic Knockout Construction

The S. mutans intergenic knockout mutants were generated using polymerase chain reaction (PCR) ligation mutagenesis as previously described (176). PCR ligation mutagenesis was used to construct S. mutans deletion mutants designated ΔSurC and ΔMrrC.

Table 3-2 Primers used in this study.

Primer Application/Function Sequence Source 108 RB Erm sequencing primer 5’ – GTCGTTAAATGCCCTTTACC – 3’ (176) 109 LB Erm sequencing primer 5’ - CCATACCACAGATGTTCCAG - 3’ (176) Erm F Mutagenesis primer for ERM f 5’- GGCGCGCCCCGGGCCCAAAATTTGTTTGAT – 3’ (176) Erm R Mutagenesis primer for ERM r 5’–GGCCGGCCAGTCGGCAGCGACTCATAGAAT–3’ (176) 97P1f Mutagenesis primers for smu97-99 5’–CCATCGTGGAACAGAAGGG – 3’ This study 97P2r Mutagenesis primers for smu97-99 5’–GGCGCGCCCTGAAAGCCGTGTCAAGTAAC – 3’ This study 97P3f Mutagenesis primers for smu97-99 5’–GGCCGGCCTGAAAAAAGTGAAAAAACTC – 3’ This study 97P4r Mutagenesis primers for smu97-99 5’–ATTTGGCAGCACCCATAG – 3’ This study 82P1f Mutagenesis primers for smu82-83 5’–TTCAAGGTGGTGTTATCACTGG – 3’ This study 82P2r Mutagenesis primers for smu82-83 5’–GGCGCGCCCTCTCCTGATTTTCAAGCTCG – 3’ This study 82P3f Mutagenesis primers for smu82-83 5’–GGCCGGCCGATTTCTTATTTTTCTTTGAG – 3’ This study 82P4r Mutagenesis primers for smu82-83 5’–TTCCCCTTGTCCTGCCAACCG – 3’ This study out82F 5’ RACE outer primers 5’–TGGCGAATGTCCCAAACG – 3’ This study out82R 5’ RACE outer primers 5’–CCCAACGAAGATACCAGG – 3’ This study out97F 5’ RACE outer primers 5’–AGTTACTTGACACGGCTTTC – 3’ This study out97R 5’ RACE outer primers 5’–TCTCCTCATTACTCAAAGTGAC – 3’ This study 82-83 IF 5’ RACE inner primers 5’–CGAAAATCTCACGTCATGAC – 3’ This study 82-83 IR 5’ RACE inner primers 5’–TCACAGAGTCGTAGGCGTAT – 3’ This study 97-99 IF 5’ RACE inner primers 5’–AGAATAAAGAGATTCCATGTAGA – 3’ This study 97-99 IR 5’ RACE inner primers 5’–GCTATTTAATACAATAATAGCTGC – 3’ This study CMT-180 5’ RACE anchor primer 5’-GAATTCGAATTCCCCCCCCCCCCCC-3’ (177) CMT-181 5’ RACE anchor primer 5’-GAATTCGAATTCAAAAAAAAAAAA- 3’ (177) RLMRACE 5’ RACE Adaptor 5'-GCUGAUGGCGAUGAAUGAACACUGCGUUUGCUGGCUU Invitrogen UGAUGAAA-3' RLMRACE 5’ RACE Inner Primer 5'-CGCGGATCCGAACACTGCGTTTGCTGGCTTTGATG-3' Invitrogen RLMRACE 3’ RACE Adaptor 5'-GCGAGCACAGAATTAATACGACTCACTATAGGT12VN-3' Invitrogen RLMRACE 3’ RACE Outer Primer 5'-GCGAGCACAGAATTAATACGACT-3' Invitrogen 5SF Northern probe of 5S 5’–AGCCTAGGGGAGACACCTGTA – 3’ This study 5SR Northern probe of 5S 5’–TCCCTCTTGCTAAGCGACGACCCTA – 3’ This study 61-63cF Northern probe of smu61-63c 5’–GTCCTGTTCTTTTTTTGAAGG – 3’ This study 61-63cR Northern probe of smu61-63c 5’–TGATTGTAATGTGAGTCTCTG – 3’ This study 82-83npF Northern probe of 82-83 5’–TCAGTGGACTATTTTATCCCGAGC – 3’ This study 82-83npR Northern probe of 82-83 5’–ATGCACCAACCTAACGCCTGTG – 3’ This study 97-99np1F Northern probe of MrrC 5’–TGAGTTACTTGACACGGCTTTCAG – 3’ This study 97-99np1R Northern probe of MrrC 5’–TGTATTCGTTTTGTTGGCAGCC – 3’ This study 97-99np2F Northern probe of smu97-99 5’–CCAACAAAACGAATACACGT – 3’ This study 97-99np2R Northern probe of smu97-99 5’–TTAAAAAGGTCAATTGAAACCACC – 3’ This study smu153F Northern probe of smu153-154 5’–CAATCTGTACTGATCCCC – 3’ This study smu153R Northern probe of smu153-154 5’–TTGGCGAATCCTAGAACC – 3’ This study smu217cF Northern probe of smu217c-218 5’–TGTCCCTTAAAACAGACACCAC – 3’ This study smu217cR Northern probe of smu217c-218 5’–GCAGGCAGAACTGAATAACACAC – 3’ This study smu259F Northern probe of smu259-260 5’–CATTTGATACAGCGTGTTTTCTTG – 3’ This study smu259R Northern probe of smu259-260 5’–AGTAGGGATGATTTCCAAAAGG – 3’ This study smu305F Northern probe of smu305-307 5’–TTTGCGTGAAGCGGGTCAG – 3’ This study smu305R Northern probe of smu305-307 5’–AACTCCCACAGCGACAAG – 3’ This study smu770cF Northern probe of smu770c-771c 5’–GGGTTAATTGCATTATAGC – 3’ This study smu770cR Northern probe of smu770c-771c 5’–GGATGAAGAGCTTTAATC – 3’ This study smu788F Northern probe of smu788-789 5’–CCTTTGACAAATCGTAGC – 3’ This study smu788R Northern probe of smu788-789 5’–ACAGACTAATCACTCCGC – 3’ This study smu1405cF Northern probe of smu1405c-1406c 5’–TCATTCGAAACAACACAGCAAG – 3’ This study smu1405cR Northern probe of smu1405c-1406c 5’–GAATCGGGTGCGCACTTTTTC – 3’ This study Underlined text = restriction digest site

37

3.3 Isolation of RNA, Northern Blot Detection, and qRT-PCR

Validation of RNA expression was visualized using Northern blot assays. Total RNA was extracted from S. mutans UA159 incubated in THYE at various growth stages and after various shock/stress treatments. The three growth stages selected for study included early-log (OD600

0.1), mid-log (OD600 0.4), and stationary phases (OD600 1.2). The four stressors; CSP, H2O2, temperature, and pH were also selected. A minimum of three biological replicates was performed. For the expression response to CSP, overnight cultures were diluted 1:20 from overnight cultures into fresh THYE media. The cultures were allowed to grow until OD600 reached 0.1 at which point CSP (0.2 µM; Advanced Protein Technology Centre of the Hospital for Sick Children, Toronto, ON; 80% purity) was added. Cultures were then incubated until

OD600 reached 0.4.

Expression response tests were performed using various treatments. Again, overnight cultures were diluted 1:20 into fresh media from overnight cultures. The cultures grew until the test OD600 (0.4) was reached. In the THYE tests, the cells were pelleted and resuspended in

THYE broth with (50°C) or cold (4°C), oxidative stress (H2O2 0.003 %), or acidic (HCl added) or basic (NaOH added) media. Cells were exposed to the stressors for 20 min. The cells were then pelleted, flash frozen in liquid nitrogen and stored at -80°C until used. In the MDM metal tests, the cultures grew until mid-log OD600 (0.4) was reached. The cells were then pelleted and resuspended in MDM (20 min) adjusted with copper sulfate (250 µM), zinc chloride (100 µM), sodium chloride (0.4 M), pH 3.5, pH 5.0 and pH 10.0 and stored similarly to the THYE cultures.

The total RNA of each sample was extracted using Direct-zolTM RNA MiniPrep (Zymo

Research). All glassware and solutions used for RNA work were baked overnight at 150°C and then washed with a 0.1 % solution of diethyl pyrocarbonate (DEPC) made in RNAse-free water.

38

The total RNA was subjected to electrophoresis using 5 µg of RNA. The RNA was heated to 95°C for 10 min and set on ice. The 5 µg RNA was loaded into each lane on an 8% polyacrylamide denaturing gel containing 7 M urea. The molecular weight markers used were the RiboRuler Low Range RNA Ladder (Thermo Scientific). An equivalent volume of 2X loading dye (95% formamide, 0.025% SDS, 0.025% bromophenol blue, 0.025% xylene cyanol, and 0.5 mM EDTA) was added to the RNA samples prior to loading. The gel was electrophoresed at 200 V for 90 min in 0.5X tris-borate-EDTA buffer (TBE) at 4°C. After electrophoresis, the RNA was electrotransferred to a SensiBlot Plus nylon membrane

(Fermentas) at 40 V for 45 min in 0.5X TBE at 4°C. The transferred RNA was cross-linked to the membrane using ultraviolet light for 5 minutes. Nylon membranes were stained using methylene blue (0.03%) to validate the quality of the RNA and to ensure appropriate RNA transfer and visualization of the molecular weight markers. The membrane was pre-hybridized using DIG Easy Hyb (Roche) for 30 min at 42°C and followed by hybridization with DIG High

Prime DNA probes (25 ng/ml) in DIG Easy Hyb solution at 42°C overnight. The probes for the assay were PCR amplified from UA159 genomic DNA with primers listed in Table 3-2

(designated as 5SF, 5SR, 97-99np1F, 97-99np1R, 97-99np2F, 97-99np2R, 82-83npF, 82-

83npR). The 5S probe was used as the internal control to confirm uniform gel loading and also to normalize the RNA expression data. The PCR products were denatured by boiling for ten minutes, quickly chilled on ice and then labeled using the method described in the digoxygenin

(DIG) High Prime Labeling Kit (Roche). The blot was rinsed in 2X saline sodium citrate (SSC) buffer, and again in 0.5X SSC at 60°C. The rest of the handling steps were performed at room temperature. The blot was then washed with 0.1 M maleic acid buffer (MAB), blocked and probed with DIG antibody (75 mU/ml) diluted 1:10,000 in blocking solution. The membrane was

39

again washed in MAB, and further washed with a detection buffer of 0.1 M Tris-HCl and 0.1 M

NaCl. The probed membrane was incubated with disodium 3-(4-methoxyspiro {1,2-dioxetane-

3,2´-(5'-chloro) tricyclo [3.3.1.13,7] decan}-4-yl) phenyl phosphate (CSPD) and the chemiluminescent signal was visualized using a chemiluminescent detector (Bio-Rad) and photographed. The transcript expression levels within the detected bands were quantified using the ImageJ64 program (National Institutes of Health, Bethesda) (178). The relative 5S RNA expression was quantified using the densitometry tools of ImageJ64. The densitometry measured the intensity of the respective band relative to the pixel intensity. To normalize the RNA expression, the value of test intensity over the 5S intensity was calculated and then divided by the unstressed expression value at mid-log (0.4 OD600) over mid-log 5S intensity since all tests included a mid-log unstressed control. All Northerns were performed with a minimum of three biological replicates.

The measurement of expression was also examined by real-time reverse-transcriptase

PCR (RT-PCR). The total RNA was DNase treated and converted into complimentary DNA

(cDNA) using the First-Strand cDNA synthesis kit (ThermoFisher Scientific) as described previously (151). The RT-PCR was performed on the cDNA using SYBR-Green PCR

(ThermoFisher Scientific, MA) and the Mx3000p Thermal cycler (Agilent Technologies, CA).

Negative controls omitted the reverse transcriptase in the synthesis of cDNA. The quantitative measure for real-time PCR is the threshold cycle (CT). The expression analysis determined the fold expression as previously described for the following equation (179):

Fold change = 2−ΔΔCT

Fold change = 2−[(CT GeneTreated −CT ReferenceTreated )−(CT GeneUntreated −CT ReferenceUntreated )]

40

All qRT-PCR tests were performed with a minimum of three independent biological replicates and three technical replicates.

3.4 Rapid Amplification of cDNA Ends (RACE)

The transcription start and stop sites were mapped for several intergenic regions. The start sites were determined by using two separate 5′ rapid amplification of cDNA ends (RACE) techniques. For the first technique, cDNA was generated with specific outer primers designed to target the selected TARs. To generate the cDNA, pooled total RNA (10 µg) was isolated and

DNAse treated using Direct-zolTM RNA MiniPrep (Zymo Research, CA). The RNA was reverse transcribed using a region-specific primer creating the complementary DNA. The RNA was then removed using RNAse H. The next step added dGTP or dTTP using terminal deoxynucleotidyl transferase (Invitrogen) to the remaining cDNA. The product was then PCR amplified using a nested inner primer designed to specifically target the transcribed region away from the 5’ end of the outer primer along with an adapter primer (CMT-180 or CMT-181) to amplify the resulting transcript. The resulting product was then column purified and sequenced by ACGT (Toronto,

ON). The sequenced product was compared with the NCBI sequence of the same region and aligned to locate the region with the repeating nucleotides identifying the 5’ end. The alternative technique used the RLM-RACE kit (Thermo Fisher, MA). Briefly, 10 µg of total RNA was treated using calf intestinal phosphatase. The RNA was then treated using tobacco acid pyrophosphatase to leave a 5’ monophosphate. The uncapped RNA was then ligated to an adapter sequence and reverse transcribed using an outer primer. PCR was then performed using an inner nested primer with a region-specific primer to amplify the region along the transcript, this DNA was then pooled and column purified and sequenced by ACGT (Toronto, ON). Again, the sequenced result was compared with the NCBI sequence of the same region and aligned to

41

locate the location of the ligated adapter which identified the 5’ end.

A search for 3’ stop sites was also performed. The procedure used the RLM-RACE kit

(Thermo Fisher, MA) according to the manufacturer’s instructions. Briefly, 1 µg of total RNA is reverse transcribed in the presence of the 3’RACE adapter. The reaction is then amplified using

PCR in the presence of the 3’RACE outer primer accompanied with a region specific outer primer. A second PCR reaction was then performed using an inner anchor primer along with a region specific primer.

3.5 Transformation Assay

To perform genetic transformation assays, an overnight culture of S. mutans grown in

THYE at 37°C was centrifuged at 4,500 RPM for 4 min at room temperature and resuspended in

1 volume of THYE. The washed culture was diluted 1:20 in THYE. Aliquots (200 µl) of the culture were transferred into a sterile microplate (Costar, Corning, Corning, NY). In test samples,

1 µg of plasmid (pDL277) was added. The sample was incubated at 37°C with 5% CO2 under static conditions. To test the effect of supplementation by synthetic CSP, 0.4 µM was added to the 200 µl culture samples after 2 h of growth at 37° C with 5% CO2. After 5 hours, test samples

(20 µl using serial dilutions in PBS) with or without plasmid were plated on THYE plates supplemented with spectinomycin (10 µg/ml). After 48 hours of incubation at 37°C, the transformation rate was calculated as percent survivability by counting the antibiotic resistant colony forming units (CFUs) divided by the total number of viable CFUs x 100. The assay was performed with a minimum of three biological replicates and three technical replicates.

3.6 Biofilm Quantification

The biofilms of TAR deletion mutants were quantified utilizing crystal violet (180).

Quantification involved the use of 24-well polystyrene plates to grow biofilms in MDM.

42

Overnight cells were centrifuged and resuspended in fresh MDM with 1% glucose or sucrose with the pH adjusted to either 7.0 or 5.0. The cells were diluted 1:20 and grown to an OD600 of

0.4. The biofilms were then inoculated with a 1:20 dilution into MDM, after which the cells were incubated at 37°C with 5% CO2 for 16 hours under static growth conditions. After 16 hours the supernatant was removed and the biofilms were rinsed with sterile distilled water. The biofilms were air dried prior to being stained with 0.1% crystal violet for 20 min. The supernatant was removed and biofilms were rinsed with sterile distilled water and allowed to dry. The crystal violet was solubilized using 30% acetic acid. The biofilms were then quantified by measuring the

OD600 of the solubilized solution with a microplate reader (3550 Microplate Reader, Bio-Rad

Laboratories, Richmond, Ca). The biofilm was quantified with a minimum of three biological replicates and three technical replicates. Triplicate controls for the assays were performed using stained, uninoculated wells.

3.7 Acid Tolerance Response Assay

The involvement of sRNA regions in the acid tolerance response (ATR) was assayed. The

ATR was quantified using a previously described method (152). Briefly, the assays were performed using 5 mL overnight cell cultures of S. mutans UA159 along with the respective mutant strains. The cultures were diluted 1:10 in Tryptone Yeast Extract (TYE) supplemented with glucose (0.5%) at pH 7.5 and then incubated at 37°C (152). Upon reaching mid-logarithmic phase (OD600 0.4), cell cultures were split into adapted and unadapted populations, pelleted by centrifugation, and resuspended in TYE supplemented with 0.5% glucose at pH 5.5 or pH 3.5, respectively. Cells grown at pH 5.5 were adapted for three hours prior to growth at pH 3.5. Cells at pH 3.5 were serially diluted and grown on THYE agar plates at time 0 and after being incubated for 2 hours at 37°C. After 48 hours of incubation at 37°C, the acid tolerance was

43

calculated as percent survivability by counting the colony forming units (CFUs) divided by the total number of viable CFUs x 100. The ATR assay was performed with a minimum of three biological replicates and three technical replicates.

3.8 Antibiotic Minimum Inhibitory Concentration Assay

Overnight cultures of UA159, ΔMrrC, and ΔSurC were grown overnight in THYE. The cultures were then diluted 1:20 in THYE and allowed to grow until they reached mid-log growth

(OD600 0.4). The inoculum was then standardized to ensure that a starting OD600 of 0.01 was reached. The inoculum was added to MDM with the addition of 1% glucose and antibiotics.

Three antibiotics were tested against the cultures. Stocks of ampicillin (10 mg/ml), kanamycin (100 mg/ml), and tetracycline (100 mg/ml) were prepared. A two-fold dilution series of each antibiotic was set up in 96-well plates. The uninoculated medium was used as negative control. The positive control consisted of inoculated medium without antibiotic.

The minimum inhibitory concentration (MIC) was then determined by incubating the plate overnight at 37°C and 5% CO2. The MIC was observed as the lowest concentration that did not result in visible growth. The mutant strains were regarded as sensitive/resistant to an antibiotic if the MIC was two-fold lower/higher than that observed for UA159. The MICs were performed with a minimum of three biological replicates and three technical replicates.

3.9 Growth Rate analysis

Growth curves were performed using Bioscreen C and did not use 5% CO2 (Growth

Curves USA, Piscataway, NJ). The exponential growth curves were fitted to the data by using

Prism6 (GraphPad, CA) to determine the growth rate and doubling times for the bacteria strains and assorted treatments. The data was fitted to the equations:

kt -1 y(t) = y0e t – time, k – rate (hour ),

44

doubling time = ln(2)/k

The growth rate experiments were performed with a minimum of three biological replicates and three technical replicates.

3.10 Statistical analyses

The statistical analyses were performed using PRISM6 software (Graphpad).

Comparisons were done using either the student’s t-test or the one-way analysis of variance

(ANOVA). The ANOVA analysis was followed by Tukey’s multiple comparison test (set at either * p < 0.05 or ** p < 0.001). Error bars in the figures were set by standard error of the mean (SE).

3.11 Bioinformatics

Several bioinformatics tools were used to investigate the intergenic regions. Protein and

RNA alignments were performed using online webtools SyntTax (181), T-Coffee (182), and

LocARNA-P (183). The visualization of the prokaryotic gene order, or synteny, was mapped using the SyntTax alignment tool (181). Alignment of peptide sequences was performed using the Tree-based consistency objective function for alignment evaluation (T-Coffee) tool (182).

Alignment of multiple RNA sequences was performed using LocARNA-P (183).

3.11.1 Inverted Repeat Searches Using EMBOSS

Inverted repeats are often regulatory target regions for transcription start or stop sites, therefore a bioinformatic search for direct and inverted repeats using The European Molecular

Biology Open Software Suite (EMBOSS) was conducted (184). The EMBOS einverted tool was used to scan the chromosomal sequence of S. mutans UA159. The software identified several inverted repeats in the nucleotide sequence (184).

45

3.11.2 Promoter and Terminator Predictions

Both Rho-independent terminators and promoters were identified bioinformatically. The search for Rho-independent terminators was performed using the ARNold website server software (11). The search for predicted promoters and terminators was used to map out the transcriptional motifs. To perform the search, the sequence containing the transcription start site and the downstream region up to the next adjacent gene were used. This sequence was submitted to ARNold to find Rho-independent terminators (11). The search for promoter elements was performed using the Prokaryotic Promoter Prediction (PPP) webserver:

(http://bioinformatics.biol.rug.nl/websoftware/ppp/ppp_start.php).

3.11.3 Multiple Em for Motif Elicitation (MEME)

A search for motifs was conducted using the MEME Suite (185). MEME was used to identify putative binding sites for a shared transcription factor that targets regions of promoters as well as discovering motifs that form other DNA signaling domains. The intergenic sequences containing inverted repeats found using the EMBOSS suite were used in an iterative search for transcription factor motifs using MEME 4.9.1. To reduce the number of false positives, MEME was set to find motifs that occurred any number of repetitions in each fragment. To investigate motifs within the inverted repeats, the search was conducted using the default settings.

Performing the search for motifs that match transcription factors set the minimum width to match the common motif, any number of repetitions, and set to only look for palindromes.

46

4 Results

4.1 Identification of Transcriptionally Active Regions

As stated earlier, the genome of S. mutans UA159 contains over 1900 genes (133). This study narrowed the number of intergenic regions to investigate by examining two pools of candidates. One screening criteria was by size, with probes designed for intergenic regions greater than 500 nt while a second pool of candidates were derived from a microarray study that inferred that 30 intergenic regions were comX regulated (169), the largest of the six regions were further examined. By using these two pools of candidates, this investigation screened the expression of over twenty intergenic regions as possible sRNA encoding regions.

The Northern analysis screen was conducted to analyze expression of each of the putative intergenic regions. From this analysis it was determined that ten intergenic regions were expressed under the conditions used for the screen. The two main regions that were expressed and followed up with further experiments included the intergenic region between dnaK (smu.82) and dnaJ (smu.83), as well as the intergenic region between pyrG (smu.97) and the downstream leucine tRNA (IGRpyrG-tRNALeu19). The regions identified from the first pool of candidates were: smu60-61, smu61-63, smu82-83, IGRpyrG-tRNALeu19, smu153-154, smu217c-218, smu259-260, smu305-307, smu788-789, and smu1405c-1406c (Table 4-1). The results of the

Northern assays that confirmed expression are summarized in Table 4-1.

47

Table 4-1 Intergenic regions of S. mutans UA159 screened for expression by Northern Analysis in this study.

US gene Dir DS gene Dir Reported size Northern 5’ References (nt) Expression RACE smu.60 > smu.61 > 120(2)/113 a Yes nd This study and Zeng (173) smu.61 > smu.63c < 350/nd b Yes nd This study and Mashburn-Warren (21) smu.76 > smu.78 > nd/229 a No nd This study and Zeng (173) smu.82 > smu.83 > 203/199 c Yes Yes This study and Silvaggi (66) smu.97 > smu.99 > ~400 (?) Yes Yes This study smu.97 > smu.99 > ~400 Yes Yes This study smu.97 > smu.99 > 449/418 a Yes Yes This study and Zeng (173) smu.97 > smu.99 > ~250/279 a Yes Yes This study and Zeng (173) smu.106c < smu.107 > nd/169 a No nd This study and Zeng (173) smu153 > smu.154 > 240 Yes No This study and Zeng (173) smu.217c < smu.218 > 550 Yes nd This study, O’Rourke (163), and Zeng (173) smu.259 > smu.260 > ~300 Yes nd This study smu.305 > smu.307 > ~300 Yes nd This study smu.401c < smu.402 > nd No No* This study and Lemme (169) smu.636 > smu.637 > nd No No* This study and Lemme (169) smu.770c < smu.771c < 290/315 a/nd d No Yes* This study, O’Rourke (163), Zeng (173), and Lemme (169) smu.788 > smu.789 > 500/53 a Yes No This study and Zeng (173) smu.807 > smu.809 > nd No No* This study and Lemme (169) smu.1063 < smu.1064c < 133/nd d No Yes* This study and Lemme (169) smu.1405c < smu.1406c < 170/90 a /100 e Yes nd This study, Zeng (173) and Deltcheva (65) smu.2025 < smu.2026c < nd No No* This study and Lemme (169)

Expression was identified in this study by Northern analysis. US: upstream gene; Dir: direction; DS: downstream gene; nd: not determined. * 5’ RACE performed prior to confirmation by Northern analysis. Transcript sizes reported in: a) Zeng (173), b) Masburn-Warren (21), c) Silvaggi (66), d) Lemme (169), e) Deltcheva (65).

The first positive transcript observed was for smu61-63c (Table 4-1). This region was also examined by Mashburn-Warren et al. (21) and contained the unannotated 17 residue peptide

XIP. The results observed here were for a transcript over 350 nt in size and was growth-phase regulated. This suggested that the 17 amino acid XIP peptide was derived from the processed

350 nt transcript. The transcript was observed during exponential growth, but was not observed once the cell reached stationary phase. Although transformation involving XIP requires chemically complex defined medium, the transcript was expressed in THYE.

RACE analysis was used to identify the 5’ start site as well as the 3’ end of the TARs. 3’ and 5’ RACE were performed for smu82-83, IGRpyrG-tRNALeu19, smu153-154, smu401c-402, 48

smu636-637, smu770c-771c, smu807-809, smu1063-1064c, and smu2025-2026c. There were no

RACE results for smu153-154, smu401c-402, smu636-637, smu807-809, and smu2025-2026c.

The 5’ RACE results for smu770c-771c and smu1063-1064c can be seen in Appendix 8.1. The expression of two other TARs were examined, namely the TARs smu82-83, and IGRpyrG- tRNALeu19 as described in Sections 4.2, and 4.3 respectively. For both regions, the expression of

RNA was examined during three phases of growth corresponding to early, mid-log, and stationary phase. Three other conditions were examined. The competence pathway regulated by

CSP has been characterized, however the expression response to CSP by sRNA has not been well studied. Also, since S. mutans is sensitive to oxidative stress, the expression response to

H2O2 exposure was tested. The third condition tested was the expression response to heat shock.

Based on the results obtained from the expression for the TARs, deletion mutants for smu82-83, and IGRpyrG-tRNALeu19 were generated. The phenotypic assays for the deletion strains Δsmu82-

83 (ΔSurC), ΔMrrCIGRpyrG-tRNALeu19 (ΔMrrC) are described in Sections 4.3 and 4.4.

4.2 Analysis of SurC (smu82-83)

The heat shock operon is well conserved in bacteria with two heat shock chaperones, dnaK and dnaJ (186). A common structure of the dnaK operon in low G-C Gram-positive bacteria is orfA-grpE-dnaK-dnaJ (186). This operon is observed in S. mutans. The genetic organization of the proteins found in the dnaK operon of S. mutans were compared with other bacteria (Figure 4-1). This region is well conserved in bacteria, and in B. subtilis the intergenic region harbors an sRNA named SurC (66). The name SurC will be maintained here to be consistent with the current literature. A multiple RNA structural sequence alignment of the intergenic region between dnaK and dnaJ was performed and a conserved region was observed in selected Gram-positive and Gram-negative bacteria (Figure 4-2). The structural alignment of

49

the RNA proximal to the rho-independent terminator of SurC (87,930 - 87,967) showed a conserved hairpin motif (Figure 4-2).

SurC fruA fruB hrcA grpE dnaK dnaJ Streptococcus mutans UA159 fruA fruB hrcA grpE dnaK dnaJ Streptococcus mutans Lj23 dnaJ hrcA phaB gatB 1351 1350 1349 vanY 1347 grpE dnaK Streptococcus pyogenes MGAS15252 hsdR 0456 0457 hrcA dnaK dnaJ grpE Streptococcus pneumoniae D39 grpE dnaJ dnaK infB rbfA truB ribF hrcA Lactobacillus salivarius CECT 5713 SurC yqxA lepA dnaK hemN hrcA grpE dnaJ Bacillus subtilis 168 grpE 1660 hrcA rpsT lepA dnaK dnaJ Staphylococcus aureus MRSA2452 dnaK grpE dnaJ 4769 smpB 4767 4766 omlA fur Pseudomonas aeruginosa PAO1

Figure 4-1 Synteny map of the genomic organization of the heat shock chaperones in various bacteria. The organization of the heat shock operon highlighting the chaperone gene dnaK upstream of the small non-coding RNA SurC. The colored genes are conserved. Green – hrcA, yellow - grpE, purple - dnaK, pink – dnaJ, grey - SurC.

50

((((...... ((((((....))))))...... )))) Streptococcus mutans UA159 AAAUACCUCCUAGAUACCCUGGUAUCUUCGUUG-GAUUU 38 Streptococcus mutans LJ23 AAAUACUUCCUAGAUACCCUGGUAUCUUCGUUG-GAUUU 38 Streptococcus pyogenes MGAS15252 AAAUCCGACAAAGAUACUCUCGUAUCUAGGAGG-UAUUU 38 Streptococcus pneumoniae TIGR4 AGUU-UUUUCUAGGACGUAAGCGUCCGUCGUCA-AAACU 37 Lactobacillus salivarius CECT5713 GGAUUGAGCCAUAUGCUUUUAUGCAUGUGGCUU-UAUCU 38 Bacillus subtilis spizizenii W23 UAAUCUAACAUAGUAAAACUUAAUACUAAAAAGUUAAGG 39 Staphylococcus aureus MRSA252 AAAGCCAAAGCCAAUGUUC---UAUUGACUUUG-ACUUU 35 Pseudomonas aeruginosa PAO1 GAUCCGCCACGCGGGAGCCUGCUCCCGCGUUGGCGUGUC 39 ...... 10...... 20...... 30......

Figure 4-2 Conserved secondary structure of SurC. Multiple sequence and structure alignment of the intergenic region between dnaK and dnaJ from selected bacteria calculated with LocARNA (183). The structural conservation is indicated with grey shading surrounding the text and shown with dot bracket annotation. The sequence conservation is shown by the grey bars at the bottom. Formation of a hairpin is predicted.

The expression of SurC in S. mutans UA159 was examined by Northern analysis. The conditions used included growth to an OD600 of 0.1, 0.4, and 1.2, corresponding to early, mid-log and stationary growth phases, respectively. Other conditions included the presence of competence stimulating peptide (CSP 0.2 µM), oxidative stress (0.003% H2O2), and heat-shock

(50°C). RNA integrity was observed by gel electrophoresis. Exposure to stressors involved growth to mid-log phase, at which point the cultures were pelleted at 4000 RPM for 5 minutes and resuspended in the stress test media for 20 min. The expression levels for the various conditions were compared to the 5S control, and then normalized and compared with the level observed at stationary phase. Typically, expression levels were normalized to a that observed at mid-log; however, the transcript for SurC was significantly reduced at the lower OD600s (p <

0.05) so the comparison was done using the stationary phase. The expression of SurC varied when exposed to various stressors as seen in Figure 4-3. Comparisons of the samples were made by ANOVA with Tukey post hoc test comparing the conditions. The expression of the transcript was significantly reduced when the cells were exposed to CSP and oxidative stress (p < 0.05).

51

There was no significant change under heat-shock stress (p < 0.05). The expression tests performed for exposure to CSP, H2O2, and increased temperature were performed at an OD600 of

0.4 to test whether or not expression would increase after exposure. The expression increased under heat exposure to a similar level as that observed for an OD600 of 1.2, but not for either CSP or oxidative stress.

A OD OD OD CSP H2O2 50°C 0.1 0.4 1.2 SurC (190 bp)

5S

B C 15000 6 * * 5

10000 4

3

5000 2

Relative RNA Expression 1 to no treatment at mid-log

0 Normalized expression relative 0 0.1 0.4 1.2 CSP H O 50° 2 2 0.1 0.4 1.2 CSP H2O2 50°C Treatment Treatment

Figure 4-3 Northern analysis of SurC (smu82-83) of S. mutans UA159 grown in THYE. The expression of SurC (approximately 200 nt) varied due to its response to a variety of external stressors. A) The expression of SurC. B) Densitometry relative expression of the loading control 5S. C) Expression of SurC under various stressors relative to OD600 1.2. OD600 0.1; OD600 0.4; OD600 1.2; exposure to CSP (0.2 µM); exposure to H2O2 (0.003%); high temperature exposure (50°C). Data represents three independent experiments. Data analyzed by an ANOVA using the Tukey post-hoc test and compared to normalized expression at an OD600 of 1.2; values represent mean +/- SE (* p < 0.05).

A map of the region in S. mutans including SurC, along with the 5’ RACE identified start site, can be seen in Figure 4-4. The SurC transcript is a component of the heat shock operon of

52

the four genes hrcA, grpE, dnaK, and dnaJ (160). SurC is located between the heat shock chaperones dnaK and dnaJ (Figure 4-4). The 5’ RACE analysis of the sequenced transcript set the size of SurC at approximately 190 nt starting at 87,809. The transcript was on the positive strand and matched the orientation of the upstream and downstream genes. The ARNold (11) predicted Rho-independent terminator (87,969-87,999) was located within the 5’ untranslated region (UTR) for dnaJ.

AATATTACAA CCTTCTAAGT TTCAAAGAAA GGAACTGAAC CCGATCTTCA ATTCCTAGAA -35 -10 +1 >>

85,000 86,000 87,000 88,000 89,000 hrcA (smu.80) grpE (smu.81) dnaK (smu.82) SurC dnaJ (smu83)

GGTATCTTCG TTGGATTTCT TATTTTTCTT TGAGTCGCTT AACGGCTTAA TATCTTATAT

predicted rho-independent terminator

Figure 4-4 Identification of the transcription start and stop sites of smu82-83 (SurC) in S. mutans. The small transcript smu82-83 (SurC) is located in the heat shock operon and is situated between two heat shock chaperones, dnaK (smu.82) and dnaJ (smu.83). Blue – positive strand; grey – putative small non-coding RNA, SurC; black – predicted rho-independent terminator. The 5’ RACE determined +1 start site is highlighted, with the direction of transcription identified by >>>. The putative promoter region was also identified.

A deletion mutant of SurC was generated. The deletion was confirmed by sequencing the intergenic region and PCR analysis. The mutant was used in phenotypic assays to characterize the phenotype upon deletion of SurC and designated ΔSurC. Assays examining growth under the influence of various stressors (pH, H2O2, EtOH, NaCl), biofilm quantification, acid-tolerance 53

response (ATR), natural competence, and antibiotic resistance were performed. The growth assays were performed in THYE and identified no significant differences (data not shown) as determined by ANOVA analysis (p < 0.05). A test of the expression of the adjacent genes comparing the wild-type with the mutant was performed using RT-PCR. There was no significant difference for dnaK (0.66 +/- 0.24). The downstream gene dnaJ was significantly upregulated in the mutant (7.81 +/- 3.66).

4.3 Analysis of the intergenic region between pyrG and tRNALeu19

4.3.1 The expression of the TARs located within IGRpyrG-tRNALeu19

The expression of the TARs between pyrG (smu.97) and fbaA (smu.99) was confirmed and quantified by Northern analysis under various conditions. The upstream arrangement of rpoE and pyrG is well-conserved in Gram-positive bacteria. However, the downstream region 3’ from pyrG is not as well conserved as shown by the gene organization in Figure 4-5. The intergenic region between smu.97 and smu.99 appears unique to S. mutans.

54

tig 86 87 88 89 94 fbaA 92 93 rpoE pyrG 19 Streptococcus mutans UA159 0247 dnaJ truA 1 0241 0242 0243 t0036 rpoE pyrG t0035 tig fbaA Streptococcus pasterurianus ATCC 43144 0099 truA 0104 rpoE pyrG 0108 8 dut tig Streptococcus agalactiae 2603V R tig rpoE fba 1728 1727 1726 1725 pyrG 1719 1718 2024 Streptococcus equi zooepidemicus MGCS10565 fba 1462 truA thiD 1459 1457 tig rpoE 1458 pyrG 1453 1452 Streptococcus pyogenes MGAS15252 0438 0440 0434 0435 0436 0439 pyrG 0437 0443 rpoE Streptococcus pneumoniae D39 tig 0329 0330 0331 0333 0337 0338 rpmB 0332 rpoE pyrG 2225 0336 0339 Streptococcus suis 05ZYH33

Figure 4-5 Genomic organization of rpoE and pyrG in various Streptococci. The organization of the genes encoding the RNA polymerase subunit delta (rpoE), cytidine triphosphate synthetase (pyrG), and fructose bisphophate aldolase (fbaA) for several streptococci. Purple – trigger factor tig; pink – rpoE; blue – pyrG, diamonds – tRNA or rRNA; yellow – fbaA, white – unconserved proteins.

There are several key metabolic genes located in close proximity to the intergenic region between smu.97 and smu.99. The rpoE gene encodes the delta subunit of RNA polymerase. The pyrimidine synthesis gene pyrG encodes the cytidine triphosphate synthetase protein. A leucine encoding tRNA is the next component of the region (tRNALeu19). And finally, smu.99 is the fructose-bisphosphate aldolase gene (fbaA). The TARs examined in this region are located downstream of pyrG and upstream of the tRNALeu19, for that reason, the complete intergenic region was referred to as IGRpyrG-tRNALeu19. A detailed map of the intergenic region is shown

(Figure 4-6). This study identified four transcripts within the IGRpyrG-tRNALeu19. More details on the sizes of the transcripts are given in section 4.3.2.

55

np1 np2

MrrCc smu97-99

IGRpyrG-tRNALeu19 99,500 99,750 100,000 100,250 pyrG tRNALEU19 IR MrrCa

ORF1

MrrCb

ORF2

Figure 4-6 Genomic organization of IGRpyrG-tRNALeu19 in S. mutans UA159. The organization of the intergenic region bound by pyrG and the tRNALeu19 was examined by two northern probes, np1 and np2. Four transcripts were identified, MrrCa, MrrCb, MrrCc, and smu97-99. A rho-independent terminator is located at the 3’ end MrrCa. The conserved ORF1 is located within MrrCa, and ORF2 is located within MrrCb. Blue, pyrG; red, tRNALeu19; orange, northern probes; green, peptides; grey, examined RNA; black, inverted repeat.

An examination of the expression of IGRpyrG-tRNALeu19 was performed by Northern analysis on cultures grown in THYE with the Northern probe np1 (Figure 4-6 and Figure 4-8).

The tests were similar to those performed with SurC, with the addition of 5°C, various pH, and metals. RNA integrity was confirmed by gel electrophoresis (Figure 4-7). The expression levels were compared with the 5S control, and then normalized to the level observed at OD600 of 0.4. A total of four transcripts were identified (Figure 4-6). The transcripts located within the first northern probe (np1) were found to be a metal responsive RNA and named MrrC as a result. The expression results are shown in Figure 4-8. Interestingly, two transcripts were observed during early growth phase (OD600 0.1) as well as during acidic shock (pH 3.5). The RNA expression was analyzed by ANOVA with a Tukey post-hoc test comparing the conditions. The expression of the transcript was significantly increased when the cells were exposed to CSP and at 50°C (p

< 0.05) and significantly decreased under pH 3.5 stress (p < 0.05).

56

OD OD OD CSP H2O2 50° C 5° C pH 3.5 0.1 0.1 0.1

Figure 4-7 Integrity of DNAse treated RNA. Ethidium bromide-stained gel to demonstrate RNA integrity. RNA was extracted for various conditions. OD600 0.1; OD600 0.4; OD600 1.2; exposure to CSP (0.2 µM); exposure to H2O2 (0.003%); heat-shock (50°C); cold-shock (5° C); acidic-shock (pH 3.5) exposure.

OD 0.4

OD OD OD CSP H2O2 50°C 5° C pH A 0.1 0.4 1.2 3.5 MrrC (~420 bp)

5S

B C *

15000 * 4 *

3 10000

2

5000 1 Relative RNA Expression to no treatment at mid-log 0 0 0.1 0.4 1.2 CSP H2O2 50° 5° C pH 3.5 Normalized expression relative 0.1 0.4 1.2 CSP H2O2 50° 5°C pH 3.5 Treatment Treatment

Figure 4-8 Transcriptional analysis of intergenic region MrrC expressed when S. mutans was grown in THYE using np1 as a probe. A) The expression of MrrC (approximately 400 nt) varied in its response to a variety of external stressors. Two transcripts were evident during early growth (OD600 0.1) and pH 3.5 conditions. B) Densitometry quantification of 5S RNA expression under various conditions C) Expression of MrrC under various stressors normalized to the relative expression at an OD600 of 0.4. OD600 0.1;

57

OD600 0.4; OD600 1.2; exposure to CSP (0.2 µM); exposure to H2O2 (0.003%); heat-shock (50°C); cold-shock (5°C); acidic-shock (pH 3.5) exposure. Data represents three independent experiments. Data was analyzed by ANOVA using the Tukey post-hoc test and compared to normalized expression at an OD600 of 0.4; values represent mean +/- SE (* p < 0.05).

1.5 A B ** pH 3.0 pH 5.0 pH 7.0 pH 10.0 *

MrrC 1.0

5S

0.5

0.0 Normalized expression relative 3.0 5.0 7.0 10.0 to no treatment at mid-log in MDM pH

Figure 4-9 The expression of MrrC of S. mutans UA159 grown in MDM under pH stress visualized by A) Northern analysis and B) normalized expression. The expression of MrrC displays variation due to its response to pH. A) Northern analysis of MrrC for pH 3.0, 5.0, 7.0, and 10.0. B) Expression of MrrC under various pH normalized to the expression of MrrC at pH 7.0 and OD600 0.4. Data is representative of three independent experiments. Data was analyzed by an ANOVA using the Tukey post-hoc test and compared to normalized expression at an OD600 of 0.4; values represent mean +/- SE (* p < 0.05, ** p < 0.001).

Previous work in B. subtilis (158), S. mutans (157), and E. coli (99) provided a rationale for the next experiment. A literature search revealed that in B. subtilis, pyrimidines altered the structure of the intergenic region between rpoE and pyrG (159); pyrG was regulated by pyrimidine concentration. Therefore, to determine whether or not pyrimidine concentration also regulated MrrC, the expression was observed in the presence and absence of the two pyrimidines, orotic acid and cytidine. Furthermore, reports in the literature show that rpoE is sensitive to metal stress in E. coli (99); in E. coli when cells are stressed with metals, the

58

expression of rpoE increases. The expression of MrrC was therefore examined to determine if

MrrC was also involved in metal stress in S. mutans.

Northern analysis was performed to measure the expression of MrrC in the presence of pyrimidines (Figure 4-10a) grown in MDM. As in the case of the pH stress analysis, MDM was used in order to have stricter control of the exposure conditions. Growth of S. mutans UA159 in the presence of the pyrimidines orotic acid and cytidine resulted in the significant reduction of the expression of MrrC relative to the no stress control (Figure 4-10, p < 0.05). This suggests that the presence of pyrimidines repressed the expression of MrrC.

Northern analysis was also used to measure the effect of metal stress on expression of

MrrC; this provided the most surprising result (Figure 4-10a). The exposure of S. mutans UA159 to copper (250 µM), zinc (100 µM), and sodium chloride (0.4 M) all significantly reduced the expression of MrrC relative to the no stress controls (p < 0.001, Figure 4-10a and d). The addition of metals repressed the expression of MrrC, and furthermore, analysis using quantitative real time-PCR (qRT-PCR) showed that the repression was also dependent upon metal ion concentration. As can be seen in Figure 4-10b, as the copper concentration increases, the expression is derepressed. Thus, the MrrC region switches the repression based on the amount of metal introduced. This suggests that MrrC was very sensitive to sensing metal stress, and that the reaction is reversible.

To sum up the results for the use of np1, the expression of the TAR MrrC was shown to respond to several extracellular stressors. MrrC was upregulated by CSP and heat but was was repressed by pH, the presence of pyrimidines, and was very sensitive to metal stress.

59

A B Control OA Cyt Cu Zn Na

MrrC (~400 bp) 0 -2

-4

-6

-8

-10 -20 -40 -60 -80 0.5 mM 1.0 mM 2.0 mM 5S (~120 bp) Copper Concentration (mM) MrrC fold expression measured by qRT-PCR

C D ** 1.4 **

20000 0.4 ** 600 * 1.2 * 15000 1.0

0.8 10000 0.6

0.4 5000

Relative 5S RNA Expression 0.2

0 0.0

Wt OA Cyt Cu Zn Na Normalized MrrC expression to OD Wt OA Cyt Cu Zn Na Treatment Treatment Figure 4-10 Response of MrrC of S. mutans UA159 to metal stress in MDM. The expression of MrrC (approximately 400 nt) responds to nutrition and metals. A) The Northern analysis of the expression of MrrC when exposed to an excess of the pyrimidines orotic acid (OA) and cytidine (Cyt) for 2 hours, copper (Cu), zinc (Zn), and sodium (Na). B) The fold expression using quantitative real time PCR for copper exposure. Data is representative of three independent experiments. C) Densitometry quantification of 5S RNA expression under various conditions D) Expression of MrrC under various stressors normalized to the relative expression at an OD600 of 0.4. A, C, and D) Lanes 1) UA159 control with no stressors, 2) orotic acid (100 µg/ml), 3) cytidine (200 µg/ml), 4) copper (250 µM), zinc (100 µM), NaCl (0.4 M). Data was analyzed by an ANOVA using the Tukey post-hoc test and compared to normalized expression at an OD600 of 0.4; values represent mean +/- SE (* p < 0.05, ** p < 0.001).

A second Northern probe (np2, 99930 to 100152 nt) was used to further map the upstream region within the IGRpyrG–tRNALeu19 (Figure 4-6). The Northern analyses were performed with a 200 nt probe under the same conditions used for SurC, and examined in THYE.

The observed transcript was named smu97-99, and ranged in size from 200 – 300 nt (Figure

60

4-11). As can be seen in Figure 4-11, the expression was at the same level under the tested conditions. The transcript yielded poor hybridization and was difficult to visualize. This may suggest that the transcript for this region was unstable. This region does not share the MrrC title since the region was not tested against metal exposure.

0.1 0.4 1.2 CSP H O 50°C A 2 2 smu97-99 (~250 bp)

5S B C

15000 4 . 0 n

0 o i 0 6 s 3 D s e O

r o p t

x 10000 n E o

i A s

s 2 N e r R

p x S e 5

5000 9 e 9 - v i 7 t

9 1 a l u e m R s

0 d e

0.1 0.4 1.2 CSP H O 50°C z

2 2 i l

a 0

Treatment m OD 0.1 OD 0.4 OD 1.2 CSP H O 50° C r 2 2 o N Treatment Figure 4-11 Transcriptional analysis of smu97-99 using np2 of S. mutans UA159 grown in THYE. The expression of smu97-99 (approximately 200 nt) displayed variation due to its response to a variety of external stressors. A) Northern analysis of smu97-99. The transcript appears as different sized bands. B) Densitometry expression in relative units of the loading control 5S. C) Expression of smu97-99 under various stressors relative to the normalized expression at an OD600 of 0.4. Lanes for A), B) and C); Lane 1, OD600 0.1; Lane 2, OD600 0.4; Lane 3, OD600 1.2; Lane 4, exposure to CSP (0.2 µM); Lane 5, exposure to H2O2 (0.003%); Lane 6, high temperature (50°C) exposure. Data is representative of three independent experiments. Data analyzed by an ANOVA using the Tukey post-hoc test and compared to normalized expression at an OD600 of 1.2; values represent mean +/- SE (* p < 0.05).

Although the expression of smu97-99 showed no significant differences, two observations became relevant for the IGRpyrG–tRNALeu19. Finding more than one transcript was expressed within the IGRpyrG–tRNALeu19 indicated that this region was complex. It was 61

apparent that there were at least three transcripts produced in the intergenic region; the 5’ RACE confirmed three and found a fourth start site as well.

4.3.2 5’ and 3’ RACE Results of IGRpyrG–tRNALeu19

Four transcription start sites were identified between pyrG and fbaA by mapping the results obtained by 5’ RACE for the IGRpyrG–tRNALeu19. It is not unusual for a transcriptionally active site to contain multiple start sites, and there are numerous reports in the literature of this occuring (66, 187). Three transcripts were observed for the MrrC region and designated MrrCa,

MrrCb, and MrrCc. As seen in Figure 4-12, the MrrCa (99602-99989) and MrrCb (99434-99836) start sites were located 100 base pairs apart. Both transcripts were encoded on the complement strand. The MrrCc (99,434-99,830, aprox 400 nt) and smu97-99 (99,930-100,100, aprox. 200 nt) transcripts are on the positive strand. The promoter region for smu97-99 is within the MrrCa transcribed region, and the promoter region for MrrCa is in the smu97-99 transcribed region. The result of the 5’ RACE identified four +1 sites within the region between pyrG and fbaA.

Identification of the 3’ region by 3’-RACE confirmed the stop site for MrrCa, and coincided with an inverted repeat, as described in Section 4.3.6. The location of the various features between smu.97 and smu.99 are shown in Figure 4-13.

62

CTTTAAATAA ATATGGTTTC TATACCCGTT ATAGTTAAAG AAATTGGCTG CCAACAAAAC -35 -10 +1 GCGGTTGAAA ACAGTCAGGC ACATTAATTG ATAGAAAAAT GAGTTACTTG -35 -10 +1

MrrCc smu97-99

99,500 99,750 100,000 100,250 pyrG smu_t19 MrrCb

AGGTTTCTAA CTAAATTTAT GAAGTTTTCG ATTTGATAAG TATGACTTAG +1 -10 -35

MrrCa

AAAGAAACAC CTTCTTAAAT CTCATTTACT AGATTTAAGA AGGTGTTTCT TTTTTTTGGA CCTAAAATAGTAACA TTACTTTATA AATAATAAAT CTGTAATAAC Inverted repeats +1 -10 -35

Figure 4-12 Identification of transcription start sites for the intergenic region between smu.97 and tRNALeu19 of S. mutans identified by 3’ and 5’ RACE. The transcribed region of MrrC and smu97-99 is located between pyrG (smu.97) and smu_t19. Expression mapping identified three overlapping transcripts MrrCa, MrrCb, and MrrCc. The three transcripts occur within the limits of Northern probe 1 (np1) and downstream of pyrG. The fourth transcript smu97-99 is located within the limits of Northern probe 2 (np2) and upstream of fbaA. Blue, positive strand; grey, putative small RNA; Black, rho-independent terminator; red, tRNALeu19.

63

SloR site <<< MrrCb stop (predicted) ( + + ++ +++ +++ ++ ) +1 >>> MrrCc 5' PyrG Stop MrrCc -10 CcpA site IR02a >>> >>>>> (++++ + + +++++ ) >>>>>... 99400 AGTCAGGCACATTAATTGATAGAAAAATGAGTTACTTGACACGGCTTTCA GTAAAAAGAATAAAGAGATTCCATGTAGATTAAACTTTTGCTTATAAAGA

>>>>>>>>>>>>>>>>>>>>>>>>>>>>> MrrCa stop 3’ rho-independent terminator <<< IR02a IR02b >>>>>>>>>>>>>>>>> <<<<<<<<<<<<<<<<<<<<<< 99500 AACACCTTCTTAAATCTCATTTACTAGATTTAAGAAGGTGTTTCTTTTTT TGGATCATTATTACAGCTAATCGTTCAATTCAAAAATGAATAAATCATAT

99600 TAGCTATTACAAAAATACATTCCAGAATAAAAATTCCAAAACAGAGAAGC TTGACAGTACCGCTTCGGCTATACTTAGAAATTAAGACTAATACTGCGAA

99700 ACTTAATAGTATGAACGAGTAAATAATAAACATCATATATTCAGCACAAT CTCTTTATAATGGCGTATTGTAACCAACCCCCAACAACTACTACGCCAAC

smu97-99 -35 >>>>>> start MrrCb -10 MrrCb -35 sloR <<< +1 MrrCb 5' <<<<< <<<<<< (+++++++ + ++ ... 99800 CTTCACCTTGGATTTGTACAAAGCAATCAGCATCCAAAGATTTAAATACT TCAAAAGCTAAACTATTCATACTGAATCTCCTTTAAATAAATATGGTTTC

start +1 >>> smu97-99 5' sloR smu97-99 -10 smu.921(RcrR) site start start MrrCa -10 + +) >>>>> ( + + +++++ + +++ + ) MrrCa 5'<<< +1 <<<<< 99900 TATACCCGTTATAGTTAAAGAAATTGGCTGCCAACAAAACGAATACACGT ATAGTTTAGTTTTTTTCAAGAAAAATTCAAGAGGATTTTATCATTGTAAT

MrrCa -35 <<<<< codY site (+ ++ ++ + ++) 100000 GAAATATTTATTATTTAGACATTATTGTGTTATTTTTCTGTTTTTTTAAG TTTTTTGCCATGCAAAATTATTTTTTAGCAGCTATTATTGTATTAAATAG

comE site smu97-99 stop (predicted) (++ ++ + ) ( + ++ ++ +) >>>>> 100100 CTTAAAGTAATAACGAATTTGATTGTATTAAGTCACTTTGAGTAATGAGG AGATTACTATAATTTCAAGGGAAAAATGTTGACGTTTTATAGATATATTT

100200 TTCAGTATCAAAAGTTTTGATTTTACTTTGGTGGTTTCAATTGACCTTTT

Figure 4-13 5’ and 3’ RACE identified transcription start and stop sites for IGRpyrG- tRNALeu19. The predicted -35 and -10 sites are underlined and in bold. The 5’ RACE +1 start site is underlined and in bold with the direction of transcription indicated as >>> and with a shaded box. The 3’ RACE stop site for MrrCa is underlined and in bold. No 3’ site was found for the other transcripts. The putative regulatory motifs for ccpA, rcrR, sloR, codY, and comE described in Section 4.4.4 were identified as well. For clarity, only the top strand is shown.

64

4.3.3 IGRpyrG-tRNALeu19 in Other Strains of Streptococcus mutans

The IGRpyrG-tRNALeu19 in other strains of S. mutans was investigated bioinformatically.

The alignment of IGRpyrG-tRNALeu19 revealed that the regions were conserved in the S. mutans strains UA159, NN2025, GS-5, and LJ23 and were 917, 1391, 917, and 913 nt in size respectively. The Genbank annotations for GS-5, and LJ23 identified one hypothetical protein between PyrG and tRNALeu19, S. mutans NN2025 contained two overlapping hypothetical coding regions (Figure 4-6 and Figure 4-14). Upon further investigation S. mutans UA159 contained two putative ORFs in the same region (ORF1 and ORF2). The ORF1 in S. mutans UA159 matched the annotated proteins SmuNN2025-0091, SMUGS-5_00410, and SMULJ23-0079.

ORF2 was identified using MacVector software (MacVector, NC) and did not match the

Genbank annotated ORFs. It has not yet been verified that either of the ORFs encode proteins.

Efforts to verify the existence of either protein were unsuccessful. Two different attempts were made to construct a plasmid in E. coli containing either ORF1 or ORF2 along with its own promoter; however, no colonies bearing the plasmid construct were recovered. One effort attempted to incorporate its own promoter and a histidine-tag to the end of the protein construct.

The other effort attempted to ligate the putative peptide to a constitutive promoter. Although the efforts to generate clones with the peptides were unsuccessful, there are several possible reasons it did not work. The region may be an ORF that produces unstable constructs. Also, the constructs using IGRpyrG-tRNALeu19 may contain an sRNA that could adversely affect the health of the cell when it is located in an overexpression vector. Although a clone encoding the peptide was not generated, examining the two hypothetical proteins between PyrG and tRNALeu19 revealed some similarities and differences between the four genomes.

65

fbaA 93 94 pyrG tig 92 rpoE 19 Streptococcus mutans UA159 tig 00395 pyrG 00410 00415 00400 09788 Streptococcus mutans GS 5 tig 0076 t0019 pyrG 0079 fba rpoE Streptococcus mutans LJ23 rpoE rep tra tra pyrG 0090 0091 0019 fbaA Streptococcus mutans NN2025

Figure 4-14 Synteny map of pyrG - fbaA in S. mutans. The transcribed region of MrrC and smu97-99 is located between pyrG (smu.97) and tRNALeu19 (smu_t19) and is conserved in other strains of S. mutans. The region between pyrG and tRNALeu in GS-5, LJ23, and NN2025 contain at least one hypothetical protein coding ORFs. Pink – rpoE, Blue – pyrG, Hash – tRNA, Yellow – fbaA, white – other annotated genes.

A conserved hypothetical protein was found within the transcribed region of ORF1

(Figure 4-6 and Figure 4-14). When compared to ORF1 from UA159, the conservation of this protein had high identity with SMUGS-5_00410 (89% identity), SMULJ23-0079 (87% identity), and SmuNN2025_0091(84% identity). A sequence alignment was performed using T-Coffee

(188) with the result shown in Figure 4-15. The predicted 45 amino acid protein of ORF1 is well conserved in S. mutans LJ23 and S. mutans GS5. While the N-terminus of the protein in S. mutans NN2025 is well conserved, the C-terminus is not well conserved. The putative -35 and -

10 sites were identified in S. mutans UA159 (Figure 4-13).

66

UA159-MrrCa (ORF1) 1 MNSLAFEVFKSLDADCFVQIQGEGWRSSCWGLVT------34 LJ23-0079 1 MNSLAFEKFESLDADCLVQFQGEGWRSSCWGWLQY------35 GS5-00410 1 MNSLAFEKFESLDADCLVQFQGEGWRSSCWGLVT------34 NN2025-0091 1 MNSLAFEKFESLDADCLVQFQGEGWVGAVAGGVAGFIGGAALGAEAAAPLALIPGFGWVT 60 ******* * ****** ** ***** . * .

UA159-MrrCa 35 I------RHYKEIVLNI 45 LJ23-0079 36 ------AIIK---RLC 42 GS5-00410 35 I------RHYKEVVLNI 45 NN2025-0091 61 DVGCATVGAISGAIGGAASGYKA--SAW 86 *

Figure 4-15 T-Coffee alignment of the ORF1 peptide found in MrrCa for four strains of S. mutans. Symbols * - conserved residue

Although the ORF was not annotated by Genbank, ORF2 was also well conserved in all four strains. ORF2 is 54 residues long and contains four double amino acids for MM, II, LL, and

LL. The significance of such repeats is unknown. Although the ORFs were not annotated, an

ORF was found in SMUGS-5 (98% identity), SMULJ23 (98% identity), and SmuNN2025

(100% identity) with the alignment shown in Figure 4-16. The hypothetical protein for S. mutans

NN2025 (SmuNN2025_0090) was examined to determine whether SmuNN2025_0090 was conserved, but it was not found in other strains of S. mutans. As seen in Figure 4-13, the putative

-35 and -10 sites were also identified in S. mutans UA159.

UA159 MrrCb(ORF2) 1 MMFIIYSFILLSFAVLVLISKYSRSGTVKLLCFGIFILECIFVIANMIYSFLN* 54 LJ23-P2 1 MMFIIYSFILLSFAVLVLISKYSRSGTVKLLCFGIFMLECIFVIANMIYSFLN* 54 GS5-P2 1 MMFIIYSFILLSFAVLVLISKYSRSGTVKLLCFGIFMLECIFVIANMIYSFLN* 54 NN2025 - P2 1 MMFIIYSFILLSFAVLVLISKYSRSGTVKLLCFGIFILECIFVIANMIYSFLN* 54 ************************************.*****************

Figure 4-16 T-Coffee alignment of the ORF2 peptide found in MrrCa for four strains of S. mutans. Repeat residues are highlighted in bold, Symbols: * - conserved residue

The TAR responsible for ORF1 and ORF2 can be deduced by its location with respect to

MrrC. The second protein (ORF2) was not in frame with the ORF1, this suggested that the hypothetical start site for ORF2 would be distinct from the start site for ORF1. The ORF1 promoter and start region were located inside of MrrCa (Figure 4-6), thus MrrCb does not 67

encode ORF1. Both ORF1 and ORF2 are located within the transcript of MrrCa. This suggested that MrrCa would generate ORF1, although the location of ORF1 is over 120 nt upstream of the

MrrCa promoter. The presence of long 5’ untranslated regions in S. mutans proximal to expressed proteins has not yet been shown. Also, since the two ORFs were not in frame, ORF2 would likely be generated by MrrCb.

4.3.4 Growth Under Various Stressors

A deletion mutant of the IGRpyrG-tRNALeu19 (ΔMrrC) was generated. The deletion was confirmed by both sequencing the intergenic region and by investigating the intergenic region using PCR analysis. The primers used to previously amplify the intergenic region no longer produced a PCR product indicating the successful knock-out of the region; this was confirmed by sequencing the region. A test of the expression of the adjacent genes comparing the wild-type with the mutant was performed using qRT-PCR. There was no significant difference for pyrG

(1.50 +/- 0.76) as well as the downstream gene fbaA (1.23 +/- 0.39). The mutant was then used in phenotypic assays to characterize the phenotype of the deletion mutant. Assays examining growth under the influence of various stressors, biofilm quantification, acid-tolerance response

(ATR), natural competence, and antibiotic resistance were performed.

The growth of the strains were examined in both complex (THYE) and minimally defined medium (MDM) (174) to compare both rich and defined nutrient conditions, respectively. S. mutans grown in either THYE or MDM were stressed with pH 5.0, pH 3.5, 0.002

% H2O2, 0.004 % H2O2, 2.5 % ethanol (EtOH), 4.0 % EtOH, 0.8 mM NaCl, and increasing concentrations of copper.

In THYE, the growth of UA159 and ΔMrrC were not significantly different (ANOVA, p

< 0.05) under the unstressed conditions measured using the Bioscreen C (Figure 4-17). The

68

unstressed doubling times were calculated for UA159 (1.17 +/- 0.13 hours) and ΔMrrC (1.20 +/-

0.14 hours) (Table 4-2).

A pH 5.0 B EtOH

1 1 m m n n

0 0 0 0 6 6

0.1 0.1 D D O O THYE

0.01 0.01 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 Time (hours) Time (hours) C D

1 1 m m n n

0 0 0 0 6 6

0.1 0.1 MDM D D O O

0.01 0.01 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 Time (hours) Time (hours)

Figure 4-17 Growth of S. mutans UA159 (wild-type) and ΔMrrC under pH 5.0 and 4.0 % ethanol stress in THYE and MDM. In THYE, the comparison of the growth of wild-type and ΔMrrC strains. Data is representative of three independent experiments. Colors: UA159, red; ΔMrrC, blue. Shapes: solid, unstressed control; open, stressed.

The growth of UA159, and ΔMrrC were also examined in MDM. The unstressed controls were not significantly different for the two strains (Figure 4-17, ANOVA, p < 0.05). The doubling times were calculated for UA159 and ΔMrrC to be 1.89 +/- 0.04 and 1.93 +/- 0.02 hours, respectively. Two significant differences were observed for cells growing at pH 5.0

(UA159 4.15 +/- 0.61; ΔMrrC 5.57 +/- 0.61 hours) and 4.0 % EtOH (UA159 4.94 +/- 0.49;

69

ΔMrrC 6.56 +/- 0.67 hours) respectively (p < 0.05). The growth under pH 5.0 and 4% EtOH stress in MDM is seen in Figure 4-17. The doubling times for the various stressors were calculated for both THYE and MDM (Table 4-2).

Table 4-2 Calculated Doubling Times (hours) of S. mutans UA159 and ΔMrrC in both THYE and MDM under various stressors

THYE MDM

Various Stressors UA159 ΔMrrC UA159 ΔMrrC Mean SD Mean SD Mean SD Mean SD

Control 1.17 0.13 1.2 0.14 1.89 0.04 1.93 0.02 pH 5 3.94 0.64 4.01 0.81 4.15 0.61 5.57* 0.61 pH 3.5 55.53 14.26 56.37 12.88 60.94 8.79 97.71 15.86 0.002 % H O 2 2 2.23 0.36 2.31 0.17 1.75 0.02 1.85 0.02 0.002 % H O 2 2 2.72 0.35 2.57 0.37 1.69 0.02 1.80 0.02 2.5 % EtOH 1.32 0.22 1.38 0.12 2.53 0.10 2.94 0.24 4.0 % EtOH 1.92 0.21 2.19 0.11 4.94 0.50 6.56* 0.67 0.8 mM NaCl 1.37 0.20 1.77 0.22 1.70 0.03 1.86 0.04

THYE MDM Metal stress (mM) UA159 ΔMrrC UA159 ΔMrrC Mean SD Mean SD Mean SD Mean SD 0.0 1.23 0.19 1.03 0.14 2.44 0.17 2.49 0.27 0.125 1.09 0.05 1.20 0.14 2.46 0.15 2.53 0.13 0.25 1.08 0.03 1.23 0.13 2.34 0.07 2.62 0.13 0.5 1.26 0.02 1.23 0.12 2.64 0.19 2.79 0.06 1.0 2.72 0.35 2.18 0.20 2.95 0.35 3.18 0.48 1.5 1.32 0.22 2.10 0.22 4.50 0.30 5.21 0.65 2.0 1.92 0.21 2.16 0.27 4.92 0.34 5.30 0.56 3.0 1.37 0.2 3.79 0.68 7.90 0.47 6.87 0.87

Data represents three independent experiments. The data was analyzed by an ANOVA using the Tukey post-hoc test and compared to the control; values represent mean +/- standard deviation (SD, * p < 0.05, grey). Abbreviations: ethanol, EtOH; sodium chloride, NaCl.

70

4.3.5 Growth Under Copper Stress

As mentioned previously, the up-regulation of RpoE upon exposure to metal in E. coli had previously been reported in the literature (99), therefore since the MrrC region was in close proximity to RpoE, an investigation into the copper regulation of MrrC was undertaken. As was shown in Section 4.3.1, the expression of MrrC was found to be sensitive to the concentration of copper. The growth of UA159 and ΔMrrC was examined under copper stress ranging from 0 to 3 mM in both THYE and MDM cultures.

The growth kinetics of S. mutans in the presence of copper was examined. Cultures were agitated before their OD600 reading to ensure cells were suspended. As shown in Section 4.3.4, the doubling times of unstressed UA159 and ΔMrrC grown in THYE were not significantly different under unstressed conditions (Table 4-2, ANOVA, p < 0.05). In THYE, there is no significant change in lag phase seen when comparing untreated and copper treated cells (Figure

4-18). The wild-type showed that the cell density increased at higher concentrations of copper until a maximum was reached at 1.5 mM, beyond this concentration the OD600 began to decrease due to metal toxicity. Although the effect was not statistically different (ANOVA, p < 0.05) the average final OD600 for ΔMrrC was lower than the wild-type in both THYE and MDM (Figure

4-18).

71

A 1.5 mM Cu B 3.0 mM Cu

1 1 m m n n

0 0 0 0.1 0 0.1 6 6

D D O O THYE

0.01 0.01

0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 Time (hours) Time (hours)

C D

1 1 m m n n

0 0 0 0

6 6 MDM 0.1 0.1 D D O O

0.01 0.01 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 Time (hours) Time (hours)

Figure 4-18 Growth curves of UA159 and ΔMrrC grown in THYE in the presence of (A) 1.5 mM and (B) 3 mM copper. In THYE, the comparison of the growth of wild-type with ΔMrrC when undergoing copper stress. Data was representative of three independent experiments. Symbols used: UA159 ( ), UA159 with Cu ( ), ΔMrrC ( ), ΔMrrC with Cu ( ).

The growth of S. mutans in the presence of copper was also examined using static cultures to determine if aeration via agitation was needed. The cultures were grown using 5%

CO2 in both MDM and THYE and the final OD600 for ΔMrrC. The OD600 were normalized to the final OD600 obtained for untreated wild-type (0.0 mM). In cells grown in static cultures in THYE, there were significant differences between wild-type and ΔMrrC at 1.5 and 3.0 mM; in MDM there were significant differences at 1.0, 1.5, 2.5, and 3.0 mM (ANOVA, p < 0.05). No significant difference was observed in agitated cultures. The highest normalized OD600 for both

72

media was achieved by the wild-type strain. The lowest values for normalized OD600 was observed with ΔMrrC.

A Static B Agitated

1.25 1.2 *

1.00 1.0 to UA159 0 mM to UA159 0 mM 0.75 0.8 600 600

0.50 0.6 THYE

0.25 0.4 Normalized Final OD Normalized Final OD 0.00 0.2 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 0.125 0.25 0.5 1 1.5 2 3

Copper Treatment (mM) Copper Treatment (mM)

C D 1.25 1.50 * * * *

1.25 1.00

1.00 to UA159 0 mM to UA159 0 mM 0.75 600 600 0.75 MDM 0.50 0.50

0.25 0.25 Normalized Final OD Normalized Final OD 0.00 0.00 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 0.125 0.25 0.5 1 1.5 2 3 Copper Treatment (mM) Copper Treatment (mM) Figure 4-19 Normalized final OD600 of of UA159 and ΔMrrC grown in the presence of copper in MDM and THYE in both static and agitated cultures. The optical densities at 600 nm were normalized to the final measurement obtained for untreated (0 mM) wild-type with all other conditions and growth of ΔMrrC. Data was representative of four independent experiments. Symbols used: UA159, black; ΔMrrC, grey. No significant difference was observed. Data analyzed by an ANOVA using the Tukey post-hoc test and compared to normalized expression of UA159 at a copper concentration of 0 mM (p < 0.05); values represent mean +/- SD.

73

4.3.6 Bioinformatic Identification of Inverted Repeats and Transcription Termination

There are several features that can be identified by using bioinformatics searches. Three features that can be identified include the direct repeat (DR), an inverted repeat (IR), and a rho- independent terminator. For example, a tandem DR in S. mutans was identified between bacteriocin gene nlmA and comC. As the features of the MrrC region was unknown, it was rational to perform a bioinformatic search for patterns within the entire chromosome, undertaken using the EMBOSS webserver (184). A search for DRs using the EMBOSS etandem webtool yielded no results within the intergenic regions investigated for this report. A genome wide search for IRs was performed using the EMBOSS einverted webtool and over 100 sites were identified along the S. mutans chromosome. The final 53 IRs for analysis were selected using two cutoff criteria. Firstly, the identity of the repeat was selected as >85%, and secondly the space between the two repeats was limited to <125 nts. This search yielded IGRs ranging from

50 to 200 nt long. The search using the aforementioned criteria yielded 53 regions on the chromosome that contain IRs. IRs were found in two regions investigated in this report,

IGRpyrG-tRNALeu19 as well as smu.807-809. Since the smu807-809 region lacked expression data, only the IGRpyrG-tRNALeu19 was followed up.

The commonality among the domains was examined using various bioinformatic alignment tools. An examination for commonality within the 53 inverted repeats was performed using the Regulatory Sequence Analysis Tools webtool Consensus

(http://rsat.ulb.ac.be/consensus_form.cgi). The aligned motif was determined to be 20 nt for both the upstream and downstream sites. The motif structure suggested that the upstream portion was represented by the presence of several well-conserved adenines, followed by the downstream end with weakly conserved thymines. Further analysis identified a limitation of the Consensus web 74

tool; the aligments were not stringent enough to characterize the inverted repeats. The alignment was for all 53 sequences.

Another useful bioinformatic prediction tool identifies transcription termination by rho- independent terminators. One such tool for predicting putative transcription termination uses the

ARNold webtool (11). When the IGRpyrG-tRNALeu19 region was examined for a putative rho- independent transcription stop site, the location was at the end of the MrrCa transcript (99,502-

99,542 shown in Figure 4-6 and Figure 4-13). This location coincided with the 3’ RACE result as well as the location of IR for the IGRpyrG-tRNALeu19 region.

4.4 Combined assays and bioinformatics for Biofilm, Transformation, Acid

Tolerance Response, motifs, and MIC for ΔSurC and ΔMrrC

4.4.1 Biofilm Assay

Biofilm formation was compared for wild-type, ΔSurC and ΔMrrC using glucose and sucrose at pH 5.0 and 7.0. The data was normalized to the quantity of biofilm for wild-type formed using 1% sugar (Figure 4-20). The biofilm of ΔSurC was significantly different for biomass formed at pH 5.0 for both glucose and sucrose (p < 0.05). The biofilm biomass of the

ΔMrrC mutant was not significantly different than the amount of biofilm biomass of the wild- type at a pH of 7.0. The growth rate reported in 4.3.4 for ΔMrrC at pH 5.0 in MDM for was significantly lower. This matches the observation that the biofilm biomass formed under this condition was significantly reduced as well (p < 0.05).

75

1.50 UA 159 ΔSurC ΔMrrC * 1.25 * *

1.00

0.75

0.50

0.25

Normalized biomass yield to UA159 0.00 Glucose pH 7 Sucrose pH 7 Glucose pH 5 Sucrose pH 5

Biofilm conditions

Figure 4-20 Normalized biofilm yield for UA159, ΔSurC and ΔMrrC grown in MDM. The comparison of biofilm yield when grown in 5% CO2 using MDM for varying pH and supplemented with glucose (1%) or sucrose (1%). The values were normalized to the yield obtained by wild-type. Colors used: UA159, black; ΔSurC, grey; ΔMrrC, white. Data represent four independent experiments in MDM. Data analyzed by an ANOVA using the Tukey post-hoc test and compared to wild-type UA159 for different treatments (* p < 0.05); values represent mean +/- SE.

4.4.2 Transformation Assay

The transformation frequency of wild-type S. mutans UA159 was compared with ΔSurC, and ΔMrrC. Cell cultures undergo two treatments to model natural transformation (no added

CSP) and induced transformation (exogenously added CSP). The deletion strains were transformed with pDL277 with and without CSP and the results were compared with wild-type.

Viable cells that were transformed have acquired spectinomycin resistance. The frequency was plotted as the percentage of spectinomycin resistant transformants divided by the total number of recipient cells (Figure 4-21). When compared with UA159, the ΔSurC strain had significantly more transformants in the absence of CSP and significantly fewer transformants in the presence of CSP (ANOVA, p < 0.05) suggesting that ΔSurC was no longer CSP responsive for

76

transformation. Under no CSP transformation conditions, the ΔMrrC strain was not significantly different than wild-type; however, in the presence of CSP transformation frequency was significantly lower than the wild-type (Figure 4-21, ANOVA, p < 0.05). Therefore, in the presence of CSP, both ΔSurC and ΔMrrC were significantly less competent than the wild-type strain.

101 UA 159 ΔSurC ΔMrrC * 100 * * 10-1

10-2

10-3

Transformation frequency (%) frequency Transformation 10-4 - CSP + CSP

Treatment Figure 4-21 Transformation frequency of S. mutans wild-type and mutants. The transformation of S. mutans with pDL277 with and without added CSP. The frequency was expressed as the percentage of spectinomycin transformants divided by total number of recipient cells. Performed in triplicate with three independent experiments. Colors used: UA159, black; ΔSurC, grey; ΔMrrC, white. Data analyzed by an ANOVA using the Tukey post-hoc test and compared to wild-type UA159 for different treatments (* p < 0.05); values represent mean +/- SD.

4.4.3 Acid Tolerance Response

An ATR assay on UA159, ΔSurC, and ΔMrrC examined the role of SurC and MrrC in acid survival. The analysis of MrrC expression demonstrated that MrrC was down-regulated by both decreased and increased pH. An expression analysis of SurC under various pH was not

77

performed. The survival kinetics of ΔSurC indicated that pH 5.5 adapted ΔSurC cells have a higher % survival rate when compared with UA159 (Figure 4-22). The % survival difference for

SurC (adapted %-unadapted %) was significant after 2.0 and 3.0 hours (ANOVA, p < 0.05). The result for MrrC for adapted % and unadapted % of MrrC were not different than wild-type.

However, the % survival difference for MrrC was significantly different than that of the wild- type at all time points (ANOVA, p < 0.05). Although the acid tolerance response was not statistically significant from UA159, it was apparent that the ΔMrrC mutation lead to a defect in the ATR. This result agreed with the doubling time of ΔMrrC in pH 5.0, as well as the influence of pH on expression of MrrC. These data provide supportive evidence for a role for both SurC and of MrrC in acid tolerance.

A B 100 100 UA159 (+) UA159 (-) UA159 ΔSurC (+) ΔSurC (-) * ΔMrrC (+) ΔMrrC (-) ΔSurC 10 * ΔMrrC * 10 1 * * * 1 0.1 % survival 0.1 0.01

0.001 adapted % - unadapted 0.01 0 1 2 3 1 2 3 Time (hours) Time (hours)

Figure 4-22 Acid Tolerance Response assay of S. mutans UA159, ΔSurC, and ΔMrrC. A) The survival kinetics of S. mutans UA159, ΔSurC and ΔMrrC upon exposure to a killing pH of 3.5. Both pH 5.5 adapted (+) and pH 3.5 unadapted (-) cultures were sampled at 1, 2 and 3 hours after exposure to a killing pH of 3.5. Symbols used: solid line, adapted; dashed line, unadapted. Colors: UA159, black; ΔSurC, red; ΔMrr, blue. B) The percent survival difference for S. mutans UA159, ΔSurC and ΔMrrC upon exposure to a killing pH of 3.5. Performed in triplicate with four independent experiments. Colors: UA159, black; ΔSurC, grey; ΔMrrC, white. Data analyzed by an ANOVA using the Tukey post-hoc test and compared to wild-type UA159 at each time point (* p < 0.05); values represent mean +/- SD.

78

4.4.4 Transcription Factor Motifs

Intergenic regions often contain regulatory motifs identified by transcriptional regulatory networks (15). The location of putative motifs may be determined bioinformatically (15) by aligning the transcription factor motif with the intergenic region (185). These results can then later be confirmed by various biochemical and genetic empirical methods. This led to the use of the Multiple Em for Motif Elicitation webserver alignment tool, (MEME) (185).

The MEME webserver software was used to examine the intergenic regions containing the inverted repeats for transcription factor motifs of S. mutans UA159. The focus of the search were the regions containing IGRpyrG-tRNALeu19 and smu82-83, as these regions were the focus of this investigaton. The 42 transcription factors motifs from manually curated regulons (15) of

S. mutans UA159 were used to carry out the search.

The search identified five putative transcription factor binding sites between PyrG and

FbaA and two between DnaK and DnaJ. The matching sites were for: the catabolite control protein CcpA (smu.1591), the manganese transport regulator protein and MntR homolog SloA

(smu.182), multidrug transport repressor and MarR homolog RcrR (smu.921), amino acid metabolism transcription factor CodY (smu.1824c), and the two-component signal response regulator ComE (smu.1917). The first transcription factor binding site matched the ccpA motif

(Figure 4-23a) as seen in the sequence map for MrrC located 20 nt downstream of the PyrG stop codon (Figure 4-13). The site overlaps the 5’ start site of MrrCc. The 16 nt alignment of ccpA with the other IR regions of S. mutans UA159, along with aligned gene regions and with homologs from other bacteria (data not shown). The second transcription factor matched two binding sites for sloA (Figure 4-23b). The putative binding sites were located directly on the +1 start site of MrrCc and and the -35 promoter region for smu97-99 (Figure 4-13). The 22 nt

79

alignment of sloA with the matching IR regions from S. mutans and other bacteria (data not shown). The third site was for an rcrR binding site (Figure 4-23c). The matching region (Figure

4-13) was located within the smu97-99 +1 start site. Another match was found within the intergenic region of smu82-83, it was located 280 nt upstream of the putative -35 start site of

SurC (data not shown). The fourth motif matches the consensus sequence binding sites for codY

(Figure 4-23d). The codY binding site was found within the MrrCa -35 promoter site (Figure

4-13). The final transcription factor binding site matched the DR site for comE. The DR binding site was located within the putative sRNA smu97-99 (Figure 4-13) and it was also found on the transcript for SurC. The alignment of the DR motifs of the comE binding site sequence was reported previously (189) and was compared with the IR regions. The weblogos for the MEME alignments are shown in Figure 4-23.

A 2 B 2

1 1 bits bits AA G C TT T A ACGT T A TATA GC T G T GTAC A T A T C G C A C GTAC G T A C G T CA G G T A T C A C A T C G GCC GGC A G TG CA TT GATT TA TA AACT AA 0 0 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 1 A T 1 A A T T A 10 T11 12 13 14 15 16 A 10 11 12 13 14 15 T16 17 18 19 20 21 22 MEME (no SSC)18.2.2014 08:30 MEME (no SSC)18.2.2014 08:23

C 2 D 2

1 1 bits A T bits T C AT G G GT G AT T A T CT A GAC TA GTC T A G T T TA TA A T CA T A G A A C T A T CT AG A A C T TT TTAT CTG CGA CAAA AA AATAG CAG CTATT 0 0 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 1 1 T A 10 11 12 13 14 15 16 17 18 19 20 10 11 12 13 14 15 MEME (no SSC)18.2.2014 08:43 MEME (no SSC)18.2.2014 08:47 E 2

1 bits

A T T TG AA AT C T T GG C C G T T C A T C T A G AT T T T A C T G C TA A A T T A T AT T A G A T A G CC G C C G C G AT T C A G A T A G A AA T T GG C G A C C T GA CCTT T TG A CAC C G A C A 0 2 3 4 5 6 7 8 9 1 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 MEME (no SSC)18.2.2014 09:27

Figure 4-23 Meme motifs of the putative transcription factor sites in proximity to "#!$%&"!'()#!*#!Leu19+,-"./012!'33#!$#!/452!'36#!7#!$-89'(:8#!;#!%-<;!'=>8#! IGRpyrG-tRNA and smu82-83. A) CcpA B) SloR C) RcrR D) CodY E) ComE

80

4.4.1 Minimum Inhibitory Concentration

The ability of ΔSurC and ΔMrrC to resist antibiotics was tested. The MICs were performed for UA159, ΔSurC, and ΔMrrC with kanamycin, tetracycline and ampicillin. The antibiotics kanamycin and tetracycline were selected as these two antibiotics are used to identify strains with a translation defect in the targeting of the 30S ribosomal subunit of the ribosome.

The antibiotic ampicillin was selected to identify mutants capable of resisting the cross-linking disruption of peptidoglycan chains caused by this antibiotic. The ΔMrrC mutant was found to be over five-fold more sensitive to kanamycin as compared with wild-type. The ΔMrrC strain was more sensitive to kanamycin than UA159. No difference was found for ampicillin or tetracycline. The ΔSurC mutant was not significantly different from wild-type.

Table 4-3 MICs of antibiotics (µg/ml) for S. mutans UA159, ΔMrrC, and ΔSurC

Strain kanamycin tetracycline ampicillin UA159 80 2 0.125 ΔMrrC 2.5 2 0.0625 ΔSurC 80 4 0.03125

81

5 Discussion

Examining the transcripts from the intergenic regions of S. mutans UA159 revealed the presence of previously unexplored regulatory regions. The regulatory role of S. mutans sRNA is currently under investigation by several groups and has yielded several interesting insights into its virulence (171), metabolism (173), and lifestyle (170). Developing insight into the regulatory factors that alter the expression of the TARs in this study demonstrates that the intergenic regions of S. mutans UA159 are transcribed. Northern analysis demonstrated that the expression of the two sRNA, SurC, and MrrC, responded to environmental cues such as stages of growth, temperature, CSP, pH and the concentration of various metals.

The TAR smu82-83 is located between the genes for two heat-shock chaperone proteins dnaK and dnaJ. This region is well conserved among bacteria (Figure 4-1). In the study by

Silvaggi et al. (66), the region between dnaK and dnaJ was observed to produce a transcript in B. subtilis. The link between the sporulation of B. subtilis and the sRNA expression of SurC was identified. The developmental expression of SurC matched the observation in this study as the expression of the transcript increased when growth reached stationary phase. The expression of

SurC in S. mutans was also up-regulated when exposed to oxidative stress (0.003% H2O2) and heat-shock (50°C) conditions, however it appears as though the expression did not change under

CSP stress (Figure 4-3). This relates to an earlier study of the heat-shock chaperones (160) of S. mutans where the expression of dnaK was examined. The region including grpE, dnaK, and dnaJ

(Figure 4-3 and Figure 5-1) was expressed as a 4.4 kb operon that was up-regulated when cells were stressed by exposure to an increase in temperature to 42°C for 10 minutes. If the region bounded by dnaK and dnaJ were highly expressed as an operon, then it should be reasonable to hypothesize that the transcript for SurC would also be highly expressed when the cell is

82

undergoing shock. The unique finding in this study revealed that SurC is expressed as an individual transcript during exponential growth and when exposed to various stressors. One explanation for the change in transcript expression is that during early growth SurC was transcribed as a part of the grpE operon (Figure 4-3 and Figure 5-1). During stationary phase and during oxidative or heat stress, it was transcribed both as an individual unit and as part of an operon. It should be noted that in B. subtilis (66) the transcription of SurC was highest during the development of B. subtilis and not under any stressors as also seen in S. mutans. The regulation of SurC was up-regulated during stationary phase, oxidative stress, and heat-shock. The expression of the region was repressed by the presence of CSP and the mutant generated from the deletion of the region caused a transformation defect (Figure 4-21). This region also contained a ComE regulatory site, which suggested that SurC also plays an as yet undetermined role in competence. The region also contained a putative binding site for the transcription factor

RcrR. This suggests that although the heat shock genes in this operon are well conserved among bacteria, the expression of stess-response in this region requires more investigation.

One intergenic region was examined in far greater detail than the other TARs. The transcription of the intergenic region between the genes pyrG and fbaA revealed that the region contained at least four individual transcripts. Three transcripts were identified as MrrCa, MrrCb, and MrrCc, with the fourth downstream transcript referred to as smu97-99.

83

? RcrPQ

RcrR

T˚C

hrcA grpE dnaK SurC dnaJ

H2O2

OD600 1.2

CSP ComD ComE

Figure 5-1 Regulatory map of the S. mutans UA159 intergenic region containing SurC based on experimental and MEME analysis. The SurC transcript is located in the grpE heat-shock operon. The growth dependent transcription, heat-shock and oxidative responses are shown for SurC. The putative regulation by the RcrR and ComE proteins based on MEME predictions are also shown. Red - putative binding sites; black – experimentally confirmed regulatory effects; rounded rectangle – putative transcriptional regulatory proteins; bent arrow – transcriptional start site; grey arrow – transcriptionally active region; blue arrow – gene on positive coding strand.

Northern analysis showed that MrrC expression was highest during exponential growth phase (OD 0.4) for S. mutans in rich medium. This was interesting since the literature reports that transformation efficiency in rich medium for S. mutans has shown that the optimal timing for transformation occurs during the exponential growth phase (190, 191). This link between exponential growth and natural transformation coincides with the observation that the expression of MrrC is higher when cells undergo CSP stress (Figure 4-8). Although it was observed that transformation efficiency was significantly reduced for ΔMrrC in the presence of CSP (Figure

4-21), the connection between the expression of MrrC and natural transformation (Figure 4-8) 84

would require further investigation. The ΔMrrC transformation defect was not anticipated since the MrrC region was not one of the thirty intergenic regions identified connecting the competence regulon with the comX promoter (169). Evidently, some of the intergenic regions connected to natural transformation are still waiting to be identified and characterized.

Acid resistance for S. mutans is a phenotype that has been closely examined (192). By using both rich and defined media, the influence of pH on MrrC expression and growth rate was measured. The expression of MrrC was similar in both rich and defined media (Figure 4-8 and

Figure 4-9). In rich medium, a twenty minute shock at pH 3.5 reduced the expression of MrrC by over half (Figure 4-8). This was also observed in defined medium where a pH 3.5 shock also reduced the expression of MrrC to almost half that observed at pH 7.0 (Figure 4-9). Although alkaline-shock was not tested in rich media, it was performed in MDM where the reduction in the expression of MrrC was ten times less than the transcript levels observed at pH 7.0 (Figure

4-9). It is clear that pH change affects the expression of MrrC. The use of rich and defined media both showed that an increase in acidity resulted in a statistically significant expression reduction while an increase in alkalinity resulted in a significant reduction (p < 0.001) in the expression of

MrrC (Figure 4-8 and Figure 4-9). As for the growth rate in rich medium, when wild-type and

ΔMrrC cultures were compared in pH 5.0 or 3.5 media, there was no apparent difference (data not shown). However, when comparing the growth rate of wild-type and ΔMrrC in MDM at pH

5.0 there was a significant reduction in the doubling time (p < 0.05) relative to pH 7.0 (Table

4-2). Both wild-type and ΔMrrC did not grow reliably in defined medium in acidic conditions

(pH 3.5). Since reliable growth does not occur at such a low pH, growth rates could not be compared. The growth defect at pH 5.0 was only observed in defined medium. This result

85

indicated that the presence of the MrrC transcript provided a protective advantage to S. mutans when it encountered unfavorable pH conditions under defined nutrient conditions.

The 5’ proximal gene adjacent to MrrC is pyrG. The product of pyrG is cytidine triphosphate synthetase and as mentioned previously, its role in pyrimidine metabolism is to convert UTP to CTP. The expression of MrrC in the presence of pyrimidines was examined. It was found that adding either cytidine or orotic acid to the medium repressed the expression of

MrrC (Figure 4-10). Although the addition of pyrimidines repressed MrrC, the result did not directly link the expression of MrrC to pyrG. Ideally, a direct comparison of the work done in B. subtilis would require the examination of pyrimidine metabolism in a pyrB deletion mutant. This may prove to be problematic as the homolog in S. mutans UA159 has two corresponding genes, pyrB (smu.858) and pyrAB (smu.860). It is unknown whether a deletion between pyrB and pyrAB in S. mutans would produce the same result as in B. subtilis as this has not yet been done.

Another complicating issue was the presence of smu.869 located between pyrB and pyrAB. All three genes may need to be removed for a meaningful comparison with the B. subtilis (193) study.

Perhaps the most significant phenotype observed was that the expression of MrrC was reduced in the presence of metal indicating its sensitivity to metal stress. Bacteria require metals as enzyme cofactors. Although trace amounts of metal ions are required as enzyme cofactors, excessive metal concentrations are lethal to bacterial cells (194). Bacteria react to exposure of sublethal doses of metal by engaging metal transporters to actively export excess metals to return the cell to more favorable conditions. The most interesting finding of this report was the response of the MrrC region of S. mutans to metal stress as seen in Figure 4-10. The strong repression observed for the MrrC transcripts by metals was novel. Examination of the repression revealed

86

that the expression of the three transcripts (MrrCabc) was sensitive to copper, zinc and sodium shock in MDM. The exposure to the metal shock was for twenty minutes, which is also informative as it revealed that the RNA transcription was tightly controlled. The Northern analysis (Figure 4-10a) showed that the exposure to copper (250 µM), zinc (100 µM), and sodium (400 mM) all strongly down-regulated MrrC since the transcript was not evident.

Therefore, this region was under metalloregulation for S. mutans in both rich and defined media.

Polar effects of the mutations were examined. The ΔSurC mutant was not significantly different for dnaK. The dnaJ gene was significantly upregulated which was likely a result of the insertion of the erythromycin gene to replace SurC. Since SurC was a part of the dnaK operon, replacement of SurC by the erythromycin cassette in the 5’-UTR of dnaJ would likely produce the observed upregulation. There were no significant expression differences for the ΔMrrC deletion mutant for pyrG and fbaA. It would be difficult to ascertain if there are polar effects from the intergenic deletion to the tRNALeu19. The tRNALeu19 is upstream of the MrrC deletion and there are five copies of tRNALeu on the chromosome (briefly described below). Examining the expression of tRNALeu would likely not be informative since the expression of the other four tRNALeus would interfere with such an analysis. The regulation of tRNALeu19 may be unique, but it would be difficult to determine if the deletion affects tRNALeu19 expression.

The 3’ RACE determined that the end transcription for the MrrCa transcript is located adjacent to an inverted repeat (Figure 5-2). This coincides with the ARNold predicted rho- independent terminator (11). It may also represent a rho-independent terminator for pyrG as it is located 80 nt away from the pyrG stop codon which would make it a bidirectional terminator.

Inverted repeats and palindromes are also often signaling regions for transcription factors, the identity of which would need to be determined; however, it is known that SloR is a metal

87

sensitive transcriptional factor activated by inverted repeats making it a strong candidate to be a regulator of this region (163). Although it was beyond the scope of this study, it would also be interesting to investigate the interaction of an inactive SloR mutant with MrrC. The effect of the removal of the inverted repeat accompanied by the loss of MrrCa was also examined in this study. The ΔMrrC strain deleted the promoter of MrrCa and the promoter for smu97-99 as well

(Figure 5-2).

Mn2+ SloA ? RcrPQ RcrR

SloR Leu ? MrrCc smu97-99 CcpA

pyrG < > MrrCa smu_t19

CodY MrrCb Cu2+, Zn2+ T˚C

Cu2+, Zn2+ CSP ComD ComE

Figure 5-2 Regulatory map of the S. mutans UA159 intergenic region containing MrrC and smu97-99 based on experimental and MEME analysis. The four transcriptionally active regions of the intergenic region bound by pyrG and tRNALeu19 (smu_t19) are shown. The metal and heat-shock regulation are shown for MrrC. The putative regulation by transcription factors involved with the regulation of MrrC based on MEME predictions are also shown. Red - putative binding sites; black – experimentally confirmed regulatory effects; rounded rectangle – putative transcriptional regulatory proteins; Leu – leucine; bent arrow – transcriptional start site; grey arrow – transcriptionally active regions; blue arrow – gene on positive coding strand; red arrow – tRNALeu19; and black rectangle < > – inverted repeat.

88

Examination of the palindromes led to the realization that transcription factors may interact with the intergenic regions containing the palindromes. A search was undertaken to identify putative transcription factor binding sites where regulator profiles were examined bioinformatically and several putative binding regions were identified (185). These sites may be involved in binding to the transcription factors for catabolite repression (CcpA), multidrug resistance (RcrR), iron homeostasis and oxidative stress response (SloR), BCAA accumulation and nutritional stress (CodY), as well as the two component histidine kinase response regulator for competence (ComE). The annotated transcription factor binding sites match regions within the MrrC area (Figure 5-2).

The protein CcpA regulates metabolism by repressing genes when the cell is in the presence of simple carbon sources such as glucose. This process is referred to as catabolite repression and is prevalent in Gram-positive bacteria and has been closely examined in S. mutans

UA159 (195). The promoters of the regulated genes contain a regulatory target site commonly referred to as a catabolite repression element (CRE) (196), also referred to as a CRE box. A putative CRE box is located approximately 20 nt downstream the PyrG stop codon (Figure 5-2).

In the model of gene activation and repression for B. subtilis, the activity of CcpA is directed by the location of the CRE box (197). When a CRE site is located in proximity to the -35 site, CcpA may act upon the gene as an activator. When the CRE site is located closer to the -10 site, it acts as a repressor of gene regulation. The model would suggest that the expression of MrrCc would be repressed by the activity of CcpA.

In Gram-positive bacteria, the CodY protein is a global transcription regulator of early stationary phase as well as virulence (198). The transcriptional repressor regulates metabolism genes required for the cell to adapt to limited nutrients and is regulated by the BCAAs. In S.

89

mutans UA159, the RelA, RelP, and RelQ proteins were required for the production of the alarmone (p)ppGpp. In a triple gene deletion mutant, namely ΔrelAPQ, the cells would not grow in minimal medium unless valine and leucine were added (162). The intergenic region between

PyrG and FbaA contains a putative CodY-binding box (Figure 5-2). This is an interesting location for a BCAA regulator as it is approximately 280 nt upstream of the tRNALeu19. S. mutans UA159 has seven tRNALeus, two of which are located outside of the five rRNA operons, smu_t19 and smu_t35. There are no tRNAs for valine or isoleucine located outside of the rRNA operons. In S. mutans UA159, there is a direct link between both CcpA, CodY and acid adaptation as well as the interaction with the aminotransferase ilvE, and it is hypothesized that leucine may act as an effective nutritional indicator since the synthesis of the amino acid requires essential precursors including pyruvate (199). The location of a putative CodY-binding box and a

CRE site within the promoters for the MrrC transcripts suggests that BCAA biosynthesis may in some way regulate the function of MrrC. The location of the putative binding sites for the BCAA transcription factor, CodY, adjacent to MrrC raises the question, if the MrrC region contains

CodY binding sites, will the concentration of leucine regulate the expression of MrrC?

Examination of the tRNALeus located within the rRNA operons identified three codons for leucine: CTA (smu_t05, smu_t46, smu_t54), TTA (smu_t08, smu_t43), and TTG (smu_t33). The two external tRNALeus used two different codons for leucine: CTT (smu_t19) and CTG

(smu_t35). Given that the codon for tRNALeu19 is unique and it is adjacent to MrrC, suggests that this tRNA may possess regulatory functions that are unique to this complex regulatory region.

The rel competence-related (Rcr) operon in S. mutans UA159 includes three proteins.

The two ABC exporters RcrP and RcrQ are annotated as multidrug transport proteins and the

RcrR protein is a MarR family homolog (33). The expression of the three genes are up-regulated

90

in response to mupirocin which indicated that the response depends on (p)ppGpp linking the operon to the stringent response of amino acid shortage. Bacterial MarR regulators dimerize and form a secondary structure that takes on the form of a winged helix-turn-helix DNA-binding protein (200). Repression occurs when MarR acts on its target genes by binding their cognate promoters on palindromic sequence motifs. This is observed in E. coli where the MarR protein binds two adjacent palindrome sites located between two genes that are transcribed in opposing directions (201). The end result is the repression of both of the adjacent genes. The MarR regulator is also known to activate a target gene by out-competing a repressor protein (202). In S. mutans UA159, RcrR regulates the rcrRPQ operon. It is also a link between the metabolism of

(p)ppGpp, genetic competence, as well as acid and oxidative stress (33). The presence of a putative RcrR box is located right after the -10 site for smu97-99 and overlaps the +1 start site of smu97-99 (Figure 5-2). The presence of the putative RcrR-binding box within the promoter of smu97-99 suggests that the region is linked to RelA and the metabolism of (p)ppGpp and is involved with the competence of S. mutans. Although this hypothesis is supported with the observation that showed the ΔMrrC strain has a transformation defect (Figure 4-21), the biological evidence for the phenotype was not strong.

Although the analysis of the transcription factors and the putative binding boxes are based on data mining analysis, the analysis presented here offers several hypotheses that can be tested experimentally. As seen in Figure 5-2, the sequences of MrrC and smu97-99 have multiple putative regulatory binding sites. The expression of MrrC and smu97-99 are also regulated by the concentration of CSP, oxidative stress, and heat-shock. The most interesting observation was that pH and metals regulate MrrC.

91

6 Future Directions

Significant progress has been made towards identifying unannotated sRNAs in this thesis.

The expression of over ten TARs were monitored during various stages of growth. The characterization of two sRNAs (SurC, MrrC) have linked several external stressors with their expression in S. mutans UA159.

There have been several novel findings. First of all, the intergenic region between pyrG and fbaA is a complex region. It contains a minimum of four transcribed RNAs. The polymerase or transcription factor responsible for the expression of MrrC still needs to be determined. A third finding was that the expression of SurC and MrrC respond to extracellular stressors such as external peptides (CSP), oxidative stress, and temperature changes. The phenotypes of SurC and

MrrC indicate a role in acid-tolerance and biofilm formation. In addition, MrrC expression was shown to respond to acidic and alkaline conditions. And finally, perhaps the most novel finding was that the expression of MrrC was sensitive to micromolar concentrations of metal. This suggests that S. mutans is capable of detecting and immediately responding to alterations in the amount of trace metal available in the surrounding media. The metal regulation of intergenic sRNAs has not been extensively characterized and would likely provide valuable regulatory information for other bacteria. This report suggests the question that if copper, zinc, and manganese are involved in the metalloregulation of the predicted sRNAs that contain an inverted repeat, what other metals regulate these sRNAs? It is very likely that there are a large number of metal regulated sRNAs that would require further characterization. This investigation represents the beginning of that effort.

Although the TARs have now been linked with external stressors, the substrates that are associated with regulated sites require further elucidation. For example, the regulation of MrrC

92

may involve five different transcription factors as indicated in the present study. Also, the concentration of peptides or nucleotides may be involved in the regulation of the TARs. Testing the level of regulation for the five regions is currently underway in an attempt to identify the regulatory pathways involved with MrrC.

The finding of metal regulation within S. mutans may provide regulatory targets for oral health care products that couple fluoride with metal. For example, CrestR-Pro Health Toothpaste uses stannous fluoride in its formulation. Recent work (203) suggests that other oral health care companies are exploring the use of zinc as an additive in their toothpaste formulation as well.

Dental amalgams are also known to leach metals. It is not yet known exactly what regulatory role the metals are performing in the oral cavity with bacteria, but it is very clear that the S. mutans has a sensitive mechanism to detect their presence. How the ability to sense and respond to metals in the oral cavity affects the virulence of S. mutans and its cariogenic ability remains to be investigated.

93

7 References

1. Land M, Hauser L, Jun S-R, Nookaew I, Leuze MR, Ahn T-H, Karpinets T, Lund O, Kora G, Wassenaar T, Poudel S, Ussery DW. 2015. Insights from 20 years of bacterial genome sequencing. Funct Integr Genomics 15:141–161. 2. NCBI Resource Coordinators. 2015. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 43:D6–17. 3. Reddy TBK, Thomas AD, Stamatis D, Bertsch J, Isbandi M, Jansson J, Mallajosyula J, Pagani I, Lobos EA, Kyrpides NC. 2015. The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification. Nucleic Acids Res 43:D1099–106. 4. Kneale GG, Kennard O. 1984. The EMBL nucleotide sequence data library. Biochem Soc Trans 12:1011–1014. 5. Wheeler DL, Chappey C, Lash AE, Leipe DD, Madden TL, Schuler GD, Tatusova TA, Rapp BA. 2000. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 28:10–14. 6. Hemm MR, Paul BJ, Miranda-Ríos J, Zhang A, Soltanzad N, Storz G. 2010. Small stress response proteins in Escherichia coli: proteins missed by classical proteomic studies. J Bacteriol 192:46–58. 7. Backofen R, Hess WR. 2010. Computational prediction of sRNAs and their targets in bacteria. RNA Biol 7:33–42. 8. Argaman L, Hershberg R, Vogel J, Bejerano G, Wagner EG, Margalit H, Altuvia S. 2001. Novel small RNA-encoding genes in the intergenic regions of Escherichia coli. Curr Biol 11:941–950. 9. Vogel J, Sharma CM. 2005. How to find small non-coding RNAs in bacteria. Biol Chem 386:1219–1238. 10. de Jong A, Pietersma H, Cordes M, Kuipers OP, Kok J. 2012. PePPER: a webserver for prediction of prokaryote promoter elements and regulons. BMC Genomics 13:299. 11. Naville M, Ghuillot-Gaudeffroy A, Marchais A, Gautheret D. 2011. ARNold: a web tool for the prediction of Rho-independent transcription terminators. RNA Biol 8:11–13. 12. Münch R, Hiller K, Grote A, Scheer M, Klein J, Schobert M, Jahn D. 2005. Virtual Footprint and PRODORIC: an integrative framework for regulon prediction in prokaryotes. Bioinformatics 21:4187–4189. 13. Solovyev V, Salamov A. 2011. Automatic annotation of microbial genomes and metagenomic sequences, pp. 61–78. In Metagenomics and its applications in agriculture, biomedicine and environmental studies, 1st ed. Nova Science Publishers, Inc. 14. Zomer AL, Buist G, Larsen R, Kok J, Kuipers OP. 2007. Time-resolved determination of the CcpA regulon of Lactococcus lactis subsp. cremoris MG1363. J Bacteriol 189:1366–1381. 15. Novichkov PS, Laikova ON, Novichkova ES, Gelfand MS, Arkin AP, Dubchak I, Rodionov DA. 2010. RegPrecise: a database of curated genomic inferences of transcriptional regulatory interactions in prokaryotes. Nucleic Acids Res 38:D111–8. 94

16. Delcher AL, Bratke KA, Powers EC, Salzberg SL. 2007. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23:673–679. 17. Sharma CM, Vogel J. 2009. Experimental approaches for the discovery and characterization of regulatory small RNA. Curr Opin Microbiol 12:536–546. 18. Hemm MR, Paul BJ, Schneider TD, Storz G, Rudd KE. 2008. Small membrane proteins found by comparative genomics and ribosome binding site models. Mol Microbiol 70:1487–1501. 19. Curreem SOT, Watt RM, Lau SKP, Woo PCY. 2012. Two-dimensional gel electrophoresis in bacterial proteomics. Protein Cell 3:346–363. 20. Ishihama Y, Schmidt T, Rappsilber J, Mann M, Hartl FU, Kerner MJ, Frishman D. 2008. Protein abundance profiling of the Escherichia coli cytosol. BMC Genomics 9:102. 21. Mashburn-Warren L, Morrison DA, Federle MJ. 2010. A novel double-tryptophan peptide pheromone controls competence in Streptococcus spp. via an Rgg regulator. Mol Microbiol 78:589–606. 22. Fontaine L, Boutry C, de Frahan MH, Delplace B, Fremaux C, Horvath P, Boyaval P, Hols P. 2010. A novel pheromone quorum-sensing system controls the development of natural competence in Streptococcus thermophilus and Streptococcus salivarius. J Bacteriol 192:1444–1454. 23. Wenderska IB, Lukenda N, Cordova M, Magarvey N, Cvitkovitch DG, Senadheera DB. 2012. A novel function for the competence inducing peptide, XIP, as a cell death effector of Streptococcus mutans. FEMS Microbiol Lett 336:104–112. 24. Lamb RA, Lai CJ, Choppin PW. 1981. Sequences of mRNAs derived from genome RNA segment 7 of influenza virus: colinear and interrupted mRNAs code for overlapping proteins. Proc Natl Acad Sci 78:4170–4174. 25. Barrell BG, Air GM, Hutchison CA. 1976. Overlapping genes in bacteriophage phiX174. Nature 264:34–41. 26. Johnson ZI, Chisholm SW. 2004. Properties of overlapping genes are conserved across microbial genomes. Genome Res 14:2268–2272. 27. Berkhout B, de Smit MH, Spanjaard RA, Blom T, van Duin J. 1985. The amino terminal half of the MS2-coded lysis protein is dispensable for function: implications for our understanding of coding region overlaps. EMBO J 4:3315–3320. 28. Ellis JC, Brown JW. 2003. Genes within genes within bacteria. Trends Biochem Sci 28:521–523. 29. Kaspar J, Ahn S-J, Palmer SR, Choi SC, Stanhope MJ, Burne RA. 2015. A unique open reading frame within the comX gene of Streptococcus mutans regulates genetic competence and oxidative stress tolerance. Mol Microbiol 96:463–482. 30. Feltens R, Gossringer M, Willkomm DK, Urlaub H, Hartmann RK. 2003. An unusual mechanism of bacterial gene expression revealed for the RNase P protein of Thermus strains. Proc Natl Acad Sci 100:5724–5729. 31. Lemos JAC, Lin VK, Nascimento MM, Abranches J, Burne RA. 2007. Three gene products govern (p)ppGpp production by Streptococcus mutans. Mol Microbiol 65:1568–1581. 95

32. Lemos JAC, Brown TA, Burne RA. 2004. Effects of RelA on key virulence properties of planktonic and biofilm populations of Streptococcus mutans. Infect Immun 72:1431– 1440. 33. Seaton K, Ahn S-J, Sagstetter AM, Burne RA. 2011. A transcriptional regulator and ABC transporters link stress tolerance, (p)ppGpp, and genetic competence in Streptococcus mutans. J Bacteriol 193:862–874. 34. Thomason MK, Storz G. 2010. Bacterial antisense RNAs: how many are there, and what are they doing? Annu Rev Genet 44:167–188. 35. Georg J, Hess WR. 2011. cis-antisense RNA, another level of gene regulation in bacteria. Microbiol Mol Biol Rev 75:286–300. 36. Sharma CM, Hoffmann S, Darfeuille F, Reignier J, Findeiss S, Sittka A, Chabas S, Reiche K, Hackermüller J, Reinhardt R, Stadler PF, Vogel J. 2010. The primary transcriptome of the major human pathogen Helicobacter pylori. Nature 464:250–255. 37. Beaume M, Hernandez D, Farinelli L, Deluen C, Linder P, Gaspin C, Romby P, Schrenzel J, François P. 2010. Cartography of methicillin-resistant S. aureus transcripts: detection, orientation and temporal expression during growth phase and stress conditions. PLoS ONE 5:e10725. 38. Waters LS, Storz G. 2009. Regulatory RNAs in Bacteria. Cell 136:615–628. 39. Opdyke JA, Kang J-G, Storz G. 2004. GadY, a small-RNA regulator of acid response genes in Escherichia coli. J Bacteriol 186:6698–6705. 40. Chen Q, Crosa JH. 1996. Antisense RNA, fur, iron, and the regulation of iron transport genes in Vibrio anguillarum. J Biol Chem 271:18885–18891. 41. Callen BP, Shearwin KE, Egan JB. 2004. Transcriptional interference between convergent promoters caused by elongation over the promoter. Mol Cell 14:647–656. 42. Marcus M, Halpern YS. 1967. Genetic analysis of glutamate transport and glutamate decarboxylase in Escherichia coli. J Bacteriol 93:1409–1415. 43. Tramonti A, De Canio M, De Biase D. 2008. GadX/GadW-dependent regulation of the Escherichia coli acid fitness island: transcriptional control at the gadY-gadW divergent promoters and identification of four novel 42 bp GadX/GadW-specific binding sites. Mol Microbiol 70:965–982. 44. Crosa JH. 1980. A plasmid associated with virulence in the marine fish pathogen Vibrio anguillarum specifies an iron-sequestering system. Nature 284:566–568. 45. Crosa JH, Walsh CT. 2002. Genetics and assembly line enzymology of siderophore biosynthesis in bacteria. Micro Mol Bio Rev 66:223–249. 46. McIntosh-Tolle D, Stork M, Alice A, Crosa JH. 2012. Secondary structure of antisense RNAβ, an internal transcriptional terminator of the plasmid-encoded iron transport-biosynthesis operon of Vibrio anguillarum. Biometals 25:577–586. 47. Sesto N, Wurtzel O, Archambaud C, Sorek R, Cossart P. 2013. The excludon: a new concept in bacterial antisense RNA-mediated gene regulation. Nat Rev Microbiol 11:75– 82. 48. Breaker RR. 2008. Complex riboswitches. Science 319:1795–1797. 49. Nahvi A, Sudarsan N, Ebert MS, Zou X, Brown KL, Breaker RR. 2002. Genetic control by a metabolite binding mRNA. Chem Biol 9:1043. 96

50. Winkler WC, Nahvi A, Sudarsan N, Barrick JE, Breaker RR. 2003. An mRNA structure that controls gene expression by binding S-adenosylmethionine. Nat Struct Biol 10:701–707. 51. Winkler WC, Cohen-Chalamish S, Breaker RR. 2002. An mRNA structure that controls gene expression by binding FMN. Proc Natl Acad Sci 99:15908–15913. 52. Kim JN, Breaker RR. 2008. Purine sensing by riboswitches. Biol Cell 100:1–11. 53. Ramesh A, Wakeman CA, Winkler WC. 2011. Insights into metalloregulation by M- box riboswitch RNAs via structural analysis of manganese-bound complexes. J Mol Bio 407:556–570. 54. Nahvi A, Barrick JE, Breaker RR. 2004. Coenzyme B12 riboswitches are widespread genetic control elements in prokaryotes. Nucleic Acids Res 32:143–150. 55. Dann CE, Wakeman CA, Sieling CL, Baker SC, Irnov I, Winkler WC. 2007. Structure and mechanism of a metal-sensing regulatory RNA. Cell 130:878–892. 56. Wakeman CA, Ramesh A, Winkler WC. 2009. Multiple metal-binding cores are required for metalloregulation by M-box riboswitch RNAs. J Mol Bio 392:723–735. 57. Cromie MJ, Shi Y, Latifi T, Groisman EA. 2006. An RNA sensor for intracellular Mg(2+). Cell 125:71–84. 58. Mandal M, Boese B, Barrick JE, Winkler WC, Breaker RR. 2003. Riboswitches control fundamental biochemical pathways in Bacillus subtilis and other bacteria. Cell 113:577–586. 59. Vogel J, Wagner EGH. 2007. Target identification of small noncoding RNAs in bacteria. Curr Opin Microbiol 10:262–270. 60. Gottesman S. 2005. Micros for microbes: non-coding regulatory RNAs in bacteria. Trends Genet 21:399–404. 61. Doolittle WF. 2013. Is junk DNA bunk? A critique of ENCODE. Proc Natl Acad Sci 110:5294–5300. 62. Gil R, Latorre A. 2012. Factors behind junk DNA in bacteria. Genes (Basel) 3:634– 650. 63. Irnov I, Sharma CM, Vogel J, Winkler WC. 2010. Identification of regulatory RNAs in Bacillus subtilis. Nucleic Acids Res 38:6637–6651. 64. Massé E, Gottesman S. 2002. A small RNA regulates the expression of genes involved in iron metabolism in Escherichia coli. Proc Natl Acad Sci 99:4620–4625. 65. Deltcheva E, Chylinski K, Sharma CM, Gonzales K, Chao Y, Pirzada ZA, Eckert MR, Vogel J, Charpentier E. 2011. CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III. Nature 471:602–607. 66. Silvaggi JM, Perkins JB, Losick R. 2006. Genes for small, noncoding RNAs under sporulation control in Bacillus subtilis. J Bacteriol 188:532–541. 67. Sittka A, Lucchini S, Papenfort K, Sharma CM, Rolle K, Binnewies TT, Hinton JCD, Vogel J. 2008. Deep sequencing analysis of small noncoding RNA and mRNA targets of the global post-transcriptional regulator, Hfq. PLoS Genet 4:e1000163. 68. Massé E, Arguin M. 2005. Ironing out the problem: new mechanisms of iron homeostasis. Trends Biochem Sci 30:462–468.

97

69. Barrangou R, Fremaux C, Deveau H, Richards M, Boyaval P, Moineau S, Romero DA, Horvath P. 2007. CRISPR provides acquired resistance against viruses in prokaryotes. Science 315:1709–1712. 70. Serbanescu MA, Cordova M, Krastel K, Flick R, Beloglazova N, Latos AJ, Yakunin AF, Senadheera DB, Cvitkovitch DG. 2015. Role of the Streptococcus mutans CRISPR-Cas systems in immunity and cell physiology. J Bacteriol 197:749–761. 71. Van der Ploeg J. 2009. Analysis of CRISPR in Streptococcus mutans suggests frequent occurrence of acquired immunity against infection by M102-like bacteriophages. Microbiology 155:1966–1976. 72. Pichon C, Felden B. 2007. Proteins that interact with bacterial small RNA regulators. FEMS Microbiol Rev 31:614–625. 73. Fröhlich KS, Vogel J. 2009. Activation of gene expression by small RNA. Curr Opin Microbiol 12:674–682. 74. Storz G, Opdyke JA, Zhang A. 2004. Controlling mRNA stability and translation with small, noncoding RNAs. Curr Opin Microbiol 7:140–144. 75. Mandin P, Guillier M. 2013. Expanding control in bacteria: interplay between small RNAs and transcriptional regulators to control gene expression. Curr Opin Microbiol 16:125–132. 76. Chao Y, Vogel J. 2010. The role of Hfq in bacterial pathogens. Curr Opin Microbiol 13:24–33. 77. Vogel J, Luisi BF. 2011. Hfq and its constellation of RNA. Nat Rev Microbiol 9:578– 589. 78. Updegrove TB, Wartell RM. 2011. The influence of Escherichia coli Hfq mutations on RNA binding and sRNA•mRNA duplex formation in rpoS riboregulation. Biochim Biophys Acta 1809:532–540. 79. Geissmann TA, Touati D. 2004. Hfq, a new chaperoning role: binding to messenger RNA determines access for small RNA regulator. EMBO J 23:396–405. 80. Du H, Wang M, Luo Z, Ni B, Wang F, Meng Y, Xu S, Huang X. 2011. Coregulation of gene expression by sigma factors RpoE and RpoS in Salmonella enterica serovar Typhi during hyperosmotic stress. Curr Microbiol 62:1483–1489. 81. McCullen CA, Benhammou JN, Majdalani N, Gottesman S. 2010. Mechanism of positive regulation by DsrA and RprA small noncoding RNAs: pairing increases translation and protects rpoS mRNA from degradation. J Bacteriol 192:5559–5571. 82. Mika F, Hengge R. 2014. Small RNAs in the control of RpoS, CsgD, and biofilm architecture of Escherichia coli. RNA Biol 11:494–507. 83. Mika F, Busse S, Possling A, Berkholz J, Tschowri N, Sommerfeldt N, Pruteanu M, Hengge R. 2012. Targeting of csgD by the small regulatory RNA RprA links stationary phase, biofilm formation and cell envelope stress in Escherichia coli. Mol Microbiol 84:51–65. 84. Boehm A, Vogel J. 2012. The csgD mRNA as a hub for signal integration via multiple small RNAs. Mol Microbiol 84:1–5. 85. Holmqvist E, Unoson C, Reimegård J, Wagner EGH. 2012. A mixed double negative feedback loop between the sRNA MicF and the global regulator Lrp. Mol Microbiol 98

84:414–427. 86. Zhang A, Wassarman KM, Rosenow C, Tjaden BC, Storz G, Gottesman S. 2003. Global analysis of small RNA and mRNA targets of Hfq. Mol Microbiol 50:1111–1124. 87. Henderson CA, Vincent HA, Casamento A, Stone CM, Phillips JO, Cary PD, Sobott F, Gowers DM, Taylor JE, Callaghan AJ. 2013. Hfq binding changes the structure of Escherichia coli small noncoding RNAs OxyS and RprA, which are involved in the riboregulation of rpoS. RNA 19:1089–1104. 88. Bandyra KJ, Said N, Pfeiffer V, Górna MW, Vogel J, Luisi BF. 2012. The seed region of a small RNA drives the controlled destruction of the target mRNA by the endoribonuclease RNase E. Mol Cell 47:943–953. 89. Bilusic I, Popitsch N, Rescheneder P, Schroeder R, Lybecker M. 2014. Revisiting the coding potential of the E. coli genome through Hfq co-immunoprecipitation. RNA Biol 11:641–654. 90. Christiansen JK, Nielsen JS, Ebersbach T, Valentin-Hansen P, Søgaard-Andersen L, Kallipolitis BH. 2006. Identification of small Hfq-binding RNAs in Listeria monocytogenes. RNA 12:1383–1396. 91. Majdalani N, Chen S, Murrow J, St John K, Gottesman S. 2001. Regulation of RpoS by a novel small RNA: the characterization of RprA. Mol Microbiol 39:1382–1394. 92. Jin Y, Watt RM, Danchin A, Huang J-D. 2009. Small noncoding RNA GcvB is a novel regulator of acid resistance in Escherichia coli. BMC Genomics 10:165. 93. Sharma CM, Papenfort K, Pernitzsch SR, Mollenkopf H-J, Hinton JCD, Vogel J. 2011. Pervasive post-transcriptional control of genes involved in amino acid metabolism by the Hfq-dependent GcvB small RNA. Mol Microbiol 81:1144–1165. 94. Papenfort K, Vogel J. 2009. Multiple target regulation by small noncoding RNAs rewires gene expression at the post-transcriptional level. Res Microbiol 160:278–287. 95. Barnhart MM, Chapman MR. 2006. Curli biogenesis and function. Annu Rev Microbiol 60:131–147. 96. Guillier M, Gottesman S. 2006. Remodelling of the Escherichia coli outer membrane by two small regulatory RNAs. Mol Microbiol 59:231–247. 97. Jørgensen MG, Nielsen JS, Boysen A, Franch T, Møller-Jensen J, Valentin-Hansen P. 2012. Small regulatory RNAs control the multi-cellular adhesive lifestyle of Escherichia coli. Mol Microbiol 84:36–50. 98. Rowley G, Spector M, Kormanec J, Roberts M. 2006. Pushing the envelope: extracytoplasmic stress responses in bacterial pathogens. Nat Rev Microbiol 4:383–394. 99. Egler M, Grosse C, Grass G, Nies DH. 2005. Role of the extracytoplasmic function protein family sigma factor RpoE in metal resistance of Escherichia coli. J Bacteriol 187:2297–2307. 100. Hild E, Takayama K, Olsson RM, Kjelleberg S. 2000. Evidence for a role of rpoE in stressed and unstressed cells of marine Vibrio angustum strain S14. J Bacteriol 182:6964–6974. 101. Rouvière PE, Las Peñas De A, Mecsas J, Lu CZ, Rudd KE, Gross CA. 1995. rpoE, the gene encoding the second heat-shock sigma factor, sigma E, in Escherichia coli. EMBO J 14:1032–1042. 99

102. Thompson KM, Rhodius VA, Gottesman S. 2007. SigmaE regulates and is regulated by a small RNA in Escherichia coli. J Bacteriol 189:4243–4256. 103. Udekwu KI, Wagner EGH. 2007. Sigma E controls biogenesis of the antisense RNA MicA. Nucleic Acids Res 35:1279–1288. 104. Papenfort K, Pfeiffer V, Mika F, Lucchini S, Hinton JCD, Vogel J. 2006. SigmaE- dependent small RNAs of Salmonella respond to membrane stress by accelerating global omp mRNA decay. Mol Microbiol 62:1674–1688. 105. Lalaouna D, Simoneau-Roy M, Lafontaine D, Massé E. 2013. Regulatory RNAs and target mRNA decay in prokaryotes. Biochim Biophys Acta 1829:742–747. 106. Vecerek B, Moll I, Bläsi U. 2007. Control of Fur synthesis by the non-coding RNA RyhB and iron-responsive decoding. EMBO J 26:965–975. 107. Altuvia S, Zhang A, Argaman L, Tiwari A, Storz G. 1998. The Escherichia coli OxyS regulatory RNA represses fhlA translation by blocking ribosome binding. EMBO J 17:6069–6075. 108. Vanderpool CK, Gottesman S. 2004. Involvement of a novel transcriptional activator and small RNA in post-transcriptional regulation of the glucose phosphoenolpyruvate phosphotransferase system. Mol Microbiol 54:1076–1089. 109. Morfeldt E, Taylor D, Gabain von A, Arvidson S. 1995. Activation of alpha-toxin translation in Staphylococcus aureus by the trans-encoded antisense RNA, RNAIII. EMBO J 14:4569–4577. 110. Prévost K, Salvail H, Desnoyers G, Jacques J-F, Phaneuf E, Massé E. 2007. The small RNA RyhB activates the translation of shiA mRNA encoding a permease of shikimate, a compound involved in siderophore synthesis. Mol Microbiol 64:1260–1273. 111. Massé E, Salvail H, Desnoyers G, Arguin M. 2007. Small RNAs controlling iron metabolism. Curr Opin Microbiol 10:140–145. 112. Altuvia S, Weinstein-Fischer D, Zhang A, Postow L, Storz G. 1997. A small, stable RNA induced by oxidative stress: role as a pleiotropic regulator and antimutator. Cell 90:43–53. 113. Zhang A, Altuvia S, Tiwari A, Argaman L, Hengge-Aronis R, Storz G. 1998. The OxyS regulatory RNA represses rpoS translation and binds the Hfq (HF-I) protein. EMBO J 17:6061–6068. 114. Wadler CS, Vanderpool CK. 2009. Characterization of homologs of the small RNA SgrS reveals diversity in function. Nucleic Acids Res 37:5477–5485. 115. Papenfort K, Podkaminski D, Hinton JCD, Vogel J. 2012. The ancestral SgrS RNA discriminates horizontally acquired Salmonella mRNAs through a single G-U wobble pair. Proc Natl Acad Sci 109:E757–64. 116. Papenfort K, Sun Y, Miyakoshi M, Vanderpool CK, Vogel J. 2013. Small RNA- mediated activation of sugar phosphatase mRNA regulates glucose homeostasis. Cell 153:426–437. 117. Wadler CS, Vanderpool CK. 2007. A dual function for a bacterial small RNA: SgrS performs base pairing-dependent regulation and encodes a functional polypeptide. Proc Natl Acad Sci 104:20454–20459. 118. Novick RP, Ross HF, Projan SJ, Kornblum J, Kreiswirth B, Moghazeh S. 1993. 100

Synthesis of staphylococcal virulence factors is controlled by a regulatory RNA molecule. EMBO J 12:3967–3975. 119. Queck SY, Jameson-Lee M, Villaruz AE, Bach T-HL, Khan BA, Sturdevant DE, Ricklefs SM, Li M, Otto M. 2008. RNAIII-independent target gene control by the agr quorum-sensing system: insight into the evolution of virulence regulation in Staphylococcus aureus. Mol Cell 32:150–158. 120. Boisset S, Geissmann T, Huntzinger E, Fechter P, Bendridi N, Possedko M, Chevalier C, Helfer A-C, Benito Y, Jacquier A, Gaspin C, Vandenesch F, Romby P. 2007. Staphylococcus aureus RNAIII coordinately represses the synthesis of virulence factors and the transcription regulator Rot by an antisense mechanism. Genes Dev 21:1353–1366. 121. Michaux C, Verneuil N, Hartke A, Giard J-C. 2014. Physiological roles of small RNA molecules. Microbiology 160:1007–1019. 122. Paster BJ, Olsen I, Aas JA, Dewhirst FE. 2006. The breadth of bacterial diversity in the human periodontal pocket and other oral sites. Periodontology 2000 42:80–87. 123. Kolenbrander PE, Palmer RJ, Rickard AH, Jakubovics NS, Chalmers NI, Diaz PI. 2006. Bacterial interactions and successions during plaque development. Periodontology 2000 42:47–79. 124. Loesche WJ. 1986. Role of Streptococcus mutans in human dental decay. Microbiol Rev 50:353–380. 125. Belda-Ferre P, Alcaraz LD, Cabrera-Rubio R, Romero H, Simón-Soro A, Pignatelli M, Mira A. 2012. The oral metagenome in health and disease. The ISME Journal 6:46– 56. 126. Hamada S, Slade HD. 1980. Biology, immunology, and cariogenicity of Streptococcus mutans. Microbiol Rev 44:331–384. 127. Ellen RP, Banting DW, Fillery ED. 1985. Streptococcus mutans and Lactobacillus detection in the assessment of dental root surface caries risk. J Dent Res 64:1245–1249. 128. Ellen RP, Banting DW, Fillery ED. 1985. Longitudinal microbiological investigation of a hospitalized population of older adults with a high root surface caries risk. J Dent Res 64:1377–1381. 129. Lemos JAC, Burne RA. 2008. A model of efficiency: stress tolerance by Streptococcus mutans. Microbiology 154:3247–3255. 130. Maruyama F, Kobata M, Kurokawa K, Nishida K, Sakurai A, Nakano K, Nomura R, Kawabata S, Ooshima T, Nakai K, Hattori M, Hamada S, Nakagawa I. 2009. Comparative genomic analyses of Streptococcus mutans provide insights into chromosomal shuffling and species-specific content. BMC Genomics 10:358. 131. Aikawa C, Furukawa N, Watanabe T, Minegishi K, Furukawa A, Eishi Y, Oshima K, Kurokawa K, Hattori M, Nakano K, Maruyama F, Nakagawa I, Ooshima T. 2012. Complete genome sequence of the serotype k Streptococcus mutans strain LJ23. J Bacteriol 194:2754–2755. 132. Biswas S, Biswas I. 2012. Complete genome sequence of Streptococcus mutans GS-5, a serotype c strain. J Bacteriol 194:4787–4788. 133. Ajdić D, McShan WM, McLaughlin RE, Savić G, Chang J, Carson MB, Primeaux 101

C, Tian R, Kenton S, Jia H, Lin S, Qian Y, Li S, Zhu H, Najar F, Lai H, White J, Roe BA, Ferretti JJ. 2002. Genome sequence of Streptococcus mutans UA159, a cariogenic dental pathogen. Proc Natl Acad Sci 99:14434–14439. 134. Clarke JK. 1924. On the bacterial factor in the etiology of dental caries. Br J Exp Pathol 5:141–147. 135. Human Microbiome Project Consortium. 2012. Structure, function and diversity of the healthy human microbiome. Nature 486:207–214. 136. Marsh PD. 2006. Dental plaque as a biofilm and a microbial community - implications for health and disease. BMC Oral Health 6 Suppl 1:S14. 137. Stephan RM. 1944. Intra-oral hydrogen-ion concentrations associated with dental caries activity. J Dent Res 23:257–266. 138. Takahashi N, Nyvad B. 2008. Caries ecology revisited: microbial dynamics and the caries process. Caries Res 42:409–418. 139. Kolenbrander PE, Palmer RJ, Periasamy S, Jakubovics NS. 2010. Oral multispecies biofilm development and the key role of cell-cell distance. Nat Rev Microbiol 8:471– 480. 140. Dashper SG, Reynolds EC. 2000. Effects of organic acid anions on growth, glycolysis, and intracellular pH of oral streptococci. J Dent Res 79:90–96. 141. Iwami Y, Abbe K, Takahashi-Abbe S, Yamada T. 1992. Acid production by streptococci growing at low pH in a chemostat under anaerobic conditions. Oral Microbiol Immunol 7:304–308. 142. Banas JA. 2004. Virulence properties of Streptococcus mutans. Front Biosci 9:1267– 1277. 143. Berkowitz RJ, Jordan HV, White G. 1975. The early establishment of Streptococcus mutans in the mouths of infants. Arch Oral Biol 20:171–174. 144. Demuth DR, Irvine DC. 2002. Structural and functional variation within the alanine- rich repetitive domain of streptococcal antigen I/II. Infect Immun 70:6389–6398. 145. Ooshima T, Matsumura M, Hoshino T, Kawabata S, Sobue S, Fujiwara T. 2001. Contributions of three glycosyltransferases to sucrose-dependent adherence of Streptococcus mutans. J Dent Res 80:1672–1677. 146. Banas JA, Vickerman MM. 2003. Glucan-binding proteins of the oral streptococci. Crit Rev Oral Biol Med 14:89–99. 147. Kilian M, Thylstrup A, Fejerskov O. 1979. Predominant plaque flora of Tanzanian children exposed to high and low water fluoride concentrations. Caries Res 13:330–343. 148. Vinogradov AM, Winston M, Rupp CJ, Stoodley P. 2004. Rheology of biofilms formed from the dental plaque pathogen Streptococcus mutans. Biofilms 1:49–56. 149. Anwar H, Strap JL, Costerton JW. 1992. Establishment of aging biofilms: possible mechanism of bacterial resistance to antimicrobial therapy. Antimicrob Agents Chemother 36:1347–1351. 150. Shemesh M, Tam A, Feldman M, Steinberg D. 2006. Differential expression profiles of Streptococcus mutans ftf, gtf and vicR genes in the presence of dietary carbohydrates at early and late exponential growth phases. Carbohydr Res 341:2090–2097.

102

151. Senadheera MD, Guggenheim B, Spatafora GA, Huang Y-CC, Choi J, Hung DCI, Treglown JS, Goodman SD, Ellen RP, Cvitkovitch DG. 2005. A VicRK signal transduction system in Streptococcus mutans affects gtfBCD, gbpB, and ftf expression, biofilm formation, and genetic competence development. J Bacteriol 187:4064–4076. 152. Li YH, Hanna MN, Svensäter G, Ellen RP, Cvitkovitch DG. 2001. Cell density modulates acid adaptation in Streptococcus mutans: implications for survival in biofilms. J Bacteriol 183:6875–6884. 153. Li Y-H, Lau PCY, Lee J, Ellen R, Cvitkovitch DG. 2001. Natural genetic transformation of Streptococcus mutans growing in biofilms. J Bacteriol 183:897–908. 154. Hossain MS, Biswas I. 2012. An extracelluar protease, SepM, generates functional competence-stimulating peptide in Streptococcus mutans UA159. J Bacteriol 194:5886– 5896. 155. Li Y-H, Lau PCY, Tang N, Svensäter G, Ellen RP, Cvitkovitch DG. 2002. Novel two-component regulatory system involved in biofilm formation and acid resistance in Streptococcus mutans. J Bacteriol 184:6333–6342. 156. Khan R, Rukke HV, Ricomini Filho AP, Fimland G, Arntzen MØ, Thiede B, Petersen FC. 2012. Extracellular identification of a processed type II ComR/ComS pheromone of Streptococcus mutans. J Bacteriol 194:3781–3788. 157. Xue X, Tomasch J, Sztajer H, Wagner-Döbler I. 2010. The delta subunit of RNA polymerase, RpoE, is a global modulator of Streptococcus mutans environmental adaptation. J Bacteriol 192:5081–5092. 158. Meng Q, Turnbough CL, Switzer RL. 2004. Attenuation control of pyrG expression in Bacillus subtilis is mediated by CTP-sensitive reiterative transcription. Proc Natl Acad Sci 101:10943–10948. 159. Turnbough CL, Switzer RL. 2008. Regulation of pyrimidine biosynthetic gene expression in bacteria: repression without repressors. Microbiol Mol Biol Rev 72:266– 300. 160. Lemos JAC, Chen YY, Burne RA. 2001. Genetic and physiologic analysis of the groE operon and role of the HrcA repressor in stress gene regulation and acid tolerance in Streptococcus mutans. J Bacteriol 183:6074–6084. 161. Yura T, Nagai H, Mori H. 1993. Regulation of the heat-shock response in bacteria. Annu Rev Microbiol 47:321–350. 162. Lemos JAC, Nascimento MM, Lin VK, Abranches J, Burne RA. 2008. Global regulation by (p)ppGpp and CodY in Streptococcus mutans. J Bacteriol 190:5291–5299. 163. O'Rourke KP, Shaw JD, Pesesky MW, Cook BT, Roberts SM, Bond JP, Spatafora GA. 2010. Genome-wide characterization of the SloR metalloregulome in Streptococcus mutans. J Bacteriol 192:1433–1443. 164. Kitten T, Munro CL, Michalek SM, Macrina FL. 2000. Genetic characterization of a Streptococcus mutans LraI family operon and role in virulence. Infect Immun 68:4441– 4451. 165. Rolerson E, Swick A, Newlon L, Palmer C, Pan Y, Keeshan B, Spatafora GA. 2006. The SloR/Dlg metalloregulator modulates Streptococcus mutans virulence gene expression. J Bacteriol 188:5033–5044.

103

166. Roberts SA, Arirachakaran P, Patenge N, Billion A, Benjavongkulchai E, Scott JR, Luengpailin S, Raasch P, Ajdić D, Normann J, Wisniewska-Kucper A, Banas JA, Retey J, Boisguérin V, Hartsch T, Hain T, Kreikemeyer B. 2007. Manganese affects Streptococcus mutans virulence gene expression. Caries Res 41:503–511. 167. Vats N, Lee SF. 2001. Characterization of a copper-transport operon, copYAZ, from Streptococcus mutans. Microbiology 147:653–662. 168. Singh K, Senadheera DB, Levesque CM, Cvitkovitch DG. 2015. The copYAZ operon functions in copper efflux, biofilm formation, genetic transformation, and stress tolerance in Streptococcus mutans. J Bacteriol 197:2545–2557. 169. Lemme A, Grobe L, Reck M, Tomasch J, Wagner-Döbler I. 2011. Subpopulation- specific transcriptome analysis of competence-stimulating-peptide-induced Streptococcus mutans. J Bacteriol 193:1863–1877. 170. Xia L, Xia W, Li S, Li W, Liu J, Ding H, Li J, Li H, Chen Y, Su X, Wang W, Sun L, Wang C, Shao N, Chu B. 2012. Identification and expression of small non-coding RNA, L10-Leader, in different growth phases of Streptococcus mutans. Nucleic Acid Ther 22:177–186. 171. Koyanagi S, Levesque CM. 2013. Characterization of a Streptococcus mutans intergenic region containing a small toxic peptide and its cis-encoded antisense small RNA antitoxin. PLoS ONE 8:e54291. 172. Fozo EM, Makarova KS, Shabalina SA, Yutin N, Koonin EV, Storz G. 2010. Abundance of type I toxin-antitoxin systems in bacteria: searches for new candidates and discovery of novel families. Nucleic Acids Res 38:3743–3759. 173. Zeng L, Choi SC, Danko CG, Siepel A, Stanhope MJ, Burne RA. 2013. Gene regulation by CcpA and catabolite repression explored by RNA-Seq in Streptococcus mutans. PLoS ONE 8:e60465. 174. Fujiwara S, Kobayashi S, Nakayama H. 1978. Development of a minimal medium for Streptococcus mutans. Arch Oral Biol 23:601–602. 175. LeBlanc DJ, Lee LN, Abu-Al-Jaibat A. 1992. Molecular, genetic, and functional analysis of the basic replicon of pVA380-1, a plasmid of oral streptococcal origin. Plasmid 28:130–145. 176. Lau PCY, Sung CK, Lee JH, Morrison DA, Cvitkovitch DG. 2002. PCR ligation mutagenesis in transformable streptococci: application and efficiency. J Micro Meth 49:193–205. 177. Syed MA, Koyanagi S, Sharma E, Jobin M-C, Yakunin AF, Levesque CM. 2011. The chromosomal mazEF locus of Streptococcus mutans encodes a functional type II toxin-antitoxin addiction system. J Bacteriol 193:1122–1130. 178. Abràmoff MD, Magalhães PJ, Ram SJ. 2004. Image processing with ImageJ. Biophotonics international 11:36–43. 179. Schmittgen TD, Livak KJ. 2008. Analyzing real-time PCR data by the comparative C(T) method. Nat Protoc 3:1101–1108. 180. O'Toole GA. 2011. Microtiter dish biofilm formation assay. J Vis Exp. 181. Oberto J. 2013. SyntTax: a web server linking synteny to prokaryotic taxonomy. BMC Bioinformatics 14:4. 104

182. Di Tommaso P, Moretti S, Xenarios I, Orobitg M, Montanyola A, Chang J-M, Taly J-F, Notredame C. 2011. T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension. Nucleic Acids Res 39:W13–7. 183. Will S, Joshi T, Hofacker IL, Stadler PF, Backofen R. 2012. LocARNA-P: accurate boundary prediction and improved detection of structural RNAs. RNA 18:900–914. 184. Rice P, Longden I, Bleasby A. 2000. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet 16:276–277. 185. Bailey TL, Williams N, Misleh C, Li WW. 2006. MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res 34:W369–73. 186. Segal R, Ron EZ. 1996. Regulation and organization of the groE and dnaK operons in Eubacteria. FEMS Microbiol Lett 138:1–10. 187. Fouquier d'Hérouel A, Wessner F, Halpern D, Ly-Vu J, Kennedy SP, Serror P, Aurell E, Repoila F. 2011. A simple and efficient method to search for selected primary transcripts: non-coding and antisense RNAs in the human pathogen Enterococcus faecalis. Nucleic Acids Res 39:e46. 188. Notredame C, Higgins DG, Heringa J. 2000. T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Bio 302:205–217. 189. Hung DCI, Downey JS, Ayala EA, Kreth J, Mair R, Senadheera D, Qi F, Cvitkovitch DG, Shi W, Goodman SD. 2011. Characterization of DNA binding sites of the ComE response regulator from Streptococcus mutans. J Bacteriol 193:3642–3652. 190. Petersen FC, Scheie AA. 2010. Natural transformation of oral streptococci. Methods Mol Biol 666:167–180. 191. Reck M, Tomasch J, Wagner-Döbler I. 2015. The alternative sigma factor SigX controls bacteriocin synthesis and competence, the two quorum sensing regulated traits in Streptococcus mutans. PLoS Genet 11:e1005353. 192. Welin J, Wilkins JC, Beighton D, Wrzesinski K, Fey SJ, Mose-Larsen P, Hamilton IR, Svensäter G. 2003. Effect of acid shock on protein expression by biofilm cells of Streptococcus mutans. FEMS Microbiol Lett 227:287–293. 193. Meng Q, Switzer RL. 2001. Regulation of transcription of the Bacillus subtilis pyrG gene, encoding cytidine triphosphate synthetase. J Bacteriol 183:5513–5522. 194. Ma Z, Jacobsen FE, Giedroc DP. 2009. Coordination chemistry of bacterial metal transport and sensing. Chem Rev 109:4644–4681. 195. Abranches J, Nascimento MM, Zeng L, Browngardt CM, Wen ZT, Rivera MF, Burne RA. 2008. CcpA regulates central metabolism and virulence gene expression in Streptococcus mutans. J Bacteriol 190:2340–2349. 196. Hueck CJ, Hueck CJ, Hillen W. 1995. Catabolite repression in Bacillus subtilis: a global regulatory mechanism for the gram-positive bacteria? Mol Microbiol 15:395–401. 197. Henkin TM. 1996. The role of the CcpA transcriptional regulator in carbon metabolism in Bacillus subtilis. FEMS Microbiol Lett 135:9–15. 198. Stenz L, François P, Whiteson K, Wolz C, Linder P, Schrenzel J. 2011. The CodY pleiotropic repressor controls virulence in gram-positive pathogens. FEMS Immunol Med Microbiol 62:123–139. 105

199. Santiago B, Marek M, Faustoferri RC, Quivey RG. 2013. The Streptococcus mutans aminotransferase encoded by ilvE is regulated by CodY and CcpA. J Bacteriol 195:3552–3562. 200. Grove A. 2013. MarR family transcription factors. Curr Biol 23:R142–3. 201. Martin RG, Nyantakyi PS, Rosner JL. 1995. Regulation of the multiple antibiotic resistance (mar) regulon by marORA sequences in Escherichia coli. J Bacteriol 177:4176–4178. 202. Oh S-Y, Shin J-H, Roe J-H. 2007. Dual role of OhrR as a repressor and an activator in response to organic hydroperoxides in Streptomyces coelicolor. J Bacteriol 189:6284– 6292. 203. Pizzey RL, Marquis RE, Bradshaw DJ. 2011. Antimicrobial effects of o-cymen-5-ol and zinc, alone & in combination in simple solutions and toothpaste formulations. International Dental Journal 61:33–40.

106

8 Appendix

8.1 5’ RACE of smu770c-771c and smu1063-1064c

The intergenic regions for smu770c-771c, and smu1063-1064c did not yield a Northern result. 5’ RACE was performed on the two transcripts. The transcripts may not be detected by the Northern as the expression levels are lower than the detection limit for the assay.

The sequenced transcript for smu770c-771c was approximately 290 nt long. The transcript was located on the complementary strand and matched the orientation of the upstream and downstream genes. The result of the 5’ RACE sequence indicated the length of the transcript closely matched the result reported in the RNA-seq experiment of 315 nt (173). The start site was located around 719,052 and the transcript ended at approximately 718,763. The sequence contains a SloR recognition element (SRE) and is located adjacent to a hypothetical manganese transporter (smu.770c) (163).

The sequenced transcript for smu1063-1064c was approximately 130 nt long. Although the smu1063-1064c TAR was reported in the comX study, the authors did not report the sizes of the intergenic transcripts (169). Also, the transcript had no match with the published RNA-seq experiment. The smu1063-1064c TAR was located on the positive strand which was opposite of the orientation of the adjacent upstream and downstream genes. The start site was located at

1,008,320, and the termination of the transcript by a rho-independent terminator (1,008,520 –

1,008,541) was predicted using ARNold.

107

smu770c-771c

718700 GGGTTAATTG CATTATAGCA TAAAAAACAG CATATGTTAA GTATGCTTAA

718750 AAATTAAAAA AGGTGTACCT ATTATTACTG AATATAAAGA GTTTGTAAGA Predicted transcription stop 718800 ATTACTGACA GTATAAAACA CAATAGAGAT TATGGAAAAT TAATCAGAGA

718850 TGTCTAAACA ATCAGATTGC GAAAAGAAAA ATACTTGAAT TTACAGTTAA

718900 TTCCTCAAGT TCTTGATTTG AGTTTGTATG CTAAAAGGAG CATGTATATT

718950 AGCAGAAAAC GGTTATCATC TCATTGAATC AAATGAAAAT TTTAAAATGA

719000 TGAAATGACT GTTAGAAGTC TGTGAAGAAA TCAAAGCAAG TAAAGGTTAT

719050 GAATTATTGA TAATAATGCT TGAACATATT TTTGTCAAGA TGATGAAGTG +1<< -10 -35 smu1063-1064c

1008270 GATGTTTTAT TTCAAGTTTA TTCAACTTTA TTCCTCTTTT CTAATCATAA -35 -10 1008320 TTAAAAACCA ACTGACCATA TCATTATATA ATAGTGTGGT CAGTTTGTCA +1>> 1008370 AAAAAGAAGG AGAGCAAAAG AGAACCTTAA AAAGCAAAAA TCAAGGTTTT

1008420 CTTTTGCTCC CAGATAGTCT CATTTAAAGT GCATTATTAT CTAAAAAGTA

1008470 GAGACTATCA GTCTTCCTAA TCATGTATAA GTAAACACAA CAAATTAAAA

1008520 CTTTGGAACA GTTCTCCCAA AG rho-independent terminator

Figure Transcription initiation sites identified by 5’ RACE for the intergenic regions between smu.770c - smu.771c and smu.1063 - smu.1064c. The predicted -35 and -10 sites are underlined and in bold. The predicted rho-independent terminator is shown for smu1063-1064c. The 5’ RACE +1 start site is underlined and in bold with the direction of transcription indicated as >>.

108