<<

Determining the mechanism of P

-dependent regulation of

DISSERTATION*

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Sara Elgamal

Graduate Program in Microbiology

The Ohio State University

2015

Dissertation Committee:

Dr. Michael Ibba, Advisor

Dr. Irina Artsimovitch

Dr. Kurt Fredrick

Dr. Jane Jackman

Copyright by

Sara Elgamal

2015

Abstract

Elongation factor P (EF-P) is required for the efficient synthesis of with stretches of consecutive prolines and other motifs that would otherwise lead to pausing.

However, previous reports also demonstrated that levels of most diprolyl-containing proteins are not altered by the deletion of efp. To define the particular sequences that trigger ribosome stalling at diprolyl (PPX) motifs, we used ribosome profiling to monitor global ribosome occupancy in strains lacking EF-P. Only 2.8% of PPX motifs caused significant ribosomal pausing in the Δefp strain, with up to a 45-fold increase in ribosome density observed at the pausing site. The unexpectedly low fraction of PPX motifs that produce a pause in led us to investigate the possible role of sequences upstream of PPX. Our data indicate that EF-P dependent pauses are strongly affected by sequences upstream of the PPX pattern. We found that residues as far as 3 codons upstream of the ribosomal peptidyl-tRNA site had a dramatic effect on whether or not a particular PPX motif triggered a , while internal Shine Dalgarno sequences upstream of the motif had no effect on EF-P dependent translation efficiency.

Increased ribosome occupancy at particular stall sites did not reliably correlate with a decrease in total levels, suggesting that in many cases other factors compensate for the potentially deleterious effects of stalling on protein synthesis. These findings indicate that the ability of a given PPX motif to initiate an EF-P-alleviated stall is

ii

strongly influenced by its local context, and that other indirect post-transcriptional effects determine the influence of such stalls on protein levels within the cell.

Transcription terminators contribute to gene regulation. Terminators are either intrinsic or

Rho-mediated. Besides regulating transcriptional punctuation of bacterial genomes, Rho can act intragenically under conditions that perturb coupling of translation and transcription. Rho- mediated transcription termination results from the action of the Rho protein, which must bind to specific sequences, called rut (Rho utilization) sites, on the nascent RNA that are typically C-rich but have no traceable consensus sequence. Since translational stalls can potentially reduce translational speed, hence create a gap between translation and transcription, this can perturb coupling of transcription and translation. In this work we investigate the role of EF-P, an elongation factor required for efficient translation of consecutive proline-containing mRNAs, to maintain translational speed for proper coupling during translation of these mRNAs. Utilizing in vivo reporter systems we investigate the effect of an oligo-proline motif upstream of a transcription termination site. Transcript levels of the downstream gene indicated reduced transcript level by northern blot. This occurred for both types of transcription terminators, Rho dependent and intrinsic. Furthermore, similar experiments were conducted in a faster and slower

RNA polymerase background (Rpoβ2 and Rpoβ8, respectively) to detect differences in the transcription termination efficiencies upon alteration of RNAP speed. We propose that EF-P can play an additional role in gene expression at the level of transcription by maintaining coupled transcription-transcription during translation of oligo-proline motifs.

We name this new role EF-P Altered Transcription Termination (EATT).

iii

Dedication

This document is dedicated to my family.

iv

Acknowledgments

I would like to express my immense gratitude to my mentor Dr. Michael Ibba. Since I’ve joined his lab, Dr. Ibba has provided me with continuous assistance and motivation throughout my studies. He has given me exciting and challenging projects without a doubt that I could do them. I am grateful for his great confidence; such confidence helped me explore my potential. Dr. Ibba also gave me the freedom to test my own ideas, which has helped me think as an independent researcher. His valuable feedback had a fundamental role in the development of my projects.

I am also thankful to my graduate committee, Drs. Irina Artsimovitch, Kurt Fredrick and

Jane Jackman for their knowledgeable contributions, insightful suggestions and genuine support.

I am thankful to Dr. Medha Raina, who initially trained me. I am very grateful to Dr.

Assaf Katz who performed the majority of the bioinformatics for the ribosome profiling data (Chapter 2 and 3). Our collaborator, Dr. William Navarre (University of Toronto,

Canada) and Steven Hersch who contributed to the work in Chapter 3. Dr. Kurt Fredrick helped me design the initiation experiment in chapter 3. Dr. Irina Artsimovitch helped me in the research plan and reporter design in chapter 4.

Special thanks to members of the Ibba lab (past and present) for all their help, scientific inputs and support throughout the years: Dr. Jennifer Shepherd, Dr. Tammy Bullwinkle,

v

Andrei Rajkovic, Adil Moghal, Kyle Mohler, Anne Witzky and Paul Kelly. I am also grateful for the productive and encouraging environment on the second floor of Aronoff.

Furthermore, I am very grateful for meeting the outstanding professors in the Advanced

Bacterial Genetics course at Cold Spring Harbor. Drs. Beth Lazazzera, Diarmaid Hughes and Fitnat Yildiz, I have learned a great deal from their discussions.

Last but not least, I owe the deepest gratitude to my beloved family; my parents and sisters for their unconditional love, constant support and endless encouragement every step of the way.

vi

Vita

1999- 2004 ...... B.Sc. Pharmaceutical Sciences, Cairo

University, Egypt

2005- 2009 ...... M.Sc. Pharmaceutical Sciences

“Microbiology and Immunology”, Cairo

University, Egypt

2010 - 2015 ...... PhD in Microbiology, Department of

Microbiology, The Ohio State University

Publications

Elgamal, S., Artsimovitch I., Figueroa-Bossi N., Bossi L. and Ibba M. maintains transcription-translation coupling during oligo-proline synthesis. 2016, In preparation.

Katz, A., Rajkovic, A., Elgamal, S., and Ibba, M. (2016) Non-canonical roles of tRNA and tRNA mimics in bacterial cell biology: Hijacking of ancient metabolic actors. Mol Microbiol. In preparation.

Hersch, S.J., Elgamal, S., Katz, A., Ibba, M., and Navarre, W.W. (2014) Translation Initiation Rate Determines the Impact of Ribosome Stalling on Bacterial Protein Synthesis. J Biol Chem 289: 28160-28171

Elgamal, S., Katz, A., Hersch, S.J., Newsom, D., White, P., Navarre, W.W., and Ibba, M. (2014) EF-P Dependent Pauses Integrate Proximal and Distal Signals during Translation. PLoS Genet 10: e1004553.

vii

Bullwinkle, T.J., Zou, S.B., Rajkovic, A., Hersch, S.J., Elgamal, S., Robinson, N., Smil, D., Bolshan, Y., Navarre, W.W., Ibba, M. (2013) (R)-beta-lysine-modified elongation factor P functions in translation elongation. J Biol Chem 288: 4416-4423

Raina, M., Elgamal, S., Santangelo, T.J., and Ibba, M. Association of a multi-synthetase complex with translating in the archaeon Thermococcus kodakarensis. FEBS Lett, 2012. 586(16): p. 2232-8.

Fields of Study

Major Field: Microbiology

viii

Table of Contents Abstract(...... (ii

Dedication(...... (iv

Acknowledgments(...... (v

Vita(...... (vii

Publications(...... (vii

Fields(of(Study(...... (viii

Table(of(Contents(...... (ix

List(of(Tables(...... (xiii

List(of(Figures(...... (xiv

List(of(Symbols/Abbreviation(...... (xvi

Chapter(1:(Introduction(...... (1

1.1(Translation(...... (1

1.2(Ribosome(rescue(mechanisms(...... (2

1.3(Elongation(factorKP(...... (5

1.3.1.$Crystal$structure$...... $5

1.3.2.$EF3P$Modification$...... $7

1.3.3$EF3P$homologs$...... $9

ix

1.3.4$Mutant$phenotypes$...... $10

1.4(EFKP(alleviates(ribosome(pausing(at(oligoKprolines(...... (11

1.5(Proteomic(analysis(of(Salmonella(efp(mutants(...... (12

1.5.1$2D3gel$analysis$...... $12

1.5.2$Stable$Isotope$Labeling$of$Amino$acids$in$Cell$culture$(SILAC)$...... $13

1.6(EFKP(and(regulation(of(gene(expression(...... (14

1.8(Transcription(...... (16

1.9(Modes(of(transcription(termination(in(E.,coli(...... (17

1.9.1$Intrinsic$termination$...... $18

1.9.2$Rho$dependent$transcription$termination$...... $20

1.9.3$Rho$inhibitors:$Psu$and$bicyclomycin$(BCM)$...... $21

Chapter(2:(Investigation(of(the(role(of(EFKP(using(ribosome(profiling(...... (23

2.1.Introduction(...... (23

2.2.(Results(...... (25

2.3.Discussion(...... (35

2.3.1$Ribosome$profiling$and$protein$synthesis$...... $35

2.3.2$The$role$of$EF3P$in$integrating$different$signals$to$regulate$translation$...... $38

2.4.Materials(and(methods(...... (40

2.4.1$Strains$...... $40

2.4.2$Ribosome$profiling$(ribo3seq)$...... $40

2.4.3$Preparation$of$ribosome$footprint$samples$...... $41

2.4.4$Preparation$of$fragmented$mRNA$samples$(RNA3seq$samples)$...... $41

2.4.5$Library$Generation$...... $42

x

2.4.6$Next$Generation$Sequencing$and$alignment$to$E.#coli$genome$...... $45

2.4.7$Analysis$of$ribosome$densities$at$PPX$sequences$and$Pausing$sites$detection$...... $46

Chapter(3:(Mining(the(Ribosome(Profiling(data(...... (47

3.1.Introduction(...... (47

3.2.(Results(...... (48

3.2.1$Non3PPX$pausing$motifs$...... $48

3.2.2$Common$patterns$in$PPX$pausing$motifs$...... $49

3.2.3$Exploring$the$role$of$the$weak$internal$Shine3Dalgarno$sequences$...... $52

3.2.4$The$Effect$of$the$amino$acid$immediately$upstream$of$the$diprolyl$motif$...... $54

3.2.5$Effects$of$distal$upstream$sequences$on$PPX$translation$...... $61

3.2.6$Effect$of$translation$initiation$on$EF3P$dependent$pausing$...... $61

3.3.Discussion(...... (63

3.4.(Materials(and(methods(...... (67

3.4.1$GFP$fluorescence$assay$in$E.#coli$...... $67

3.4.2$Anti$Shine3Dalgarno$hybridization$free3energy$prediction$...... $67

3.4.3$Other$sequences$analysis$and$curve$fitting$...... $68

Chapter(4:(EFKP(maintains(transcriptionKtranslation(coupling(during(oligoKproline( synthesis(...... (69

4.1(Introduction(...... (69

4.2(Results(...... (73

4.2.1$EF3P$does$not$regulate$Rho$Synthesis$...... $73

4.2.2$Effects$of$Oligo3Pro$synthesis$on$Rho3dependent$transcription$termination$...... $74

4.2.3$Effects$of$Oligo3Pro$synthesis$on$transcription$through$intrinsic$terminator$sites$.....$77

xi

4.2.4$Investigating$the$effect$of$different$RNAP$mutants$on$EF3P$dependent$coupling$...... $78

4.2.5$In$Silico$search$for$EF3P$suppressed$transcription$terminators$...... $81

4.4(Materials(and(Methods(...... (92

4.4.1$Strains$...... $92

4.4.2$$Plasmids$...... $93

4.4.3$Fluorescence$assay$in$E.#coli$...... $94

4.4.4$Isolation$and$analyses$of$in#vivo#RNA$transcripts$...... $94

Chapter(5:(Conclusions(and(Perspectives(...... (96

References(...... (102

xii

List of Tables

Table 1. EF-P dependent pauses that contain a PPX sequence ...... 30

Table 2. List of PPX sequences that do not produce a translation pause...... 31

Table 3. List of PPX-containing proteins identified by SILAC to be 3-fold or more abundant in the wild-type vs. ∆efp strain and their corresponding pausing index from ribo-seq ...... 34

Table 4. List of EF-P dependent pauses that do not contain a PPX sequence ...... 35

Table 5. Indexed library PCR primers ...... 44

Table 6. Combinations of amino acids at A site (X, #10) and 2 positions upstream the P site (Z) in pausing and non-pausing ZPPX sequences ...... 60

Table 7. Common genes in both SILAC [81] and RNA-seq [20] that are down-regulated in ∆efp strain in comparison to WT and contain at least one PPX. These genes have at least a 1.5 fold difference in RNA total reads in the RNA-seq dataset ...... 83

Table 8. Prediction of intrinsic terminators for genes with RNA reads more than 2 fold higher in WT versus ∆efp ...... 85

Table 9. Prediction of intrinsic terminators for genes with RNA reads more than 2 fold higher in WT versus ∆efp ...... 88

Table 10. DNA oligonucleotides utilized ...... 95

$

xiii

List of Figures

Figure 1. Triggers for translation stalling and ribosome rescue...... 4

Figure 2. Structure of EF-P...... 6

Figure 3. The structure of EF-P bound to the ribosome ...... 7

Figure 4. Modification of EF-P...... 8

Figure 5. Venn diagram of E. coli SILAC data ...... 15

Figure 6. Overview of ribosome profiling steps ...... 24

Figure 7. Correlation for biological replicates of ribosome profiling data ...... 26

Figure 8. Translation pausing at PPX sequences...... 27

Figure 9. Comparison of pausing index for each PPX...... 29

Figure 10. Comparison of SILAC and rib-seq data...... 37

Figure 11. The GFP/mCherry reporter...... 50

Figure 12. Alignment of sequences from genes with pausing or non-pausing PPX...... 51

Figure 13. Correlations between mRNA sequence and pause strength...... 53

Figure 14. Effects on translation pausing of aSD affinity and the identity of the preceding PPX...... 56

Figure 15. Effect of amino acids preceding PPX on pausing efficiency...... 57

Figure 16. Effect of codon usage at A site position. Dr. Assaf Katz prepared this figure.58

Figure 17. Translation initiation and elongation stall strength influence protein level. .... 63

xiv

Figure 18. Model of the interplay between translation initiation rate and elongation stalls.

...... 66

Figure 19. Rho is produced equally in WT and ∆efp Salmonella...... 74

Figure 20. Reporter with a rut site downstream of oligo-proline motifs...... 76

Figure 21. Reporter with an intrinsic terminator downstream of an oligo-proline motif. . 78

Figure 22. Reporter with a rut site downstream of oligo-proline motifs in different RNAP backgrounds...... 80

$

xv

List of Symbols/Abbreviation

PTC Peptidyl centre aa-tRNA Aminoacyl-tRNA tmRNA Transfer-messenger RNA EF-P Elongation factor P E. coli Escherichia coli 2D-DIGE Two-dimensional difference in gel electrophoresis P or Pro Proline Rut site Rho utilization site bp Base pairs cDNA Complementary DNA P-site peptidyl-tRNA site A-site Aminoacyl -tRNA site SILAC Stable Isotope Labeling of Amino acids in Cell culture RNAP RNA Polymerase TEC Transcription elongation complex Psu Polarity suppression protein BCM Bicyclomycin Ribo-seq Ribosome profiling SD Shine Dalgarno sequence aSD Anti Shine Dalgarno sequence PPX Diprolyl motifs EATT EF-P Altered Transcription Termination

xvi

Chapter 1: Introduction

1.1 Translation

Bacteria dedicate ~50% of their energy to protein synthesis. In Escherichia coli 20–40% of protein synthesis is directed to produce ribosomes and other translation factors [1].

Protein is produced from an mRNA sequence through the process of translation. The translation cycle can be subdivided into four stages: initiation, elongation, termination and ribosome recycling [2]. Translation initiation is the rate-limiting step for most proteins. In translation is carried out by the 70S ribosome, which has a small subunit (30S) and a large subunit (50S). For translation to initiate the mRNA, translation 1 (IF1), IF2, IF3, and the initiator tRNA charged with N- formylmethionine (fMet-tRNAfMet) assemble on the 30S subunit to form the pre- initiation complex. IF1, IF2 and IF3 dissociate and the 50S subunit binds to form the initiation complex [3]. The elongation stage begins with the addition of each following amino acid sequentially to the nascent chain. Each amino acid is delivered to the ribosome in a ternary complex including an aminoacyl-tRNA (aa-tRNA), elongation factor Tu (EF-Tu) and GTP. If the tRNA anticodon pairs with the codon in the decoding center of the A-site, GTP is hydrolyzed and the aminoacyl end of the tRNA moves into the centre (PTC) and is ready for bond formation. Peptide bond formation proceeds such that the α-amino group in the A-site tRNA initiates a nucleophilic attack on the ester carbonyl carbon of the P-site tRNA resulting in the 1

breakage of that ester bond and the formation of a peptide bond. At this stage the peptidyl-tRNA in the A-site is one amino acid longer and the P-site is occupied with a deacylated tRNA. After peptide bond formation, the tRNAs are transiently shifted to A/P and P/E hybrid states which are resolved by elongation factor G (EF-G), a GTPase that catalyzes the translocation of the tRNAs and mRNA relative to the ribosome. The elongation stage continues until a is encountered. With a stop codon in the decoding center, it is recognized by either peptide chain 1 (RF1) or RF2.

Binding of the release factor leads to termination by cleavage of the nascent peptide chain from the P-site tRNA and release of the nascent peptide from the ribosome.

Following termination, the GTPase RF3 binds the ribosome to release RF1/2, leaving only the mRNA and a deacylated P-site tRNA. Finally, ribosome recycling factor (RRF) and EF-G promote dissociation of the 50S subunit and 30S subunit. The disassociated

30S subunit is still bound to deacylated tRNA and mRNA. IF3 binds to the 30S leading to its full dissociation, thus preparing it for a new round of protein synthesis [2].

1.2 Ribosome rescue mechanisms

During both elongation and termination the translation cycle can be arrested if an mRNA codon is not recognized in the decoding center. Bacteria need to rescue the ribosomes stalled on mRNA to maintain cellular translation capacity at an adequate level. There are two possible outcomes of a stalled ribosome (Figure1) either the mRNA is cleaved to make a non-stop complex or elongation and termination eventually resume [4].

Accumulation of non-stop complexes is toxic, and the failure to rescue these ribosomes will diminish the ability of the cell to make proteins [5, 6]. In a non-stop translation

2

complex, if the peptidyl-tRNA carries a short nascent peptide (less than five amino acids), it can exit through the ribosomal exit site (E-site), this process is known as drop- off. However, if the peptidyl-tRNA carries a chain longer than five amino acids it is stably associated with the the ribosome [7, 8] and cannot exit without hydrolysis. Thus bacteria need to be able to hydrolyze peptidyl-tRNA in non-stop complexes while not interfering with elongating ribosomes. The main mechanism bacteria use for rescuing non-stop complexes is the trans-translation pathway [6, 9, 10]. This pathway involves a particular RNA molecule called transfer-messenger RNA (tmRNA) and a small protein B

(SmpB) [11]. The tmRNA-SmpB complex structurally mimics a tRNA and it can be aminoacylated with alanine and interact with the ribosome [12, 13]. Furthermore, there is a within tmRNA so when it is in the decoding center, the ribosome can resume translation on this reading frame until it terminates at a stop codon at the end of this reading frame. Thus all components of the stalled complex are removed at the end of the reaction and the nascent polypeptide and stalled mRNA are targeted for degradation

[12]. There are also backup systems for trans-translation like the alternative ribosome- rescue factor A (ArfA) and ArfB [9].

The alternative outcome of elongation and termination, resumption, is more common [4,

14-16]. The decision between these outcomes seems to be complicated, and many of the involved factors are yet to be identified. Furthermore, stalling is affected by the sequence and structure of both the mRNA and the nascent polypeptide [15, 17]. Elongation through certain oligo- proline sequences is promoted by the conserved translation elongation factor P (EF-P) [18-20]. It has been proposed that EF-P stimulates elongation by binding

3

adjacent to the peptidyl-tRNA in the P-site facilitating peptidyl transfer by correctly positioning the tRNA in the P-site [19, 21]. Finally, if elongation fails to resume the mRNA is cleaved in the A-site by an unknown mechanism, and ribosome rescue is triggered [22-27]. On the other hand, under severe stress conditions [28] most mRNAs are cleaved in the A-site by ribosome-dependent endonuclease toxins (e.g., RelE) and upon stress relief, the trans-translation system is required to rescue ribosomes for the cells to resume growth [29, 30].

Figure 1. Triggers for translation stalling and ribosome rescue. Figure from [4]. Bacterial ribosomes can initiate translation before transcription is complete. Most translation cycles produce full-length protein (part a). Non-stop complex can result from: Premature transcription termination (part b), damage to the mRNA (part c) or RNase activity (part d), frameshifting (part e) and readthrough (part f). Elongation of some ribosomes is stimulated by elongation factor P.

4

1.3 Elongation factor-P

EF-P was first identified in 1975 and from in vitro translation assays. It was thought to be a translation factor that facilitated the formation of the first peptide bond [31]. EF-P is a

20-kDa protein and there are approximately 0.1 – 0.2 copies of it per ribosome in E. coli

[32, 33]. EF-P was initially thought to be essential for viability of E. coli [34, 35]. This was supported by the fact that all bacteria encode at least one efp homolog. However, a systematic high-throughput attempt to disrupt all nonessential genes in E. coli found that at least one strain of E. coli with an efp disruption could remain viable [36, 37]. EF-P mimics tRNA both in size and shape (Figure 2). In order for EF-P to function it requires post-translational modification [38-40], as do its homologs the eukaryotic and archaeal initiation factor 5A (eIF5A and aIF5A, respectively). According to an in vitro model of peptide bond synthesis, EF-P was shown to stimulate the formation of a peptide bond between N-formyl- Met-tRNA and puromycin. This led to the general conclusion that

EF-P had a central role in protein translation by facilitating the formation of the first peptide bond, although it was known that EF-P was not strictly required to reconstitute protein synthesis in vitro [41, 42].

1.3.1. Crystal structure

EF-P is a protein that mimics a tRNA [38, 40] in both shape and size (Figure 2A). A crystal structure of EF-P from Thermus thermophilus HB8 at a 1.65 Å resolution [43] shows that the protein consists of three β-barrel domains (domains I, II, and III), whilst eIF-5A has only two domains (Figure 2B). Another crystal structure, at 3.5 Å

5

Figure 2. Structure of EF-P. A. Arrows indicate sites of aminoacylation and β-lysylation for tRNA and EF-P, respectively [44]. B: Ribbon model of EF-P of E. coli. EF-P is composed of three beta- barrel domains [39].

resolution, of Thermus thermophilus EF-P in complex with the 70S ribosome along with

fMet the initiator transfer RNA N-formyl-methionyl-tRNAi (fMet-tRNAi ) and a short piece of mRNA [41] revealed that EF-P binds between the for the peptidyl tRNA

(P site) and the exiting tRNA (E site). It spans both of the ribosomal subunits having its amino terminal domain positioned adjacent to the aminoacyl acceptor stem and its carboxyl terminal domain positioned next to the anticodon stem-loop of the initiator tRNA bound in the P site. Also, domain II of EF-P interacts with the

L1 (Figure 3). From this structure, it was thought that EF-P could facilitate the

fMet appropriate positioning of fMet-tRNAi during translation initiation for first peptide bond formation [41]. This interaction between the modified region of EF-P and the 3`- end of initiator tRNA in the P-site suggests that EF-P might have a direct effect on translation initiation [41, 42].

6

Figure 3. The structure of EF-P bound to the ribosome A: Overview of the E- and P-site tRNAs bound to the 70S ribosome, PTC is peptidyl transferase center. B: Overview of EF-P and P-site tRNA bound to 70S. EF-P is shown as a surface representation in shades of magenta with the different domains indicated [41].

Moreover, EF-P could play a role in modulating ribosomal mRNA binding and optimize the efficiency of translation, a role that has recently been found for eIF5a [45, 46]

1.3.2. EF-P Modification

Early studies on E. coli EF-P showed that it had a post-translational modification with a mass consistent with that of lysine [47]. Two unrelated , YjeK and PoxA, were shown to be required for this modification pathway [38, 40]. PoxA is the first aaRS paralog that has been found to modify a protein with an amino acid [39]. PoxA adds β- lysine onto lysine 34 of E. coli EF-P in an ATP-dependent manner [44, 48]. Also, from the crystal structure of EF-P in complex with PoxA, it can be seen that their interaction mimics that of an aaRS with its cognate tRNA [49]. This interaction favors the idea that

EF-P modification is analogous to tRNA aminoacylation [39].

For lysylation of EF-P by PoxA, (R)-β-lysine was shown to be an 100-fold more efficient a substrate than either (S)-β-lysine or L-α-lysine. Synthesis of β-lysine is achieved by

7

YjeK (2,3-β-lysine aminomutase) which catalyzes the conversion of L-α-lysine to (R)- β- lysine [49]. Fully modified EF-P requires an additional , YfcM. YfcM hydroxylates modified EF-P on the same Lysine 34 [50]. This results in the fully modified EF-P that carries a unique ε(R)-β-lysylhydroxylysine modification of ~144 Da

(Figure 4).

Bioinformatic analysis indicated that both poxA and yjeK were closely linked with the efp gene [38, 40]. Moreover, in some Gram-negative bacteria, the three genes are in the same [44]. Since the modification of EF-P is required for its activity, it is noteworthy that some organisms lack homologs of PoxA and YjeK [40]. It is not know in these organisms whether EF-P is modified or how this is achieved.

Figure 4. Modification of EF-P. Post-translational modification of EF-P requires PoxA, YjeK [44] and YfcM [50]. Figure adapted from [51].

It has been recently shown in Pseudomonas aeruginosa that EF-P is Posttranslational modified by glycosylation with rhamnose at a position equivalent to that at which β- 8

lysine is attached to E. coli EF-P [52]. Also, It has been observed that not all EF-P proteins have the conserved lysine 34 residue. Also, some bacteria encode an EF-P homolog (EF-P- like protein, EF-P2, yeiP) [36, 53]. In EF-P2, the corresponding Lys34 is replaced by arginine. Some organisms, such as E. coli and Salmonella enterica, have both

EF-P2 and EF-P, but in other cases EF-P2 is the only EF-P present in the genome [40].

1.3.3 EF-P homologs

At least one EF-P homolog has been found in all sequenced bacterial genomes to date. In addition many genomes have additional paralogs [40]. The eukaryotic homolog of EF-P is eIF5A and in its counterpart is aIF5A. Moreover, both eIF5A and aIF5A are modified at a highly conserved lysine residue with the unique amino acid .

Knockout studies in yeast revealed that the hypusine modification is absolutely required for eIF5A activity [54, 55]. Furthermore, this residue maps to the analogous location that is modified in EF-P [44]. eIF5A is highly conserved and essential in [56, 57].

It is the only protein known to contain the unusual amino acid residue hypusine (Nε- (4- amino-2-hydroxybutyl)-lysine) [58, 59]. Hypusylation requires two enzymatic steps catalysed by deoxyhypusine synthase (DHS) and deoxyhypusine hydroxylase (DOHH) enzymes [55]. In eIF-5A the added spermidine moiety is hydroxylated by DOHH while in the case of EF-P the final modification step which involves hydroxylation by YfcM, is performed on the C4 or C5 position of Lysine 34 of EF-P not on the added β-lysine [50]. eIF5A was first purified from ribosomes and was believed to be involved in the formation of the first peptide bond [60, 61]. Recently, yeast studies have indicated that eIF5A promotes translation elongation rather than translation initiation [45, 46, 62] and like EF-

9

P, eIF5A was shown to promote the translation of polyproline motifs [63]. e/aIF5A share both sequence and structural similarity (Figure 4) with the first two domains of EF-P, but lack domain III [41, 43]. As deletion of eIF5A is lethal, but efp deletion is not, the bacterial protein is an ideal model to understand the physiological role of this family of proteins.

The phenotypes associated with lack of eIF5A were involved with stress adaptation, cell cycle control and apoptosis [55, 64-66]. Upon induction of oxidative stress, eIF5A is essential for polysome disassembly and stress granule formation. This might be due to its role in enhancing translation elongation [67]. Furthermore, eIF5A has been reported to selectively bind certain RNA molecules. This supports the idea that these factors may have a modulating role in the translation of specific transcripts rather than being general translation factors [68-70].

1.3.4 Mutant phenotypes

Salmonella mutants lacking poxA, yjeK or efp have pleiotropic and overlapping phenotypes. These phenotypes include: increased susceptibility to numerous unrelated stress conditions including antibiotics from different pharmacological classes, increased respiration under nutrient-limiting conditions, reduced growth rates, impaired motility, attenuated virulence in a mouse infection model [38] and reduced colonization in a swine host [71]. Bacillus subtilis strains lacking efp have impaired motility [72] together with impaired ability to sporulate [73]. In Agrobacterium tumefaciens, an efp homolog, chvH, was identified in a screen for mutants that were defective in plant virulence. chvH mutants showed reduced expression of virulence genes and hypersusceptibility to

10

detergents. Also, the phenotypes could be rescued by complementation with the efp gene from E. coli [74].

1.4 EF-P alleviates ribosome pausing at oligo-prolines

Although the basic peptide bond formation reaction is the same for all amino acids, the speed is not uniform. Factors affecting translation speed include the abundance of each aminoacyl-tRNA, the structure of the incorporated amino acid, and structural features of the mRNA and the nascent peptide [15]. Changes in translation speed can have key physiological roles such as regulating protein synthesis and folding. This not only controls the fraction of active protein, but also can potentially provide new functionality through alternative folds [75]. When comparing amino acids according to their speed of incorporation during peptide bond formation, proline is by far the slowest [76, 77].

Proline is the only natural cyclic amino acid [78] and the pyrrolidine ring of proline restricts the accessible conformations for this residue imposing structural constraints on the positioning of the amino acid in the PTC, resulting in slow peptide bond formation and ribosome stalling [79]. To overcome this potential bottleneck, robust translation of sequences with consecutive prolines requires the presence of elongation factor P (EF-P)

[18-20, 80, 81]. The β-Lysine modification of EF-P can protrude deeply into the PTC, facilitating peptide bond formation by stabilizing peptidyl-tRNA on the ribosome and promoting optimal positioning of the substrates. Recent work has shown that EF-P acts by entropic steering of Pro-tRNA towards a catalytically productive orientation in the

PTC [21].

11

1.5 Proteomic analysis of Salmonella efp mutants

1.5.1 2D-gel analysis

Differences in protein expression patterns of Salmonella wild-type, ΔpoxA and ΔyjeK mutant strains were investigated by 2D-gel analysis [38]. It was found that the differences in the soluble proteomes were relatively modest. Half of misregulated proteins appeared to be more abundant in the mutant strains while the majority of the downregulated proteins appeared to be related to metabolic functions such as nutrient acquisition and

ATP generation. This suggested that poxA and yjeK were required for the expression of proteins necessary for the utilization of appropriate alternative energy sources upon nutrient deprivation [38]. This finding was supported by the known function of PoxA as a regulator of pyruvate oxidase activity [82]. The misregulation of these genes in

Salmonella would affect its survival upon encountering the intracellular environment of the macrophage. The most overrepresented class of proteins identified upregulated in the mutant strains, was linked with the SPI-1 pathogenicity island. These proteins are required for Salmonella to invade host cells. Upon internalization of Salmonella into the macrophage, these proteins are downregulated and Salmonella turns on the expression of effectors, which are secreted by the type III secretion system encoded on SPI-2 [83, 84].

The rapid termination of SPI-1 expression upon macrophage internalization is necessary otherwise continued expression of those proteins would induce cytokines and lead to macrophage death [85, 86]. Misregulated SPI-1 expression in the macrophage environment would ultimately be detrimental to bacterial growth within the host [87, 88] and could help explain the attenuated virulence displayed by these mutants [38].

12

The inability of ΔpoxA, ΔyjeK, and Δefp mutants to grow on low osmolarity medium, hypersensitivity to detergents, increased susceptibility to numerous antibiotics of different pharmacological classes and migration defects suggest that these mutants may have altered cell envelope, problems in membrane biogenesis or defects in membrane permeability. Comparison of the membrane proteomes of Salmonella wild-type and Δefp showed few changes, but included the prominent overexpression of a single porin,

KdgM, in the outer membrane of the efp mutant. Removal of KdgM in the efp mutant background rescued the cell surface-associated phenotypes of efp mutants [51].

All of these observations suggest that EF-P might be responsible for the translational regulation of a limited number of proteins rather than being a global translation factor.

Part of this work aims to identify by ribosome profiling what these specific proteins are and to try to understand how they are similar and how EF-P regulates their expression.

1.5.2 Stable Isotope Labeling of Amino acids in Cell culture (SILAC)

SILAC is a quantitative proteomic method based on mass spectrometry (MS) [89]. Cells are either grown on normal medium (unlabeled amino acids) or on a medium that includes one or more stable isotope labeled amino acid(s). In the latter, while cells are growing their proteome is labeled by the incorporation of the modified amino acid(s).

Equal amounts of labeled and unlabeled cells are combined followed by to generate . Upon MS analysis, there is a distinct mass shift of labeled peptides depending on the type and number of the heavy amino acid used [90].

The altered proteome in the Δefp mutant of Salmonella was investigated by SILAC, which revealed that only a small subset of proteins were significantly misregulated in the

13

absence of EF-P [80]. The most highly misregulated groups had genes involved in metabolic pathways, two-component (stimulus- response) systems, and motility [80]. A similar SILAC experiment was done using a Δefp mutant of E. coli [81] and found that absence of active EF-P leads to down-regulation of proteins related to translation and

ATP synthesis, and upregulation of proteins related to chemotaxis, motility, and amino acid biogenesis. Figure 5 is a Venn diagram of protein categories obtained from the E. coli SILAC data. Only ~ 600 PPXs were identified in the SILAC experiment (this represents ~ 42.4% of the total PPX proteins in E. coli). From these identified PPXs ~ 8.2

% were downregulated in the ∆efp strain while ~ 9% were upregulated in the ∆efp strain.

On the other hand, ~ 82.8 % of the identified PPXs were unaffected by lack of EF-P. This proteomic data indicates that although - on the molecular level - EF-P is needed to promote oligo-proline synthesis, it is not actually altering the production of most of these proteins.

1.6 EF-P and regulation of gene expression

Gene expression is a multistep process, which ultimately converts the sequence encoded in DNA to a corresponding chain of amino acids that form proteins.

At the translational level ribosome stalling is a universally conserved mechanism for regulating gene expression by fine-tuning the rate of protein production for proper protein folding. The amino acids in the stalling sequences interact in the PTC and the ribosomal exit tunnel to decrease the elongation speed or termination rate. This interaction depends on the nature of their side-chains [91, 92]. Some stalling motifs impede peptidyl transfer, such as SecM- like peptides, the FXXYXIWPP sequence [93], arginine attenuator

14

peptides and C-terminal Pro residues [92]. For Pro residues, the prolyl-tRNA, and peptidyl-prolyl- tRNA at the A and P sites exhibit poor reactivity since they hinder peptide bond formation or block termination [92]. An interesting aspect of gene regulation in bacteria is the simultaneous occurrence of transcription and translation such that the translation of an mRNA can start as soon as it is transcribed. One of the consequences of coupling transcription and translation is that changes affecting one of the processes automatically affect the other. This relationship provides a further example of a gene regulatory mechanism unique to bacteria.

Figure 5. Venn diagram of E. coli SILAC data The SILAC data from Δefp mutant of E. coli was used [81] to make this venn diagram. The Box represents the entire E. coli proteome which includes ~ 4300 proteins.

1.7 Coupled transcription and translation in bacteria

In bacteria efficient translation elongation is not only necessary for proper production of proteins, but also plays a key role in maintaining proper coupling of translation to transcription. Coupling ensures transcriptional processivity by preventing RNAP

15

backtracking, serving as an energy control mechanism that both limits the number of extra transcripts and prevents premature termination [94].

Although translation-transcription coupling has been known for decades, its function was ascribed only in specific cases of transcription attenuation and polarity [95]. When a ribosome slows down owing to, for example, amino acid deficiency, the growing gap between moving RNAP and lagging ribosome provides the termination factor Rho with access to the nascent RNA, which results in premature termination of down- stream genes—a phenomenon known as the polarity effect [96]. Such cooperation between the two macromolecules explains the precise match of translational and transcriptional rates under various growth conditions [94, 95]. The ribosome can also be responsible for early transcription termination at certain metabolic , by allowing the formation of an intrinsic termination hairpin in the leader RNA sequences that leads to transcription attenuation [95]. Notably, both attenuation and polarity occur when the ribosome and

RNAP are physically separated. In each case, the effect of a ribosome on RNAP is indirect and mediated by either Rho factor or RNA secondary structures.

1.8 Transcription

Transcription is the process in which RNA polymerase (RNAP) synthesizes an RNA copy of the template DNA. It is the first and often critical step in gene expression [97].

The synthesized RNA can be structural (e.g. rRNA, tRNA) or protein-coding (mRNA).

Both bacterial RNAP and the replisome move along the same DNA template, but RNAP is 20-fold slower. Furthermore, their movement is sometimes in opposite directions [98] making conflict inevitable [99]. Transcription can further be hindered by interactions of

16

RNAP or transcription factors [100-104] with nascent RNA hairpins or DNA-bound proteins [97, 105, 106]. Although these delays are usually temporary [107], nevertheless, they can also lead to backtracking resulting in arrested complexes [108], which can block replication in both directions [97]. Preventing backtracking is required to inhibit double- stranded DNA breaks (DSB) that can form upon codirectional collisions. Since the first trailing ribosome directly assists RNAP, coupling serves as an anti-backtracking mechanism [109]. Consequently, the cell needs to employ various mechanisms to prevent excessive RNAP backtracking when translation is inefficient or absent. Backtracked complexes can be restarted by Gre factors [110], this gives RNAP another chance to pass the roadblock. As a final option, the stalled RNAP can be removed by the action of Mfd or Rho [111]. Rho factor relives backtracking by terminating transcription once it becomes uncoupled from translation [112-114]. If Rho fails, cells also rely on the backup termination factor Mfd which can disrupt ECs arrested by several different mechanisms

[113, 115-117]. Finally, DksA has been shown to be involved in replication fork progression and the DNA-damage response during starvation [118, 119]. DksA could suppress backtracking by plugging the exit for the RNA 3′-OH terminus. [120]

1.9 Modes of transcription termination in E. coli

The choice of transcription stop site depends on features of the nascent RNA sequence

[121]. Two types of mechanisms cause RNAP to stop and dissociate from the template

DNA, intrinsic (Rho-independent) termination and Rho-mediated termination.

17

1.9.1 Intrinsic termination

An intrinsic termination signal is characterized by a hairpin followed by a U-rich sequence in the RNA and does not require auxiliary factors. An intrinsic terminator typically contains a short G:C-rich stem-loop structure, followed by a run of 6–8 U residues [122]. Terminators at the end of genes prevent unintentional transcription into the downstream genes, whereas terminators in the upstream regulatory leader regions can regulate expression of the structural genes in response to metabolic and environmental signals [123, 124]. The mechanism of gene regulation by transcription termination- antitermination is called transcription attenuation. This mechanism enables bacteria to direct RNAP to either terminate transcription or transcribe the downstream genes of an operon in response to a certain metabolic signal [125]. This happens through the selective arrangement of one of two mutually exclusive RNA-secondary structures in the nascent transcript, the antiterminator and the terminator [126, 127]. One of the paradigm systems for control by leader peptide translation occurs in the trp operon in E. coli where the elongating transcription complex is subject to control by transcription attenuation. The trp leader transcript can form three overlapping RNA secondary structures, the pause structure, the antiterminator and an intrinsic transcription terminator. While translating a small open reading frame in the trp leader, the influence of the translating ribosome is critical. At the start of trp operon transcription, RNAP pauses. This pausing allows time for a ribosome to initiate translation of this leader peptide. The ribosome disrupts the paused complex and transcription resumes with the ribosome closely following. This state of coupled transcription and translation is necessary to sense the level of available tryptophan and determine the fate of the operon’s transcription. Two tandem Trp codons 18

strategically placed within the leader peptide will either stall the ribosome (low tryptophan) or translation will continue to the end of the leader peptide (tryptophan excess). In the former scenario, the ribosome stall will effectively uncouple transcription and translation and upon transcription continuing an antiterminator structure forms and prevents the formation of the alternative overlapping terminator resulting in transcriptional read through into the trp structural genes. In the latter scenario with excess tryptophan, efficient translation proceeds and the ribosome continues to the end of the leader peptide thus blocking the formation of the antiterminator structure thereby promoting terminator formation. This terminates transcription prior to the trp structural genes. Another example of transcription attenuation in E. coli is the expression of the pyrimidine biosynthetic operon, pyrBI [128, 129]. The pyr system utilizes coupling of translation to transcription of a leader peptide but it does not involve an alternative antiterminator structure. The limiting parameter is the speed of release from a transcriptional pause. The release rate is related to UTP availability, such that it happens rapidly if UTP is available, and slowly if UTP is limiting. A prolonged pause allows enough time for the leader peptide coding region to initiate translation. This translating ribosome not only releases the paused RNAP to resume transcription but furthermore this tightly coupled ribosome prevents formation of the terminator. On the other hand, if UTP is abundant, the pause is short not permitting translation initiation and in absence of a coupled translating ribosome, the terminator can form and terminate transcription in the leader region preceding the structural genes [130].

19

1.9.2 Rho dependent transcription termination

Rho is a hexameric protein, with an RNA-dependent ATPase and helicase activity. It can bind to RNA emerging out of the elongation complex (EC) at Rho-binding sites, also known as Rho utilization (rut) sites. Upon binding, Rho translocates along the RNA using ATP hydrolysis until it reaches the EC, where it dissociates the EC by either pushing RNAP forward or unwinding the RNA:DNA hybrid using its helicase activity

[131]. Rho-dependent terminators are located at the end of transcriptional units, in regulatory regions preceding genes and within coding sequences. Most identified Rho- dependent terminators are regulated while some are constitutive [132]. Rut sites show a high content of C residues and few Gs. This low G content minimizes RNA secondary structure thus permitting Rho loading onto the transcript. These seem to be the only features required for Rho–RNA binding [121]. Rho also requires a ribosome-free transcript of at least 85–97 nt for its binding [132, 133].

The polarity effect is due to intragenic Rho-mediated transcription termination [134]. The model for polarity proposes that rut sites occur commonly in mRNA sequences but are inactive because translating ribosomes hinder access to Rho but when translation is terminated or slowed down upstream of a rut site, then Rho can bind and promote termination. Therefore under optimal translation conditions the leading ribosome covers the rut site before Rho can anchor to it [121]. Polarity suppression can occur if Rho is defective [135].

20

1.9.3 Rho inhibitors: Psu and bicyclomycin (BCM)

Bicyclomycin (BCM)

BCM is a specific inhibitor of Rho [136]. Biochemical and structural analyses indicate that BCM binds adjacent to the ATPase of Rho [137] . BCM inhibits Rho-dependent termination in vivo [138] and in vitro [137] by noncompetitive inhibition of the RNA- dependent ATPase activity of Rho [139]. ATP hydrolysis is blocked by interfering with a key glutamic acid residue that participates in catalysis [140]. Although high concentrations of BCM is lethal to E. coli K-12 [136], due to the essentiality of rho

[141], sublethal doses of BCM are sufficient to disturb Rho termination in vivo [138].

Polarity suppression protein, Psu

Psu is a unique 21 kDa protein from bacteriophage P4. In P4 it is a non-essential capsid decoration protein [142] that inhibits Rho-dependent termination specifically and efficiently both in vivo [143] and in vitro [144]. Psu interacts with Rho and disturbs its

ATP binding and RNA-dependent ATPase activity, leading to the decrease of RNA release rate from the elongation complex [144]. Psu has no effect on intrinsic termination

[145].

In this work we investigated in more detail the role of EF-P in gene expression first by using the approach of ribosome profiling (ribo-seq). This technique can monitor local translation rates on a genome-wide scale. We utilized it to search for novel EF-P dependent targets (chapter 2). In chapter 3, the results obtained by ribo-seq were further validated by an in vivo reporter system and other aspects of translation were investigated.

21

Since translation is coupled to transcription in bacteria, we investigated in chapter 4 the effect EF-P can have on transcription due to reduced translation elongation across EF-P dependent motifs.

22

Chapter 2: Investigation of the role of EF-P using ribosome profiling

2.1.Introduction

Ribosome profiling (ribo-seq) is a genome-wide, quantitative analysis of in vivo translation by deep sequencing. It can map the precise positions of ribosomes on transcripts by nuclease footprinting. The mRNA fragments protected from nuclease are converted into a DNA library suitable for deep sequencing [146, 147]. The level of translation of a gene is indicated by the abundance of footprint fragments corresponding to that gene in the deep sequencing data sets. Moreover, these footprints reveal the precise regions of the that are translated. Although overall mRNA levels can be obtained using microarrays and RNA-seq, those methods only provide an indication of transcriptional level regulation and fail to offer information regarding translational control. Ribosome profiling can directly monitor translation since a translating ribosome occupies an ~28-nt portion of the mRNA transcript and will protect it upon nuclease digestion. The produced footprint fragments indicate which mRNA was being translated and provides positional information on the precise location of the ribosome on that transcript. The process includes many steps (Figure 6) including cell harvesting, cell lysis, nuclease digestion to produce footprints, footprint purification and ligation to linkers, reverse transcription, circularization and finally PCR amplification to create a deep sequencing library [148]. mRNA abundance measurements also need to be

23

obtained to distinguish changes in protein expression that cannot be explained solely by changes in transcript levels (translational regulation). This is done by processing a randomly fragmented complete mRNA sample in parallel with the ribosome footprint

[146, 147].

The high precision tool of ribosome profiling was utilized to monitor gene expression at the level of protein synthesis. Investigation of the genome-wide differences of local translation rates between wild-type and the Δefp strain could help explain the varying expression levels of protein and could identify additional transcripts that were not know to be affected by EF-P

Figure 6. Overview of ribosome profiling steps Ribosome profiling involves many steps. Rapid cell collection, cell lysis, nuclease digestion, footprint fragment purification, linker ligation, reverse transcription, circularization, PCR amplification and finally sequencing and analysis of the cDNA library (Adapted from [148])

24

2.2. Results

Ribo-seq was performed for wild type E. coli, ∆efp and ∆efp complemented strains (∆efp pEF-P, complemented with a plasmid expressing efp). Cells were harvested at mid-log phase and collected by rapid filtration followed by rapid freezing in liquid nitrogen [149].

Nuclease-treated (footprints) and untreated (total) mRNA samples were processed for each of the strains. The correlation between the two biological replicates for each strain was between 96–98% (Figure 7). EF-P prevents translational pausing during synthesis of some polyproline-containing proteins [19]. E coli has over 2000 PPX motifs encoded in its genome, of which 913 had significant reads in our ribo-seq data (i.e. with a coverage of at least 3 sequencing reads per codon). Translational pauses cause the accumulation of ribosomes at the pausing site and increased density in ribosome profiling (i.e. a significant increase in sequence reads at the pause site compared to reads obtained at neighboring regions of the same transcript). It has been previously observed that at strong pauses such accumulations produce at least a ten fold increase in ribosome density at the pause site when compared to the full gene ribosome density [15].

25

Figure 7. Correlation for biological replicates of ribosome profiling data Samples from Wt, Δefp and the complemented strains (Δefp pEF-P). Our experiment resulted in an average of 12.2 million reads/sample. Scatter plots show correlation of total footprints/gene (log10) for the biological replicates. Sara Elgamal conducted the Experiment. The data analysis was started by Dr. Peter White and completed by Dr. Assaf Katz.

We analyzed the pausing tendency of each PPX site by measuring the ratio of ribosome density between the PPX and the full gene and refer to this as the pausing index . In the wild-type strain only 14.6% of the PPX motifs had a pausing index above 2 compared to

50.4% of the PPX motifs in the ∆efp strain (Figure 8A).

26

Figure 8. Translation pausing at PPX sequences. A) Histogram of pausing index of PPX sequences in WT (dark gray) and Δefp strains (light gray). B) Histogram of changes in pausing index of PPX after efp deletion. C) Example of two genes bearing different ribosome occupancies for a similar PPX sequence. Left, malE contains a PNPPK sequence that does not produce translation pausing and right ubiD contains a PEPPK sequence that produces a pause in translation. Insets show the ribosome occupancies for the full genes. Black and red are used for the two WT strain replicates, green and blue for the Δefp replicates, while cyan and magenta are used for the complemented strain replicates. Values in the figure are normalized to the corresponding gene ribosome occupancy average. Dr. Assaf Katz did data analysis for this figure.

27

By more stringent criteria, only 0.22% of PPX motifs had a pausing index higher than 10 in the wild-type strain compared to 2.8 % in the ∆efp strain. Table 1 shows the 26 PPX motifs where a diprolyl or triprolyl sequence had a pausing index higher than 10 in the

∆efp strain. Table 2 shows non-pausing motifs, defined as PPXs that had ribosome densities equal to or lower than the gene average.

Proteomic data from SILAC showed that not all of these proteins had a significant difference in protein levels between WT and ∆efp strains [81]. Although both experiments were performed using different growth media, comparison of these data sets suggest that E. coli can compensate for decreased translation efficiency by other mechanisms related to changes in mRNA levels or protein stability. Table 3 shows the pausing index for the 16 proteins that both contain a PPX sequence and displayed at least three fold higher protein abundance in wild-type vs. the ∆efp strain in the E. coli SILAC dataset [81].

Although ∆efp strains showed increased ribosome occupancy at most of the PPX- encoding sequences (Figure 8B), only a small subset of these genes had a pausing index high enough above the threshold to be considered as strong pauses (i.e. having a pausing index 10 fold above the gene average [150] (Figure 8A)). This variability holds true for all PPX patterns including many of the PPP or PPG sequences that have been reported to produce strong translation pauses in the absence of EF-P [18, 19, 80]] (Figure 9).

Notable examples include ubiD and malE which both have a PPK sequence, but only translation of ubiD pauses at this position in the ∆efp strain (Figure 8C). This indicates that other factors influence the tendency of the ribosome to pause at a particular PPX

28

sequence. To investigate what other sequence determinants contribute to pausing we compared the strong EF-P dependent pauses (defined as regions with a pausing index of at least 10 [150] in the ∆efp strain) with the PPX sequences that have the lowest pausing index in the ∆efp strain (with a pausing index equal to or below 1). 31 EF-P dependent pausing sites were identified, 26 of which (distributed in 22 genes) contained a PPX motif (Table 1). The five other EF-P alleviated pauses contained no PPX motif (Table 4), consistent with our previous observation of EF-P mediated relief of non-PPX pauses such as the GSCGPG motif in the poxB gene [80].

Figure 9. Comparison of pausing index for each PPX. Box plot comparing the PPX pausing indices of every possible PPX combination. Pausing indices was calculated for all PPX with significant amount of reads in the ribo- seq. Distribution of all the pausing indices found for each PPX are shown in a box plot (average as a small square, median as the middle line of the box, box limits represent 25th and 75th percentiles, higher and lower values are marked with an “x” symbol). Dr. Assaf Katz did data analysis for this figure.

29

Table 1. EF-P dependent pauses that contain a PPX sequence Pausing index *2 SILAC Gene *1 PPX at average E. Upstream sequence Complemented pause WT ∆efp coli WT/ (∆efp pEF-P) ∆efp*3 sgrR PPD MRGLRMNTLGWFDFKSAWFA 10.75 15.77 13.79 NA pcnB PPD ERNAELQRLVKWWGEFQVSA 4.03 21.09 11.09 1.3 dnaE PPD DVGRVLGHPYGFVDRISKLI 0.79 12.47 8.66 1.10 ydcR PPG NRSLAQVSKTATAMSVIENL 1.61 10.95 4.62 NA rsxC PPE LKLFSAFRKNKIWDFNGGIH 3 32.97 17.87 NA fliI PPS AMAQREIALAIGEPPATKGY 3.87 18.78 14.82 1.93 fliP PPN FTRIIIVFGLLRNALGTPSA 0 16.65 11.46 1.09 lepA PPE SAKTGVGVQDVLERLVRDIP 2.12 13.3 5.1 3.59 lepA PPP CSAKTGVGVQDVLERLVRDI 2.17 13.35 5.18 visC PPQ SGLRVAVLEQRVQEPLAANA 2.93 12.39 8.61 2.03 trmL PPN ------MLNIVLYEPEI 0.69 11.62 5.17 NA rfaP PPD TEDLTPTISLEDYCADWAVN 0.55 10.79 3.57 1.52 PPP PPELSQGMMTLPEALRTLHR 1.07 18.57 6.62 recG 2.08 PPT PELSQGMMTLPEALRTLHRP 1.07 18.45 6.85 mnmG PPS FLDGKIHIGLDNYSGGRAGD 2.72 10.57 6.5 2.54 rffT PPN LRAVHQQFGDTVKVVVPMGY 0.63 12.67 2.86 NA PPD EQSMIEALKTILGKMHQDAA 1.21 11.78 2.89 cyaA 1.17 PPK ETQRHYLNELELYRGMSVQD 3.46 11.32 7.5 ubiB PPD AFFNRDYRKVAELHVDSGWV 0.98 20.93 8.45 1.39 ubiD PPK EDVSALREVGKLLAFLKEPE 1.05 13.88 7.62 1.53 nfi PPD RAQQIELASSVIREDRLDKD 1.35 11.24 9.51 NA rnr PPD EAGVGFVVPDDSRLSFDILI 1.52 12.1 5.2 0.69 PPP IREGLKALGYYQPTIEFDLR 1.23 23.2 7.92 ytfM 3.15 PPK REGLKALGYYQPTIEFDLRP 1.4 23.11 8.04 mgtA PPS SRLVHRDPLPGAQQTVNTVV 6.86 16.32 12.1 0.94 yjhB PPQ ------MATAWYKQVN 10.56 45 47.34 NA

1 * : Genes with a PPPX pattern are introduced twice to account for the ribosome density at P1P2P3 and 2 3 P2P3X. * : Values correspond to averages of two independent samples.* : Values from SILAC [151].

30

Table 2. List of PPX sequences that do not produce a translation pause. Pausing index *3 Gene *1 PPX at *2 Upstream sequence Complemented pause WT ∆efp (∆efp pEF-P) yaaA PPQ TLPELLDNSQQLIHEARKLT 0.27 0.45 0.67 frr PPL MASDLGLNPNSAGSDIRVPL 0.65 0.42 0.58 lpxA PPY FCIIGAHVMVGGCSGVAQDV 0.15 0.56 0.51 gpt PPI VDIPQDTWIEQPWDMGVVFV 0.75 0.70 0.68 tesB PPV PAPDGLPSETQIAQSLAHLL 0.50 0.88 0.69 aroG PPV ---MNYQNDDLRIKEIKELL 0.16 0.67 0.41 ybiB PPL QAQAKLDEHQPVFMPVGAFC 0.68 0.58 0.95 pflA PPK ELGKHKWVAMGEEYKLDGVK 0.58 0.93 0.84 rlmI PPA VAVRKKEGMELTQGPVTGEL 0.24 0.93 0.21 cbpA PPK KGLVSKKQTGDLYAVLKIVM 0.62 0.49 1.22 ycdY PPY DMTQVSADYNALFIGDECAV 0.31 0.89 0.55 flgA PPR RGGKLEAGNVKLKRGRLDTL 0.39 0.89 0.29 ndh PPR PDIYAIGDCASCPRPEGGFV 0.21 0.32 0.56 phoP PPF ARMQALMRRNSGLASQVISL 0.46 0.51 0.70 galU PPH EAMLEKRVKRQLLDEVQSIC 1.07 0.71 0.74 topA PPK KMGIRTASTGVFLGCSGYAL 0.52 0.35 0.92 tyrR PPL QKGMFREDLYYRLNVLTLNL 0.17 0.89 0.90 rstA PPA DMNHILALEMGACDYILKTT 0.08 0.34 0.09 yniA PPR PKVWAVGADRDYSFLVMDYL 1.19 0.36 0.16 nadE PPI PQQVARTIENWYLKTEHKRR 0.18 0.83 0.37 yeaK PPF ASLASPAEVDELTGCVFGAI 0.06 0.20 0.07 aspS PPH NEEEQREKFGFLLDALKYGT 0.32 0.89 0.84 gyrA PPH IPNLLVNGSSGIAVGMATNI 0.31 0.61 0.70 glpT PPL KERGGIVSVWNCAHNVGGGI 0.11 0.04 0.00 glpT PPC SIAVMFVLLFLCGWFQGMGW 0.05 0.27 0.20 glpT PPI ILVALFAFAMMRDTPQSCGL 0.12 0.02 0.02 yfbV PPA ALSLPMQGLWWLGKRSVTPL 0.73 0.59 0.80 pta PPL QDEVDGLVSGAVHTTANTIR 1.03 0.30 0.80 Continued

31

Table 2 continued prmB PPY IRSDLFRDLPKVQYDLIVTN 0.42 0.87 0.56 sixA PPM VISHLPLVGYLVAELCPGET 1.33 0.57 0.77 maeB PPL KKMARAPMILALANPEPEIL 0.32 0.81 0.67 dapE PPF LAFAGHTDVVPPGDADRWIN 0.66 0.69 0.64 srmB PPA LEALQDKGFTRPTAIQAAAI 0.01 0.57 0.32 ung PPQ YFLNTLQTVASERQSGVTIY 0.38 0.83 1.04 sdaC PPP YSVAITNTVESFMSHQLGMT 1.65 0.75 1.11 sdaC PPR SVAITNTVESFMSHQLGMTP 1.46 0.59 0.86 ygfZ PPR ------MAFTPF 0.57 0.35 0.18 gcvP PPV ASRNKRFTSYIGMGYTAVQL 0.44 0.80 0.84 speA PPM YIDGDGIATTMPMPEYDPEN 0.47 0.72 0.32 hybO PPV MGLSSKAAAEMAESVTNPQR 0.21 0.92 0.87 yqhD PPR HSAHVQPVFAVLDPVYTYTL 0.04 0.75 0.35 parE PPL ALFVKHFRALVKHGHVYVAL 0.14 0.59 0.13 yqiC PPT ISELENRSTEIKKQPDPETL 0.09 0.59 0.40 gltB PPF KLLETAEPHPGKALYCTENN 0.19 0.30 0.20 gltB PPI SSQPRIIYDYFRQQFAQVTN 0.17 0.92 0.38 aroK PPR EKQLARTQRDKKRPLLHVET 0.09 0.42 0.42 pck PPV SKAGHATKVIFLTADAFGVL 0.69 0.87 0.43 yhiR PPY FQQLKAKLPPVSRRGLILID 0.18 0.99 0.22 yibT PPR DKAVDFMASSQAFREYLKKL 0.34 0.33 0.23 lldD PPF ---MIISAASDYRAAAQRIL 0.42 0.63 0.44 dfp PPF AAARRGANVTLVSGPVSLPT 1.00 0.70 0.71 ibpA PPY GFDRLFNHLENNQSQSNGGY 1.03 0.87 1.40 ilvE PPF YISEGAGENLFEVKDGVLFT 0.70 0.77 0.20 rho PPK PSQIRRFNLRTGDTISGKIR 0.64 0.64 0.67 fdhE PPL IRIIPQDELGSSEKRTADMI 0.04 0.46 0.15 cytR PPM MLLLGSRLPFDASIEEQRNL 0.33 0.72 0.51 cytR PPT GDFTFEAGSKAMQQLLDLPQ 0.44 0.68 0.87 rplK PPA PIPVVITVYADRSFTFVTKT 0.76 0.41 0.43 Continued

32

Table 2 continued

malE PPK AYPIAVEALSLIYNKDLLPN 0.14 0.47 0.27

malK PPA GLETITSGDLFIGEKRMNDT 0.83 0.73 0.53

ssb PPM FSGGAQSRPQQSAPAAPSNE 0.52 0.71 0.82

ytfB PPM KPTLEKVWHAPDNFRFMDPL 0.85 0.47 0.58

fklB PPF WELTIPQELAYGERGAGASI 0.88 0.84 0.97 nrdD PPL HLDVEKKVNPYDKIDFEAPY 0.39 0.24 0.37 yjjK PPK -----MAQFVYTMHRVGKVV 0.40 0.88 0.62 rob PPL FAQTPALYRRSPEWSAFGIR 0.28 0.78 0.56

*1: Only PPX sequences that present ribosome densities equal or lower than the gene average were considered in this list. Also genes where the PPX sequence was immediately down stream of a translation pause were eliminated from the list to prevent the inclusion of genes where a pausing PPX was masked by the inability of ribosomes to pass a previous pause. 2 * : Genes with a PPPX pattern are introduced twice to account for the ribosome density at P1P2P3 and P2P3X *3: Values correspond to averages of two independent samples Dr. Assaf Katz assembled this table

To investigate what other sequence determinants contribute to pausing we compared the strong EF-P dependent pauses (defined as regions with a pausing index of at least 10

[150] in the ∆efp strain) with the PPX sequences that have the lowest pausing index in the

∆efp strain (with a pausing index equal or below 1). 31 EF-P dependent pausing sites were identified, 26 of which (distributed in 22 genes) contained a PPX motif (Table 1).

The five other EF-P alleviated pauses contained no PPX motif (Table 4), consistent with our previous observation of EF-P mediated relief of non-PPX pauses such as the

GSCGPG motif in the poxB gene [80].

33

Table 3. List of PPX-containing proteins identified by SILAC to be 3-fold or more abundant in the wild-type vs. ∆efp strain and their corresponding pausing index from ribo-seq

Pausing index *1 SILAC average PPX at Gene E. coli WT/ Upstream sequence pause Complemented ∆efp*1 WT ∆efp (∆efp pEF-P)

atpD PPG 5.18 EMTDSNVIDKVSLVYGQMNE 0.62 5.04 3.17 cadA PPL AEDIANKIKQTTDEYINTIL NA NA NA 6.78 PPG EEVYLDEMVGRINANMILPY hrpA PPA AQMLKVENVQQAWQQWINKL 0.00 0.00 0.00 3.73 PPK YTGARNARFSIFPGSGLFKK 1.17 3.80 2.39 lepA PPE SAKTGVGVQDVLERLVRDIP 2.12 13.3 5.1 3.59 PPP CSAKTGVGVQDVLERLVRDI 2.17 13.35 5.18 malQ PPD GGAETWCDRELYCLKASVGA 2.12 4.20 4.97 3.49 PPM KASVGAPPDILGPLGQNWGL 0.23 0.69 0.38 mcrB PPM MLDNIINDYKLIFNSGKSVI 1.20 0.52 1.34 3.01 PPG TTIETILKRLTIKKNIILQG 0.00 1.63 0.20 nadC PPR 3.42 ------M 1.99 1.79 1.81 pdxB PPR 3.36 VGRRLQARLEALGIKTLLCD 0.94 4.23 3.60 pncA PPR 4.04 ------M 0.35 1.59 2.43 pyrC PPV 5.32 VVPYTSEIYGRAIVMPNLAA 0.35 2.77 1.45 rnb PPQ 7.01 ATEKGFGFLEVDAQKSYFIP 1.24 7.47 2.93 ycgR PPT TVEQLQQSEYLQLPAFITVP 9.19 1.29 1.10 6.58 PPY PPTLWFVQRRRYFRISAPLH 0.00 1.47 1.42 ygdH PPE 3.28 ENFDINVLRRERGVKLELIN 0.15 3.13 1.09 PPN QRYKDSRFIGMTEPSIIAAE 0.46 3.60 3.02 yhgF PPK 3.48 RGRNEGVLQLSLNADPQFDE 0.61 3.81 3.00 yjjK PPG FEELNSTEYQKRNETNELFI 0.23 3.10 1.47 3.47 PPK -----MAQFVYTMHRVGKVV 0.40 0.88 0.62 ytfM PPP IREGLKALGYYQPTIEFDLR 1.23 23.2 7.92 3.15 PPK REGLKALGYYQPTIEFDLRP 1.4 23.11 8.04

*1:These values from SILAC by Peil et al., [81] 34

Table 4. List of EF-P dependent pauses that do not contain a PPX sequence Pausing index *3 SILAC average 1 Sequences examined Gene * 2 Upstream sequences WT ∆efp Complemented E. coli at pause * (∆efp pEF-P) WT/ ∆efp *4

dcuA *5 MLVVE ------12.28 20.2 10.18 NA

nanT SAAWLG TTTQNIPWYRHLNRAQWRAF 6.89 13.6 2.01 NA AQWRAFSAAW MSTTTQNIPWYRHLNR

putA QIAAALAA PLGPVVCISPWNFPLAIFTG 0.58 18.92 8.79 0.26 IFTGQIAAAL ETHRPLGPVVCISPWNFPLA

valS WEKQGYFKPN MEKTYNPQDIEQPLY 0.68 11.12 2.77 5.75 EHWEKQGYFK MEKTYNPQDIEQPLYEH

ycbZ WFIPN# AQASQQEGRHRFPWPLRWLN 4.96 11.91 8.27 1.14 LRWLNWFIPN# IQERIAQASQQEGRHRFPWP

*1: Only genes where both samples from the ∆efp strain show a pause and where the complemented strain present a decrease of ribosome occupancies at the pausing site were included. *2: “#” represents a stop codon. *3: Values correspond to averages of two independent samples. *4: These values from SILAC by Peil et al., [81] *5: dcuA pausing site includes the translation and thus, it doesn't have any upstream amino acid sequence.

2.3.Discussion

2.3.1 Ribosome profiling and protein synthesis

A challenging aspect of analyzing ribosome profiling data is that increased density of ribosome footprints can indicate many ribosomes actively translating a transcript or an increased translation time [148]. The ribosome profiling data introduced here was compared with the available proteomic data obtained by SILAC [81] of wild type E. coli and ∆efp strains. Upon comparing total footprints/gene ratios for ∆efp/WT strains

35

(obtained by ribo-seq) to differences in protein abundance for ∆efp/WT strains detected by SILAC, most proteins (77%) seem unaffected, with only 2 % of proteins showing a greater than 2-fold increase in both datasets (Figure 10).

This comparison also showed that 5.5 % of the proteins having 2-fold higher protein abundance as detected by SILAC show decreased or unchanged ribosome occupancies from ribo-seq. This could be a result of many factors including differences in protein half-life or mRNA abundance between WT and ∆efp strains. It is particularly interesting to note that synthesis of RNase II, which plays a critical role in mRNA turnover, is highly

EF-P dependent. Ribo-seq showed that the PPQ motif in rnb (encoding RNAse II) had a

7.4 pausing index (Table 3) while SILAC detected RNase II to be 7 times more abundant in the WT versus the ∆efp strain [81].

The differences between Ribo-Seq and SILAC may also be influenced by inherent biases of the ribosomal profiling method [152]. The protocol involved in generating footprints captures short mRNA fragments covered by exactly one ribosome. It remains possible that mRNA fragments with very closely located ribosomes (as we expect near pausing sites) could be lost as has been previously shown for other pauses [153]. Moreover, determining pausing sites by computing motif reads divided by gene average [150] can be misleading when the gene has more than one pausing motif or when the pausing motif is at the start of the gene. In both of those cases the average reads/gene would probably be inaccurate.

36

Figure 10. Comparison of SILAC and rib-seq data. Pie chart comparing proteins identified in our ribo-seq experiment (cutoff 70 footprint reads/gene) and also present in Peil et al., SILAC dataset .Ribo-seq data (ratio between Δefp and WT footprints/gene) was compared with SILAC data (protein abundance ratio between Δefp and WT). In 77% (800 out of 1039) of the proteins, the ratio for both datasets was between 0.5–2. For 7.5% of the genes, there was more than 2 fold higher total footprints/gene in Δefp vs. WT, about one fourth of them also had above 2 folds more protein abundance in Δefp vs. WT. While 7.3% of the genes had less had less than

37

2.3.2 The role of EF-P in integrating different signals to regulate translation

We were surprised that only 2.8 % of the PPX motifs detected by ribosome profiling had a pausing index of 10 or more (the threshold considered as strong pausing) [150].

Although, the remainder of the PPX motifs may not be pausing translation, the observed increase in ribosome density likely reflects that many PPX motifs can still slow translation. This might offer some explanation for the increased polysome retention previously observed for E. coli ∆efp strains [154]. Similar effects are also observed after addition of to E. coli cultures [154] or depletion of eIF5A (EF-P paralog) in eukaryotic cells [155-157]. These observations suggest that EF-P enhances the translation speed of several slightly slower segments of mRNA that collectively would have an important effect on global translation dynamics. Part of this translation enhancement may not come directly from EF-P binding to the ribosome, but from the release of tRNAPro that is trapped in other stalled ribosomes.

Both ribosome profiling and SILAC data show that EF-P directly affects the synthesis of several key components of translation, and therefore the loss of EF-P may have a broad but indirect impact on protein synthesis. YjjK (or EttA, Energy-dependent translational throttle A [158]) has recently been shown to be sensitive to the ATP/ADP ratio in the cell. EttA can control the progression of 70S ribosome initiation complexes into translation elongation and thus alter protein synthesis in energy-depleted cells [158, 159].

EttA has 2 PPX motifs, PPG and PPK, and in the ribo-seq data PPG caused a pausing index of 3.1 in the ∆efp strain, a ~14 fold increase compared to WT (Table 3). Similarly,

SILAC data showed that Etta is 3.5 and 15.4 times more abundant in WT than in efp deletion strains in E. coli and Salmonella, respectively [80, 81]. In view of the fact that 38

components of ATP synthase and also Etta are affected by the loss of EF-P [80, 81], it is conceivable that loss of EF-P may perturb the energy state of the cell, which may in turn contribute to the growth defect of the ∆efp mutant. Our ribo-seq data also shows that another translation factor, LepA, has 2 PPXs that both caused a pausing index higher than

13.3, and SILAC data showed a significant change in the WT/∆efp ratio of 3.6. (Table 3

[81]). Changes in the levels of other proteins observed in the E. coli SILAC data [81] could further affect protein translation. Examples of this are RaiA (a translation inhibitor and ribosomal stability enhancer [160] that shows 8 fold increase in the ∆efp strain), Sra

(a protein of unknown function that binds 30S ribosomal subunits during stationary phase

[161] and shows a ~2 fold increase in the ∆efp strain) and several proteins involved in tRNA processing (RNaseII (Rnb) [162]), modification (MnmE, MnmG, SelU [163, 164]) or aminoacylation (LysU, ValS [165]) that present 2- to 7-fold reduced levels in the ∆efp strain. In addition, SILAC data shows a ~2 fold decrease in the levels of the chaperone

HslU in the ∆efp strain. As this chaperone is also part of the HslVU [166], changes in its levels could affect protein stability. Additionally, a 3 fold increase in the levels of HchA(Hsp31) in the ∆efp strain could also produce changes in protein stability as this chaperone has been proposed to have some proteolytic activity [167].

Altogether, changes in the levels of proteins involved in protein synthesis and stability could explain some of the differences we observe between ribo-seq and SILAC data.

Additionally, the effects observed here on cellular levels of translation factors such as

EttA and LepA, on proteins modifying protein stability such as HslU and on proteins expected to influence mRNA turnover such as RNaseII, suggest a broad role for EF-P in

39

integrating and balancing different inputs that determine the efficiency of protein synthesis.

2.4.Materials and methods

2.4.1 Strains

E. coli BW25113 (Wild type) and Δefp E. coli strains were obtained from the Keio collection. For the Δefp strain, the kanamycin cassette was removed via pCP20-encoded FLP recombinase and was confirmed by PCR [36, 168].

The Δefp E. coli complemented strain was constructed by introducing the efp open reading frame in trans on the arabinose-inducible vector pBAD (Δefp pEF-P strain).

2.4.2 Ribosome profiling (ribo-seq)

Cell Growth and Harvest :Saturated cultures of wild type E. coli BW25113, ∆efp and the complemented strain were diluted to an OD600 nm of 0.01 in 200 mL of Luria broth medium. The media for the ∆efp complemented strain was supplemented with 0.02 % arabinose. Strains were grown at 37 °C, 250 rpm to OD600 nm 0.4-0.5. Cells were harvested by rapid filtration [149] through a prewarmed 0.45 µm nitrocellulose membrane. The cells were scraped onto a pre-warmed spatula then directly submerged in liquid nitrogen. The frozen cells were dislodged into 0.65 ml of cold lysis buffer (20 mM

Tris-HCl pH 8.0, 10.5 mM MgCl2, 40 U/µl RNase Inhibitor (Roche), and 100 U/ml

Turbo® DNase (NEB)) [154] and re-chilled in liquid nitrogen.

40

Lysate Preparation: The harvested cells were lysed by freeze/thaw 3 times then spun down at full speed for 10 min at 4ºC. The clarified supernatant was immediately frozen in liquid nitrogen and stored at 80ºC. This lysate was used to prepare footprint and total mRNA samples.

2.4.3 Preparation of ribosome footprint samples

As previously described [149], the frozen lysate after thawing on ice was supplemented with 5 mM CaCl2. Digestion was carried out at 25ºC in a thermomixer (Eppendorf) shaken at 550 rpm with 750 U micrococcal nuclease (Roche, catalog no. 10107921001) for each 0.5 mg RNA. Digestions were quenched after 1 hr with EGTA to a final concentration of 6 mM. Both digested and control samples were loaded onto 10/45% sucrose gradients (buffered in 10 mM MgCl2, 100 mM NH4Cl, 20 mM Tris pH 8.0, 2 mM DTT). The gradients were separated by ultracentrifugation at 35,000 rpm with SW41 rotor for 2.5 hr at 4ºC. Fractions were harvested from the gradient and monosomes were manually collected by monitoring OD254 nm.

2.4.4 Preparation of fragmented mRNA samples (RNA-seq samples)

As previously described [148], total RNA for each strain was prepared from the same lysate used for ribosome footprinting. Ribosomal RNA was removed from the total RNA with MICROBExpress (Ambion). Small RNA was removed with MEGAclear (Ambion).

The mRNA was randomly fragmented by partial alkaline hydrolysis as previously described [146].

41

2.4.5 Library Generation

Ribosome footprints and the fragmented mRNA samples were converted to a cDNA library as previously described [146, 149]. Samples were first denatured using SDS to a final concentration of 1% (w/v). Following that, the denatured samples were extracted once with 1 volume of hot acid phenol then once with 1 volume of acid phenol and finally with 1 volume of chloroform–IAA (24:1). The precipitated yield was resuspended in 20 µl of 10 mM Tris pH 7.0. To resolve, 25 mg of RNA was mixed with 2x TBE-urea sample loading buffer (Invitrogen) and loaded on a 15% TBE-urea gel. The gel was run in 1x TBE at 200V then stained for 3 min in SYBR Gold. With the 10 bp ladder

(Invitrogen), a band between 28 to 42 bp was excised, gel purified and resuspended in 10 mM Tris pH 7.0. Dephosphorylation of the 3 ́ ends of the RNA was done by adding T4

PNK (NEB) for 1 hr at 37ºC, followed by heat inactivation. The precipitated RNA was resuspended in 10 µl of 10 mM Tris pH 7.0. From this, 5 pmol (quantified by Bio-

Analyzer) of RNA was diluted to 5 µl in 10 mM Tris pH 7.0 and ligated for 2.5 hr at

37ºC to 1 µl of 1 µg/µl Linker-1 (5 ́_App/CTGTAGGCACCATCAAT/3 ddC_ 3 ́,

IDTDNA) with 8 µl of 50% sterile filtered PEG MW 8000, 2 µl of 100% DMSO

(Sigma), 2 µl of 10x T4 RNA buffer (NEB), 1 µl of 40 U/µl RNase Inhibitor

(Roche), and 1 µl of T4 ligase 2, truncated (NEB). The ligated products were precipitated and resolved on a 10% TBE-urea gel in 1x TBE at 200V. Using the 10 bp ladder, a band between 45 and 60 bp was excised. After gel extraction, the products were reverse transcribed using Superscript III (Invitrogen) in a 20 µl reaction volume at 50°C for 30 min [148] with no more than five molar excess of Reverse transcription primer, 5′-

(Phos)- 42

AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGC-

(SpC18)-CACTCA-(SpC18)-TTCAGACGTGTGCTCTTCCGATCTATTGATGGTGC

CTACAG-3′. NaOH was added to hydrolyze RNA products were (final concentration of

0.1 mM) and incubated at 95°C C for 15 min. A 10% TBE-urea gel was used to resolve the reverse transcribed cDNA. A band between 165 to 180 bp was excised using 10 bp ladder as the standard. After gel purification, the product was resuspended in 15 µl of 10 mM Tris pH 8.0 and circularized with CircLigase (EPICENTRE) at 60°C for 1 hr. To inactivate the enzyme, it was heated at 80°C for 10 min. The circDNA was PCR amplified as previously described [148] with the Forward library PCR primer, 5′-

AATGATACGGCGACCACCGAGATCTACAC-3′ and an Indexed reverse library PCR primers, 5′-CAAGCAGAAGACGGCATACGAGATNNNNNN GTGACTG GAGTTCA

GACGTGTGCTCTTCCG-3′ (NNNNNN indicates the reverse complement of the index sequence discovered during sequencing. The twelve forward-strand barcode sequences in

Table 5). The PCR was done for 7 to 12 cycles using Phusion polymerase (NEB) and the products were resolved on an 8% polyacrylamide gel in 1x TBE at 180V. Using the 10 bp ladder as the standard, a band between 165 to 180 bp was excised. The purified product was resuspended in 10 µl of 10 mM Tris pH 8.0 and quantified by BioAnalyzer (high sensitivity DNA kit, Agilent). The libraries were sequenced on HiSeq2500 platform from

Illumina.

43

Table 5. Indexed library PCR primers

Forward Indexed reverse library PCR primer (5`→3`) * index (5`→3`)

ACGACT CAAGCAGAAGACGGCATACGAGATAGTCGTGTGACTGGAGTTCAGACGTGTGCTCTTCCG

ATCAGT CAAGCAGAAGACGGCATACGAGATACTGATGTGACTGGAGTTCAGACGTGTGCTCTTCCG

CAGCAT CAAGCAGAAGACGGCATACGAGATATGCTGGTGACTGGAGTTCAGACGTGTGCTCTTCCG

CGACGT CAAGCAGAAGACGGCATACGAGATACGTCGGTGACTGGAGTTCAGACGTGTGCTCTTCCG

GCAGCT CAAGCAGAAGACGGCATACGAGATAGCTGCGTGACTGGAGTTCAGACGTGTGCTCTTCCG

TACGAT CAAGCAGAAGACGGCATACGAGATATCGTAGTGACTGGAGTTCAGACGTGTGCTCTTCCG

CTGACG CAAGCAGAAGACGGCATACGAGATCGTCAGGTGACTGGAGTTCAGACGTGTGCTCTTCCG

GCTACG CAAGCAGAAGACGGCATACGAGATCGTAGCGTGACTGGAGTTCAGACGTGTGCTCTTCCG

TTAGGC CAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTCAGACGTGTGCTCTTCCG

TGACCA CAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTCAGACGTGTGCTCTTCCG

ACAGTG CAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTCAGACGTGTGCTCTTCCG

GCCAAT CAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTCAGACGTGTGCTCTTCCG

* Underlined sequence indicates the reverse complement of the index sequence recognized during Illumina sequencing [148].

44

2.4.6 Next Generation Sequencing and alignment to E. coli genome

The final DNA libraries were validated with Agilent 2100 Bioanalyzer using Agilent

High Sensitivity DNA Kit, and the library concentrations were determined by Q-PCR using KAPA SYBRR Fast qPCR kit. The libraries were then run on Single End flowcell on HiSeq 2000 (Illumina) and 52 bp reads were generated. Between 8 to 15 million raw sequence reads were generated for each sample.

All reads were trimmed in a two step process to first remove the adapter sequence

“CTGTAGGCACCATCAAT”, discarding reads shorter than 25 nucleotides, and second to remove the first base from every read which was added during the library construction process. This was performed using the FASTX-Toolkit

(http://hannonlab.cshl.edu/fastx_toolkit/). This trimming process removed an average of

24% of the reads per sample. The remaining 7-11 million reads were aligned using bwa

(0.6.2) to the E. coli K12 MG1655 reference genome (Genebank version U00096.2)

[169]. All reads that could not be uniquely mapped to the reference sequence were removed from subsequent analysis, leaving 1.6 – 2.7 million uniquely mapped reads per sample. A TDF file was created for each sample for visualization in IGV, which was scaled to reads per 10 million data using bedtools (2.17.0) [170] and igvtools (2.3.3)

[171]. A coverage file, describing the coverage for each feature in the K12 genome, was created using bedtools. Only ORF that present an average of at least one read per position were considered for further analysis.

For RNA-Seq analysis, coverage was normalized and the means of the WT, ∆efp and complemented groups were tested for significant differences using the binomial test in the R package DESeq (1.10.1), producing fold changes and adjusted p-values for each 45

feature [172]. Resulting p-values were adjusted for multiple testing with the Benjamin-

Hochberg procedure, which controls false discovery rate (FDR). The data has been deposited in the NCBI Sequence Read Archive (SRA) BioProject no. PRJNA241328.

2.4.7 Analysis of ribosome densities at PPX sequences and Pausing sites detection

Translation pausing sites were searched on the ribosome profiling data using costume analysis scripts written in Perl. In outline, ribosome occupancies were normalized as a ratio of the full ORF average ribosome occupancy. Strong pauses were considered to be segments where the ribosome occupancy was at least 10 fold the gene average [150].

Similarly PPX occupancies were normalized as a ratio to the ORF average ribosome occupancies. Strong PPX pauses were considered to have at least 10 fold higher occupancies than the gene average and non pausing PPX were considered to have at most an occupancy equal to the gene average. In addition, non-pausing PPX sequences located just down stream of a translation pause were excluded of the list of non-pausing PPX to prevent including sequences that have low reads because ribosomes were unable to cross a previous point of the mRNA.

46

Chapter 3: Mining the Ribosome Profiling data 3.1.Introduction

During protein synthesis each amino acid is detached from an aminoacyl-tRNA and incorporated into the nascent peptide. Although the basic peptidyl transfer reaction is the same for all amino acids, the speed of incorporation is not uniform. It is affected by several factors including the abundance of each individual aminoacyl-tRNA, the structure of the incorporated amino acid, and structural features of the mRNA and the nascent peptide. For example, mRNA sequences upstream of the peptidyl-tRNA site (P site) codon that interact with the anti Shine Dalgarno sequence (aSD) from 16S rRNA [150] or regions from the nascent peptide that interact with the ribosome exit tunnel, have been shown to slow translation [91, 92, 173].

Decreasing the speed of translation, or even pausing it, can have important roles in protein synthesis. For instance, sequence context dependent pausing during translation of secM is known to regulate synthesis of the membrane protein SecA [174]. More broadly, changes in translation speed can affect co-translational folding of proteins, controlling not only the fraction of active protein [175], but potentially also providing new functionality through alternative folds [75]. Although in these cases translation pausing has beneficial physiological roles, in other cases it could be detrimental if it significantly decreases the efficiency of protein synthesis. Accordingly, patterns that induce ribosome pausing are often excluded from coding regions [91, 150]. Exceptions to this include PPP and PPG sequences. This is most probably due to the presence of elongation factor P (EF-P) [91], a 47

protein that has been recently shown to prevent the pauses produced by these and other sequences, most of which contain a PP motif [18, 19, 80, 81]. It has been described that in the absence of EF-P, mRNA coding for PPG will pause with Gly-tRNAGly located at the A site of the ribosome and peptidyl-tRNAPro at the P site [19]. A similar effect has been reported for PPP sequences that pause with the second Pro at the P site [91].

Our data indicate that pausing-potential is largely influenced by the local context of the

PPX pattern, and that specific amino acids upstream of the PPX motif can modulate whether or not a particular A site residue can trigger a stall. Wide variations in EF-P dependent translation of these PPXs led us to investigate the effect of sequences upstream of diproline-containing motifs. Using an in vivo strategy described in detail below we found that amino acids encoded upstream of PPX play a key role in EF-P-dependent translation.

Furthermore, we also investigated the role of translation initiation on EF-P dependent pausing to help explain why both components of ATP synthase, the α- and β-subunits

(AtpA and AtpD) have different EF-P dependences - in both ribo-seq and proteomic data- even though both share common features. Specifically, they both contain a PPG motif and are translated from the same mRNA transcript.

3.2. Results

3.2.1 Non-PPX pausing motifs

The five non-PPX containing genes (in Table 4) were further investigated in vivo by introducing the sequence coding for the pausing segment into a GFP reporter system. In this reporter GFP is in a transcriptional fusion to mCherry, which has a separate Shine-

48

Dalgarno (SD) sequence and serves as an internal control (Figure 11A, [80]). After inserting these non-PPX motifs into the reporter system (at the amino terminus of GFP, between codons 3 and 4), the EF-P dependency could not be reproduced (Figure 11B).

Other longer sequences were additionally tested without positive results (Table 4), suggesting that these pauses might depend at least in part on sequence features outside the cloned segments.

3.2.2 Common patterns in PPX pausing motifs

The large variability in PPX-mediated pausing patterns revealed by the ribo-seq data

(Figure 9) led us to search for additional sequence features that might affect pausing at

PPX sequences. We compared the sets of well-defined pausing and non-pausing motifs

(Tables 1 and 2, respectively). Some patterns such as PPD or PPN were only found in the pausing PPX sequences while PPQ or PPK were present in both gene sets. When comparing alignments of the amino acid or nucleotide sequences, we were unable to identify any common patterns within either the pausing or the non-pausing PPX sequences (Figure 12). It has been proposed that several translation pauses do not depend purely on one mechanism, but instead integrate different signals that slow down translation [91].

To investigate if other known mechanisms might contribute to pausing, the role of Shine-

Dalgarno (SD) like sequences upstream of the PPX sequence, the utilization of low usage tRNAs at the A site codon, and combinations of specific amino acids at the A site and upstream of PPX were tested.

49

Figure 11. The GFP/mCherry reporter. A) Schematic of the GFP/mCherry reporter: GFP is in a transcriptional fusion to mCherry that has a separate Shine-Dalgarno sequence. mCherry serves as an internal control for variations in transcription and plasmid copy number [80]. Tested motifs were inserted in- frame at the fourth codon of gfp [176]. Fluorescence ratio (GFP/mCherry) is measured for WT and Δefp strains harboring the tested plasmid. The fold difference in fluorescence ratios between WT and Δefp strains is then normalized to the values obtained from a no insert control. B) The reporter construct with the 5 non-PPX motifs described in Table 4. The five motifs have fluorescence ratios lower than no insert, PPPPPP is the positive control.

50

Figure 12. Alignment of sequences from genes with pausing or non-pausing PPX. Figure shows a Logo representation [177] of alignments of nucleotide (A and B) or amino acid (C and D) sequences of pausing (A and C) or non pausing (B and D) PPX containing genes. PPX sequences were manually aligned together with the first 20 upstream codons/amino acids. No gaps were introduced in the sequence. Dr. Assaf Katz prepared this figure.

51

3.2.3 Exploring the role of the weak internal Shine-Dalgarno sequences

Most translation pauses in wild type E. coli are due to interactions of the mRNA's coding region with the anti-SD (aSD) sequence of the 16S rRNA [150]. We reasoned that having a motif capable of interacting with the aSD upstream of a PPX might contribute to pausing. The RNAsubopt program in the Vienna RNA package [178] was used to search for the presence of nucleotide sequences upstream of the PPX coding region that are predicted to have affinity for the aSD sequence (5′-CACCUCCU-3′), referred to here as aSD-weak binding sequences. Several paused PPX sequences also contain a sequence 7 to 9 bases upstream of the third position of the X codon from PPX predicted to weakly bind the aSD. The median affinity of these sequences for the aSD was ~-2 Kcal/mol, about half of the minimum affinity found to produce an increased pausing index by itself in previous studies (4 to 12 Kcal/mol) [150]. It is possible that these low affinities could enhance the ability of PPX sequences to produce a pause in translation, a hypothesis that was supported by the absence of these aSD-weak binding sequences upstream of PPXs that do not produce a pause (Figure 13A).

Another possible feature of pausing patterns might be the use of rare codons that could slow translation and increase the strength of pauses [8, 179]. Consistent with this, when analyzing the codons used for PPX patterns, the stronger EF-P dependent pauses frequently use rare tRNAs for decoding the codons at the A site (Figure 13B).

To further determine the possible role of the aSD-weak binding sequences and the use of rare codons several sequences were introduced at the amino terminus of a GFP reporter

(Figure 11A) and tested for their effect on translation in WT and ∆efp E. coli strains. GFP fluorescence values were normalized against the fluorescence of mCherry encoded on the 52

same transcript immediately downstream of gfp. As the diverse sequence patterns are introduced at the beginning

Figure 13. Correlations between mRNA sequence and pause strength. A) Figure shows estimated affinity for ribosomal aSD sequence in sequences upstream of PPX in pausing and non-pausing genes. Position o correspond to the third position of the X codon at PPX (average as a small square, median as the middle line of the box, box limits represent 25th and 75th percentiles, higher and lower values are marked with an “x” symbol). B) Comparison of codon usage for all PPX sequences and their corresponding ribosome occupancies in the strain. Values correspond to average of two biological replicates and are normalized by the average occupancies of the corresponding genes. Dr. Assaf Katz prepared this figure.

of the gfp sequence (between the 3rd and 4th codons) we do not expect that pauses have substantial effects on protein folding. Thus, the expectation is that most of the changes in

53

GFP production associated with efp deletion will come from the reduction in the number of ribosomes able to cross the pausing site. This is in accordance with previous reports where comparable experiments correlated well with changes in the level of protein [80,

81]. Several codon variations coding for PEPPK were tested, a translation pause site in ubiD, and PNPPK found at a non-pausing segment of malE (Figs. 8C, 2 and Tables 1 and

2). Sequences encoding PEPPK are predicted using RNAsubopt to bind to the aSD with affinities ranging from -5 to 0 kcal/mol. Conversely, all PNPPK variants present a binding energy of 0 kcal/mol and are predicted to be easily translated by the ribosome

(Figure 14A). In addition, plasmids bearing PDPPK and PQPPK sequences were constructed as controls for the role of an acidic versus an amide containing amino acid 2 positions upstream of the P site amino acid of pausing ribosomes. GFP levels did not correlate with the sequence affinity for the ribosome aSD, indicating that this does not play a significant role in EF-P dependent pausing for P(E/D/Q/N)PPK sequences (Figure

14A). Also, similar constructs using diverse codons at the pausing A site position did not show any effect of low usage codons (Figure 16). Instead, a consistent tendency was observed of decreased translation efficiency for clones bearing an acidic amino acid at position -2 with regard to the Pro at the ribosome P site independent of aSD affinity

(Figure 14B). This pausing was also observed with other basic amino acids (Arg or His) at the A site position (Figure 15A) and was independent of codon usage (Figure 16).

3.2.4 The Effect of the amino acid immediately upstream of the diprolyl motif

The finding that PP-basic pausing depends on the identity of the amino acid 2 positions upstream of the P site position (Z-2 on Z-2P-1PPXA), suggests a possible role of this

54

position in determining the A site selectivity for EF-P relieved translation pausing. A similar effect has been previously observed for the macrolide dependent pausing of ermAL1 translation, at the leader sequence of ermA. In this example, the presence of an

Ala two positions upstream of the P-site amino acid will pause translation only in the presence of certain A site amino acids such as Glu. Conversely, the presence of Phe or

Gly in the -2 position produces a non-selective ribosome that either pauses (Phe) or continues translation (Gly) irrespective of the A site amino acid [180].

Amino acids at -2 have also been shown to be important in other translation pausing examples [91] and Peil et al. have also recently suggested that some Z-2P-1PX patterns

(with Z and X representing any ) could also induce EF-P relieved pauses [81].

In order to determine if there is a general role of the -2 amino acid on EF-P dependent pauses, the PPX ribosome densities for all possible Z-2P-1PX amino acid combinations were analyzed (Figs. 1, Table 5). When exclusively comparing the well-defined pausing and non-pausing sequences (Tables 1 and 2, respectively) acidic amino acids at the A site

(X position on Z-2P-1PX) were found to stall translation independent of the identity of the

-2 amino acid (Z on Z-2P-1PX). Similarly, hydrophobic or aromatic amino acids at the A site do not produce a pause independent of the identity of the amino acid at the -2 position. Some examples of these were confirmed using the GFP/mCherry system described above (Figure 15B).

55

Figure 14. Effects on translation pausing of aSD affinity and the identity of the amino acid preceding PPX. A) Diverse sequences coding for similar amino acid patterns, but with varying aSD affinities, were introduced at the N-terminus of GFP. GFP production in WT and Δefp strains was measured and normalized against mCherry, which is cloned as a transcription fusion. aSD affinities 6 positions upstream of the third nucleotide of the X codon are shown. B) Distribution of the effects of all the clones that code for the same amino acid pattern (irrespective of their specific aSD affinities) are represented as box plots in dark lines (average as a small square, mean as the middle line of the box, box limits represent 25th and 75th percentiles). Average values for each specific clone are shown in light gray. The mean of at least three biological replicates is shown and error bars (which indicate one standard deviation) are only shown in A for clarity. Dr. Assaf Katz prepared this figure.

56

Figure 15. Effect of amino acids preceding PPX on pausing efficiency. Various sequences were introduced at the area coding for the N-terminus of GFP. When possible, sequences of genes found to either pause or not pause in E. coli were used (names in parenthesis). A) PP-basic, B) PP-acid and PP-hydrophobic, C) PP-amide, D) PP-hydroxy and E) PP-Pro. The mean of at least three biological replicates is shown and error bars indicate one standard deviation.

57

Figure 16. Effect of codon usage at A site position. Dr. Assaf Katz prepared this figure.

58

Conversely to what was observed for acidic, hydrophobic and aromatic moieties, other amino acids at the A site have a pausing behavior that is context dependent. Four examples of this variable PPX behavior were further investigated: PP-basic, PP-amide,

PP-OH and PPP. Similar to previous results with PPK patterns, pausing was only observed at PP-basic motifs when the −2 residue was acidic. In contrast, PP-amides always pause with the exception of some specific cases where there is an OH containing amino acid at the −2 position. These activities were confirmed for both patterns in the

GFP/mCherry system, although acidic-PP-basic patterns have a weak effect on GFP translation as compared to the other patterns analyzed (Figs. 15A and 15C).

In ribosome profiling, PP-OH was only observed to pause when the A site was occupied by a Ser, whereas the presence of Tyr or Thr did not cause a translation pause. In the

GFP/mCherry reporter system, PP-S and PP-T produced some decrease of translation in the ∆efp strain. In some cases (with Gln or Val preceding PPX) Ser in the A site produced a stronger effect than Thr, but in others (with Asp preceding PPX) no difference was observed (Figure 15D). Contrary to predictions based on previous reports, PPP sequences were only observed to pause with an Arg or Ile at -2 in the ribosome profiling data. The effect of amino acids at the -2 position was also studied using the GFP system

(Figure15E). All PPP motifs produced at least a 4-fold decrease in GFP production in the

∆efp strain compared to WT. Previous studies have suggested that longer Pro stretches will induce stronger pauses.

59

Table 6. Combinations of amino acids at A site (X, #10) and 2 positions upstream the P site (Z) in pausing and non-pausing ZPPX sequences

{-2} (Z) *1 {A site} (X, No- Identity -2 Identity A site Pause Identity -2 Identity #10) *1 pause cases A site cases Hydrophobic Acidic 0 6 AAIAVI DDDDD D Basic Acidic 0 1 H E Acidic Acidic 0 1 D D Amide Acidic 0 1 N D Hydrophobic Amide 0 4 AAII NQN OH Amide 2 TY QQ 1 Y N Amide Amide 0 1 N Q Hydrophobic Aromatic 4 LILI FFFF 0 OH Aromatic 2 TT FF 0 Amide Aromatic 3 NNN YFF 0 Hydrophobic Basic 9 MLVLLILLV KRRKRHRRK 0 Basic Basic 2 KR KK 0 Acidic Basic 0 2 DE KK OH Basic 2 TT HR 0 Amide Basic 1 N K 0 Cys Basic 1 C H 0 Aromatic Basic 1 F R 0 Hydrophobic 0 1 L G Hydrophobic Hydrophobic 17 LVLLLLILLLIL LIVVALLIALA 0 LLILL VLVLMM Acidic Hydrophobic 1 E M 0 Basic Hydrophobic 4 RRRR ILVL 0 Cys Hydrophobic 1 C L 0 OH Hydrophobic 5 TTTTY AMAAL 0 Amide Hydrophobic 2 NN MI 0 Acidic OH 1 D Y 1 D S OH OH 1 Y Y 1 Y S Amide OH 1 Q T 0 Aromatic OH 0 0 Hydrophobic OH 3 LVV TYY 1 V S Aromatic SH 1 W C 0 Basic Pro 0 2 RR P OH Pro 1 T P 0 Hydrophobic Pro 0 1 I P *1 Pauses with PPP are not included in the analysis to prevent errors in interpreting A site and -2 positions. Dr. Assaf Katz prepared this table. 60

For instance, we have previously shown that a 6 Pro stretch will reduce GFP translation

3- to 4-fold more than a 3 Pro stretch [80]. By contrast, the addition of only one Pro before PPP (PPPP) does not have any effect on either GFP expression or ribosome occupancies in ribosome profiling experiments (Figs. 15E, Table 5) indicating that addition of a single prolyl residue is not enough to significantly reduce translation efficiency.

3.2.5 Effects of distal upstream sequences on PPX translation

The finding that all ZPPP motifs produced a ~4 fold effect in the GFP/mCherry reporter system was unexpected, as in the ribo-seq data only RPPP and IPPP were observed to produce a strong pause. Moreover Val, that only appeared preceding non-pausing PPP in our ribo-seq data, had the strongest effect compared to the other ZPPP patterns (Figs.

15E). The finding that some of the patterns tested in the reporter system were unable to reproduce the pausing tendency observed in ribo-seq suggests that other sequence features might have additional effects on the pausing of PPX. No obvious correlation was observed between the “X” amino acid and up to 12 codon positions further upstream of

PPX in our set of validated in vivo pausing sequences (data not shown).

3.2.6 Effect of translation initiation on EF-P dependent pausing

Proteomic analysis of Salmonella showed that AtpD, but not AtpA, synthesis appeared to be affected by efp deletion (20.6 fold difference for AtpD contrasting with 1.05 for AtpA)

[80]. A similar trend, although less dramatic, was recently observed in a SILAC experiment performed with E. coli (5.18 fold difference in synthesis for AtpD and 1.88 for AtpA) [81]. Using AtpA and AtpD as model proteins, we examined the effect of

61

varying the level of translation initiation through altering elements in the 5′-untranslated region of the mRNA. This was done by either mutating the Shine-Dalgarno sequence or the start codon [181].

To investigate the relation between translation initiation and EF-P dependence with proteins other than AtpD and AtpA, we employed the Keio collection E. coli efp mutant.

We conducted a similar plasmid-based translational fusion assay wherein specific polyproline motifs were inserted directly into the 4th codon position of GFP. We varied the efficient AUG initiation codon to GUG or UUG (Class I codons that support efficient translation) or to AUC or CUG (Class IIA codons that support translation at levels only

1–3% that of AUG) [182]. Mutation of the start codons of these constructs revealed a similar correlation between expression in wild-type cells (initiation strength) and EF-P dependence for multiple different polyproline motifs (Figure 17). Fluorescence plotting and curve fitting follow a trend towards a maximum expression in the efp mutant and, interestingly, the strongest polyproline motif tested (six consecutive prolines) appeared to be rate limiting (reached maximum expression in the efp mutant) even with the weakest start codon tested, CTG. This supports the idea that the relation between initiation and

EF-P dependence is not restricted to AtpD and AtpA in Salmonella, but is generally applicable to other EF-P dependent motifs and in other species.

62

Figure 17. Translation initiation and elongation stall strength influence protein level. Fluorescence data from E. coli containing the pBAD30XS plasmid with the indicated poly-proline motifs inserted at the 4th codon of GFP. For each motif construct, four start codon mutations were generated: From highest to lowest expression in wild-type: AUG (wt), GUG, UUG, AUC, CUG. Start codon mutant constructs with the same poly-proline motif were plotted as a group with a corresponding ‘1-exp’ curve of best fit. ‘No motif’ has no inserted motif and is included as an EF-P independent control. Shown is a fluorescence ratio of GFP normalized to mCherry expressed from the same mRNA but with its own ribosome binding site [181]. Steven Hersch prepared this figure.

3.3.Discussion

The results presented here confirm previous observations that most PPX motifs do not require EF-P for proper translation [80, 81]. Instead, potential EF-P alleviated pauses are restricted to a small subset of these proteins. With the exception of PPG and PPP that

63

have special structural features, all the strong EF-P dependent pauses found in the ribosome profiling data (Table 1) have a polar amino acid at the A site. Trp at the A site has been previously found to produce translation pausing in vitro [81, 91], but could not be detected in our in vivo ribosome profiling data. Similar results have been previously obtained through proteomic data [81]. This suggests a possible effect of some polar groups on the positioning of the amino acid moiety of aminoacyl-tRNA at the A site of ribosomes. More importantly, our data show a dependency of PPX pauses on the identity of amino acids located N-terminal of the pausing site. This is in accordance with recent reports from Peil et al. indicating that amino acids like Asp, Ala or Ile can stimulate EF-P dependent pauses if located just upstream of a PPX sequence [81]. In contrast to their report, our results indicate that the effects of preceding amino acids are highly dependent on the context in which they are located. For instance, an Asn will prevent pausing when located before PPK, but allow it when located next to PP (N/Q) (Figure 15). More striking is a comparison of the pausing strength of (Q/V)PPS to (Q/V)PPT or of QPPS to

NPPS (Figure 15D). In both cases a single methyl group is enough to determine whether or not translation pauses. This shows an exquisite level of selectivity at either the PTC or the exit tunnel, both of which are usually expected to be fairly non selective in order to facilitate synthesis of all proteins.

Our finding that amino acids located two residues upstream of PPX (the Z-2 and Z-3 positions) influence pausing is supported by our previous finding that the motif generating the EF-P alleviated pause in PoxB requires a stretch of 6 amino acids [80].

One possible explanation for these results is that interactions between the nascent peptide

64

and the exit tunnel modulate the selectivity of the A site, similar to what has been observed for the macrolide relieved pausing of ermAL1 translation [180] or the translocon relieved translation of secM [183]. In these two cases the role of the amino acid two positions upstream of the P site is as important as that observed here for most EF-P relieved pauses. In contrast, pausing during translation of TnaC seems to depend on the -

3` position [184], similar to what we observed for atpD translation. Effects from positions further upstream also have relevant roles for some EF-P relieved pauses, as we have previously observed for the non-PPX pause on PoxB translation that depends on 6 continuous amino acids [80]. Thus, EF-P relived pauses depend on an array of diverse amino acid sequence contexts that interact with the PTC or the exit channel. Additional local effects of mRNA structure or interactions with the ribosome can not be ruled out as we did observe some variability in GFP expression depending on codon usage in the regions upstream of PPK (Figure 14A). Nevertheless, these effects were usually small and did not correlate with either codon usage or affinity for the ribosomal aSD.

EF-P dependence was found to directly correlate with the rate of translation initiation.

Our findings demonstrate that polyproline-induced stalls exert a net effect on protein levels only if they limit translation significantly more than initiation [181]. This model

(Figure 18) can be generalized to explain why sequences that induce pauses in translation elongation to, for example, facilitate folding do not necessarily exact a penalty on the overall production of the protein.

Finally, our results also suggest that polyproline dependent stalls in the absence of EF-P are eventually resolved and that protein synthesis proceeds to completion in most cases.

65

If the ribosome disengaged from polyproline induced stalls we would expect that proteins translated with low rates of initiation would show a degree of EF-P dependence similar to that of proteins derived from transcripts with much higher initiation rates. This supposition is supported by our SILAC data, where we do not observe an increase in peptide counts before the stall sequence compared to sequences downstream of the stall.

Figure 18. Model of the interplay between translation initiation rate and elongation stalls. Figure adapted from [181] Depiction of translation of an mRNA transcript with low initiation rate (left) or high initiation rate (right) and the subsequent effect of no elongation stall (top), a strong stall (middle) or a weak stall (bottom). Black line, mRNA transcript. Purple box, start codon. Brown box, stop codon. Blue ovals, translating ribosomes. Yellow stop sign, weak stall motif. Red stop sign, strong stall motif. ++++, high degree of protein synthesis per transcript. ++, medium protein synthesis per transcript. +, low level of protein synthesis per transcript.

66

3.4. Materials and methods

Plasmids used for motif verification were derivatives of pBAD30 [185]. As previously described [80], the plasmid contained a tandem fluorescent fusion cassette composed of green fluorescent protein (gfp) followed directly by mCherry. A cloning site was added to that construct after the 3rd codon of gfp. This plasmid was subsequently designated pBAD30XS. Patterns were inserted into pBAD30XS by double strand oligo hybridization.

3.4.1 GFP fluorescence assay in E. coli

As described previously, overnight cultures of E. coli harboring pBAD30XS constructs were used to inoculate M9 media to an OD600 of 0.05 (24, 36). The M9 media was supplemented with 0.2% glycerol, 5 µg/l thiamine, 0.2% arabinose and 100 µg/ml ampicillin. Cultures were grown at 37°C with shaking and fluorescence was measured at

8h using a spectrofluorometer (Horiba) with excitation at 481nm and emission at 507nm for GFP or excitation at 587nm and emission at 610nm for mCherry. Blank medium measurements were subtracted as background

3.4.2 Anti Shine-Dalgarno hybridization free-energy prediction

The free-energy of hybridization to anti shine-Dalgarno was estimated with a method similar to what has previously been described [150]. A sliding window of 8 nucleotides was selected form each mRNA sequence and the energy of binding to the sequence 5`-

CACCUCCU-3` (E. coli aSD) was calculated using RNAsubopt program in the Vienna

RNA package [178] considering a temperature of 37°C. The predicted free-energy was assigned to the position of the last nucleotide in the sliding window.

67

3.4.3 Other sequences analysis and curve fitting

Alignment of sequences upstream of pausing and non-pausing PPX was performed manually. These alignments were analyzed using Weblogo 3.3 [177]. Codon usage for codons in the “X” position of PPX was analyzed using costume Perl scripts, based on codon usage of highly expressed genes from E. coli [186].

Curve fitting: Data was fit to a ‘1-exp’ curve of the form ! = !(1 − !!!") by the minimum sum of chi-squares method: For each observed wild-type fluorescence value

(x-coordinate), chi-square values were generated by comparing the corresponding observed ∆efp fluorescence to the y-value calculated by the equation. The chi-squares for all constructs with the same ORF were summed and the values of A and B yielding the minimal sum of chi-squares was solved using Microsoft Excel’s ‘Solver’ function.

Coefficient of determination (R2) comparing observed ∆efp data and y-values predicted by the solved 1-exp equation was calculated using Excel’s ‘RSQ’ function.

68

Chapter 4: EF-P maintains transcription-translation coupling during oligo-proline synthesis

4.1 Introduction

During protein synthesis each amino acid is incorporated into the nascent peptide according to the order dictated by the mRNA. Although the basic peptidyl transfer reaction is the same for all amino acids, the speed of incorporation is not uniform and ribosomes elongate at a nonuniform rate [187]. Variations in elongation rates have been widely identified in gene regulation [188], mRNA decay [4, 189], codon bias [190] and protein folding [191]. Upon comparing amino acids according to their speed of incorporation during peptide bond formation, proline is by far the slowest [76, 77].

Proline is the only natural cyclic amino acid [78] and its pyrrolidine ring imposes structural constraints on the positioning of the amino acid in the peptidyl transferase center (PTC), resulting in slow peptide bond formation and ribosome stalling [79]. To overcome this potential bottleneck, robust translation of sequences with consecutive prolines requires the presence of elongation factor P (EF-P) [18-20, 80, 81]. EF-P in bacteria and the homolog (e/a)IF5A in eukarya and archaea are modified post- translationally to their active forms [19, 38, 63]. The post-translationally modified forms of EF-P and IF5A appear to increase the rate of peptide bond formation entropically by effecting the positioning and stabilization of the peptidyl-Pro-tRNA on the ribosome [21].

69

In addition to the importance of translation elongation for protein production in bacteria, elongation progress is also important for maintaining the transcription of the mRNA itself. This crosstalk between the ribosome and RNA polymerase (RNAP) is possible because transcription and translation are coupled in bacteria [192]. Translation of an mRNA can initiate as soon as the ribosome-binding site (RBS) emerges from RNA polymerase (RNAP) [193]. There are various consequences of the cross talk between the transcriptional and translational machineries, including gene expression regulation by transcription attenuation [125] and Rho-mediated polarity [132, 194]. Direct coupling of transcription and translation has been reported in vitro [195] and in vivo [95]. In vivo work has revealed that for the majority of the coding region, the first trailing ribosome directly assists RNAP during elongation thus matching translational and transcriptional rates under various growth conditions [95, 196]. Furthermore, coupling also preserves genome integrity through an anti-backtracking mechanism [109]. Backtracking of

RNAPs occurs due to collisions between RNAP and the replisome since both occupy the same DNA track but the replisome moves 20 times more faster than RNAP [197].

Collisions between bacterial RNAP and the replisome are predominantly co-directional since the majority of essential and ribosomal RNA genes in bacteria are oriented in the same direction as replication [198, 199]. Some collisions can interrupt and restart replication [200], while extensive RNAP backtracking introduces single-strand and double-strand breaks (DSBs) leading to increased mutation rates and jeopardized genome stability [109].

70

When translation slows or is stalled, a spatial gap can form between the transcribing RNAP and the trailing ribosome potentially allowing premature transcription termination [94]. Bacteria have two main modes of transcription termination: Rho- independent (intrinsic termination), and Rho-mediated (factor-mediated) termination

[121, 132]. Intrinsic termination requires features on the mRNA and does not require any auxiliary factors. These mRNA structural features include a stable hairpin, which is followed by a run of uridine residues [201, 202]. The mechanism of intrinsic termination involves pausing of the transcription elongation complex (TEC) at the termination site along with the formation of the RNA terminator hairpin inside RNA polymerase (RNAP)

[202-204]. Stable RNA hairpin formation is accompanied by the dissociation of a weak

U-rich RNA:DNA hybrid duplex [205, 206] causing destabilization and dissociating the

TEC [207, 208]. Intrinsic terminators, besides occurring at the end of a gene, frequently exist just downstream of promoters or intergenically in an operon and are implicated in gene regulation [209]. On the other hand, Rho-mediated transcription termination requires the factor Rho and accounts for around half of the identified transcription terminators in E. coli. Rho-dependent terminators can proceed genes, exist within coding sequences and are also found at the end of transcriptional units [132]. Rho is a homohexameric protein with RNA-dependent ATPase and helicase activities [210]. It constitutes 0.1 % of the total soluble protein of E. coli and Salmonella typhimurium

[211]. It is essential in E. coli [135] but dispensable in Bacillus subtilis [212] and

Staphylococcus aureus [213]. Analysis of bacterial sequences suggests that rho-like homologues are highly conserved in bacteria [213, 214]. Rho binds to mRNA at Rho

71

utilization (rut) sites. Upon reaching RNAP, Rho can dissociate from the DNA template and release the RNA transcript [96, 215, 216]. As with intrinsic termination, details of these final events are still unclear [121]. Rut sites have been defined as sequences of

70–80 nt in length that are rich in cytosines and lack secondary structure [217]. C richness is the only known feature common to Rho-dependent terminators and is the main requirement for Rho binding [132]. In addition, the attachment of Rho requires a ribosome-free transcript of at least 85–97 nt [218].

Beyond altering transcription termination efficiency, ribosomal stalling could result in mRNA cleavage. It has been established that ribosome stalling caused by certain peptides leads to mRNA cleavage around the stop codon, resulting in nonstop mRNAs

[24, 26]. Sunohara et al. further investigated whether the cleavage of mRNA could occur at sense codons when translation elongation is prevented. They connected the coding regions of two proteins by a short nucleotide sequence corresponding to the SecM arrest sequence. They found that this arrest peptide strongly induced mRNA cleavage, resulting in nonstop mRNAs that are recognized by tmRNA. This cleavage did not involve RelE nor the known major endoribonucleases [25].

In this work we investigate the impact of translational pausing at EF-P dependent motifs on the coupling of transcription to translation. Utilizing a reporter with an oligo- proline motif upstream of a potential terminator (rho dependent and independent) we find that oligo-proline motifs can promote uncoupling and decrease the transcription of a downstream gene. The classical model for polarity [134] states that rut sites occur frequently in messenger RNA (mRNA) sequences but are inactived by translating

72

ribosomes hindering access to Rho. Our findings now show that EF-P plays an important role in this polarity model during oligo-proline synthesis. This suggests that the presence of EF-P during the synthesis of oligo-prolines can alter the outcome of potential transcription termination sites (TTS), which we refer to as EF-P Altered Transcription

Termination Sites (EATTS).

4.2 Results

4.2.1 EF-P does not regulate Rho Synthesis

Since Rho has 2 potential EF-P dependent motifs we compared the available proteomic datasets from SILAC (stable isotope labeling of amino acids in cell culture) for evidence of regulation. From the Salmonella SILAC dataset [80] rho was reported to be produced

~ 4.7 fold more in WT Salmonella than in a ∆efp mutant. In the E. coli SILAC dataset

[81] rho was reported to be produced similarly in both WT E. coli and a ∆efp mutant.

Since the E. coli and S. Typhimurium rho proteins are ~ 99.5 % identity we investigated

Rho production in WT and ∆efp Salmonella strains. Our data indicate that EF-P does not affect Rho synthesis (Figure 19).

73

Figure 19. Rho is produced equally in WT and ∆efp Salmonella. A) Rho internally labeled with sfGFP. B) Coomassie staining of the samples loaded in A. C)Western Blot, each strain carries two FLAG epitopes, a 3xFLAG inserted internally in the rho gene and a 1xFLAG fused to the C-end of a chromosomal cat cassette (internal standard). This experiment was performed in the Bossi Lab in France.

4.2.2 Effects of Oligo-Pro synthesis on Rho-dependent transcription termination

To investigate whether ribosomal stalling at PPX motifs in a ∆efp E. coli strain can uncouple translation from transcription in the presence of a downstream rut site, the reporter pBAD30HATC was used. In this reporter sfgfp is in a transcriptional fusion to mCherry, which has a separate Shine-Dalgarno sequence. A rut site is introduced downstream of the inserted motifs within sfgfp (Figure 20A). This rut site is a simple repeat of TCs that can function as an efficient Rho-dependent terminator [217]. Guérin et al., showed that these TC repeats downstream from the chloramphenicol acetyl- transferase (CAT) gene start codon, led to Rho-dependent transcription termination only

74

in the absence of protein synthesis, which they attributed to interference between translation and transcription termination [217].

We hypothesized that upon including a 30 bp TC repeat downstream from an oligo- proline motif, if the stall at that motif in ∆efp E.coli cells is sufficient to cause uncoupling between transcription and translation, lower transcript levels of the downstream mcherry will result in the efp mutant but not in WT. GFP and mCherry fluorescence were measured (Figure 20B) and northern blot analyses of transcript levels were performed

(Figure 20C). To confirm that the transcription termination in our reporter was Rho dependent, the Rho inhibitor Psu was utilized. Psu is a unique capsid decoration protein from bacteriophage P4 [142, 219] which inhibits Rho termination with efficiency and specificity both in vivo [143, 220, 221] and in vitro [145]. To confirm the specificity of

Psu for Rho in our assays, a mutant Psu (E56K) that is unable to bind Rho [144, 222] was also used as a negative control. In WT all motifs were able to produce the fluorescent proteins similarly with active Rho (Psu - ), and upon inhibiting Rho with Psu (Psu +) all motifs showed ~ 5 fold improved mCherry fluorescence (Fig 20B). On the other hand, for the ∆efp mutant with active Rho (Psu - ) production of the fluorescent proteins was severely limited when the motif was EF-P dependent (PPPPPP or PPG). Upon inhibiting

Rho with Psu (Psu +) strains encoding either PPPPPP or PPG insert showed improved mCherry fluorescence by ~ 36 & 32.5 fold, respectively, compared to the corresponding strains producing inactive Psu (Psu-). In contrast, production of the reporter with the GN insert in the presence of active Psu, increased ~ 5 fold in the ∆efp mutant, similar to the

75

observed increase for all the motifs in WT. This indicated that the drastic effects observed for the oligo-proline motifs was due to their EF-P dependency.

Figure 20. Reporter with a rut site downstream of oligo-proline motifs. A) pBAD30HATC (rut) reporter with 3 different inserts GN, PPPPPP and PPG. B) mCherry and GFP fluorescence/OD for WT and ∆efp strains harboring both the rut reporter and a psu plasmid producing either active (+) or inactive (-) psu. C) Northern blot for gfp-mCherry with PPPPPP insert for both WT and ∆efp strain. Dr. Irina Artsimovitch assisted with the selection and design of rut site.

76

To confirm that the observed effect on fluorescent protein production was transcriptional, we investigated transcript levels by northern blotting using a probe against mCherry (Figure 20C). For the PPPPPP motif in WT, transcript level for gfp- mcherry was 1.6 fold higher in the presence of active Psu. On the other hand, for the ∆efp mutant the transcript level for gfp-mcherry was 5.3 fold higher in the presence of active

Psu. In Fig 20C, the higher band represents the full length gfp-mcherry transcript (~ 2000 nts) while the lower band (~1200 nts) could be explained by the presence of a predicted terminator stem-loop structure identified by ARNold [223, 224]

(AGTGACTAAAGGCGGCCCGCTGCCTTTTGCGTGGGA), which using RNAfold

[225] had a predicted free energy of -8.40 kcal/mol.

4.2.3 Effects of Oligo-Pro synthesis on transcription through intrinsic terminator sites

To determine whether ribosomal stalling at PPX motifs in a ∆efp E. coli strain could uncouple translation from transcription in the presence of a downstream intrinsic terminator, we utilized the reporter pBAD30HAPYT. To investigate the affect of an olio- proline motif, we inserted ahead of the hairpin a PPPPPP motif. In WT replacing GN with PPPPPP caused a 1.6 fold decrease in mCherry fluorescence, while in ∆efp there was 26.3 fold decrease (Figure 21B). Investigation of transcript levels by dot blot analysis (Figure 21C) indicated similar transcript levels of pyrL- mCherry in WT with the

GN and PPPPPP motifs while in ∆efp there is 2 fold more detected pyrL- mCherry for the

GN compared to the PPPPPP motif (2.4). This indicates that pausing at the PPPPPP motif ahead of the pyrL intrinsic terminator increased termination efficiency in the absence of

EF-P. 77

Figure 21. Reporter with an intrinsic terminator downstream of an oligo-proline motif. A) pBAD30HAPYT plasmid with PPPPPP or GN insert. In hairpin (+) reporter the insert is followed by pyrL. 3HA is 3 repeats of HA tag B) mCherry (or pyrL-mCherry) fluorescence/OD for WT and ∆efp strains harboring pBAD30HAT or pBAD30HAPYT plasmid with PPPPPP or GN insert. For each strain 2 additional rpoB chromosomal backgrounds were tested, rpoB2 and rpoB8. C) Dot blot for mCherry/pyrL-mCherry for the samples in panel B, value of mCherry probe is normalized to HA probe, *ratio is the average of 3 repeats, **standard deviation (SD) of the repeats. Dr. Irina Artsimovitch contributed to the selection and design of the intrinsic terminator.

4.2.4 Investigating the effect of different RNAP mutants on EF-P dependent coupling

Our above data indicate that in the ∆efp strain, ribosome stalling at oligo-proline motifs can uncouple transcription from translation. To further corroborate the proposed role of

EF-P in maintaining coupling, we investigated the effects on termination of altering the speed of RNAP transcription elongation. To investigate the effects of slower or faster 78

transcription elongation when translation elongation is hindered in absence of EF-P, two rif R RNAP chromosomal mutations were utilized, rpoB8 (B subunit, Q513P) and rpoB2

(B subunit, H526Y). Compared to wild type RNAP (rpoB), the mutant rpoB2 has a faster transcription elongation rate while rpoB8 has a lower transcription elongation rate [222,

226-228] due to a defect in NTP incorporation. The slow transcription elongation rate for rpoB8 makes it more termination prone at hairpin - dependent terminators [229]. For

Rho-dependent transcription terminators, rpoB8 behaves similar to wild type RNAP

(rpoB) [222]. On the other hand, rpoB2 reads through intrinsic terminators more efficiently than does wild type RNAP (rpoB) [227]. In a ∆efp mutant background for rpoB and rpoB2, upon replacing GN with PPPPPP there was a 26 and 32 fold decrease in mCherry fluorescence, respectively (Figure 21B). The corresponding activity decreased

45 fold for rpoB8, consistent with this strain being more prone to intrinsic termination

[229]. The dot blot analysis (Figure 21C), which shows the transcript level ratios between the 3 different RNAP variants, indicated that the rpoB8 ∆efp chromosomal background was the most sensitive to the intrinsic terminator, and rpoB2 the least in presence of an upstream oligo-proline motif.

Investigation of the different RNAP chromosomal backgrounds in the rut reporter

(Figure 22) indicated that in rpoB, for both WT and ∆efp strains, the mcherry production improvement in the presence of active Psu (psu +) was greater than for rpoB2. This may result from a faster transcription elongation rate in rpoB2, making it generally more resilient to Rho, thus inhibition of Rho with Psu would be less significant. This

79

observations is in accordance with the work of Jin et al. that demonstrated increased readthrough for rpoB2 relative to rpoB at a rho-dependent TTS [230].

Figure 22. Reporter with a rut site downstream of oligo-proline motifs in different RNAP backgrounds. A) mCherry fluorescence/OD for WT and ∆efp strains harboring rut reporter (Figure 20A) with PPPPPP and psu plasmid with either active (+) or inactive (-) psu. For each strain 2 additional rpoB chromosomal backgrounds were tested, rpoB2 and rpoB8. The mCherry fluorescence /OD for rpoB chromosomal backgrounds is the same as Figure 20B. B) Dot blot for gfp-mCherry for the samples in panel A, value of mCherry probe is normalized to HA probe, *ratio is the average of 3 repeats, **standard deviation (SD) of the repeats. Dr. Irina Artsimovitch contributed to the experimental design

80

On the other hand, in the rpoB8 chromosomal background, it is noteworthy that for the

∆efp strain, the mCherry fluorescence in absence of Psu (Psu-) was higher than its counterpart in rpoB or rpoB2 for the PPPPPP insert. This suggested that the slower transcription elongation rate of rpoB8 had a positive effect on maintaining coupling with the ribosome hindered by an oligo-proline motif.

4.2.5 In Silico search for EF-P suppressed transcription terminators

Although the major known function of EF-P is to facilitate the robust synthesis of oligo- proline motifs, this does not provide a complete picture of the role of EF-P in gene expression. In particular, the proteomic data from SILAC experiments performed both in

S. Typhimurium [80] and E. coli [81] indicated that only a minority of PPX-containing proteins were downregulated in the ∆efp strain. For E. coli SILAC,~13 % of the proteins had at least 2-fold misregulation between the WT and ∆efp strains. Furthermore, RNA- seq data in E. coli [20] showed that ~21 % had at least 2-fold misregulation between WT and ∆efp. In both datasets, PPX-containing genes are almost equally represented in the upregulated and downregulated genes, suggesting that EF-P might have an effect on transcription. In bacteria, since transcription and translation are coupled, EF-P might influence transcription by maintaining proper transcription- translation coupling, as shown above. This would be particularly critical when the ribosome is translating an EF-

P dependent motif while RNAP is transcribing a potential transcription termination site

(TTS). Transcription termination in bacteria is either dependent on Rho or intrinsic. The estimated total number of transcription units in E. coli is 3538 while the number of Rho- independent terminators is estimated to be 257 [231]. In this work, we investigated the

81

RNA seq data for WT and ∆efp E. coli strains [20] to search for PPX-containing genes that contain potential TTSs.

4.2.5.1 Potential rut sites

Bioinformatic prediction of Rho-dependent terminators is not accessible since, other than

C-richness, there are very few other conserved features [232]. Although individual Rho- dependent terminators have been studied extensively, less is known on a genome-wide scale. Recent work by Peters et al. [233] used two strand-specific, global RNA-profiling techniques to obtain high-resolution maps of Rho-dependent termination in E. coli using the Rho-specific inhibitor bicyclomycin (BCM). They found that the vast majority of

Rho-dependent terminators controlled antisense transcription. Since, these experiments were done with WT E.coli cells that have active EF-P, they would not detect EF-P

Altered Transcription Termination Sites (EATTS). As an alternative approach, we compared RNA-seq data with SILAC data to identify PPX-proteins that were down- regulated in both datasets in the ∆efp strain in comparison to WT (Table 7). It is noteworthy that cadA was identified to include a Rho-dependent terminator that starts

40bps before the stop codon of cadA [233]. Furthermore, the PPG motif is only 144 bps upstream of this rut site. From this table, several genes have rich C>G regions and have weak secondary structure prediction. In future work the genes cadA, hslU, mqo and frmB will be further investigated. Transcript levels will be compared for these genes with and without Psu induction for WT and ∆efp E. coli.

82

Table 7. Common genes in both SILAC [81] and RNA-seq [20] that are down-regulated in ∆efp strain in comparison to WT and contain at least one PPX. These genes have at least a 1.5 fold difference in RNA total reads in the RNA-seq dataset Wt/∆efp ratio Within 200 bps after PPX Oligo-pro Gene Highest initial ∆G SILAC RNA motif Highest C:G% (kcal/mol) for predicted score * seq ** ratio *** RNA secondary structure ****

PPA 62.5:0 -56.9 2.73 2.77 acnB PPG 62.5:12.5 -59.8 PPT 62.5:0 -68.9 PPD 50:12.5 -61.1 malQ 3.49 1.63 PPM 37.5:12.5 -60.4

amiB 2.00 1.61 PPPPPPPP 37.5:12.5 -79.3 PPL 37.5:0 -54.8 cadA 6.78 3.24 PPG 37.5:12.5 -55.6 PPE 62.5:12.5 -51.2 ygdH 3.28 2.54 PPN 62.5:12.5 -59.1 PPV 50:0 -62.8 yghJ 2.25 4.43 PPR 37.5:0 -66.8 PPA 62.5:12.5 -63.8 hslU 2.07 2.63 PPG 37.5:12.5 -56.5 sdaC 2.60 1.58 PPP 37.5:12.5 -68 tyrB 2.15 1.58 PPN 37.5:12.5 -66.1 recJ 2.13 1.95 PPL 37.5:0 -73.2 lysU 2.10 1.54 PPT 62.5:0 -26.7 frmB 2.04 1.92 PPK 50:12.5 -26 mqo 2.98 6.58 PPM 62.5:0 -47.6 ybiU 5.09 3.62 PPG 50:0 -70.1

* These values from SILAC by Peil et al., from [81] ** RNA seq from Elgamal et al., [20] *** C and G %s were calculated within windows of 8 bps **** ∆G prediction for RNA secondary by Mfold [234]

83

4.2.5.2 Potential intrinsic termination sites

An intrinsic terminator when transcribed into RNA, forms a GC-rich stem–loop structure

(or hairpin) followed by a 7- to 8 -nt stretch that is U-rich [232]. For all the PPX containing genes with greater than 2 fold more RNA reads in WT/∆efp the transcript sequence was fed into the ARNold program to detect rho-independent terminators [201,

223-225]. From a total of 131 genes, 30 where predicted to form a hairpin (Table 8).

Similarly, for all the PPX containing genes with greater than 2 fold more RNA reads in

∆efp/WT rho-independent terminators were predicted [201, 223-225]. From a total of 124 genes, 19 where predicted to form a hairpin (Table 9). The presence of an intrinsic terminator downstream -of at least one of the PPXs - for the genes in Table 8 was detected for ~ 43.3 % of the genes while the corresponding percent in Table 9 is ~ 31.6 % of the genes. The presence of an intrinsic terminator downstream of PPX in absence of

EF-P could cause uncoupling of translation from transcription. In future work transcription of yahG, xanP, brnQ and proY will be further investigated by northern blot to detect if truncated transcripts at these predicted TTS can be detected in WT or the ∆efp

E. coli strain. In Table 9 about 63 % of the genes have the PPX downstream of a predicted intrinsic TTS or within. This suggests that if transcription terminations occurred in those genes due to these predicted TTS, the uncoupling would occur even before the ribosome is hindered by EF-P. The gene nrdD is particularly interesting since one of its

PPXs (PPL) is within the predicted TTS. In future work transcription of nrdD will be investigated by northern blot to detect if this truncated transcript is formed, and to what extent, in WT and ∆efp E. coli strains.

84

Table 8. Prediction of intrinsic terminators for genes with RNA reads more than 2 fold higher in WT versus ∆efp bps PPX 5 TTS Detected Secondary structures and free energy of stem-loop region between Gene ` PPX Before or (kcal/mol) * TTS & PPX end After TTS **

yegT 156 GTCGCCAATTCTGGTTGGCTCCATCACTGACCGcTTTTTCTCGGCG -5.6 PPA After 351

yfjR 115 GTGTCAGTGCGCACGCTGCGGCGTGaTTTTCGTGAACG -7.6 PPE After 538

PPG Before 90 yahG 211 GAGCGTATTATCCAGTCGCATCCGGTGCTGATTGGTTTTGATCAGGC -5.6 PPI After 49 PPF After PPM 80 aslA 352 GACGCCGTTGCCAGCCAGGGGCTGaTTTTAACTTCGG -6.6 PPH After PPR 85 PPQ Before 373 citT 473 CCAACACCGCGCGTACCGGGGGTACGgTTTTTCCGGTCA -7.2 PPL After 11 PPS cysJ 1426 GCCGACGAAGCGCCAGGTAAAAACTGGCTGTTCTTTGGTA -8.9 Before PPE 1047 ugpE 531 GCGGATCGACGGCGCATCGCCAATGCGCTTCTTTTGCGAC -8.2 PPV After 207 efeO 18 CCGTAACGCATTGCAGTTGAGCGTGGCTGCGcTGTTTTCTTCTG -7.5 PPS After 704 fruA 844 GTGCCGGTACTGGCAGGTTATATTGCCTTTTCCATTGCC -6.1 PPL After 435 107 TGTTTGCCGCCTGTCAGCATCTGCTGGCGaTGTTCGTTGCGG -9.6 PPL Before 10 xanP 253 ATTCAAATTAAGGCCTGGGGTCCGGTTGGCTCCGGGCTgTTGTCTATTCAGG -12.1 PPE After 1070 PPT 817 mdtO 124 CTGGTGATTCTGATCTCGATGACCTTTGAGATCccTTTTGTGGCGTT -9.6 After PPS

Continued

85

Table 8 continued PPX bps 5' TTS Detected Secondary structures and free energy of stem-loop region Gene PPX Before or between end (kcal/mol) * After TTS TTS & PPX ** 25 GAAAAGGACAAAATCAATCCGGTGGTGTTTTACACCTCCGCCGGACTGATTT -2.5 PPE After 281 betT TGTTGTTTTCCC 1185 CTTCTACAGCCTGCTGGCGCAGTATCCGGCGTTTACCTTTAGC -6.7 PPE After 495 PPM Before 166

brnQ 263 TTTGTTACCTGGCGGTGGGGCCGCTTTTCGCTACGC -6.6 PPC 746 After PPM fhuC 47 PPS After 96 ATATCTCCTTTCGTGTGCCCGGGCGCACGcTTTTGCATCCGC -11.8 PPA PPQ Before 86 hyfR 1478 AAATGGTTGAAGATCGCCAGTTTCGCAGCGATCTCTTTTATCGCC -7.6 PPP 1336 PPL After

potF 209 GCAAATTAATGGCCGGGAGTACCGGCTTTGATCTGGTG -10.4 PPA After 762

proY 1209 AACGACCATCGGCGGGCTGATTTTCCTGCTCTTTATTATCG -6.3 PPE Before 41

253 AAAATGGATGTCCTGTTTGACGGTGCGGGTTTTATCTTCGC -5.2 PPG Before 96 psuT 355 GGCGTGATGGGGCTGCTGATTCGCATCCTTGGCAGCaTTTTCCAGAAAG -9.7 PPK After 314

sfmD 2291 TGCCGACGCAAGGCGCGTTGGTTCGTGCCaaTTTTGATACCCG -11 PPE Before 1777

ycjM 265 TTTTATCCATGGTCATCTGATGATGGCTTTTCGGTAATT -7.2 PPL After 591

PPL 1129 yhdP 1419 TAAAGCCAAAGCCGTCCATGCGCGCGGCGGTTTTCGTTACCT -5 After PPM

Continued 86

Table 8 continued PPX bps between 5' TTS Detected Secondary structures and free energy of stem-loop region Gene PPX Before or TTS & PPX end (kcal/mol) * After TTS ** PPE Before yhhJ 770 CGCTGTTTATGCTGGGCGTGGCGCTCAGTCTGTTTGCCAC -8.7 PPN 457 yihQ 1953 TAATGCGCCCATCGGCAAGCCGCCGGTCTTTTATCGCG -6.9 PPV Within PPPP Before 514 entS 1125 ACCGGTTGCTTCCGCAAGCGCGAGCGGTTTTGGTTTGTT -5.7 PPQ After 58

fdrA 530 CGTCGATGATTGCCGGCACACCGCTGGCTTTTGCTAACGT -10.7 PPA After 214

PPK Before ymfE 573 TTTTGTAATTTTATCATTCTCAGTGAGTGGTATTATCTTTGCTT -7.4 PPI 416

87 yphG 3223 TGGCCGAAAGCGGCATTATTCACCACCGTGATGCCTTTTATTTTTAA -9.6 PPD Before 2598 ytfF 205 ATGCTCACTATGATGGGCAACCTCATCTATTACTTCTGC -5.5 PPL After 615 rsmJ 17 TTGATGAAACAGGCACCGGAGACGGTGCCTTATCTGTTCTG -17.9 PPL After 612

ycjR 20 ATCAGGCGTTCTTTCCGGAAAACATTCTGGAGAaaTTTCGTTATATC -8.8 PPM After 261

* Prediction of intrinsic terminators done with ARNold [201, 223-225]. Secondary structures colored such that loops are in red and stems in blue. Lowercase letters in RNAmotif predictions indicate the spacer element, between the stem-loop and T-rich region. ** This column shows the # of nts between the predicted transcription terminator site (TTS) and the closest PPX

87

Table 9. Prediction of intrinsic terminators for genes with RNA reads more than 2 fold higher in WT versus ∆efp bps PPX 5' TTS Detected Secondary structures and free energy of stem-loop region between Gene PPX Before or end (kcal/mol) * TTS & After TTS PPX ** PPF 191 fliI 194 TAGAAAGCGAAGTCGTTGGCTTTAACGGTCAACGGCTGTTTTTAATGC -9.6 PPA After PPS 195 GAAAAACCTGCTCAGTGTGGAAGGATTACACTGGTTTTTACCCAAT -7.5 abgT PPM After 261 TTTTGCTCCACTTGGTGCGATCCTGGCGCTGGTTTTAGGTGCCG -5.5 140 PPW clcB 1130 TATGTGAAATGACCGGGGAGTATCAGCTACTCCCCGGTTTATTGATTGCCT -14.8 Before PPL 247 cydC 474 TGATTTCACCCTCGCCTTTACGCTGGGCGGcaTTATGTTACTGA -7.7 PPL After 14 88

-7.1 PPK Before 233 flhC 396 GCAACTTTCCAGCTGCAACTGCTGCGGCGGCaaTTTTATTACCCA PPS After 43 gfcD 1374 TACCGACCACCTGCTGCTTGATGGCGGTATTTTCACCAATA -7.9 PPT After 344 metE 1223 ATGTCTATGAAGTGCGTGCTGAAGCCCAGCGTGCGCgTTTTAAACTGCC -9.4 PPI After 283 mtlA 75 CGCGTTTATCGCGTGGGGTATCATCACCGCGTTATTTATTCCA -6.3 PPP After 1218 PPK Before 1183 nrdD 1781 AAGCGCCTTACCCGCCGCTGGCGAACGGTGGTTTCATTTGCTA -9.3 PPL Within 89 ACGGTACACCCCTGCGCCAGGTACCCCTGGCGCAGcgTTTTGTTATTCC -15.1 16 rsxC PPE Before 1005 GATTATGGGTGGCCCGCTAATGGGCTTTACCTTGCCA -7.1

Continued

88

Table 9 continued PPX 5' TTS Detected Secondary structures and free energy of stem-loop region bps between Gene PPX Before or end (kcal/mol) * TTS & PPX ** After TTS PPG sapC 789 GTGGACTGTCATGCTGCCAGGTGCGGCAaTTATGATTAGCG -8.2 Before 587 PPS PPF 108 speC 148 TTGCTTAAGCGCACCGGTTTTCATCTACCGGTGTTTTTGTATTCC -11.5 After PPV

ACCAGCTCCACTGCGGGAACGTCTTACTTCCCGTATATCTTTATG tnaB 547 -10.6 PPA After 403 GC PPV Before treF 1597 GGCTGGACTAACGGTGTGGTACGCCGTTTAATTGGTTT -6.7 PPN 663

385 TTACTGGCTCCCTGGCAAGGTCTGAATGCCAGcTTTTCGCCTGTG -7.6 After 89

yfiQ PPL CTCAGGAGTTGCGGGTTGTGGTTGAGCACGATCCGgTTTTCGGG 1745 CCGT -10.6 62

PPF 550 ygiS 311 CCGACGGTCAGCCTCTGACGGCAGAGGaTTTTGTCCTCGG -9 After PPE AAAACTGCCTACCGATGCGCAAAAAGTGGCGTTGGgTTTTGCGC PPT After 149 yhjA 534 -9.1 TGTA PPL Before 208 TCCGGCATATCGCCATATTGGCGGTCATGGATATGGCTCTATTTA ykgF 966 -8.9 PPS Before 512 TCCA GGGCGATGCCTTCGCAGAACGCAACGGCTGCGGTTTTACCTTTG narY 1398 -6.6 PPL Before 323 G

* Prediction of intrinsic terminators done with ARNold [201, 223-225]. Secondary structures colored such that loops are in red and stems in blue. Lowercase letters in RNAmotif predictions indicate the spacer element, between the stem-loop and T- rich region. ** This column shows the # of nts between the predicted transcription terminator site (TTS) and the closest PPX 89

4.3 Discussion

In this work, we showed that an EF-P dependent motif upstream of a rut site or an intrinsic terminator caused decreased transcription of a downstream gene only in the absence of EF-P. This indicated that by stalling at oligo-proline motifs, ribosomes in ∆efp cells could uncouple transcription from translation at these downstream TTS. Therefore, we propose that EF-P could affect gene expression at the level of transcription, and not only at the translational level. Control of gene expression at the level of transcription termination is a common mechanism in bacteria with a range of molecular mechanisms moderating the termination activity of RNA polymerase. RNAP responds to two different sets of signals for termination: Rho-dependent terminators or intrinsic terminators [124].

Rho utilization (rut) sites are characterized by high C/low G content, and relatively little secondary structure [121, 132]. In this work we utilized a rut site of 30 nt composed of pure rUrC. This simple polypyrimidine repeat was reported by Guérin et al. to act as an artificial Rho-dependent terminator in vivo and in vitro [217]. They observed termination only in the absence of protein synthesis due to uncoupling translation from transcription. Similar to their observations, we detected Rho-dependent transcription termination only in ∆efp cells that included an oligo-proline motif upstream of the TC repeats. This termination was not detected when the upstream motif was proline-free

(Figure 20). This indicated that the translational stall caused by an oligo-proline motif was sufficient to cause uncoupling in the absence of EF-P.

Similarly, we investigated intrinsic terminators using the pyrL intrinsic terminator upstream of mCherry. Regulation of the pyrimidine biosynthetic operon (pyrBI) is an

90

example of transcription attenuation in E. coli [151] that utilizes coupling of translation to transcription of a leader peptide. Tight coupling across the leader peptide is necessary to prevent formation of the terminator [130]. With this reporter, we detected a significant increase in termination efficiency in ∆efp cells that included an oligo-proline motif upstream of pyrL compared to a proline-free motif. This increase in termination efficiency was not detected in WT (Figure 3).

Transcription and translation are coupled in both bacteria and archaea, raising the possibility that this new role for EF-P may also be detectable for the archaeal EF-P homologue, aIF5A. In eukarya, although there is no coupling, it is possible that eIF5A has a role in the no-go decay (NGD) mechanism which recognizes mRNAs with translation elongation stalls and targets them for endonucleolytic cleavage [189].

In this work we describe two examples where the presence of EF-P decreases transcription termination, and it is conceivable that the opposite may also happen in other instances. It is well known that termination control mechanisms have been shown to either increase or decrease termination efficiency [124]. For intrinsic termination sites, ribosomes stalling at oligo-proline motifs may favor formation of a terminator or antiterminator RNA structure, thus enhancing or decreasing transcriptional termination efficiency. For rut sites, although the classical scenario with a stalled ribosome is to cause transcriptional termination, there are reports of the opposite. For example, in the E. coli tna operon, the stalled ribosome occupies the overlapping rut sequences, preventing Rho binding and increasing transcription of the tna operon [235]. Similarly, in the riboswitch mgtA in Salmonella, when proline levels are limiting, the ribosome stalls during

91

translation of a short proline rich leader (mgtL). This stall favors the formation of a stemloop that hides the region where Rho interacts. In contrast, complete translation of mgtL would promote Rho-dependent transcription by exposing this region.[236].

Interestingly, rut sites and PPX motifs share in common C richness. Since the amino acid proline is coded by CCN, having a PPX guarantees at least 50% C.

Furthermore, C-rich mRNA regions as short as 8 nts have been identified as rut sites [96,

236]. Thus, it would not be surprising for a rut site itself to include a PPX, which would protect it from being accessed by Rho during translation [237, 238].

We propose that the presence of EF-P during translation of oligo-proline motifs plays a dual role, first is to maintain a proper translation elongation rate across this mRNA region and second to maintain coupled transcription-transcription across cryptic

(potential) TTS. We name this new role EF-P Altered Transcription Termination

(EATT). Extensive research guided by bioinformatics will be necessary to identify these potential EATT sites and understand their contribution to gene expression.

4.4 Materials and Methods

4.4.1 Strains

Salmonella wild-type (SL1344)

E. coli wild-type (BW25113) and efp mutant were from the Keio knockout collection and kanamycin cassettes were removed using FLP recombinase [36, 168].

BW25113 B2: BW25113, rpoB2, RifR

BW25113 B8: BW25113, rpoB8, RifR

Salmonella Rho internal labeling: Done in Bossi Lab in France, described in [239]

92

4.4.2 Plasmids

For EF-P dependent expression assays in E. coli, the pBAD30XS translation fusion plasmid was used as described previously [20, 80]. The plasmid contains an arabinose- inducible tandem fluorescence cassette consisting of GFP followed immediately by the mCherry SD sequence and mCherry. pBAD30XS was modified by replacing the gfp with sfgfp while maintaining the xhoI-speI cloning site. mCherry was improved by introducing an i-tag before mcherry [240]. Also a 3HA tag was placed directly after the start codon of sfgfp. The new plasmid was termed pBAD30HA.

-rut reporter, pBAD30HATC: Cloned by inserting in pBAD30HA a linker (SGSGSGSG) followed by (TC)15 using the speI site in gfp to insert the double strand oligo hybridization product of primers STC1 and STC2. pBAD30HATC was further double digested with XhoI and speI and PPPPPP and PPG were inserted by double strand oligo hybridization to insert the double strand oligo hybridization product of primer pairs (PG1

& PG2)and (6P1& 6P2), respectively.

-Hairpin reporter, pBAD30HAPYT: A PCR product amplifying only itag-mcherry from pBAD30HA with a forward primer introducing SpeI-Pcil (SPT1) and a reverse primer with XbaI (SPT2) was ligated back into pBAD30HA (double digested with SpeI and

XbaI) to make pBAD30HAT. To introduce hairpin, pyrBI operon leader peptide

(ATGGTTCAGTGTGTTCGACATTTTGTCTTACCGCGTCTGAAAAAAGACGCTG

GCCTGCCGTTTTTCTTCCCGTTGATCACCCATTCCCAGCCCCTCAATCGAGGG

GCTTTTTTTTGCCCAGGCGTCAGGAGA) amplified from E. coli with primers introducing SpeI (PFS) and Pcil (PRP) was inserted inframe into pBAD30HAT cut with

SpeI and Pcil. This plasmid called pBAD30HAPYT was further double digested with 93

XhoI and speI and PPPPPP was inserted by double strand oligo hybridization with primer pairs (6P1& 6P2) to produce pBAD30HAPYT2

-Psu production from pIA280PSU

Psu production was driven by Ptrc. pIA280PSU was cloned by double digesting plasmid pIA280 with bglII and HindIII to insert psu (ordered gene block from IDT). A mutant psu, E56K (unable to bind Rho or inhibit Rho-dependent termination [144] ) was also cloned (pIA280PSU56)

4.4.3 Fluorescence assay in E. coli

As described previously, overnight cultures of E. coli harboring pBAD30XS constructs were used to inoculate M9 media to an OD600 of 0.05 [20, 80]. The M9 media was supplemented with 0.2% glycerol, 0.5 g/l, tryptone, 0.3 g/l thiamine, 0.2% arabinose, 50

µg/ml kanamycin and 100 µg/ml ampicillin. Cultures were grown at 37°C with shaking.

After 2hrs cultures were induced with 0.2 mM IPTG. After 5 more hrs fluorescence was measured using a spectrofluorometer (Horiba) with excitation at 481nm and emission at

507nm for GFP or excitation at 587nm and emission at 610nm for mCherry.

4.4.4 Isolation and analyses of in vivo RNA transcripts

After fluorescence measurements, 4 ml of culture was pelleted and stored in RNAlater®

Stabilization Solution (Ambion) at 4ºC. Total bacterial RNA was extracted as follows:

RNAlater® was removed and the pellet was resuspended in REB buffer and extracted as described in [241]. Northern blot analyses was performed by fractionating 20 µg of total

RNA on a 1.5% agarose gel containing 6% formaldehyde [242, 243]. The RNAs were transferred onto to Zeta-probe cationized, nylon membrane (Bio-Rad) by Capillary gel-

94

transfer [244]. RNA was then UV cross-linked to the membrane followed by prehybridization for 2-4 h at 50ºC in a solution containing 5X SSC, 20mM Na2HPO4 pH

7.2, 1X Denhardt solution, 7% SDS and 100 µg/ml denatured salmon sperm DNA. After addition of the denatured [32P]-5´-end labeled DNA probe, the hybridization was allowed to proceed for 18 h at 50ºC. The membrane was washed at 50ºC for 30 min: twice in 3X

SSC, 10X Denhardt solution, 25mM NaH2PO4 pH 7.5 & 5% SDS and once in 1% SSC &

1% SDS. For dot blos: The RNA was prepared as described in [245] then loaded to the nylon membrane using the Bio-Dot® Microfiltration Apparatus. The membrane was subsequent processed as mentioned above.

Table 10. DNA oligonucleotides utilized Name Sequence (5'-3')

STC1 CTAGTGGCAGCGGCAGCGGCAGCGGCAGCGACGTCTCTCTCTCTCTCT CTCTCTCTCTCTCTCGGCCGG STC2 CTAGCCGGCCGAGAGAGAGAGAGAGAGAGAGAGAGAGAGACGTCGC TGCCGCTGCCGCTGCCGCTGCCA PG1 TCGAGCCGCCGGGTA

PG2 CTAGTACCCGGCGGC

6P1 TCGAGCCGCCGCCGCCGCCGCCGA

6P2 CTAGTCGGCGGCGGCGGCGGCGGC SPT1 CCTAACACTAGTGGCAACACATGTCCGACATTAGAAATAGCACAAAA AAAAGTGAGC SPT2 CAGAATTTGCCTGGCGGCAGTA

PFS GGCAACACTAGTATGGTTCAGTGTGTTCGACATTTTGTC

PRP ATGTCGGACATGTTCTCCTGACGCCTGGGCAAAAAAAAGC

mCherry probe TGTTATCTTCTTCGCCCTTGCTCACTTTTTTTTGTGCTATTTCTAATGTC

HA probe GTTGCCCTCGAGTTTGCTAGCGTAATCTGGAACGTCATATGGATAGG 5S probe) (ppB10, [246]) ACACTACCATCGGCGCTACG

95

Chapter 5: Conclusions and Perspectives

Gene expression is an intricate multistep process. The production of a protein starts with transcription of the genetic code on DNA into mRNA. The mRNA is translated by ribosomes to produce the nascent peptide chain that must fold correctly in order to produce an active protein. Every stage of this journey is subjected to regulation and fine-tuning. Cross talk between transcription and translation enables the bacterial cell to adjust transcription according to its translation capacity, this conserves cellular resources by preventing accumulation of mRNAs that cannot be translated.

Translation proceeds rapidly (~ 1 to 20 amino acids/ second, depending on organism and conditions [247]), and with great fidelity [248]. With the incorporation of amino acids, the nascent polypeptide must pass through a 100 Å channel through the large ribosomal exit tunnel. This narrow exit can only accommodate α-helical conformations. This entails that the nascent protein cannot fold into tertiary structure until it is released [249]. Also, protein folding is kinetically coupled to the emergence of the nascent peptide and in many cases the folding outcome can be manipulated by ribosome effects [250], polypeptide elongation rate [251] and cotranslational interactions with chaperones [252]. All of these factors can bias kinetically competing folding events to produce alternative stable structures with different functional properties [253, 254].

Translational stalling at polyproline-stretches has been shown in bacteria [19, 20,

80] and eukaryotes [255]. Relief of such stalls in bacteria requires the translation 96

elongation factor, EF-P. In chapter 2 we performed ribosome profiling to monitor local translational rates on a genome wide level in wild type and ∆efp E. coli cells. Prior to conducting the experiment, the available proteomic data [80] indicated that PPP, PPG and

APP were overrepresented in the proteins downregulated in the efp mutant strain. Our ribosome profiling data confirmed the dependency of some oligo-proline motifs on EF-P for proper translation. This was evident from increased ribosome density at these motifs in the ∆efp mutant. While analyzing the ribo-seq data we determined a strong pause site to be a site where ribosome density was at least 10 fold greater than the average reads/bp of the gene [150]. The percent of PPX-containing genes that met this criterion was only

2.8 % for ∆efp mutant. Recently Woolstenhulme et al. [256] reported that -with a higher resolution data analysis by 3` assignment strategy– they detected 47% of PPX instances to have a pause score above 10 in the ∆efp mutant. Interestingly, they detected in WT 88 instances of PPXs that had a pause score above 10. In our ribo-seq data we only detect 2 of those instances.

Furthermore, they detect pausing scores in the ∆efp mutant close to 1000 and in

WT as high as 100. In our data set, the highest score we detected in ∆efp mutant was 45 and for WT was 10.75. This indicates that the 3` assignment strategy [257] provides much higher resolution and emphasizes that Pro has a relatively high pause score in the P site even in WT [256]. Another difference between our and Woolstenhulme et al.’s ribo- seq data sets is the footprint size selection. Our data was generated using footprint sizes of 28-42 bps [149] while they selected 20-40bps. It is worth noting that in the standard ribosome profiling method, only monosome-protected fragments are sequenced. This has

97

been reported by Subramaniam et al. to underestimate the in vivo extent of ribosome occupancy in the incidence of ribosome traffic jams [187]. Deep sequencing of longer mRNA fragments (50–80 nt) exposed a rise in average disome occupancy up to 90 nt upstream of CUA codons during leucine starvation which was suggestive of a traffic jam between two to three ribosomes behind the paused ribosome. Subramaniam et al.concluded that the paused ribosome and the trailing ribosomes both had an equal probability of occurring either as monosomes or as disomes upon nuclease treatment

[187].

In chapter 3, the data obtained from our ribo-seq experiments were further validated utilizing an in vivo reporter system. These experiments showed differences between EF-P dependencies of various ZPPX’s. While a ZPPX may show high EF-P dependency in the in vivo reporter, the ribo-seq data sometimes showed varying pausing scores for this same ZPPX. Another example of this discrepancy is the α- and β-subunits of ATP synthase, AtpA and AtpD. Although both are translated from a polycistronic mRNA and contain a PPG motif,, proteomic analysis revealed that AtpD levels were strongly dependent on EF-P, whereas AtpA levels were independent. This led us to investigate the first stage of translation. The stage of translation initiation is typically the primary regulatory step. Our work demonstrated that oligoproline-induced stalls exerted a net effect on protein levels only if they limited translation significantly more than initiation.

Although the available body of knowledge regarding EF-P is increasing, we still lack insight on why a higher percent of proteins are upregulated rather than

98

downregulated in the absence of EF-P. We also lack any understanding on how/if EF-P affects transcription. In chapter 4, we investigated if pausing at oligo-proline motifs upstream of transcription termination sites (Rho-dependent or intrinsic) could uncouple transcription from translation in the absence of EF-P. Our data suggested uncoupling evident from the reduction of downstream transcription level and protein production.

Perspectives

Although the information necessary to produce a protein is encrypted in its amino acid sequence, a delicate balance between different components of the cellular gene expression machinery is needed to achieve its proper 3D structure. For example, translation is coupled to transcription in bacteria, this requires that bacteria maintain translation elongation progress to maintain transcription. On the other hand, translation elongation must slow down at certain locations to allow for co-translational folding of specific peptide regions. Goldman et al. [258] showed that having a folding domain near the ribosomal exit tunnel could exert force on the stalled ribosomal nascent chain such that translation could resume. This indicated that protein folding could modulate translational dynamics. They concluded that force generated by either nascent chain folding or chaperone binding could constitute an important mechanism in regulating translation elongation. Cymer et al. [259] investigated whether translational stalling at a diproline stretch in the absence of EF-P could be overcome by pulling on the nascent chain. They found that, similar to the arrest peptide in SecM, stalling at diproline could only be relieved by a very strong pulling force and that the potency of the PP stalling

99

motif was affected by the identity of the preceding amino acid. Furthermore, they found that translational stalling induced by the most potent diproline stretches in the absence of

EF-P could be overcome by very strong pulling forces.

Upon considering all these intricate factors, it seems EF-P could be very critical for production of proteins that contain diprolines, are highly translated and lack an upstream domain whose folding would generate a directional force driving translation completion.

On the other hand, from the prospective of translation-transcription coupling, stalling at diprolines could expose a transcription termination site (TTS) that is usually inactive due to efficient translation elongation. Another possibility is that increased ribosome density at EF-P dependent motifs on the mRNA could suppress a TSS if it overlaps with, or is very close to, it. Overall, EF-P might be playing an important role both in the cross talk between transcription & translation on one hand, and on the other hand between translation and protein folding. Extensive research guided by bioinformatics will be necessary to elucidate the impact of this factor on various aspects of gene expression regulation.

Not only does EF-P serve as a promising drug target for anti-infective agents, but it is serves as a model for eIF5A. eIF5A, the eukaryotic homolog of EF-P, has also been shown to stimulate the synthesis of proteins containing polyproline motifs [255]. eIF5A is involved in promoting translation of particular mRNAs encoding signaling proteins involved in cell cycle progression and inflammation, linking this factor to both cancer and diabetes. eIF5A has been shown in yeast to play a role in mRNA turnover [260] and

100

although there is no coupling of transcription to translation in eukaryotes, it is possible that eIF5A has a role in the no-go decay (NGD) mechanism. NGD recognizes mRNAs with translation elongation stalls and targets them for endonucleolytic cleavage [189].

The initial evidence for NGD came from experiments using a reporter transcript that had an RNA stem-loop structure in the middle of the open reading frame of a budding yeast gene [189]. The ribosome stalled during translation elongation due to the formation of an RNA-stem loop structure [261], which was then followed by endonucleolytic cleavage. Furthermore, the transcript cleavage appeared to be translation-dependent, since in the absence of translation, the cleavage was prevented

[189]. A review by Harigaya and Parker [262] stated a variety of reasons that could induce elongation stalls, these included insertion of two consecutive proline residues ahead of a stop codon.

In summary, although the main (and possibly only) role of EF-P is to facilitate translation of proteins with oligo-proline motifs, nevertheless its effect on gene regulation can be implemented through various mechanisms. These mechanisms may be related to mRNA (maintaining coupling of translation to transcription, preventing mRNA cleavage at the A-site due to translational stalls, and preventing mRNA folding during translational stalls) or related to the produced protein (capability/speed of production, correct stoichiometry of protein subunits in protein complexes or correct folding) and ribosome recycling. All these possible mechanisms of action suggest that EF-P acts as global factor in gene regulation.

101

References

1. Russell JB, Cook GM (1995) Energetics of bacterial growth: balance of anabolic and catabolic reactions. Microbiol Rev 59: 48-62 2. Schmeing TM, Ramakrishnan V (2009) What recent ribosome structures have revealed about the mechanism of translation. Nature 461: 1234-42 3. Laursen BS, Sorensen HP, Mortensen KK, Sperling-Petersen HU (2005) Initiation of protein synthesis in bacteria. Microbiol Mol Biol Rev 69: 101-23 4. Keiler KC (2015) Mechanisms of ribosome rescue in bacteria. Nature Reviews Microbiology 13: 285-297 5. Moore SD, Sauer RT (2005) Ribosome rescue: tmRNA tagging activity and capacity in Escherichia coli. Mol Microbiol 58: 456-66 6. Chadani Y, Ono K, Ozawa S, Takahashi Y, Takai K, Nanamiya H, Tozawa Y, Kutsukake K, Abo T (2010) Ribosome rescue by Escherichia coli ArfA (YhdL) in the absence of trans-translation system. Mol Microbiol 78: 796-808 7. Gonzalez de Valdivia EI, Isaksson LA (2005) Abortive translation caused by peptidyl-tRNA drop-off at NGG codons in the early coding region of mRNA. FEBS J 272: 5306-16 8. Cruz-Vera LR, Magos-Castro MA, Zamora-Romo E, Guarneros G (2004) Ribosome stalling and peptidyl-tRNA drop-off during translational delay at AGA codons. Nucleic Acids Res 32: 4462-8 9. Chadani Y, Ono K, Kutsukake K, Abo T (2011) Escherichia coli YaeJ protein mediates a novel ribosome-rescue pathway distinct from SsrA- and ArfA-mediated pathways. Mol Microbiol 80: 772-85 10. Kurita D, Chadani Y, Muto A, Abo T, Himeno H (2014) ArfA recognizes the lack of mRNA in the mRNA channel after RF2 binding for ribosome rescue. Nucleic Acids Res 42: 13339-52 11. Karzai AW, Susskind MM, Sauer RT (1999) SmpB, a unique RNA-binding protein essential for the peptide-tagging activity of SsrA (tmRNA). EMBO J 18: 3793-9 12. Keiler KC, Waller PR, Sauer RT (1996) Role of a peptide tagging system in degradation of proteins synthesized from damaged messenger RNA. Science 271: 990-3 13. Bessho Y, Shibata R, Sekine S, Murayama K, Higashijima K, Hori-Takemoto C, Shirouzu M, Kuramitsu S, Yokoyama S (2007) Structural basis for functional mimicry of long-variable-arm tRNA by transfer-messenger RNA. Proc Natl Acad Sci U S A 104: 8293-8

102

14. Shoji S, Janssen BD, Hayes CS, Fredrick K (2010) Translation factor LepA contributes to tellurite resistance in Escherichia coli but plays no apparent role in the fidelity of protein synthesis. Biochimie 92: 157-63 15. Li G-W, Oh E, Weissman JS (2012) The anti-Shine-Dalgarno sequence drives translational pausing and codon choice in bacteria. Nature 484: 538-541 16. Schrader JM, Zhou B, Li GW, Lasker K, Childers WS, Williams B, Long T, Crosson S, McAdams HH, Weissman JS, et al. (2014) The coding and noncoding architecture of the Caulobacter crescentus genome. PLoS Genet 10: e1004463 17. Cruz-Vera LR, Sachs MS, Squires CL, Yanofsky C (2011) Nascent polypeptide sequences that influence ribosome function. Curr Opin Microbiol 14: 160-6 18. Ude S, Lassak J, Starosta AL, Kraxenberger T, Wilson DN, Jung K (2013) Translation elongation factor EF-P alleviates ribosome stalling at polyproline stretches. Science 339: 82-5 19. Doerfel LK, Wohlgemuth I, Kothe C, Peske F, Urlaub H, Rodnina MV (2013) EF-P is essential for rapid synthesis of proteins containing consecutive proline residues. Science 339: 85-8 20. Elgamal S, Katz A, Hersch SJ, Newsom D, White P, Navarre WW, Ibba M (2014) EF-P dependent pauses integrate proximal and distal signals during translation. PLoS Genet 10: e1004553 21. Doerfel LK, Wohlgemuth I, Kubyshkin V, Starosta AL, Wilson DN, Budisa N, Rodnina MV (2015) Entropic contribution of elongation factor P to proline positioning at the catalytic center of the ribosome. Journal of the American Chemical Society 22. Garza-Sánchez F, Shoji S, Fredrick K, Hayes CS (2009) RNase II is important for A-site mRNA cleavage during ribosome pausing. Molecular Microbiology 73: 882-897 23. Janssen BD, Garza-Sánchez F, Hayes CS (2013) A-site mRNA cleavage is not required for tmRNA-mediated ssrA-peptide tagging. PloS One 8: e81319 24. Hayes CS, Sauer RT (2003) Cleavage of the A site mRNA codon during ribosome pausing provides a mechanism for translational quality control. Molecular Cell 12: 903- 911 25. Sunohara T, Jojima K, Tagami H, Inada T, Aiba H (2004) Ribosome stalling during translation elongation induces cleavage of mRNA being translated in Escherichia coli. The Journal of Biological Chemistry 279: 15368-15375 26. Sunohara T, Jojima K, Yamamoto Y, Inada T, Aiba H (2004) Nascent-peptide- mediated ribosome stalling at a stop codon induces mRNA cleavage resulting in nonstop mRNA that is recognized by tmRNA. RNA (New York, NY) 10: 378-386 27. Garza-Sánchez F, Gin JG, Hayes CS (2008) Amino Acid Starvation and Colicin D Treatment Induce A-site mRNA Cleavage in Escherichia coli. Journal of Molecular Biology 378: 505-519 28. Pedersen K, Zavialov AV, Pavlov MY, Elf J, Gerdes K, Ehrenberg M (2003) The bacterial toxin RelE displays codon-specific cleavage of mRNAs in the ribosomal A site. Cell 112: 131-40 29. Christensen SK, Gerdes K (2003) RelE toxins from bacteria and Archaea cleave mRNAs on translating ribosomes, which are rescued by tmRNA. Molecular Microbiology 48: 1389-1400

103

30. Christensen SK, Pedersen K, Hansen FG, Gerdes K (2003) Toxin-antitoxin loci as stress-response-elements: ChpAK/MazF and ChpBK cleave translated RNAs and are counteracted by tmRNA. Journal of Molecular Biology 332: 809-819 31. Glick BR, Ganoza MC (1975) Identification of a soluble protein that stimulates peptide bond synthesis. Proc Natl Acad Sci U S A 72: 4257-60 32. Cole JR, Olsson CL, Hershey JW, Grunberg-Manago M, Nomura M (1987) Feedback regulation of rRNA synthesis in Escherichia coli. Requirement for initiation factor IF2. J Mol Biol 198: 383-92 33. An G, Glick BR, Friesen JD, Ganoza MC (1980) Identification and quantitation of elongation factor EF-P in Escherichia coli cell-free extracts. Can J Biochem 58: 1312- 4 34. Gerdes SY, Scholle MD, Campbell JW, Balazsi G, Ravasz E, Daugherty MD, Somera AL, Kyrpides NC, Anderson I, Gelfand MS, et al. (2003) Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. J Bacteriol 185: 5673-84 35. Aoki H, Adams SL, Turner MA, Ganoza MC (1997) Molecular characterization of the prokaryotic efp gene product involved in a peptidyltransferase reaction. Biochimie 79: 7-11 36. Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H (2006) Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2: 2006-0008 37. Yamamoto N, Nakahigashi K, Nakamichi T, Yoshino M, Takai Y, Touda Y, Furubayashi A, Kinjyo S, Dose H, Hasegawa M, et al. (2009) Update on the Keio collection of Escherichia coli single-gene deletion mutants. Mol Syst Biol 5 38. Navarre WW, Zou SB, Roy H, Xie JL, Savchenko A, Singer A, Edvokimova E, Prost LR, Kumar R, Ibba M, et al. (2010) PoxA, yjeK, and elongation factor P coordinately modulate virulence and drug resistance in Salmonella enterica. Mol Cell 39: 209-21 39. Yanagisawa T, Sumida T, Ishii R, Takemoto C, Yokoyama S (2010) A paralog of lysyl-tRNA synthetase aminoacylates a conserved lysine residue in translation elongation factor P. Nat Struct Mol Biol 17: 1136-43 40. Bailly M, de Crecy-Lagard V (2010) Predicting the pathway involved in post- translational modification of elongation factor P in a subset of bacterial species. Biol Direct 5 41. Blaha G, Stanley RE, Steitz TA (2009) Formation of the first peptide bond: the structure of EF-P bound to the 70S ribosome. Science 325: 966-70 42. Glick BR, Chladek S, Ganoza MC (1979) Peptide bond formation stimulated by protein synthesis factor EF-P depends on the aminoacyl moiety of the acceptor. Eur J Biochem 97: 23-8 43. Hanawa-Suetsugu K, Sekine S, Sakai H, Hori-Takemoto C, Terada T, Unzai S, Tame JR, Kuramitsu S, Shirouzu M, Yokoyama S (2004) Crystal structure of elongation factor P from Thermus thermophilus HB8. Proc Natl Acad Sci U S A 101: 9595-600

104

44. Zou SB, Roy H, Ibba M, Navarre WW (2011) Elongation factor P mediates a novel post-transcriptional regulatory pathway critical for bacterial virulence. Virulence 2: 147-51 45. Saini P, Eyler DE, Green R, Dever TE (2009) Hypusine-containing protein eIF5A promotes translation elongation. Nature 459: 118-21 46. Gregio AP, Cano VP, Avaca JS, Valentini SR, Zanelli CF (2009) eIF5A has a function in the elongation step of translation in yeast. Biochem Biophys Res Commun 380: 785-90 47. Aoki H, Xu J, Emili A, Chosay JG, Golshani A, Ganoza MC (2008) Interactions of elongation factor EF-P with the Escherichia coli ribosome. FEBS J 275: 671-81 48. Gilreath MS, Roy H, Bullwinkle TJ, Katz A, Navarre WW, Ibba M (2011) beta- Lysine discrimination by lysyl-tRNA synthetase. FEBS Lett 585: 3284-8 49. Roy H, Zou SB, Bullwinkle TJ, Wolfe BS, Gilreath MS, Forsyth CJ, Navarre WW, Ibba M (2011) The tRNA synthetase paralog PoxA modifies elongation factor-P with (R)-beta-lysine. Nat Chem Biol 7: 667-9 50. Peil L, Starosta AL, Virumae K, Atkinson GC, Tenson T, Remme J, Wilson DN (2012) Lys34 of translation elongation factor EF-P is hydroxylated by YfcM. Nat Chem Biol 51. Zou SB, Hersch SJ, Roy H, Wiggers JB, Leung AS, Buranyi S, Xie JL, Dare K, Ibba M, Navarre WW (2012) Loss of elongation factor P disrupts bacterial outer membrane integrity. J Bacteriol 194: 413-25 52. Rajkovic A, Erickson S, Witzky A, Branson OE, Seo J, Gafken PR, Frietas MA, Whitelegge JP, Faull KF, Navarre W, et al. (2015) Cyclic Rhamnosylated Elongation Factor P Establishes Antibiotic Resistance in Pseudomonas aeruginosa. MBio 6: e00823 53. Riley M, Abe T, Arnaud MB, Berlyn MK, Blattner FR, Chaudhuri RR, Glasner JD, Horiuchi T, Keseler IM, Kosuge T, et al. (2006) Escherichia coli K-12: a cooperatively developed annotation snapshot--2005. Nucleic Acids Res 34: 1-9 54. Park JH, Aravind L, Wolff EC, Kaevel J, Kim YS, Park MH (2006) Molecular cloning, expression, and structural prediction of deoxyhypusine hydroxylase: a HEAT- repeat-containing metalloenzyme. Proc Natl Acad Sci U S A 103: 51-6 55. Park MH, Nishimura K, Zanelli CF, Valentini SR (2010) Functional significance of eIF5A and its hypusine modification in eukaryotes. Amino Acids 38: 491-500 56. Lee SB, Park JH, Kaevel J, Sramkova M, Weigert R, Park MH (2009) The effect of hypusine modification on the intracellular localization of eIF5A. Biochem Biophys Res Commun 383: 497-502 57. Nishimura K, Lee SB, Park JH, Park MH (2012) Essential role of eIF5A-1 and deoxyhypusine synthase in mouse embryonic development. Amino Acids 42: 703-10 58. Chen KY, Liu AY (1997) Biochemistry and function of hypusine formation on eukaryotic initiation factor 5A. Biol Signals 6: 105-9 59. Wolff EC, Kang KR, Kim YS, Park MH (2007) Posttranslational synthesis of hypusine: evolutionary progression and specificity of the hypusine modification. Amino Acids 33: 341-50

105

60. Benne R, Brown-Luedi ML, Hershey JW (1978) Purification and characterization of protein synthesis initiation factors eIF-1, eIF-4C, eIF-4D, and eIF-5 from rabbit reticulocytes. J Biol Chem 253: 3070-7 61. Park MH (1989) The essential role of hypusine in initiation factor 4D (eIF-4D). Purification of eIF-4D and its precursors and comparison of their activities. J Biol Chem 264: 18531-5 62. Zanelli CF, Maragno AL, Gregio AP, Komili S, Pandolfi JR, Mestriner CA, Lustri WR, Valentini SR (2006) eIF5A binds to translational machinery components and affects translation in yeast. Biochem Biophys Res Commun 348: 1358-66 63. Gutierrez E, Shin BS, Woolstenhulme CJ, Kim JR, Saini P, Buskirk AR, Dever TE (2013) eIF5A promotes translation of polyproline motifs. Mol Cell 51: 35-45 64. Eun B, Cho B, Moon Y, Kim SY, Kim K, Kim H, Sun W (2010) Induction of neuronal apoptosis by expression of Hes6 via p53-dependent pathway. Brain Res 1313: 1-8 65. Sun Z, Cheng Z, Taylor CA, McConkey BJ, Thompson JE (2010) Apoptosis induction by eIF5A1 involves activation of the intrinsic mitochondrial pathway. J Cell Physiol 223: 798-809 66. Rahman-Roblick R, Roblick UJ, Hellman U, Conrotto P, Liu T, Becker S, Hirschberg D, Jornvall H, Auer G, Wiman KG (2007) p53 targets identified by protein expression profiling. Proc Natl Acad Sci U S A 104: 5401-6 67. Aitken CE, Puglisi JD (2010) Following the intersubunit conformation of the ribosome during translation in real time. Nat Struct Mol Biol 17: 793-800 68. Clement PM, Henderson CA, Jenkins ZA, Smit-McBride Z, Wolff EC, Hershey JW, Park MH, Johansson HE (2003) Identification and characterization of eukaryotic initiation factor 5A-2. Eur J Biochem 270: 4254-63 69. Clement PM, Johansson HE, Wolff EC, Park MH (2006) Differential expression of eIF5A-1 and eIF5A-2 in human cancer cells. FEBS J 273: 1102-14 70. Xu A, Jao DL, Chen KY (2004) Identification of mRNA that binds to eukaryotic initiation factor 5A by affinity co-purification and differential display. Biochem J 384: 585-90 71. Bearson SM, Bearson BL, Brunelle BW, Sharma VK, Lee IS (2011) A mutation in the poxA gene of Salmonella enterica serovar Typhimurium alters protein production, elevates susceptibility to environmental challenges, and decreases swine colonization. Foodborne Pathog Dis 8: 725-32 72. Kearns DB, Chu F, Rudner R, Losick R (2004) Genes governing swarming in Bacillus subtilis and evidence for a phase variation mechanism controlling surface motility. Mol Microbiol 52: 357-69 73. Ohashi Y, Inaoka T, Kasai K, Ito Y, Okamoto S, Satsu H, Tozawa Y, Kawamura F, Ochi K (2003) Expression profiling of translation-associated genes in sporulating Bacillus subtilis and consequence of sporulation by gene inactivation. Biosci Biotechnol Biochem 67: 2245-53 74. Charles TC, Nester EW (1993) A chromosomally encoded two-component sensory transduction system is required for virulence of Agrobacterium tumefaciens. J Bacteriol 175: 6614-25

106

75. Tsai CJ, Sauna ZE, Kimchi-Sarfaty C, Ambudkar SV, Gottesman MM, Nussinov R (2008) Synonymous mutations and ribosome stalling can lead to altered folding pathways and distinct minima. J Mol Biol 383: 281-91 76. Muto H, Ito K (2008) Peptidyl-prolyl-tRNA at the ribosomal P-site reacts poorly with puromycin. Biochemical and Biophysical Research Communications 366: 1043- 1047 77. Wohlgemuth I, Brenner S, Beringer M, Rodnina MV (2008) Modulation of the Rate of Peptidyl Transfer on the Ribosome by the Nature of Substrates. Journal of Biological Chemistry 283: 32229-32235 78. Lummis SC, Beene DL, Lee LW, Lester HA, Broadhurst RW, Dougherty DA (2005) Cis-trans isomerization at a proline opens the pore of a neurotransmitter-gated ion channel. Nature 438: 248-52 79. Doerfel LK, Rodnina MV (2013) Elongation factor P: Function and effects on bacterial fitness. Biopolymers 99: 837-845 80. Hersch SJ, Wang M, Zou SB, Moon KM, Foster LJ, Ibba M, Navarre WW (2013) Divergent protein motifs direct elongation factor P-mediated translational regulation in Salmonella enterica and Escherichia coli. MBio 4: e00180-13 81. Peil L, Starosta AL, Lassak J, Atkinson GC, Virumae K, Spitzer M, Tenson T, Jung K, Remme J, Wilson DN (2013) Distinct XPPX sequence motifs induce ribosome stalling, which is rescued by the translation elongation factor EF-P. Proc Natl Acad Sci U S A 110: 15265-70 82. Chang YY, Cronan JE, Jr. (1982) Mapping nonselectable genes of Escherichia coli by using transposon Tn10: location of a gene affecting pyruvate oxidase. J Bacteriol 151: 1279-89 83. Bustamante VH, Martinez LC, Santana FJ, Knodler LA, Steele-Mortimer O, Puente JL (2008) HilD-mediated transcriptional cross-talk between SPI-1 and SPI-2. Proc Natl Acad Sci U S A 105: 14591-6 84. Haraga A, Ohlson MB, Miller SI (2008) Salmonellae interplay with host cells. Nat Rev Microbiol 6: 53-66 85. Chen LM, Kaniga K, Galan JE (1996) Salmonella spp. are cytotoxic for cultured macrophages. Mol Microbiol 21: 1101-15 86. Hersh D, Monack DM, Smith MR, Ghori N, Falkow S, Zychlinsky A (1999) The Salmonella invasin SipB induces macrophage apoptosis by binding to caspase-1. Proc Natl Acad Sci U S A 96: 2396-401 87. Fink SL, Cookson BT (2007) Pyroptosis and host cell death responses during Salmonella infection. Cell Microbiol 9: 2562-70 88. Lara-Tejero M, Sutterwala FS, Ogura Y, Grant EP, Bertin J, Coyle AJ, Flavell RA, Galan JE (2006) Role of the caspase-1 inflammasome in Salmonella typhimurium pathogenesis. J Exp Med 203: 1407-12 89. Harsha HC, Molina H, Pandey A (2008) Quantitative proteomics using stable isotope labeling with amino acids in cell culture. Nat Protoc 3: 505-16 90. Gruhler S, Kratchmarova I (2008) Stable isotope labeling by amino acids in cell culture (SILAC). Methods Mol Biol 424: 101-11

107

91. Woolstenhulme CJ, Parajuli S, Healey DW, Valverde DP, Petersen EN, Starosta AL, Guydosh NR, Johnson WE, Wilson DN, Buskirk AR (2013) Nascent peptides that block protein synthesis in bacteria. Proc Natl Acad Sci U S A 110: E878-87 92. Ito K, Chiba S, Pogliano K (2010) Divergent stalling sequences sense and control cellular physiology. Biochem Biophys Res Commun 393: 1-5 93. Tanner DR, Cariello DA, Woolstenhulme CJ, Broadbent MA, Buskirk AR (2009) Genetic identification of nascent peptides that induce ribosome stalling. J Biol Chem 284: 34809-18 94. Zhang Y, Mooney RA, Grass JA, Sivaramakrishnan P, Herman C, Landick R, Wang JD (2014) DksA guards elongating RNA polymerase against ribosome-stalling- induced arrest. Molecular Cell 53: 766-778 95. Proshkin S, Rahmouni AR, Mironov A, Nudler E (2010) Cooperation between translating ribosomes and RNA polymerase in transcription elongation. Science (New York, NY) 328: 504-508 96. Bossi L, Schwartz A, Guillemardet B, Boudvillain M, Figueroa-Bossi N (2012) A role for Rho-dependent polarity in gene regulation by a noncoding small RNA. Genes Dev 26: 1864-73 97. Belogurov GA, Artsimovitch I (2015) Regulation of Transcript Elongation. Annu Rev Microbiol 98. Mirkin EV, Mirkin SM (2007) Replication fork stalling at natural impediments. Microbiol Mol Biol Rev 71: 13-35 99. Gupta MK, Guy CP, Yeeles JTP, Atkinson J, Bell H, Lloyd RG, Marians KJ, McGlynn P (2013) Protein-DNA complexes are the primary sources of replication fork pausing in Escherichia coli. Proceedings of the National Academy of Sciences of the United States of America 110: 7252-7257 100. Bochkareva A, Yuzenkova Y, Tadigotla VR, Zenkin N (2012) Factor-independent transcription pausing caused by recognition of the RNA-DNA hybrid sequence. The EMBO journal 31: 630-639 101. Kireeva ML, Kashlev M (2009) Mechanism of sequence-specific pausing of bacterial RNA polymerase. Proceedings of the National Academy of Sciences of the United States of America 106: 8900-8905 102. Artsimovitch I, Landick R (2002) The transcriptional regulator RfaH stimulates RNA chain synthesis after recruitment to elongation complexes by the exposed nontemplate DNA strand. Cell 109: 193-203 103. Ring BZ, Yarnell WS, Roberts JW (1996) Function of E. coli RNA polymerase sigma factor sigma 70 in -proximal pausing. Cell 86: 485-493 104. Yakhnin AV, Babitzke P (2010) Mechanism of NusG-stimulated pausing, hairpin-dependent pause site selection and intrinsic termination at overlapping pause and termination sites in the Bacillus subtilis trp leader. Molecular Microbiology 76: 690-705 105. Chan CL, Wang D, Landick R (1997) Multiple interactions stabilize a single paused transcription intermediate in which hairpin to 3' end spacing distinguishes pause and termination pathways. Journal of Molecular Biology 268: 54-68

108

106. Guérin M, Leng M, Rahmouni AR (1996) High resolution mapping of E.coli transcription elongation complex in situ reveals protein interactions with the non- transcribed strand. The EMBO journal 15: 5397-5407 107. Landick R (2006) The regulatory roles and mechanism of transcriptional pausing. Biochemical Society Transactions 34: 1062-1066 108. Nudler E (2012) RNA Polymerase Backtracking in Gene Regulation and Genome Instability. Cell 149: 1438-1445 109. Dutta D, Shatalin K, Epshtein V, Gottesman ME, Nudler E (2011) Linking RNA polymerase backtracking to genome instability in E. coli. Cell 146: 533-543 110. Toulmé F, Mosrin-Huaman C, Sparkowski J, Das A, Leng M, Rahmouni AR (2000) GreA and GreB proteins revive backtracked RNA polymerase in vivo by promoting transcript trimming. The EMBO journal 19: 6853-6859 111. Washburn RS, Gottesman ME (2011) Transcription termination maintains chromosome integrity. Proceedings of the National Academy of Sciences of the United States of America 108: 792-797 112. Richardson JP (1991) Preventing the synthesis of unused transcripts by Rho factor. Cell 64: 1047-9 113. Roberts JW, Shankar S, Filter JJ (2008) RNA polymerase elongation factors. Annu Rev Microbiol 62: 211-33 114. Cardinale CJ, Washburn RS, Tadigotla VR, Brown LM, Gottesman ME, Nudler E (2008) Termination factor Rho and its cofactors NusA and NusG silence foreign DNA in E. coli. Science (New York, NY) 320: 935-938 115. Washburn RS, Wang Y, Gottesman ME (2003) Role of E.coli transcription-repair coupling factor Mfd in Nun-mediated transcription termination. J Mol Biol 329: 655-62 116. Selby CP, Sancar A (1994) Mechanisms of transcription-repair coupling and mutation frequency decline. Microbiol Rev 58: 317-29 117. Park JS, Marr MT, Roberts JW (2002) E. coli Transcription repair coupling factor (Mfd protein) rescues arrested complexes by promoting forward translocation. Cell 109: 757-67 118. Tehranchi AK, Blankschien MD, Zhang Y, Halliday JA, Srivatsan A, Peng J, Herman C, Wang JD (2010) The transcription factor DksA prevents conflicts between DNA replication and transcription machinery. Cell 141: 595-605 119. Trautinger BW, Jaktaji RP, Rusakova E, Lloyd RG (2005) RNA polymerase modulators and DNA repair activities resolve conflicts between DNA replication and transcription. Molecular Cell 19: 247-258 120. Furman R, Sevostyanova A, Artsimovitch I (2012) Transcription initiation factor DksA has diverse effects on RNA chain elongation. Nucleic Acids Res 40: 3392-402 121. Boudvillain M, Figueroa-Bossi N, Bossi L (2013) Terminator still moving forward: expanding roles for Rho factor. Current Opinion in Microbiology 16: 118-124 122. Wilson KS, von Hippel PH (1995) Transcription termination at intrinsic terminators: the role of the RNA hairpin. Proc Natl Acad Sci U S A 92: 8793-7 123. Santangelo TJ, Artsimovitch I (2011) Termination and antitermination: RNA polymerase runs a stop sign. Nat Rev Microbiol 9: 319-29

109

124. Henkin TM (1996) Control of transcription termination in prokaryotes. Annu Rev Genet 30: 35-57 125. Merino E, Yanofsky C (2005) Transcription attenuation: a highly conserved regulatory strategy used by bacteria. Trends Genet 21: 260-4 126. Henkin TM, Yanofsky C (2002) Regulation by transcription attenuation in bacteria: how RNA provides instructions for transcription termination/antitermination decisions. Bioessays 24: 700-7 127. Winkler W, Nahvi A, Breaker RR (2002) Thiamine derivatives bind messenger RNAs directly to regulate bacterial gene expression. Nature 419: 952-6 128. Roland KL, Powell FE, Turnbough CL, Jr. (1985) Role of translation and attenuation in the control of pyrBI operon expression in Escherichia coli K-12. J Bacteriol 163: 991-9 129. Lynn SP, Burton WS, Donohue TJ, Gould RM, Gumport RI, Gardner JF (1987) Specificity of the attenuation response of the operon of Escherichia coli is determined by the threonine and isoleucine codons in the leader transcript. J Mol Biol 194: 59-69 130. Gollnick P, Babitzke P (2002) Transcription attenuation. Biochimica et Biophysica Acta (BBA) - Gene Structure and Expression 1577: 240-250 131. Epshtein V, Dutta D, Wade J, Nudler E (2010) An allosteric mechanism of Rho- dependent transcription termination. Nature 463: 245-9 132. Ciampi MS (2006) Rho-dependent terminators and transcription termination. Microbiology 152: 2515-28 133. Graham JE, Richardson JP (1998) rut Sites in the Nascent Transcript Mediate Rho-dependent Transcription Termination in Vivo. Journal of Biological Chemistry 273: 20764-20769 134. Richardson JP, Grimley C, Lowery C (1975) Transcription termination factor rho activity is altered in Escherichia coli with suA gene mutations. Proc Natl Acad Sci U S A 72: 1725-8 135. Das A, Court D, Adhya S (1976) Isolation and characterization of conditional lethal mutants of Escherichia coli defective in transcription termination factor rho. Proc Natl Acad Sci U S A 73: 1959-63 136. Zwiefka A, Kohn H, Widger WR (1993) Transcription termination factor rho: the site of bicyclomycin inhibition in Escherichia coli. Biochemistry 32: 3564-70 137. Magyar A, Zhang X, Kohn H, Widger WR (1996) The antibiotic bicyclomycin affects the secondary RNA binding site of Escherichia coli transcription termination factor Rho. J Biol Chem 271: 25369-74 138. Yanofsky C, Horn V (1995) Bicyclomycin sensitivity and resistance affect Rho factor-mediated transcription termination in the tna operon of Escherichia coli. J Bacteriol 177: 4451-6 139. Park HG, Zhang X, Moon HS, Zwiefka A, Cox K, Gaskell SJ, Widger WR, Kohn H (1995) Bicyclomycin and dihydrobicyclomycin inhibition kinetics of Escherichia coli rho-dependent transcription termination factor ATPase activity. Arch Biochem Biophys 323: 447-54

110

140. Skordalakes E, Brogan AP, Park BS, Kohn H, Berger JM (2005) Structural mechanism of inhibition of the Rho transcription termination factor by the antibiotic bicyclomycin. Structure 13: 99-109 141. Bubunenko M, Baker T, Court DL (2007) Essentiality of ribosomal and transcription antitermination proteins analyzed by systematic gene replacement in Escherichia coli. J Bacteriol 189: 2844-53 142. Dokland T, Isaksen ML, Fuller SD, Lindqvist BH (1993) Capsid localization of the bacteriophage P4 Psu protein. Virology 194: 682-7 143. Linderoth NA, Tang G, Calendar R (1997) In vivo and in vitro evidence for an anti-Rho activity induced by the phage P4 polarity suppressor protein Psu. Virology 227: 131-41 144. Pani B, Ranjan A, Sen R (2009) Interaction surface of bacteriophage P4 protein Psu required for complex formation with the transcription terminator Rho. J Mol Biol 389: 647-60 145. Pani B, Banerjee S, Chalissery J, Muralimohan A, Loganathan RM, Suganthan RB, Sen R (2006) Mechanism of inhibition of Rho-dependent transcription termination by bacteriophage P4 protein Psu. J Biol Chem 281: 26491-500 146. Ingolia NT (2010) Genome-wide translational profiling by ribosome footprinting. Methods Enzymol 470: 119-42 147. Ingolia NT, Ghaemmaghami S, Newman JR, Weissman JS (2009) Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324: 218-23 148. Ingolia NT, Brar GA, Rouskin S, McGeachy AM, Weissman JS (2012) The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat Protoc 7: 1534-50 149. Oh E, Becker AH, Sandikci A, Huber D, Chaba R, Gloge F, Nichols RJ, Typas A, Gross CA, Kramer G, et al. (2011) Selective ribosome profiling reveals the cotranslational chaperone action of trigger factor in vivo. Cell 147: 1295-308 150. Li GW, Oh E, Weissman JS (2012) The anti-Shine-Dalgarno sequence drives translational pausing and codon choice in bacteria. Nature 484: 538-41 151. 152. Dana A, Tuller T (2012) Determinants of translation elongation speed and ribosomal profiling biases in mouse embryonic stem cells. PLoS Comput Biol 8 153. Wolin SL, Walter P (1988) Ribosome pausing and stacking during translation of a eukaryotic mRNA. EMBO J 7: 3559-69 154. Bullwinkle TJ, Zou SB, Rajkovic A, Hersch SJ, Elgamal S, Robinson N, Smil D, Bolshan Y, Navarre WW, Ibba M (2013) (R)-beta-lysine-modified elongation factor P functions in translation elongation. J Biol Chem 288: 4416-23 155. Cano VS, Jeon GA, Johansson HE, Henderson CA, Park JH, Valentini SR, Hershey JW, Park MH (2008) Mutational analyses of human eIF5A-1--identification of amino acid residues critical for eIF5A activity and hypusine modification. FEBS J 275: 44-58 156. Dias CA, Cano VS, Rangel SM, Apponi LH, Frigieri MC, Muniz JR, Garcia W, Park MH, Garratt RC, Zanelli CF, et al. (2008) Structural modeling and mutational

111

analysis of yeast eukaryotic translation initiation factor 5A reveal new critical residues and reinforce its involvement in protein synthesis. FEBS J 275: 1874-88 157. Rossi D, Kuroshu R, Zanelli CF, Valentini SR (2014) eIF5A and EF-P: two unique translation factors are now traveling the same road. Wiley Interdiscip Rev RNA 5: 209-22 158. Boël G, Smith PC, Ning W, Englander MT, Chen B, Hashem Y, Testa AJ, Fischer JJ, Wieden H-J, Frank J, et al. (2014) The ABC-F protein EttA gates ribosome entry into the translation elongation cycle. Nat Struct Mol Biol 159. Chen B, Boel G, Hashem Y, Ning W, Fei J, Wang C, Gonzalez RL, Jr., Hunt JF, Frank J (2014) EttA regulates translation by binding the ribosomal E site and restricting ribosome-tRNA dynamics. Nat Struct Mol Biol 21: 152-9 160. Vila-Sanjurjo A, Schuwirth B-S, Hau CW, Cate JHD (2004) Structural basis for the control of translation initiation during stress. Nature Structural & Molecular Biology 11: 1054-1059 161. Izutsu K, Wada C, Komine Y, Sako T, Ueguchi C, Nakura S, Wada A (2001) Escherichia coli ribosome-associated protein SRA, whose copy number increases during stationary phase. Journal of Bacteriology 183: 2765-2773 162. Reuven NB, Deutscher MP (1993) Multiple exoribonucleases are required for the 3' processing of Escherichia coli tRNA precursors in vivo. FASEB journal: official publication of the Federation of American Societies for Experimental Biology 7: 143-148 163. Elseviers D, Petrullo LA, Gallagher PJ (1984) Novel E. coli mutants deficient in biosynthesis of 5-methylaminomethyl-2-thiouridine. Nucleic Acids Research 12: 3521- 3534 164. Wolfe MD, Ahmed F, Lacourciere GM, Lauhon CT, Stadtman TC, Larson TJ (2004) Functional diversity of the rhodanese homology domain: the Escherichia coli ybbB gene encodes a selenophosphate-dependent tRNA 2-selenouridine synthase. J Biol Chem 279: 1801-9 165. Ibba M, Söll D (2000) Aminoacyl-tRNA Synthesis. Annual Review of Biochemistry 69: 617-650 166. Seong IS, Oh JY, Lee JW, Tanaka K, Chung CH (2000) The HslU ATPase acts as a molecular chaperone in prevention of aggregation of SulA, an inhibitor of cell division in Escherichia coli. FEBS letters 477: 224-229 167. Malki A, Caldas T, Abdallah J, Kern R, Eckey V, Kim SJ, Cha S-S, Mori H, Richarme G (2005) Peptidase activity of the Escherichia coli Hsp31 chaperone. The Journal of Biological Chemistry 280: 14420-14426 168. Datsenko KA, Wanner BL (2000) One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc Natl Acad Sci U S A 97: 6640-5 169. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows- Wheeler transform. Bioinformatics 25: 1754-60 170. Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics (Oxford, England) 26: 841-842 171. Thorvaldsdóttir H, Robinson JT, Mesirov JP (2013) Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in Bioinformatics 14: 178-192

112

172. Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biology 11: R106 173. Lu J, Deutsch C (2008) Electrostatics in the ribosomal tunnel modulate chain elongation rates. J Mol Biol 384: 73-86 174. Yap M-N, Bernstein HD (2009) The plasticity of a translation arrest motif yields insights into nascent polypeptide recognition inside the ribosome tunnel. Molecular Cell 34: 201-211 175. Komar AA, Lesnik T, Reiss C (1999) Synonymous codon substitutions affect ribosome traffic and protein folding during in vitro translation. FEBS Lett 462: 387-91 176. Letzring DP, Dean KM, Grayhack EJ (2010) Control of translation efficiency in yeast by codon-anticodon interactions. RNA 16: 2516-28 177. Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Res 14: 1188-90 178. Lorenz R, Bernhart SH, Höner Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL (2011) ViennaRNA Package 2.0. Algorithms Mol Biol 6 179. Fredrick K, Ibba M (2010) How the sequence of a gene can tune its translation. Cell 141: 227-9 180. Ramu H, Vazquez-Laslop N, Klepacki D, Dai Q, Piccirilli J, Micura R, Mankin AS (2011) Nascent peptide in the ribosome exit tunnel affects functional properties of the A-site of the peptidyl transferase center. Mol Cell 41: 321-30 181. Hersch SJ, Elgamal S, Katz A, Ibba M, Navarre WW (2014) Translation initiation rate determines the impact of ribosome stalling on bacterial protein synthesis. The Journal of Biological Chemistry 289: 28160-28171 182. Sussman JK, Simons EL, Simons RW (1996) Escherichia coli translation initiation factor 3 discriminates the initiation codon in vivo. Mol Microbiol 21: 347-60 183. Rychkova A, Mukherjee S, Bora RP, Warshel A (2013) Simulating the pulling of stalled elongated peptide from the ribosome by the translocon. Proceedings of the National Academy of Sciences of the United States of America 110: 10195-10200 184. Wilson DN, Beckmann R (2011) The ribosomal tunnel as a functional environment for nascent polypeptide folding and translational stalling. Curr Opin Struct Biol 21: 274-82 185. Guzman LM, Belin D, Carson MJ, Beckwith J (1995) Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter. J Bacteriol 177: 4121-30 186. Thanaraj TA, Argos P (1996) Ribosome-mediated translational pause and organization. Protein Sci 5: 1594-612 187. Subramaniam AR, Zid BM, O'Shea EK (2014) An integrated approach reveals regulatory controls on elongation. Cell 159: 1200-11 188. Ito K, Chiba S (2013) Arrest peptides: cis-acting modulators of translation. Annu Rev Biochem 82: 171-202 189. Doma MK, Parker R (2006) Endonucleolytic cleavage of eukaryotic mRNAs with stalls in translation elongation. Nature 440: 561-4 190. Plotkin JB, Kudla G (2011) Synonymous but not the same: the causes and consequences of codon bias. Nat Rev Genet 12: 32-42

113

191. Zhang G, Hubalewska M, Ignatova Z (2009) Transient ribosomal attenuation coordinates protein synthesis and co-translational folding. Nat Struct Mol Biol 16: 274-80 192. Castro-Roa D, Zenkin N (2015) Methods for the Assembly and Analysis of In Vitro Transcription-Coupled-to-Translation Systems. In Bacterial Transcriptional Control, Artsimovitch I, Santangelo TJ (eds) pp 81-99. Springer New York 193. McGary K, Nudler E (2013) RNA polymerase and the ribosome: the close relationship. Curr Opin Microbiol 16: 112-7 194. Tomar SK, Artsimovitch I (2013) NusG-Spt5 proteins-Universal tools for transcription modification and communication. Chem Rev 113: 8604-19 195. Castro-Roa D, Zenkin N (2012) In vitro experimental system for analysis of transcription-translation coupling. Nucleic Acids Research 40: e45-e45 196. Burmann BM, Schweimer K, Luo X, Wahl MC, Stitt BL, Gottesman ME, Rosch P (2010) A NusE:NusG complex links transcription and translation. Science 328: 501-4 197. Fradkin LG, Kornberg A (1992) Prereplicative complexes of components of DNA polymerase III holoenzyme of Escherichia coli. J Biol Chem 267: 10318-22 198. Guy L, Roten CA (2004) Genometric analyses of the organization of circular chromosomes: a universal pressure determines the direction of ribosomal RNA genes transcription relative to chromosome replication. Gene 340: 45-52 199. Srivatsan A, Tehranchi A, MacAlpine DM, Wang JD (2010) Co-orientation of replication and transcription preserves genome integrity. PLoS Genet 6: e1000810 200. Merrikh H, Machon C, Grainger WH, Grossman AD, Soultanas P (2011) Co- directional replication-transcription conflicts lead to replication restart. Nature 470: 554-7 201. Lesnik EA, Sampath R, Levene HB, Henderson TJ, McNeil JA, Ecker DJ (2001) Prediction of rho-independent transcriptional terminators in Escherichia coli. Nucleic Acids Res 29: 3583-94 202. Gusarov I, Nudler E (1999) The mechanism of intrinsic transcription termination. Mol Cell 3: 495-504 203. von Hippel PH, Yager TD (1992) The elongation-termination decision in transcription. Science 255: 809-12 204. Yarnell WS, Roberts JW (1999) Mechanism of intrinsic transcription termination and antitermination. Science 284: 611-5 205. Korzheva N, Mustaev A, Kozlov M, Malhotra A, Nikiforov V, Goldfarb A, Darst SA (2000) A structural model of transcription elongation. Science 289: 619-25 206. Martin FH, Tinoco I, Jr. (1980) DNA-RNA hybrid duplexes containing oligo(dA:rU) sequences are exceptionally unstable and may facilitate termination of transcription. Nucleic Acids Res 8: 2295-9 207. Landick R (1997) RNA polymerase slides home: pause and termination site recognition. Cell 88: 741-4 208. Nudler E, Gusarov I, Avetissova E, Kozlov M, Goldfarb A (1998) Spatial organization of transcription elongation complex in Escherichia coli. Science 281: 424-8 209. Friedman DI, Court DL (1995) Transcription antitermination: the lambda paradigm updated. Mol Microbiol 18: 191-200 210. Burgess BR, Richardson JP (2001) RNA passes through the hole of the protein hexamer in the complex with the Escherichia coli Rho factor. J Biol Chem 276: 4182-9

114

211. Housley PR, Whitfield HJ (1982) Transcription termination factor rho from wild type and rho-111 strains of Salmonella typhimurium. J Biol Chem 257: 2569-77 212. Quirk PG, Dunkley EA, Jr., Lee P, Krulwich TA (1993) Identification of a putative Bacillus subtilis rho gene. J Bacteriol 175: 8053 213. Washburn RS, Marra A, Bryant AP, Rosenberg M, Gentry DR (2001) rho is not essential for viability or virulence in Staphylococcus aureus. Antimicrob Agents Chemother 45: 1099-103 214. Opperman T, Richardson JP (1994) Phylogenetic analysis of sequences from diverse bacteria with homology to the Escherichia coli rho gene. J Bacteriol 176: 5033- 43 215. Richardson LV, Richardson JP (1996) Rho-dependent termination of transcription is governed primarily by the upstream Rho utilization (rut) sequences of a terminator. J Biol Chem 271: 21597-603 216. Zalatan F, Platt T (1992) Effects of decreased cytosine content on rho interaction with the rho-dependent terminator trp t' in Escherichia coli. J Biol Chem 267: 19082-8 217. Guerin M, Robichon N, Geiselmann J, Rahmouni AR (1998) A simple polypyrimidine repeat acts as an artificial Rho-dependent terminator in vivo and in vitro. Nucleic Acids Res 26: 4895-900 218. Hart CM, Roberts JW (1994) Deletion analysis of the lambda tR1 termination region. Effect of sequences near the transcript release sites, and the minimum length of rho-dependent transcripts. J Mol Biol 237: 255-65 219. Isaksen ML, Rishovd ST, Calendar R, Lindqvist BH (1992) The polarity suppression factor of bacteriophage P4 is also a decoration protein of the P4 capsid. Virology 188: 831-9 220. Sunshine M, Six E (1976) Relief of P2 bacteriophage amber mutant polarity by the satellite bacteriophage P4. J Mol Biol 106: 673-82 221. Belogurov GA, Vassylyeva MN, Svetlov V, Klyuyev S, Grishin NV, Vassylyev DG, Artsimovitch I (2007) Structural basis for converting a general transcription factor into an operon-specific virulence regulator. Mol Cell 26: 117-29 222. Shashni R, Mishra S, Kalayani BS, Sen R (2012) Suppression of in vivo Rho- dependent transcription termination defects: evidence for kinetically controlled steps. Microbiology 158: 1468-81 223. Macke TJ, Ecker DJ, Gutell RR, Gautheret D, Case DA, Sampath R (2001) RNAMotif, an RNA secondary structure definition and search algorithm. Nucleic Acids Res 29: 4724-35 224. Gautheret D, Lambert A (2001) Direct RNA motif definition and identification from multiple sequence alignments using secondary structure profiles. J Mol Biol 313: 1003-11 225. Hofacker IL, Fontana W, Stadler PF, Bonhoeffer LS, Tacker M, Schuster P (1994) Fast folding and comparison of RNA secondary structures. Monatshefte für Chemie / Chemical Monthly 125: 167-188 226. Kogoma T (1994) Escherichia coli RNA polymerase mutants that enhance or diminish the SOS response constitutively expressed in the absence of RNase HI activity. J Bacteriol 176: 1521-3

115

227. McDowell JC, Roberts JW, Jin DJ, Gross C (1994) Determination of intrinsic transcription termination efficiency by RNA polymerase elongation rate. Science 266: 822-5 228. Jin DJ, Gross CA (1991) RpoB8, a rifampicin-resistant termination-proficient RNA polymerase, has an increased Km for purine nucleotides during transcription elongation. J Biol Chem 266: 14478-85 229. von Hippel PH, Pasman Z (2002) Reaction pathways in transcript elongation. Biophys Chem 101-102: 401-23 230. Jin DJ, Walter WA, Gross CA (1988) Characterization of the termination phenotypes of rifampicin-resistant mutants. J Mol Biol 202: 245-53 231. Salgado H, Peralta-Gil M, Gama-Castro S, Santos-Zavaleta A, Muniz-Rascado L, Garcia-Sotelo JS, Weiss V, Solano-Lira H, Martinez-Flores I, Medina-Rivera A, et al. (2013) RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more. Nucleic Acids Res 41: D203-13 232. Peters JM, Vangeloff AD, Landick R (2011) Bacterial transcription terminators: the RNA 3'-end chronicles. J Mol Biol 412: 793-813 233. Peters JM, Mooney RA, Grass JA, Jessen ED, Tran F, Landick R (2012) Rho and NusG suppress pervasive antisense transcription in Escherichia coli. Genes Dev 26: 2621- 33 234. Zuker M (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 31: 3406-15 235. Gong F, Yanofsky C (2002) Analysis of tryptophanase operon expression in vitro: accumulation of TnaC-peptidyl-tRNA in a release factor 2-depleted S-30 extract prevents Rho factor action, simulating induction. J Biol Chem 277: 17095-100 236. Hollands K, Proshkin S, Sklyarova S, Epshtein V, Mironov A, Nudler E, Groisman EA (2012) Riboswitch control of Rho-dependent transcription termination. Proc Natl Acad Sci U S A 109: 5376-81 237. Gong F, Yanofsky C (2003) Rho's role in transcription attenuation in the tna operon of E. coli. Methods Enzymol 371: 383-91 238. Cruz-Vera LR, Rajagopal S, Squires C, Yanofsky C (2005) Features of ribosome- peptidyl-tRNA interactions essential for tryptophan induction of tna operon expression. Mol Cell 19: 333-43 239. Elgamal S, Artsimovitch I, Figueroa-Bossi N, Bossi L, Ibba M (2015) Elongation Factor P maintains transcription-translation coupling during oligo-proline synthesis. In preparation, 240. Henriques MX, Catalão MJ, Figueiredo J, Gomes JP, Filipe SR (2013) Construction of Improved Tools for Protein Localization Studies in Streptococcus pneumoniae. PLoS ONE 8 241. Figueroa N, Wills N, Bossi L (1991) Common sequence determinants of the response of a prokaryotic promoter to DNA bending and supercoiling. EMBO J 10: 941-9 242. Sambrook J, Russell DW (2006) Separation of RNA According to Size: Electrophoresis of RNA through Agarose Gels Containing Formaldehyde. CSH Protoc 2006

116

243. Rio DC (2015) Denaturation and electrophoresis of RNA with formaldehyde. Cold Spring Harb Protoc 2015: 219-22 244. Rio DC (2015) Northern blots: capillary transfer of RNA from agarose gels and filter hybridization using standard stringency conditions. Cold Spring Harb Protoc 2015: 306-13 245. Brown T, Mackey K, Du T (2004) Analysis of RNA by northern and slot blot hybridization. Curr Protoc Mol Biol Chapter 4: Unit 4 9 246. Figueroa-Bossi N, Schwartz A, Guillemardet B, D'Heygere F, Bossi L, Boudvillain M (2014) RNA remodeling by bacterial global regulator CsrA promotes Rho-dependent transcription termination. Genes Dev 28: 1239-51 247. Puglisi JD (2015) Protein synthesis. The delicate dance of translation and folding. Science 348: 399-400 248. Yadavalli SS, Ibba M (2012) Quality control in aminoacyl-tRNA synthesis its role in translational fidelity. Adv Protein Chem Struct Biol 86: 1-43 249. Kim SJ, Yoon JS, Shishido H, Yang Z, Rooney LA, Barral JM, Skach WR (2015) Protein folding. Translational tuning optimizes nascent protein folding in cells. Science 348: 444-8 250. Kosolapov A, Deutsch C (2009) Tertiary interactions within the ribosomal exit tunnel. Nat Struct Mol Biol 16: 405-11 251. Spencer PS, Siller E, Anderson JF, Barral JM (2012) Silent substitutions predictably alter translation elongation rates and protein folding efficiencies. J Mol Biol 422: 328-35 252. Mashaghi A, Kramer G, Bechtluft P, Zachmann-Brand B, Driessen AJ, Bukau B, Tans SJ (2013) Reshaping of the conformational search of a protein by the chaperone trigger factor. Nature 500: 98-101 253. Zhou M, Guo J, Cha J, Chae M, Chen S, Barral JM, Sachs MS, Liu Y (2013) Non-optimal codon usage affects expression, structure and function of clock protein FRQ. Nature 495: 111-5 254. Sander IM, Chaney JL, Clark PL (2014) Expanding Anfinsen's principle: contributions of synonymous codon selection to rational protein design. J Am Chem Soc 136: 858-61 255. Gutierrez E, Shin BS, Woolstenhulme CJ, Kim JR, Saini P, Buskirk AR, Dever TE (2013) eIF5A Promotes Translation of Polyproline Motifs. Mol Cell 256. Woolstenhulme Christopher J, Guydosh Nicholas R, Green R, Buskirk Allen R (2015) High-Precision Analysis of Translational Pausing by Ribosome Profiling in Bacteria Lacking EFP. Cell Reports 11: 13-21 257. Balakrishnan R, Oman K, Shoji S, Bundschuh R, Fredrick K (2014) The conserved GTPase LepA contributes mainly to translation initiation in Escherichia coli. Nucleic Acids Res 42: 13370-83 258. Goldman DH, Kaiser CM, Milin A, Righini M, Tinoco I, Jr., Bustamante C (2015) Ribosome. Mechanical force releases nascent chain-mediated ribosome arrest in vitro and in vivo. Science 348: 457-60

117

259. Cymer F, Hedman R, Ismail N, Heijne Gv (2015) Exploration of the Arrest Peptide Sequence Space Reveals Arrest-enhanced Variants. Journal of Biological Chemistry 290: 10208-10215 260. Zuk D, Jacobson A (1998) A single amino acid substitution in yeast eIF-5A results in mRNA stabilization. The EMBO journal 17: 2914-2925 261. Hosoda N, Kobayashi T, Uchida N, Funakoshi Y, Kikuchi Y, Hoshino S, Katada T (2003) Translation termination factor eRF3 mediates mRNA decay through the regulation of deadenylation. J Biol Chem 278: 38287-91 262. Harigaya Y, Parker R (2010) No-go decay: a quality control mechanism for RNA in translation. Wiley Interdiscip Rev RNA 1: 132-41

118