<<

Probing the Regulation of P-Mediated

Thesis

Presented in Partial Fulfillment of the Requirements for the Degree Master of Science

in the Graduate School of The Ohio State University

By

Mengchi Wang, B.S.

Graduate Program in Microbiology

The Ohio State University

2013

Thesis Committee:

Dr. Michael Ibba, Advisor

Dr. Kurt Fredrick

Dr. Irina Artsimovitch

Copyright by

Mengchi Wang

2013

ABSTRACT

Elongation factor P (EF-P) is a universally conserved factor homologous to eukaryotic/archaeal 5A. However, the mechanism by which EF-P regulates certain translation process is still largely unclear. Previous studies show that EF-P facilitates bond formation in vitro. The crystal structures of EF-P revealed that the contains three domains and an overall structure that mimics a tRNA, binding between the P-site and E- site of the during translation. However, only a limited number of are affected by the loss of EF-P. In E. coli and Salmonella, deletion of the efp results in pleiotropic phenotypes, including increased susceptibility to numerous cellular stressors.

Based on these results, we hypothesize that EF-P mediates translation of a subset of mRNAs, which share common characteristics in their sequences. We conducted an unbiased in vivo investigation of the specific targets of EF-P by employing stable isotope labeling of amino acids in cell culture (SILAC) to compare the proteomes of wild-type and efp mutant Salmonella. We found that metabolic and motility are prominent among the subset of proteins with decreased production in the efp mutant.

Furthermore, particular tripeptide motifs are statistically overrepresented among the proteins downregulated in efp mutant strains. These include PPP, PPG, APP and

YIRYIR, all of which were confirmed to induce EF-P dependence by a translational fusion assay. Notably, we found that many proteins containing these identified motifs are not misregulated in an EF-P-deficient background, suggesting that the factors that

ii govern EF-P-mediated regulation are complex. The possibility of a structural feature that underlies EF-P mediated translation is discussed. In summary, this work established a productive bioinformatics strategy for screening EF-P target motifs. The discoveries made in this study help provide important clues regarding the mechanism of EF-P regulated translation.

iii

ACKNOWLEDGEMENTS

I would like to express my deepest gratitude to my advisor, Dr. Michael Ibba for his trust, support and guidance in every step along the way. He personified the aptitude of a brilliant scientist, the charisma of a true leader and provided to me wise mentorship that inspires me to live up to my full potential in research and my career. I am also indebted to my advisory committee members Dr. Kurt Fredrick and Dr. Irina

Artsimovitch for their unreserved support and insightful critiques on my project.

I am grateful for my friends and fellow lab members, in particular, Dr. Hervé Roy for guiding me into the lab and helping me start the project, Dr. Tammy Bullwinkle and Medha Raina for their keen advice on my experiments, Dr. Assaf Katz and Andrei

Rajkovic for their helpful insights, Sara Elgamal and Jing Li for their great editorial help with this thesis, and finally, James Bardeen and Ellen Bardeen for their unconditional love and warm friendship.

Last but not the least, this work is not possible without the collaborative efforts with our colleagues at Dr. William Navarre’s Lab in University of Toronto, especially

Steven Hersch.

iv

VITA

June 2006...... Nanjing Jinling High School, China

July 2010...... B.S. Biology, Nanjing Agricultural University, China.

2010 to present ...... Graduate Research Assistant, Department of

Microbiology, The Ohio State University

PUBLICATIONS

Hersch, S. J.*, Wang, M.*, Zou, S. B., Moon, K. M., Foster, L. J., Ibba, M. &

Navarre, W. W. (2013) Divergent Protein Motifs Direct -Mediated

Translational Regulation in Salmonella enterica and , mBio. 4.

* Co-first Author.

FIELDS OF STUDY

Major Field: Microbiology

v

TABLE OF CONTENTS

ABSTRACT ...... ii

ACKNOWLEDGEMENTS ...... iv

VITA ...... v

PUBLICATIONS ...... v

FIELDS OF STUDY...... v

TABLE OF CONTENTS ...... vi

LIST OF TABLES ...... vii

LIST OF FIGURES ...... viii

CHAPTER 1. INTRODUCTION ...... 1

CHAPTER 2. MATERIALS AND METHODS ...... 15

CHAPTER 3. PREDICTING SEQUENCE MOTIFS TARGETS OF EF-P ...... 21

CHAPTER 4. VERIFICATION OF PREDICTED EF-P TARGET MOTIFS...... 27

CHAPTER 5 EF-P TARGET: BEYOND SEQUENCE MOTIFS...... 37

REFERENCES ...... 43

vi

LIST OF TABLES

Table 1. Bioinformatics analysis predicts EF-P regulated tripeptide motifs...... 25

Table 2. Verification of predicted EF-P dependent motifs...... 28

Table 3. Identity of the second has marginal influence on EF-P mediated translation...... 36

Table 4. Motif targets and structural rigidity ...... 40

vii

LIST OF FIGURES

Figure 1. Structure of the ribosome...... 3

Figure 2. Overview of bacterial translation...... 4

Figure 3. The crystal structure of EF-P bound to the 70S ribosome...... 11

Figure 4. A subset of proteins is significantly misregulated in Δefp Salmonella...... 23

Figure 5. Target motifs verified by a fluorescent reporter...... 27

Figure 6. Translation of poly-proline motifs with defects in EF-P modification...... 29

Figure 7. Comparison of proteins identified in SILAC and those with EF-P target motifs ...... 31

Figure 8. Tripeptide "R1 R2 R3" with Cα and Cβ positions ...... 40

viii

CHAPTER 1. INTRODUCTION

1.1 The Mechanism of Translation

1.1.1 Overview of Translation

Translation is the crucial step of protein synthesis where genetic information encoded on mRNA is converted to the corresponding polypeptide sequence. In , this takes place on the 70S ribosome and uses messenger RNA (mRNA) as the template and amino acids as substrates which are delivered by transfer

(tRNA). There are four phases of translation namely initiation, elongation, termination and ribosome recycling. Additional factors are required for each of these four steps, and the accuracy and speed of translation are ensured by a fine-tuned cooperation of these components. (1, 2)

1.1.2 The Translation Machinery

The ribosome consists of two subunits in all species. In bacteria, the 70S ribosome is constituted by a large (50S) and a small (30S) subunit, both of which contain three binding sites for tRNA, namely, the A (aminoacyl) site, the P (peptidyl) site and the E (exit) site. The 30S subunit plays roles in the fidelity of translation by binding to mRNA together with the anticodon stem-loops of tRNA and hence monitoring base pairing between the two. The 50S subunit binds to the acceptor arms of the tRNA and catalyzes formation between the incoming amino acid on the A-site tRNA and the nascent peptide chain on the P-site tRNA.

Understanding of the molecular mechanism by which the ribosome catalyzes

1 protein synthesis was considerably expanded by the solving of a series of high resolution crystal structures, starting with an archaeal 50S subunit from Haloarcula marismortui and a bacterial 30S subunit from Thermus thermophilus published in

2000, followed by structures of the 70S ribosome (3, 4, 5, 6), the bacterial 50S subunit

(7), as well as the mobile elements of the 50S including the L1 or L7/L12 stalks that have been solved individually (8, 9).

These results revealed the 70S ribosome (Figure 1) includes an interface between the 30S and the 50S subunits that consists mainly of RNA. The A (aminoacyl) site accepts the incoming charged tRNA, the P (peptidyl) site holds the nascent peptide chain, and the E (exit) site is where deacylated P-site tRNA locates after peptide bond formation before leaving the ribosome. Between the “head” and “body” of the 30S subunit is a cleft structure which binds to mRNA, where codons of mRNA interact with the anticodons of tRNA. The 50S subunit contains the peptidyl- center

(PTC), which is located between the P-site and the A-site tRNAs, whereas E- site tRNA is ~50 Å away from the PTC.

2

Figure 1. Structure of the ribosome. Reproduced from (2).

3

1.1.3 Initiation, elongation, termination, ribosome recycling

The process of translation is generally divided into four phases, namely initiation, elongation, termination and ribosome recycling (Figure 2).

Figure 2. Overview of bacterial translation. Reproduced from (2).

In bacteria, the initiation of translation is marked by the formation of a complex by mRNA, initiator tRNA, three initiation factors IF1, IF2 and IF3, and the 30S and the 50S subunits. First, IF1 and IF3 bind to the 30S subunits recycled by ribosome recycling factor and EF-G after termination. IF3 facilitates release of subunit mRNA and deacylated tRNA from the previous translation cycle, preventing the 30S subunit 4 from prematurely re-associating with the 50S subunit (2). IF1 prevents initiator tRNA from binding to the 30S subunit (1, 10), assisting IF3 by increasing the dissociation rate between the 30S and 50S subunits (11, 12, 13). The Shine-Dalgarno sequence on mRNA base pairs with the 3’ end of the 16S rRNA on the 30S subunit. The

Shine-Dalgarno sequence is a highly conserved 3-10 nucleotide purine-rich sequence

5’ from the AUG , which binds to a highly conserved pyrimidine-rich region of the 16S rRNA, resulting in a proper alignment of the AUG start codon in the

P site of the 30S subunit (2, 14). Simultaneously required to be precisely positioned in the P site is the initiator fMet-tRNAfMet, which is escorted by IF2-GTP, resulting in base pairing between the AUG start codon of mRNA and the anticodon loop of the initiator tRNA (14, 15). This formation of the 30S pre-initiation complex is followed by binding of the 50S subunit and the eventual formation of the 70S initiation complex (1, 2). A series of conformational changes is proposed upon GTP- catalyzed by IF2, resulting in release of IF1, IF3 and then IF2 (16, 17, 18, 19).

The elongation phase of translation requires two factors, EF-Tu and EF-G (20,

fMet 21). Starting with fMet-tRNA in the P site, the next aminoacylated tRNA is delivered to the unoccupied A site of the 70S initiation complex as a ternary complex with EF-Tu and GTP (1, 2). EF-Tu proofreads for the correct interaction between the codon on mRNA and the anticodon on the aminoacylated tRNA (22, 23), upon which a conformational change in the ribosome triggers GTP hydrolysis by EF-Tu, which releases the A-site tRNA and swings it into the PTC of the 50S subunit (2, 24).

Peptide bond formation involves the spontaneous deacylation of P-site tRNA and the subsequent transferring of the peptide chain to A-site tRNA (2, 25, 26, 27). This 5 peptidyl transfer is followed by translocation, which is facilitated by GTP hydrolysis by EF-G, causing the tRNAs and mRNA to move with respect to the ribosome, specifically, resulting in the deacylated-tRNA in the P-site and peptidyl-tRNA in the

A-site transferring to the E-site and the P-site, respectively (28, 29). This results in the elongation of the polypeptide chain by one amino acid, and an empty A-site ready to accept the next ternary complex (1, 2).

Termination begins when a on the mRNA is encountered in the A site, signaling the end of the coding sequence (29, 30). In bacteria, two release factors,

RF1 and RF2 both recognize UAA, whereas UAG and UGA are only recognized by

RF1 or RF2 respectively (29, 31). RF1 and RF2 induce the hydrolysis of peptidyl-tRNA, resulting in the release of the newly synthesized protein from the ribosome (29, 31). Then RF3, a GTPase, binds to the A-site. Upon hydrolysis, RF3 drives the terminal mRNA codon into the P-site and moves the last tRNA into the

E-site, resulting in the release of the polypeptide chain (32, 33).

After RF3 dissociates from the ribosome, mRNA and a deacylated tRNA are left in the P-site. The 70S ribosome must be dissociated for a new round of protein synthesis to begin. This process requires ribosome recycling factor (RRF) to work together with EF-G associated with GTP hydrolysis, as well as involvement of IF-3

(34, 35, 36, 37).

6

1.1.4 The role of mRNA in Translation regulation and Ribosome Stalling

Bacterial regulation of can occur at many points from synthesis of an mRNA transcript to a mature protein. A particular event of interest during this process is ribosome stalling, where translation pauses at certain sites on the mRNA for different reasons. While some ribosome arrests are temporary and serve to calibrate the pace of translation, others lead to aborted protein synthesis and have been shown to be associated with quality control (38, 39). Though a number of proteins have been characterized with translational stalls, each is unique and common sequence motifs have not been well defined although certain themes have emerged (40).

The rate at which a protein is synthesized by the ribosome depends in part on the usage of specific codons in the mRNA. Codons are the sequence of three nucleotides that specify the amino acid that will be translated during protein synthesis. Multiple codons that are synonymous can be translated as the same amino acid, while multiple isoacceptor tRNA species can be used to read the codons for the same amino acid.

Therefore, the abundance of the tRNA that recognizes the codon will affect the speed at which decode a codon (41). Unequal distribution of codons along genes has been reported to control ribosome speed and translation stability. For example, in bacteria the speed of translation is reduced during the first 30–50 codons recognized by low abundant tRNAs, presumably resulting in a more stable elongation at the beginning of the gene, reducing the frequency of ribosome stalling (42).

It is also known that mRNA structure influences translation efficiency. Formation of strong hairpin loops around the Shine-Dalgarno sequence can hinder the pairing between the mRNA and 16S subunit, resulting in a significantly decreased expression 7 level. One example for this is some riboswitches which are sensor elements on the mRNA that can sequester the translation start site in the absence of specific ligands

(43). Pausing of ribosomes by local mRNA structure has also been identified to correlate with translation of certain proteins, suggesting it is required for the proper folding of nascent peptide (41). Recent discoveries facilitated by ribosome profiling have revealed previously uncharacterized pausing and stalling sites on the transcript, including Shine-Dalgarno like sequences (44, 45).

The efficiency of translation can also be affected by the identity of the amino acid being translated. For example, proline, as an N-alkylamino acid, has a significantly slower than average peptide bond formation rate due to prolonged accommodation and peptidyl transfer (46). In addition, the structure of the nascent peptide can also cause stalling of the ribosome by interfering with its core functions. Such interference includes obstructing the exit channel with certain N-terminal motifs, interacting with the peptidyl-transferase center with conserved C-terminal motifs, as well as a poor peptidyl transferring activity induced by certain peptide patterns (40, 47, 48, 49). A well-studied example of that is the SecM transcript that stalls its own translation in a nascent polypeptide-dependent manner via a series of complex interactions with the ribosome (47, 50). This stall leads to remodeling of an mRNA hairpin thereby revealing the ribosome of the downstream secA gene that results in the increased production of SecA (51). SecA in turn forms a feedback loop by targeting the SecM-stalled ribosome to the protein export machinery, which allows translation of SecM to resume (52).

Molecular mechanisms underlying ribosome stalling have been studied by 8 exploiting a well-established tmRNA quality control system, where tmRNA is a chimera that acts as both tRNA and mRNA. During rescue of the stalled ribosome, the tmRNA, which is delivered by a complex of SmpB and EF-Tu-GTP, recognizes paused ribosomes. It first acts as a tRNA to bind to the A-site in the ribosome and then acts as an mRNA, allowing the translation machinery to switch template. As a result the ssrA peptide tag is added to the C-terminus of the nascent polypeptide chain, leading to . Thus, the potentially deleterious truncated polypeptide is degraded and the damaged mRNA responsible for translational arrest is turned over

(53, 54).

Genetic selection for stalling motifs in many previous studies has exploited this tmRNA system, for example, by linking resistance to tmRNA tagging and screens for cellular survival (40). However, this method depends on the function of an antibiotic reporter, and may result in limited selection of motifs. A recent improvement on this system used a two-hybrid system that employs conditional translation of HIS3 downstream of a weak (38).

1.2 EF-P overview

Elongation factor P (EF-P) is a universally conserved bacterial translation factor originally characterized for its ability to facilitate protein synthesis. EF-P is strictly conserved in all bacterial genomes, and is the homologue of initiation factor 5A in and (namely, aIF5A and eIF5A) (55, 56). EF-P shares no significant similarities in amino acid sequence with other prokaryotic translation factors (57), nor is it capable of substituting for EF-G or EF-Ts in stimulating the 9 reaction (58).

Earlier studies focused on the role of EF-P in facilitating peptide bond formation in vitro, predominately by using fMet-tRNAifMet and (58). Studies in vitro using suggested the binding site of EF-P may overlap the PTC of the ribosome (59). It was further proposed that the ability of EF-P to stimulate peptide bond synthesis negatively correlates with the size of the amino acid side chain of the acceptor, as demonstrated in vitro by the synthesis of a number of fMet-initiated dipeptides from CCA amino-acyl acceptors (60).

10

Figure 3. The crystal structure of EF-P bound to the 70S ribosome. Reproduced from (61).

11

The crystal structure of EF-P from Thermus thermophilus HB8 showed that the protein contains three domains and an overall structure that mimics a tRNA (62). A detailed structure capturing EF-P bound to the 70S T. thermophilus ribosome (Figure

3) showed that the binding site is located between the P-site and E- site, with a position that is close to the P-site tRNA (61). These data together led to a proposed role of EF-P to facilitate the proper positioning of the fMet-tRNAifMet for the formation of the first peptide bond during translation initiation (61).

It has been shown that for EF-P to effectively function in E. coli and Salmonella,

EF-P has to undergo a post-translational modification, wherein a unique β-lysine residue is added to a conserved lysyl residue by the combined activities of PoxA and

YjeK (the PYE pathway) (63, 64). In vitro studies revealed that yjeK encodes a

2,3-β-lysine aminomutase that converts α-lysine to β-lysine, whereas PoxA post-tranlationally adds (R)-β-lysine to the unmodified EF-P (65, 66). The β-lysylated residue of EF-P is oriented in the ribosome such that it projects into the peptidyl transferase center (PTC), presumably to modulate peptide bond formation (63).

Although it has further been shown that β-lysylated EF-P also undergoes hydroxylation on the Lys34 catalyzed by YfcM, based on phenotypic and in vivo analyses, the importance of this additional modification has yet to be proven to be critical for EF-P function (64, 67).

Despite its universal presence and consequential role in translation, EF-P is not essential for bacterial viability, but is responsible for a variety of distinctive but related phenotypes in vivo. Deletion of EF-P in E. coli does not result in cell death, although a substantial loss of peptide-bond forming activity has been shown (59, 68). 12

Perhaps most significantly, EF-P, rather than acting as a global regulator, specifically targets certain groups of proteins, many of which are relevant to virulence and responses to cellular stressors. In vivo, 2D SDS-PAGE assays of Salmonella poxA and yjeK mutants revealed similar protein expression profiles that differ from wild-type

Salmonella (69). Proteomic analysis in E. coli, Salmonella and Agrobacterium confirmed that deletion of efp or any of the PYE genes results in pleiotropic phenotypes, including attenuated virulence in mice and swine, altered expression of

SPI-1 proteins, outer membrane defects, increased susceptibility to antibiotics, motility defects and impaired nutrient utilization (69, 70, 71, 72).

Although much progress has been made towards understanding the structural and functional roles of EF-P, it remains to be determined by what mechanism EF-P discriminatorily regulates translation of a subset of transcripts, and how that connects to the bacterial phenotypes of pathogenicity and stress responses in vivo.

1.3 Bioinformatics approaches to study EF-P function

Based upon previous studies, we hypothesize the existence of one or more shared characteristics or elements among the group of transcripts which are recognized by

EF-P for . Therefore, it would be conceivable to utilize a high throughput database to first specify the proteins that are post-transcriptionally regulated by EF-P, and then analyze the enrichment of certain features.

In this study, we present an unbiased in vivo comprehensive analysis of the

Salmonella proteins affected by EF-P using stable-isotope labeling by amino acids in cell culture (SILAC). SILAC is a quantitative proteomics technique that adopts mass 13 spectrometry to detect the difference in protein abundance between samples by isotopic labeling (73, 74). Recognition of proteins whose synthesis is regulated by

EF-P is then made possible by comparing Salmonella wild-type and Δefp mutant strains labeled with heavy and light isotopes followed by mass spectrometry. Relative abundance of a protein in wild-type and Δefp strains can then be used to predict the importance of EF-P during synthesis of that protein.

To identify the possible targets of EF-P on mRNA, we screened for over-representation of motifs. Motifs are specific patterns presented in nucleotide or amino acid sequences with biological significance such as facilitating DNA binding, structural importance and protein recognition, and while some motifs are consecutive and short, others can be discontinuous and stretched (75). Motif discovery has been an important field in bioinformatics research. While an extensive collection of tools are currently available for motif screening, choices including MEME, word-seeker and

GSEA, all have limitations and are insufficiently customized for extracting signals from noise in a SILAC database. In this work, we devise original bioinformatics tools programmed in Perl and Linux shell script to better fit SILAC data. Subsequent results are subject to statistical models and experimental validation.

The information on the EF-P target motifs gained from this study will shed light on the nature of the translation stalls alleviated by EF-P, furthering our understanding of the mechanism by which EF-P regulates translation of certain proteins.

14

CHAPTER 2. MATERIALS AND METHODS

2.1 Bacterial strains and plasmids

For SILAC we employed an arginine and lysine auxotroph (ΔargH ΔlysA) derivative of Salmonella enterica sv. Typhimurium str. SL1344 utilized in an earlier study (76). The Δefp mutation from strain WN934 was transferred into the auxotrophic strain by transduction using the HT105/1 int-201 derivative of phage P22

(77) to generate strain WN1308.

The E. coli strains used for motif verification were derivatives from the Keio knockout collection (68), including Δyfcm, Δyjek, ΔpoxA, Δefp and BW25113

(wild-type). Deletion of genes and removal of kanamycin cassettes using FLP recombinase (78) was confirmed by PCR. Plasmids used for motif verification were derivatives of pBAD30 (79) containing a tandem fluorescent fusion cassette composed of green fluorescent protein (gfp) followed immediately by the mCherry

Shine-Dalgarno sequence and mCherry, both of which are optimized for synthesis in

E. coli (80). This construct is designated as pBAD30mw700 and served as the PCR template for construction of other reporters (81). Unless otherwise stated, the codons used for motif inserts were optimized according to tRNA abundance in E. coli (82).

Using the red-gam recombinase protocol described above, we generated strain

WN1405 by deleting the majority of the efp gene (base-pairs 145-424) from S.

15

Typhimurium str. 14028s such that the new mutation would not influence the promoter of the upstream antiparallel yjeK gene. The mutant allele was transduced to a fresh strain background prior to experimentation. Deletions of efp were confirmed by PCR amplification using the primers WNp582 and WNp583 (81). WN1405 and the previously used Δefp mutant, WN934 (83), behave identically under all conditions tested (data not shown).

2.2 Stable isotope labeling of amino acids in cell culture (SILAC)

For SILAC, we supplemented MOPS minimal media with amino acids at the concentrations previously described (76) and used 0.2% (w/v) glycerol as a carbon source. For heavy isotope labeled samples, arginine and lysine were replaced with

13 2 C6-Arg and H4-Lys isotopes at the equivalent molar concentration. WN1269

(wild-type) was grown in heavy arginine and lysine isotopes and WN1308 (Δefp) was grown in light isotopes in this media for 16h to ensure complete labeling of all proteins. Strains were subsequently subcultured 1:200 into the same media and grown to an optical density (600nm) of 0.5 (mid-log phase) at which point the cells were harvested by centrifugation. The pellets were subjected to lysis by heating to 99°C for

5 minutes in fresh lysis buffer (1% deoxycholate [DOC] in 50mM ammonium bicarbonate [NH4HCO3] at pH 8). Cell debris was removed by centrifugation at

13,000 x g for 15 minutes and the supernatant was frozen at -80°C until used.

16

2.3 Mass spectrometry and proteomic data analysis

For analysis of the isotope-labeled lysates, 30μg of protein from each of WN1269

(heavy) and WN1308 (light) were combined, fractionated into 12 pieces by gel electrophoresis and in-gel trypsin digested following a previously outlined procedure

(84). The resulting were subjected to liquid chromatography coupled to tandem mass spectrometry using an Orbitrap XL (Thermo Scientific) as described previously (85). Data analysis was conducted using the MaxQuant software (86) to generate an average normalized heavy/light ratio over three biological replicates and

Significance B values were calculated using Perseus (87). To determine significance, we used the cutoff of a Significance B score of less than 0.01 in at least one trial. This statistic measures significance within a single trial even for proteins that were only identified in one the three biological replicates. When we assess significance using the average of Significance B scores across all three trials (excluding scores where the protein of interest was not identified in a given trial), we obtain similar percentages of significantly misregulated proteins containing EF-P target motifs. For example, using an average Significance B score of less than 0.05, we identified 107 significantly misregulated proteins. 61 of these are greater than two-fold downregulated in the efp mutant and 25 of these 61 contain a PPP, PPG or APP motif. All expression ratios and

Significance B scores can be referenced in (81).

17

2.4 DAVID analysis

Gene lists were generated based on the SILAC data and on the presence of EF-P target motifs in Salmonella proteins as described in the body text and figure legends.

Gene lists were uploaded to the Database for Annotation, Visualization and Integrated

Discovery (DAVID) online analysis software and we employed the Functional

Annotation Tools to determine overrepresented annotation groupings amongst our input genes (88). DAVID employs a one-tailed Fisher Exact Probability Test to calculate P values of individual annotation groups and cluster P values are generated as the geometric mean of the P values of all constituent groups. Functional annotation clusters with P values less than 0.05 are shown in figures. Full breakdown of clusters can be found in the supplemental tables in (81).

2.5 Bioinformatic identification of prominent tripeptide motifs

Out of the 1517 Salmonella proteins identified by SILAC, 1294 were retrieved from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database

(http://www.genome.jp/kegg/), parsed, and processed for further analysis of prominent tripeptide motifs using a customized Perl script. All possible combinations of 20 common amino acids in 3-digit length were enumerated and counted for their respective occurrences in three designated groups, namely high 10%, middle 80% and low 10%, ranked by the gene’s WN1269 / WN1308 SILAC ratio.

18

Expected occurrences in each group were calculated by first determining the rate at which each amino acid occurred in the retrieved database (e.g. Ala accounts for

10.00% of all retrieved amino acids) and then calculating the accumulated probability of each specific tripeptide motif, assuming all combinations are completely unbiased

(e.g. Ala-Ala-Ala is expected to occur in [10.00%]3=0.1% of all tripeptides inspected), finally multiplying this probability with the three-digit sliding windows inspected in each group (e.g. if the high 10% group contains 55,650 sliding windows of three consecutive amino acids, then the expected occurrence of Ala-Ala-Ala is 55,650 x 0.1% = 55.65).

The enrichment index of each tripeptide in a group was calculated as the ratio of actual over expected occurrences of that tripeptide in that group, normalized by the actual over expected occurrence of that tripeptide in total (total, in this case, refers to the 1294 sequences retrieved from the SILAC data):

Occurrence in one group/Expectation in that group Enrichment = group Occurrence in all group/Expectation in all group

The P value of each tripeptide was calculated using a Chi-square test with Yates’ correction as follows:

2  Occurrence-Expectation -0.5 Chi-square=  Expectation

Top candidate motifs most enriched in proteins downregulated in the efp mutant are screened with a P value < 0.0005 and Enrichmentlowest 10% value of < 1 and then 19 ranked by Enrichmenthighest 10%. Similarly, top candidate motifs enriched in upregulated genes are screened by a P value < 0.0005, Enrichementhighest10% value of <

1 then ranked by Enrichmentlowest 10%.

2.6 Motif verification

Luria-Bertani (LB) broth and base M9 minimal salts media were of standard composition (89). Media were supplemented with 200μg/ml ampicillin when required.

All cultures are incubated at 37°C. Overnight cultures of E. coli strains harboring pBAD30 constructs in LB were diluted to OD600 of 0.05 in M9 media supplemented with 0.2% glycerol. After two hours or when the OD600 reached 0.1 - 0.15, the culture was supplemented with 0.2% arabinose to induce synthesis of the GFP and mCherry reporter proteins and fluorescence was assessed using a spectrofluorometer (Horiba) at designated time points. Cells were analyzed for GFP using excitation at 481nm and emission at 507nm, and for mCherry with excitation at 587nm and emission at 610nm.

Background with blank media was subtracted and the ratio of GFP fluorescence over that of mCherry was calculated. Reported values represent averages and standard deviations determined from three independent experimental replicates.

20

CHAPTER 3. PREDICTING SEQUENCE MOTIFS TARGETS OF EF-P

3.1 Identification of EF-P regulated proteins by SILAC

We have previously examined with our collaborators the proteome of the

Salmonella poxA mutant using 2D-DIGE. In that study, total cellular proteins from wild-type and poxA mutants were labeled with fluorescent dyes, mixed in equal amounts and then separated by 2D-PAGE prior to analysis (90). The results suggested that a relatively small subset of proteins were affected by perturbations in the PYE pathway – a finding in agreement with earlier work on the efp mutant of

Agrobacterium (91). However, only a small number of these proteins were unambiguously identified by mass spectrometry due to crowding on the 2D gel.

To gain a more comprehensive view of the effect of EF-P on protein levels, we employed stable isotope labeling of amino acids in cell culture (SILAC) in conjunction with quantitative mass spectrometry-based proteomics to examine the proteome of an efp mutant strain of Salmonella enterica Sv. Typhimurium strain

SL1344 (strain WN1308). We examined the profiles of three biological replicates and in total were able to detect, quantify and identify a total of 1517 proteins, or approximately 34% of the 4514 proteins predicted to be encoded in the S.

Typhimurium strain SL1344 genome (Figure 4). To identify proteins showing altered levels, we first selected candidates that showed a significant difference between the

21 wild-type and efp mutant strains in at least one of the three biological replicates with a

Significance B cutoff of 0.01 (see methods). By this criterion, 87 proteins showed changes of two-fold or greater and 28 displayed a change of greater than ten-fold. Of the 87 significantly misregulated proteins, 49 showed decreased steady-state levels in the efp mutant strain and are more likely to be direct targets of EF-P owing to its characterized stimulatory effect on translation (92, 93).

22

Figure 4. A subset of proteins is significantly misregulated in Δefp Salmonella. Histogram outlining the distribution of protein synthesis ratios identified in SILAC. Columns indicate the number of proteins with average synthesis ratio between two neighboring x-axis values. Underlined values in the x-axis indicate a change in scale. Inset table shows the number of SILAC hits demonstrating a greater than two-fold difference in protein level between the efp+ (WN1269) and ∆efp (WN1308) strains. The second column further indicates the proteins with a Significance B value of less than 0.01 in at least one trial. ‘Total’ indicates the number of proteins identified in at least one replicate regardless of expression ratio and the second column of this row includes proteins with any expression ratio that had a Significance B value of less than 0.01 in at least one trial. Synthesis ratios shown are the average normalized heavy / light ratios of three biological replicates (81).

23

3.2 Identification of amino acid motifs enriched in EF-P regulated proteins

To determine if misregulation of specific proteins in efp mutants can be attributed to certain motifs, we analyzed the SILAC data for the prevalence of specific peptide sequences in proteins showing a strong decrease or increase in the Salmonella efp mutant.

Preliminary examination showed that the analysis of tripeptides is feasible.

Searching tetrapeptides resulted in prevalently unreliable statistics due to lower occurrences of each combination, whereas dipeptide targets were shown to be statistically insignificant. In addition, probing nucleotides rather than peptides revealed no considerably over-represented targets. Based on these initial results (data not shown), we decided to focus on tripeptides. We searched for the frequency of each possible tripeptide motif in the 10% of proteins most strongly affected (either up or down) by EF-P. We compared the actual frequency of each motif to its expected frequency and the frequency observed in the remaining 90% of proteins identified in

SILAC. An enrichment score was calculated based on the ratio between observed occurrences of a tripeptide within this group, normalized against the actual over expected occurrences in all retrieved SILAC hits (see methods). Using this method we identified a number of tripeptide motifs that were enriched in the 10% of proteins that were most downregulated in the efp mutant (Table 1).

The two top scoring tripeptide sequences identified were PPP and PPG, confirmed by the recent findings regarding EF-P regulation, and also acting as a proof of principle for our analysis (92, 93). Additionally, a number of previously uncharacterized sequence motifs were also identified, including the third highest 24 scoring sequence, APP, implicating them as potential EF-P dependent motifs. While this thesis was in preparation, several APP-containing motifs such as MRAPP and

WAPP were identified as sequences capable of stalling ribosomes (38).

b c Motif Occhigh Exphigh Occmid Expmid Occlow Explow Enrichhigh Enrichlow P value PPP 18 4.96 15 29.1 0 3.16 4.09 0 2.10E-10 PPG 28 8.48 26 49.7 1 5.4 3.82 0.21 5.60E-14 APP 18 11.1 22 65.1 0 7.07 3.38 0 7.60E-10 RME 12 5.3 16 31.1 2 3.38 3 0.79 1.30E-04 YIR 10 4.64 14 27.2 1 2.96 3 0.47 4.70E-04 TQM 11 3.64 17 21.4 0 2.32 2.95 0 1.40E-04 PFF 10 2.84 15 16.6 2 1.81 2.78 0.87 7.40E-05 DPP 5 6.73 8 39.4 1 4.29 2.68 0.84 2.00E-07 FFL 5 6.15 8 36.1 1 3.92 2.68 0.84 1.40E-06 QNA 22 10 43 58.8 5 6.39 2.36 0.84 3.00E-05

Table 1. Bioinformatics analysis predicts EF-P regulated tripeptide motifs (81). a Prevalence among the SILAC hits of all possible tripeptide combinations of the 20 common amino acids. Occ, motif occurrences in the indicated subgroup of the SILAC data; Exp, expected motif occurrences in the indicated subgroup based on amino acid prevalence; Enrich, calculated motif enrichment index in the indicated subgroup of the SILAC data; high, low, and mid, subgroups of SILAC data encompassing the 10% of proteins with the highest (high) or lowest (low) WN1269/WN1308 ratios; “mid” incorporates the remaining 80%. b Motifs shown are those that demonstrated the greatest enrichment index within the 10% of proteins most downregulated in the efp mutant (highest WN1269/WN1308 ratio). c P values were calculated using the chi-square test with Yates’ correction as described in Materials and Methods.

25

3.3 Extending motif candidates beyond tripeptide

To extend the possibility of potential motif targets with more than three amino acids, we investigated all combinations of motif peptides with four amino acids using the same protocol (data not shown). Results showed generally higher P values due to lower occurrences of each tetrapeptide motif in all three groups, resulting in a dataset more susceptible to noise. Regardless, this result suggests exclusion of significantly over-represented tetrapeptide motifs as targets of EF-P.

To further investigate the possible expansion of target tripeptides, we looked at the local peptide environment with 10 amino acids upstream and downstream from the selected tripeptide candidates. Bioinformatics analysis using MEME and LOGO were done using the top three predicted tripeptide motifs PPP, PPG and APP in proteins whose syntheses are at least two-fold up or down regulated by EF-P (data not shown).

However, both results lack robust statistical significance. These results combined lead to our focus on tripeptides as major sequence motif targets of EF-P.

26

CHAPTER 4. VERIFICATION OF PREDICTED EF-P TARGET MOTIFS.

4.1 Verification of motif targets using a fluorescent reporter

To test the hypothesis that EF-P plays a role in the translation of predicted motifs, we constructed a number of translational fusion plasmids wherein putative target sequences were inserted in-frame at the fourth codon of gfp (94), followed by an mCherry as an internal control for plasmid copy number variation (Figure 5).

Figure 5. Target motifs verified by a fluorescent reporter. Plasmids used for motif verification were derivatives of pBAD30 containing a tandem fluorescent fusion cassette composed of green fluorescent protein (GFP) followed immediately by the mCherry Shine-Dalgarno sequence and mCherry, both of which are optimized for synthesis in E. coli. Target motifs are inserted in-frame at the fourth codon of gfp (94). Unless otherwise stated, the codons used for motif inserts were optimized according to tRNA abundance in E. coli (82).

Relative fluorescence (GFP/mCherry) was significantly decreased in an efp mutant relative to wild-type E. coli when any of the PPP, PPG or APP motifs were inserted into GFP (Table 2). Interestingly, insertion of a YIRYIR motif also caused production of the GFP reporter to be dependent on EF-P. This is the only EF-P target motif identified that lacks a proline residue. In contrast, none of the other predicted motifs – including a single YIR sequence – yielded a significant difference in fluorescence. 27

To further examine if translation of the characterized motifs is affected by identity of the nucleotides, we randomized the codon usage of the same hypothetical motif with six consecutive prolines. Despite the codon alteration, the levels of synthesis of these poly proline stretches are relatively similar to each other, and all are much higher than control and tri-proline motifs (Table 2). This result indicates that the putative motif-induced EF-P regulation is determined by peptide sequence rather than codon usage.

Motif GFP fluorescence (WT/Δefp)b Nullc 0.95±0.09 d PPPPPP0 20.68±0.38 d PPPPPP1 18.55±0.38 d PPPPPP2 18.12±0.20 d PPPPPP3 15.97±0.08 PPP 4.72±0.23 PPG 10.39±0.66 APP 5.88±0.20 RME 1.34±0.07 YIR 0.96±0.13 YIRYIR 7.61±2.54 PFF 1.58±0.19

Table 2. Verification of predicted EF-P dependent motifs (81). a Motifs were assayed for EF-P dependence by insertion into the 4th codon position of GFP. b Values at 21 h post-induction were normalized to co-transcriptionally expressed mCherry and are shown as a ratio of wild-type and efp mutant strains expressing the same construct. All values are the averages ± standard deviations for three biological replicates c No motif inserted into GFP. d Sequence of six optimal (CCG; 0) or random (1 to 3) proline codons.

28

To investigate if this effect is subject to EF-P modification, we transferred fluorescent reporter plasmids containing target motifs into E. coli strains with defects in EF-P modification. When the motif target PPPPPP is inserted, a dramatic decrease of relative fluorescence is seen in Δpox and Δyjek backgrounds. However, this level is still somewhat higher than complete depletion of EF-P, suggesting slight retention of its function (Figure 6). In contrast, in Δyfcm translation of PPPPPP is comparable to that seen in a fully modified EF-P background, suggesting a secondary role for the final hydroxylation on EF-P. These results not only confirm the previous discovery that modification of β-lysine on EF-P by YjeK and PoxA is essential for EF-P function in facilitating peptide bond formation, but also validated the predicted motifs as regulated targets of EF-P.

Figure 6. Translation of poly-proline motifs with defects in EF-P modification. Data represent the average of three independent experiments. a No motif inserted into GFP. 29

4.2 The presence of a PPP, PPG or APP motif is not sufficient to confer dependence on EF-P

To investigate whether the characterized target motifs are sufficient for EF-P regulation in vivo, we compared the occurrences of PPP, PPG or APP motifs in proteins with their SILAC ratios. Of the 422 proteins that contain one of the three target motifs, 100 were conclusively identified in SILAC (Figure 7A). Only 20 of these were found to be significantly downregulated in the efp mutant strain. A more conservative analysis of proteins with SILAC ratios of less than two (without regard for the significance score) found that there were 45 proteins identified in SILAC that were not downregulated in S. Typhimurium WN1308 yet have a PPP, PPG or APP motif (12, 22 and 16 proteins, respectively, contain each motif; 5 proteins have two of the three motifs). This demonstrates that a large percentage of proteins that contain a characterized EF-P target motif are not misregulated in the efp deletion strain.

Examples include ZipA, SseA and YtfM, which were identified to have average expression ratios of 1.07, 0.74 and 0.84 respectively with Significance B scores that were far from significant in all three replicates (81). Although they were not misregulated in the efp mutant, ZipA has two distinct APP motifs as well as an APPP motif, SseA has an APPG motif, and YtfM contains a PPP motif. Taken together, these data suggest that the presence of particular tripeptide motifs may not, in and of itself, be sufficient to cause translational stalling in the absence of EF-P.

30

Figure 7. Comparison of proteins identified in SILAC and those with EF-P target motifs (81). (A) Venn diagram outlining the overlap of: proteins conclusively identified in SILAC, proteins significantly downregulated in WN1308, and proteins containing a PPP, PPG, or APP motif as annotated in the Salmonella serovar Typhimurium strain SL1344 genome (GenBank accession no. FQ312003.1). (B) DAVID analysis of SILAC hits showing a decreased protein level in the Δefp strain (ratio > 2) and a significance B score of less than 0.01 in at least one trial. Functional annotation clusters showing significant overrepresentation (P value < 0.05) are shown. (C) DAVID analysis showing the most significantly overrepresented clusters among the 422 proteins that contain an EF-P target motif. For clarity, only groups with P values < 0.001 are shown here. (D) DAVID analysis showing the only significantly overrepresented cluster (P value < 0.05) among the 20 proteins that fall into all three categories (contain an EF-P target motif and were significantly downregulated in the Δefp strain in SILAC). 31

4.3 EF-P disproportionately affects the synthesis of signaling and nucleotide-binding/metabolic proteins

To examine whether EF-P affected the synthesis of all classes of proteins similarly or if there was a particular bias toward a specific subset of proteins, the

SILAC data was subjected to a functional annotation analysis by the Navarre lab using the Database for Annotation, Visualization and Integrated Discovery (DAVID) software package (95, 96, 97). The program compares a list of genes with functional annotation databases including GO terms, KEGG pathways and SP-PIR keywords amongst others. By comparing the prevalence of proteins belonging to these categories in their respective database and in the input gene list, DAVID generates a P value highlighting significantly overrepresented annotation terms.

Upon analysis of the 49 significantly downregulated proteins identified in SILAC, we find that four clusters demonstrate overrepresentation with a cluster P value of less than 0.05 (Figure 7B). Most prominent amongst these terms were two-component regulatory systems, with particular emphasis on proteins involved in chemotaxis and motility. Furthermore, metabolic proteins were also abundantly downregulated as annotated by functions in nucleotide binding, oxidative , or proteolysis.

We subsequently examined the S. Typhimurium SL1344 genome for proteins containing EF-P dependent sequences and found 422 ORFs encoding a PPP, PPG or

32

APP motif. These motifs occurred in 112 (PPP), 195 (PPG), and 185 (APP) proteins, with 70 of those proteins containing more than one of these motifs. In contrast, the sequence YIRYIR is not present in any Salmonella protein. To examine the physiological processes involving proteins with EF-P dependent motifs, we conducted a DAVID cluster analysis and found that the overrepresented functional groups were similar to those identified amongst proteins that were significantly downregulated in

SILAC (Figure 7C). Many of the identified proteins with a PPP/PPG/APP motif are predicted membrane proteins, a group we previously investigated by comparative gel electrophoresis. We previously identified the outer membrane porin, KdgM, as significantly upregulated in the efp mutant (71). The SILAC analysis confirms KdgM as a protein that is 38-fold more abundant in the efp mutant (81).

Twenty proteins were identified that both contained a target motif and were significantly downregulated in SILAC. Of these, seven were identified as nucleotide binding proteins - the only overrepresented category in this subset of 20 proteins

(Figure 7D). Some of these proteins are involved in central metabolism. AtpD, the catalytic subunit of the F0F1 ATPase, had previously been identified as downregulated in the proteome of poxA mutants (69). PfkB is a phosphofructokinase that functions in glycolysis. Three of the other EF-P dependent nucleotide binding proteins (HflB/FtsH, HslU, and ClpB) play a role in protein stability and turnover.

The function of the polyproline motifs in these proteins is unlikely to be universally 33 conserved in function. For example, in AtpD (50), ClpB (98), and PfkB (PDB ID:

3UMP) the putative EF-P dependent motifs are not proximal to the region of the protein that interacts with ATP, whereas in HflB/FtsH the putative EF-P dependent motif (GPPG) makes contact with AMP (99).

Some proteins identified as EF-P-dependent by SILAC or by the presence of a polyproline motif can provide a parsimonious explanation for previously described phenotypes of Salmonella strains lacking PoxA or EF-P. We observed, for example, that the gamma-glutamyl transferase (Ggt) is present at a level approximately

16-fold lower in the efp mutant compared to wild-type Salmonella (81). Ggt contains a short polyproline motif (residues 291-293) and its strongly reduced synthesis in mutant cells likely explains why Salmonella strains lacking EF-P, PoxA, or YjeK are simultaneously unable to utilize γ-glutamyl-glycine as a nitrogen source and are resistant to the compound GSNO (S-nitrosoglutathione) (71, 90). The misregulation of proteins involved in motility and chemotaxis, such as CheA, which is downregulated approximately four-fold in the efp mutant, likely contributes to the observed motility defect in poxA, yjeK and efp mutant strains (71).

4.4 Role of EF-P is not specific to facilitating the first peptide bond formation

We modified our fluorescence reporter system to address the previously proposed function of EF-P in facilitating the first peptide bond (58) as well as the theory that

34 the property of the second amino acid can affect the regulatory role of EF-P (60). In this reporter system, we added in a leader region of the fliC gene - which contains its

Shine-Dalgarno sequence and first 9 amino acids – upstream from the start codon of

GFP. SILAC data suggests fliC is 3.7 fold down-regulated in the efp mutant, confirming the previous report that fliC is positively regulated by EF-P (69).

Regardless, if the property of the first peptide bond is specifying the regulation of translation by EF-P, it is plausible to expect a change when the second amino acid is substituted.

However, systematic alteration of the second amino acid in fliC showed no obvious interference with the ability of EF-P to facilitate translation (

Table 3). Although it has been previously proposed from in vitro assays using puromycin that the size of the side chain of the second amino acid could influence the ability of EF-P to facilitate the first peptide bond, this correlation is not shown here in vivo (60). Our results demonstrated neither altering the identity of the second amino acid with a variety of residue sizes and properties nor changing their codons resulted in significant changes in relative fluorescence ratios. Further scrutiny of the SILAC data revealed no over-representation of specified second amino acids in the proteins misregulated by EF-P (data not shown). These results combined strongly challenge the previously proposed mechanism in which regulation of EF-P focus on the first peptide bond and might be influenced by the second amino acid.

35

Second Codon Relative Fluorescent (Amino Acid) (WT/Δefp) GCA (Ala) 1.02±0.09 UGG (Trp) 1.40±0.03 UUC (Phe) 0.78±0.15 UUU (Phe) 1.02±0.01 AAG (Lys) 1.23±0.10 AAA (Lys) 1.06±0.13 AGC (Ser) 0.75±0.19 UCU (Ser) 1.17±0.09 GGU (Gly) 1.16±0.08 GGC (Gly) 0.92±0.07

Table 3. Identity of the second amino acid has marginal influence on EF-P mediated translation. Data represent the average of three independent experiments.

36

CHAPTER 5. EF-P TARGET: BEYOND SEQUENCE MOTIFS

In the previous chapters, we assessed the steady-state levels of over 1500

Salmonella proteins using SILAC and found that less than six percent were significantly misregulated by more than two fold in a strain lacking EF-P. A functional annotation analysis of the corresponding downregulated genes revealed that proteins involved in responding to environmental stimuli, motility, metabolism and proteolysis were overrepresented among the group affected by the loss of EF-P. These data lend insight into the causes underlying the pleiotropic phenotypes exhibited by efp mutants by implicating particular pathway components. For instance, the finding that two-component systems were the most highly overrepresented group amongst the downregulated proteins suggests that impaired responses to external stimuli contribute to the observed sensitivity of the efp mutant to a range of pharmacologically unrelated antibiotics. Determining the specific defective pathways in future studies may point to potential targets for novel antimicrobials that would act synergistically with existing drugs.

Following the hypothesis that proteins misregulated by EF-P are characterized by certain sequence motifs, we probed the proteomic data for an abundance of such sequences in downregulated proteins. Several tripeptides prominent amongst downregulated proteins were confirmed by using a translational fusion assay,

37 suggesting a novel mechanism where EF-P rescues ribosomes stalled at particular target motifs. Notably, identified tripeptide motifs such as PPP, PPG and APP were characterized with high proline content. This is confirmed by the recent findings that

EF-P alleviates stalls caused by motifs containing polyproline stretches that optionally end with a glycine during translational elongation (92, 93), although APP had not been identified before this study. Regardless, our SILAC data has also led to the identification of the first characterized EF-P target motif that lacks a proline: YIRYIR.

Furthermore, recent analysis of PoxB identified the motif GSCGPG as responsible for its regulation by EF-P (81).

These data not only suggest a wider array of EF-P target motifs which are not limited to three amino acids or polyprolines, but also provide a unifying mechanism for the physiological consequences that are triggered by the loss of EF-P. This is further reinforced by the fact that the tripeptide motifs can be relatively common in the proteome and yet the overall number of proteins affected strongly by the loss of

EF-P is considerably smaller. Indeed, of the hundred tripeptide target motif-containing proteins identified in SILAC, only twenty were significantly downregulated. The data suggest that tripeptide motifs including polyproline tracts may not always be solely sufficient to mediate a translational stall and may depend on upstream interactions within the exit tunnel as well. Conversely, some weaker tripeptide motifs such as the

GPG of PoxB may instigate ribosomal stalls if sufficiently strengthened by upstream 38 interactions.

These results cumulatively suggest structural features may underlie the EF-P target motifs. To explore this possibility, we looked to the existing database, literature and prediction tools for insights. One particularly interesting discovery is the link between EF-P motif targets and their unusually low residue flexibility, which is defined as a small standard deviation in the relative distances between C1 (Cα1 and

Cβ1) and C3 (Cα3 and Cβ3) in a tripeptide motif (Figure 8).

High resolution (≤ 2.0 Å) structures from the (PDB) were used to determine the local structure of all 8000 possible combination of tripeptides. Based on various distances and their respective standard deviations between Cα and Cβ atoms,

18% of tripeptides tested are classified as rigid, 4% as non-rigid and 78% as intermediate (100). Interestingly, all the verified EF-P motif targets (with the exception of YIRYIR) have been unanimously categorized as rigid tripeptides (Table

4).

39

Figure 8. Tripeptide "R1 R2 R3" with Cα and Cβ positions. Reproduced from (100).

GFP Fluorescence Predicted Tripeptide Motif P value (WT / Δefp) Enrichhigh Flexibility null† 0.95 ± 0.09 NA NA NA PPP 4.72 ± 0.23 4.09 2.1E-10 Rigid PPG 10.39 ± 0.66 3.82 5.6E-14 Rigid APP 5.88 ± 0.20 3.38 7.6E-10 Rigid RME 1.34 ± 0.07 3 0.00013 Rigid YIR 0.96 ± 0.13 3 0.00047 Rigid YIRYIR 7.61 ± 2.54 NA NA NA

Table 4. Motif targets and structural rigidity (100).

40

These results, while tentative, agree with the notion that the structure of particular residues in a protein can play important regulatory roles in its translation (49, 101). In particular, conformation of nascent peptides in the ribosome tunnel can affect protein folding and signal recognition (102). Previous studies have also linked proline to rigid local peptide structure as well as ribosome stalling. Indeed, rigidity due to proline is well understood since the side chain forms a covalent bond with the backbone amine forming a cycle (46, 103, 104). Further, proline is involved in several characterized types of ribosome stalling caused by nascent peptide (49, 105). Finally, it is also revealed that within the ribosomal exit tunnel there is limited conformational flexibility, therefore local motifs with strong structural rigidity can potentially interfere with the PTC and cause ribosome stalling (49, 106, 107). Taken together, it is conceivable that short peptide motifs with specific structures characterized by low local flexibility could cause blockages in the immediate exit tunnel, resulting in ribosomal stalling. The binding of EF-P in between the E-site and P-site could then serve to alleviate this blockage.

It is clear that not all stalls are created equal and we note that previously reported peptides that stall during elongation function in wild-type circumstances where EF-P is present and (presumably) active. Multiple groups have reported on proteins that instigate a translational stall via not only a particular sequence in the vicinity of the

PTC but also upstream residues in the nascent polypeptide chain that interact with the ribosomal exit tunnel. These stalls may be dependent on external factors such as the

41 antibiotic in the case of ermAL1 and tryptophan for TnaC, or may be self-mediated as for SecM and MifM (49, 51, 108, 109, 110, 111). The laboratory of

Allen Buskirk identified a number of stall sequences containing polyproline tracts near the PTC that also require upstream residues for efficient stalling (40, 112).

Future work will delineate the specific structural features that enable certain nascent chain induced stalls to be alleviated by EF-P while others, as in the case of

SecM or TnaC, require alternate factors to rescue the stall. Also unresolved is whether

EF-P preferentially targets particular subsets of proteins (e.g. nucleotide binding proteins) for the purpose of regulation or if this EF-P-dependent subset of proteins merely require difficult to translate structural features such as polyprolines for functional purposes.

42

REFERENCES 1. Ramakrishnan, V. (2002) Ribosome structure and the mechanism of translation, Cell. 108, 557-72. 2. Schmeing, T. M. & Ramakrishnan, V. (2009) What recent ribosome structures have revealed about the mechanism of translation, Nature. 461, 1234-42. 3. Gao, H., Sengupta, J., Valle, M., Korostelev, A., Eswar, N., Stagg, S. M., Van Roey, P., Agrawal, R. K., Harvey, S. C., Sali, A., Chapman, M. S. & Frank, J. (2003) Study of the structural dynamics of the E coli 70S ribosome using real-space refinement, Cell. 113, 789-801. 4. Yusupov, M. M., Yusupova, G. Z., Baucom, A., Lieberman, K., Earnest, T. N., Cate, J. H. & Noller, H. F. (2001) Crystal structure of the ribosome at 5.5 A resolution, Science. 292, 883-96. 5. Schuwirth, B. S., Borovinskaya, M. A., Hau, C. W., Zhang, W., Vila-Sanjurjo, A., Holton, J. M. & Cate, J. H. (2005) Structures of the bacterial ribosome at 3.5 A resolution, Science. 310, 827-34. 6. Selmer, M., Dunham, C. M., Murphy, F. V. t., Weixlbaumer, A., Petry, S., Kelley, A. C., Weir, J. R. & Ramakrishnan, V. (2006) Structure of the 70S ribosome complexed with mRNA and tRNA, Science. 313, 1935-42. 7. Harms, J., Schluenzen, F., Zarivach, R., Bashan, A., Gat, S., Agmon, I., Bartels, H., Franceschi, F. & Yonath, A. (2001) High resolution structure of the large ribosomal subunit from a mesophilic eubacterium, Cell. 107, 679-88. 8. Nikulin, A., Eliseikina, I., Tishchenko, S., Nevskaya, N., Davydova, N., Platonova, O., Piendl, W., Selmer, M., Liljas, A., Drygin, D., Zimmermann, R., Garber, M. & Nikonov, S. (2003) Structure of the L1 protuberance in the ribosome, Nature structural biology. 10, 104-8. 9. Diaconu, M., Kothe, U., Schlunzen, F., Fischer, N., Harms, J. M., Tonevitsky, A. G., Stark, H., Rodnina, M. V. & Wahl, M. C. (2005) Structural basis for the function of the ribosomal L7/12 stalk in factor binding and GTPase activation, Cell. 121, 991-1004. 10. Simonetti, A., Marzi, S., Jenner, L., Myasnikov, A., Romby, P., Yusupova, G., Klaholz, B. P. & Yusupov, M. (2009) A structural view of translation initiation in bacteria, Cellular and molecular sciences : CMLS. 66, 423-36. 11. Antoun, A., Pavlov, M. Y., Lovmar, M. & Ehrenberg, M. (2006) How initiation factors maximize the accuracy of tRNA selection in initiation of bacterial protein synthesis, Molecular cell. 23, 183-93. 12. Antoun, A., Pavlov, M. Y., Lovmar, M. & Ehrenberg, M. (2006) How initiation factors tune the rate of initiation of protein synthesis in bacteria, The EMBO journal. 25, 2539-50. 13. Pavlov, M. Y., Antoun, A., Lovmar, M. & Ehrenberg, M. (2008) Complementary 43 roles of initiation factor 1 and ribosome recycling factor in 70S ribosome splitting, The EMBO journal. 27, 1706-17. 14. Kaminishi, T., Wilson, D. N., Takemoto, C., Harms, J. M., Kawazoe, M., Schluenzen, F., Hanawa-Suetsugu, K., Shirouzu, M., Fucini, P. & Yokoyama, S. (2007) A snapshot of the 30S ribosomal subunit capturing mRNA via the Shine-Dalgarno interaction, Structure. 15, 289-97. 15. Gotz, F., Dabbs, E. R. & Gualerzi, C. O. (1990) Escherichia coli 30S mutants lacking protein S20 are defective in translation initiation, Biochimica et biophysica acta. 1050, 93-7. 16. Hennelly, S. P., Antoun, A., Ehrenberg, M., Gualerzi, C. O., Knight, W., Lodmell, J. S. & Hill, W. E. (2005) A time-resolved investigation of ribosomal subunit association, Journal of . 346, 1243-58. 17. Allen, G. S., Zavialov, A., Gursky, R., Ehrenberg, M. & Frank, J. (2005) The cryo-EM structure of a translation initiation complex from Escherichia coli, Cell. 121, 703-12. 18. Grigoriadou, C., Marzi, S., Pan, D., Gualerzi, C. O. & Cooperman, B. S. (2007) The translational fidelity function of IF3 during transition from the 30 S initiation complex to the 70 S initiation complex, Journal of molecular biology. 373, 551-61. 19. Grigoriadou, C., Marzi, S., Kirillov, S., Gualerzi, C. O. & Cooperman, B. S. (2007) A quantitative kinetic scheme for 70 S translation initiation complex formation, Journal of molecular biology. 373, 562-72. 20. Noble, C. G. & Song, H. (2008) Structural studies of elongation and release factors, Cellular and molecular life sciences : CMLS. 65, 1335-46. 21. Schmeing, T. M., Voorhees, R. M., Kelley, A. C., Gao, Y. G., Murphy, F. V. t., Weir, J. R. & Ramakrishnan, V. (2009) The crystal structure of the ribosome bound to EF-Tu and aminoacyl-tRNA, Science. 326, 688-94. 22. Pingoud, A., Urbanke, C., Krauss, G., Peters, F. & Maass, G. (1977) Ternary complex formation between elongation factor Tu, GTP and aminoacyl-tRNA: an equilibrium study, European journal of biochemistry / FEBS. 78, 403-9. 23. Blanchard, S. C., Gonzalez, R. L., Kim, H. D., Chu, S. & Puglisi, J. D. (2004) tRNA selection and kinetic proofreading in translation, Nature structural & molecular biology. 11, 1008-14. 24. Kavaliauskas, D., Nissen, P. & Knudsen, C. R. (2012) The busiest of all ribosomal assistants: elongation factor Tu, Biochemistry. 51, 2642-51. 25. Beringer, M. & Rodnina, M. V. (2007) The ribosomal peptidyl transferase, Molecular cell. 26, 311-21. 26. Frank, J., Gao, H., Sengupta, J., Gao, N. & Taylor, D. J. (2007) The process of mRNA-tRNA translocation, Proceedings of the National Academy of Sciences of the United States of America. 104, 19671-8. 27. Simonovic, M. & Steitz, T. A. (2009) A structural view on the mechanism of the 44 ribosome-catalyzed peptide bond formation, Biochimica et biophysica acta. 1789, 612-23. 28. Dunkle, J. A. & Cate, J. H. (2010) Ribosome structure and dynamics during translocation and termination, Annual review of biophysics. 39, 227-44. 29. Petry, S., Weixlbaumer, A. & Ramakrishnan, V. (2008) The termination of translation, Current opinion in structural biology. 18, 70-7. 30. Kisselev, L. L. & Buckingham, R. H. (2000) Translational termination comes of age, Trends in biochemical sciences. 25, 561-6. 31. Weixlbaumer, A., Jin, H., Neubauer, C., Voorhees, R. M., Petry, S., Kelley, A. C. & Ramakrishnan, V. (2008) Insights into translational termination from the structure of RF2 bound to the ribosome, Science. 322, 953-6. 32. Zavialov, A. V., Buckingham, R. H. & Ehrenberg, M. (2001) A posttermination ribosomal complex is the guanine nucleotide exchange factor for peptide RF3, Cell. 107, 115-24. 33. Gao, H., Zhou, Z., Rawat, U., Huang, C., Bouakaz, L., Wang, C., Cheng, Z., Liu, Y., Zavialov, A., Gursky, R., Sanyal, S., Ehrenberg, M., Frank, J. & Song, H. (2007) RF3 induces ribosomal conformational changes responsible for dissociation of class I release factors, Cell. 129, 929-41. 34. Karimi, R., Pavlov, M. Y., Buckingham, R. H. & Ehrenberg, M. (1999) Novel roles for classical factors at the interface between translation termination and initiation, Molecular cell. 3, 601-9. 35. Peske, F., Rodnina, M. V. & Wintermeyer, W. (2005) Sequence of steps in ribosome recycling as defined by kinetic analysis, Molecular cell. 18, 403-12. 36. Gao, N., Zavialov, A. V., Li, W., Sengupta, J., Valle, M., Gursky, R. P., Ehrenberg, M. & Frank, J. (2005) Mechanism for the disassembly of the posttermination complex inferred from cryo-EM studies, Molecular cell. 18, 663-74. 37. Hirokawa, G., Demeshkina, N., Iwakura, N., Kaji, H. & Kaji, A. (2006) The ribosome-recycling step: consensus or controversy?, Trends in biochemical sciences. 31, 143-9. 38. Woolstenhulme, C. J., Parajuli, S., Healey, D. W., Valverde, D. P., Petersen, E. N., Starosta, A. L., Guydosh, N. R., Johnson, W. E., Wilson, D. N. & Buskirk, A. R. (2013) Nascent peptides that block protein synthesis in bacteria, Proceedings of the National Academy of Sciences of the United States of America. 110, E878-87. 39. Papatriantafyllou, M. (2012) Freeing stalled ribosomes, Nature reviews Molecular cell biology. 13, 280-280. 40. Tanner, D. R., Cariello, D. A., Woolstenhulme, C. J., Broadbent, M. A. & Buskirk, A. R. (2009) Genetic identification of nascent peptides that induce ribosome stalling, The Journal of biological chemistry. 284, 34809-18. 41. Novoa, E. M. & Ribas de Pouplana, L. (2012) Speeding with control: codon usage, tRNAs, and ribosomes, Trends in genetics : TIG. 28, 574-81. 45

42. Fredrick, K. & Ibba, M. (2010) How the sequence of a gene can tune its translation, Cell. 141, 227-9. 43. Serganov, A. & Nudler, E. (2013) A decade of riboswitches, Cell. 152, 17-24. 44. Li, G. W., Oh, E. & Weissman, J. S. (2012) The anti-Shine-Dalgarno sequence drives translational pausing and codon choice in bacteria, Nature. 484, 538-41. 45. Dana, A. & Tuller, T. (2012) Determinants of translation elongation speed and ribosomal profiling biases in mouse embryonic stem cells, PLoS computational biology. 8, e1002755. 46. Pavlov, M. Y., Watts, R. E., Tan, Z., Cornish, V. W., Ehrenberg, M. & Forster, A. C. (2009) Slow peptide bond formation by proline and other N-alkylamino acids in translation, Proceedings of the National Academy of Sciences of the United States of America. 106, 50-4. 47. Nakatogawa, H. & Ito, K. (2002) The ribosomal exit tunnel functions as a discriminating gate, Cell. 108, 629-36. 48. Cruz-Vera, L. R., Rajagopal, S., Squires, C. & Yanofsky, C. (2005) Features of ribosome-peptidyl-tRNA interactions essential for tryptophan induction of tna expression, Molecular cell. 19, 333-43. 49. Ramu, H., Vazquez-Laslop, N., Klepacki, D., Dai, Q., Piccirilli, J., Micura, R. & Mankin, A. S. (2011) Nascent peptide in the ribosome exit tunnel affects functional properties of the A-site of the peptidyl transferase center, Molecular cell. 41, 321-30. 50. Bhushan, S., Hoffmann, T., Seidelt, B., Frauenfeld, J., Mielke, T., Berninghausen, O., Wilson, D. N. & Beckmann, R. (2011) SecM-stalled ribosomes adopt an altered geometry at the peptidyl transferase center, PLoS biology. 9, e1000581. 51. Murakami, A., Nakatogawa, H. & Ito, K. (2004) Translation arrest of SecM is essential for the basal and regulated expression of SecA, Proceedings of the National Academy of Sciences of the United States of America. 101, 12330-5. 52. Nakatogawa, H. & Ito, K. (2001) Secretion monitor, SecM, undergoes self-translation arrest in the cytosol, Molecular cell. 7, 185-92. 53. Barends, S., Kraal, B. & van Wezel, G. P. (2011) The tmRNA-tagging mechanism and the control of gene expression: a review, Wiley interdisciplinary reviews RNA. 2, 233-46. 54. Janssen, B. D. & Hayes, C. S. (2012) The tmRNA ribosome-rescue system, Advances in protein chemistry and structural biology. 86, 151-91. 55. Bailly, M. & de Crecy-Lagard, V. (2010) Predicting the pathway involved in post-translational modification of elongation factor P in a subset of bacterial species, Biology direct. 5, 3. 56. Kyrpides, N. C. & Woese, C. R. (1998) Universally conserved translation initiation factors, Proceedings of the National Academy of Sciences of the United States of America. 95, 224-8. 46

57. Aoki, H., Adams, S. L., Chung, D. G., Yaguchi, M., Chuang, S. E. & Ganoza, M. C. (1991) Cloning, sequencing and overexpression of the gene for prokaryotic factor EF-P involved in peptide bond synthesis, Nucleic acids research. 19, 6215-20. 58. Glick, B. R., Chladek, S. & Ganoza, M. C. (1979) Peptide bond formation stimulated by protein synthesis factor EF-P depends on the aminoacyl moiety of the acceptor, European journal of biochemistry / FEBS. 97, 23-8. 59. Aoki, H., Dekany, K., Adams, S. L. & Ganoza, M. C. (1997) The gene encoding the elongation factor P protein is essential for viability and is required for protein synthesis, The Journal of biological chemistry. 272, 32254-9. 60. Ganoza, M. C. & Aoki, H. (2000) Peptide bond synthesis: function of the efp gene product, Biological chemistry. 381, 553-9. 61. Blaha, G., Stanley, R. E. & Steitz, T. A. (2009) Formation of the first peptide bond: the structure of EF-P bound to the 70S ribosome, Science. 325, 966-70. 62. Hanawa-Suetsugu, K., Sekine, S., Sakai, H., Hori-Takemoto, C., Terada, T., Unzai, S., Tame, J. R., Kuramitsu, S., Shirouzu, M. & Yokoyama, S. (2004) Crystal structure of elongation factor P from Thermus thermophilus HB8, Proceedings of the National Academy of Sciences of the United States of America. 101, 9595-600. 63. Park, J. H., Johansson, H. E., Aoki, H., Huang, B. X., Kim, H. Y., Ganoza, M. C. & Park, M. H. (2012) Post-translational modification by beta-lysylation is required for activity of Escherichia coli elongation factor P (EF-P), The Journal of biological chemistry. 287, 2579-90. 64. Bullwinkle, T. J., Zou, S. B., Rajkovic, A., Hersch, S. J., Elgamal, S., Robinson, N., Smil, D., Bolshan, Y., Navarre, W. W. & Ibba, M. (2013) (R)-beta-lysine-modified elongation factor P functions in translation elongation, The Journal of biological chemistry. 288, 4416-23. 65. Behshad, E., Ruzicka, F. J., Mansoorabadi, S. O., Chen, D., Reed, G. H. & Frey, P. A. (2006) Enantiomeric free radicals and enzymatic control of stereochemistry in a radical mechanism: the case of lysine 2,3-aminomutases, Biochemistry. 45, 12639-46. 66. Roy, H., Zou, S. B., Bullwinkle, T. J., Wolfe, B. S., Gilreath, M. S., Forsyth, C. J., Navarre, W. W. & Ibba, M. (2011) The tRNA synthetase paralog PoxA modifies elongation factor-P with (R)-beta-lysine, Nature chemical biology. 7, 667-9. 67. Peil, L., Starosta, A. L., Virumae, K., Atkinson, G. C., Tenson, T., Remme, J. & Wilson, D. N. (2012) Lys34 of translation elongation factor EF-P is hydroxylated by YfcM, Nature chemical biology. 8, 695-7. 68. Baba, T., Ara, T., Hasegawa, M., Takai, Y., Okumura, Y., Baba, M., Datsenko, K. A., Tomita, M., Wanner, B. L. & Mori, H. (2006) Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection, Molecular systems biology. 2, 2006 0008. 69. Navarre, W. W., Zou, S. B., Roy, H., Xie, J. L., Savchenko, A., Singer, A., Edvokimova, E., Prost, L. R., Kumar, R., Ibba, M. & Fang, F. C. (2010) PoxA, yjeK, 47 and elongation factor P coordinately modulate virulence and drug resistance in Salmonella enterica, Molecular cell. 39, 209-21. 70. Bearson, S. M., Bearson, B. L., Brunelle, B. W., Sharma, V. K. & Lee, I. S. (2011) A mutation in the poxA gene of Salmonella enterica serovar Typhimurium alters protein production, elevates susceptibility to environmental challenges, and decreases swine colonization, Foodborne pathogens and disease. 8, 725-32. 71. Zou, S. B., Hersch, S. J., Roy, H., Wiggers, J. B., Leung, A. S., Buranyi, S., Xie, J. L., Dare, K., Ibba, M. & Navarre, W. W. (2012) Loss of elongation factor P disrupts bacterial outer membrane integrity, Journal of bacteriology. 194, 413-25. 72. Van Dyk, T. K., Smulski, D. R. & Chang, Y. Y. (1987) Pleiotropic effects of poxA regulatory mutations of Escherichia coli and Salmonella typhimurium, mutations conferring sulfometuron methyl and alpha-ketobutyrate hypersensitivity, Journal of bacteriology. 169, 4540-6. 73. Ong, S. E., Blagoev, B., Kratchmarova, I., Kristensen, D. B., Steen, H., Pandey, A. & Mann, M. (2002) Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics, Molecular & cellular proteomics : MCP. 1, 376-86. 74. Ong, S. E. (2012) The expanding field of SILAC, Analytical and bioanalytical chemistry. 404, 967-76. 75. Sahu, T. K., Rao, A. R., Vasisht, S., Singh, N. & Singh, U. P. (2012) Computational approaches, databases and tools for in silico motif discovery, Interdisciplinary sciences, computational life sciences. 4, 239-55. 76. Cooper, C. A., Zhang, K., Andres, S. N., Fang, Y., Kaniuk, N. A., Hannemann, M., Brumell, J. H., Foster, L. J., Junop, M. S. & Coombes, B. K. (2010) Structural and biochemical characterization of SrcA, a multi-cargo type III secretion chaperone in Salmonella required for pathogenic association with a host, PLoS Pathog. 6, e1000751. 77. Schmieger, H. (1971) A method for detection of phage mutants with altered transducing ability, Mol Gen Genet. 110, 378-81. 78. Datsenko, K. A. & Wanner, B. L. (2000) One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products, Proceedings of the National Academy of Sciences of the United States of America. 97, 6640-5. 79. Guzman, L. M., Belin, D., Carson, M. J. & Beckwith, J. (1995) Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter, Journal of bacteriology. 177, 4121-30. 80. Shaner, N. C., Campbell, R. E., Steinbach, P. A., Giepmans, B. N., Palmer, A. E. & Tsien, R. Y. (2004) Improved monomeric red, orange and yellow fluorescent proteins derived from Discosoma sp. red fluorescent protein, Nature biotechnology. 22, 1567-72. 81. Hersch, S. J., Wang, M., Zou, S. B., Moon, K. M., Foster, L. J., Ibba, M. & 48

Navarre, W. W. (2013) Divergent Protein Motifs Direct Elongation Factor P-Mediated Translational Regulation in Salmonella enterica and Escherichia coli, mBio. 4. 82. Sharp, P. M., Cowe, E., Higgins, D. G., Shields, D. C., Wolfe, K. H. & Wright, F. (1988) Codon usage patterns in Escherichia coli, Bacillus subtilis, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Drosophila melanogaster and Homo sapiens; a review of the considerable within-species diversity, Nucleic acids research. 16, 8207-11. 83. Zou, S. B., Roy, H., Ibba, M. & Navarre, W. W. (2011) Elongation factor P mediates a novel post-transcriptional regulatory pathway critical for bacterial virulence, Virulence. 2, 147-51. 84. Shevchenko, A., Wilm, M., Vorm, O. & Mann, M. (1996) Mass spectrometric sequencing of proteins silver-stained polyacrylamide gels, Anal Chem. 68, 850-8. 85. Chan, Q. W., Howes, C. G. & Foster, L. J. (2006) Quantitative comparison of caste differences in honeybee hemolymph, Molecular & cellular proteomics : MCP. 5, 2252-62. 86. de Godoy, L. M., Olsen, J. V., Cox, J., Nielsen, M. L., Hubner, N. C., Frohlich, F., Walther, T. C. & Mann, M. (2008) Comprehensive mass-spectrometry-based proteome quantification of haploid versus diploid yeast, Nature. 455, 1251-4. 87. Cox, J. & Mann, M. (2011) Quantitative, high-resolution proteomics for data-driven systems biology, Annu Rev Biochem. 80, 273-99. 88. Dennis, G., Jr., Sherman, B. T., Hosack, D. A., Yang, J., Gao, W., Lane, H. C. & Lempicki, R. A. (2003) DAVID: Database for Annotation, Visualization, and Integrated Discovery, Genome Biol. 4, P3. 89. Miller, J. (1972) Experiments in Molecular Genetics, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. 90. Navarre, W. W., Halsey, T. A., Walthers, D., Frye, J., McClelland, M., Potter, J. L., Kenney, L. J., Gunn, J. S., Fang, F. C. & Libby, S. J. (2005) Co-regulation of Salmonella enterica genes required for virulence and resistance to antimicrobial peptides by SlyA and PhoP/PhoQ, Mol Microbiol. 56, 492-508. 91. Peng, W. T., Banta, L. M., Charles, T. C. & Nester, E. W. (2001) The chvH locus of Agrobacterium encodes a homologue of an elongation factor involved in protein synthesis, Journal of bacteriology. 183, 36-45. 92. Doerfel, L. K., Wohlgemuth, I., Kothe, C., Peske, F., Urlaub, H. & Rodnina, M. V. (2013) EF-P is essential for rapid synthesis of proteins containing consecutive proline residues, Science. 339, 85-8. 93. Ude, S., Lassak, J., Starosta, A. L., Kraxenberger, T., Wilson, D. N. & Jung, K. (2013) Translation elongation factor EF-P alleviates ribosome stalling at polyproline stretches, Science. 339, 82-5. 94. Letzring, D. P., Dean, K. M. & Grayhack, E. J. (2010) Control of translation efficiency in yeast by codon-anticodon interactions, Rna. 16, 2516-28. 49

95. Dennis, G., Jr., Sherman, B. T., Hosack, D. A., Yang, J., Gao, W., Lane, H. C. & Lempicki, R. A. (2003) DAVID: Database for Annotation, Visualization, and Integrated Discovery, Genome Biol. 4, P3. 96. Huang da, W., Sherman, B. T. & Lempicki, R. A. (2009) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists, Nucleic acids research. 37, 1-13. 97. Huang da, W., Sherman, B. T. & Lempicki, R. A. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources, Nat Protoc. 4, 44-57. 98. Rees, D. M., Montgomery, M. G., Leslie, A. G. & Walker, J. E. (2012) Structural evidence of a new catalytic intermediate in the pathway of ATP hydrolysis by F1-ATPase from bovine heart mitochondria, Proceedings of the National Academy of Sciences of the United States of America. 109, 11139-43. 99. Niwa, H., Tsuchiya, D., Makyio, H., Yoshida, M. & Morikawa, K. (2002) Hexameric ring structure of the ATPase domain of the membrane-integrated metalloprotease FtsH from Thermus thermophilus HB8, Structure. 10, 1415-23. 100. Anishetty, S., Pennathur, G. & Anishetty, R. (2002) Tripeptide analysis of protein structures, BMC Structural Biology. 2, 9. 101. Seidelt, B., Innis, C. A., Wilson, D. N., Gartmann, M., Armache, J. P., Villa, E., Trabuco, L. G., Becker, T., Mielke, T., Schulten, K., Steitz, T. A. & Beckmann, R. (2009) Structural insight into nascent polypeptide chain-mediated translational stalling, Science. 326, 1412-5. 102. Peterson, J. H., Woolhead, C. A. & Bernstein, H. D. (2010) The conformation of a nascent polypeptide inside the ribosome tunnel affects protein targeting and protein folding, Mol Microbiol. 78, 203-17. 103. Simons, K. T., Bonneau, R., Ruczinski, I. & Baker, D. (1999) Ab initio protein structure prediction of CASP III targets using ROSETTA, Proteins. Suppl 3, 171-6. 104. Bonneau, R., Tsai, J., Ruczinski, I., Chivian, D., Rohl, C., Strauss, C. E. & Baker, D. (2001) Rosetta in CASP4: progress in ab initio protein structure prediction, Proteins. Suppl 5, 119-26. 105. Sunohara, T., Jojima, K., Yamamoto, Y., Inada, T. & Aiba, H. (2004) Nascent-peptide-mediated ribosome stalling at a stop codon induces mRNA cleavage resulting in nonstop mRNA that is recognized by tmRNA, Rna. 10, 378-86. 106. Choi, K. M., Atkins, J. F., Gesteland, R. F. & Brimacombe, R. (1998) Flexibility of the nascent polypeptide chain within the ribosome--contacts from the peptide N-terminus to a specific region of the 30S subunit, European journal of biochemistry / FEBS. 255, 409-13. 107. Wilson, D., Bhushan, S., Becker, T. & Beckmann, R. (2011) Nascent polypeptide chains within the ribosomal tunnel analyzed by cryo-EM in Ribosomes (Rodnina, M., Wintermeyer, W. & Green, R., eds) pp. 393-404, Springer Vienna. 50

108. Gong, F. & Yanofsky, C. (2002) Instruction of translating ribosome by nascent peptide, Science. 297, 1864-7. 109. Bhushan, S., Hoffmann, T., Seidelt, B., Frauenfeld, J., Mielke, T., Berninghausen, O., Wilson, D. N. & Beckmann, R. (2011) SecM-stalled ribosomes adopt an altered geometry at the peptidyl transferase center, PLoS Biol. 9, e1000581. 110. Sandler, P. & Weisblum, B. (1989) Erythromycin-induced ribosome stall in the ermA leader: a barricade to 5'-to-3' nucleolytic cleavage of the ermA transcript, Journal of bacteriology. 171, 6680-8. 111. Chiba, S. & Ito, K. (2012) Multisite ribosomal stalling: a unique mode of regulatory nascent chain action revealed for MifM, Molecular cell. 47, 863-72. 112. Woolstenhulme, C. J., Parajuli, S., Healey, D. W., Valverde, D. P., Petersen, E. N., Starosta, A. L., Guydosh, N. R., Johnson, W. E., Wilson, D. N. & Buskirk, A. R. (2013) Nascent peptides that block protein synthesis in bacteria, Proc Natl Acad Sci U S A.

51