<<

Rageia Elfageih Production and folding of proteins in the of

Production of proteins and folding in periplasm of the Rageia Elfageih Escherichia coli

ISBN 978-91-7911-464-0

Department of Biochemistry and Biophysics

Doctoral Thesis in Biochemistry at Stockholm University, Sweden 2021

Production and folding of proteins in the periplasm of Escherichia coli Rageia Elfageih Academic dissertation for the Degree of Doctor of Philosophy in Biochemistry at Stockholm University to be publicly defended on Friday 14 May 2021 at 10.00 online via Zoom, public link is available at the department website.

Abstract The Gram-negative bacterium E. coli is the most widely used host for the production of recombinant proteins. Disulfide bond containing recombinant proteins are usually produced in the periplasm of E. coli since in this compartment of the cell - in contrast to the cytoplasm - disulfide bond formation is promoted. To reach the periplasm recombinant proteins have to be translocated across the cytoplasmic membrane by the protein translocation machinery. To obtain sufficient yields of active recombinant protein in the periplasm is always challenging. The Ph.D. studies have aimed at developing strategies to enhance recombinant protein production yields in the periplasm, to better understand what happens when a protein is produced in the periplasm, and to shed light on the protein folding process in the periplasm. It has been shown that evolving translation initiation regions (TIRs) can enhance periplasmic protein production yields of a variety of proteins. Furthermore, it has been shown that the protein translocation machinery can adapt for enhanced periplasmic recombinant protein production. Force profile analysis was used to study co-translational folding of the periplasmic disulfide-bond containing protein (PhoA) in the periplasm. It was shown that folding-induced forces can be transmitted via the nascent chain from the periplasm to the peptidyl transferase center in the ribosome and that PhoA appears to fold co- translationally via disulfide-stabilized folding intermediates. Finally, the S. pneumoniae neuraminidases NanA, NanB, and NanC were produced in E. coli and subsequently isolated. The activity of these neuraminidases was monitored at different pH as well as their oligomeric state was studied.

Keywords: Escherichia coli, periplasm, recombinant protein production, disulfide bond containing proteins, translation initiation region, protein translocation machinery, co-translational folding, neuraminidases.

Stockholm 2021 http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-191529

ISBN 978-91-7911-464-0 ISBN 978-91-7911-465-7

Department of Biochemistry and Biophysics

Stockholm University, 106 91 Stockholm

PRODUCTION AND FOLDING OF PROTEINS IN THE PERIPLASM OF ESCHERICHIA COLI

Rageia Elfageih

Production and folding of proteins in the periplasm of Escherichia coli

Rageia Elfageih ©Rageia Elfageih, Stockholm University 2021

ISBN print 978-91-7911-464-0 ISBN PDF 978-91-7911-465-7

The cover image is created using BioRender.com. Images and image modifications in comprehensive summary by Rageia Elfageih

Printed in Sweden by Universitetsservice US-AB, Stockholm 2021 “Anything you can imagine you can create.” Oprah Winfrey

Dedication To my parents and family For their endless love, support, and encouragements

List of papers:

I. Elfageih R, Karyolaimos A, Kemp G, de Gier J-W, von Heijne G, Kudva R (2020). Cotranslational folding of alkaline phospha- tase in the periplasm of Escherichia coli. Protein Science 29(10):2028–37.

II. Karyolaimos A*, Dolata KM*, Antelo-Varela M*, Mestre Borras A, Elfageih R, Sievers S, Becher D, Riedel K and de Gier J-W (2020). Escherichia coli can adapt its protein translocation ma- chinery for enhanced periplasmic recombinant protein produc- tion. Frontiers in Bioengineering and Biotechnology 7: 465. *Shared first author

III. Mirzadeh K*, Shilling PJ*, Elfageih R, Cumming AJ, Cui HL, Rennig M, Nørholm MHH, Daley DO (2020). Increased produc- tion of periplasmic proteins in Escherichia coli by directed evo- lution of the translation initiation region. Microbial Cell Factories 19(1):85. *Shared first author

IV. Elfageih R, de Gier J-W, Daniels R (2021). Characterization of the Streptococcus pneumoniae neuraminidases NanA, NanB, and NanC. Manuscript in preparation

Contents

Introduction ...... 1 1. Gram-negative ...... 3 1.1. The cytoplasm ...... 3 1.2. The inner membrane ...... 4 1.3. The periplasm ...... 4 1.4. The layer ...... 4 1.5. The outer membrane ...... 5 2. Gram-positive bacteria ...... 5 3. Biogenesis of cell envelope proteins ...... 5 3.1. The ribosome and the translation initiation region in mRNA ...... 6 3.1.1. Initiation of translation ...... 7 3.1.2. Elongation of translation ...... 8 3.1.3. Termination of translation and ribosomal recycling ...... 9 4. Biogenesis of bacterial cell envelope proteins ...... 10 4.1. ...... 11 4.2. Protein translocation ...... 12 4.2.1. Regulation of secA expression ...... 13 4.3. Formation of disulfide bonds in the periplasm ...... 13 5. Neuraminidases ...... 16 5.1. Streptococcus pneumoniae and influenza virus neuraminidases ...... 16 5.1.2. Catalytic site characteristics, substrate specificity and product formation ...... 17 6. Recombinant protein production in E. coli ...... 19 6.1. Production strains and expression vectors ...... 20 6.1.1. The T7 RNA polymerase/promoter ...... 21 6.1.2. The tac promoter ...... 22 6.1.3. The rhaBAD promoter ...... 23 6.1.4. The araBAD promoter ...... 24 7. Summaries of chapters I-IV ...... 25 8. Future perspectives ...... 28 Sammanfattning på svenska ...... 30 32 ...... ﺺﺨﻠﻤﻟا( ﺔﻐﻠﻟﺎﺑ ﺔﯿﺑﺮﻌﻟا ) Summary in Arabic Acknowledgments ...... 33 References ...... 35

Introduction

Bacteria are single-cell based organisms and they can have different shapes (they can be e.g., spherical or rod-shaped) (1). On average, bacteria are 1 – 2 µm in diameter/length, and their mass (dry weight) can range from 1 – 10 pg (1). The bacterial chromosome is localized in the central region of the cell and bacteria lack a membrane-based nucleus; the region within the bacterial cell containing its genetic information is often referred to as the nucleoid (2). In addition, bacteria can also contain plasmids, which are independently repli- cating pieces of DNA that contain additional genetic information (3). There are Gram-negative and Gram-positive bacteria (4)(Figure 1). Gram- negative bacteria have two membranes, i.e., the inner or cytoplasmic mem- brane and the outer membrane. Between the two membranes is the periplasm, which contains a thin layer of peptidoglycan (5).

Figure 1. Schematic representation of the cell envelope of a Gram-negative bacterium and the one of a Gram-positive bacterium. a) The basic setup (from the outside to the inside) of the Gram-negative bacterial cell envelope consists of the outer membrane, the periplasm and the inner/cytoplasmic membrane. In the periplasm, there is a thin layer of peptidoglycan that is anchored to the outer membrane via the lipoprotein Lpp (for the sake of simplicity not specified in the cartoon). The outer membrane contains (LPS) in the outer leaflet and lipids in the inner leaflet. The inner/cytoplasmic membrane consists of a lipid bilayer. Both membrane systems contain (integral) membrane proteins, including lipoproteins. b) In Gram- positive bacteria, the basic setup of the cell envelope (from the outside to the inside) consists of a thick layer of peptidoglycan that contains an anionic polymer called and sur- face proteins are attached to it, and the cytoplasmic membrane. The cytoplasmic membrane

1 consists of a lipid bilayer which also contains integral membrane proteins, including lipopro- teins and membrane bound . Gram-positive bacteria lack an outer membrane and they are usually surrounded by a polysaccharide capsule (not shown).

The outer membrane protects the Gram-negative bacterium against the exter- nal often hostile environment. The inner or cytoplasmic membrane encloses the cytoplasm. Gram-positive bacteria have only a cytoplasmic membrane. Gram-positive bacteria contain a thick layer of peptidoglycan (5, 6). An ex- ample of a Gram-negative bacterium is Escherichia coli (E. coli) and an ex- ample of a Gram-positive bacterium is Streptococcus pneumoniae (S. pneu- moniae). I worked with both these bacteria during the Ph.D. studies. E. coli is a rod-shaped facultative anaerobic enteric bacterium (5). It is com- monly found in the gastrointestinal tract of warm-blooded organisms, includ- ing humans (7). E. coli is a very well-studied Gram-negative bacterium and it has been used a lot as a model organism in biological studies (8, 9). It also is the ‘workhorse’ in molecular biology and biotechnology (10). In the Ph.D. studies, I have been using E. coli for the production of recombinant proteins (chapters II, III and IV), and to study the folding of proteins in the periplasm (chapter I). S. pneumoniae is a bacterium which is part of the normal upper respiratory tract flora in humans (11). It can become pathogenic under certain conditions, in particular when the host immune system is compromised (11–13). It can stimulate the inflammatory response by colonizing the air sacs of the lungs and this can make that plasma, blood and white blood cells fill the alveoli of the lungs (14). This phenomenon is better known as pneumonia (15) S. pneu- moniae can also cause meningitis, sepsis, otitis media and bacteraemia (16). S. pneumoniae is a lancet-shaped, facultative that mainly occurs in pairs or short chains. S. pneumoniae has many different virulence factors (Figure 2) (12). I have been studying one of its virulence factors, the so-called neuraminidases (chapter IV). The neuraminidases facilitate bacterial adhesion and invasion to the tracheal epithelial cells via cleavage of sialic acid from host glycoproteins (17–19). The free sialic acid is subsequently imported into S. pneumoniae so that it can be used as a carbon and energy source (20). It has been shown that the attach- ment of S. pneumoniae to the epithelial cells can be enhanced by an influenza virus infection (21). Influenza virus neuraminidases cleave sialic acid from glycoconjugates in human lung tissue (22). Removal of sialic acid leads to disruption of the epithelial layer and exposure of specific receptors that may propagate invasion of S. pneumoniae (17, 23). More specifically, I have con- tributed to a study aiming to characterize the properties of S. pneumoniae and influenza virus neuraminidases.

2

Figure 2: Schematic representation of the bacterium S. pneumoniae and its virulence fac- tors. S. pneumoniae has a cytoplasmic membrane which encloses the cytoplasm. Furthermore, it has a thick layer of peptidoglycan with lipoteichoic acid and it has a polysaccharide capsule. The major pneumococcal virulence factors are pneumolysin is released via autolysis export (24), its neuraminidases (A,B and C), where A is anchored to the cell surface and both B and C are secreted into the extracellular environment, the cell-surface proteins PspA and PspC,and autolysin LytA (25, 26). The metal ion-binding protein PsaA is the pneumococcal surface ad- hesin A (12, 27, 28). PiaA is required for iron acquisition (29, 30), PiuA is involved in iron uptake A (29, 30) and PitA is an iron transporter (29, 31). The IgA protease is an immunoglobu- lin A protease (32, 33). S. pneumoniae makes bacteriocins and has a polysaccharide capsule and pili (12, 34). All these factors play roles in respiratory colonization and disease (12).

In the following sections, I will give a more detailed overview of the compart- ments Gram-negative and Gram-positive bacteria consist of. I will also de- scribe how protein biogenesis occurs in bacteria with a focus on the biogenesis of secretory proteins in E. coli. Notably, all cell envelope and extracellular proteins, both in Gram-negative and Gram-positive bacteria, are synthesized in the cytoplasm and have to be targeted to the proper location (35).

1. Gram-negative bacteria

1.1. The cytoplasm The cytoplasm of both Gram-negative and Gram-positive bacteria is a gel- like environment and it is roughly organized into three ‘zones’ (36). In short and therefore maybe not totally complete, there is the ‘nucleoid zone’, there is the ‘structural zone’, and there is the ‘metabolic zone’. The ‘nucleoid zone’ is comprised of the chromosome, which contains the genetic information, and

3 many proteins/protein complexes. The ‘structural zone’ is comprised of cyto- skeleton proteins and the space between the ‘nucleoid zone’ and the ‘structural zone’ is the ‘metabolic zone’ (36, 37). In contrast to the eukaryotic cytoplasm, the bacterial cytoplasm lacks organelles and its organization is less complex. In bacteria, all proteins are synthesized in the cytoplasm and many proteins have the cell envelope or the extracellular milieu as their final destination and therefore have to be guided to their final destination (5, 38).

1.2. The inner membrane The inner membrane (a.k.a. the cytoplasmic membrane) is the innermost membrane of Gram-negative bacteria. It surrounds the cytoplasm, and it con- sists of lipids and proteins (Figure 1a). More specifically, the inner membrane consists of phospholipids like phosphatidyl-ethanolamine, phosphatidyl-glyc- erol and cardiolipin (155). The inner membrane is a selective barrier control- ling the passage of e.g., ions and many other molecules in and out of the cell and it does this with the help of channels and transporters that reside in the membrane (40). Inner membrane proteins are either integral or peripheral (41). Integral membrane proteins are either embedded in the membrane or are co- valently linked to a lipid that is part of the inner membrane. Membrane em- bedded integral inner membrane proteins consist of one or more hydrophobic stretches (a.k.a. a-helices). The hydrophobic stretches of multispanning mem- brane proteins are connected by loops (42–44).

1.3. The periplasm The compartment that is localised between the outer and the inner mem- brane of the cell envelope is called the periplasm. It is a gel-like matrix be- cause it is densely packed with proteins and there is a thin layer of peptidogly- can in the periplasm (5). The peptidoglycan layer is attached to the outer mem- brane by the lipoprotein Lpp (5) (Figure 1a). The periplasm contains e.g., chaperones, proteases, nucleases, substrate binding proteins, and proteins in- volved in the biogenesis of the cell envelope (45, 46). The periplasm does not contain any ATP and it is in contrast to the cytoplasm oxidizing (47).

1.4. The peptidoglycan layer The peptidoglycan layer is a unique and essential structural component of the bacterial cell envelope (48, 49). The peptidoglycan layer is a net-like pol- ymer and it serves as a scaffold to attach other polymers and proteins to. The peptidoglycan layer is a rigid exoskeleton, but it is porous and flexible enough to allow passage of e.g., nutrients, chemical signals and virulence factors and also protein structures, like TolC, can ‘go through’ the peptidoglycan layer (48–52). The peptidoglycan layer plays diverse functions; it protects the cell from e.g., bursting due to the osmotic instability (5, 49), and it maintains the

4 integrity, morphology, and the shape of the cells (53). It also plays a key role in cell division (54–56).

1.5. The outer membrane The outer membrane is a selective barrier that protects the Gram-negative bacterium from harmful and toxic compounds such as antibiotics (57). The outer membrane is composed of an asymmetric bilayer, the outer leaflet con- sists of (LPS), and the inner leaflet is composed of phos- pholipids (5). The outer membrane also contains integral membrane proteins and peripheral membrane proteins (58).

2. Gram-positive bacteria

The cytoplasmic membrane of Gram-positive bacteria like S. pneumoniae is composed of a phospholipid bilayer similar to the one of the inner/cytoplas- mic membrane of Gram-negative bacteria (59). The composition of both the head groups and the fatty acyl chains can vary in response to environmental stresses, such as a low pH or osmotic stress (59). S. pneumoniae and many other Gram-positive bacteria have a thick and multi- layered peptidoglycan layer that surrounds the cytoplasmic membrane (34). Anionic polymers are threading through the glycan strands, they are called teichoic acid (5). Teichoic acid is composed of glycerol phosphate and gluco- syl phosphate repeats and it is covalently linked to the peptidoglycan (5). It can also be linked to lipid anchor components in the cytoplasmic membrane (15, 59, 60). Gram-positive bacteria usually have a capsular polysaccharide that is anchored to the outer cell surface (59, 61). The capsular polysaccharide is a protective layer against harmful substances and it plays a vital role in pathogenesis by promoting adhesion and colonization of S. pneumoniae to the nasopharyngeal cavity (61). The structure of the capsular polysaccharide is diverse because of the differences in sugar composition and linkages (61). This is nicely illustrated by the 98 known capsular polysaccharide-based sero- types of S. pneumoniae (62). S. pneumoniae has a variety of proteins with diverse functions that decorate the surface of the cells. Generally, the surface proteins are either covalently attached to the peptidoglycan layer or non-co- valently attached to the cell surface (63).

3. Biogenesis of cell envelope proteins

A gene is transcribed into messenger RNA (mRNA) during the transcrip- tion process, and the mRNA is translated into a protein during the translation process (64)(Figure 3). In bacteria all proteins are synthesized in the cyto- plasm. In case of non-cytoplasmic proteins, there are different pathways that can guide these proteins to their final destination (Figure 3). All these proteins

5 have at their N-terminus a zip code, i.e., a signal anchor sequence (inner/cy- toplasmic membrane proteins) or a signal peptide (secretory proteins) (35, 38, 65–68).

Figure 3. From gene to protein in a Gram-negative bacterium. A gene (DNA) is transcribed into messenger RNA (mRNA) during the transcription process, and the mRNA is translated into a protein by the ribosome. Many bacterial mRNAs are polycistronic, i.e., they encode for more than one protein. All proteins are synthesized in the cytoplasm. In case of non-cytoplasmic proteins, there are different pathways that can guide these proteins to their final destination.

3.1. The ribosome and the translation initiation region in mRNA The ribosome is a nucleoprotein complex and it is highly conserved (69). It consists of proteins and RNA molecules. In bacteria, the ribosome (70S) consists of two subunits, the small (30S) subunit and the large (50S) subunit (69). The small subunit consists of the 16S ribosomal RNA and 21 proteins, while the large subunit consists of two ribosomal RNAs and 33 proteins (69, 70). The mRNA is decoded at the decoding centre of the ribosome. The pol- ypeptide chain grows by the formation of peptide bonds at the peptidyl trans- ferase centre of the ribosome (69). The ribosome contains three sites that are key for protein synthesis, i.e., the aminoacyl (A), peptidyl (P) and exit (E) sites. The A-site receives the aminoacyl-tRNA(aa-tRNA), the P-site is the site where the peptide bonds are formed between incoming amino acids and the

6 elongating polypeptide chain, and at the E-site the uncharged tRNAs get re- leased. The small subunit of the ribosome interacts with the mRNA and the aa-tRNA anticodon stem-loop. The large subunit interacts with aa-tRNA ac- ceptor arms and it catalyses the formation of peptide bonds at the peptidyl transferase centre (71). The growing polypeptide chain leaves the ribosome through the exit tunnel of the large subunit of the ribosome (69). mRNAs possess characteristic features for the initiation of protein synthesis in the translation initiation region (a.k.a. TIR)(72)(Chapter III). A TIR is the region in a mRNA molecule that spans approximately between position −20 to position +15 in the mRNA molecule relative to the start codon (63). The efficiency of translation initiation can be affected by numerous factors such as the start codon (mostly ATG (AUG), but also other ones can be used), a con- sensus sequence approximately 8–10 nucleotides upstream of the start codon, which is often referred to as the Shine–Dalgarno (SD) sequence, (a.k.a. the ribosome binding site (RBS)). The SD sequence pairs with a complementary sequence at the 3´end of the 16S rRNA, which is often referred to as the anti-Shine–Dalgarno (aSD) se- quence. The stability of the mRNA fold near the start codon and the mRNA A/U abundant elements that are recognized by the S1 protein of the 30S sub- unit (73). All these factors contribute to the efficiency of mRNA recruitment to the ribosome; thus, the efficiency of translation depends to a great extent on the overall structure of the TIR (as well as the rest of the characteristics of a mRNA molecule)(74, 75). Protein translation is a dynamic process that is comprised of four steps: initiation (see also above), elongation, termination and ribosome recycling (76).

3.1.1. Initiation of translation In bacteria, the initiation step of the protein synthesis process depends on the formation of the translation initiation complex (77, 78). During the for- mation of the translation initiation complex, the 30S subunit of the ribosome binds to the mRNA via interactions between the SD sequence/RBS in the mRNA molecule and the aSD sequence at the 3´end of the 16S rRNA (70, 79). The efficiency of the interaction depends on the sequence within and around the RBS. The initiation of translation is promoted by initiation factors IF1, IF2 and IF3 (80). It involves the accommodation of the start codon (usually AUG) at the P site of the ribosome and contributes to the fidelity of the initiation of translation (78). At the end of the initiation step, the small subunit and the large subunit form a complex, and the P site is loaded with the aminoacylated initiator tRNA (fMet-tRNAfMet), and elongation can start now (Figure 4).

7

Figure 4. Formation of the translation initiation complex. The formation of the translation initiation complex occurs in three steps. In the first step, the small ribosomal (30S) subunit binds IF-1 and IF-3 and subsequently the mRNA. The Shine-Dalgarno sequence of the mRNA interacts with anti-Shine-Dalgarno sequence of the 16S rRNA. In the second step, IF-2-GTP binds the 30S subunit and recruits fMet-tRNAfMet to the peptidyl site (P-site). In the third step, the 50S subunit associates, IF-2 hydrolyses GTP, and IF-1, IF-2 and IF-3 dissociate. The trans- lation initiation complex is now ready to enter the elongation phase. In the figure, ‘A’ represents the aminoacyl site, ‘P’ represents the peptidyl site and ‘E’ represents the exit site of the ribo- some.

3.1.2. Elongation of translation After binding of the 30S and 50S subunits and once the initiator tRNA is attached to the P-site, the empty A-site is ready to receive an aminoacylated- tRNA encoded by the second codon in the mRNA (79). It is recruited to the A-site of the ribosome in a complex with elongation factor Tu (EF-Tu) and GTP (79, 81, 82). Upon hydrolysis of GTP, EF-Tu mediates the release of the methionine from the initiator tRNA to the a-amino group of the second ami- noacyl-tRNA and EF-Tu and GDP are released (79, 81, 82). This reaction oc- curs at the peptidyl transferase centre, and it is catalysed by the 23S rRNA and it results in the formation of dipeptidyl-tRNA in the A-site and deacylated- tRNA in the P-site (81). In addition, elongation factor P (EF-P) is thought to potentiate the first peptide bond formation at the peptidyl transferase centre (83). The deacylated tRNA at the P-site moves to the E-site to eventually leave the ribosome. The peptidyl tRNA is moved from the A-site to the P-site and the mRNA is moved with respect to the small subunit of the ribosome. The translocation reaction is promoted by elongation factor-G (EF-G) and hydrol- ysis of GTP provides the energy required to complete the translocation reac- tion (76, 79). This reaction leads to evacuation of the A-site, and the ribosome is ready for the next round of elongation (70, 71)(Figure 5).

8

Figure 5. Protein synthesis in bacteria. -1- Initiation step (orange arrows). This step begins with the recruitment of the ribosomal 30S and 50S subunits from the cytoplasmic pool by ini- tiation factors (IFs). The 30S subunit binds to the mRNA and fMet-RNAMet and assembles with the large (50S) subunit to form the translation initiation complex. -2- Elongation step (green arrows). This step is initiated by the recruitment of elongation factor P (EF-P) and the second charged tRNA complexed with elongation factor Tu (EF-Tu) to the A-site, then the first peptide bond is formed at the peptidyl transferase centre and this is coupled to the hydrolysis of GTP. EF-P and EF-Tu are subsequently released, while the elongation factor G associates to empty the A-site for the next round of elongation. The translocation reaction is mediated by the hy- drolysis of GTP. -3- Termination and -4- ribosome recycling steps (red arrows). Termination of translation begins with the recruitment of release factor 1 or 2 (RF-1 or 2) based on the stop codon used. The polypeptide chain is released and RF-1 or 2 is dissociated with aid of RF-3 and powered by hydrolysis of GTP. The ribosome is disassembled by the ribosomal release factor (RRF) and EF-G, at the expense of GTP hydrolysis, to its ribosomal subunits and the two subunits are returned to their cytoplasmic pools. All the protein shapes in the figure resemble their three dimensional structures. This figure is adapted with permission from (76).

3.1.3. Termination of translation and ribosomal recycling Translation is terminated when the ribosome reaches a stop codon in the mRNA (UAA, UAG or UGA)(79). A stop codon is recognized by release fac- tor 1 or 2 (RF1/RF2)(79, 84). RF1 promotes the termination at the stop codons UAA and UAG, while RF2 promotes termination at UAA and UGA. Hydrol- ysis and release of the polypeptide chain are triggered by RF1/RF2 (85). Upon peptide bond hydrolysis, the dissociation of RF1/RF2 from the A-site is driven by hydrolysis of GTP (79, 86). This reaction is accelerated by release factor 3

9 (RF3), which is also a GTPase (87, 88). After polypeptide chain release, the ribosome needs to be disassembled to its ribosomal subunits so that they can be used for the synthesis of other proteins. The dissociation of the ribosomal subunits requires ribosomal recycling factors along with EF-G (89). IF3 plays a role in ribosomal recycling by replacing the deacylated tRNA on the 30S subunit. Also, it allows the detachment of the mRNA or make a new SD–aSD interaction with a downstream RBS in case the mRNA is polycistronic (90)(Figure 5).

4. Biogenesis of bacterial cell envelope proteins

All bacterial cell envelope proteins and proteins secreted into the extracel- lular milieu are synthesized in the cytoplasm. How do these proteins end up at their final location? For the sake of clarity it may good to mention that in this section I will mainly focus on the cell envelope of the Gram-negative bac- terium and in particular the one of E. coli. In E. coli, most of these proteins are in the case of cytoplasmic membrane proteins inserted into the membrane and in the case of secretory proteins trans- located across the cytoplasmic membrane via a hetero-oligomeric protein complex, the so-called Sec-translocon (91, 92). The core of the Sec-translocon is comprised of three integral membrane proteins, i.e., SecY, SecE and SecG, which assemble into a trimer that makes up a protein-conducting channel. SecA is a peripheral subunit of the Sec-translocon (93). It is an ATPase and mediates targeting and (stepwise) translocation of secretory proteins (94). SecDF-YajC and YidC are auxiliary translocon components that can interact with the SecYEG core complex and facilitate protein translocation across / insertion into the cytoplasmic membrane (95, 96). It has been shown that YidC by itself can also function as an insertase for a subset of small integral inner membrane proteins (96, 97). Inner membrane proteins have at the N-terminus a signal anchor sequence that guides them to the inner membrane and secretory proteins have a cleavable signal peptide that guides them to the inner membrane (only a handful of inner membrane proteins in E. coli also has a cleavable signal peptide (see below)) (98–100). A cleavable signal peptide is 15- 40 amino acids long and its struc- ture is tri-partite. At the N-terminus it has positively charged amino acids, the helical hydrophobic core is made up of 8-12 residues, and the C-terminus of a signal peptide is slightly polar (98, 99). The C-terminal domain contains the cleavage site that is recognized by signal peptidase (98, 101). The signal pep- tide is cleaved upon translocation (102) (Figure 6). The Sec-translocon trans- locates proteins in a mostly unfolded state (35, 38, 103). There is also the TAT-translocon, which can translocate folded proteins (103–105). Further discussing the TAT-translocon is beyond the scope of this thesis.

10 4.1. Protein targeting Proteins can be targeted co-translationally or post-translationally to the in- ner membrane (103). Inner membrane proteins are most often targeted in a co- translational fashion (68, 96). The hydrophobicity of the signal peptide of se- cretory proteins appears to be a major determinant for the mode of targeting used (106). There are also secretory proteins that can be targeted both co- translationally and post-translationally (106, 107). Co-translational targeting is mediated by the signal recognition particle (SRP), which is a ribonucleoprotein, and its receptor, FtsY (66)(Figure 6). The SRP identifies highly hydrophobic signal peptides of secretory proteins or the N- terminal transmembrane helix (signal anchor sequence) of an inner membrane protein as it emerges from the ribosomal exit tunnel (66). The ribosome-nas- cent chain complex is targeted to the inner membrane via the interaction of the SRP with FtsY (108). FtsY can interact with inner membrane lipids and the Sec-translocon and delivers the protein at the Sec-translocon (109, 110). SRP-mediated protein targeting is driven by the hydrolysis of GTP (111). Post-translational protein targeting can be mediated by cytoplasmic chaper- ones, e.g., SecB and SecA can be involved (38). Secretory polypeptide chains emerging from the ribosome interact with trigger factor, which is a chaperone (112), or ribosome-bound SecA (113, 114). Secretory proteins leave the ribo- some and bind to SecB, a cytoplasmic chaperone that has holdase activity (115). SecB keeps the precursor protein in an unfolded and soluble state, and it prevents misfolding and/or aggregation of the precursor protein (116). The (mostly) unfolded precursor is delivered to the Sec-translocon and subse- quently translocated across the inner membrane (Figure 6).

Figure 6. Protein targeting to and translocation across the inner membrane of E. coli. The envelope proteins are exported from the cytoplasm to the periplasm in roughly three stages: sorting and targeting, translocation and release and maturation of the proteins. In the sorting and targeting stage, the precursor protein is targeted co-transitionally or post-transitionally to

11 the Sec-translocon (SecYEG). Co-translational targeting is mediated by the signal recognition particle (SRP) and its receptor FtsY. While post-translational targeting is mediated in a chaper- one-dependent manner, where e.g., trigger factor or SecB or SecA or SecB-SecA are involved; or a chaperone independent manner. In the translocation/release stage, pre-proteins can be trans- located via the SecYEG translocon in two manners; co- or post-translationally. These processes are ‘powered’ by the ATPase motor protein SecA via ATP hydrolysis as well as the proton motive force (PMF). YidC and SecDF-YajC are auxiliary Sec-translocon components that help the biogenesis of membrane proteins and enhance translocation proficiency, respectively. Dur- ing the maturation step, the signal peptide is cleaved off at (probably) a late stage of transloca- tion by the signal peptidase I (LepB) and the processed protein is released into the periplasm.

4.2. Protein translocation In bacteria, there are two main pathways that mediate the translocation of the secretory proteins, i.e., the Sec-pathway and the twin-arginine pathway (TAT-pathway). The Sec-pathway translocates (mostly) unfolded proteins, while the TAT-pathway mediates the translocation of folded proteins. As mentioned before, the TAT-pathway is not dealt with here in any further detail (Figure 6)(38). Co-translational protein targeting is characterized by the coupling of protein synthesis and protein insertion into or protein translocation across the cyto- plasmic membrane (117). During the translation of inner membrane proteins, the hydrophobic transmembrane helices go directly from the ribosomal exit tunnel to the Sec-translocon channel and they are then laterally inserted into the membrane via the lateral gate of the Sec-translocon (118). YidC can form a complex with the Sec-translocon and it is localized adjacent to its lateral gate. It appears to be involved in the transfer of transmembrane segments into the lipid bilayer as well as the folding of inner membrane proteins (118). It can also function as an insertase of small inner membrane proteins (119). SecA is recruited during co-translational protein translocation to the Sec- translocon when the hydrophilic loops of the inner membrane proteins need to be translocated (120). SecD, F and YajC are auxiliary Sec-translocon com- ponents and they form a complex that can interact with the Sec-translocon and the SecDF-YajC complex enhances the proficiency of translocation (38). Post-translational protein translocation is characterized by the (nearly) com- plete synthesis of the polypeptide in the cytoplasm before delivering it to the Sec-translocon. During post-translational protein translocation, SecA by re- peated cycles of ATP hydrolysis and with the help of the proton motive force drives protein translocation (121). The signal sequence of a secretory protein is cleaved off by signal peptidase I (LepB) and the protein is subsequently released into the periplasm (102).

12 4.2.1. Regulation of secA expression As mentioned before, SecA is the motor protein driving the translocation of secretory proteins and sizeable periplasmic domains of inner membrane proteins across the cytoplasmic membrane in a stepwise manner through cy- cles of translocon insertion and de-insertion coupled with the hydrolysis of ATP (122). The accumulation levels of SecA are modulated according to the ability of the cell to translocate proteins via the Sec-translocon; the decreased ability to translocate proteins leads to upregulation of the synthesis of SecA (123–125). How does this work? The gene encoding SecA is in an operon with the gene encoding SecM; i.e., secM secA. SecM is a periplasmic monitor that regulates the expression of secA in response to the Sec-translocon capacity of the cell (123). Secreted SecM has no function in the periplasm and it is rapidly degraded by the periplasmic tail-specific protease (126). The gene encoding SecM is located upstream of the gene encoding SecA and they are as mentioned before both part of the same operon. The region between the two genes can form a stem-loop struc- ture that masks the SD sequence of secA (123). Translation of SecM is sub- jected to elongation arrest because SecM contains the sequence 150 166 FX4WIX4GIRAGP , where X is any amino acid (127). This sequence, which is a so-called arrest peptide, interacts with the ribosomal exit tunnel and the ribosome is stalled at (P166), which is a position close to the C terminus of the nascent peptide which transiently arrests the translation (35, 123). The signal sequence of the SecM nascent polypeptide chain is recognized by the SRP and the protein is targeted in an SRP-dependent fashion to the Sec- translocon (35, 123). The translocation of SecM generates a pulling force that ‘travels’ along the nascent chain to the PTC to resume the translation pro- cesses of the next gene the secA (122). During the time window of ribosome stalling, the stem-loop structure is disrupted and the SD sequence is exposed to allow secA translation by other ribosome(s). Translation arrest of secM (hence exposure of SD) prolongs, leading to higher frequencies of secA translation. Thus, if there are Sec-translocon capacity problems in E. coli, it will respond by synthesizing more of the motor protein SecA.

4.3. Formation of disulfide bonds in the periplasm Once a protein is secreted into the periplasm, it is either trafficked to the outer membrane (or beyond), or it folds in the periplasm. Folding of proteins in the periplasm is often aided by periplasmic chaperones and folding catalysts (45, 46, 128, 129). In the following section, I will briefly describe the Dsb (disulfide bond) system of E. coli, which is involved in the formation of disul- fide bonds in the periplasm (130). This system has played a key role in the Ph.D. studies (chapters I, II and III). A disulfide bond (-S-S-) is a covalent bond formed between the -SH groups of two cysteine residues in a polypeptide

13 chain (130, 131). It stabilizes the folded state of a protein. Disulfide bond for- mation can occur spontaneously in the presence of molecular oxygen (131, 132). However, the rate of this spontaneous reaction is very slow and it actu- ally takes too long to support the formation of disulfide bonds needed by the cell (128). Catalysts can enhance the speed of this reaction (111). In Gram- negative bacteria, many periplasmic proteins and outer membrane proteins contain one or more disulfide bonds (133). Notable example of a periplasmic disulfide bond containing protein in E. coli is alkaline phosphatase (chapter I). Failure to form correct disulfide bonds can lead to protein aggregation and/or degradation by periplasmic proteases (134). Notably, many recombi- nant proteins like hormones and antibody fragments contain disulfide bonds and are therefore usually produced in the periplasm of E. coli. E. coli has the Dsb system that catalyses the formation of disulfide bonds (Fig- ure 7). This systems consists of different proteins, including DsbA, which with the help of DsbB catalyses the disulfide bond formation reaction, and DsbC, which with the help of DsbD mediates disulfide bond isomerisation (130). DsbA is a thiol disulfide oxidoreductase and is thought also to have some chaperone activity (128, 130, 135). It is a monomeric protein with a molecular weight of 21 kDa, and it has a domain with a thioredoxin (Trx)-fold and it has a helical domain that folds around aforementioned domain (136). DsbA is characterized by a Cys-X-X-Cys motif (128), where X is any amino acid, and an uncharged groove that facilitates the interaction between DsbA and its un- folded substrate via hydrophobic interactions (137, 138). The cysteine at position 30 is exposed in the crystal structure of DsbA repre- senting the oxidized form to the surface of the protein and this allows it to be attacked by reduced cysteines in the substrate (139). The cysteine at position 33 in the same structure is embedded inside the protein, and it is not involved in the initial step of a mixed disulfide bond complex with a substrate (139). The thiol:disulfide exchange reaction mediated by DsbA is started by a nucle- ophilic attack of the thiol group from the substrate on the disulfide bond at the active site of the DsbA (128, 130, 139). The oxidized form of the DsbA is not a favourable state because the Cys30 has a low pKa (3.5)(139) making it an excellent leaving group enhancing the oxidation of a substrate by DsbA and by doing so converting the active site of DsbA to its reduced state (130)(Figure 7). The resulting reduced state of DsbA needs to be re-oxidized to gain its oxidase activity. This reaction is catalysed by the integral membrane protein DsbB, which spans the cytoplasmic membrane via four transmembrane heli- ces, and has its active site that contains two pairs of cysteine residues in its periplasmic loops (140, 141). The first cysteine pair at position 41 and 44; and the second cysteine pair at position 104 and 130, which make a disulfide bridge at the active site of DsbB (142). DsbB extracts electrons from the re- duced form of DsbA and funnels them into the via quinones in the inner membrane (130, 140). Consequently, both DsbA and

14 DsbB gain their oxidized state and are ready for another cycle of protein oxi- dation (Figure 7).

Figure 7. The formation of disulfide bonds in the periplasm of E. coli. 1) The protein that has cysteines that can form a disulfide bond is exported in an unfolded (reduced) state. A thiol group of the substrate attacks the Cys30 of DsbA (oxidized state) to form a substrate-DsbA mixed disulfide complex. This complex is resolved by the attack of the second thiol group of the substrate on the substrate-DsbA mixed disulfide complex. 2) The disulfide bond containing protein (oxidized state) and reduced DsbA are generated. Also, a mis-oxidized form of the pro- tein can be produced as well. 3) The reduced DsbA (inactive state) is recycled back to its oxi- dized and active state by the inner membrane protein DsbB. 4) I to IV are the steps of electron transfer from DsbA to DsbB and the electrons are then transferred to the quinones in the inner membrane and thereby funnelled into the respiratory chain. As a result, both DsbA and DsbB are recycled back to their active state and can start another round of protein oxidation. 5) DsbC can sense a mis-oxidized protein that can be produced during protein oxidation and reduce it. Then the reduced protein undergoes another cycle of the oxidation process. 6) DsbC is recycled back to its active state with the aid of DsbD and cytoplasmic thioredoxin (TrxA) (7).

During protein folding processes, proteins can misfold due to an error in the formation of disulfide bonds. DsbC, an isomerase , is able to sense erroneous disulfide bond formation and will reduce them. Then, the substrate can get another chance to fold properly and undergo another round of disulfide bond formation by DsbA. DsbC is recycled back to its active state by the inner membrane protein DsbD in a process that requires cytoplasmic thioredoxin (140, 143)(Figure 7).

15

5. Neuraminidases

Neuraminidases or sialidases (N-acylneuraminosyl glycohydrolases EC 3.2.1.18) were first discovered in the 1940s as receptor-destroying from Vibrio cholerae and the influenza virus (144, 145). In 1996, neuramini- dases were defined as glycosyl hydrolases and they are widely distributed in nature (146). In the SWISS-PROT protein sequence database there is a list of different neuraminidases from a variety of organisms, including bacteria, vi- ruses, bacteriophages, fungi, protozoae, mycoplasmas and some eukaryotes (147). Neuraminidases catalyze the removal of sialic acid from various gly- coconjugates (148). Sialic acid is a generic name of a large family of naturally occurring analogues of N-acetyl neuraminic acid (Neu5Ac) and it is located at the termini of car- bohydrate complexes in eukaryotes (149). The occurrence of different ana- logues is linked to species, cell type, cell age, and tissue type (149). Some analogues have a role in protecting glycoconjugates from ‘attacks’ by neuram- inidases (149). Pathogens can have proteins that can recognize sialic acid to promote their attachment to the host and many pathogens can remove sialic acid from the surface of the host cell to aid pathogenesis and/or nutritional requirements (146). In this section, I will briefly discuss bacterial neuraminidases with a focus on the ones from S. pneumoniae and I will briefly touch upon influenza virus neuraminidases.

5.1. Streptococcus pneumoniae and influenza virus neuraminidases Many neuraminidase-producing bacteria can use sialic acid as a carbon and energy source because they have both a sialic acid transporter (SatABC) to import the sialic acid inside the cell as well as enzymes that can catabolize it (146)(Figure 8). The S. pneumoniae genome encodes for up to three neuram- inidases, i.e., NanA, NanB and NanC. It uses the neuraminidases to unencrypt adhesive receptors on the host via hydrolytic removal of a-glycosidically linked sialic acids, either O-glycosidic or N-glycosidic bonds, from sialylated glycoconjugates (Figure 8)(14, 150). The hydrolytic reaction is an essential step for S. pneumoniae colonization and pathogenesis (151, 152). It has been suggested that NanA is responsible for the sialic acid removal (148), while NanB is essential for the survival of the bacterium (148, 153) and NanC is thought to be a regulator of NanA (148, 154). The influenza virus neuraminidases (IV-NAs) belong to the exosialidase en- zymes just like the Sp-NAs (EC 3.2.1.18)(155, 156). They cleave the α-glyco- sidic linkage between the N-acetylneuraminic acid and sugar residue of gly- coconjugates. IV-NAs are divided into ten subtypes; nine subtypes are from the influenza A virus and one subtype is from the influenza B and C viruses.

16 The nine subtypes of influenza A are further divided into two phylogenic groups. The first subtype group includes the neuraminidases: N1, N4, N5 and N8, while the second subtype group includes the neuraminidases N2, N3, N6 N7 and N9 (155, 157). The crystal structures of domains from the Sp-NAs have been determined (154, 158, 159). They are monomeric in their active state and they do not have disulfide bonds (154, 158, 159). NanA, NanB and NanC show domain conservation: they have an N-terminal signal sequence, an N-terminal carbohydrate-binding module (CBM), and a catalytic β-propel- ler domain with an irregular inserted (I) domain that protrudes from the cata- lytic domain (150)(Figure 8). While NanC shows similarity to the overall topology of NanA and NanB, it has a critical difference at the active site that causes the specificity of NanC toward its substrate (154). In contrast to NanB and NanC, NanA has a C-terminal domain that contains an LPETG anchor motif that is recognized upon its translocation across the cytoplasmic membrane and cleaved via the inner membrane protein sortase A (StrA) between the threonine (T) and the glycine (G). After that, it undergoes transglycosylation and transpeptidation reactions that tether NanA to the bac- terial surface (160, 161)(Figure 8). The sequence identity between NanB and NanC is approximately 50%, and both share about 25% identity with NanA (154). The catalytic domains of the neuraminidases have a six-bladed b-pro- peller topology that contains conserved key catalytic amino acids (146). More- over, they have conserved repeated sequences between one and five times along the Sp-NAs sequences known as bacterial neuraminidase repeats (BNRs)(158). The presence of CBMs enhances the catalytic efficiency of Sp- NAs toward their substrates (162). NanA is the largest Sp-NAs in size 115 kDa because it has an additinal C-terminal membrane-anchor domain. NanB and NanC are similar in size, 78 and 82 kDa, respectively.

5.1.2. Catalytic site characteristics, substrate specificity and prod- uct formation The Sp-NAs show diversity in substrate specificities, catalytic mechanism and kinetic parameters (148, 150, 163). The active site of Sp-NAs shares con- served catalytic residues including a tri-arginine cluster that interacts with car- boxylic groups of sialic acids, nucleophilic tyrosine with its associated glu- tamic acid residue, aspartic acid as an acid/base, and a hydrophobic pocket that accommodates an acetamido group (148, 150, 164). However, they vary in their specificity toward the substrate. This variation is generated from a different topology around the active site, where the catalytic cavity of the NanA is flat and open due to the presence of glycine at position 674 and large insertion beyond this region allowing access of a wide range of substrates and water molecule to support the substrate hydrolysis (148).

17

Figure 8. Buildup, biogenesis and function of S. pneumoniae neuraminidases. a) schematic representation of the buildup of the Sp-NAs proteins NanA, NanB and NanC. They have a similar buildup consisting of an N-terminal signal peptide with variable length, a carbohydrate- binding module (CBM) domain, a b-propeller catalytic domain with an irregular inserted (I) domain and a repetitive sequence of bacterial neuraminidase repeats (BNRs). At the C-terminus of NanA, there is an LPETG anchor motif to tether the protein to the cell surface. b) All Sp- NAs are synthesized in the cytoplasm, targeted to the inner membrane pathway and subse- quently translocated to the trans side of the cytoplasmic membrane, where the signal peptide is cleaved by signal peptidase and the Sp-NAs maturate to their native functional state. c) The

18 sortase A (SrtA) scans the C-termini of the translocated proteins to identify a LPXTG anchor motif, in this case, this is the LPETG anchor motif of NanA. Subsequently, cleavage between the threonine (T) and the glycine (G) of the LPETG motif occurs by sortase A (SrtA). An acyl- NanA intermediate is formed between the thiol group of the SrtA cysteine active site and the carboxyl-group of threonine at the C-terminal end of NanA. This intermediate undergoes transglycosylation and transpeptidation reactions that mediate the incorporation of NanA into the . d) Sp-NAs attack the glycoconjugates of the respiratory epithelium mucosal layer; they cleave the terminal sialic acids, where NanB and NanC cleave sialic acid with O-linked and N-linked a-glycosidic bonds respectively, while NanA cleaves the terminal sialic acid of glycosphingolipid. e) The free sialic acids are imported into the bacterial cell via the sialic acid transporter (SatABC) to catabolize it and use it as a carbon and energy source (f).

In contrast to NanA, the catalytic cavity of NanB and NanC is a narrow cleft because of the presence of bulky tryptophan at position 674 and 716, respec- tively (148, 158). As a result of the catalytic site topology variations, the Sp- NAs are different in their specificity, catalytic reaction and product formation (148). Therefore, NanA is a hydrolytic sialidase with non-selective substrate specificity. It can cleave α-2,3-, α-2,6-, and α-2,8-linked sialic acids and re- lease N-acetylneuraminic acid (Neu5Ac). NanB is an intramolecular trans-si- alidase that shows selective substrate specificity toward α-2,3-linked sub- strates and produces 2,7-anhydro-Neu5Ac. Similar to NanB, NanC has selec- tive substrate specificity toward α-2,3-linked sialosides, but it initially releases 2-deoxy-2,3-didehydro-N-acetylneuraminic acid (DANA, Neu5Ac2en), a molecule that inhibits NanA and NanC can hydrate Neu5Ac2en to Neu5Ac upon α-2,3-linked substrate depletion (148, 150, 165). Moreover, the pH can affect the enzymatic activity of Sp-NAs (150). It has been reported that the catalytic activity of NanA, NanB and NanC are optimum at a pH range of 5.5- 6.5, 5–5.5 and 5–6, respectively (163).

6. Recombinant protein production in E. coli

Recombinant protein production plays a central role in basic biochemi- cal/molecular biology research and many biotechnological applications (166). Recombinant proteins can be produced either in vivo using prokaryotic hosts, such as E. coli and subtilis, eukaryotic hosts, such as yeast, insect cells and mammalian cells or in vitro in cell-free systems. E. coli is usually the ‘first-choice’ host for recombinant protein production in academia and in- dustry. It has become a popular recombinant protein production platform be- cause E. coli is very well characterized, grows rapidly, is easy to manipulate, and is cost‐effective to use (166, 167, 170). Below, I will focus on the pro- moter systems and E. coli strains that are most relevant for the experimental work presented in this thesis.

19

6.1. Production strains and expression vectors The commonly used E. coli strains for the production of recombinant pro- teins are B- and K-derived strains (171). The B-strains are used to produce recombinant proteins because of their rapid biomass formation, their reduced production of acetate that inhibits biomass formation at a high concertation and their high capacity of amino acid synthesis (172). The widely used recom- binant protein strain BL21(DE3) and its derivatives are B-derived and they all have the strong T7 promoter system that drives target gene expression (167)(see below). Studier and Moffat generated the BL21(DE3) strain by in- tegrating the gene encoding the RNA polymerase from bacteriophage T7 on the genome with the help of a lambda-based vector (173, 174). E. coli BL21(DE3) is characterized by the absence of the cytoplasmic protease Lon and the outer membrane protease OmpT (171, 175). Absence of these two proteases often leads to an increase in the stability of the produced recombi- nant protein (174). In particular OmpT plays a role in this, since it can degrade endogenous and recombinant proteins after cell lysis and it is (close to) im- possible to inhibit (175). All these criteria make E. coli BL21(DE3) the most popular host to produce recombinant protein in the cytoplasm and the cell en- velope in academia and in industry to produce proteins at a lab. scale. E. coli strain W3110 (and derivatives thereof) is often (for historical reasons) used by industry to produce recombinant proteins at a large(r) scale (176, 177). E. coli naturally contains plasmids (see the very beginning of the Introduc- tion). A plasmid can give the cell the ability to produce e.g., proteins that make the cell resistance to antibiotics (178). Genetic manipulation and cloning methods make it possible to introduce any desired gene into a plasmid and express this gene. The list of the expression plasmids is huge: they can have different origins of replication, promoters, selection markers, multiple cloning sites and genetic information encoding affinity/protein isolation tags (170). All these have to be considered when choosing an expression vector for the production of a recombinant protein (Figure 9). The initiation of transcription depends on the recruitment of RNA polymerase to a defined nucleotide sequence located upstream of the gene encoding the recombinant protein known as a promoter sequence. In E. coli, promoters con- sist of two regions, each of which contains six nucleotides located 10 and 35 nucleotides upstream of the transcriptional initiation site at which RNA poly- merase binds (179). Phage promoters, like the T7 promoter, can consist of one stretch of nucleotides (180, 181). Promoters vary in their transcription initia- tion frequency and in their basal expression, which affects both endogenous and recombinant protein production levels. The binding of RNA polymerase to the promoter usually requires a specific factor or compound such as pres- ence or absence of metabolite or increase or decrease in temperature (182,

20 183). Most of the promoters used for recombinant protein production are gen- erated from operons involved in sugar utilization or they are from bacterio- phages (170, 181).

Figure 9. Major characteristics of an expression vector used for recombinant protein pro- duction in E. coli. The origin of replication (Ori) is in light green. The Ori determines the copy number of the vector per cell. The promoter region is in light orange. Affinity tags, in light blue, and sequences encoding for their removal are in light pink. Here, they are positioned upstream of multiple cloning site (MCS), but they can also be positioned downstream of it. The affinity tags are either peptides such as a poly-His-tag, HA-tag or fusion proteins such as GST (gluta- thione-S-transferase) and MBP (maltose-binding protein) (184). Both tag and cleavable site are usually required in the purification process. The striped box is the coding sequence for the target protein and also a transcription terminator site, in red, is included. The selection marker, in violet, is usually a gene encoding an antibiotic resistance marker required to select the cells that carry the desired expression vector.

6.1.1. The T7 RNA polymerase/promoter The T7 promoter in combination with the T7 RNA polymerase is often used to drive the expression of genes encoding recombinant proteins in E. coli (170). Both the T7 promoter and the T7 RNA polymerase originate from bac- teriophage T7 (181, 185). The T7 promoter is specifically recognized by the very powerful T7 RNA polymerase and not by E. coli RNA polymerase (the T7 RNA polymerase does not recognize E. coli promoters) (174, 181). The gene encoding the T7 RNA polymerase is placed in the bacterial genome using a prophage (λDE3) and it is under the transcriptional control of the lacUV5 promoter, which is a variant of the lactose promoter that is not affected by catabolic repression and more powerful than the wild-type pro- moter (174, 186). Therefore, The T7-based gene expression can be induced

21 by lactose or its non-hydrolyzable analogue isopropyl β-D-1-thiogalactopyra- noside (IPTG) (Figure 10). In spite of the lac operators (to which the LacI repressor can bind) that are present in most T7-based expression vectors (T7/lac), non-induced target gene expression is often a limitation of this sys- tem (in particular when the recombinant protein is toxic) (181, 187). Various mechanisms can control it including the use of e.g., lacIq, a promoter mutation that leads to more lac repressor, LacI, that makes tight repression of the lac- inducible T7 RNA polymerase gene and T7 co-production (28). T7 lysozyme is a natural inhibitor of the T7 RNA polymerase (188, 189). In chap- ter III, we used a T7-based promoter system based expression vector (pET28a; pBR322) and ‘evolved’ the six nucleotides upstream and downstream of the AUG start codon of it using degenerate primers that cover all possible nucle- otide combinations around the start codon without changing the amino acid sequence in order to enhance recombinant protein production yields.

Figure 10. T7-promoter based system for recombinant protein production. The lacUV5 promoter, in light orange, is blocked by the LacI repressor protein, in green, that can bind to the lac operator site(s) in the lacUV5 promoter. When the IPTG, in red, is added, the repressor protein binds to the inducer that leads to its conformational change and dissociation of the LacI protein from its binding site(s). Then RNA polymerase, in light violet, binds and start transcrip- tion of the t7rnap gene, in light brown. The T7RNA polymerase, in light blue, is produced and begins the transcription of the target gene from the T7 promoter of the pET vector resulting in the production of the target protein, in yellow. The activity of T7 RNA polymerase can be modulated by T7 lysozyme, in pink, which is a natural inhibitor of the T7 RNA polymerase.

6.1.2. The tac promoter The tac promoter system is a hybrid of the E. coli trp promoter and afore- mentioned lacUV5 promoter (190). Its activity is less sensitive to the intracel- lular level of cAMP (190). Target gene expression using this promoter is in- duced with lactose or IPTG (191, 192). This promoter is suitable for the high level expression of foreign genes in E. coli. In contrast to expression vectors with a T7-based promoter system, expression vectors with a tac promoter can

22 be used in a much wider variety of E. coli recombinant protein production hosts since they depend on the E. coli RNA polymerase rather than the T7 RNA polymerase (192). In chapter IV, we have used the tac-promoter based pGEX expression vector (pGEX-6p1)(193). This vector has a pBR322 origin of replication and it is characterized by the presence of aforementioned lacIq. The LacI repressor tightly regulates the tac promotor by preventing its tran- scription in the absence of lactose/IPTG. Also, it is a suitable vector for pro- duction and efficient purification of GST-tagged proteins (194). Thus, in chap- ter IV we used this expression vector to recombinantly produce S. pneumoniae neuraminidases in the E. coli Rosetta (DE3) strain. This strain is a BL21-de- rivative designed to enhance the expression of recombinant proteins by syn- thesizing extra tRNAs for the AGG, AGA, AUA, CUA, CCC, GGA codons in order to enhance the production of recombinant proteins in E. coli (170, 195, 196).

6.1.3. The rhaBAD promoter The L-rhamnose-inducible rhaBAD promoter that belongs to the rha op- eron of E. coli is also widely used to control the expression of genes encoding heterologous/recombinant proteins in E. coli. The rhaBAD promoter system belongs to the positively regulated expression systems known as the AraC- XylS family (197, 198). The rhaBAD promoter is capable of relatively high levels of expression (199), it displays undetectable/very low baseline gene ex- pression in the absence of its inducer (200, 201). It is useful for in particular the expression of genes encoding toxic proteins such as secretory and mem- brane proteins (202). L-rhamnose is a naturally occurring deoxyhexose, and it is a non-toxic to bac- terial cells (199, 203, 204). E. coli is able to utilize L-rhamnose as an energy and carbon source because its rha operon encodes for the rhamnose-proton symporter (RhaT), which transports rhamnose into the cell, and three enzymes (RhaB, RhaA and RhaD) required for rhamnose catabolism and the genes en- coding the regulatory proteins (RhaS and RhaR)(205)( Figure 11). Upon addition of rhamnose, the RhaR protein activates the transcription from the rhaSR promoter and the produced RhaS causes activation of transcription from rhaT and rhaBAD promoters (Figure 11). Also, an excessive amount of RhaS acts as a negative autoregulator, thereby downregulating transcription from its own promoter (205). For recombinant protein production, the gene encoding the recombinant protein is usually inserted downstream of the rha- BAD promoter, thus the level of the transcribed gene from this promoter de- pends on the amount of rhamnose added to the culture medium (202). In chapter II, we used an E. coli strain that is deficient in the rha operon (in- cluding the rhaT and rhaB,A,D genes that are involved in rhamnose transport and catabolism). Importantly, the rha operon deletion generates a truly titrat- able protein production setup for E. coli that can be used to very precisely tune

23 recombinant protein production rates in rhamnose concentration-dependent manner (206). Such a system enables to e.g., harmonize the production rate of a secretory recombinant protein with the protein translocation capacity of the cell.

Figure 11. a. Schematic view of the E. coli rhamnose operon. The operon encodes for regu- latory proteins (RhaR and RhaS) and the L-rhamnose transporter (RhaT) and the catabolic en- zymes RhaB, RhaA and RhaD. Upon addition of L-rhamnose the regulator proteins RhaR acti- vates the transcription from rhaSR promoter while RhaS activates the transcription from both the rhaT and rhaBAD promoters. b. Schematic view of rhaBAD promoter for recombinant protein production, where the target gene introduced downstream of the promoter and RhaS protein regulates its transcription in response to L-rhamnose.

6.1.4. The araBAD promoter The arabinose inducible promoter system is a relatively weak promoter sys- tem that is regulated by AraC (207). L-arabinose induces the expression of genes under control of the araBAD promoter and the araBAD promoter ex- hibits an ‘On and Off’ expression phenotype (208, 209). AraC activates or inhibits the transcription in response to the presence and absence of the L- arabinose (210). E. coli cells can transport the inducer via two transporters: the low-affinity transporter, AraE, and high-affinity transporter, AraFGH (210). In the chapter I, we have used the arabinose inducible promoter system for the production of derivatives of alkaline phosphatase, a disulfide contain- ing protein in E coli, in order to study its folding.

24 7. Summaries of chapters I-IV

Chapter I: Co-translational folding of alkaline phosphatase in the periplasm of Escherichia coli Force-profile analysis is a recently developed method that has been used to study co-translational protein folding of cytoplasmic proteins, both in vitro and in vivo (Figure 12). Force-profile analysis takes advantage of the sensi- tivity of the SecM-family of translational arrest peptides to pulling forces act- ing on the nascent chain: the higher the pulling force, the less efficient is the translational stall induced by the arrest peptide (221, 222)(see section 4.2.1.). Many co-translational events, including protein folding, can generate force on the nascent chain, and are hence amenable to force-profile analysis. The use of force profile analysis has so-far been mainly limited to study the co-trans- lational folding of small domains of cytosolic proteins that fold in close prox- imity to the translating ribosome.

Figure 12. Principle of Force Profile Analysis. The protein of interest is synthesized as trun- cates of variable length, where 2 to 10 amino acids are subtracted from the C-terminus of the protein in a stepwise manner (proteins of variable length are generated and referred to as ‘N’). Then, the constructs are translated, and radioactively labelled with 35S methionine and separated by means of SDS-PAGE. The radioactive intensity of the bands is measured by means of auto- radiography and quantified using the EasyQuant software. The fraction full length (fFL ) of each construct is determined with respect of the intensity of the full length (FL) protein up to the stop codon and arrest (A) protein up to the critical proline residue of SecM. The force profile is generated by plotting the calculated fraction full length (fFL ) against the different lengths (N). The pulling force (�⃗) represents the N-terminal protein folding event acting on the nascent chain that causes the translation process to resume.

25 In chapter I, we have explored the use of force profile analysis to investigate in E. coli cells the co-translational folding of a periplasmic protein. i.e. Alka- line Phosphatase (PhoA). It has been shown that PhoA can be targeted and translocated in a co-translational fashion (107). The folding of PhoA in the periplasm is known to be stabilized by two consecutive pairs of disulfide bonds, in a reaction catalyzed by DsbA in the periplasm (107, 223). The force profile for PhoA demonstrates numerous peaks at lengths corresponding with exposure of the cysteine pairs within the sequence of PhoA to the periplasm. These peaks are sensitive to the absence of DsbA (see section 4.3.), indicating that disulfide bond formation-based protein folding can generate a pulling force. Thus, our data show that the force profile analysis can be used to study co-translational folding of proteins in an extra-cytosolic compartment. In the light of the mounting evidence that many periplasmic proteins can be translo- cated co-translationally across the cytoplasmic membrane in E. coli, force pro- file analysis appears to be a powerful tool to study the folding of these pro- teins.

Chapter II: Escherichia coli can adapt its protein translocation ma- chinery for enhanced periplasmic recombinant protein production A tunable rhamnose promoter-based setup for the production of recombi- nant proteins in E. coli has been engineered (206, 224). This setup enables to precisely set the production rate of a secretory recombinant protein and this is critical to enhance protein production yields in the periplasm. The idea is that precisely setting the production rate of a secretory recombinant protein is re- quired to harmonize its production rate with the protein translocation capacity of the cell. In chapter II, using proteome analysis we have shown to our sur- prise that enhancing periplasmic production of human Growth Hormone (hGH) using the tunable rhamnose promoter-based setup is accompanied by increased accumulation levels of at least three key players in protein translo- cation; the peripheral motor of the Sec-translocon (SecA), leader peptidase (LepB), and the cytoplasmic membrane protein integrase/chaperone (YidC) (see for more information about SecA, LepB and YidC sections 4.2. and 4.2.1.). Thus, enhancing periplasmic hGH production appears to lead to in- creased Sec-translocon capacity, increased capacity to cleave signal peptides from secretory proteins and an increased capacity of an alternative membrane protein biogenesis pathway, which frees up Sec-translocon capacity for pro- tein secretion. To test if the increased accumulation levels of SecA, LepB and YidC levels in cells with enhanced periplasmic hGH production yields were due to adaptation or the accumulation of mutations leading to permanently high SecA, LepB and YidC levels, cells were harvested and subsequently cul- tured in the absence of inducer. SecA, LepB, and YidC levels went down again when the cells were cultured in the absence of inducer. This indicates that when using the tunable rhamnose-promoter system to enhance the production

26 of a protein in the periplasm, E. coli can adapt its protein translocation ma- chinery for enhanced recombinant protein production in the periplasm.

Chapter III: Increased production of periplasmic proteins in Escherichia coli by directed evolution of the translation initiation region. In bacteria, many recombinant proteins are produced in the oxidising envi- ronment of the periplasm (Gram-negative bacteria) or the culture supernatant (Gram-positive bacteria). A commonly encountered problem is that the signal peptide used to translocate the recombinant protein across the cytoplasmic membrane influences the synthesis and secretion of the recombinant protein in an unpredictable manner. Understanding why this is the case could lead to improved methods for producing secreted recombinant proteins. In chapter III, using the T7 RNA polymerase/promoter setup (see section 6.1.1.) it is demon- strated that the signal peptides contribute to an unpredictable translation initi- ation region (see section 3.1.1.). A directed evolution approach that selects a new translation initiation region, whilst leaving the amino acid sequence of the signal peptide unchanged, can increase production yields of secreted re- combinant proteins. The approach can be used to increase the production yields of single chain antibody fragments, hormones and other recombinant proteins in the periplasm of E. coli. The study demonstrates that signal peptide performance appears to be coupled to the efficiency of the translation initia- tion region.

Chapter IV: Characterization of the Streptococcus pneumoniae neuram- inidases NanA, NanB and NanC Neuraminidases or sialidases can be synthesized by pathogenic and some eukaryotes (147). In prokaryotes, neuraminidases cleave the termi- nal sialic acid from carbohydrate complexes lining the host cell surface. Strep- tococcus pneumoniae is a Gram-positive bacterium and it is part of the respir- atory tract flora (225). It can become pathogenic under specific conditions, in particular, when the immune system of the host is compromised. The S. pneu- moniae genome encodes for three different neuraminidases, i.e., NanA, NanB, and NanC. The genome of S. pneumoniae also encodes a sialic acid transporter and catabolic enzymes involved in sialic acid metabolism (20, 148, 149). Thus, S. pneumoniae can use sialic acid as a carbon and energy source (see section 5.). In chapter IV, I produced NanA, NanB and NanC in the cytoplasm of E. coli (i.e., without their signal peptides) as GST fusions (between the Nans and GST there is a TEV protease cleavage site). The Nans were recov- ered from the GST-fusions with the help of the TEV protease. Using a MU- NANA-based activity assay the activity of NanA, NanB, and NanC was mon- itored at different pH. The optimum pH for NanA and NanB activity is 6.6 and the optimum pH for NanC activity is 7. In addition, the oligomeric state

27 of NanA, NanB, and NanC was monitored using size exclusion chromatog- raphy and BN-PAGE. The three Nans seems to occur mostly, if not exclu- sively, as monomers.

8. Future perspectives

Main aim of the Ph.D. studies was to further our understanding of the pro- duction of proteins in the periplasm of E. coli and the folding of proteins in this compartment. In chapter I, force profile analysis (FPA) was used to study the co-translational folding of a periplasmic protein and in particular the role disulfide bond for- mation plays in this process. The periplasmic protein PhoA, which contains two consecutive disulfide bonds, was used. It would be interesting to monitor folding of proteins with more than two consecutive disulfide bonds and pro- teins with non-consecutive disulfide bonds using FPA. Furthermore, the periplasm of E. coli seems to contain many different chaperones (45, 129, 134, 226). For many of these chaperones it is not very clear what their substrates are and how they assist protein folding. FPA may be a powerful tool to identify substrates of periplasmic chaperones and study the mode of action of periplas- mic chaperones. It is envisaged that a better understanding of periplasmic chaperones may pave the way for designing strategies (chaperone co-produc- tion/engineering) to enhance the production of recombinant proteins in the periplasm. The saturation of the protein translocation machinery can hamper recombinant protein production in the periplasm of E. coli. This can lead to the aggregation of the target protein and other proteins in the cytoplasm and this can negatively impact fitness and biomass formation. It is assumed that harmonizing the pro- duction rate of the target protein with the capacity of the protein translocation machinery can help to enhance periplasmic protein production yields (206, 224). In chapter II, we show that in E. coli ‘harmonizing the production rate of the target protein with the capacity of the protein translocation machinery’ can actually lead to the upregulation of (at least) three key players in protein translocation (SecA, LepB, and YidC). The mechanism behind the upregula- tion of these components - besides SecA - is mostly unclear (38, 227, 228). It would be interesting to elucidate how LepB and YidC levels are upregulated and if more components are upregulated. At any rate, our study shows that E. coli apparently can adapt to the production of recombinant proteins in the periplasm and this may indicate that we may want to revisit the (well-con- trolled) co-production of components of the protein translocation machinery to enhance the production of recombinant proteins in the periplasm (229, 230). An incompatibility between the 5' UTR of an expression vector and the 5' end of the cloned protein-coding sequence of a recombinant protein can generate problems at the level of the translation initiation region (TIR). The ‘directed’

28 evolution of the TIR may help to improve recombinant protein production yields. In chapter III, it is shown that the TIR can be successfully evolved to enhance the periplasmic production yield of recombinant proteins in the periplasm. We thought that the synthetic evolution of the translation initiation region might influence the mRNA secondary structure thereby enhancing the translation initiation efficiency. However, it remains unclear to what extent evolving a TIR can influence mRNA structures/stability and translation-initi- ation rates. Monitoring e.g., mRNA levels and monitoring structures in vivo may help to further understand how evolving TIRs can enhance recombinant protein production (231, 232). Some characteristics of the recombinantly produced S. pneumoniae neuram- inidases, NanA, NanB, and NanC were elucidated. It would be very interesting to further characterize them with for instance specific inhibitors and, more importantly, compare their characteristics with the ones of viral neuramini- dases. This may shed light on the relationship of S. pneumoniae and the flu virus during co-infections.

29

Sammanfattning på svenska

Den gramnegativa bakterien Escherichia coli är den mest använda bakte- rien för produktion av rekombinanta proteiner. Proteiner som innehåller disul- fidbindningar produceras vanligtvis i periplasman av E. coli eftersom disul- fidbindningar främjas i denna del av cellen – i motsats till cytoplasman. För att nå periplasman måste rekombinanta proteiner flyttas över det cytoplas- miska membranet med proteintranslokeringsmaskineriet. Att få tillräckligt ut- byte av aktivt rekombinant protein i periplasman är alltid utmanande. Mitt doktorandprojekt har syftat till att utveckla strategier för att förbättra utbytet vid produktion av rekombinanta proteiner i periplasman, att bättre förstå vad som händer när ett protein produceras i periplasman och att belysa protein- veckningsprocessen i periplasman. I kapitel I har vi undersökt användningen av “force-profile analysis” (kraft- profilanalys) för att i E. coli-celler undersöka kotranslationell veckning av ett periplasmiskt protein (alkaliskt fosfatas (PhoA)). Veckningen av PhoA i pe- riplasman är känd för att stabiliseras av två par disulfidbindningar, i en reakt- ion katalyserad av proteinet DsbA i periplasman. Kraftprofilen för PhoA visar flera toppar i längder som motsvarar exponering av cysteinpar inom sekven- sen för PhoA för periplasman. Dessa toppar är känsliga för frånvaron av DsbA, vilket indikerar att disulfidbindningsbaserad proteinveckning kan generera en dragkraft. Således visar våra data att kraftprofilanalysen kan an- vändas för att studera kotranslationell veckning av proteiner i utanför cy- toplasman. Mot bakgrund av de ökande bevisen för att många periplasmiska proteiner kan transporteras kotranslationellt över cytoplasmamembranet i E. coli, verkar kraftprofilanalys vara ett kraftfullt verktyg för att studera veck- ningen av dessa proteiner. I kapitel II visar vi, med hjälp av proteomanalys, att periplasmatisk produktion av humant tillväxthormon (hGH) kontrollerat av ramnospromotorn ger upp- hov till höjda halter av minst tre nyckelfaktorer i proteintranslokering: den perifera motorn i Sec-translokon (SecA), ledarpeptidas (LepB) och det cy- toplasmiska membranproteinintegraset (YidC). Förstärkning av periplasmat- isk hGH-produktion verkar således leda till ökad Sec-translokon-kapacitet, ökad kapacitet att klyva signalpeptider från sekretoriska proteiner och en ökad kapacitet hos en alternativ väg för membranproteinbiogenes, vilket frigör Sec- translokon-kapacitet för proteinsekretion. Dessutom har vi visat att E. coli kan

30 anpassa sitt proteintranslokeringsmaskineri för förbättrad produktion av re- kombinant protein i periplasman, när man använder ramnospromotorsystemet för att förbättra produktionen av ett protein i periplasman. I kapitel III, använder vi T7 RNA-polymeras/promotor-inställningen, för att visa att signalpeptider bidrar till en oförutsägbar position för translationsiniti- ering. Med hjälp av riktad evolutions för att välja en ny position för trans- lationsinitiering kunde vi öka utbytet av utsöndrade rekombinanta proteiner. Tillvägagångssättet kan användas för att öka utbytet av antikroppsfragment med enkel kedja, hormoner och andra rekombinanta proteiner i periplasman av E. coli. Neuraminidas eller sialidas kan syntetiseras av prokaryota och eukaryota pa- togener. De klyver den terminala sialinsyran från kolhydratkomplex på värd- cellens yta. Streptococcus pneumoniae är en grampositiv bakterie vars genom kodar för tre olika neuraminidas (NanA, NanB och NanC). I kapitel IV pro- ducerade jag NanA, NanB och NanC i cytoplasman av E. coli. Med hjälp av en MUNANA-baserad aktivitetsanalys övervakades aktiviteten för NanA, NanB och NanC vid olika pH. Dessutom övervakades det oligomera tillstån- det för NanA, NanB och NanC med användning av storleksuteslutningskro- matografi och BN-PAGE.

31

ﺺﺨﻠﻤﻟا ﺔﻐﻠﻟﺎﺑ ﯿﺑﺮﻌﻟا ﺔ:

ﺘﻌﺗ ﺮﺒ ا ﻟ ﺘﻜﺒ ﯾﺮﯿ ﺎ ﯿﻜﯾﺮﺷﻹا ﺔ ﺔﯿﻧﻮﻟﻮﻘﻟا ﻟا( ﺎﺴ ﻟ ﺒ ﺔ - ماﺮﺠﻟا ) ﺮﺜﻛﻷا إ ﺪﺨﺘﺳ ا ﻣ ﺎ ﻧﻹ ﺘ جﺎ ا ﻟ ﺗوﺮﺒ ﯿ ﻨ تﺎ ﻲﻓ ﻲﻓ تﺎ ﻨ ﯿ ﺗوﺮﺒ ﻟ ا جﺎ ﺘ ﻧﻹ ﺎ ﻣ ا ﺪﺨﺘﺳ تاﺮﺒﺘﺨﻤﻟا ﺔﯿﻤﯾدﺎﻛﻷا ﻚﻟﺬﻛو ﻲﻓ تﺎﻋﺎﻨﺼﻟا .ﺔﯿﺋاوﺪﻟا ﺚﯿﺣ ﻢﺘﯾ ﻊﯿﻨﺼﺗ تﺎﻨﯿﺗوﺮﺒﻟا ﺔﯿﺋﺎﻨﺛ ﻂﺑاوﺮﻟا ﺪﯿﺘﯾﺮﺒﻜﻟا ﺔﯾ ﻲﻓ ا ﻞﺋﺎﺴﻟ ا يﻮﻠﺨﻟ ﻟ ﺔﯿﻠﺨﻠ ا ،ﺔﯾﺮﯿﺘﻜﺒﻟ ﺎﻤﺑ ﻌﯾ فﺮ ﺑ ﺎ ﻟ ﯿﺴ ،مزﻼﺑﻮﺘ ﺛ ﻢ ﻢﺘﯾ ﺎﮭﻟﺎﺳرإ لﻼﺧ ا ﻟ ﻐ ﺸ ﺎ ء ﻼ يﻮﻠﺨﻟا ﺮﺒﻋ تاﻮﻨﻗ ﺔﺻﺎﺧ ﻞﻘﻨﻟ تﺎﻨﯿﺗوﺮﺒﻟا ﻰﻟإ ﺔﻘﻄﻨﻤﻟا ةرﻮﺼﺤﻤﻟا ﻦﯿﺑ ءﺎﺸﻐﻟا يﻮﻠﺨﻟا ءﺎﺸﻐﻟاو ءﺎﺸﻐﻟاو يﻮﻠﺨﻟا ءﺎﺸﻐﻟا ﻦﯿﺑ ةرﻮﺼﺤﻤﻟا ﺔﻘﻄﻨﻤﻟا ﻰﻟإ تﺎﻨﯿﺗوﺮﺒﻟا ﻞﻘﻨﻟ ﺔﺻﺎﺧ تاﻮﻨﻗ ﺮﺒﻋ يﻮﻠﺨﻟا ﻲﺟرﺎﺨﻟا ﺔﯿﻠﺨﻠﻟ ﺔﯾﺮﯿﺘﻜﺒﻟا ، ﻰﻤﺴﺗ هﺬھ ا ﺔﻘﻄﻨﻤﻟ ا .مزﻼﺒﯾﺮﺒﻟ ﺰﯿﻤﺘﺗ ﺔﻘﻄﻨﻣ ا مزﻼﺒﯾﺮﺒﻟ ، ﻰﻠﻋ ﺲﻜﻋ ا ﻞﺋﺎﺴﻟ ﻞﺋﺎﺴﻟ ا ﺲﻜﻋ ﻰﻠﻋ ، يﻮﻠﺨﻟا ،ﺔﯿﻠﺨﻠﻟ ﺎﮭﻄﺳﻮﺑ ﺪﺴﻛﺆﻤﻟا ﻟا يﺬ ﻢﻋﺪﯾ ﻦﯾﻮﻜﺗ اور ﻂﺑ ﺛ ﻨ ﺎ ﻲﺋ ا ﯾﺮﺒﻜﻟ ﺘ ﺪﯿ ﻲﻄﻟ ا ﻟ ﺗوﺮﺒ ﯿ ﻨ ﺎ ت و ﯾ ﻢﻋﺪ ﻢﺪﯾو ﺘﺳا ﺎھراﺮﻘ و ﺎﻌﻓ ﻟ ﯿ ﺎﮭﺘ . ﻨھ كﺎ ا ﺪﯾﺪﻌﻟ ﻦﻣ ا ﻟ ﯾﺪﺤﺘ تﺎ ﻲﺘﻟا ﻨﮭﺟاﻮﺗ ﺎ ﻲﻓ ﻢﮭﻓ ﺮﯾﻮﻄﺗو و ﻦﯿﺴﺤﺗ إ ﻧ ﺘ جﺎ جﺎ ﺘ ﻧ إ ﻦﯿﺴﺤﺗ تﺎﻨﯿﺗوﺮﺒﻟا ، ﺔﺻﺎﺧ ا ﻟ ﺗوﺮﺒ ﯿ ﻨ تﺎ ذ تا ا ﺎﻄﻟ ﻊﺑ ا ﻟ اوﺪ ﻲﺋ ، ﻲﻓ ا ﯾﻼﺨﻟ ﺎ ا ﻟ ﺒ ﻜ ﯾﺮﯿﺘ ﺔ ﯿﻤﻜﺑ تﺎ ةﺮﯿﻓو و ﺼﺧ ﺎ ﺺﺋ ﺺﺋ ﺎ ﺼﺧ ﺴﻣ ﺘ ﻘ ﺮ ة و ﻓ ﻌ ﺎ ﻟ ﺔ . اﺬﮭﻟ ﺗ ﻀﺘ ﻦﻤ ﺎﺳر ﻟ ﺔ هارﻮﺘﻛﺪﻟا هﺬھ ﻰﻠﻋ دﺪﻋ ﻦﻣ ا تﺎﺳارﺪﻟ ا ﻟ ﺜﺤﺒ ﺔﯿ ﻲﺘﻟا ﻂﻠﺴﺗ ا ءﻮﻀﻟ ءﻮﻀﻟ ا ﻂﻠﺴﺗ ﻰﻠﻋ ﺎﻌﻓ ﻟ ﺔﯿ قﺮط ﯾﺪﺟ ةﺪ ﻟ ﻢﮭﻔ ﯿﻛ ﻔ ﺔﯿ ﻦﯿﺴﺤﺗ إ ﻧ ﺘ جﺎ و ﻧ ﻞﻘ ا ﻟ ﺗوﺮﺒ ﯿ ﻨ تﺎ ةرﻮﺼﺑ ﻻ يذﺆﺗ ا ﻠﺨﻟ ﺔﯿ ، و ﺬﻛ ﻚﻟ ﺤﻣ ﺎ و ﻟ ﺔ ﻓ ﮭ ﻢ ﺔﯿﻟآ ﺎﮭﯿط ﻲﻓ ﻂﺳﻮﻟا ﺪﺴﻛﺆﻤﻟا ﺔﯿﻠﺨﻟﺎﺑ ﻟا ﺘﻜﺒ .ﺔﯾﺮﯿ

ﺚﯿﺣ ﺖﺤﺠﻧ ىﺪﺣإ ﺬھ ه ا ﻟ تﺎﺳارﺪ ﺔﯿﺜﺤﺒﻟا ﻲﻓ ﯾﺰﻌﺗ ﺰ ﺔﯿﻟآ ﻊﯿﻨﺼﺗ هﺬھ تﺎﻨﯿﺗوﺮﺒﻟا ﻦﻣ لﻼﺧ ﺮﯾﻮﻄﺗ ﺮﯾﻮﻄﺗ لﻼﺧ ﻦﻣ تﺎﻨﯿﺗوﺮﺒﻟا هﺬھ ﻊﯿﻨﺼﺗ ﺔﯿﻟآ تﺎﻨﯿﺠﻟا ﻲﻓ ﺔﻘﻄﻨﻤﻟا ﻲﺘﻟا ﻂﺒﺗﺮﯾ ﺎﮭﺑ ﯾﺰﻧﻹا ﻢ ا ﻟ لوﺆﺴﻤ ﻦﻋ ﺔﻤﺟﺮﺗ ا ﯿﺠﻟ ﻨ تﺎ إ ﻰﻟ ﺗوﺮﺑ ﯿ ﻨ ،تﺎ ﺎﻤﺑ فﺮﻌﯾ ﺑ ـ ﺮﯾ ﺎﺑ ت وﺑ ﻟإ ﺎﻨﯿﻟا ﻤﺮ ﻦ وﺴ ﺰﻹ .موزﻮﺒﯾﺮﻟا هﺬھ ﺔﻘﻄﻨﻤﻟا ﺪﻌﺗ ﺎﺳﺎﺳأ ﺮھﻮﺟ ﯾ ﺎ ءﺪﺒﻟ ﺔﻤﺟﺮﺗ تاﺮﻔﺸﻟا ﺔﯿﻨﯿﺠﻟا ﻟ ﺣﻸ ﻤ ضﺎ ا ﻷ ﻣ ﯿ ﻨ ﯿ ﺔ ﻰﻟإ ﻰﻟإ ﺗوﺮﺑ ﯿ ﻨ تﺎ ﻲﻓ ا ﺎﺴﻟ ﻞﺋ ا يﻮﻠﺨﻟ ﻟ ﺨﻠ .ﺔﯿﻠ و ﺪﻗ ﻆﺣﻮﻟ ﻲﻓ ﺔﺳارد ﺜﺤﺑ ﯿ ﺔ أ ىﺮﺧ ﻲﻓ هﺬھ ا ﺎﺳﺮﻟ ،ﺔﻟ نأ ا ةﺎﻨﻘﻟ ةﺎﻨﻘﻟ ا نأ ،ﺔﻟ ﺎﺳﺮﻟ ا هﺬھ ﻲﻓ ىﺮﺧ ﺔﻟوﺆﺴﻤﻟا ﻦﻋ ﻞﻘﻧ تﺎﻨﯿﺗوﺮﺒﻟا ﻰﻟإ ﻟا ﻂﺳﻮ ا ﻟ ﻤ ﺆ ﺪﺴﻛ ﻒﯿﻜﺘﺗ ﻊﻣ ةدﺎﯾز جﺎﺘﻧإ تﺎﻨﯿﺗوﺮﺒﻟا ﻲﻓ ﻞﺋﺎﺴﻟا يﻮﻠﺨﻟا ، ذو ﻚﻟ ﻦﻣ لﻼﺧ نأ ا ﯾﻼﺨﻟ ﺎ ا ﻟ ﺘﻜﺒ ﺔﯾﺮﯿ ﺰﻔﺤﺗ ﻦﯾﻮﻜﺗ ﺗوﺮﺑ ﯿ ﻨ تﺎ ةﺪﻋﺎﺴﻣ ﻞﻤﻌﺗ ﻰﻠﻋ ﻢﻋد هﺬھ ا ﻟ ﻘ ﻨ ،ةﺎ ﻰﺘﺣ ﯾ ﻢﺘ ﻢﺘ ﯾ ﻰﺘﺣ ،ةﺎ ﻨ ﻘ ﻟ ا هﺬھ ﻢﻋد ﻰﻠﻋ ﻞﻤﻌﺗ ةﺪﻋﺎﺴﻣ تﺎ ﻨ ﯿ ﺗوﺮﺑ ﻦﯾﻮﻜﺗ ﻞﻘﻧ ﻟا ﺗوﺮﺒ ﯿ ﻨ تﺎ ةرﻮﺼﺑ ﺘﻣ ﻨ ﺔﺒﺳﺎ ﻊﻣ لﺪﻌﻣ ﻧﻹا ﺘ جﺎ ﺑ ﺎ ﻠﺨﻟ ﺔﯿ ، ﻣو ﻨ ﻊ ا ﺪﺴﻧ ا ھد ﺎ ، و ﻗﻮﺗ ﺎﮭﻔ ﻦﻋ ا ﻞﻤﻌﻟ ﺎﻤﻣ يدﺆﯾ يدﺆﯾ ﻰﻟإ تﻮﻣ ا ﯾﻼﺨﻟ ﺎ ﺔﯾﺮﯿﺘﻜﺒﻟا . و ﺬﻛ ﻚﻟ ﺪﻗ أ ﻟ ﺖﻘ ﺔﺳارد ىﺮﺧا ﻲﻓ هﺬھ ا ﺔﻟﺎﺳﺮﻟ ا ءﻮﻀﻟ ﻰﻠﻋ ﻓ ﺎﻌ ﻟ ﺔﯿ ﻞﯿﻠﺤﺗ ﻞﯿﻠﺤﺗ ﺔﯿ ﻟ ﺎﻌ ةﻮﻘﻟا ﺔﺠﺗﺎﻨﻟا ﻦﻋ ﻊﯿﻨﺼﺗ و ﻞﻘﻧ و ﻲط تﺎﻨﯿﺗوﺮﺒﻟا ﻲﻓ نآ ،ﺪﺣاو ﻚﻟذو ﻲﻓ ﺔﻟوﺎﺤﻣ ﻢﮭﻔﻟ آ ﺔﯿﻟ ﻲط تﺎﻨﯿﺗوﺮﺒﻟا تﺎﻨﯿﺗوﺮﺒﻟا ﻲط ﺔﯿﻟ ﻲﺘﻟا يﻮﺘﺤﺗ ﻰﻠﻋ ﻂﺑاور ﻲﺋﺎﻨﺛ ﺪﯿﺘﯾﺮﺒﻜﻟا ﻲﻓ ا ﻟ ﻂﺳﻮ ا ﻟ ﻤ ﺆ ﺪﺴﻛ . اﺬھ ﺔﻓﺎﺿﻹﺎﺑ نأ هﺬھ ﺔﻟﺎﺳﺮﻟا ﺖﻨﻤﻀﺗ ﺖﻨﻤﻀﺗ ﺔﻟﺎﺳﺮﻟا هﺬھ نأ ﺔﻓﺎﺿﻹﺎﺑ اﺬھ . رد ﺔﺳا ﻦﻋ تﺎﻤﯾﺰﻧإ ﻦﻣ ﺎﯾﺮﯿﺘﻜﺒﻟا ﺔﯾﻮﺋﺮﻟا ﺔﯾﺪﻘﻌﻟا ( ﺒﺟﻮﻤﻟا ﺔ- ﺮﺠﻟا ما )، هﺬھ تﺎﻤﯾﺰﻧﻻا ﺪﻌﺗ ﻦﻣ ﺪﺣأ ا ﺎﺳﻮﻟ ﻞﺋ ﻞﺋ ﺎﺳﻮﻟ ا ﺪﺣأ ﻦﻣ ﺪﻌﺗ تﺎﻤﯾﺰﻧﻻا هﺬھ ﺔﻤﮭﻤﻟا ﻲﺘﻟا ﺎﮭﻣﺪﺨﺘﺴﺗ ﺎﯾﺮﯿﺘﻜﺒﻟا ﻦﻣ ﻞﺟأ ﻮﻤﻨﻟا و ﺧﺪﻟا لﻮ و ﻧﻹا ﻲﻓرﺎﺸﺘ أ ﺔﺠﺴﻧ ا زﺎﮭﺠﻟ ا ﻟ ﺘ ﻨ ﻲﺴﻔ ﻲﻓ ﻲﻓ ﻲﺴﻔ ﻨ ﺘ ﻟ ا زﺎﮭﺠﻟ ا ﺔﺠﺴﻧ أ ﻲﻓرﺎﺸﺘ ﻧﻹا و لﻮ نﺎﺴﻧﻹا ، و ﺐﺒﺴﺗ إ تﺎﺑﺎﮭﺘﻟ ﺔﯾﻮﺋر ةدﺎﺣ ﺔﺻﺎﺧ ﻟ ىﺪ ﻌﺿ فﺎ ﺔﻋﺎﻨﻤﻟا . و ﺮﯾﺪﺠﻟا ﺬﻟﺎﺑ ﺮﻛ نأ ﻲﻓ هﺬھ ا ﻟ ﺔﺳارﺪ ﺔﺳارﺪ ﻟ ا هﺬھ ﻲﻓ نأ ﻢﺗ ﺰﯿﻔﺤﺗ جﺎﺘﻧإ هﺬھ تﺎﻤﯾﺰﻧﻹا ﻲﻓ ﺴﻟا ﻞﺋﺎ يﻮﻠﺨﻟا ﺔﯿﻠﺨﻠﻟ ﯿﻜﯾﺮﺷﻹا ﺔ ﺔﯿﻧﻮﻟﻮﻘﻟا ﻟا( ﺎﺴ ﻟ ﺒ ﺔ - ماﺮﺠﻟا ) ، ﺚﯿﺣ ﻢﺗ ﻢﺗ ﺚﯿﺣ ، ﻠﺼﻓ ﮭ ﺎ ، و ﺗ ﻨ ﻘ ﯿ ﺎﮭﺘ ، و ﺔﺳارد ﺾﻌﺑ ﻦﻣ ﺼﺧ ﺎ ﺋ ﺼ ﮭ ﺎ . ﻤﻛ ﺎ نأ هﺬھ ا ﻟ ﺔﺳارﺪ فﺪﮭﺗ إ ﻰﻟ ﻢﮭﻓ آ ﻟ ﺔﯿ ﻞﻤﻋ ﺬھ ه هﺬ تﺎﻤﯾﺰﻧﻹا و ﺔﺻﺎﺧ ﺎﮭﻧوﺎﻌﺗ ﻊﻣ ﻓ سوﺮﯿ ﻹا اﺰﻧﻮﻠﻔﻧ ﻲﻓ ثاﺪﺣإ إ تﺎﺑﺎﮭﺘﻟ ﺔﯾﻮﺋر ةدﺎﺣ ﺪﻗ يدﻮﺗ ﯿﺤﺑ ﺎ ة ا ﺜﻜﻟ ﺮﯿ ﺮﯿ ﺜﻜﻟ ا ة ﺎ ﯿﺤﺑ يدﻮﺗ ﻦﻣ ا ﻰﺿﺮﻤﻟ لﻮﺣ ا ﺎﻌﻟ ﻟ .ﻢ ﺚﯿﺣ ﻦﻣ ا ﻊﻗﻮﺘﻤﻟ أ ن ﻢﮭﻓ ﺔﯿﻟآ اﺬھ نوﺎﻌﺘﻟا ﺎﻤﺑر ﻢﮭﺴﯾ ﻞﻜﺸﺑ ﻌﻓ لﺎ ﻲﻓ ﺮﯾﻮﻄﺗ ﺮﯾﻮﻄﺗ ﻲﻓ لﺎ ﻌﻓ ﻞﻜﺸﺑ ﺔﻣظﻧﻷا ﺔﯿﻋﺎﻓﺪﻟا ﺪﺿ ھ ﻤ ﺎ .

32 Acknowledgment

I would like to thank my supervisor, Jan-Willem, for giving me the chance to join your laboratory in order to finish the Ph.D. It was challeng- ing for both of us, particularly during the COVID-19 pandemic. However, I appreciate the opportunity, the time and the scientific support you offered to me, so this Ph.D. could be achieved. I would like to thank Gunnar von Heijne as well, for being my co-super- visor and for the opportunity to collaborate. I appreciate your scientific sup- port in achieving my goals. My gratitude is also addressed to Robert Daniels for all the support and advice you have shared with me to proceed with projects. You have never been hesitant to discuss project issues and to give suggestions and share thoughts. It was also a pleasure to teach with you, and I am glad that you were awarded the title of teacher of the year. I have learned so much from you, and I am very grateful for this experience. Special thanks go to Renuka Kudva for both your scientific and personal support. It was a pleasure to work with you, and for your significant input on the projects. Many thanks go to Grant Kemp as well, for the scientific disscussions and advices you have shared. I am gratful for your time and support. Thanks to Daniel Daley and his former and current group members: Ste- phen, Kiavash, Claudio, Patrick, Aurelie, and Zoe. James and Diana, I wish you both good luck with your spectacular future ahead. I would also like to thank both the department's current and former heads, Martin Högbom and Lena Mäler and the current and former Ph.D. direc- tors, Pia Ädelroth and Stefan Nordlund, for their efforts in the manage- ments of the Ph.D. program. Thanks to all former and current Jan-Willem group members and special thanks go to Alex. It was a pleasure to work with you. I appreciate your time to share thoughts, organize meetings, create a healthy work environ- ment, and to take care of the lab. Henrik, it was a pleasure to perform teaching with you, and thanks for all the valuable discussions. I wish you both good luck with the Ph.D. studies. I would like to thank the Gunnar von Heijne community for the journal clubs and the lab seminar arrangements as well as the collaborative attitude to generate a friendly work environment. I wish much luck to all Masters and Ph.D. students and all Post-Docs in your future careers.

33 As well, I would like to thank the entire DBB staff especially the Einar Hallberg group, former groups of Tara Hessa, Robert Daniels, and Elzbieta Glaser; and the administrative & technical staff. And to Daniel Lundin, thank you for reviewing the Swedish version of the thesis sum- mary. Many thanks go to the Student Union and SULF organization at Stock- holm University, particularly, Sara Elg and Ingrid Lander for your ad- vices and support in all the circumstances I have been through. A special thanks go to the Ministry of Higher Education-Libya; they fi- nancially supported me for two and half years of the Ph.D. time-period. To the Minister of Higher Education-Libya, it was a pleasure to meet you at the Embassy of Libya in Stockholm. I appreciate your understanding and support of all students' circumstances. A big thanks go to the Embassy of Libya in Stockholm. You do a great job of supporting students in Scandi- navian countries. Former supervisors Farag Elshaari, Dhastagir Sheriff, Abdalla Jarari and Mustafa Elfakhri; and colleagues at the Department of Biochemistry, Faculty of Medicine, University of Benghazi; thank you all for the chance you were offered to me to join the department and the opportunity to get the scholarship to achieve the Ph.D. abroad. All friends in Sweden, Libya, Sudan, Turkey, and Dubai, you are fantastic guys! I am lucky to have you around; I will never forget your great support under all the circumstances I have been through as well as the fun time I have had with all of you. To my parents, Dad, it is almost five years since you passed away. Sorry I did not have a chance to see you again since I left home seven years ago, but I hope you are in a better place with no more pain. I know you always have been worrying about me, and you used to remind me in every single call that your support sustained me throughout. I am here today to tell you I made it, Dad, and I miss you at this moment. Unfortunately, you are no longer in our life, but you always are alive in my heart. "O, Allah! Forgive my father have mercy on him. Accept his deeds and grant him Jannah" Ameen. Mom, I am grateful to have you supporting me all the time. You are the strongest woman I have ever seen in my life. Thanks, Mom, for each Dua and prayer you did and still do for me. Words can't express how much I love you. Without you, Mom and Dad, I most certainly would not be where I am today. Finally, special and profound thanks go to my brothers (Ihab, Luay, and Ahmed), my sisters (Eman and Asma), and the rest of my great family. You offered invaluable support, humor, and love for me all the time over the years.

34 References:

1. Schroeder E, Wuertz S. 2003. 3 - Bacteria. In Handbook of Water and Wastewater , ed D Mara, N Horan, pp. 57–68. London: Academic Press 2. Krogh TJ, Møller-Jensen J, Kaleta C. 2018. Impact of chromosomal architecture on the function and evolution of bacterial genomes. Front. Microbiol. 9:2019 3. Shintani M, Sanchez ZK, Kimbara K. 2015. Genomics of microbial plasmids: clas- sification and identification based on replication and transfer systems and host taxonomy. Front Microbiol. 6: 4. Coico R. 2005. Gram staining. Curr Protoc Microbiol. Appendix 3:Appendix 3C 5. Silhavy TJ, Kahne D, Walker S. 2010. The bacterial cell envelope. Cold Spring Harb Perspect Biol. 2(5):a000414 6. Beeby M, Gumbart JC, Roux B, Jensen GJ. 2013. Architecture and assembly of the Gram-positive cell wall. Mol Microbiol. 88(4):664–72 7. Katouli M. 2010. Population structure of gut Escherichia coli and its role in devel- opment of extra-intestinal infections. Iran J Microbiol. 2(2):59–72 8. Blount ZD. 2015. The unexhausted potential of E. coli. eLife. 4:e05826 9. Lee PS, Lee KH. 2003. Escherichia coli- a model system that benefits from and contributes to the evolution of proteomics. Biotechnol Bioeng. 84(7):801–14 10. Tenaillon O, Skurnik D, Picard B, Denamur E. 2010. The population genetics of commensal Escherichia coli. Nat Rev Microbiol. 8(3):207–17 11. Donkor ES. 2013. Understanding the pneumococcus: transmission and evolution. Front Cell Infect Microbiol. 3:(7) 12. Brooks LRK, Mias GI. 2018. Streptococcus pneumoniae’s virulence and host im- munity: aging, diagnostics, and prevention. Front. Immunol. 9:1366 13. Weight CM, Venturini C, Pojar S, Jochems SP, Reiné J, et al. 2019. Microinvasion by Streptococcus pneumoniae induces epithelial innate immunity during coloni- sation at the human mucosal surface. Nature Communications. 10(1):3060 14. Weiser JN, Ferreira DM, Paton JC. 2018. Streptococcus pneumoniae : transmission, colonization and invasion. Nat Rev Microbiol. 16(6):355–67 15. Ramirez M. 2015. Chapter 86 - Streptococcus pneumoniae. In Molecular Medical Microbiology (Second Edition), ed Y-W Tang, M Sussman, D Liu, I Poxton, J Schwartzman, pp. 1529–46. Boston: Academic Press 16. Mizrachi Nebenzahl Y, Blau K, Kushnir T, Shagan M, Portnoi M, et al. 2016. Strep- tococcus pneumoniae Cell-wall-localized phosphoenolpyruvate protein phos- photransferase can function as an adhesin: identification of its host target mole- cules and evaluation of its potential as a vaccine. PLoS ONE. 11(3):e0150320 17. McCullers JA, Bartmess KC. 2003. Role of neuraminidase in lethal synergism be- tween influenza virus and Streptococcus pneumoniae. J. Infect. Dis. 187(6):1000– 1009 18. Uchiyama S, Carlin AF, Khosravi A, Weiman S, Banerjee A, et al. 2009. The sur- face-anchored NanA protein promotes pneumococcal brain endothelial cell inva- sion. J. Exp. Med. 206(9):1845–52 19. Banerjee A, Van Sorge NM, Sheen TR, Uchiyama S, Mitchell TJ, Doran KS. 2010. Activation of brain endothelium by pneumococcal neuraminidase NanA promotes bacterial internalization. Cell. Microbiol. 12(11):1576–88 20. Marion C, Burnaugh AM, Woodiga SA, King SJ. 2011. Sialic acid transport con- tributes to pneumococcal colonization. Infect. Immun. 79(3):1262–69 21. Bosch AATM, Biesbroek G, Trzcinski K, Sanders EAM, Bogaert D. 2013. Viral and bacterial interactions in the upper respiratory tract. PLOS Pathogens. 9(1):e1003057

35 22. Cohen M, Zhang X-Q, Senaati HP, Chen H-W, Varki NM, et al. 2013. Influenza A penetrates host mucus by cleaving sialic acids with neuraminidase. Virology Jour- nal. 10(1):321 23. Peltola VT, McCullers JA. 2004. Respiratory viruses predisposing to bacterial in- fections: role of neuraminidase. Pediatr. Infect. Dis. J. 23(1 Suppl):S87-97 24. Price KE, Greene NG, Camilli A. 2012. Export requirements of pneumolysin in Streptococcus pneumoniae. Journal of Bacteriology. 194(14):3651–60 25. Jedrzejas MJ. 2001. Pneumococcal virulence factors: structure and function. Micro- biol Mol Biol Rev. 65(2):187–207 26. Gholamhosseini-Moghaddam T, Rad M, Mousavi SF, Ghazvini K. 2015. Detection of lytA, pspC, and rrgA genes in Streptococcus pneumoniae isolated from healthy children. Iran J Microbiol. 7(3):156–60 27. Johnston JW, Myers LE, Ochs MM, Benjamin WH, Briles DE, Hollingshead SK. 2004. Lipoprotein PsaA in virulence of Streptococcus pneumoniae: Surface ac- cessibility and role in protection from superoxide. Infect Immun. 72(10):5858–67 28. Li N, Yang X-Y, Guo Z, Zhang J, Cao K, et al. 2014. Varied metal-binding proper- ties of lipoprotein PsaA in Streptococcus pneumoniae. J Biol Inorg Chem. 19(6):829–38 29. Miao X, He J, Zhang L, Zhao X, Ge R, et al. 2018. A novel iron transporter SPD_1590 in Streptococcus pneumoniae Contributing to Bacterial Virulence Properties. Front. Microbiol. 9: 30. Whalan RH, Funnell SGP, Bowler LD, Hudson MJ, Robinson A, Dowson CG. 2005. PiuA and PiaA, iron uptake lipoproteins of Streptococcus pneumoniae, elicit serotype independent antibody responses following human pneumococcal septi- caemia. FEMS Immunol Med Microbiol. 43(1):73–80 31. Cao K, Zhang J, Miao X-Y, Wei Q-X, Zhao X-L, et al. 2018. Evolution and molec- ular mechanism of PitAs in iron transport of Streptococcus species. J Inorg Bio- chem. 182:113–23 32. Chi Y-C, Rahkola JT, Kendrick AA, Holliday MJ, Paukovich N, et al. 2017. Strep- tococcus pneumoniae IgA1 protease: A metalloprotease that can catalyze in a split manner in vitro. Protein Science. 26(3):600–610 33. Janoff EN, Rubins JB, Fasching C, Charboneau D, Rahkola JT, et al. 2014. Pneu- mococcal IgA1 Protease Subverts Specific Protection By Human IgA1. Mucosal Immunol. 7(2):249–56 34. Vollmer W, Massidda O, Tomasz A. 2019. The cell wall of Streptococcus pneu- moniae. Microbiol Spectr. 7(3):GPP3-0018-2018 35. Crane JM, Randall LL. 2017. The Sec system: protein export in Escherichia coli. EcoSal Plus. 7(2):10.1128/ecosalplus.ESP-0002-2017 36. Trevors JT. 2011. The Composition and organization of cytoplasm in prebiotic Cells. Int J Mol Sci. 12(3):1650–59 37. Vendeville A, Larivière D, Fourmentin E. 2011. An inventory of the bacterial mac- romolecular components and their spatial organization. FEMS Microbiol. Rev. 35(2):395–414 38. Tsirigotaki A, De Geyter J, Šoštaric´ N, Economou A, Karamanou S. 2017. Protein export through the bacterial Sec pathway. Nature Reviews Microbiology. 15(1):21–36 39. Raetz CR, Dowhan W. 1990. Biosynthesis and function of phospholipids in Esche- richia coli. J. Biol. Chem. 265(3):1235–38 40. Lin T-Y, Weibel DB. 2016. Organization and function of anionic phospholipids in bacteria. Appl. Microbiol. Biotechnol. 100(10):4255–67 41. Klein P, Kanehisa M, DeLisi C. 1985. The detection and classification of mem- brane-spanning proteins. Biochim. Biophys. Acta. 815(3):468–76

36 42. Facey SJ, Kuhn A. 2010. Biogenesis of bacterial inner-membrane proteins. Cell. Mol. Life Sci. 67(14):2343–62 43. Dalbey RE, Wang P, Kuhn A. 2011. Assembly of bacterial inner membrane pro- teins. Annual Review of Biochemistry. 80(1):161–87 44. von Heijne G. 2006. Membrane-protein topology. Nature Reviews Molecular Cell Biology. 7(12):909–18 45. Goemans C, Denoncin K, Collet J-F. 2014. Folding mechanisms of periplasmic pro- teins. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research. 1843(8):1517–28 46. Duguay AR, Silhavy TJ. 2004. Quality control in the bacterial periplasm. Biochim. Biophys. Acta. 1694(1–3):121–34 47. Miller SI, Salama NR. 2018. The gram-negative bacterial periplasm: Size matters. PLOS Biology. 16(1):e2004935 48. Vollmer W, Bertsche U. 2008. Murein (peptidoglycan) structure, architecture and biosynthesis in Escherichia coli. Biochimica et Biophysica Acta (BBA) - Biomembranes. 1778(9):1714–34 49. Vollmer W. 2015. Chapter 6 - Peptidoglycan. In Molecular Medical Microbiology (Second Edition), ed Y-W Tang, M Sussman, D Liu, I Poxton, J Schwartzman, pp. 105–24. Boston: Academic Press 50. Zgurskaya HI, Krishnamoorthy G, Ntreh A, Lu S. 2011. Mechanism and function of the outer membrane channel TolC in multidrug resistance and physiology of enterobacteria. Front Microbiol. 2:189 51. Vollmer W, Blanot D, de Pedro MA. 2008. Peptidoglycan structure and architec- ture. FEMS Microbiol. Rev. 32(2):149–67 52. Zgurskaya HI, Löpez CA, Gnanakaran S. 2015. Permeability barrier of Gram-neg- ative cell envelopes and approaches to bypass It. ACS Infect Dis. 1(11):512–22 53. Yadav AK, Espaillat A, Cava F. 2018. Bacterial strategies to preserve cell wall in- tegrity against environmental threats. Front. Microbiol. 9:2064 54. Egan AJF, Cleverley RM, Peters K, Lewis RJ, Vollmer W. 2017. Regulation of bacterial cell wall growth. The FEBS Journal. 284(6):851–67 55. Typas A, Banzhaf M, Gross CA, Vollmer W. 2011. From the regulation of pepti- doglycan synthesis to bacterial growth and morphology. Nat Rev Microbiol. 10(2):123–36 56. Yakhnina AA, Bernhardt TG. 2020. The Tol-Pal system is required for peptidogly- can-cleaving enzymes to complete bacterial cell division. PNAS. 117(12):6777– 83 57. Ebbensgaard A, Mordhorst H, Aarestrup FM, Hansen EB. 2018. The Role of Outer Membrane Proteins and Lipopolysaccharides for the Sensitivity of Escherichia coli to Antimicrobial Peptides. Front Microbiol. 9:2153 58. Sankaran K, Wu HC. 1994. Lipid modification of bacterial prolipoprotein. Transfer of diacylglyceryl moiety from phosphatidylglycerol. J. Biol. Chem. 269(31):19701–6 59. Rajagopal M, Walker S. 2017. Envelope structures of Gram-positive bacteria. Curr. Top. Microbiol. Immunol. 404:1–44 60. Poxton IR. 2015. Chapter 5 - Teichoic Acids, lipoteichoic acids and other secondary cell wall and membrane polysaccharides of Gram-positive bacteria. In Molecular Medical Microbiology (Second Edition), ed Y-W Tang, M Sussman, D Liu, I Pox- ton, J Schwartzman, pp. 91–103. Boston: Academic Press 61. Larson TR, Yother J. 2017. Streptococcus pneumoniae capsular polysaccharide is linked to peptidoglycan via a direct glycosidic bond to β-D-N-acetylglucosamine. PNAS. 114(22):5695–5700

37 62. Geno KA, Saad JS, Nahm MH. 2017. Discovery of Novel Pneumococcal Serotype 35D, a Natural WciG-Deficient Variant of Serotype 35B. J. Clin. Microbiol. 55(5):1416–25 63. Suits MD, Boraston AB. 2013. Structure of the Streptococcus pneumoniae Surface Protein and Adhesin PfbA. PLoS One. 8(7):e67190 64. Crick F. 1970. Central Dogma of Molecular Biology. Nature. 227(5258):561–63 65. Fekkes P, Driessen AJM. 1999. Protein targeting to the bacterial cytoplasmic Mem- brane. Microbiol Mol Biol Rev. 63(1):161–73 66. Akopian D, Shen K, Zhang X, Shan S. 2013. Signal recognition particle: An essen- tial protein targeting machine. Annu Rev Biochem. 82:693–721 67. Duong F, Eichler J, Price A, Leonard MR, Wickner W. 1997. Biogenesis of the Gram-negative bacterial envelope. Cell. 91(5):567–73 68. Facey SJ, Kuhn A. 2010. Biogenesis of bacterial inner-membrane proteins. Cell. Mol. Life Sci. 67(14):2343–62 69. Steitz TA. 2008. A structural understanding of the dynamic ribosome machine. Nat. Rev. Mol. Cell Biol. 9(3):242–53 70. Kaczanowska M, Rydén-Aulin M. 2007. Ribosome biogenesis and the translation process in Escherichia coli. Microbiol Mol Biol Rev. 71(3):477–94 71. Schmeing TM, Ramakrishnan V. 2009. What recent ribosome structures have re- vealed about the mechanism of translation. Nature. 461(7268):1234–42 72. Dreyfus M. 1988. What constitutes the signal for the initiation of protein synthesis on Escherichia coli mRNAs? J. Mol. Biol. 204(1):79–94 73. Milón P, Rodnina MV. 2012. Kinetic control of translation initiation in bacteria. Crit. Rev. Biochem. Mol. Biol. 47(4):334–48 74. Gualerzi CO, Brandi L, Caserta E, Garofalo C, Lammi M, et al. 2001. Initiation factors in the early events of mRNA translation in bacteria. Cold Spring Harb. Symp. Quant. Biol. 66:363–76 75. Jin H, Zhao Q, Gonzalez de Valdivia EI, Ardell DH, Stenström M, Isaksson LA. 2006. Influences on gene expression in vivo by a Shine-Dalgarno sequence. Mol. Microbiol. 60(2):480–92 76. Moore PB. 2012. How should we think about the ribosome? Annu Rev Biophys. 41:1–19 77. Simonetti A, Marzi S, Myasnikov AG, Fabbretti A, Yusupov M, et al. 2008. Struc- ture of the 30S translation initiation complex. Nature. 455(7211):416–20 78. Gualerzi CO, Brandi L, Caserta E, Teana AL, Spurio R, et al. 2000. Translation initiation in bacteria. The Ribosome, pp. 475–94 79. Rodnina MV. 2018. Translation in prokaryotes. Cold Spring Harb Perspect Biol, p. a032664 80. Laursen BS, Sørensen HP, Mortensen KK, Sperling-Petersen HU. 2005. Initiation of protein synthesis in bacteria. Microbiol. Mol. Biol. Rev. 69(1):101–23 81. Pape T, Wintermeyer W, Rodnina MV. 1998. Complete kinetic mechanism of elon- gation factor Tu-dependent binding of aminoacyl-tRNA to the A site of the E. coli ribosome. EMBO J. 17(24):7490–97 82. Hughes D. 2013. Elongation factors: translation. In Brenner’s Encyclopedia of Ge- netics (Second Edition), ed S Maloy, K Hughes, pp. 466–68. San Diego: Aca- demic Press 83. Hummels KR, Kearns DB. 2020. Translation elongation factor P (EF-P). FEMS Microbiol Rev. 44(2):208–18 84. Korostelev AA. 2011. Structural aspects of translation termination on the ribosome. RNA. 17(8):1409–21 85. Kisselev L, Ehrenberg M, Frolova L. 2003. Termination of translation: interplay of mRNA, rRNAs and release factors? EMBO J. 22(2):175–82

38 86. Petropoulos AD, McDonald ME, Green R, Zaher HS. 2014. Distinct Roles for Re- lease Factor 1 and Release Factor 2 in Translational quality control. J Biol Chem. 289(25):17589–96 87. Freistroffer DV, Pavlov MY, MacDougall J, Buckingham RH, Ehrenberg M. 1997. Release factor RF3 in E. coli accelerates the dissociation of release factors RF1 and RF2 from the ribosome in a GTP-dependent manner. EMBO J. 16(13):4126– 33 88. Klaholz BP, Myasnikov AG, Heel M van. 2004. Visualization of release factor 3 on the ribosome during termination of protein synthesis. Nature. 427(6977):862–65 89. Kiel MC, Raj VS, Kaji H, Kaji A. 2003. Release of ribosome-bound ribosome re- cycling factor by elongation factor G *. Journal of Biological Chemistry. 278(48):48041–50 90. Karimi R, Pavlov MY, Buckingham RH, Ehrenberg M. 1999. Novel roles for clas- sical factors at the interface between translation termination and initiation. Molec- ular Cell. 3(5):601–9 91. Veenendaal AKJ, van der Does C, Driessen AJM. 2004. The protein-conducting channel SecYEG. Biochimica et Biophysica Acta (BBA) - Molecular Cell Re- search. 1694(1):81–95 92. Lycklama a Nijeholt JA, Driessen AJM. 2012. The bacterial Sec-translocase: struc- ture and mechanism. Philos. Trans. R. Soc. Lond., B, Biol. Sci. 367(1592):1016– 28 93. Vrontou E, Economou A. 2004. Structure and function of SecA, the preprotein translocase nanomotor. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research. 1694(1):67–80 94. Chatzi KE, Sardis MF, Economou A, Karamanou S. 2014. SecA-mediated targeting and translocation of secretory proteins. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research. 1843(8):1466–74 95. Schulze RJ, Komar J, Botte M, Allen WJ, Whitehouse S, et al. 2014. Membrane protein insertion and proton-motive-force-dependent secretion through the bacte- rial holo-translocon SecYEG–SecDF–YajC–YidC. PNAS. 111(13):4844–49 96. Komar J, Alvira S, Schulze RJ, Martin R, Lycklama a Nijeholt JA, et al. 2016. Membrane protein insertion and assembly by the bacterial holo-translocon SecYEG–SecDF–YajC–YidC. Biochem J. 473(19):3341–54 97. Dalbey RE, Kuhn A, Zhu L, Kiefer D. 2014. The membrane insertase YidC. Bio- chimica et Biophysica Acta (BBA) - Molecular Cell Research. 1843(8):1489–96 98. von Heijne G. 1990. The signal peptide. J. Membrain Biol. 115(3):195–201 99. de Souza GA, Leversen NA, Målen H, Wiker HG. 2011. Bacterial proteins with cleaved or uncleaved signal peptides of the general secretory pathway. Journal of Proteomics. 75(2):502–10 100. Ivankov DN, Payne SH, Galperin MY, Bonissone S, Pevzner PA, Frishman D. 2013. How many signal peptides are there in bacteria? Environ Microbiol. 15(4):983–90 101. Chatzi KE, Sardis MF, Karamanou S, Economou A. 2013. Breaking on through to the other side: protein export through the bacterial Sec system. Biochem. J. 449(1):25–37 102. Auclair SM, Bhanu MK, Kendall DA. 2012. Signal peptidase I: Cleaving the way to mature proteins. Protein Science. 21(1):13–25 103. Kudva R, Denks K, Kuhn P, Vogt A, Müller M, Koch H-G. 2013. Protein translo- cation across the inner membrane of Gram-negative bacteria: the Sec and Tat de- pendent protein transport pathways. Research in Microbiology. 164(6):505–34 104. Palmer T, Berks BC. 2012. The twin-arginine translocation (Tat) protein export pathway. Nature Reviews Microbiology. 10(7):483–96

39 105. Frain KM, Robinson C, van Dijl JM. 2019. Transport of folded proteins by the Tat System. Protein J. 38(4):377–88 106. Lee HC, Bernstein HD. 2001. The targeting pathway of Escherichia coli presecre- tory and integral membrane proteins is specified by the hydrophobicity of the tar- geting signal. PNAS. 98(6):3471–76 107. Kadokura H, Beckwith J. 2009. Detecting folding intermediates of a protein as It passes through the bacterial translocation channel. Cell. 138(6):1164–73 108. Draycheva A, Bornemann T, Ryazanov S, Lakomek N-A, Wintermeyer W. 2016. The bacterial SRP receptor, FtsY, is activated on binding to the translocon. Mo- lecular Microbiology. 102(1):152–67 109. Saraogi I, Akopian D, Shan S. 2014. Regulation of cargo recognition, commitment, and unloading drives cotranslational protein targeting. J Cell Biol. 205(5):693– 706 110. Shen K, Arslan S, Akopian D, Ha T, Shan S. 2012. Activated GTPase movement on an RNA scaffold drives co-translational protein targeting. Nature. 492(7428):271–75 111. Shan S, Chandrasekar S, Walter P. 2007. Conformational changes in the GTPase modules of the signal reception particle and its receptor drive initiation of protein translocation. J Cell Biol. 178(4):611–20 112. Castanié-Cornet M-P, Bruel N, Genevaux P. 2014. Chaperone networking facili- tates protein targeting to the bacterial cytoplasmic membrane. Biochimica et Bio- physica Acta (BBA) - Molecular Cell Research. 1843(8):1442–56 113. Singh R, Kraft C, Jaiswal R, Sejwal K, Kasaragod VB, et al. 2014. Cryo-electron Microscopic Structure of SecA Protein Bound to the 70S Ribosome. J. Biol. Chem. 289(10):7190–99 114. Huber D, Rajagopalan N, Preissler S, Rocco MA, Merz F, et al. 2011. SecA interacts with ribosomes in order to facilitate posttranslational translocation in Bacteria. Molecular Cell. 41(3):343–53 115. Huang C, Rossi P, Saio T, Kalodimos CG. 2016. Structural basis for the antifolding activity of a molecular chaperone. Nature. 537(7619):202–6 116. Sala A, Bordes P, Genevaux P. 2014. Multitasking SecB chaperones in bacteria. Front. Microbiol. 5:666 117. Valent QA, Scotti PA, High S, de Gier J-WL, von Heijne G, et al. 1998. The Esch- erichia coli SRP and SecB targeting pathways converge at the translocon. The EMBO Journal. 17(9):2504–12 118. Sachelaru I, Petriman NA, Kudva R, Kuhn P, Welte T, et al. 2013. YidC occupies the lateral gate of the SecYEG translocon and is sequentially displaced by a nas- cent membrane protein *. Journal of Biological Chemistry. 288(23):16295–307 119. Kiefer D, Kuhn A. 2018. YidC-mediated membrane insertion. FEMS Microbiol Lett. 365(12):fny106 120. Deitermann S, Sprie GS, Koch H-G. 2005. A dual function for SecA in the assembly of single spanning membrane proteins in Escherichia coli. J. Biol. Chem. 280(47):39077–85 121. Schiebel E, Driessen AJM, Hartl F-U, Wickner W. 1991. ΔμH+ and ATP function at different steps of the catalytic cycle of preprotein translocase. Cell. 64(5):927– 39 122. Butkus ME, Prundeanu LB, Oliver DB. 2003. Translocon “Pulling” of Nascent SecM controls the duration of its translational pause and secretion-responsive secA regulation. Journal of Bacteriology. 185(22):6719–22 123. Nakatogawa H, Ito K. 2001. Secretion Monitor, SecM, undergoes self-translation arrest in the . Molecular Cell. 7(1):185–92

40 124. Schatz PJ, Beckwith J. 1990. Genetic analysis of protein export in Escherichia coli. Annu. Rev. Genet. 24:215–48 125. Oliver D, Norman J, Sarker S. 1998. Regulation of Escherichia coli secA by cellular protein secretion proficiency requires an intact gene X signal sequence and an ac- tive translocon. Journal of Bacteriology. 180(19):5240–42 126. Silber KR, Keiler KC, Sauer RT. 1992. Tsp: a tail-specific protease that selectively degrades proteins with nonpolar C termini. Proc Natl Acad Sci U S A. 89(1):295– 99 127. Murakami A, Nakatogawa H, Ito K. 2004. Translation arrest of SecM is essential for the basal and regulated expression of SecA. PNAS. 101(33):12330–35 128. Nakamoto H, Bardwell JCA. 2004. Catalysis of disulfide bond formation and isom- erization in the Escherichia coli periplasm. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research. 1694(1):111–19 129. De Geyter J, Tsirigotaki A, Orfanoudaki G, Zorzini V, Economou A, Karamanou S. 2016. Protein folding in the cell envelope of Escherichia coli. Nature Microbi- ology. 1(8):1–13 130. Manta B, Boyd D, Berkmen M. 2019. Disulfide bond formation in the periplasm of Escherichia coli. EcoSal Plus. 8(2) 131. Rajpal G, Arvan P. 2013. Chapter 236 - Disulfide bond formation. In Handbook of Biologically Active Peptides (Second Edition), ed AJ Kastin, pp. 1721–29. Bos- ton: Academic Press 132. Bechtel TJ, Weerapana E. 2017. From structure to redox: the diverse functional roles of disulfides and implications in disease. Proteomics. 17(6). 133. Ruiz N, Chng S-S, Hiniker A, Kahne D, Silhavy TJ. 2010. Nonconsecutive disulfide bond formation in an essential integral outer membrane protein. Proc Natl Acad Sci U S A. 107(27):12245–50 134. Merdanovic M, Clausen T, Kaiser M, Huber R, Ehrmann M. 2011. Protein quality control in the bacterial periplasm. Annu. Rev. Microbiol. 65:149–68 135. Bocian-Ostrzycka KM, Grzeszczuk MJ, Banaś AM, Jagusztyn-Krynicka EK. 2017. Bacterial thiol oxidoreductases — from basic research to new antibacterial strate- gies. Appl Microbiol Biotechnol. 101(10):3977–89 136. Atkinson HJ, Babbitt PC. 2009. An atlas of the thioredoxin fold class reveals the complexity of function-enabling adaptations. PLOS Computational Biology. 5(10):e1000541 137. Guddat LW, Martin JL, Bardwell JCA, Zander T. 1997. The uncharged surface fea- tures surrounding the active site of Escherichia coli DsbA are conserved and are implicated in peptide binding. Protein Science. 6(6):1148–56 138. Frech C, Wunderlich M, Glockshuber R, Schmid FX. 1996. Preferential binding of an unfolded protein to DsbA. EMBO J. 15(2):392–98 139. Nelson JW, Creighton TE. 1994. Reactivity and ionization of the active site cysteine residues of DsbA, a protein required for disulfide bond formation in vivo. Bio- chemistry. 33(19):5974–83 140. Bushweller JH. 2020. Protein disulfide exchange by the intramembrane enzymes DsbB, DsbD, and CcdA. Journal of Molecular Biology. 432(18):5091–5103 141. Tang M, Nesbitt AE, Sperling LJ, Berthold DA, Schwieters CD, et al. 2013. Struc- ture of the disulfide bond generating membrane protein DsbB in the Lipid Bilayer. J Mol Biol. 425(10):1670–82 142. Jander G, Martin N l., Beckwith J. 1994. Two cysteines in each periplasmic domain of the membrane protein DsbB are required for its function in protein disulfide bond formation. The EMBO Journal. 13(21):5121–27 143. Berkmen M. 2012. Production of disulfide-bonded proteins in Escherichia coli. Pro- tein Expression and Purification. 82(1):240–51

41 144. Gottschalk A, Lind PE. 1949. Product of interaction between influenza virus en- zyme and ovomucin. Nature. 164(4162):232–33 145. Hirst GK. 1941. The Agglutination of red cells by allantoic fluid of chick embryos infected with influenza virus. Science. 94(2427):22–23 146. Taylor G. 1996. Sialidases: structures, biological significance and therapeutic po- tential. Current Opinion in Structural Biology. 6(6):830–37 147. Bairoch A, Apweiler R. 2000. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28(1):45–48 148. Xu G, Kiefel MJ, Wilson JC, Andrew PW, Oggioni MR, Taylor GL. 2011. Three Streptococcus pneumoniae Sialidases: Three Different Products. J. Am. Chem. Soc. 133(6):1718–21 149. Varki A. 1992. Diversity in the sialic acids. Glycobiology. 2(1):25–40 150. Xiao K, Wang X, Yu H. 2019. Comparative studies of catalytic pathways for Strep- tococcus pneumoniae sialidases NanA, NanB and NanC. Scientific Reports. 9(1):2157 151. Manco S, Hernon F, Yesilkaya H, Paton JC, Andrew PW, Kadioglu A. 2006. Pneu- mococcal neuraminidases A and B both have essential roles during Infection of the respiratory tract and sepsis. Infection and Immunity. 74(7):4014–20 152. Janesch P, Rouha H, Badarau A, Stulik L, Mirkina I, et al. 2018. Assessing the function of pneumococcal neuraminidases NanA, NanB and NanC in in vitro and in vivo lung infection models using monoclonal antibodies. Virulence. 9(1):1521– 38 153. Chandrasekaran A, Srinivasan A, Raman R, Viswanathan K, Raguram S, et al. 2008. Glycan topology determines human adaptation of avian H5N1 virus hemag- glutinin. Nature Biotechnology. 26(1):107–13 154. Owen CD, Lukacik P, Potter JA, Sleator O, Taylor GL, Walsh MA. 2015. Strepto- pneumoniae NanC structural insights into the specificity and mechanism of a sialidase that produces a sialidase inhibitor. J. Biol. Chem. 290(46):27736–48 155. McAuley JL, Gilbertson BP, Trifkovic S, Brown LE, McKimm-Breschkin JL. 2019. Influenza virus neuraminidase structure and functions. Front Microbiol. 10:39 156. Colman PM, Tulip WR, Varghese JN, Tulloch PA, Baker AT, et al. 1989. Three- dimensional structures of influenza virus neuraminidase-antibody complexes. Philos. Trans. R. Soc. Lond., B, Biol. Sci. 323(1217):511–18 157. Shtyrya YA, Mochalova LV, Bovin NV. 2009. Influenza virus Neuraminidase: structure and function. Acta Naturae. 1(2):26–32 158. Xu G, Potter JA, Russell RJM, Oggioni MR, Andrew PW, Taylor GL. 2008. Crystal structure of the NanB sialidase from Streptococcus pneumoniae. J. Mol. Biol. 384(2):436–49 159. Gut H, Xu G, Taylor GL, Walsh MA. 2011. Structural basis for Streptococcus pneu- moniae NanA inhibition by influenza antivirals zanamivir and oseltamivir carbox- ylate. J. Mol. Biol. 409(4):496–503 160. Schneewind O, Missiakas DM. 2012. Protein secretion and surface display in Gram- positive bacteria. Philos Trans R Soc Lond B Biol Sci. 367(1592):1123–39 161. Cámara M, Boulnois GJ, Andrew PW, Mitchell TJ. 1994. A neuraminidase from Streptococcus pneumoniae has the features of a surface protein. Infect Immun. 62(9):3688–95 162. Thobhani S, Ember B, Siriwardena A, Boons G-J. 2003. Multivalency and the Mode of Action of Bacterial Sialidases. J. Am. Chem. Soc. 125(24):7154–55 163. Hayre JK, Xu G, Borgianni L, Taylor GL, Andrew PW, et al. 2012. Optimization of a direct spectrophotometric method to investigate the kinetics and inhibition of sialidases. BMC Biochemistry. 13(1):19

42 164. Yang L, Connaris H, Potter JA, Taylor GL. 2015. Structural characterization of the carbohydrate-binding module of NanA sialidase, a pneumococcal virulence fac- tor. BMC Structural Biology. 15(1):15 165. Gut H, King SJ, Walsh MA. 2008. Structural and functional studies of Streptococ- cus pneumoniae neuraminidase B: An intramolecular trans-sialidase. FEBS Let- ters. 582(23):3348–52 166. Baeshen MN, Al-Hejin AM, Bora RS, Ahmed MMM, Ramadan HAI, et al. 2015. Production of Biopharmaceuticals in E. coli: Current Scenario and Future Per- spectives. J. Microbiol. Biotechnol. 25(7):953–62 167. Rosano GL, Morales ES, Ceccarelli EA. 2019. New tools for recombinant protein production in Escherichia coli: A 5-year update. Protein Science. 28(8):1412–22 168. Jia B, Jeon CO. High-throughput recombinant protein expression in Escherichia coli: current status and future perspectives. Open Biology. 6(8):160196 169. Khow O, Suntrarachun S. 2012. Strategies for production of active eukaryotic pro- teins in bacterial expression system. Asian Pacific Journal of Tropical Biomedi- cine. 2(2):159–62 170. Rosano GL, Ceccarelli EA. 2014. Recombinant protein expression in Escherichia coli: advances and challenges. Front Microbiol. 5:172 171. Selas Castiñeiras T, Williams SG, Hitchcock AG, Smith DC. 2018. E. coli strain engineering for the production of advanced biopharmaceutical products. FEMS Microbiol Lett. 365(15):fny162 172. Yoon SH, Han M-J, Jeong H, Lee CH, Xia X-X, et al. 2012. Comparative multi- omics systems analysis of Escherichia coli strains B and K-12. Genome Biology. 13(5):R37 173. Chamberlin M, Mcgrath J, Waskell L. 1970. New RNA Polymerase from Esche- richia coli infected with Bacteriophage T7. Nature. 228(5268):227–31 174. Studier FW, Moffatt BA. 1986. Use of bacteriophage T7 RNA polymerase to direct selective high-level expression of cloned genes. Journal of Molecular Biology. 189(1):113–30 175. Grodberg J, Dunn JJ. 1988. ompT encodes the Escherichia coli outer membrane protease that cleaves T7 RNA polymerase during purification. J. Bacteriol. 170(3):1245–53 176. Marisch K, Bayer K, Cserjan-Puschmann M, Luchner M, Striedner G. 2013. Eval- uation of three industrial Escherichia coli strains in fed-batch cultivations during high-level SOD protein production. Microbial Cell Factories. 12(1):58 177. Choi JH, Keum KC, Lee SY. 2006. Production of recombinant proteins by high cell density culture of Escherichia coli. Chemical Engineering Science. 61(3):876–85 178. Carattoli A. 2009. Resistance Plasmid Families in Enterobacteriaceae. Antimicro- bial Agents and Chemotherapy. 53(6):2227–38 179. Hawley DK, McClure WR. 1983. Compilation and analysis of Escherichia coli pro- moter DNA sequences. Nucleic Acids Res. 11(8):2237–55 180. Dunn JJ, Studier FW, Gottesman M. 1983. Complete nucleotide sequence of bacte- riophage T7 DNA and the locations of T7 genetic elements. Journal of Molecular Biology. 166(4):477–535 181. Tegel H, Ottosson J, Hober S. 2011. Enhancing the protein production levels in Escherichia coli with a strong promoter. The FEBS Journal. 278(5):729–39 182. Henderson KL, Evensen CE, Molzahn CM, Felth LC, Dyke S, et al. 2019. RNA Polymerase: Step-by-Step Kinetics and Mechanism of Transcription Initiation. Bi- ochemistry. 58(18):2339–52 183. Deuschle U, Gentz R, Bujard H. 1986. lac Repressor blocks transcribing RNA pol- ymerase and terminates transcription. PNAS. 83(12):4134–37

43 184. Young CL, Britton ZT, Robinson AS. 2012. Recombinant protein expression and purification: A comprehensive review of affinity tags and microbial applications. Biotechnology Journal. 7(5):620–34 185. Sousa R. 2013. T7 RNA Polymerase. In encyclopedia of biological chemistry (Sec- ond Edition), ed WJ Lennarz, MD Lane, pp. 355–59. Waltham: Academic Press 186. Angius F, Ilioaia O, Amrani A, Suisse A, Rosset L, et al. 2018. A novel regulation mechanism of the T7 RNA polymerase based expression system improves over- production and folding of membrane proteins. Scientific Reports. 8(1):8572 187. Hannig G, Makrides SC. 1998. Strategies for optimizing heterologous protein ex- pression in Escherichia coli. Trends in Biotechnology. 16(2):54–60 188. Zhang X, Studier FW. 2004. Multiple roles of T7 RNA polymerase and T7 lyso- zyme during bacteriophage T7 infection. J Mol Biol. 340(4):707–30 189. Studier FW. 1991. Use of bacteriophage T7 lysozyme to improve an inducible T7 expression system. Journal of Molecular Biology. 219(1):37–44 190. de Boer HA, Comstock LJ, Vasser M. 1983. The tac promoter: a functional hybrid derived from the trp and lac promoters. Proc. Natl. Acad. Sci. U.S.A. 80(1):21– 25 191. Neubauer P, Hofmann K, Holst O, Mattiasson B, Kruschke P. 1992. Maximizing the expression of a recombinant gene in Escherichia coli by manipulation of in- duction time using lactose as inducer. Appl Microbiol Biotechnol. 36(6):739–44 192. Yamabhai M, Buranabanyat B, Jaruseranee N, Songsiriritthigul C. 2011. Efficient E. coli expression systems for the production of recombinant β-mannanases and other bacterial extracellular enzymes. Bioeng Bugs. 2(1):45–49 193. Harper S, Speicher DW. 2008. Expression and purification of GST fusion proteins. Curr Protoc Protein Sci. Chapter 6:Unit 6.6 194. Din RU, Khan MI, Jan A, Khan SA, Ali I. 2020. A novel approach for high-level expression and purification of GST-fused highly thermostable Taq DNA polymer- ase in Escherichia coli. Arch Microbiol. 202(6):1449–58 195. Fathi-Roudsari M, Akhavian-Tehrani A, Maghsoudi N. 2016. Comparison of Three Escherichia coli strains in recombinant production of reteplase. Avicenna J Med Biotechnol. 8(1):16–22 196. Makino T, Skretas G, Georgiou G. 2011. Strain engineering for improved expres- sion of recombinant proteins in bacteria. Microb Cell Fact. 10:32 197. Englesberg E, Irr J, Power J, Lee N. 1965. Positive Control of Enzyme Synthesis by Gene C in the l-Arabinose System. Journal of Bacteriology. 90(4):946 198. Lee N, Englesberg E. 1962. Dual effects of structural genes in Escherichia coli. Proceedings of the National Academy of Sciences of the United States of America. 48(3):335 199. Terpe K. 2006. Overview of bacterial expression systems for heterologous protein production: from molecular and biochemical fundamentals to commercial sys- tems. Appl Microbiol Biotechnol. 72(2):211–22 200. Haldimann A, Daniels LL, Wanner BL. 1998. Use of New Methods for Construc- tion of tightly regulated arabinose and rhamnose promoter fusions in Studies of the Escherichia coli Phosphate Regulon. Journal of Bacteriology. 180(5):1277 201. Jf T, Rf S. 1987. Positive regulation of the Escherichia coli L-rhamnose operon is mediated by the products of tandemly repeated regulatory genes. J Mol Biol. 196(4):789–99 202. Giacalone MJ, Gentile AM, Lovitt BT, Berkley NL, Gunderson CW, Surber MW. 2006. Toxic protein expression in Escherichia coli using a rhamnose-based tightly regulated and tunable promoter system. BioTechniques. 40(3):355–64

44 203. Dvorak P, Chrast L, Nikel PI, Fedr R, Soucek K, et al. 2015. Exacerbation of sub- strate toxicity by IPTG in Escherichia coli BL21(DE3) carrying a synthetic met- abolic pathway. Microbial Cell Factories. 14(1):201 204. Kosinski MJ, Rinas U, Bailey JE. 1992. Isopropyl-β-d-thiogalactopyranoside influ- ences the metabolism of Escherichia coli. Appl Microbiol Biotechnol. 36(6):782– 84 205. Wickstrum JR, Skredenske JM, Balasubramaniam V, Jones K, Egan SM. 2010. The AraC/XylS family activator RhaS negatively autoregulates rhaSR expression by preventing cyclic AMP receptor protein activation. J. Bacteriol. 192(1):225–32 206. Hjelm A, Karyolaimos A, Zhang Z, Rujas E, Vikström D, et al. 2017. Tailoring Escherichia coli for the l-Rhamnose PBAD Promoter-Based Production of Mem- brane and Secretory Proteins. ACS Synth. Biol. 6(6):985–94 207. Guzman LM, Belin D, Carson MJ, Beckwith J. 1995. Tight regulation, modulation, and high-level expression by vectors containing the arabinose pBAD promoter. Journal of Bacteriology. 177(14):4121–30 208. Siegele DA, Hu JC. 1997. Gene expression from plasmids containing the araBAD promoter at subsaturating inducer concentrations represents mixed populations. PNAS. 94(15):8168–72 209. Megerle JA, Fritz G, Gerland U, Jung K, Rädler JO. 2008. Timing and Dynamics of single cell gene expression in the arabinose utilization system. Biophysical Journal. 95(4):2103–15 210. Schleif R. 2010. AraC protein, regulation of the L-arabinose operon in Escherichia coli, and the light switch mechanism of AraC action. FEMS Microbiol Rev. 34(5):779–96 211. Nilsson OB, Hedman R, Marino J, Wickles S, Bischoff L, et al. 2015. Cotransla- tional protein folding inside the ribosome exit tunnel. Cell Rep. 12(10):1533–40 212. Nilsson OB, Müller-Lucks A, Kramer G, Bukau B, von Heijne G. 2016. Trigger factor reduces the force exerted on the nascent chain by a cotranslationally Folding Protein. J. Mol. Biol. 428(6):1356–64 213. Nilsson OB, Nickson AA, Hollins JJ, Wickles S, Steward A, et al. 2017. Cotransla- tional folding of spectrin domains via partially structured states. Nat. Struct. Mol. Biol. 24(3):221–25 214. Farías‐Rico JA, Goetz SK, Marino J, Heijne G von. 2017. Mutational analysis of protein folding inside the ribosome exit tunnel. FEBS Letters. 591(1):155–63 215. Farías-Rico JA, Selin FR, Myronidi I, Frühauf M, Heijne G von. 2018. Effects of protein size, thermodynamic stability, and net charge on cotranslational folding on the ribosome. PNAS. 115(40):E9280–87 216. Tian P, Steward A, Kudva R, Su T, Shilling PJ, et al. 2018. Folding pathway of an Ig domain is conserved on and off the ribosome. Proc. Natl. Acad. Sci. U.S.A. 115(48):E11284–93 217. Kudva R, Tian P, Pardo-Avila F, Carroni M, Best RB, et al. 2018. The shape of the bacterial ribosome exit tunnel affects cotranslational protein folding. eLife. 7:e36326 218. Kemp G, Kudva R, de la Rosa A, von Heijne G. 2019. Force-Profile Analysis of the Cotranslational Folding of HemK and Filamin Domains: Comparison of Biochem- ical and Biophysical Folding Assays. J Mol Biol. 431(6):1308–14 219. Marsden AP, Hollins JJ, O’Neill C, Ryzhov P, Higson S, et al. 2018. Investigating the effect of chain connectivity on the folding of a beta-sheet protein on and off the ribosome. Journal of Molecular Biology. 430(24):5207–16 220. Jensen MK, Samelson AJ, Steward A, Clarke J, Marqusee S. 2020. The folding and unfolding behavior of ribonuclease H on the ribosome. Journal of biologival chemistry. 295(33):11410-11417

45 221. Ismail N, Hedman R, Schiller N, von Heijne G. 2012. A biphasic pulling force acts on transmembrane helices during translocon-mediated membrane integration. Na- ture Structural & Molecular Biology. 19(10):1018–22 222. Goldman DH, Kaiser CM, Milin A, Righini M, Tinoco I, Bustamante C. 2015. Me- chanical force releases nascent chain–mediated ribosome arrest in vitro and in vivo. Science. 348(6233):457–60 223. Sone M, Kishigami S, Yoshihisa T, Ito K. 1997. Roles of disulfide bonds in bacterial alkaline phosphatase. J. Biol. Chem. 272(10):6174–78 224. Karyolaimos A, Ampah-Korsah H, Hillenaar T, Mestre Borras A, Dolata KM, et al. 2019. Enhancing recombinant protein yields in the E. coli periplasm by combining signal peptide and production rate screening. Front Microbiol. 10:1511 225. Weiser JN, Ferreira DM, Paton JC. 2018. Streptococcus pneumoniae: transmission, colonization and invasion. Nat Rev Microbiol. 16(6):355–67 226. Mamipour M, Yousefi M, Hasanzadeh M. 2017. An overview on molecular chap- erones enhancing solubility of expressed recombinant proteins with correct fold- ing. Int J Biol Macromol. 102:367–75 227. Baumgarten T, Ytterberg AJ, Zubarev RA, Gier J-W de. 2018. Optimizing Recom- binant protein production in the Escherichia coli periplasm alleviates Stress. Appl. Environ. Microbiol. 84(12):e00270-18 228. Nakatogawa H, Ito K. 2002. The ribosomal exit tunnel functions as a discriminating Gate. Cell. 108(5):629–36 229. Makino T, Skretas G, Georgiou G. 2011. Strain engineering for improved expres- sion of recombinant proteins in bacteria. Microb Cell Fact. 10:32 230. Sonoda H, Kumada Y, Katsuda T, Yamaji H. 2011. Effects of cytoplasmic and periplasmic chaperones on secretory production of single-chain Fv antibody in Escherichia coli. Journal of Bioscience and Bioengineering. 111(4):465–70 231. Rouskin S, Zubradt M, Washietl S, Kellis M, Weissman JS. 2014. Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo. Nature. 505(7485):701–5 232. Burkhardt DH, Rouskin S, Zhang Y, Li G-W, Weissman JS, Gross CA. 2017. Op- eron mRNAs are organized into ORF-centric structures that predict translation ef- ficiency. eLife. 6:e22037

46