Structure-Function and Substrate-Specificity

Studies of Escherichia coli YidC

Dissertation

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy

in the Graduate School of The Ohio State University

By

Balasubramani Hariharan

Graduate Program in Biophysics

The Ohio State University

2018

Dissertation Committee

Professor Ross E. Dalbey, Advisor

Professor Natividad Ruiz

Professor Charles Bell

Professor Karin Musier-Forsyth

1

Copyrighted by

Balasubramani Hariharan

2018

2

Abstract

This dissertation examines the substrate properties of M13 Procoat-Lep that determine the pathway it takes for membrane insertion and describes Electron paramagnetic resonance (EPR) methods that explore the solvent exposure of YidC greasy slide residues, the microenvironment of the hydrophilic groove to probe the role of the strictly positively charged residue, and the distance(s) between two strategic positions within YidC that were not resolved in the structure.

Chapter one of this dissertation reviews targeting and protein translocation pathways in prokaryotes and eukaryotes. Secretory and membrane are synthesized in the cytoplasm and targeted to their destined membranes in multiple ways. Co-translationally targeting is achieved either by a direct interaction of ribosomes to the or by ribosome nascent chain complexes that are targeted to the Sec by signal recognition particle and the FtsY. Post-translationally targeting to the SecYEG translocase embedded in the membrane is achieved by cytosolic chaperones and SecA. The heterotrimeric SecYEG inserts the majority of the substrates through the membrane. The cytosolic motor ATPase SecA works along with the trimeric Sec YEG complex providing energy for protein translocation. The ancillary proteins SecDF/YajC also associate with SecYEG to improve translocation efficiency. The substrates that are

ii independent of SecYEG insert via the YidC membrane insertase, which is the primary focus of this dissertation. YidC functions within the cell, as an independent insertase, to insert membrane proteins or function in concert with the Sec translocon to insert, fold and assemble proteins into the membrane. In addition to the Sec translocase and YidC insertase systems, proteins can be exported by the twin arginine translocation pathway (Tat), in which proteins are folded prior to export. In this chapter, the structural components and mechanism of insertion will be reviewed for each of the export pathways.

Chapter two of this dissertation focuses on the insertion components YidC and

SecYEG in the inner membrane of E. coli. Both of these insertases are essential and universally conserved, and together insert most proteins into the plasma (cytosolic) membrane in bacteria. The structure of SecYEG revealed an hourglass channel structure with a central pore ring and a lateral gate. Proteins enter the SecYEG complex through the cytoplasmic region and the hydrophilic regions of the substrate are translocated across the channel through the pore ring while the transmembrane segments of membrane proteins exit the channel to integrate into the lipid bilayer through the lateral gate. The structure of

YidC revealed it possesses a novel hydrophilic groove exposed to both the cytoplasm and the lipid bilayer. Based on structure-function studies of a single-spanning membrane protein, YidC was proposed to catalyze membrane insertion by recruiting the substrate hydrophilic region into hydrophilic groove via an electrostatic interaction. This mechanism would effectively reduce the membrane crossing distance for translocation.

Exactly how YidC would insert more complicated membrane proteins and how it would cooperate with the Sec translocase was not clear based on the structure. Previously, the iii polarity and charge of the periplasmic regions of membrane proteins was proposed to determine the YidC and Sec translocase requirements for insertion. We have further tested this polarity/charge hypothesis using the bacteriophage M13 coat protein, which is synthesized in a precursor form called procoat with two hydrophobic regions. We found that M13 procoat becomes increasingly YidC/Sec dependent by making the periplasmic loop increasingly polar in the absence of charged residues. In addition, we discovered that increasing the loop hydrophilicity beyond a certain threshold blocked translocation even with Sec/YidC. However, translocation can be restored by adding hydrophobic residues to the transmembrane segment to increase the driving force for membrane insertion. We also demonstrated that the length of the Procoat-Lep loop is a determinant for Sec dependence. Based on these results we envision that insertion of proteins is occurring at the interface of SecY and YidC insertase. This hypothesis was corroborated by photo- crosslinking studies where the interactions of the substrate and the translocases can be detected. Our results on the YidC and SecYEG biomachineries highlight the universal principles that guide membrane protein biogenesis in all walks of life.

Chapter three of this dissertation focuses on using the biophysical technique electron paramagnetic resonance (EPR) to study YidC. Employing EPR and site-directed spin labeling, we showed that the greasy slide residues 427, 432 and 434 are at the interface of water and lipid. This corroborates our solvent exposed results based on in vivo NEM alkylation studies of YidC in E. coli but is different than the results from molecular dynamics simulations. Additionally, we used continuous wave EPR and power saturation

EPR experiments to probe the spatial hydrophilic environment of YidC residues in the iv aqueous groove. The groove possesses a conserved positive charge (366R in E. coli YidC), which was shown to be essential for translocase activity of YidC in Bacillus subtilis and was proposed to recruit the substrate tail into the groove by an electrostatic attraction mechanism, but the exact function of this positive charge is controversial. Chen et al. provided data that the hydrophilic nature of the positively charged residue is more essential than the electrostatic nature and that the positively charged residue is surprisingly dispensable for its translocase activity. Using continuous EPR and site directed spin labeling, we provide data that suggests that the positively charged residue 366 is essential to maintain the hydrophilic nature of the groove when the groove has an apolar residue at the top of the groove at the 517 position. Lastly, the EPR Double electron electron resonance (DEER) technique was used to estimate distance(s) between residues that are not resolved in the crystal structure of YidC.

v

Dedicated to my Father

T.S. Hariharan

vi

Acknowledgments

First, I would like to thank my advisor, Dr. Ross E. Dalbey, for his patience, guidance, encouragement and trust throughout the course of my graduate studies.

I would like to thank our collaborator, Professor Andreas Kuhn, for his help and advice on my projects. I would also like to thank all my graduate study committee members Dr.

Charles Bell, Dr. Karin Musier Forsyth and Dr. Natividad Ruiz for their valuable advice and guidance. I am also grateful to my former lab members especially Dr. Seth Hennon,

Dr. Lu Zhu and Dr. Raunak Soman for their advice and training during the initial course of my lab research.

I would like to thank my current lab members Dr. Yuanyuan Chen, Sri Karthika

Shanmugam, Haoze He and Margaret Steward for their advice and camaraderie developed during my stay in the lab. I will always be grateful to my friends in Columbus Dr. Shibi

Likhite, Dr. Krishna Patel and back home in Chennai Dr. Aloysius Wilfred Raj, Vinod

Ravishankar, Anand Kumar and Deepika Govindarajulu, for their trust and encouragement.

Lastly, but most importantly, I would like to thank my family, Kamatchi Hariharan, Sridevi

Hariharan and Karthika Hariharan who have been my pillars of strength and my other family members for their love and support. I love you all!

vii

Vita

October,1989 ...... Born in Chennai, India

2011...... B.S. Industrial Biotechnology,

Anna University.

2012 - Present ...... Graduate Teaching and Research Associate,

Department of Biophysics,

The Ohio State University

Fields of Study: Biophysics

viii

Table of Contents

Abstract…………………………………………………………………………………….i

Dedication…………………………………………………………………………...……vi

Acknowledgments……………………………………………………………………….vii

Vita……………………………………………………………………………………...viii

Table of Contents…………………………………………………………………………ix

List of Tables……………………………………………………………………………xiii

List of Figures …………………………………………………………………………..xiv

Chapter 1……...…………………………………………………………………………..1

1.1 Introduction..…………………………………………………………………..1

1.1 Biological membranes………………………………………...…………….....1

1.2 Membranes in Eukaryotic organisms.………………………....…………….....2

1.3 Membranes in Prokaryotic organisms.………………………...……………....6

1.4 Membrane proteins…………………..………………………...……………....6

1.5 Protein targeting……………………………….………………………………8

1.5.1 Co-translational targeting…………………………………………..10

1.5.2 Post-translational targeting……………………………….………...12

1.6 Sec Translocon……….…………….………………………………….……..16

1.6.1 The SecYEG….……………………………………………….……17

1.6.3 Mechanism of Co-translational Pathway………………………...... 20

1.6.4 Molecular motor protein SecA…………………………….………..21

ix

1.6.5 Accessary unit SecDF/YajC………………………………...... 22

1.7 Membrane Insertase YidC…………………………………………...... 24

1.7.1 Mitochondrial Oxa1………………………………………………..25

1.7.2 Plastids Alb3/Alb4…………………………………………………27

1.7.3 Membrane insertase YidC………………………………………….27

1.7.4 Archeal Duf106………………………………………………….…36

1.7.5 Eukaryotic homologues of YidC….……………………………..…36

1.8 Twin arginine translocase pathway……………………….…………………..37

1.8.1 Translocase components……………………….…………...... 39

1.8.2 Translocase mechanism……………………….……………………40

1.9 Figures………………………………………………….…………………….42

Chapter 2…...……………………………………………………………………...... 53

Polarity of the translocated region is the primary translocase determinant for M13 procoat membrane protein insertion…..………………………………………………….………53

2.1 Introduction…………………………………………………………………..54

2.2 Results………………………………………………………………………..56

2.2.1 Increasing the polarity of the uncharged PCLep loop causes the protein to become YidC/Sec dependent for insertion…………………………..…....……………58

2.2.2 Addition of hydrophobic residues to the highly polar loop decreases the Sec-dependence of insertion…………………….……………………………………59

2.2.3 Increasing the hydrophobicity of TM2 recues membrane insertion of

PCLep mutants with highly polar periplasmic loops……………………………………..60

x

2.2.4 Increasing the length of the translocated region increases the Sec dependence of membrane insertion…………………………………….………………...61

2.2.5 The Sec-dependent PCLep inserts at the interface of YidC and

SecYEG……………………………………………………………………………….….62

2.3 Discussion ……………………………………………………………………63

2.4 Materials and Methods……………………………………………………….67

2.4.1 Materials……………………………………………….…………...67

2.4.2 Strains Plasmids and growth conditions...…………………….……68

2.4.3 Proteolysis Accessibility ……………...... …………………….……68

2.4.4 Mutagenesis………………………………………………………...69

2.4.5 In vivo photocrosslinking ……………...... ………………………...69

2.5 Figures………………………………………………………………………..71

Chapter 3…………………..…………………………………………………....……….85

Structural and Functional studies of YidC using Electron Spin Resonance………………85

3.1 Introduction…………………………………………………………………..86

3.1.1 Continuous wave-EPR and Pulsed EPR……………………………………88

3.1.2 Double Electron Electron Resonance……………………………………....89

3.2 Results………………………………………………………………………..92

3.2.1 Accessibility studies of YidC residues using EPR……….………....92

3.2.2 Probing the microenvironment of the hydrophilic groove by SDSL

……………………………………………………………………………94

3.2.3 DEER-EPR can be used to study distances not resolved by crystal

xi

structures…………………………………………………………………96

3.3 Discussion ……………………………………………………………………96

3.4 Materials and Methods…………………………………………………….....99

3.4.1 Overexpression and purification of YidC………….……………...100

3.4.2 Reconstitution into Liposomes……………………………….…...101

3.4.3 CW-EPR power saturation experiments………….……………….102

3.4.4 EPR Spectral Simulations ………………………………………...102

3.4.5 CW Power Saturation Experiments…………………………….…103

3.4.6 DEER studies……………………………………………………..105

3.5 Figures………………………………………………………………………106

Chapter 4...... ………………………………………………………………….….…….117

4.1 Primary findings from work…………..…………………………………….117

4.2 Future experiments………………………………………………………….120

List of References…………………...…………………………………………………121

xii

List of Tables

Table 2.1 GES values of the translocation loops for all the mutants tested in the study along with their dependency on YidC and SecYEG for membrane insertion….………………81

Table 2.2 Translocation efficiency for membrane insertion of PCLep mutants….………………………………………………………………………………..82

Table 3.1 Membrane depth parameter for all the spin-labeled cysteine YidC mutants tested by EPR………………………………………………………………………………….113

xiii

List of Figures

Figure 1.1 Simplified representation of a eukaryotic cell showing different organelles….42

Figure 1.2 Representation of bacterial membranes and proteins in various locations…...44

Figure 1.3 Schematic of the protein translocation and insertion pathways of E. coli…………………………………….……………………….…………………………46

Figure 1.4 Cartoon representation of the molecular motor SecA.…...……………….….48

Figure 1.5 Cartoon representation of SecYEG from T. Thermophiles. ……….…………49

Figure 1.6 Ribbon and Space Filling models of E. coli YidC………...... 50

Figure 1.7 Model of the Holo translocon.….…………………………………….….…..51

Figure 1.8 Structure of the Twin arginine translocon TatABC components……………52

Figure 2.1 Schematic representation of Proteinase K accessibility assay………………...71

Figure 2.2 Increase in polarity of the periplasmic loop makes it more Sec- dependent………………………………………………………..……………………….72

Figure 2.3 Increase in hydrophobicity of the translocated loop decreases its Sec dependence for membrane insertion……………………………………………...... 74

Figure 2.4 Increasing hydrophobicity of TM2 segments rescues translocation of the highly polar periplasmic lops of PCLep…………………………………………...…………….76

Figure 2.5 Length of the periplasmic loop acts as a positive determinant for membrane protein insertases……………………………………………………………………...... 77

xiv

Figure 2.6 Pictorial representation of Amber codon TAG mutants of YidC and SecYEG.

…………………………………………………………………………...…………...... 78

Figure 2.7 Photo crosslinking studies showing the substrate PCLep interacts with YidC and SecY during translocation………………….………………………….…………….79

Figure 2.8 Mechanism of insertion of Procoat-Lep at the interface of YidC and

SecYEG………………………………………..…………………………………………83

Figure 3.1 A schematic representation of the EPR instrument……………………….....107

Figure 3.2 Spin label MTSL structure and reactivity, and EPR lineshape….………….109

Figure 3.3 A cartoon representation of the YidC structure highlighting residues tested using EPR………………………………………………...... ……………………..…….110

Figure 3.4 A cartoon representation of the structure of YidC highlighting aromatic residues tested using EPR……….……………………………...... ……………………...... 111

Figure3.5 CW EPR spectrum for the spin-labeled YidC mutants tested in DOPC proteoliposomes………………………………………………………………………...112

Figure 3.6 Power saturation curves of the spin-labeled YidC cysteine mutants in DOPC bilayer at 298K…………..……………………………………………………………...114

Figure 3.7 DEER spectrum of the double spin labeled 375/542 YidC mutant in DOPC proteoliposomes…………………………………………………..………………….…116

Figure 3.8 Purification and Spin labeling procedure of YidC mutants with

MTSL…………………………………………………………………………………...117

xv

CHAPTER 1

Introduction

1.0 Biological membranes

Life originated in an aqueous environment where reactions, cellular and subcellular processes have evolved. In the earliest life form, membranes were already developed since they are necessary to sequester and cellular processes.

Membranes, in biology, are thin pliable sheet like layer that form the outer boundary of a living cell or internal cellular compartments. The outer boundary is called the plasma membrane, and the internal cellular compartments are enclosed by organellar membranes.

Notably, some organelles have more than one membrane. The cell membrane has two fundamental functions: first, to be a barrier keeping the constituents of the cell in and toxic substances out and, second, to be a gate allowing transport of essential nutrients into the cells and movement of waste products out of the cells.

A typical membrane consists of lipid molecules with proteins embedded in it or associated with it. These lipid moieties are often amphipathic, signifying that they possess both polar and nonpolar groups. A bilayer formation is energetically favorable as the polar

1 domain of the lipid molecule lies at the surface of membranes where they interact with water and the non-polar domains of the lipid molecule orient themselves facing each other in order to avoid hydrophilic interactions on the two surfaces (in the case of the plasma membrane, the cytoplasmic and external milieu). The proteins in membranes can play many different roles depending on the biological function of the membrane. For example, membranes of the sheath act as insulators while the mitochondrial inner membrane plays key roles in energy production, and the exterior membrane of the cell is particularly crucial for signal transduction and maintaining the cellular integrity. Proteins that are membrane proteins are of two types, peripheral membrane proteins that are associated with the surface or integral membrane proteins that are embedded in the membrane.

In living organisms, cells are broadly divided based on their ability to have membrane-enclosed nucleus. Eukaryotic cells contain membrane-bound nucleus as well as several organelles, while prokaryotic cells do not. Other differences in cellular structure of prokaryotes and eukaryotes include, among other things, the presence of mitochondria, chloroplasts, Golgi apparatus, lysosome, and the structure of chromosomal DNA in eukaryotes.

1.1 Membranes in Eukaryotic organism

1.1.1 Nuclear Membrane

The nucleus distinguishes multicellular and unicellular eukaryotic organisms, from prokaryotic eubacteria and archaea. The boundary of the nucleus is formed by the nuclear envelope (NE). It consists of two membranes, the outer nuclear membrane

(ONM), which is continuous with the endoplasmic reticulum (ER) and faces the

2 cytoplasm, and the inner nuclear membrane (INM) enclosing the nucleoplasm. (Fig 1.1)

These two membranes are fused at various sites by the nuclear pore complexes (NCP) (1).

Transport of proteins into the nucleus occurs through the NCP in a diffusion dependent manner assisted by interactions with the FG (phenylalanine glycine) repeats (2).

1.1.2 Endoplasmic Reticulum

The endoplasmic reticulum (ER) is the largest membrane-delineated intracellular compartment within eukaryotic cells. They are divided into two regions: rough ER, which have the capacity to bind ribosomes, and the smooth ER that lack the ability to bind ribosomes. The ER performs many essential cellular functions, including lipid biosynthesis, signaling, and the biogenesis of the outer nuclear membrane (see above). It also has roles in the biogenesis of the Golgi apparatus, and transport of lipids to mitochondria through contact sites. Membrane proteins and secreted proteins are synthesized at the rough ER sites and get transported to destined membranes either via mitochondrial contact sites or the vesicular trafficking route through the Golgi apparatus from the smooth ER. Rough ER is also involved in glycosylation and quality control of proteins.

1.1.3 Golgi apparatus

The Golgi apparatus is comprised of distinct compartments called cisternae that are layered on top of one another to form the Golgi stack. In most organisms, they are stacked while in certain eukaryotes these stacks exist as discrete units (3). (Fig. 1.1) The Golgi network, at the heart of secretory pathway, receives proteins from smooth ER by a vesicular mediated mechanism and delivers them to their destined location. The vesicles, upon

3 reaching the cisternae, fuse to the cis stack of the membrane. Following the delivery of the material to the cis-Golgi cisternae, proteins transition across the Golgi stack, where they undergo modifications, and then further sorted and packaged for delivery to various post-

Golgi compartments. This can include sorting to the secretory vesicles, the plasma membrane, or to the endosomal/lysosomal/vacuolar system (4).

1.1.4 Mitochondria

Mitochondria and plastids (chloroplasts) evolved from free-living bacteria via symbiosis within a eukaryotic host cell (Margulis 1970). Mitochondria have two membranes, the outer and inner mitochondrial membrane, and two aqueous spaces; the internal space called the matrix and the intramembrane space as depicted in Fig. 1.1. The outer membrane has been shown to contact the endoplasmic reticulum at certain sites that facilitate transport of lipids (5). Most of the mitochondrial proteins are nuclear encoded, which are targeted to mitochondria and imported using special protein complexes in the membranes namely

TOM and TIM (translocase in the outer membrane/translocase in the inner membrane).

Targeted to the mitochondria by their N-terminal signal sequence, the nuclear encoded cargo is translocated from the cytosol into the intramembrane space of mitochondria by the channel forming TOM-complex proteins (6). After translocation the pathway diversifies, where β-barrel proteins are sorted to the SAM complex which inserts the cargo into the outer membrane proteins with six alpha helical TM segments (the carrier proteins) are sorted to the TIM22 complex to insert into the inner membrane or to the TIM23 complex where the protein can insert into the inner membrane if they possess a membrane anchor or be imported into the aqueous matrix (7) and the proteins characterized by the cysteine

4 motifs are directed to the mitochondrial intermembrane space assembly (MIA) pathway that fold substrates using Mia40 and sulfhydryl oxidase Erv1 proteins (8). The remaining few mitochondrially-encoded proteins are synthesized in the matrix and are targeted to the mitochondrial inner membrane co-translationally and inserted into the membrane by a membrane insertase like Oxa1 which will be reviewed later in this section (7,9). It should be noted for TIM23-dependent import of proteins into the matrix both ATP hydrolysis and the electrical potential is required.

1.1.5 Chloroplast

As explained above, the endosymbiotic cyanobacterium living inside a eukaryotic cell eventually evolved into a true organelle, the chloroplast. Almost 98% of the chloroplast proteins are nuclear encoded in the cytoplasm and then transported to the organelle.

Chaperones and molecular machineries responsible for recognition and transport of the chloroplast proteins are found in the cytoplasm as well as in the chloroplast membrane.

Transport to the chloroplast is highly regulated to avoid promiscuous targeting to mitochondria or to the ER after being recognized by the cytosolic signal recognition particle (10). Chloroplasts have three types of membranes, the outer envelope membrane

(OEM), the inner envelope membrane (IEM), which is separated by the intermembrane space and the thylakoid membrane. Photosynthesis is carried out at the thylakoid membrane, an energy-transducing internal membrane system that possesses photosynthetic membrane protein complexes (11). The substrates transported to the thylakoid lumen have to traverse 5 compartments namely OEM, intermembrane space, IEM, stroma and thylakoid membrane (10). The cytosolic chaperones Hsp70 and Hsp90 family of proteins

5 facilitate the transport of unfolded protein from the cytosol to the TOC complex on the outer membrane of chloroplast (12,13). At the OEM, preprotein interacts with essential

TOC receptors, Toc34/Toc159 and are imported into the intermembrane space by the essential channel protein Toc75 (14). Tic20/21 proteins of the TIC complex form a channel at the inner membrane to import the proteins into the stroma with Tic22 protein chaperoning the preprotein in the intermembrane space (15). Both TOC and TIC complexes physically interact in a dynamic manner to import the cargo into the stroma in a ATP dependent manner involving Hsp70 and Hsp93. After delivery to the stroma compartment, the preprotein is cleaved by the stromal processing peptidase releasing the stromal signal peptide and the thylakoid destined proteins are inserted by special insertases like

Alb3/Alb4 and SecYE which we will review in the later section.

1.1.6 Peroxisomes

Peroxisomes are small eukaryotic organelles ubiquitously present in eukaryotic cells surrounded by a single membrane and specialized in oxidative metabolic reactions like production and degradation of hydrogen peroxide. They are devoid of nucleic acids and ribosomes and therefore must import both the peroxisomal matrix and membrane proteins from the cytosol. Peroxisomes can multiply by growth and division from pre-existing peroxisomes (16) and are also capable of forming de novo from the ER in cells deprived of peroxisomes. Peroxisome membrane proteins (PMP’s) are inserted into peroxisomal membrane with the help of 3 peroxin proteins; PEX19, essential for peroxisome protein sorting, transports cargo from cytosol and interacts with its membrane bound receptor

PEX3 (17). The universal protein PEX3 together with PEX19 inserts tail anchored (18)

6 and other PMP’s into the peroxisome membrane. The third protein PEX16, exists in mammals and absent in S. cerevisiae, is responsible for insertion of PMP’s into peroxisome membranes with the help of PEX19 (19). Notably, import of the matrix proteins from the cytosol into the peroxisome is facilitated by PEX5 (20) and PEX7 proteins. The cargo interacts with these proteins via the PTS2/PTS1 signal peptides (peroxisome transport signal) present in their N/C-terminals, respectively, and are targeted to the membranes where PMP’s facilitate the transport of cargo into the luminal compartment in an ATP independent manner (19).

1.2 Membranes in prokaryotes

Bacteria lack an enclosed nucleus, but they have a plasma membrane surrounding the cytoplasm. They are broadly defined by the physical properties of the outer layer of their cell structure. In most cases, bacteria are classified by the outcome of Gram staining.

The stain is either retained by the peptidoglycan, giving cells a purple color, or washed out.

In Gram-positive bacteria the stain is retained, while in Gram-negative bacteria the stain is not retained. The Gram-positive bacteria have a single plasma membrane surrounded by a thick outer cell wall composed of peptidoglycan (monoderm). In contrast, Gram-negative bacteria have two membranes with a peptidoglycan layer in between (diderm cell envelope). (Fig. 1.2) The outer membrane of gram-negative bacteria displays a variety of proteins that are involved in drug resistance, flagellar motions and porins that import several essential nutrients.

1.3 Membrane proteins

7

Proteins synthesized in a cell are broadly classified into soluble and membrane proteins. The structures of membrane proteins differ from those of soluble proteins with regard to the distribution of hydrophobic and hydrophilic groups. Soluble proteins perform their functions in the cytoplasmic milieu (Gram-positive and Gram-negative bacteria) and periplasm (Gram negative bacteria) whereas membrane proteins can either associate with the membranes via electrostatic interactions with polar groups on the membrane surface or be embedded to various extents in the hydrophobic core of the membrane.

Almost 30% of the cellular proteome is composed of membrane proteins. These are the proteins that are responsible for executing essential functions in signal transduction, energy production, and nutrient transport. The membrane-embedded segments of a typical membrane protein consist of either a transmembrane alpha helix, where the protein assembles as a helix bundle or an antiparallel β-sheet to form barrels shaped units. β barrel structures are found in the outer membrane of bacteria, mitochondria and chloroplasts. All other membranes have proteins in the helical type. The number of transmembrane segments

(TMS) within membrane proteins varies from 1-12 TMS in alpha helical membrane proteins and 8-22 β strands in a β barrel integral membrane proteins.

Except for mitochondria and chloroplasts, almost all proteins are synthesized in the cytoplasm and are targeted to destined membranes for their functionality. There are special proteins within each membrane that help in the transport of polypeptide chains across membranes. These special proteins can be organized into three categories 1) Channels, where small molecules are passively or actively transported across the membrane 2)

Exporters, with channel-like structures, that export or secrete proteins across the membrane

8 into the periplasm (gram-negative), intermembrane space (mitochondria, chloroplast) or extracellular (gram-positive) (Fig. 1.2) 3) Insertases, that catalyze the insertion of proteins into the membrane. This thesis focuses on the events occurring at the inner membrane of gram negative bacteria E. coli in which all the membrane proteins are alpha helices.

In order for membrane proteins to function, they have to be properly inserted and folded into the membrane. This process occurs in three stages: targeting, insertion, and folding. In the cytosol as the protein is being synthesized, the amino terminal end of the protein exits from the ribosome. This ribosome protein complex is called the ribosome nascent chain (RNC). The nascent polypeptide chain typically possesses a signal peptide

(if it is an exported protein) or a TM segment (if it is a membrane protein), which is then recognized by the signal recognition particle (SRP). The complex is usually in close proximity to the membrane, where the SRP receptor contacts the SRP-RNC complex.

Depending on the targeting sequence, the protein is either translocated across the membrane or inserted into the lipid bilayer. Once in the membrane, the protein folds into its native conformation before it assembles into quaternary structure if it is part of a multisubunit protein. Notably, there are quality control pathways in the cell that recognize and degrade misfolded proteins in case any step fails.

1.4 Protein Targeting

A major challenge for cells is to ensure the proper folding and targeting of newly synthesized proteins to the different cellular compartments. Membrane proteins are synthesized in the cytosol and need to be targeted to their destined membranes for their respective functionality. There are special membrane protein complexes termed

9 translocases, which either export proteins into the periplasm or insert them into the inner membrane. We will discuss in detail the mechanism by which membrane targeting is achieved and the nascent chain is handed over to the translocases.

Targeting is achieved by a signal peptide, which is usually 20-30 amino acids in length and has three distinct regions. 1) An N-terminal region (N-domain) which has 1-3 positively charged amino acids; 2) a central hydrophobic region containing 10-15 residues

(H-domain), and 3) a hydrophilic polar region in the C-terminus containing the residues important for signal peptide cleavage (C-domain) (21). The signal sequence is usually conserved and predicted by the characteristic tripartite structure.

Most signal peptides attached to the exported proteins span the membrane prior to being proteolytically removed by Leader Peptidase (signal peptidase), a novel membrane bound serine protease whose C-terminal catalytic domain is located in the periplasm

(22,23). The leader peptidase substrates usually have a characteristic neutral small residue at the -1 and -3 positions like an Ala-X-Ala site at the end of their signal peptide. While most secretory proteins have a cleavable signal peptide, many integral membrane proteins do not possess one, but rather have a hydrophobic TM segment that serves as a uncleaved signal peptide for targeting.

Proteins are targeted to the membrane in two ways. 1) By co-translational targeting where the proteins are translocated across the membrane as they are being synthesized. 2)

By post translational targeting where the proteins are synthesized in the cytosol and interact with a cytosolic chaperone such as SecB and is then targeted to the membrane (24). When the nascent protein is being synthesized and the signal peptide is exposed from the exit

10 tunnel of ribosome the signal recognition particle (SRP) and the trigger factors compete to bind to the signal peptide (25,26). If the peptide is hydrophobic enough, SRP dominates and targets the protein to the translocon co-translationally. On the other hand, if the signal peptide is not hydrophobic enough, then the trigger factor takes over and as the nascent chain gets larger and larger, it binds to the SecB chaperone that functions to the keep the protein in a translocation competent conformation such that the protein can be delivered post-translationally to the translocon for its transport (27). We will look at both of these targeting pathways in detail in the following sections.

1.5 Co-translational targeting

1.5.1 Signal Recognition Particle

The signal recognition particle (SRP) involved in targeting is a universally conserved ribonucleoprotein complex that has both a protein component and RNA. It mediates the delivery of ribosome-nascent chain complexes (RNCs) from the cytosol to the protein translocation machineries in the endoplasmic reticulum membrane in eukaryotes (Sec61αβγ complex) or plasma membrane in prokaryotes (SecYEG complex).

As stated previously, it recognizes a signal peptide of the exported protein or a hydrophobic segment of the membrane protein after it is exposed out of the ribosome exit tunnel. In eukaryotes, SRP is composed of six proteins and one 7S RNA molecule, while in bacteria

SRP has a single protein, Ffh (homologous to SRP54 in eukaryotes) bound to either a 4.5S

RNA in Gram-negative bacteria or 6S RNA in Gram-positive bacteria.

SRP scans the ribosome with low affinity even before the nascent chain reaches the exit tunnel and interacts with the ribosomal binding proteins L23 and L29. This is called

11 the stand-by mode. When the nascent chain reaches the exit site, SRP forms a high affinity complex by rearranging itself to bind the signal peptide with a low Kd. Soon after forming this high affinity complex, the nascent chain is delivered to its receptor at the membrane to form a quaternary complex. The receptor SR (SRP Receptor in eukaryotes) or FtsY (in prokaryotes) then transfers the ribosome nascent chain to the translocon by a series of steps involving the catalysis of .

Bacterial SRP consists of Ffh (SRP54 homolog) and a hairpin structured 4.5S RNA.

Ffh has three regions. The N terminal domain (N), GTPase domain (G) and a methionine rich (M) domain. The Ffh N/G domain interact with the L23 and L29 proteins within the ribosome and its M domain extends into the ribosomal exit tunnel to form a complex with the hydrophobic region of the signal peptide. The low/high affinity of SRP towards the ribosome enables it to scan all the ribosomes even as the concentrations of ribosome reach a hundred times that of SRP in a cell. The recruitment of SRP receptor, FtsY additionally regulates the translating complex. FtsY consists of an N domain, a GTPase domain, and an

A domain (consisting of an N-terminal acidic region in the E. coli FtsY). Stable intermediate complexes have been shown between SRP-FtsY-RNC with the Sec translocon. SRP and FtsY interaction through their N/G domains requires bound GTP and, following GTP hydrolysis, the RNC is transferred from the targeting complex to the translocon. The exact mechanism of the latter step is still unclear. What is evident, however, is that the complexes have mutual interacting regions with the translocon and it is a concerted process where FtsY and SRP come off the RNC and the nascent chain binds to the translocon for membrane protein insertion/secretion. The binding of the hydrophobic

12

TM of membrane protein substrate and the ribosome to the translocon triggers movement of the plug domain and allows opening of the channel. The hydrophobic domains of the membrane protein can then be translocated across the channel to the periplasm or the protein substrate can laterally exit into the membrane through the lateral gate. It should be noted that some Sec independent membrane proteins have been shown to be targeted independent of the SRP/FtsY system to YidC although its exact mechanism is still unclear.

One of the substrates called Procoat-Lep gets inserted in this fashion and will be discussed in detail later in Chapter 2.

1.5.2 Post translational targeting

Post translational targeting occurs when the protein is targeted to the membrane after most or all of the protein has been synthesized by the ribosome. In a well characterized organism like E. coli, there are about 400-500 secretory proteins that are targeted to the Sec translocon post translationally. They are targeted to the membrane with the help of the motor ATPase SecA and targeting is achieved depending on the exported protein with or without the assistance of cytosolic chaperones such as trigger factors or

SecB.

1.5.2.1 Trigger Factor

Trigger factor (TF) is a non-essential ATP independent chaperone expressed ubiquitously in bacteria. TF is dimeric in nature, composed of an N-terminal domain (that contacts the L23 region of the ribosome), a prolyl (PPIase) domain and a C- terminal domain (that extends into the exit tunnel interacting with the nascent chain of the protein being translated). The recognition sites for TF are extended hydrophobic sequences

13 flanked by positively charged residues on both the sides. This same sequence is recognized by SRP although with low affinity. Hence both SRP and TF proteins scan all the actively translating nascent chains simultaneously. Multiple TF’s can bind the hydrophobic patches of a continuously synthesized protein. Binding is achieved by the PPIase and C-terminal domains of trigger factor. Binding prevents the protein from aggregating and keeps it in a soluble translocation competent state. TF interacts with outer membrane proteins (OMPs) and periplasmic proteins. In the absence of TF, the export of several OMPs and periplasmic proteins are significantly perturbed (28). Remarkably, a substantial number of these TF- interacting exported substrates are substrates for SecB. This includes precursors of OmpA,

OmpC, OmpF, LamB, PhoE, TolC, DegP, FkpA, OppA, Bla, and MBP (29). It is still unclear how the TF hands over the substrates to cytosolic SecB and from SecB to SecA.

1.5.2.2 Chaperone SecB

SecB is a non-essential ATPase independent foldase that keeps the pre-proteins in a soluble and unfolded state by interacting with the hydrophobic motifs. It is only found in proteo-bacteria. SecB structurally exists as a dimer of dimers and thus rearranges itself upon binding the exported protein in an extended fashion preventing the preprotein from aggregating. It has a higher affinity towards the RNC, which accounts for the exchange from trigger factors to SecB and has low affinity for the tertiary structures of proteins. This difference in affinity serves as a first round of screening for the non-specific substrates that reach SecA.

SecB monomer is composed of four stranded antiparallel β-sheets (the first two strands being at opposite sides and connected by a cross over loop) and two α-helices

14 separated by a helix connecting loop. SecB dimer is formed via interactions between strands β1 and helices α1 of two monomers. The tetramer forms by packing the helices α1 of four monomers in between the eight-stranded antiparallel β-sheets formed by each dimer. The packing is facilitated by polar interactions. Two peptide binding grooves are present on each side of the SecB tetramer, each allowing the binding of ∼20 amino acids in an extended conformation. The fact that SecB can bind approximately 150 residues of the pre-protein substrates (30) suggests that the bound polypeptide might wrap around the chaperone using several possible routes.

1.5.2.3 Chaperone DnaKJ

DnaK is one of the most abundantly expressed chaperones in the cytosol whose expression is induced in response to several chemical and physical stresses. DnaK is a 638 amino acid long protein that chaperones almost 600 proteins in E. coli. DnaK has three domains: N-terminal nucleotide binding domain (NBD) where the ATPase activity resides; a C-terminal substrate binding domain (SBD) and a linker that relays allosteric information across both domains. The open/closed state of SBD is controlled by the occupancy of nucleotide at NBD. The nucleotide binding is further regulated by a protein called DnaJ.

This family of proteins contains a 70 amino acid consensus sequence known as the J domain. The J domain of DnaJ, interacts with DnaK and stimulates its ATPase activity.

DnaJ also associates with unfolded polypeptide chains and prevents their aggregation. Another co-chaperone called GrpE mediates the dissociation of ADP and the subsequent binding of a new ATP that triggers substrate release from DnaK and resets the chaperone cycle.

15

Mutations in SecB and DnaK (or DnaJ) exhibit synthetic lethality, and the expression of DnaK is upregulated in the absence of SecB, and vise versa (31,32). Studies performed on SecB substrates like OmpT, OmpA, OmpF, and DegP using null strains of

SecB show that these proteins aggregate in the cytosol when DnaKJ are depleted. Both

SecB and DnaK have been shown to have common binding domains on proteins. These data suggest that both chaperones could work in concert to assist in the post- translational export of Sec substrates (33). The physical interaction between SecB and DnaK in vivo also corroborates this hypothesis.

1.5.2.4 Chaperone GroEL-GroES

The ATP-dependent chaperonin GroEL/GroES is an essential and well- characterized member of the Hsp60/Hsp10 chaperone family. They provide both a protected environment and assist in the folding of polypeptides generally up to 60 kDa so they can attain their native structures. GroEL forms a barrel-shaped complex composed of two heptameric rings assembled back-to-back (34). GroEL is made up of three domains:1) the equatorial domain, which is responsible for intra- and inter-subunit interactions and for nucleotide-binding; 2) the apical domain is involved in both substrate and GroES binding, and 3) the intermediate domain, which relays conformational changes between the other two domains. The two rings form a wide cavity where the chaperone interacts with the protein via its hydrophobic patches. The GroEL folding cavity can be closed by a seven

GroES co-chaperone lid, which allows confinement of the polypeptide as shown in figure

1.1. Two main folding models are proposed in the field: the Anfinsen cage model for which the central cavity is acting as a passive cage, so that folding can occur unperturbed free of

16 aggregation and the Iterative annealing model that proposes an active role of the cavity through repeated unfolding events (successive binding and release cycles) to reverse kinetically trapped folding intermediates and thereby enhance folding.

The substrates of SecB were identified to interact with GroEL in vivo. These include the outer membrane proteins such as OmpA, OmpC and OmpF, and periplasmic proteins OppA and YncE. (35,36) Apart from these experiments, over-expression of

GroEL and GroES rescues secB null strains and vice versa, even though mutations in

GroES/GroEL do not exhibit any defects with the chaperoning of substrates. These results highlight the interactions between SecB and GroEL/GroES with proteins to be exported, where they prevent them from aggregating in the cytoplasm. Also, the discovery that

GroEL is efficiently targeted to membrane-bound SecA in vitro strongly supports a link between the chaperone and the Sec translocon (34,37).

1.6 The Sec Translocon

The Sec translocon is an essential, universally conserved heterotrimeric complex consisting of three proteins: SecYEG in prokaryotes and Secαβγ in eukaryotes. In addition, the Sec translocon has accessory subunits such as SecA and the SecDF/YajC complex.

SecA is essential for translocation of many exported proteins in bacteria while SecDFYajC improve the efficiency of translocation. These proteins were identified using a genetic screen where the translocase function was mutated (i.e. the sec or prl mutations). The Sec translocon can perform two kinds of functions: 1) The translocase function can transport the substrate completely across the membrane or a hydrophilic domain of a membrane protein from the cytoplasm and 2) The insertase function can insert substrates into the lipid

17 bilayer. SecYEG inserts/translocates substrate in either an energy independent fashion by itself or in an ATP-dependent fashion with the help of SecA. We will review the structure of the individual proteins and how they work in concert to perform translocation in the following sections.

1.6.1 The SecYEG

The first crystal structure of SecYEG was solved by Rapoport and colleagues in the year 2004 from Methanococcus jannaschii at 3.2 Å resolution. The structure revealed that a single protomer of SecYEG forms a channel. SecY forms the main channel forming unit that has a classic hourglass structure where TM 1-5 and TM 6-10 form two symmetric bundles held together by a linker. It forms a channel that is open to both the periplasm and the cytoplasm. At the center of the hour glass is a pore ring with a diameter of 4 Å. The pore ring is formed by 6 hydrophobic residues. A short helix TM2a keeps the pore ring closed and is called a plug domain. The plug functions to maintain the integrity of the membrane and preserve the permeability barrier. It has been shown that deleting the plug domain does not result in significant defects in protein translocation (38). However, electrophysiology experiments have shown that plug deletion mutants are perturbed in their membrane permeability and reveal fluctuations between the open and closed state of the translocon (39). Studies by Li et al. show that in the absence of the plug domain, neighboring SecY loops can partly substitute for it so that there is only a small perturbation.

However, permanently displacing the plug domain is toxic to E. coli as was shown by disulphide cross-linking experiments (40).

18

There are a total of 12 structures of Sec translocon solved so far using X-ray crystallography and Cryo-EM with different partners and from various organisms. In the year 2008, both Rapoport and coworkers (Figure 1.3) and Ito and coworkers published structures of SecY in a new conformation from Thermotoga maritima and Thermus thermophilus respectively. In 2014, Park et al. solved the structure of an active Sec translocon with a polypeptide embedded within the channel using Cryo-EM and molecular modeling (41). In this study, SecY along with a ribosome nascent chain was purified and frozen to generate density maps of 11Å resolution using Cryo-EM technique. It was shown that the signal peptide gets placed at the lateral gate. This causes significant rigid body movements in the two halves of the SecY channel that allows for further substrate to enter the channel. These structures revealed a possible mechanism of the Sec translocon where in case of translocation, the substrates enter the channel through the pore ring and exit the translocase towards periplasm. In case of insertion, the substrates TM segment gets recognized due to its hydrophobicity and leaves SecY via the lateral gate between TM2 and TM7. Structure of SecYEG with SecA solved at 4.5 Å resolution by Zimmer et al. showed that SecA interacts with the SecY cytoplasmic loops on the membrane surface and that TM2 and TM7 of SecY are slightly apart from their original position from the structure of a resting SecYEG. Similar movement of the TM segments was observed in the structure of Sec with a translating ribosome by Jomaa et al. (42). The cytoplasmic loops of SecY between TM4/5 and TM5/6 are exposed and were shown to contact ribosomes and trigger factors.

19

The second subunit, SecE is a 14kDa essential 3TM protein that forms a clamp around SecY by wrapping around the two sides via its TM segment and cytoplasmic tail to stabilize the complex. Much of the N-terminal part of E. coli SecE can be deleted without compromising its function (43,44), which suggests that at least one half of the proposed clamp is not required for translocon function.

The third subunit of the Sec translocon in eukaryotes and archaea is Sec61β, while a distinct protein SecG constitutes the third subunit of the bacterial Sec translocon (45).

The SecG subunit in E. coli is a 12 kDa protein with 2 TM helices that occupies a position close to the N-terminal half of SecY (46,47). It is not an essential protein, but it was shown to improve translocation efficiency when the proton motive force is compromised.

1.6.2 Oligomeric state of SecYEG

The functional unit of the Sec translocon has been shown to be a single SecYEG protomer both in vitro and in vivo. Nevertheless, higher oligomeric states have been reported using crosslinking and electron microscopic studies. The relevance of these states is not yet completely understood. Significantly, two forms of dimers have been reported.

One dimer has two SecY units interacting at the interface with the SecE units facing in a

“back to back” conformation as observed by cryo-EM. Another dimer shows two SecY units interacting along their lateral gate with TM2 and TM7 potentially forming a large pore as shown using crosslinking studies. (48) Park and Rapoport in 2012 have shown that in a resting state both the conformations are possible with similar probabilities.

1.6.3 Mechanism of Co-translational pathway

20

Once RNCs are targeted to the inner membrane via the SRP-FtsY pathway, the translating ribosome aligns with the SecY channel (49-51) via the C4 and C6 loops of SecY as explained previously (52,53) and the ribosomal proteins L23, L24 and L29.

Additionally, contacts have been shown between the C4 and C5 loops of SecY and the 23S rRNA in the ribosomal exit tunnel. Interestingly, ribosomes contact lipids, which prepare the membrane environment in front of the lateral gate of the SecY translocon for protein insertion. These contacts are believed to result in the switching of the translocon from the closed to the pre-open state, where the lateral gate partially opens, but the central channel remains closed by the plug domain. This process is called Priming of Sec translocon. (54)

As the protein inserts into the SecYEG channel, the central pore in SecY also starts to widen. (55) The signal anchor (SA) sequence of the inner membrane protein (IMP) then inserts into the SecY channel and gets inserted into the lateral gate at helices 2b and 7 in

SecY. These changes results in displacement of the plug domain and switches the translocon to the open state. The SA of the protein then exits via the lateral gate and inserts into the lipid bilayer laterally.

1.6.4 Molecular motor protein SecA

SecA is a 100 kDa molecular motor protein that is essential for the translocation of proteins across the inner membrane. It belongs to the Superfamily 2 of DExH/D (Asp–

Glu–X–His/Asp) proteins, which include various and nucleic acid modifying enzymes. It exists in the cytosol as a soluble chaperone as well as a membrane bound complex. SecA binding to the membrane is facilitated by the negatively charged lipids, similar to the binding of FtsY (34). It interacts with the cytosolic loops of SecY with much

21 higher affinity than the pre-protein and is known as the cytosolic receptor for the Sec translocon. It has a characteristic ATPase domain by which it performs repeated hydrolysis steps providing the energy for translocation of substrates across the membrane.

Structurally, SecA has six distinct domains as shown in the figure 1.2 (56,57). The

N-terminal nucleotide-binding domain I (NBD1) together with a second nucleotide- binding domain 2 (NDB2) performs the ATPase function of SecA. (58) The NBD1/NBD2 interface has a for an ATP molecule which upon hydrolysis triggers conformational changes in the motor domain as well as in the peptide-binding domain

(PBD; also called pre-protein cross-linking domain, PPXD). This PBD has an antiparallel

β-strand and a globular region both of which are involved in substrate and Sec translocon binding (33). There is a helical scaffold domain (HSD) which together with NBD2 forms a substrate binding clamp at the interface. The HSD is followed by the helical wing domain

(HWD) and finally a non-essential flexible C-terminal domain (CTD). The first half of the

C-tail extends into the core of SecA and the second half interacts with lipids and the chaperone SecB. All SecA domains, with the exception of HWD, are involved in the interaction with SecY (59,60).

In the crystal structure of the SecA–SecYEG complex, a single SecA protein is bound to a single SecYEG protomer creating a groove for the preprotein to pass through.

The functionality of this complex had been intensively investigated (61). In a current model, after initial ATP binding to SecA and pre-protein release from SecB, the signal sequence of the pre-protein acquires an -helical conformation (62) that probably binds to the substrate binding clamp of SecA (63). Simultaneous conformational changes due to

22 this binding causes SecA to penetrate deeper into the channel, where the signal sequence of the substrate is intercalated at the lateral gate of SecY (64), while the downstream segment of the substrate is still inside the pore. The interaction of the signal sequence with the lateral gate induces conformational changes and movement of the plug domain paving the way for the substrate to move up the pore toward the periplasm by repeated hydrolysis of ATP. Each hydrolysis cycle is predicted to move around 20-25 amino acids into the translocon. Binding of SecA to SecYEG is also believed to prevent the substrate from back sliding during translocation.

1.6.5 Accessory unit SecDF/YajC

The heterodimeric protein SecDF is an accessory protein of the Sec translocon.

These proteins were identified using a genetic screen with the reporter assay for translocation. Several functions in the late stages of protein translocation have been proposed for SecDF. One function is that SecDF interacts with the preprotein in the periplasm thus promoting translocation. Alternatively, it functions in the release of exported protein from the translocon. Although deleting SecD or SecF resulted in a severe defect in protein translocation in vivo, the absence of SecD or SecF did not have much effect using an in vitro translocation system. Further in vitro studies suggested that SecDF prevents back sliding of the substrate-peptide chain during membrane insertion, since when

SecDF is present there is a higher affinity of SecA with the membrane.

A major breakthrough in the field was the determination of the crystal structure of

T. thermophilus SecDF, which provided insight into its function. The structure revealed the presence of 12 TM segments with 2 large periplasmic domains. Additionally, the

23 atomic structure of the isolated periplasmic domain of SecD (P1) was solved. Interestingly, the conformation of P1 in solution showed a rotation of its head domain compared with P1 in the full-length structure, thereby implying a functional change. The current model suggests the top part of the P1 domain (also called the head domain) swivels in the presence of the PMF about 120 degrees thereby resulting in a movement of 75 Å (25 amino acids).

Further evidence supporting this theory was shown by crosslinking the preprotein with the

P1 domain. Also, immobilization of the head domain resulted in translocation deficiency.

Because an energy source like ATP is absent in the periplasm, it was hypothesized that the

PMF would assist in such movements. In fact, the TM arrangement of SecDF is similar to that of a member of the RND superfamily, multi-drug efflux transporter AcrB23 with charged residues lining the TM segments. These charged residues lining the channel assist in proton transfer. Intriguingly, the mutants in which the charged residues in E. coli SecDF were replaced by uncharged residues lacked the SecDF activity.

The proposed mechanism by which SecDF assists in translocation is that it first binds to the preprotein once it emerges from SecY and prevents back sliding. Then it swivels its head domain to translocate about 25 amino acids across the membrane with the help of the PMF. The SecDF complex should be in close proximity to SecYEG for such a mechanism to be possible. Indeed, crosslinking and co-immunoprecipitation studies have shown that SecDF is in close proximity to SecYEG. Additionally, SecDF has been shown to interact with the insertase YidC and is believed to be the scaffold that brings YidC and

SecYEG together to form a unified complex. The membrane protein YajC in the complex is not essential for cell viability or protein translocation. Its function is not yet known.

24

1.7 The membrane insertase YidC/Oxa1/Alb3 family

The Escherichia coli YidC is a membrane protein insertase at the inner

(cytoplasmic) membrane and is composed of 6 TM segments with a large periplasmic domain in between TM1 and TM2. It is an essential protein ubiquitous in nature and belongs to a family of proteins found in all walks of life ranging from archeal (Duf106) and bacterial (YidC) to mitochondrial (Oxa1/2) and thylakoid membranes (Alb3/4). Recent studies have identified homologues in the eukaryotic organisms like the Get system in yeast, the ER membrane protein complex subunit 3 (EMC3) and the transmembrane and coiled coil domains 1 (TMCO1). The TMCO1 family members can function as an insertase independently and in concert with Sec translocon catalyzing translocation of newly synthesized membrane proteins.

In the bacterial, chloroplast and mitochondrial homologs, the 5 TM segment core region exhibits the insertase activity of the proteins. In particular, the first and second of these TM domains (i.e. TM 2 and TM 3 of the E. coli YidC) share several conserved residues that contact substrate proteins, and mutations of these residues cause loss-of- function phenotypes (65). This family of proteins achieve its essentiality most likely from inserting energy transducing complexes (66). The bacterial YidC and mitochondrial Oxa1 primarily function to insert and assemble the protein complexes involved with respiration

(67,68), whereas the chloroplast paralog Alb3/4 are necessary for photosynthesis and thylakoid biogenesis. In addition to its role in protein insertion, members of the

YidC/Oxa1/Alb3 family contribute to the folding and assembly of membrane proteins

(69,70). It was proposed that they function as membrane-embedded chaperones which

25 interact with folding intermediates (71). The function of this family of proteins is evolutionarily conserved as Oxa1 and YidC can replace each other. Also, Alb3/4 can replace YidC to insert proteins into the bacterial inner membrane (72). We will discuss each member of the family in the subsequent sections and present some recent results on the discovery of other Oxa1 homologs.

1.7.1 Mitochondrial Oxa1

Mutations affecting the biogenesis of the respiratory complex (Oxidase assembly)

Cytochrome c Oxidase (73) and the formation of the F1F0 ATP synthase led to the discovery that Oxa1 in mitochondria was involved in membrane protein biogenesis. Oxa1 can insert both mitochondrial and nuclear encoded proteins into the mitochondrial inner membrane. The nuclear encoded proteins are transported across the mitochondrial outer membrane with the help of TOM (Translocase in the outer membrane) proteins and into the inner membrane with the TIM (Translocase in the inner membrane) proteins of mitochondria. This was shown for the protein Mdl1, an ATP dependent permease of mitochondria that transports iron from the matrix and ultimately into the cytoplasm. Mdl1 was shown to require both TIM 22 and Oxa1 for its inner membrane insertion showing that both proteins work in concert (74).

Alternatively, the mitochondrial encoded proteins are inserted co-translationally by

Oxa1 from the matrix into the inner membrane. The flexible positively charged C-tail of

Oxa1 acts as a ribosome receptor with great affinity towards ribosomes. It has been crosslinked to the ribosomal proteins MRP40 and MRP20. These ribosomal proteins are

26 homologous to L23 of bacterial ribosomes which are bound to the ribosome near the exit tunnel.

Oxa2 is a paralog that is also present in the inner membrane of mitochondria. It has been shown to be important in the posttranslational biogenesis of Cox2, a mitochondrial encoded protein which is essential for the electron transfer of the cytochrome c oxidase complex. Oxa2 can be functionally replaced with Oxa1 when it is over expressed in the cell.

1.7.2 Plastids Alb3/Alb4

The finding of Oxa1 led to the search of other homologues and resulted in the discovery of Alb3 and YidC in the later years. Alb3 got its name from the albino-like appearance of the plant when the gene was knocked down (75). Alb3 was crucial for the insertion of light harvesting chlorophyll proteins (LHCP), which play an essential role in photosynthesis. LHCP is inserted post-translationally into the thylakoid after it is imported into the stroma from the cytoplasm. TIC/TOC (translocase in the inner/outer envelope membrane of chloroplast) help bring the LHCP precursor into the stroma. After it is imported, the signal peptide is cleaved from the precursor by the stromal peptidase (76).

LHCP then interacts with cpSRP and then gets targeted to Alb3 by the cpFtsY.

Alb3 has been shown to interact with the Sec complex of chloroplast cpSecYE using crosslinking studies and co-immunoprecipitation studies. Substrates that require the

Sec translocon for insertion have also been shown to interact with Alb3 for folding. Until now, an Alb3/cpSecY dependent substrate has not been identified except for D1. Some of the substrates that insert via Alb3 are D1, D2, CP43, PS1-A, and ATPase subunit CF0III.

27

Alb4 is the paralog of Alb3, which is important for the thylakoid biogenesis. It is involved in the assembly of CF1CF0ATP synthase (77).

1.7.3 The membrane insertase YidC

YidC is a 61 kDa integral membrane protein that was shown to catalyze membrane protein insertion. It was first identified as a protein insertion machinery that cooperates with the Sec translocon for the integration of proteins into the E. coli inner membrane

(78,79). Scotti et al. showed that YidC can be cross-linked to newly synthesized FtsQ during membrane protein insertion and that YidC co-purifies with SecYEG and

SecDF/YajC.

In the year 2000, YidC was demonstrated to facilitate the insertion of M13 phage coat protein (PC), as depleting it resulted in the accumulation of the procoat precursor in the cytoplasm. M13 procoat was previously thought to be inserted by an un-assisted mechanism. Another single TM phage protein Pf3 coat was also shown to be a YidC-only substrate. Moreover, crosslinking studies showed YidC interacts with the inserting Pf3 coat

(80). While the reason why YidC is essential for the cell is not completely clear, it may be due to the fact that YidC is required for the biogenesis of respiratory complexes. Subunit c of F1F0 ATPase was shown to be dependent on YidC for membrane insertion in vitro using proteoliposomes.

Other substrates using the YidC-only pathway are MscL (Mechanosensitive channel of large conductance), which inserts co-translationally and TssL, a tail anchored membrane protein. In eukaryotes, tail anchored membrane proteins are inserted by the Get pathway involving five Get proteins (81). Interestingly, Get 1 is an Oxa1 homolog and was

28 shown to have insertase activity in an in vitro study. One of the unique attributes given to

YidC was its concerted role along with SecYEG to insert complex and large membrane protein substrates. CyoA, a lipoprotein utilizes both YidC and SecYEG for its biogenesis

(82). Its short N-terminal region was inserted by the YidC-only pathway while its large periplasmic domain is translocated by the SecYEG machinery working with the SecA motor ATPase.

Proteins such as subunit a and b of F1F0 ATP synthase, CyoA (cytochrome b0 ubiquinol oxidase subunit) and TatC (Twin arginine translocase subunit) require both YidC and SecYEG for their insertion into the membrane. How are proteins inserted by the YidC-

SecYEG concerted pathway? Urbanus et al. answered this question for the single-spanning

FtsQ using ribosome nascent chains of various lengths. They discovered that during insertion the TM segment of FtsQ first makes contact with SecY and consequently with

YidC. This led to the hypothesis that YidC is present in close proximity to Sec translocon and may assist in the lateral integration of a TM segment into the membrane (83). Further studies provided information on how YidC contacts SecY and SecDF. Li et al. developed a lethal genetic screen where it was shown that G355 and M471 of YidC contacts SecY, and SecDF was essential for such an interaction. Later crosslinking studies by Koch et al. showed that YidC can contact the lateral gate (TM2b, TM7) of SecY even in the absence of SecDF (84). The authors proposed the complex between YidC and SecYEG is dynamic with the TM segments (3 and 5 in E. coli) of YidC interacting deep inside the lateral gate reducing the pore diameter of SecY from 8 Å to 5 Å (85). Such a complex would account

29 for a large hydrophilic groove that can accommodate at least two TM segments of the inserting membrane protein.

Recently Collinson and co-workers have shown that YidC forms a complex with

SecYEG and SecDF/YajC, termed the Holo translocon (HTL). ACEMBL plasmid (86) was used to over express all the individual components in a single plasmid as a complex of 250 kDa unit. It was shown to be functional as translated substrates were shown to insert into the membrane when the HTL was reconstituted into liposomes. HTL was able to interact with ribosomes with similar or higher affinity than the individual components. It was also shown to stimulate the ATPase activity of SecA similar to SecYEG alone confirming the efficacy of such a complex (87).

The HTL can work in two capacities, depending on its precise makeup. One version of HTL contains one copy of SecYEG, YidC, SecDF and YajC that functions in membrane protein insertion. The second was a SecYEG dimer that functions with SecA to promote protein export. Interactions were also determined through dithiobis

(succinimidyl propionate) chemical cross-linking and occurred between SecD/YidC as well as between SecY, E and G. The number of copies of YidC in the cell is around 2500 with SecYEG around 500 copies. Because both of the major components YidC and

SecYEG have been shown to function independently, the physiological relevance of HTL in vivo is still debated. Also, the exact ratios of each components are still controversial as the cellular concentrations is difficult to determine accurately.

1.7.3.1 YidC as foldase

30

One of the unique features attributed to YidC is its role as a chaperone and assembler of multi-transmembrane complexes. Studies with LacY biogenesis showed that

YidC was crucial for the correct folding of the protein but was not required for its insertion.

Monoclonal antibodies recognizing the folded conformations of LacY was used to study its biogenesis and under depleted conditions of YidC the recognition of the antibody was disrupted. Zhu et al. tested the insertion of individual periplasmic loops under depletion of

YidC and all the loops were membrane inserted in the correct orientation (88). This study confirmed that YidC acts in the later stages of membrane protein biogenesis of LacY making sure the protein is correctly folded. Wagner et al. showed similar trend with MalF, a subunit of the maltose binding complex. Upon YidC depletion, the stability of the complex was affected without compromising the insertion of the TM segments of MalF

(70).

1.7.3.2 Oligomeric state of YidC

The oligomeric state of YidC has been debated for over a decade. Cryo EM studies performed by Kohler et al. showed that YidC formed a homo-dimeric channel and YidC can bind to the ribosome in close proximity to the L23 protein (89). Another cryo EM structure by Seitl et al. showed YidC exists as a monomer when bound to the ribosome which was confirmed by in vivo studies in later year (90). Recent studies by Spann et al. showed that each promoter in an artificially designed dimeric YidC functions as a separate insertase and each promoter has the capacity to bind substrate (91).

1.7.3.3 Structure of YidC

31

In E. coli there is only a single gene-encoded YidC protein in the cell which has a core domain of 5 transmembrane segments. Also, it has an additional N-terminal TM segment and a large periplasmic domain P1 in between TM1 and TM2. X-ray crystal structures of the P1 domain revealed a β sandwich structure resembling a sugar binding domain. The structure did not reveal functional insights of the P1 domain in the mechanism of translocation. The P1 domain is not essential for translocation as deletion studies showed it is not needed for the function of YidC. Further studies showed the P1 domain contacts the periplasmic region of SecF and may facilitate the formation of the HTL complex (92).

Kumuzaki et al. in the year 2014 solved the first set of high resolution crystal structures of YidC. This included the Gram-positive bacteria Bacillus halodurans YidC2

(BhYidC) and the Gram-negative bacteria E. coli YidC at 2.4 Å and 3.2 Å resolution, respectively (93,94). Both the structures were achieved in the lipid cubic phase and YidC was crystallized as a monomer. One of unique features revealed from the structure is the presence of a hydrophilic cavity within the 5 TM core region. The aqueous groove is about

2000 cm3 in volume extending almost 20 Å from the cytoplasmic surface to the center of the membrane. The groove is open both to cytoplasm and lipid bilayer but closed from the periplasmic side. Another important feature identified was the cytoplasmic helical hairpin like domain which was predicted to be involved in the initial recruitment of the substrate.

A striking revelation from the crystal structure of BhYidC was the conserved positively charged residue (R73) in the groove of YidC and it plays a critical role in membrane insertion of the single spanning MifM protein that has two negatively charged residues in the translocated tail. Replacing the conserved arginine of SpoIIIJ (YidC

32 homolog) with lysine rescued the SpoIIIJ depletion strain but not with any neutral or negatively charged amino acid residues suggesting an electrostatic mechanism in which the YidC positive charge attracts the negative charge in the tail of the substrate into the groove (95).

Chen et al. further investigated the role of the conserved arginine at position 366 for the membrane insertase function of the E. coli YidC. It was shown to be fully functional with a polar or an apolar amino acid substituted but lost its function when a negative residue was substituted. In contrast, the positive residue was found to be critical for the Gram- positive S. mutans YidC2. In the case of chloroplast Alb3, when the conserved positive charge (lysine) was substituted with an amino acid with a neutral side chain the protein was still functional and lost its functionality only when substituted by a negatively charged amino acid. Recent studies by Chen et al. (Chen and Dalbey) have shown that suppressor mutations can make even a negative residue functional when it replaces the strictly conserved positive charge in the groove of the E. coli and S. mutans YidC.

Another intriguing feature from the crystal structure of YidC was the hairpin like helix localized in the cytoplasmic region. This cytoplasmic hairpin helix is important for

YidC activity in E. coli since a deletion mutant in the C-terminal part of the helix impairs activity. The arrangement of two antiparallel helices in the C1 region of EcYidC is rotated by 35° with respect to the core region, as compared with the BhYidC structure. Also, the

B factor for this region in both structures is high showing the flexibility of the C1 cytoplasmic loops region. Though the hairpin helix is involved in the translocation mechanism, its precise sequence is not critical. Crosslinking studies performed by Koch et

33 al. showed that the C1 loop contacts the targeting proteins SRP and FtsY and the Sec translocon (96).

1.7.3.4 Mechanism of insertion

Based on the crystal structure Kumuzaki et al. proposed a mechanism for membrane insertion of a single spanning transmembrane protein where the N-terminal tail of the substrates contacts YidC in the hydrophilic cavity. In their model, the negatively charged residues in the N-tail make an ionic interaction with the conserved positively charged residue in the groove. YidC catalyzes translocation of the tail because the hydrophilic groove extends almost half the distance into the membrane, thereby reducing the energy cost of translocation (93).

The second step of translocation is achieved by the insertion of the hydrophobic

TM segment of the substrate into the greasy slide of YidC located between the TM 3 and

TM5. Studies by Klenner et al. showed that the inserting hydrophobic region of the substrate contacts multiple hydrophobic side chains of TM3 and TM5 (97). Once the TM segment goes up the slide, then with the help of PMF, the N-tail is translocated to the periplasm. One of the limitations of their proposed mechanism is that it does not reveal how the substrates with 2 or more TM segments are inserted and how YidC cooperates with SecYEG to insert proteins.

1.7.3.5 From the substrate point of view

Previous studies showed that YidC can insert proteins with shorter periplasmic loops and requires the assistance of SecYEG for substrates having larger loops to translocate. Kuhn and coworkers systematically increased the periplasmic region of M13

34 phage protein to 80 amino acids and discovered it required the Sec translocon for its insertion (98). The largest substrate known to be inserted by the YidC-only pathway so far is MscL, which has a loop of 29 amino acids. Gray et al. proposed that substrates with charge unbalanced TM segments were significantly more likely to depend on YidC for insertion (99). On the other hand, Zhu et al. proposed that the charge composition of the translocated periplasmic domain could determine which pathway a single-span membrane protein would prefer for insertion. For instance, increasing hydrophobicity of the TM segments could make the substrate insert independently, and decreasing the hydrophobicity could lead to a dependence on YidC, SecYEG, or both (100). A loop with a positive charge would require YidC/SecYEG for translocation, whereas a negative charge in the loop only required YidC. This latter finding is in line with the proposed mechanism of Kumuzaki et al. where a small loop is accommodated within the hydrophilic cavity, by being attracted to the arginine in the groove.

For the M13 procoat protein, Soman et al. revealed that the periplasmic loops within the substrates determine the pathway of membrane insertion (101). In this study the author predictably changed the pathway of procoat by increasing or decreasing the polarity of the periplasmic loop. Remarkably, Zhu et al. found studying a construct of LacY, with

M13 phage protein sequence inserted between TM 6 and TM7, that YidC was required for the insertion of the phage protein but the 12 TM segments of LacY were only Sec dependent (88). This study showed that there were other parts of a protein that dictated translocase requirement. Clearly, the features of membrane proteins that dictate the translocase requirements need to be determined in future studies.

35

1.7.4 Archeal Duf106

Proteins with low sequence similarity to the YidC/Oxa1/Alb3 family have been identified previously in archaea (102-104). These proteins are annotated as Domain of

Unknown Function 106 (Duf106) for their lack of knowledge of the function of proteins in this family.

In 2015, Keenan el al. solved the structure of archaeal Duf106 protein in a detergent solubilized state at a resolution of 3.5 Å from Methanocaldococcus jannaschii (Mj0480)

(105). The Duf106 protein oligomerizes to form a tetramer in the crystal structure and possesses only 3 TM segments that correspond to the location of TM 2,3, and 6 of the E. coli YidC. The Mj0480 structure superimposes on the analogous region of BhYidC with a root-mean-square deviation of 3.9 Å (over 105 core residues) showing a striking similarity in the secondary structure with just 14% sequence similarity. The key features are a hydrophilic groove and a predicted coiled-coil structure (not seen in the structure) in the cytoplasm confirming the close resemblance of Duf106 with its homologs in bacteria.

Further studies from Keenan et al. showed that Mj0480 binds selectively to stalled RNCs and can be crosslinked to the nascent chain at the lipid-exposed hydrophilic surface. These studies provide evidence of a YidC like proteins that may facilitate insertion in the archeal membrane. Homologues of the Duf106 are found in other Euryarchaeota,

Crenarchaeota and Korarchaeota suggesting that YidC is present in the three domains of life. Future experiments with Duf106 would help shed light on the mechanism of protein insertion via this universally conserved family of proteins.

1.7.5 Eukaryotic homologs of YidC

36

Recent studies by the Kennan and Hedge groups showed that the ER membrane protein complex subunit EMC3, and transmembrane and coiled-coil domains 1

(TMCO1) share a common ancestry with the Oxa1/Alb3/YidC family proteins. The ER proteins are homologous to the Duf106 protein. Glycosylation mapping studies showed these proteins have a 3TM segment core with the N-terminus facing the ER and the C- terminus facing the cytoplasm as observed with the Duf106 protein. Anghel and Guna et al. have shown TMCO1 and EMC, respectively, interact with ribosome and Sec61 complex using fractionation and immunoprecipitation studies (106,107). Reconstitution studies performed with EMC showed that it could insert the ER resident enzyme squalene synthase

(SQS) near to 50% of that using native membranes. Future experiments on these ER homologs will reveal their precise functions in membrane protein evolution.

1.8 Twin arginine translocase pathway

The twin arginine translocase (Tat) pathway exists in archaea, bacteria and plant chloroplasts. In bacteria, it exports proteins that are folded, unlike the Sec translocon. Many of its substrates are proteins with metal cofactors and are proteins that are important for energy metabolism, formation of the cell envelope, biofilm formation, heavy metal resistance, nitrogen-fixing symbiosis, and bacterial pathogenesis (108). The protein transport does not require ATP as an energy source but relies on the proton motive force

(PMF) (109).

Gram-positive bacteria use the TAT pathway to different extents. Bacillus subtilis and Staphylococcus aureus exports only a few Tat substrates while enteric bacteria exports about 20-30 substrates. Tat substrates typically bind cofactors or metals in the cytoplasm

37 and the proteins are folded and active in the cytoplasm prior to export. Accomplishing these tasks in the cytoplasm resolves the issue of transporting or maintaining a pool of cofactors separately and then assembling them in the external milieu (110,111).

Targeting to the Tat translocase is dictated by the presence of an N-terminal signal peptide that is very similar to the general signal peptide, namely a polar amino terminal

(N) domain, hydrophobic core (H) region and a polar carboxyl (C) domain. Despite the same basic structure, the key feature of the Tat signal peptide is the presence of a conserved

SRRxFLK motif located at the junction of the N- and H-domains. (112) The twin arginines are essential for Tat dependent export and give the pathway its characteristic name. The presence of basic residues in the C tail prevents export by the Sec pathway, eliminating promiscuous export by both the Tat and Sec pathways.

Richter et al. (113) showed that small, unstructured hydrophilic proteins with phenylalanine glycine (FG) repeat could be exported by the Tat system, and that the presence of hydrophobic surface patches was sufficient to abort transport, raising the possibility that the Tat system screens proteins based on their surface hydrophobicity. It has also been reported that the length of the unstructured FG repeat polypeptide dramatically affects Tat export, with longer regions abolishing Tat export altogether. (114)

Conversely, Jones et al. (115) recently reported that the Tat system was surprisingly tolerant of hydrophobic patches on the surface of structured single-chain variable fragment proteins, and export efficiency was increased with greater structural rigidity. The exact substrate features of the Tat Pathway are still under investigation. Chaperones prevent

38 export of a protein until insertion has taken place, and mutants incapable of cofactor binding are rapidly degraded once in contact with the Tat machinery. (116)

1.8.1 Translocase components

Three integral membrane proteins form the minimal set of components for the assembly of the Tat translocase in E. coli: TatA, TatB and TatC. These are expressed from the Tatabc operon and assembled on the cytoplasmic membrane as Tat(A)BC complex and a cytoplasmic TatA pool. Exactly how the complex is assembled and the reason for the Tat

A pool is extensively studied but is still debated.

TatA is an 89 amino acid protein that consists of a short periplasmic N-terminal region, a transmembrane helix that is linked via a hinge region to a cytosolically exposed amphipathic helix (APH), and a highly unstructured, cytoplasmically exposed C-terminal region. (117) There is about 20% sequence similarity between TatA and TatB, but both have distinct functionality in the translocation pathway. TatC is a 238 amino acid protein with 6 TM segments with the N and C termini located in the cytoplasm. TatB and TatC are in the ratio of 1:1 in a complex with varying ratios of TatA attached to it making the complex size vary from 100-500 kDa. As previously mentioned, TatA is 50 times more abundant in the cell and is present in two separate pools (118). E. coli also has another type of Tat protein called TatE, which is only found in enterobacteria and a couple of gram positive bacteria and has a 50% sequence similarity to TatA. (119) TatE could complement for a tatA knockout if continuously expressed but its function is still under debate. (120)

1.8.2 Translocase mechanism

39

Early cross-linking studies conducted in chloroplasts demonstrated that a Tat- dependent substrate could be cross-linked to the cpTatC complex (equivalent to TatC in bacteria) (121), demonstrating that this is the initial receptor for the Tat pathway. Similar binding characteristics were observed for the TatBC complex of E. coli. In both instances, full translocation was prevented by the absence of a PMF showing that TatBC as a unit is functional and PMF dependent. The crystal structure resolved by Ramasamy et al. presented a “glovelike structure” (122), where TatC appeared to assemble into a concave structure that can accommodate a TM segment of TatB or the neighboring TatC. Also, using deletion mutants, the periplasmic loops of TatC, P1 and P2 were shown to be essential for translocation but not for insertion or assembly of the complex into the membrane.

Crosslinking studies show that the N-terminal region of TatC makes contact with the RR-signal peptide and data showing that TatB can directly interact with RR signal peptides suggests that TatB functionally cooperates with TatC by forming part of the signal binding pocket. This insertion of the signal peptide into the membrane after proofreading forms the first step in translocation (123-125).

In the second step, the translocation complex is formed. A large number of TatA proteins are recruited to the translocation site in a manner that is dependent on the proton- motive force, thereby forming the translocation complex. TatA protomers are predicted to form the ‘pore’ of varying diameters in the cytoplasmic membrane, permitting the passage of fully-folded proteins into the periplasm (126).

40

The third step is the actual process of translocation itself. The exact mechanism is still controversial and we will discuss the two hypotheses. The first model is called the

Trap door model. The APH domain of TatA on the cytoplasmic membrane surface comes into contact with the substrate. With the help of PMF, the flexible APH domains are flipped into the membrane allowing substrate translocation as the trap door is opened. The second model proposed is a weakening of the lipid bilayer when TatA oligomerizes with its polar

N-tail destabilizing the membrane and this allows translocation of the Tat substrate. This model is gaining more support with recent NMR structures and single particle electron microscope studies of TatA.

41

1.9 Figures

Fig 1.1 Simplified representation of a eukaryotic cell showing different organelles

42

Representation of eukaryotic cell showing targeted proteins directed to different organelles or sites. Targeting of nuclear encoded proteins ( ) to various sites via the rough endoplasmic reticulum represents a major pathway. The newly synthesized proteins are inserted in the ER and trafficked via Golgi, lysosomes to plasma membrane and the external milieu in the form of secretory vesicles. The organelles chloroplasts and mitochondria have small genomes coding for a small number of proteins localized either to the stroma and inner membranes for chloroplasts ( ) or to matrix and inner membrane for mitochondria ( ). All other proteins in these organelles along with peroxisomes and lysosomes are encoded by the nucleus and imported from the cytoplasm as shown by the arrows. Proteins are also imported from outside the cell via endocytosis ( ). This figure was modified from Fig. 1.2 from the Protein Targeting book by Anthony Pugsley.

43

Figure 1.2 Representation of bacterial membranes and proteins in various locations

The diagram is divided into two parts representing both Gram-negative and Gram-positve bacterial cellular structure. Proteins are indicatied according to their localisation as cytoplasmic, periplasmic and extracellular. Membrane proteins are indicated in the outer membrane as beta barrels and inner membrane as alpha helices. Integral membrane proteins 44

(TM segments embedded within the bilayer) with muliple conformations (Type 1 with N- out/C-in, Type 2 with N-in/C-out and polytopic membrane proteins) are indicated in green.

Peripheral membrane proteins are indicated on the surface of the inner membrane.

Additional protineaceous structures that are attached to the membrane such as lipopolysaccharides, flagellum and surface proteins are indicated. The peptidoglycan cell wall is indicated in blue which forms a rigid layer between outer and inner membrane

(gram-negative) and on the exterior of cytoplasmic membrane (gram-positive). This figure was modified from Fig. 1.1 from the Protein Targeting book by Anthony Pugsley.

45

Figure 1.3: Schematic of the protein translocation and insertion pathways of E. coli.

Newly synthesized proteins can be targeted either Co- and post-translationally to the SecYEG translocon and to the YidC insertase in bacteria. (A) Targeting is initiated by the co-translational binding of SRP to ribosomes translating a membrane protein (black line exiting the ribosome). The SRP–ribosome-nascent chain (RNC) complex is then targeted to the SRP receptor FtsY (Magenta), that is bound to the membrane. FtsY and SRP interact in a GTP dependent fashion (2). Upon GTP hydrolysis, FtsY hands over the RNC to either the integral SecYEG translocon (2a) or to membrane protein insertase YidC (2b). 46

Translation presumes and the transmembrane segments are partitioned into the membrane.

The heterotrimeric SecYEG translocon forms a protein conducting channel and associates with additional proteins, like SecA or the SecDFYajC complex. (B) Secretory proteins are first contacted by the chaperone trigger factor which are in close proximity to the ribosome.

In certain cases, the substrate nascent chain is either directly bound by the free floating soluble SecA, which translocates the proteins post-translationally in ATP-dependent steps through the SecYEG channel (1a) or bound by the chaperone SecB and then get into contact with the SecYEG-bound SecA (1b).

47

Figure 1.4: Cartoon representation of the molecular motor SecA.

Cartoon ribbon structure of T. maritima SecA is shown. (PDB: 2FSF) NBD1 and 2 are shown in blue and yellow; the two helix finger as part of the HSD is shown in orange; the transducer helix in red-orange; the PPXD is shown in green; and the HWD in maroon.

The bound ADP molecule is shown in a stick representation.

48

Figure 1.5: Cartoon representation of SecYEG from T. Thermophiles.

Crystal structure of SecYEG from Thermus thermophiles (PDB 5AWW) reveals an hourglass-shaped transmembrane channel formed by 10 TM segments of SecY. In the resting state, the channel is blocked by a plug domain (green) formed by a short helix in

TM2a. A lateral gate between TM2b (cyan), TM3 (blue) and TM7 (red) allows the TM region of substrates to exit laterally into the lipid bilayer. SecE that wraps around the rim of SecY is shown in yellow, and SecG is shown in purple. A. Side view to show that hour glass structure of SecYEG. B. Front view to show the separation of TM segments of the lateral gate.

49

Figure 1.6: Ribbon and space filling models of E. coli YidC

The crystal structure of YidC from E. coli (PDB 3WVF) reveals a novel hydrophilic groove as indicated within the inner leaflet of membrane, which is exposed to the cytoplasm and the lipid bilayer but is sealed on the periplasmic side. In the center of the groove resides a strictly conserved positively charged residue (R366, shown in red). The non-essential periplasmic domain is indicated as P1 and the TM3/TM5 forming greasy slide known to contact the TM segment of the inserting substrates are shown in green and yellow respectively.

50

Figure 1.7: Model of the Holo translocon.

Structure of Holo translocon assembled using crystal structure coordinates and cryo-EM maps of various proteins. The structure shows YidC in cyan is located in close proximity to SecY in blue and in proximity to SecF in purple. The periplasmic region of

YidC interacts with that of SecF in purple and acts as a clamp to keep the complex together.

SecD in green sets itself in between YidC and SecF with its periplasmic domain titled towards the central cavity formed by the complex. SecY in blue and SecE in yellow are shown to interact with YidC at the lateral gate and SecF respectively.

51

Figure 1.8: Structure of the Twin arginine translocon TatABC components.

The crystal structure of TatC from Aquifex aeolicus (PDB 4HTS) shown in blue reveals a

“cupped-hand” shape formed by the six TM segments. Solution NMR structure of oligomeric TatA complex come together to form a pore-like polymer (PDB 2LZS). Shown are both a cytoplasmic view and the edge of the membrane view of the TatA polymer (in pink). Shown in yellow is the solution NMR structure of TatB, which consists of four α- helices and adopts an extended L-shape conformation. The TMH segment of TatB and amphipathic helices of TatA oligomer are shown interacting with TatC. 52

CHAPTER 2

Polarity of the translocated region is the primary translocase

determinant for M13 procoat membrane protein insertion

2.0 Contributions

The work reported in this chapter was carried out by Balasubramani Hariharan, who will be the first author on the paper. Raunak Soman, who will be third author on the paper created several mutants used in this study and contributed in the preparation of the manuscript. Balasubramani prepared all the figures.

53

2.1 Introduction:

Protein targeting and transport is a complex process that has evolved for millions of years to localize proteins precisely to their appropriate destinations. About one third of the cellular proteome are membrane proteins and their transport across or into the membranes is catalyzed by specialized molecular devices. In bacteria there are two proteins that catalyze most of the membrane protein insertion in the inner/cytoplasmic membrane (127-129). The Sec translocon is a universally conserved heterotrimeric complex, which is involved in both protein translocation across the membrane and insertion of proteins into the inner membrane. Membrane protein insertase YidC, another ubiquitous protein, can function both independently and in concert with the Sec translocon to insert membrane proteins into the cytoplasmic membrane (130). Homologs of YidC have been identified in mitochondria, chloroplasts and more recently in archaea and Endoplasmic reticulum (102,107,131-133).

In 2014, high resolution crystal structures of YidC were solved in B. subtilis and E. coli using x ray crystallography (93,94). The structures revealed the presence of an aqueous cavity which is open both to cytoplasm and the lipid bilayer but enclosed on the periplasmic side. Another important feature revealed from the structure was an essential and conserved arginine (R366 in E. coli, R73 in B. subtilis) within the aqueous groove. A mechanism of insertion was proposed based on the structure, where the conserved positively charged residue attracts the negatively charged residue in the N-tail of the

54 substrate and then the transmembrane (TM) segments partition into the lipid bilayer after they exit the greasy slide of YidC. Previous studies have shown that hydrophobic TM segments of substrate contacts YidC mainly at TM1, TM3 (134), TM4 (135) and TM5

(TM3/5 is also called “greasy slide”) during membrane insertion (97,136). From the crystal structures it is apparent how the substrates TM segments can be placed in between TM3 and TM5 (TM2 and TM4 of B. subtilis) while having their periplasmic loop within the hydrophilic cavity.

One of the important questions in the field is to identify the physiochemical translocase determinants for membrane protein insertion (134,135). Previous studies have highlighted that all of YidC’s substrates have small periplasmic domains whereas substrates with larger periplasmic loops require the assistance of the Sec machinery

(137). Further studies have shown that substrates primary structure can be predictively altered to change their translocase determinants from YidC-only to YidC/Sec dependent.

Negative charges in the translocated regions (138) and transmembrane segments (139) are determining features for YidC whereas the opposite charge in these domains favor the requirement for the Sec machinery for membrane insertion. Some of the other key features are a low hydrophobicity in the TM segments (139,140) and charge/polarity of the translocated regions (141).

In this report, we have systematically investigated the polarity/charge hypothesis as a translocase determinant of the M13 procoat-lep protein for membrane insertion. We

55 show that the substrate procoat-lep is YidC-dependent and Sec independent when the polarity of the periplasmic loop is low whereas the increase in polarity of the loop increases its requirement for the Sec machinery for insertion. Furthermore, addition of hydrophobic residues to a strictly Sec-dependent periplasmic loop decreased its requirement for the Sec translocon. We also report the length of the periplasmic loop as a positive determinant for the Sec machinery. Taken together, we show that polarity of the translocated region is the primary determinant of the translocase requirement for membrane insertion.

2.2 Results

The polarity/charge hypothesis predicts the substrate properties in determining the translocases for membrane protein insertion. The substrates translocation domain with polarity below a certain threshold are inserted by YidC only mechanism while increasing the polarity further makes it require the assistance of both YidC and Sec translocon or even blocked if the polarity is too high (141). In contrast, decreasing the polarity can facilitate insertion independent of both the translocases. We have tested this hypothesis using M13 coat protein which is synthesized as a two transmembrane precursor protein with a 20 amino acid periplasmic loop. In our study, procoat possessed a C-terminal extension with 103-amino acid derived from the C-terminus of leader peptidase (also called Lep, (142)) in order to immunoprecipitate the protein using Lep antiserum.

56

To elucidate the dependence of M13 procoat protein on the membrane insertase

YidC and Sec translocon, we employed the depletion strains JS7131 and CM124 respectively. JS7131 has the endogenous YidC gene knocked out and another YidC gene introduced at the lambda attachment site under the araBAD promoter (137). To study membrane insertion under YidC depleted conditions, cells were grown in media supplemented with glucose. JS7131 cells expressing PCLep mutants were grown in the presence of arabinose (YidC expressing) or glucose (YidC depleted) and labeled with [35S]- methionine for 1 min after induction using IPTG. Membrane insertion was examined using the protease accessibility assay. If procoat inserts into the membrane signal peptidase 1 cleaves the protein, generating the signal peptide and the mature coat protein. Upon addition of proteinase K, the coat protein periplasmic loop gets digested and results in a shift on the gel producing a "fragment" (Fig. 2.1). The uninserted procoat remains in the cytosol as full-length protein. When YidC is depleted (grown in 0.2% glucose), procoat accumulates in the cytosol and is resistant to PK (Fig. 2.2, A). Dependence on SecYEG was tested using CM124 cells, where the SecE gene is under the control of araBAD promoter

(143). When the cells are grown in the presence of glucose, substrates that depend on

Sec translocon for insertion are blocked as depleting SecE would result in the disintegration of the entire Sec complex (144,145). Mutants of PCLep were expressed in cells with SecE expressing (0.2 arabinose and 0.4% glucose) or depleted (0.4% glucose) and radiolabeled with [35S]-methionine for 1 min. Membrane insertion was studied as explained for JS7131 cells using protease accessibility assay.

57

For all the experiments, we analyzed outer membrane protein (OmpA) as a positive control for SecYEG depletion and as a negative control for YidC dependence. The

Sec-dependent OmpA accumulates in a cytoplasmic precursor form when SecE is depleted whereas it inserts across the membrane under YidC depleted conditions, as it is

YidC independent for membrane insertion.

2.2.1 Increasing the polarity of the uncharged PCLep loop causes the protein to become

YidC/Sec dependent for insertion.

Polarity/charge hypothesis predicts that charge and polarity are the primary determinants for membrane protein translocases. To test if polarity by itself is a determinant, we substituted the charged residues with asparagine residues (polar but uncharged). The mutant with a highly polar periplasmic loop without any charge would still require the assistance of both the insertases for membrane insertion. In order to test that, we increased the hydrophilicity of the loop by adding more asparagine residues to a

Sec-independent mutant ANGNN PCLep. Signal peptidase cleavage and proteinase K accessibility show that both PC-wildtype (Fig 2.1 A) and ANGNN PCLep are strictly YidC- dependent and Sec-independent for insertion (141). As seen from Fig 2.1 B, substituting position Ala7-Lys8-Ala9-Ala10 with asparagine residues results in mutant 7N PCLep which inserts in a YidC-dependent and slightly Sec-dependent fashion. Further substitution of one more asparagine residue (8N PCLep, 237 kJ/mol GES) at position +20, makes it require

Sec translocon to a greater extent for membrane insertion (Fig. 2.1, C). The protein

58 becomes completely YidC/Sec-dependent when two more asparagine residues are added to 8N mutant at position +16 and +18. This resulting mutant, 10N PCLep (264 kJ/mol GES), was found to be strictly dependent on both the translocases in accordance with our hypothesis (Fig 2.1, D).

2.2.2 Addition of hydrophobic residues to the highly polar loop decreases the Sec- dependence of insertion.

The results shown in Fig. 2.1 indicate as the polarity of the PCLep periplasmic domain is increased, the protein becomes increasingly dependent on the Sec machinery for insertion, in addition to YidC. However, translocation is blocked when the polarity of the loop reaches a limit. No insertion is observed with the 11N PCLep protein (316 kJ/mol

GES) that has one more asparagine added (at +19) (Fig. 2.2, A), suggesting a polarity threshold beyond which proteins cannot translocate a loop even in the presence of both the translocases.

If polarity of the translocated region is the primary translocase determinant, the addition of hydrophobic residues to the loop should decrease its dependence on the translocases.

We substituted either 2 or 3 phenylalanine residues in the periplasmic loop which resulted in PCLep11N+2F and PCLep11N+3F mutants respectively. Results in Fig. 2.2 shows that these mutants can insert into the membrane. Remarkably, while the

PCLep11N+2F mutant is dependent on the SecYEG for insertion, the PCLep11N+3F inserts completely independent of Sec translocon (Fig. 2.2 B, C). This clearly demonstrates that the increase in hydrophobicity of a neutral periplasmic loop rescues the insertion of 11N 59

PCLep mutant and increasing it further makes the mutant completely independent of Sec translocon.

To elucidate whether the charged loop has similar effect upon increased hydrophobicity, we substituted hydrophobic residues in a highly charged loop and studied its translocase dependency. Here we substituted three phenylalanine residues into the periplasmic loop of the strictly Sec-dependent 3R and 5E PCLep mutants (See schematic in Fig 2.2, A). Strikingly, similar to the 11N PCLep with 3F added, these mutants insert in a

YidC-dependent and Sec-independent manner (Fig. 2.2, E and 2.2, G). Taken together, these results support the polarity/charge hypothesis that it is the overall polarity of the periplasmic loop that dictates the translocase requirement for protein insertion.

2.2.3 Increasing the hydrophobicity of TM2 recues membrane insertion of PCLep mutants with highly polar periplasmic loops.

Our hypothesis is that increasing the hydrophilicity of the loop beyond a certain limit will prevent translocation even with both the YidC and SecYEG cooperating in membrane insertion. This explains why the 11N (Fig. 2.2, A), the 7E (Fig. 2.3, C; (146)) and

4R (Fig. 2.3, A) are blocked in membrane insertion. We asked whether we could promote translocation of these highly polar periplasmic loops if we increase the driving force for insertion by introducing 4 leucines into TM2. As shown in Fig. 2.3, translocation of the 4R

PCLep was completely rescued by the 4L mutation in TM2 with both YidC and Sec promoting translocation (Fig. 2 3, compare panels A and B). In addition, translocation of the 7E PCLep and PCLep 11N was restored to some extent by the substituting the 4

60 leucines into TM2 (Fig. 2.3, see panels D-E). The results show that increasing the hydrophobicity of TM2 of PCLep increases the capacity of PCLep to translocate the highly polar periplasmic loops.

2.2.4 Increasing the length of the translocated region increases the Sec dependence of membrane insertion.

Previous studies have identified that YidC only substrates have short periplasmic loops (137). This can be related to the polarity/charge hypothesis because when the length of the luminal loop increases, the polarity of the loop exceeds the energy threshold to be inserted by the YidC itself and so would require the assistance from Sec translocon.

The longest loop inserted by YidC only pathway identified so far is that of MscL protein which is 29-residue long. We tested the translocase dependency by increasing the loop size of procoat by adding 5, 10, and 15 alanine residues after the phenylalanine at position

+11 (see Fig. 2.4). Adding alanine residue does not increase the hydrophilicity of the loop significantly as the GES value of an alanine residue is -1.7 kJ/mol. As shown in Figure 2.4,

PCLep with the addition of the 5 alanine residues, inserts Sec-independently but still in a strictly YidC-dependent manner (Fig. 2.4, A). Increasing the loop length further by adding

10 alanine residues resulted in membrane insertion that is markedly dependent on the

Sec machinery, in addition to YidC (Fig. 2.4, B). Interestingly, increasing the driving force for membrane insertion by the addition of 4 leucine residues to TM2, results in very efficient insertion under SecE-depletion conditions (Fig. 2.4, B right panel). Further increases in the loop size by addition of a total of 15 alanine residues made insertion very

61 inefficient, suggesting there is a size limit even with SecYEG. The addition of 4 leucines to the TM segment did not improve insertion (Fig. 2.4 C, right panel). In conclusion a loop length of 30 residues seems to be the maximum that can be translocated.

2.2.5 The Sec-dependent PCLep inserts at the interface of YidC and SecYEG.

In support of our hypothesis of membrane protein insertion occurring at the interface of YidC and SecYEG, we have shown physical crosslinking of the substrate PCLep to both YidC and SecYEG using an in vivo site-directed cross-linking approach. Amber stop codons (TAG) were incorporated at different positions in either YidC or SecY as shown in the figure 2.5. These constructs were then expressed along with the PCLep substrate in cells that carried pSup-BpaRS-6TRN plasmid. pSup plasmid encodes the amber suppressor tyrosyl tRNA and tyrosyl-tRNA synthetase mutated to incorporate pBpa at the amber codon. Using this method, we show that wild-type PCLep (YidC dependent) and 3R PCLep

(YidC/Sec dependent) crosslink to YidC and SecY when expressed with Psup-YidC plasmid

(YidC subcloned into the pSup-BpaRS-6TRN plasmid). The Psup-YidC plasmid has YidC under its endogenous constitutive promoter. Cells expressing both substrate and Psup-

YidC plasmids were grown to 0.6 OD and labeled with [35S]methionine after IPTG induction. The sample is then divided into 2 aliquots, one which is then UV exposed for

20 minutes at room temperature to activate the pBpa photocrosslinker and the other is kept in dark. Both aliquots are the TCA precipitated, acetone washed, and dissolved in

Tris-SDS buffer. The samples are purified using Co2+ affinity resin and either left untreated or was immunoprecipitated using Anti-Lep antibody. The samples were then analyzed by 62

SDS-PAGE and phosphorimaging. The results show that wildtype PCLep during insertion contacts YidC insertase, evident from the crosslinked band precipitable by both resin and

Anti-lep. (Fig. 2.6, A)

Similarly, we show crosslinks between SecYEG and PCLep in MC1060 cells bearing three plasmids, one expressing pISI SecY408EG, one expressing pSup-BpaRS-6TRN plasmids, and the third expressing either wild-type PCLep or 3R PCLep. From figure 2.6, it is evident that 3R PCLep, a Sec dependent protein, interacts with the SecY complex. These crosslinking results are in support of our hypothesis that PCLep inserts at the YidC/Sec interface with both the insertases functioning in concert.

2.3 Discussion:

In this chapter, we provide evidence for polarity/charge hypothesis which states that polarity of the translocated region is the primary translocase determinant for membrane protein insertion. We examined the YidC/Sec dependency of the M13 procoat’s periplasmic loop by replacing the charged residues with polar uncharged residues (i.e. asparagines) as studying the effect of charged residues on translocation is challenging due the contribution of proton motive force (pmf) (146,147). Pmf promotes the translocation of negatively charged and prevents the translocation of positively charged residues (148). As predicted by our hypothesis, when the polarity of the uncharged loop is low, procoat is YidC dependent for insertion, but when the polarity is increased it requires the assistance of both YidC and Sec for membrane insertion (Fig.

2.1). This suggests that charged residues are not required to make the substrate YidC/Sec

63 dependent. We further substantiated the polarity hypothesis by showing that the Sec dependency of a positively, negatively and a highly polar periplasmic loop decreases when the overall hydrophilicity of the periplasmic loop is decreased by the addition of hydrophobic residues (Fig 2.2).

As summarized in Table I, we found a good correlation between the Sec- dependence of membrane insertion and the standard free energy needed for the transfer of the different polar loops of the PCLep proteins across the membrane studied in this paper. These GES values for each amino acid that are used to determine the standard free energy expense include the contribution of the peptide bond (149). For each series of mutants, the protein becomes more and more dependent on Sec as the GES values increase. However, beyond a certain value (> 316 KJ/mol for the 11N PCLep, >296 KJ 4R, and >396 kJ/mol for the 7E-PCLep), insertion did not occur. The threshold appears higher for the negatively charged mutant and lowest for the positively charged mutant probably because the membrane potential favors the transfer of negatively charged residues over positively charged residues, a fact which is not considered by the GES values.

Recent structures of YidC solved in both B. subtilis and E. coli revealed the presence of a conserved positively charged residue within the aqueous cavity (94,138).

The conserved arginine was proposed to attract the negatively charged residue in the N- tail of the substrate. This explains the requirement of Sec machinery for a positively charged loop of PCLep as the charge repulsion with the conserved arginine would hinder

64 the ability of YidC to insert this region by itself. Our hypothesis is that PCLep is still targeted to YidC as all the mutants are YidC dependent for insertion. As the polarity is increased, YidC cannot insert the substrate by itself and Sec translocon, which is in close proximity to YidC, assists in the insertion process.

While translocation of a strongly polar loop of PCLep can occur with the help of both YidC and SecYEG, there is a limit to the polarity level that can be translocated. When either the 4 arginines, 7 negatively charged residues or 11 asparagine residues are in the loop, then translocation does not occur because the polarity of the loop exceeded the threshold (Fig. 2.3). However, we find by increasing the hydrophobicity of TM2 (see the corresponding 4 leucine mutants), we could translocate these highly polar regions to a certain extent. We believe this is due to increasing the driving force for membrane insertion.

Can the polarity/charge hypothesis explain the length requirement for translocation of a domain. Previous studies have shown that length of the periplasmic region is a positive determinant of the Sec-dependent insertion (150,151). One of the unaccounted factor in the previous studies was that increasing the length of the polar region would indirectly increase the membrane transfer expense. In order to circumvent the increase in GES value (152), we increased the length of the periplasmic region by adding alanine residues after the phenylalanine at position +11. Interestingly, we found that insertion of PCLep is still Sec-independent by the addition of 5 alanine but the

65 addition of 10 alanines results in translocation becoming Sec-dependent, suggesting that length of the chain is indeed a translocase determinant. Interestingly, we found by increasing the driving force for membrane insertion by incorporating 4 Leucine residues into the TM segment, that translocation of the 10 ala PCLep mutant became almost completely Sec-independent. Thus, a periplasmic loop size of 30 residues was completely translocated by the YidC insertase, which is similar to the size of MscL periplasmic loop which inserts by the YidC only pathway (153). The length limit for translocation by YidC- only pathway stems from the limited capacity of the aqueous cavity of YidC (154). The groove can hold a maximum of 30 residues and increase in length would require the assistance from Sec translocon to membrane insert the substrate. This explains why YidC- only substrates have only small protein domains that are translocated.

Notably there are many membrane proteins, single and multispanning, that have translocated loops greater than 30 residues and are inserted in bacteria by the Sec machinery. We hypothesize that PCLep, on the other hand, is targeted to the YidC insertase and this leads to its limited ability to insert highly polar loops or loops of 30 or

35 residues in size even when it cooperates with SecYEG. Although the reason for this is not clear it was also seen in a previous study (155). For example, translocation of PCLep with an OmpA fragment inserted into the periplasmic loop although efficient when short was strongly inhibited or blocked when it was 40, 60, or 80 residues in size. However, when the size of the periplasmic loop was greater than 80 residues in size it was

66 translocated quite well probably because it was targeted to the Sec machinery in a different fashion and inserted by SecYEG using the motor SecA ATPase for translocation.

Based on our results of PCLep requiring both the membrane translocases for insertion of highly polar loop and the fact that YidC has been previously shown to associate with SecYEG (156), we hypothesize that the insertion of PCLep occurs at the interface of YidC and SecYEG (Fig. 2.7) with both the insertases interacting with the substrate in concert. Our crosslinking studies between PCLep and the translocases corroborate such a hypothesis.

2.4 Materials and Methods

2.4.1 Materials

Sodium azide was purchased from Sigma. Isopropyl 1-thio-D-galactopyranoside was from

Research Products International Corp. Trans [35S]-label, a mixure of 85% [35S]-methionine and 15% [35S]-cysteine, 1000 Ci/mmol, was from PerkinElmer Life Sciences. Antisera to leader peptidase (anti-Lep) and outer membrane protein A (anti-OmpA) were from our own laboratory collection.

2.4.2 Strains and Growth Conditions

Js7131 and Mc1060 strains were from our lab collection and CM124 was a generous gift from Beth traxler (143). The yidC and secE genes are under the control of the araBAD promoter in JS7131 and CM124 cells respectively. pSup-BpaRS-6TRN, pLz and pms119 plasmids were from our lab collection.

67

Overnight cultures of the YidC depletion strain JS7131 cells were back diluted in 1:100 and cultured in LB media for 3h at 37 °C supplemented with either 0.2% arabinose (YidC expression) or 0.2% glucose (YidC depletion) (141). The SecE depletion strain CM124 was grown in M9 minimal media with either 0.2% arabinose and 0.4% glucose (SecE expression) or 0.4% glucose (SecE depletion) (157) for 8–9 h to deplete SecE. Both the cells were exchanged into fresh M9 media without methionine and shaken for 30 min at

37 °C. To express the mutant PCLep proteins in CM124, JS7131, and MC1060, the genes were cloned into the pLZ1 vector and pMS119 vector under the control of the lacUV5 promoter (158). For crosslinking studies, the MC1060 cells were cultured in LB media with unnatural amino acid BpA supplemented at 37 °C in dark until it reached optimum OD prior induction.

2.4.3 Protease accessibility assay

PCLep mutants encoded on the vector were grown in the respective strains and the protein expression was induced by 1 mM isopropyl 1-thio--D-galactopyranoside (IPTG) for 5 min. Cells were then labeled with [35S]-methionine for 1 min, collected by centrifugation and resuspended in spheroplast buffer (33 mM Tris-HCl, pH 8.0, 40% (m/v) sucrose). The resuspended cells were converted into spheroplasts by treating them with

1 mM EDTA and 10 g/ml lysozyme on ice for 30 min. The cells were the split into two aliquots with addition of proteinase K (0.75 mg/ml) to one of the aliquots for 1 h on ice.

The reaction was quenched by the addition of 5 mM phenylmethylsulfonyl fluoride

(PMSF) for 5 min. The cells were TCA precipitated and spun down at 14,000 g for 10 min

68 and the pellet was washed with 1 ml of ice-cold acetone. The protein pellet was then solubilized with Tris-SDS buffer (10 mM Tris-HCl, pH 8.0, 2% (m/v) SDS). The samples were immunoprecipitated with antiserum to leader peptidase (Lep) to precipitate the PCLep derivatives or with antiserum to OmpA for a control. The samples were finally analyzed by SDS-PAGE and phosphorimaging.

2.4.4 Mutagenesis of PCLep to create mutants

All of the mutants were constructed using site directed PCR mutagenesis and their sequence integrity was verified by DNA sequencing.

2.4.5 In vivo pBpa cross-linking

E. coli JS7131 cells carrying pSup-YidC which has TAG amber stop codons at position 497 in YidC were grown at 37°C in LB media in the presence of 1 mM pBpa. After reaching the log phase, cells were induced with 1mM IPTG for 5 min and labeled with [35S]-methionine

(70 Ci/mL) for 2 min. Cells were separated into two aliquots: one was exposed to UV for

20 min at room temperature and the other was kept in the dark. Both the aliquots were

TCA precipitated, acetone washed and solubilized in Tris-SDS, pH 8.0 buffer. The samples were incubated with Co2+Resin to enrich YidC and the YidC crosslinked proteins. The elution fraction was treated with Anti-Lep antiserum to precipitate the substrate PCLep or PCLep crosslinked proteins. The samples were run on an SDS polyacrylamide gel and analyzed using phosphorimaging. Similarly, MC1060 cells bearing three plasmids, the pISI

SecY408EG, pSup-BpaRS-6TRN plasmid, and pLZ1 3R PCLep were grown to log phase and induced for 30 min by incubation with IPTG. The cells were then [35S]-labeled, separated

69 into 2 aliquots and UV exposed as described above. The samples were immunoprecipitated using Anti-His antibody to detect the His tagged SecY and Anti-Lep antibody to probe the substrate PCLep. The samples are analyzed by SDS-PAGE using a

15% polyacrylamide gel followed by phosphorimaging.

70

Figures 2.5

Fig 2.1 Schematic representation of Proteinase K accessibility assay.

Protease accessibility was used to examine membrane insertion of procoat. The substrates are induced by addition of IPTG for 5 mins and radioactively labeled with [35S]methionine for 1min. Once labeled the substrates can either insert into the membrane or remain uninserted in the cytosol. If substrate procoat inserts into the membrane, it is cleaved by signal peptidase 1 and gets converted to the mature coat protein (M). Further addition of

Proteinase K (PK) cleaves the translocated loop of PCLep, resulting in an additional shift on the gel (Fragment F). When the translocase is depleted, the precursor form (P) of the substrate protein accumulates in the cytoplasm and is resistant to PK digestion. OmpA

(violet), a Sec-dependent outer membrane protein serves as a positive control for Sec translocon and a negative control for YidC insertase. 71

Figure 2.2: Increase in polarity of the periplasmic loop makes it more Sec dependent.

At the top of the figure are the amino acid sequences of PCLep mutants with asparagine residues introduced into the periplasmic region marked in red. The sequences of the two transmembrane segments are marked with green letters. YidC and SecE requirement for membrane insertion of WT (A), 7N (B), 8N (C), and 10N (D) PCLep were tested. Representative data of OmpA indicate inhibition in OmpA export under SecE depletion conditions (E). E. coli JS7131 (YidC depletion strain) expressing the various 72

PCLep proteins were grown for 3 h under YidC expression (0.2% arabinose) or YidC depletion conditions (0.2% glucose), labeled with [35S]methionine for 1 min, and analyzed by the protease-accessibility assay, as described under “Experimental Procedures.” The

Sec dependence of membrane insertion was tested using CM124, the SecE depletion strain.

CM124 transformed with the respective pLZ1 plasmid was grown under SecE expression

(0.2% arabinose and 0.4% glucose) or SecE depletion conditions (0.4% glucose), labeled with [35S]methionine, and analyzed as described above. P, M and F denote the precursor, mature and the fragment form of the PCLep substrate. PK is the proteinase K that has a wide range of specificity.

73

Figure 2.3: Increase in hydrophobicity of the translocated loop decreases its Sec dependence for membrane insertion. At the top of the figure are the amino acid sequences of PCLep Sec-dependent mutants (substitutions are in red letters) with the polarity of the loop decreased by the introduction of hydrophobic phenylalanines (blue letters). The YidC and SecE requirements for membrane insertion of 11N (A), 11N+2F (B), 11N+3F (C), 5E (D), 5E

+3F(E), 3R (F), and 3R+3F PCLep (G) are shown. Membrane insertion, labeling and

74 protease mapping studies were performed as described in Fig. 2.1, and under

“Experimental Procedures.”

75

Figure 2.4: Increasing hydrophobicity of TM2 segments rescues translocation of the highly polar periplasmic lops of PCLep.

At the top of the figure are the amino acid sequences of Sec-dependent PCLep mutants (substitutions are in red letters) without and with added four leucine residues (in red letters) introduced into TM2. The YidC and SecE requirements for insertion was investigated for the 4R (A), 4R + 4L (B), 7E (C), 7E + 4L (D) and 11N +4L (E), are shown.

Membrane insertion, labeling and protease-accessibility studies were performed as described in Fig. 2.1, and under “Experimental Procedures.” 76

Figure 2.5: Length of the periplasmic loop acts as a positive determinant for membrane protein insertases.

At the top of the figure are the amino acid sequences of PCLep mutants with extended loops of 5, 10, and 15 alanine residues (in blue letters) at the indicated positions within the periplasmic loop. Where indicated, 4 leucine residues (in red letters) were added to TM2. The YidC and SecE dependence of the 5 Ala (A), 10 Ala (B), and 15 Ala (C)

PCLep mutants are shown. The PCLep mutants with the 4L mutation are shown in the right panel in A-C. The YidC and SecE dependence of membrane insertion was examined as described in Fig. 2.1, and under “Experimental Procedures.” 77

Figure 2.6: Pictorial representation of Amber codon TAG mutants of YidC and

SecYEG.

A. Cartoon ribbon diagram of YidC highlighting the TAG mutation at position 497 which faces the greasy slide. B. Cartoon ribbon diagram of SecY with the surface map highlighting residue 408 which was used for photo crosslinking studies.

78

Figure 2.7: Photo crosslinking studies showing the substrate PCLep interacts with

YidC and SecY during translocation.

Cells expressing either the YidC derivative with pBpa introduced at position 497 or the SecY derivative with pBpa introduced at position 408 along with PCLep mutants were harvested and exposed to UV light. In Fig. A, E. coli JS7131 cells bearing two plasmids, one the Psup-YidC plasmid which expresses YidC with an amber codon at position 497 and the other expressing wild-type PCLep were grown. The YidC substrate wild-type PCLep was induced, the cells labeled with [35S]-methionine, and UV exposed as mentioned above. The samples were then purified with Co2+resin to enrich the C-terminal

His tagged YidC and immunoprecipitated with Anti-lep to probe the substrate. Lane 1 shows purified YidC serving as molecular marker. Lanes 2 and 3 show the presence of

YidC and crosslinked band around 75kDa when exposed to UV light. In Fig. B, MC1060 79 cells bearing three plasmids, one expressing SecY with an amber codon at position 408, one expressing the 3R PCLep mutant and the third expressing pSup-BpaRS-6TRN were grown, induced and radiolabeled as described in the methods section. The sample was split into two aliquots. One was exposed to UV and the other was kept in dark. The samples were probed with anti-His antibody to pull down the N-terminal His tagged SecY, and anti-

Lep to immunoprecipitate PCLep. Samples in lane 1 and 6 in panel B are purified proteins serving as molecular weight marker controls for YidC and SecY respectively. Indicated with (*) and (#) are the crosslinked bands of PCLep-SecY and PCLep-YidC respectively.

80

Membrane The Sec GES value Insertase Translocon (Kj/mol) Mutants YidC SecYEG Fig. 1 Wild type +++ - 238 7N PCLep +++ + 251 8N PCLep +++ ++ 237 10N PCLep +++ +++ 291 Fig. 2 11N PCLep NI NI 316 11N PCLep + 2F +++ ++ 288 11N PCLep + 3F +++ - 286 5E PCLep +++ +++ 317 5E PCLep + 3F +++ - 287 3R PCLep +++ +++ 282 3R PCLep + 3F +++ - 252 Fig. 3 4R PCLep NI NI 296 4R PCLep + 4L +++ +++ 296 7E PCLep NI NI 396 7E PCLep + 4L +++ +++ 396 11N PCLep +4L +++ +++ 316 Fig. 4 5A PCLep +++ - 230 10A PCLep +++ + 221 15A PCLep +++ +++ 213 5A PCLep +4L +++ - 230 10A PCLep +4L +++ - 221 15A PCLep +4L +++ +++ 213 Table 1: GES values of the translocation loops for all the mutants tested in the study along with their dependency on YidC and SecYEG for membrane insertion

The membrane transfer expense for translocation of the periplasmic region of the PCLep constructs calculated using the GES scale. (159) The standard free energy contribution of the membrane potential is not considered here for the transfer of charged residues. ‘+++’ indicates a strict translocase requirement for insertion; ‘++’ indicates a partial translocase requirement; ‘+’ indicates a weak translocase requirement; ‘-‘indicates no translocase requirement. 81

Mutant YidC+ YidC- SecE+ SecE- Fig. 1 Wild type 98 1 100 100 7N PCLep 78 1 83 72 8N PCLep 85 1 86 41 10N PCLep 90 2 76 4 Fig. 2 11N PCLep 5 0 8 2 11N PCLep + 2F 58 2 62 54 11N PCLep + 3F 92 3 89 87 5E PCLep 93 8 89 12 5E PCLep + 3F 96 4 94 95 3R PCLep 93 6 62 7 3R PCLep+3F 86 0 92 94 Fig. 3 4R PCLep 0 0 0 0 4R PCLep + 4L 96 0 88 58 7E PCLep 0 0 0 0 7E PCLep + 4L 55 0 60 3 11N PCLep + 4L 19 3 38 23 Fig. 4 5A PCLep 91 8 92 80 10A PCLep 95 0 86 54 15A PCLep 23 0 30 14 5A PCLep + 4L 96(NS) 98(NS) 98 99 10A PCLep +4L 97(NS) 98(NS) 96 90 15A PCLep +4L 21(NS) 0(NS) 53 40 Table 2: Translocation efficiency for membrane insertion of PCLep mutants.

Table shows the percentage of the protein inserted into the membrane for all the strains and conditions shown in the paper. For the purposes of quantitation only lanes 2 and 4 were quantified for signal peptide processing. The raw data was quantified using software Image J, developed by the NIH; following which the signal peptide cleavage percentage was calculated using the method described in Schuenemann et al.(160)

82

Figure 2.7: Mechanism of insertion of Procoat-Lep at the interface of YidC and

SecYEG.

Cartoon representation YidC/SecYEG complex with the greasy slide of YidC and the lateral gate of SecYEG facing each other. Mechanism of PCLep insertion at the interface of both these proteins is shown here in steps. A. The protein arrives at the inner

83 membrane of E. coli. B. The signal peptide H1 enters into the hydrophilic groove of YidC as a hairpin with the N-tail facing towards the cytoplasm. C. The mature domain of PCLep

(H2) interacts with SecYEG and both the TM’s of the substrate partition into the membrane with the periplasmic loop in the hydrophilic groove. D. The transmembrane segments partition into the membrane fully and the loop is translocated into the periplasm. E. The inserted protein moves away laterally from the membrane insertase complex.

84

CHAPTER 3

Structural and Functional studies of YidC using Electron Spin

Resonance

3.0 Contributions

The work reported in this chapter was carried out by Balasubramani Hariharan from The

Ohio state university and Indra Sahu from Miami University. Balasubramani created, purified and reconstituted all the YidC constructs and Indra performed the EPR and DEER experiments. Balasubramani wrote the manuscript and both contributed equally in the preparation of the figures.

85

3.1 Introduction

The field of structural biology had a landmark event in 1985 when the first X-ray crystal structure of a membrane protein was solved. Since then there have been over 650 structures of membrane proteins determined using various techniques. Even though there is an exponential increase in the availability of atomic structures of membrane proteins from X-ray, NMR and cryo-EM methods, it could not, in its entirety, reveal the information on the dynamics of residues within membrane proteins (161). In 1989 Hubbell and coworkers developed a method, called site directed spin labeling (SDSL) where a paramagnetic molecule that has a lone pair of electron, can be used to study protein dynamics. In SDSL experiments, all native non-disulfide bonded cysteines are replaced with an alanine or serine (162). A unique cysteine residue is then introduced into a recombinant protein using site-directed mutagenesis and subsequently reacted with a sulfhydryl-specific nitroxide reagent to generate a stable spin label side-chain (163,164).

These experiments were in principle similar to an NMR experiment, except that an electron spin is excited instead of the nuclear spin. These studies were called Electron Paramagnetic

Resonance (EPR) experiments and they could be employed to elucidate the dynamics of the site of interest in membrane proteins.

The fundamental principle of EPR spectroscopy is that it measures the absorption of microwave radiation corresponding to the energy splitting of an unpaired electron when it is placed in a strong magnetic field. The electron possesses a magnetic moment and spin quantum number 푆 = 1/2, with magnetic spin components 푀푠 = +1/2 and 푀푠 = −1/2. In the absence of a magnetic field, these two states are degenerate and have the same energy.

86

However, when an external magnetic field (퐵0) is applied, the magnetic moment of electron splits and aligns itself either parallel (푚푠 = −1/2) or antiparallel (푚푠 = +1/2) to the field giving rise to two energy states. The energy shift between the two states is proportional to the applied magnetic field in EPR similar to the chemical shift factor in an NMR experiment. In a typical EPR experiment, the microwave frequency or the magnetic field is changed until the energy transition between the two electron states match the microwave frequency. This is a phenomenon called resonance (165). The Fourier Transform of the output from this experiment is a derivative lineshape as shown in the figure 3.2. Further parameters like the amplitude and peak width of the lineshape can be extracted from the lineshape spectrum and be used for the calculation of membrane depth factor “ф” (phi) that provides information about solvent accessibility of a spin label within a protein.

The advantage of EPR spectroscopy is that it is three times more sensitive than nuclear magnetic resonance (NMR) spectroscopy. It is not restricted by any size constraints and does not rely on expensive isotope labeling as NMR experiments typically do. Nor it is influenced by the optical properties of the sample. EPR experiments can be performed on a wide range of samples containing proteins in solution to highly packed membrane suspensions, to tissue samples, ammonium sulfate-precipitated solids, or frozen samples maintained at cryogenic temperatures (166). The sample volume can be as low as 70 nanoliter to several mL or even a small animal (167,168). Most importantly, EPR spectroscopy can answer structural and dynamic questions of proteins in solution and in membrane bilayers that are challenging using most other biophysical techniques (169).

3.1.1 EPR: Continuous wave and Pulsed EPR

87

There are two methodologies utilized for spin measurements with an EPR instrument: Continuous Wave EPR (CW EPR) and Pulsed EPR. In a continuous wave EPR, a constant frequency is maintained while sweeping the magnetic field (Bo) for resonance.

In the experimental set-up, the microwave (MW) field builds up in a resonator into which the sample tube is placed. (Figure 3.1) The MW irradiation is provided in the form of waveguides which are completely absorbed by the sample and reflected back. The recording of this reflected MW power as a function of the magnetic field yields the CW

EPR spectrum. In case of a pulse EPR, the spectrum is generated by exciting a large frequency range simultaneously with a single high-power MW pulse of given frequency ν at a constant magnetic field B0. Most EPR applications make use of continuous wave methods since the pulse EPR method requires sophisticated technical equipment. One of the limitations of pulse EPR is that it has to be carried out at low temperature due to the short relaxation of the transverse magnetization of electron spin involved in pulse experiments, especially for transition metal ions. CW EPR spectra, on the other hand, can be recorded at room temperature for a number of spin systems including radicals, transition metal ions and is significantly more sensitive than the pulsed EPR method (170,171).

Pulse EPR techniques are emerging as the instrument of choice for measurement of interspin distances between two spin labeled residues. The distance measured can range from 2-8 nm, thereby matching with the distance range of interest typically in proteins or multiprotein assemblies. We have employed the pulse EPR method called double electron electron resonance (DEER) technique to study the distances between specific residues within YidC. The major goal of this DEER study was to establish the technique in the lab

88 with the long-term objective to measure distances between spin labels attached to YidC and SecYEG complexes, respectively (172-174). This information can provide information about how the YidC and SecYEG bio-machineries work together in the membrane insertion process.

3.1.2 Double Electron-Electron Resonance

Double electron-electron resonance (DEER) (also known as pulsed electron double resonance (PELDOR) EPR spectroscopy), in combination with site-directed spin labeling, is a powerful structural biology technique used to obtain long range distances of ~20-80 Å in biomolecules. The distances are determined by measuring the dipolar coupling between two unpaired electron spins. In DEER, the measurement of the coupling between the two spins can be performed by monitoring one set of spins while exciting another set of spins with a second microwave frequency. This allows the distance between them to be measured

(175). These distance measurements provide valuable structural information from systems in which other techniques like solution NMR, X-ray crystallography or FRET prove difficult or impossible. (176) However, the application of DEER spectroscopy to study membrane proteins can still be difficult due to much shorter transverse relaxation times

(T2) or phase memory times (Tm) and poor DEER modulation in biologically relevant proteoliposomes as compared to water soluble proteins or membrane proteins in detergent micelles. Additionally, the use of a highly concentrated protein sample in liposome introduces a strong background contribution, due to crowding, causing extreme limits on sensitivity, distance range, and experimental throughput. To circumvent these issues,

DEER experiments with membrane proteins are conducted with at least one of the spin

89 labels located outside the membrane. Other methods employed are the use of lipodisq nanoparticles or nanodiscs, reconstitution in the presence of unlabeled proteins, bicelles or restricted spin label probes. Lipodisq or nanodiscs are gaining popularity as they possess a small scaffold protein that surround a lipid bilayer of phospholipids. With this nanodisc system, artifact due to crowding is avoided (177). DEER experiments have historically been carried out at X-band (~9.5 GHz) microwave. However, the trend is now moving towards Q-band microwave excitation to increase sensitivity. X-band DEER experiments can suffer from poor signal to noise and extended data collection times. Studies by Sahu et al. and others have reported an increase in sensitivity in DEER measurements for proteins or peptides when the experiment is performed at Q-band compared to X-band excitation

(178,179).

In summary, EPR spectroscopy is a powerful technique to solve many structural questions dealing with membrane proteins. In particular, it can examine the spin-labeled residue side chain mobility, solvent accessibility, the polarity of its immediate environment, and intra- or intermolecular distances between two nitroxide spin labels attached to the protein. Among the various spin labels available, the (1-oxyl-2,2,5,5- tetramethylpyrroline-3-methyl) methanethiosulfonate spin label (MTSSL) (180) is typically used as this is the most characterized SDSL and it uses its sulfhydryl specificity to attach to cysteines on the protein and the spin label a relatively small molecular volume, similar to a tryptophan side chain (Fig. 3.2). The MTSL spin label is covalently attached to the protein by formation of a disulfide bond to the sulfhydryl group of the cysteine, engineered at specific locations on YidC. One of the advantages of this spin label is a rather

90 long and flexible sidechain characterized by five rotatable bonds (12 Å in length) and this intrinsic flexibility makes the MTSL a non-perturbing probe at most of the sites in proteins

(181).

In 2014, the structure of the membrane protein YidC was solved at high resolution in Bacillus halodurans and E. coli by X-ray crystallography. The structure revealed the presence of a unique hydrophilic cavity within the inner leaflet of membrane, which is exposed to the cytoplasm and lipid bilayer, but is sealed on the periplasmic side. The cavity contains a strictly conserved positively charged residue that is essential in B. subtilis and has been proposed to interact transiently with the negatively charged residue of the hydrophilic region of membrane protein substrates that are to be translocated across the membrane. Indeed, the negatively charged residues in the N-terminal tail of MifM are necessary for translocation consistent with an electrostatic attraction model (118-121). The conserved positive charge is shown to be not essential for the insertase activity of the E. coli YidC or an A. thaliana Alb3 derivative.

Recent studies performed by Chen et al. (Chen and Dalbey) investigated the role of the positively charged residue in the E. coli YidC and Gram-positive S. mutans YidC2 to determine whether the charge property of the residue or its hydrophilicity is important for the insertase activity of YidC. The results show that the positive charge at position 366 of

E. coli YidC and 73 of S. mutans YidC2 is necessary if the adjacent residue is apolar, suggesting it is the hydrophilic microenvironment that is important for the operation of the cavity. The arginine loses its essential function for YidC if the neighboring residue at 517

(for E. coli YidC) or at 234 (for S. mutans YidC2) is a hydrophilic residue.

91

In this chapter, we have employed EPR to study the hydrophilic nature of the E. coli YidC groove and the environment near the positively charged residue and residues near the top of the cavity. In addition, the solvent accessibility of 4 residues were examined that previously showed conflicting results examining solvent exposure using cysteine alkylation studies with N-ethyl-maleimide and molecular dynamics simulations (182).

These 4 residues are part of the YidC “greasy slide” where the TM segment of the substrate moves through during insertion (183). Finally, as a proof of concept, we measured the distance between two residues within YidC in order to see if YidC reconstituted in lipid vesicles is amenable to DEER analysis.

3.2 Results

3.2.1 Accessibility studies of YidC residues.

The structural dynamics and solvent accessibility of YidC residues in DOPC lipid bilayers were investigated using SDSL coupled with EPR spectroscopy. Both CW-EPR line shape analysis and CW-EPR power saturation methods were used to analyze the immediate spatial environment of YidC residues. CW-EPR spectra are very sensitive to the nitroxide side-chain motion of spin-labeled proteins incorporated into different environments (i.e., aqueous exposed vs lipid exposed). As controls, we studied two YidC residues (Figure 3.3), 501C which is buried in a hydrophobic lipid bilayer environment, and 416C which is in the cytoplasmic loop and hence exposed to an aqueous environment.

Figure 3.5 shows CW-EPR spectra of both the mutants at room temperature. The parameters like Amplitude and Peak width of CW-EPR spectrum were measured and the membrane depth parameter were calculated as explained in methods section. (Table 3.1)

92

Power saturation accessibility studies were performed on both the mutants as described in the methods section. If the spin label is more accessible to a water-soluble quencher, the power required to saturate the sample would be more than the power required for a lipid- soluble quencher. The longitudinal relaxation due to the collision-induced quenching is directly proportional to the power required to saturate the intensity. As can be seen, spin labeled 416C is more accessible to solvent as the power saturation curves show that Ni-

EDDA was more accessible than oxygen and alternatively spin labeled 501C is more accessible to lipid as the power required to saturate is more with oxygen than Ni-EDDA.

Having established the CW-EPR technique, we examined the solvent accessibility of YidC 427C, 432C and 434C, three of four residues (L427, I432, L434, and V500) that were found to be in disagreement between the molecular dynamics simulations and cysteine accessibility studies using N-ethyl maleimide (182). To clarify these results, the three YidC mutants were purified and labeled with the MTSL spin label. CW-EPR experiments were performed on the mutants at room temperature and the spectra are shown in figure 3.5. The power saturation EPR results suggested that the spin-labeled cysteine at

427 (Fig. 3.6, L) and 434 (Fig 3.6, H) are water accessible even though they are part of the greasy slide from the crystal structure and predicted to face lipid. The power saturation curves of Ni-EDDA from the saturation experiments can be seen requiring a higher power to saturate than the air (21% O2) meaning that the spin label is more accessible to quencher in aqueous environment than lipid. On the other hand, the spin label at 432 was lipid exposed, contradicting the in vivo NEM alkylation studies. The membrane depth parameter for the mutants were calculated as “-0.2” for 427 and 434 suggesting that it is water exposed

93 and “0.3” for 432 (Fig. 3.6, K) suggesting that it is lipid exposed. All the mutants have depth parameters close to zero, indicating they are at the interface of water and lipid.

Additionally, to confirm the efficacy of the EPR analysis, we studied residues 371 and 473, which are part of TM2 and TM3, respectively. The previous NEM accessibility and MD simulation results of these residues suggested that 371 is water accessible and 473 is inaccessible. From figure 3.6 (C), the Ni-EDDA accessibility of spin-labeled Cys 371 was only slightly higher than that of air showing the residue is accessible to both water and lipid exposed. This is due to the dynamic nature of YidC, where the TM segments are in constant transient motion, a property of YidC previously shown by Hennon et al. (136). Also, power saturation curves for spin-labeled Cys 473 show that it is more accessible to the oxygen suggesting that it is lipid exposed as predicted by the crystal structure of YidC.

3.2.2 Probing the microenvironment of the hydrophilic groove by SDSL

To probe the microenvironment in the groove, 366R was mutated to a cysteine residue, spin labeled and reconstituted into proteoliposomes. The power saturation EPR

(Fig 3.6, E) studies show that the nitroxyl side chain at 366 is accessible to water with a membrane depth value of -0.8. (Table3.1) The adjacent residue on top of 366R is a tyrosine at position 517. This residue 517, along with 516, were shown to be shielding the positive charge (366) in the molecular dynamics simulations performed by Chen et al. To study the influence of residue 517 on the solvent accessibility of the 366 residue, we performed power saturation experiments on the spin labeled mutant 366C YidC with the immediate environment around it varied. First, to make the spatial region around 366 polar and hydrophilic, we generated the mutant 366C/517N. Figure 3.6, F shows the spin label at 366

94 is still water accessible when a polar hydrophilic residue is at position 517. We then examined the spin label at 366 with the Y517 residue mutated to an apolar isoleucine residue. The power saturation curves for 366C/517I (Fig. 3.6, I) shows that the spin label is buried inside the lipids with a membrane depth factor of “0.9”.

In order to test the accessibility of position 517 with respect to the conserved 366, we created the mutants 517C, 366N/517C and 366I/517C (with the 366 and 517 swapped).

The proteins were purified, modified with spin label, and the spin label probed for the accessibility at the position 517. The EPR results from Fig. 3.6, M shows that spin label at

517 is well accessible to Ni-EDDA in wildtype conditions (366R). When the adjacent residue below (366) was changed to an asparagine (Fig. 3.6, G), the accessibility of the spin label at 517 was reduced by 4 times (compared to the 517C) with a membrane depth parameter of “-0.2”. To determine if the accessibility can be reduced further for the spin labeled residue at 517 by placing an apolar isoleucine residue at 366, we studied the spin labeled 366I/517C YidC protein. The Power saturation EPR (Fig. 3.6, J) results show the solvent accessibility for spin label at 517C is same for both isoleucine and asparagine at position 366. This shows that the reduction in accessibility is probably due to decreased hydrophilicity or steric hinderance at 366 by asparagine and isoleucine. Taken together our experiments reveal that the positive charge is essential to keep the groove hydrated especially when a hydrophobic residue is located at the top of the cavity.

3.2.3 DEER EPR can be used to study distances not resolved by crystal structures

To demonstrate that DEER can be used to determine distances within YidC, we studied YidC with a spin label incorporated at both 375 and 542. Figure 3.7 shows the

95

DEER data for MTSL spin-labeled YidC (375/542) in DOPC proteoliposomes. Fourier deconvolution method was applied to obtain the inter-spin distance and its distribution (see methods section). The distance spectrum (fig 3.6) reveals two major peaks one at

47Å and the other at 20Å most likely representing the two different conformations of the

C-tail, reflecting the protein fluctuation dynamics and possibly the flexibility of spin label side chain. The modulation depth of the curve is in good agreement with standard DEER measurement with phase time memory of 1.1 microseconds.

3.3 Discussion In this study, we have performed CW-EPR line shape analysis and CW-EPR power saturation studies on residues to elucidate the mobility of a spin labeled cysteine and its solvent accessibility. Some residues were chosen based on the crystal structure that predicted they were exposed to lipid or aqueous region. Other residues were chosen because of conflicting solvent exposure results based on cysteine accessibility studies with

N-ethyl maleimide and the MD simulations previously performed in our lab. (182) All the residues were replaced with cysteines and spin labeled. The YidC proteins were purified to homogeneity, concentrated and reconstituted into DOPC liposomes. The data reported in this study provide a better understanding of side-chain mobility, solvent exposure and spatial environment of different residues of YidC.

To confirm that the EPR method can be effectively utilized to probe the immediate environment of the YidC protein residue, two residues were selected. A mutant F501C

YidC, buried inside the membrane and K416C YidC exposed to the aqueous environment were spin labeled, purified and reconstituted into proteoliposomes. The EPR spectral

96 lineshape (Fig. 3.5) is narrower for 416YidC than the 501YidC showing that the residue in more exposed to aqueous environment. Also, the power saturation spectra (Fig. 3.6) for

416YidC indicate that Ni-EDDA is more accessible than the oxygen. This is evident from the fact that power required to saturate the sample in presence of Ni-EDDA is higher than oxygen. In contrast, the trend reverses in case of 501YidC where, the power required to saturate the sample is higher in the presence of oxygen indicating the spin label is buried with in the membrane and much less accessible to Ni-EDDA.

So far, we have analyzed three of the four residues that showed contradicting results between the solvent exposure results determined by NEM modification and MD simulations. YidC 427C, 432C and 434C were shown to be facing the greasy slide and involved in crosslinking with the TM segment of substrates. (97) The cysteine accessibility study with NEM showed that all the three residues 427, 432 and 434 were accessible to water to various extents but surprisingly the MD studies did not show the presence of any water molecules within 6Å of any residue. Remarkably, our study matches with the in vivo cysteine accessibility results, showing that the residues are accessible to water but to a lesser extent. This is possible if the residues are at the interface of lipid and water. Power saturation curves show that 427 (Fig 3.6, L) and 434 (Fig. 3.6, H) are more accessible to

Ni-EDDA than oxygen and 432 (Fig. 3.6, K) is lipid exposed as the spin label is quenched by oxygen more than Ni-EDDA. Another parameter calculated based on the spectral lineshape was the membrane depth parameter phi “ф”. It is calculated from half of power saturation values (P1/2) in the presence of water soluble and lipid soluble quencher, as described in the methods section. The values of membrane depth parameter lie between

97

“-1 to 1”. A positive and higher value of depth parameter would mean the residue is buried more within the membrane and a negative value would mean the residue is outside the membrane or water exposed. The depth parameter for 427 and 434 were calculated as ‘-

0.2’, and 432 was “0.3” indicating that the residue is at the interface of water and lipid.

One probable reason for the contradicting results between the studies is that the MD studies were not run for a longer time (185ns) to detect all the dynamic movements of the TM segments.

The CW EPR results of spin labeled groove residues shed light on the function of the arginine in keeping the microenvironment of the groove fully hydrated. This explains why the arginine is essential for the operation of the groove when there is an apolar residue at the top of the groove (at 517 for the E. coli YidC and at 234 for the S. mutans YidC).

The power saturation spectrum for spin labeled 366C mutant revealed that the spin label was well accessible to Ni-EDDA. When the adjacent residue 517 was changed from an aromatic tyrosine residue to a polar asparagine residue, the spin labeled 366C was still solvent accessible as predicted. But when the 517 was mutated to an isoleucine, the accessibility of 366 was greatly decreased which is evident from the depth parameter phi.

We hypothesize that when the residue 517 is made hydrophobic the groove constricts at the periplasmic side of the groove making it less solvent exposed and making it difficult for the Ni-EDDA to reach the spin label.

Finally, we performed DEER experiments on a double cysteine spin-labeled mutant of YidC. We selected a cysteine in the TM segment (375) and the other on the C tail (542) in the cytoplasm. The C-tail is not resolved in the crystal structure most likely due to its

98 high flexibility. The DEER spectrum revealed two major distances 47Å and 20Å consistent with two very different conformations of the protein. This data demonstrates that the

DEER approach is capable of providing spatial information between strategically placed spin labels and capable of detecting different conformations of the protein.

In conclusion, we have employed EPR spectroscopy and SDSL to study the YidC aqueous groove residues, a residue within the greasy slide, and to measure distances within

YidC. We show that the strictly conserved arginine in the groove is required to keep the cavity hydrophilic in nature when there is a hydrophobic residue at the top of the cavity.

We also confirmed that the greasy slide 434 residue is partly accessible to water agreeing with the solvent exposure data derived from cysteine modification by N-ethyl maleimide

(182). Finally, the DEER study showed this method is capable of measuring distances between two spin labels attached to YidC.

3.4 Methods and Materials

Kanamycin and Lysozyme were purchased from Sigma. Proteinase K (PK) was purchased from Qiagen. Isopropyl 1-thio-β-D-galactopyranoside was from Research Products

International Corp. PMSF was purchased from United States Biochemical (Affymetrix).

Phosphate Buffered Saline, pH 7.2 (PBS) was purchased from Thermo Scientific. (1-Oxyl-

2,2,5,5-tetramethyl-∆3-pyrroline-3-methyl) Methanethiosulfonate MTSL was purchased from Toronto Research Chemicals from Canada. DOPC and the extrusion accessories was purchased from Avanti polar lipids. E. coli BL21 strain was from our lab stock. YidC single cysteine mutants (all with a C-terminal His6 tag) were obtained after site directed

99 mutagenesis using quick change kit where the natural cysteine at position 423 had been changed to serine (cysteine-less background).

3.4.1 Overexpression and Purification of YidC

The overexpression and purification of pEH1 expressing YidC was carried out in

E. coli BL21 cells. Individual colonies were picked and placed in 5ml cultures for overnight growth in 50µg/ml Kanamycin antibiotic. Overnight culture was used to inoculate 1L LB culture and was grown at 37 °C until it reached 0.6 OD600, at which point protein expression was induced using 1 mM IPTG (isopropyl-1-thio-D-galactopyranoside). Cells were harvested 3h post induction by centrifugation (at 3200 g for 20 min, 4°C) and stored at -

80°C.

Cell pellets were thawed on ice and re-suspended in PBS buffer pH 7.2 containing

Lysozyme (1 mg/ml). The cells were sonicated on ice for several cycles at an output of

65% power to break open the membranes. The samples were centrifuged at 40,000 g for

50 min at 4°C to remove unlysed cells and inclusion bodies. To isolate the membrane pellets, the supernatant was then subjected to ultracentrifugation at 160,000 g for 50 min at 4°C. Membrane pellets were then solubilized in PBS containing 1% DDM (Anatrace) overnight by stirring with magnetic beads. The sample was again centrifuged for 25 min to remove all non-solubilized components and the supernatant was incubated with the Co2+-

NTA matrix (Qiagen) for 3h at 4°C. After washing with low imidazole buffer [PBS (pH

7.2), 20mM imidazol, 10% glycerol and 0.2% (w/v) DDM], 3mg of the spin label MTSL was added to the YidC sample on the column and shaken for 24 h at 4°C. YidC was eluted from the column with high imidazole buffer [PBS (pH 7.2), 400 mM imidazole containing

100

0.2% (w/v) DDM] and collected in 1 mL fractions. The samples were analyzed by 15%

SDS-PAGE and the fractions containing pure YidC was pooled and dialyzed with buffer

[PBS pH 7.2, 0.02% (w/v) DDM] overnight to remove glycerol, imidazole and free spin label. The YidC protein was concentrated by centrifugation using a 50 kDa Amicon spin concentrator.

3.4.2 Reconstitution into Proteoliposomes

1,2-Dioleoyl-sn-glycero-3-phosphocholine (DOPC) was purchased from Avanti

Polar Lipids (Alabaster, AL) and lipid vesicles prepared as described in (184). DOPC lipid systems have been well characterized for reconstitution of YidC and studies have shown that in the presence of DOPC YidC reconstitutes in the right conformation (with periplasmic domain within the liposomes) with maximum efficiency(185). The dry lipid film was resuspended in HEPES buffer (pH 8.0) containing 50 mM KCl, 0.02% (w/v)

DDM and incubated at 37°C for 1h. The sample was vortexed at a low speed to form multilamellar vesicles. Unilamellar vesicles were generated by the extrusion technique

(Mini-Extruder, Avanti Polar Lipids Inc). Specifically, 1ml of the lipid suspension was extruded 7-11 times through a membrane with a pore size of 0.4 µm until a semi-clear solution is achieved. For preparing proteoliposomes, a concentrated spin labeled YidC sample (at a final concentration of 20 µM) was added to the lipid sample and extruded as described above. The final protein:lipid molar ratio was set to 1:400. SM2-Biobeads were prepared according to the protocol (186) and added to the proteoliposomes to remove excess DDM as described in Kusters et al. The removal of DDM in the lipid sample with the Biobeads aids in further reconstitution and is achieved by slow rotary shaking of the

101 sample overnight at 4°C. The Biobeads are removed by centrifugation at 10K rpm for 5 min. The proteoliposomes are concentrated by centrifugation at 200, 000g to form a pellet and resuspended in 15µl of HEPES buffer for EPR measurements.

3.4.3 CW-EPR Spectroscopic Measurements

EPR experiments were conducted with the assistance of the Lorigan Group at the

Ohio Advanced EPR Laboratory at Miami University. CW-EPR spectra were collected at

X-band on a Bruker EMX CW-EPR spectrometer using an ER041xG microwave bridge and ER4119-HS cavity coupled with a BVT 3000 nitrogen gas temperature controller. The spin concentration for YidC samples was ~40-80 μM. Each spin-labeled CW-EPR spectrum was acquired by signal averaging ten 42-s field scans with a central field of 3315

G and sweep width of 100 G, a modulation frequency of 100 kHz, a modulation amplitude of 1 G, and microwave power of 10 mW at 295 K. The side-chain mobility of the spin label was determined by calculating the inverse central line width (Peak width) from each CW-

EPR spectrum. An empirical motional parameter (τ0) was determined from the CW-EPR spectra using equation shown below (169,187).

Eq 1:

−10 where K = 6.5 × 10 s, ΔH is the width of the center-line, and h0 and h−1 are the heights of the center and high field lines, respectively.

3.4.4 EPR Spectral Simulations

102

EPR spectra were simulated using the Multicomponent LabVIEW program written by Dr. Christian Altenbach (188) including the macroscopic order and microscopic disorder (MOMD) model developed by Freed group. (189) The principal components of the hyperfine interaction tensor A-tensors, A = [5.5 G, 5.5 G, 34.8 G] and g-tensors, g =

[2.0088, 2.0063, 2.0023] were obtained from a least-square fit to the spectrum. During the simulation process, the A and g tensors were held constant, and the rotational diffusion tensors were varied. A two-site fit was used to account for both the rigid (the slower) and higher (the faster) motional components of the EPR spectrum. The best fit rotational correlation times and relative population of both components were determined using a

Brownian diffusion model.

Power saturation experiments were performed on a Bruker EMX X-band CW-EPR spectrometer consisting of an ER 041XG microwave bridge coupled with an ER 4123D

CW-Resonator (Bruker BioSpin). (190) Samples were loaded into gas permeable TPX capillary tubes with a total volume of 8-10 μL at a spin label concentration of 40–80 μM.

EPR data collection was carried out using a modulation amplitude of 1 G and a varying microwave power of 0.4–100 mW. The scan range of all spectra was 100 G, and the final spectra were obtained by signal averaging 50 scans.

3.4.5 CW Power Saturation Experiments

CW-EPR power saturation curves were obtained for all the spin labeled YidC Cys mutants under three conditions: (1) equilibrated with nitrogen as a control; (2) equilibrated with a lipid-soluble paramagnetic reagent: air (20% oxygen); and (3) equilibrated with

103 nitrogen in the presence of a water-soluble paramagnetic reagent nickel(II) ethylenediaminediacetate (NiEDDA chelate (1 mM), as described. (191) The samples were purged with gas for at least 60 min at a rate of 10 mL/min before performing each EPR measurement. High purity nitrogen and house supply compressed air lines were used. The resonator remained connected to the gas line during all measurements, and the sample temperature was held at 295 K. The peak-to-peak amplitude (A) of the first derivative mI = 0 resonance line was measured and plotted against the square root of the incident microwave power. The data points were then fit using a Matlab software script using equation 2:

Eq 2:

where I is a scaling factor, P1/2 is the power where the first derivative amplitude is reduced to half of its unsaturated value, and ε is a measure of the homogeneity of saturation of the resonance line. In the above equation, A, I, ε, and P1/2 are adjustable parameters and yield a characteristic P1/2 value.

The corresponding Φ depth parameters were calculated using the following equation:

Eq. 3:

where ΔP1/2(O2) is the difference in the P1/2 values for air and nitrogen-exposed samples, and ΔP1/2(Ni-EDDA) is the difference in the P1/2 values for Ni-EDDA and nitrogen- exposed samples (190).

104

3.4.6 DEER Studies

Pulse DEER experiments were performed using a Bruker ELEXSYS E580 spectrometer equipped with a SuperQ-FT pulse Q-band system with a 10 W amplifier and

EN5107D2 resonator. The DEER sample was prepared at a spin concentration of ~80 μM.

30% (w/w) deuterated glycerol was used as a cryoprotectant. The sample was loaded into a 1.1 mm inner diameter quartz capillary (Wilmad LabGlass, Buena, NJ) and mounted into the sample holder (plastic rod) inserted into the resonator. DEER data was collected using the standard four pulse sequence [(π/2)ν1-τ1-(π)ν1-t-(π)ν2-(τ1 +τ2-t)-(π)ν1-τ2–echo] at Q-band with a probe pulse width of 10/20 ns, pump pulse width of 24 ns, 80 MHz of frequency difference between probe and pump pulse, shot repetition time determined by spin-lattice relaxation time (T1), 100 echoes/point, and 2-step phase cycling at 80K collected out to ~

2.0 μs for overnight data acquisition time (12 h).

DEER data were analyzed using DEER Analysis 2011. The distance distributions P(r) were obtained by Tikhonov regularization in the distance domain, incorporating the constraint P(r) > 0. A homogeneous three-dimensional model for micelle samples and a homogeneous two-dimensional model for proteoliposomes sample was used for background correction. The regularization parameter in the L curve was optimized by examining the fit of the time domain. Transverse relaxation data were collected by using the standard Hahn echo pulse sequence [(π/2)-τ1-(π)-τ1-echo] at Q-band with 10/20 ns pulse widths, an initial τ1 of 200 ns and an increment of 16 ns, 100 echoes/point, and 2-step phase cycling at 80K. The transverse relaxation time (T2) or phase memory time (Tm) was determined by fitting the data with a single exponential decay.

105

3.5 Figures

VNA

Fig 3.1: A schematic representation of the EPR instrument

EPR spectrometer consists of a microwave source, a coaxial transmission cell placed in the field of a magnet, and a microwave detector. The microwave source is produced at 100 kilohertz signal and digital signal generator with an output of 800-

2700Mhz after up-converting using an amplifier. The arbitrary microwave produced in the vector network analyzer (VNA) is sent either directly or via a broadband amplifier to the

106 transmission cell placed in the electromagnet where the samples are placed. The detector receives the signal, down-converts, and attenuates it, if necessary. The source and detector are integrated together with the computer forming a complete digital vector network analyzer (VNA). The scanning electromagnets are regulated using the time base units of the standard spectrometers whose start pulses are fed into the controlling program running on the embedded computer, and the power of the field is regulated by the magnet power unit (192).

107

Figure 3.2: Spin label MTSL structure and reactivity, and EPR lineshape.

Structure of MTSL spin label and its reactivity with a cysteine residue of the protein is shown. The structure shows the length and flexibility of the side chain with its 5 rotational carbon bonds. B. A typical CW-EPR spectral lineshape is shown. The Amplitude and width of the peak are extracted to calculate the membrane depth parameter as explained in the methods section.

108

Fig. 3.3: A cartoon representation of the YidC structure highlighting residues tested using EPR

A cartoon of E. coli YidC (PDB: 3wvf) highlighting residues that were tested to study their respective spatial environments. Control residues 416K (pink), 501 (cyan), 373

(magenta) and 471 (orange) are shown in the cytoplasm and the greasy slide. The positively charged residue 366R is shown in red color within the groove. The residues with conflict

434 (light green) and 515 (Dark green) are indicated in TM3 and TM5 respectively.

109

Figure 3.4 A cartoon representation of the structure of YidC highlighting aromatic residues tested using EPR

Space filling cartoon of YidC showing the residues 516(Yellow), 517(Green) and

433(Pink) shielding arginine 366 (Red) in the hydrophilic groove. The three phenyl residues that form an aromatic ring at the periplasmic half of YidC are shown here as spheres.

110

Magnetic Field G Magnetic Field G Figure 3.5: CW-EPR spectrum for the spin labeled YidC mutants in DOPC proteoliposomes

Shown here are the CW EPR data for the YidC mutants that were analyzed in this study. The data collection was carried out using a modulation amplitude of 1G and a varying microwave power of 0.4–100 mW. The scan range of all spectra was 100G, and the final spectra were obtained by signal averaging 50 scans.

111

Membrane Depth

Site parameter ()

501 0.9

416 -0.8

371 -0.8

473 0.7

434 -0.2

432 0.3

427 -0.2

366 -0.8

366C/517N -0.8

366N/517C -0.2

366I/517C -0.2

366C/517I 0.9

Table 3.1: Membrane depth parameter for all the spin labeled cysteine YidC mutants tested by EPR The values of the membrane depth parameter phi for the various spin labeled residues were calculated from their respective power saturation curves as explained in the methods section.

112

Figure 3.6: Power Saturation curves of the spin labeled YidC Cysteine mutants in DOPC bilayer at 298K. (Contd.)

113

517

Figure 3.6: Power Saturation curves of the spin labeled YidC Cysteine mutants in DOPC bilayer at 298K.

EPR power saturation curves of the spin labeled YidC Cys proteins in DOPC proteoliposomes at 295 K. K416C (A) and P373C (C) are at a site outside the lipid bilayer, while the F501C (B), and 473C (D) sites are located within the transmembrane domain.

The solvent accessibility of spin labeled 366C (F), 366C/517N (E), 366N/517C (G),

366C/517I (I), 366I/517C (J) and 517C (M) were probed to study the hydrophilic groove.

Spin labeled 434 Cys (H), 432C (K) and 427C (L) were probed to determine solvent accessibility since the residues showed contradicting results between the solvent exposure results determined by NEM alkylations and MD simulations. The low power amplitude for each of the conditions was rescaled to a common value. The membrane depth was determined from values extracted from the curves as explained in methods section.

114

Figure 3.7 DEER Spectrum for the double spin labeled 375/542 YidC mutant in

DOPC liposomes

Q-band DEER data of YidC 375/542 double Cys mutant bearing two spin labels, one on TM2 and the other in the C-tail of YidC. Background-subtracted dipolar evolutions of the indicated spin labeled mutants (left) and their corresponding distance probability distributions from Tikhonov regularization (right) for DOPC proteoliposomes is shown.

115

Figure 3.8 Purification and Spin labeling procedure of YidC mutants with MTSL.

(A) Shown in the Figure are the steps in purification of spin labeled YidC Cys mutants starting with the cell growth, induction of YidC, sonication of cells, and membrane extraction. The extracted proteins are incubated with Co-resin and spin labeled for 24h. (B)

The His-tagged spin-labeled YidC protein is eluted using Imidazole and pure fractions are collected. The pooled fractions are dialyzed (C) and concentrated (D). (E) They are mixed with DOPC lipids and extruded. (F) Biobeads are added to remove excess detergent and the sample analyzed (G). Samples at various stages are analyzed by SDS-PAGE to confirm purity of YidC and concentration (B, D, G) The samples are analyzed at Dr. Gary Lorigan’s

Lab at Miami of Ohio.

116

CHAPTER 4

CONCLUSIONS

4.1 Primary findings of the work

In chapter two, the physiochemical properties of the translocated loop of Procoat-

Lep were determined and varied to see if there was a correlation with the properties of the loop with the requirement of YidC and SecYEG for membrane insertion. Our studies showed that the polarity and charge of the translocated region is a crucial determinant for whether YidC or YidC and SecYEG are needed for membrane insertion. When the periplasmic loop of the substrate was made increasingly polar in the absence of charged residues, the Sec-independent PCLep became increasingly Sec dependent for translocation of the periplasmic loop. Additionally, if the high polarity of the loop was decreased by substituting hydrophobic amino acids into the loop, the YidC/Sec-dependent protein inserts in a less Sec dependent fashion. We also show that length of the loop is a positive translocase determinant. Based on our findings we hypothesize that the hydrophilicity factor of the loop is responsible for the translocase determinant. Also, the hydrophobicity of transmembrane segments assists in the translocation of the loops. We show that increasing the hydrophobicity of the substrates TM segments rescued insertion of

117 previously un-insertable mutants. We believe this rescue of translocation is achieved because the addition of hydrophobic amino acids to the TM segments increases the driving force energy for membrane insertion allowing the highly polar loops to be translocated.

The combined results reinforce the idea the YidC can translocate on its own a loop only up to a certain polarity threshold. Above this threshold the loop cannot be translocated most likely because, after being released from the groove, it cannot cross the outer leaflet of the membrane. The length requirement for translocation also makes sense because the size of the groove in the inner leaflet region can only accommodate a certain length of the peptide chain. Therefore, the SecYEG would be needed to assist in translocating a loop with a polarity above a certain magnitude and loop above a certain length.

How can YidC and SecYEG work together to insert membrane proteins and translocate polar loops across the membrane? One intriguing idea is that membrane insertion always occurs at the YidC/SecYEG interface. Clearly, the YidC insertase can form a dynamic complex with SecYEG and it is likely that the greasy slide region of YidC is in proximity to the SecYEG lateral gate. Thus, SecYEG can assist in the insertion of certain YidC substrate when needed.

In chapter three, we elucidated the hydrophilic nature of the aqueous groove present in YidC using Electron paramagnetic resonance. Our studies show that a spin label attached to YidC, coupled with EPR can be effectively employed to study the immediate spatial environment of residues both embedded within the membrane and exposed to aqueous milieu. Biochemical studies performed by Chen et al. on the conserved positively charged residue in YidC shows that its hydrophilic nature is essential to keep the aqueous groove

118 hydrated and that the electrostatic nature was dispensable for both viability and function.

Using continuous wave EPR and power saturation experiments, we provide further evidence that the water accessibility of the spin label at 366 is reduced when the residue on top of it, 517 is made hydrophobic than that of a hydrophilic residue. Our findings confirm that the positively charged residue 366R maintains the hydrophilic nature of the groove through which the substrate enters for membrane insertion.

In addition to these groove studies, EPR was used to study the solvent accessibility of specific greasy slide residues (L427, I432 and L434) that gave conflicting solvent exposure data based on MD simulations and in vivo NEM alkylation studies of the respective greasy slide residues. Power saturation EPR analysis of spin labeled cysteines at 427 and 434 showed that the residues were water accessible whereas the spin label at

432 was lipid exposed. The extent of water accessibility was assessed by elucidating membrane depth parameter for each mutant. All the residues had the phi value close to zero indicating that the residues were at the interface of lipid and water. This data reinforces the solvent exposure data determined by the NEM modification of YidC residues for 427 and

434 as they were modified to various extents indicating that the respective cysteines were solvent exposed; lipid exposed cysteines are unreactive to NEM since the thiol group is protonated (193).

Lastly, we performed DEER to measure distances between two spin labels on YidC one incorporated at residue 375 after TM2 and the other at residue 542 in the C-tail. The analysis determined there are two major distances between these locations, one at 47Å and the other at 20Å. The two measured distances are consistent with two different

119 conformations of the C-tail due to the protein fluctuation dynamics. These promising results show that quantitative information can be provided by the EPR pulse method that are complementary to FRET methods but also have the advantage that it can measure shorter distances. In the future, we plan on using DEER to measure distances between the

YidC greasy slide region and the SecYEG lateral gate to help characterize the holo- insertase complex.

4.2 Future directions

The projects described above provide useful insights to our understanding of YidC structure and the mechanism of insertion of multi transmembrane substrates at the inner membrane of E. coli. However, there are many questions that remain to be answered: (i)

How does the substrate N-tail or the periplasmic loop enter the aqueous groove of YidC?

(ii)What is the rate limiting step in the catalysis of membrane insertion? Is it partitioning of TM segments into the greasy slide or the translocation of the hydrophilic N- tail/periplasmic loop across the outer leaflet of the membrane? (iii) What is the composition of a YidC/SecYEG complex in vivo and what percentage of the respective proteins form the complex? (iv) How exactly do YidC and SecYEG cooperate in membrane insertion?

(v) Does the YidC greasy slide region and SecYEG lateral gate form a consolidated site for YidC/Sec substrates? Answers to these questions would provide a clear picture of events during membrane insertion at the inner membrane of E. coli.

120

List of References

1. Frey, S., Rees, R., Schunemann, J., Ng, S. C., Funfgeld, K., Huyton, T., and Gorlich, D. (2018) Cell 174, 202-+ 2. De Magistris, P., and Antonin, W. (2018) Curr Biol 28, R487-R497 3. Nakamura, N., Wei, J. H., and Seemann, J. (2012) Curr Opin Cell Biol 24, 467-474 4. Guo, Y., Sirkis, D. W., and Schekman, R. (2014) Annual review of cell and developmental biology 30, 169-206 5. Ellenrieder, L., Rampelt, H., and Becker, T. (2017) Journal of molecular biology 429, 2148-2160 6. van Wilpe, S., Ryan, M. T., Hill, K., Maarse, A. C., Meisinger, C., Brix, J., Dekker, P. J., Moczko, M., Wagner, R., Meijer, M., Guiard, B., Honlinger, A., and Pfanner, N. (1999) Nature 401, 485-489 7. Bohnert, M., Pfanner, N., and van der Laan, M. (2007) FEBS letters 581, 2802- 2810 8. Chacinska, A., Pfannschmidt, S., Wiedemann, N., Kozjak, V., Sanjuan Szklarz, L. K., Schulze-Specking, A., Truscott, K. N., Guiard, B., Meisinger, C., and Pfanner, N. (2004) The EMBO journal 23, 3735-3746 9. Ott, M., and Herrmann, J. M. (2010) Biochimica et biophysica acta 1803, 767-775 10. Lee, D. W., Yoo, Y. J., Razzak, M. A., and Hwang, I. (2018) Plant physiology 176, 663-677 11. Li, H. M., and Chiu, C. C. (2010) Annual review of plant biology 61, 157-180 12. Qbadou, S., Becker, T., Mirus, O., Tews, I., Soll, J., and Schleiff, E. (2006) The EMBO journal 25, 1836-1847 13. Flores-Perez, U., and Jarvis, P. (2013) Biochimica et biophysica acta 1833, 332- 340 14. Chang, W. L., Soll, J., and Bolter, B. (2012) Biological chemistry 393, 1263-1277 15. Paila, Y. D., Richardson, L. G. L., and Schnell, D. J. (2015) Journal of molecular biology 427, 1038-1060 16. Motley, A. M., and Hettema, E. H. (2007) The Journal of cell biology 178, 399-410 17. Schueller, N., Holton, S. J., Fodor, K., Milewski, M., Konarev, P., Stanley, W. A., Wolf, J., Erdmann, R., Schliebs, W., Song, Y. H., and Wilmanns, M. (2010) The EMBO journal 29, 2491-2500 18. Chen, Y., Pieuchot, L., Loh, R. A., Yang, J., Kari, T. M., Wong, J. Y., and Jedd, G. (2014) Nature communications 5, 5790 19. Cross, L. L., Ebeed, H. T., and Baker, A. (2016) Biochimica et biophysica acta 1863, 850-862 20. Einwachter, H., Sowinski, S., Kunau, W. H., and Schliebs, W. (2001) EMBO reports 2, 1035-1039 21. Nielsen, H., Engelbrecht, J., Brunak, S., and vonHeijne, G. (1997) Protein Eng 10, 1-6 22. Paetzel, M., Dalbey, R. E., and Strynadka, N. C. J. (1998) Nature 396, 707-707 121

23. Blobel, G. (1980) Eur J Cell Biol 22, 153-153 24. Koch, H. G., Hengelage, T., Neumann-Haefelin, C., MacFarlane, J., Hoffschulte, H. K., Schimz, K. L., Mechler, B., and Muller, M. (1999) Molecular biology of the cell 10, 2163-2173 25. Beck, K., Wu, L. F., Brunner, J., and Muller, M. (2000) Embo Journal 19, 134-143 26. Angelini, S., Deitermann, S., and Koch, H. G. (2005) EMBO reports 6, 476-481 27. High, S., and Dobberstein, B. (1991) Journal of Cell Biology 113, 229-233 28. Oh, E., Becker, A. H., Sandikci, A., Huber, D., Chaba, R., Gloge, F., Nichols, R. J., Typas, A., Gross, C. A., Kramer, G., Weissman, J. S., and Bukau, B. (2011) Cell 147, 1295-1308 29. Castanie-Cornet, M. P., Bruel, N., and Genevaux, P. (2014) Bba-Mol Cell Res 1843, 1442-1456 30. Khisty, V. J., Munske, G. R., and Randall, L. L. (1995) Journal of Biological Chemistry 270, 25920-25927 31. Muller, J. P. (1996) Journal of bacteriology 178, 6097-6104 32. Ullers, R. S., Ang, D., Schwager, F., Georgopoulos, C., and Genevaux, P. (2007) Proceedings of the National Academy of Sciences of the United States of America 104, 3101-3106 33. Sakr, S., Cirinesi, A. M., Ullers, R. S., Schwager, F., Georgopoulos, C., and Genevaux, P. (2010) Journal of Biological Chemistry 285, 23504-23512 34. Saibil, H. R., Fenton, W. A., Clare, D. K., and Horwich, A. L. (2013) Journal of molecular biology 425, 1476-1487 35. Watanabe, T., Hayashi, S., and Wu, H. C. (1988) Journal of bacteriology 170, 4001-4007 36. Hoppel, C., Kerner, J., and Distler, A. M. (2005) Febs Journal 272, 261-261 37. He, S. C., and Fox, T. D. (1997) Molecular biology of the cell 8, 1449-1460 38. Maillard, A. P., Lalani, S., Silva, F., Belin, D., and Duong, F. (2007) Journal of Biological Chemistry 282, 1281-1287 39. Saparov, S. M., Erlandson, K., Cannon, K., Schaletzky, J., Schulman, S., Rapoport, T. A., and Pohl, P. (2007) Molecular cell 26, 501-509 40. Harris, C. R., and Silhavy, T. J. (1999) Journal of bacteriology 181, 3438-3444 41. Van den Berg, B., Clemons, W. M., Jr., Collinson, I., Modis, Y., Hartmann, E., Harrison, S. C., and Rapoport, T. A. (2004) Nature 427, 36-44 42. Jomaa, A., Boehringer, D., Leibundgut, M., and Ban, N. (2016) Nature communications 7, 10471 43. Schatz, P. J., Bieker, K. L., Ottemann, K. M., Silhavy, T. J., and Beckwith, J. (1991) Embo Journal 10, 1749-1757 44. Nishiyama, K., Mizushima, S., and Tokuda, H. (1992) Journal of Biological Chemistry 267, 7170-7176 45. Schroder, P. A., and Moore, M. J. (2005) Rna-a Publication of the Rna Society 11, 1521-1529 46. van der Sluis, E. O., Nouwen, N., and Driessen, A. J. M. (2002) FEBS letters 527, 159-165 47. Satoh, Y., Mori, H., and Ito, K. (2003) 42, 7442-7447 122

48. Zheng, Z., Blum, A., Banerjee, T., Wang, Q., Dantis, V., and Oliver, D. (2016) The Journal of biological chemistry 291, 5997-6010 49. Beckmann, R., Bubeck, D., Grassucci, R., Penczek, P., Verschoor, A., Blobel, G., and Frank, J. (1997) Science 278, 2123-2126 50. Becker, T., Bhushan, S., Jarasch, A., Armache, J. P., Funes, S., Jossinet, F., Gumbart, J., Mielke, T., Berninghausen, O., Schulten, K., Westhof, E., Gilmore, R., Mandon, E. C., and Beckmann, R. (2009) Science 326, 1369-1373 51. Frauenfeld, J., Gumbart, J., van der Sluis, E. O., Funes, S., Gartmann, M., Beatrix, B., Mielke, T., Berninghausen, O., Becker, T., Schulten, K., and Beckmann, R. (2011) Nature structural & molecular biology 18, 614-U127 52. Cheng, Z. L., Jiang, Y., Mandon, E. C., and Gilmore, R. (2005) Journal of Cell Biology 168, 67-77 53. Menetret, J. F., Schaletzky, J., Clemons, W. M., Osborne, A. R., Skanland, S. S., Denison, C., Gygi, S. P., Kirkpatrick, D. S., Park, E., Ludtke, S. J., Rapoport, T. A., and Akey, C. W. (2007) Molecular cell 28, 1083-1092 54. Kusters, I., van den Bogaart, G., Kedrov, A., Krasnikov, V., Fulyani, F., Poolman, B., and Driessen, A. J. (2011) Structure 19, 430-439 55. Egea, P. F., and Stroud, R. M. (2010) Proceedings of the National Academy of Sciences of the United States of America 107, 17182-17187 56. Kusters, I., and Driessen, A. J. M. (2011) Cellular and Molecular Life Sciences 68, 2053-2066 57. Zimmer, J., and Rapoport, T. A. (2009) Journal of molecular biology 394, 606-612 58. Papanikolau, Y., Papadovasilaki, M., Ravelli, R. B. G., McCarthy, A. A., Cusack, S., Economou, A., and Petratos, K. (2007) Journal of molecular biology 366, 1545- 1557 59. Mori, H., and Ito, K. (2006) Journal of Biological Chemistry 281, 36249-36256 60. Das, S., and Oliver, D. B. (2011) Journal of Biological Chemistry 286 61. Banerjee, T., Zheng, Z., Abolafia, J., Harper, S., and Oliver, D. (2017) The Journal of biological chemistry 292, 19693-19707 62. Chou, Y. T., and Gierasch, L. M. (2005) Journal of Biological Chemistry 280, 32753-32760 63. Zimmer, J., Nam, Y. S., and Rapoport, T. A. (2008) Nature 455, 936-U932 64. Karamanou, S., Gouridis, G., Papanikou, E., Sianidis, G., Gelis, I., Keramisanou, D., Vrontou, E., Kalodimos, C. G., and Economou, A. (2007) Embo Journal 26, 2904-2914 65. Lemaire, C., Guibet-Grandmougin, F., Angles, D., Dujardin, G., and Bonnefoy, N. (2004) Journal of Biological Chemistry 279, 47464-47472 66. Chen, M. Y., Xie, K., Jiang, F. L., Yi, L., and Dalbey, R. E. (2002) Biological chemistry 383, 1565-1572 67. van der Laan, M., Nouwen, N. P., and Driessen, A. J. M. (2005) Curr Opin Microbiol 8, 182-187 68. van der Laan, M., Urbanus, M. L., ten Hagen-Jongman, C. M., Nouwen, N., Oudega, B., Harms, N., Driessen, A. J. M., and Luirink, J. (2003) Proceedings of the National Academy of Sciences of the United States of America 100, 5801-5806 123

69. Nagamori, S., Smirnova, I. N., and Kaback, H. R. (2004) Journal of Cell Biology 165, 53-62 70. Wagner, S., Pop, O., Haan, G. J., Baars, L., Koningstein, G., Klepsch, M. M., Genevaux, P., Luirink, J., and de Gier, J. W. (2008) Journal of Biological Chemistry 283, 17881-17890 71. Dalbey, R. E., and Kuhn, A. (2004) The Journal of cell biology 166, 769-774 72. Jiang, F. L., Yi, L., Moore, M., Chen, M. Y., Rohl, T., van Wijk, K. J., de Gier, J. W. L., Henry, R., and Dalbey, R. E. (2002) Journal of Biological Chemistry 277, 19281-19288 73. Bonnefoy, N., Kermorgant, M., Groudinsky, O., Minet, M., Slonimski, P. P., and Dujardin, G. (1994) Proceedings of the National Academy of Sciences of the United States of America 91, 11978-11982 74. Bohnert, M., Rehling, P., Guiard, B., Herrmann, J. M., Pfanner, N., and van der Laan, M. (2010) Curr Biol 20, 1227-1232 75. Sundberg, E., Slagter, J. G., Fridborg, I., Cleary, S. P., Robinson, C., and Coupland, G. (1997) Plant Cell 9, 717-730 76. Bellafiore, S., Ferris, P., Naver, H., Gohre, V., and Rochiax, J. D. (2002) Plant Cell 14, 2303-2314 77. Benz, M., Bals, T., Gugel, I. L., Piotrowski, M., Kuhn, A., Schunemann, D., Soll, J., and Ankele, E. (2009) Mol Plant 2, 1410-1424 78. Saaf, A., Monne, M., de Gier, J. W., and von Heijne, G. (1998) Journal of Biological Chemistry 273, 30415-30418 79. Samuelson, J. C., Chen, M. Y., Jiang, F. L., Moller, I., Wiedmann, M., Kuhn, A., Phillips, G. J., and Dalbey, R. E. (2000) Nature 406, 637-641 80. Chen, M. Y., Samuelson, J. C., Jiang, F. L., Muller, M., Kuhn, A., and Dalbey, R. E. (2002) Journal of Biological Chemistry 277, 7670-7675 81. Samuelson, J. C., Jiang, F. L., Yi, L., Chen, M. Y., de Gier, J. W., Kuhn, A., and Dalbey, R. E. (2001) Journal of Biological Chemistry 276, 34847-34852 82. van Bloois, E., Haan, G. J., de Gier, J. W., Oudega, B., and Luirink, J. (2006) Journal of Biological Chemistry 281, 10002-10009 83. Urbanus, M. L., Scotti, P. A., Froderberg, L., Saaf, A., de Gier, J. W. L., Brunner, J., Samuelson, J. C., Dalbey, R. E., Oudega, B., and Luirink, J. (2001) EMBO reports 2, 524-529 84. Sachelaru, I., Petriman, N. A., Kudva, R., Kuhn, P., Welte, T., Knapp, B., Drepper, F., Warscheid, B., and Koch, H. G. (2013) Journal of Biological Chemistry 288, 16295-16307 85. Sachelaru, I., Winter, L., Knyazev, D. G., Zimmermann, M., Vogt, A., Kuttner, R., Ollinger, N., Siligan, C., Pohl, P., and Koch, H. G. (2017) Scientific reports 7 86. Nie, Y., Chaillet, M., Becke, C., Haffke, M., Pelosse, M., Fitzgerald, D., Collinson, I., Schaffitzel, C., and Berger, I. (2016) Advanced Technologies for Protein Complex Production and Characterization 896, 27-42 87. Schulze, R. J., Komar, J., Botte, M., Allen, W. J., Whitehouse, S., Gold, V. A. M., Nijeholtb, J. A. L. A., Huard, K., Berger, I., Schaffitzel, C., and Collinson, I. (2014)

124

Proceedings of the National Academy of Sciences of the United States of America 111, 4844-4849 88. Zhu, L., Kaback, H. R., and Dalbey, R. E. (2013) Journal of Biological Chemistry 288, 28180-28194 89. Kohler, R., Boehringer, D., Greber, B., Bingel-Erienmeyer, R., Collinson, I., Schaffitzel, C., and Ban, N. (2009) Molecular cell 34, 344-353 90. Seitl, I., Wickles, S., Beckmann, R., Kuhn, A., and Kiefer, D. (2014) Molecular microbiology 91, 408-421 91. Spann, D., Pross, E., Chen, Y. Y., Dalbey, R. E., and Kuhn, A. (2018) Scientific reports 8 92. Ravaud, S., Stjepanovic, G., Wild, K., and Sinning, I. (2008) Journal of Biological Chemistry 283, 9350-9358 93. Kumazaki, K., Kishimoto, T., Furukawa, A., Mori, H., Tanaka, Y., Dohmae, N., Ishitani, R., Tsukazaki, T., and Nureki, O. (2014) Scientific reports 4, 7299 94. Kumazaki, K., Chiba, S., Takemoto, M., Furukawa, A., Nishiyama, K., Sugano, Y., Mori, T., Dohmae, N., Hirata, K., Nakada-Nakura, Y., Maturana, A. D., Tanaka, Y., Mori, H., Sugita, Y., Arisaka, F., Ito, K., Ishitani, R., Tsukazaki, T., and Nureki, O. (2014) Nature 509, 516-520 95. Chen, Y., Soman, R., Shanmugam, S. K., Kuhn, A., and Dalbey, R. E. (2014) The Journal of biological chemistry 289, 35656-35667 96. Sachelaru, I., Winter, L., Knyazev, D. G., Zimmermann, M., Vogt, A., Kuttner, R., Ollinger, N., Siligan, C., Pohl, P., and Koch, H. G. (2017) Scientific reports 7, 101 97. Klenner, C., and Kuhn, A. (2012) The Journal of biological chemistry 287, 3769- 3776 98. Facey, S. J., and Kuhn, A. (2004) Bba-Mol Cell Res 1694, 55-66 99. Gray, A. N., Henderson-Frost, J. M., Boyd, D., Sharafi, S., Niki, H., and Goldberg, M. B. (2012) Mbio 3 100. Zhu, L., Wasey, A., White, S. H., and Dalbey, R. E. (2013) Journal of Biological Chemistry 288, 7704-7716 101. Soman, R., Yuan, J. J., Kuhn, A., and Dalbey, R. E. (2014) Journal of Biological Chemistry 289, 1023-1032 102. Luirink, J., Samuelsson, T., and de Gier, J. W. (2001) FEBS letters 501, 1-5 103. Zhang, Y. J., Tian, H. F., and Wen, J. F. (2009) Bmc Evol Biol 9 104. Makarova, K. S., Galperin, M. Y., and Koonin, E. V. (2015) Biochimie 118, 302- 312 105. Borowska, M. T., Dominik, P. K., Anghel, S. A., Kossiakoff, A. A., and Keenan, R. J. (2015) Structure 23, 1715-1724 106. Guna, A., Volkmar, N., Christianson, J. C., and Hegde, R. S. (2017) Molecular biology of the cell 28 107. Anghel, S. A., McGilvray, P. T., Hegde, R. S., and Keenan, R. J. (2017) Cell reports 21, 3708-3716 108. Berks, B. C. (1996) Molecular microbiology 22, 393-404 109. Yahr, T. L., and Wickner, W. T. (2001) Embo Journal 20, 2472-2479

125

110. Tottey, S., Waldron, K. J., Firbank, S. J., Reale, B., Bessant, C., Sato, K., Cheek, T. R., Gray, J., Banfield, M. J., Dennison, C., and Robinson, N. J. (2008) Nature 455, 1138-U1117 111. Sargent, F., Berks, B. C., and Palmer, T. (2002) Archives of microbiology 178, 77- 84 112. Berks, B. C., Sargent, F., and Palmer, T. (2000) Molecular microbiology 35, 260- 274 113. Richter, S., and Bruser, T. (2005) Journal of Biological Chemistry 280, 42723- 42730 114. Sutherland, G. A., Grayson, K. J., Adams, N. B. P., Mermans, D. M. J., Jones, A. S., Robertson, A. J., Auman, D. B., Brindley, A. A., Sterpone, F., Tuffery, P., Derreumaux, P., Dutton, P. L., Robinson, C., Hitchcock, A., and Hunter, C. N. (2018) Journal of Biological Chemistry 293, 6672-6681 115. Walker, K. L., Jones, A. S., and Robinson, C. (2015) Pharm Bioprocess 3, 387-396 116. Ize, B., Gerard, F., Zhang, M., Chanal, A., Voulhoux, R., Palmer, T., Filloux, A., and Wu, L. F. (2002) Journal of molecular biology 317, 327-335 117. Weiner, J. H., Bilous, P. T., Shaw, G. M., Lubitz, S. P., Frost, L., Thomas, G. H., Cole, J. A., and Turner, R. J. (1998) Cell 93, 93-101 118. Sargent, F., Bogsch, E. G., Stanley, N. R., Wexler, M., Robinson, C., Berks, B. C., and Palmer, T. (1998) Embo Journal 17, 3640-3650 119. Oates, J., Barrett, C. M., Barnett, J. P., Byrne, K. G., Bolhuis, A., and Robinson, C. (2005) Journal of molecular biology 346, 295-305 120. Sargent, F., Stanley, N. R., Berks, B. C., and Palmer, T. (1999) Journal of Biological Chemistry 274, 36073-36082 121. Mori, H., and Cline, K. (2002) Journal of Cell Biology 157, 205-210 122. Ramasamy, S., Abrol, R., Suloway, C. J. M., and Clemons, W. M. (2013) Structure 21, 777-788 123. Cline, K., and Mori, H. (2001) Journal of Cell Biology 154, 719-729 124. Pradel, N., Santini, C. L., Ye, C. Y., Fevat, L., Gerard, F., Alami, M., and Wu, L. F. (2003) Biochemical and biophysical research communications 306, 786-791 125. Robinson, C., Matos, C. F. R. O., Beck, D., Ren, C., Lawrence, J., Vasisht, N., and Mendel, S. (2011) Bba-Biomembranes 1808, 876-884 126. Lausberg, F., Fleckenstein, S., Kreutzenbeck, P., Frobel, J., Rose, P., Muller, M., and Freudl, R. (2012) PloS one 7 127. Dalbey, R. E., Wang, P., and Kuhn, A. (2011) Annual review of biochemistry 80, 161-187 128. Kudva, R., Denks, K., Kuhn, P., Vogt, A., Muller, M., and Koch, H. G. (2013) Research in microbiology 164, 505-534 129. Kuhn, A., Koch, H. G., and Dalbey, R. E. (2017) EcoSal Plus 7 130. Pohlschroder, M., Hartmann, E., Hand, N. J., Dilks, K., and Haddad, A. (2005) Annual review of microbiology 59, 91-111 131. Dalbey, R. E., Kuhn, A., Zhu, L., and Kiefer, D. (2014) Biochimica et biophysica acta 1843, 1489-1496

126

132. Yen, M. R., Harley, K. T., Tseng, Y. H., and Saier, M. H., Jr. (2001) FEMS microbiology letters 204, 223-231 133. Guna, A., Volkmar, N., Christianson, J. C., and Hegde, R. S. (2018) Science 359, 470-473 134. Yu, Z., Koningstein, G., Pop, A., and Luirink, J. (2008) The Journal of biological chemistry 283, 34635-34642 135. Klenner, C., Yuan, J., Dalbey, R. E., and Kuhn, A. (2008) FEBS letters 582, 3967- 3972 136. Hennon, S. W., and Dalbey, R. E. (2014) Biochemistry 53, 3278-3286 137. Samuelson, J. C., Chen, M., Jiang, F., Moller, I., Wiedmann, M., Kuhn, A., Phillips, G. J., and Dalbey, R. E. (2000) Nature 406, 637-641 138. Price, C. E., and Driessen, A. J. (2010) The Journal of biological chemistry 285, 3575-3581 139. Zhu, L., Wasey, A., White, S. H., and Dalbey, R. E. (2013) The Journal of biological chemistry 288, 7704-7716 140. Ernst, S., Schonbauer, A. K., Bar, G., Borsch, M., and Kuhn, A. (2011) Journal of molecular biology 412, 165-175 141. Soman, R., Yuan, J., Kuhn, A., and Dalbey, R. E. (2014) The Journal of biological chemistry 289, 1023-1032 142. Kuhn, A., Zhu, H. Y., and Dalbey, R. E. (1990) The EMBO journal 9, 2385-2389 143. Traxler, B., and Murphy, C. (1996) The Journal of biological chemistry 271, 12394-12400 144. Yang, Y. B., Yu, N., and Tai, P. C. (1997) The Journal of biological chemistry 272, 13660-13665 145. Kihara, A., Akiyama, Y., and Ito, K. (1995) Proceedings of the National Academy of Sciences of the United States of America 92, 4532-4536 146. Cao, G., Kuhn, A., and Dalbey, R. E. (1995) The EMBO journal 14, 866-875 147. Andersson, H., and von Heijne, G. (1994) FEBS letters 347, 169-172 148. Schuenemann, T. A., Delgado-Nixon, V. M., and Dalbey, R. E. (1999) The Journal of biological chemistry 274, 6855-6864 149. White, S. H., and Wimley, W. C. (1999) Annual review of biophysics and biomolecular structure 28, 319-365 150. Kuhn, A. (1988) European journal of biochemistry 177, 267-271 151. Andersson, H., and von Heijne, G. (1993) The EMBO journal 12, 683-691 152. Facey, S. J., and Kuhn, A. (2004) Biochimica et biophysica acta 1694, 55-66 153. Neugebauer, S. A., Baulig, A., Kuhn, A., and Facey, S. J. (2012) Journal of molecular biology 417, 375-386 154. Dalbey, R. E., and Kuhn, A. (2014) Nature structural & molecular biology 21, 435- 436 155. Roos, T., Kiefer, D., Hugenschmidt, S., Economou, A., and Kuhn, A. (2001) The Journal of biological chemistry 276, 37909-37915 156. Sachelaru, I., Petriman, N. A., Kudva, R., Kuhn, P., Welte, T., Knapp, B., Drepper, F., Warscheid, B., and Koch, H. G. (2013) The Journal of biological chemistry 288, 16295-16307 127

157. Randall, L. L., and Hardy, S. J. (2002) Cellular and molecular life sciences : CMLS 59, 1617-1623 158. Zhu, L., Klenner, C., Kuhn, A., and Dalbey, R. E. (2012) Journal of molecular biology 424, 354-367 159. Engelman, D. M., Steitz, T. A., and Goldman, A. (1986) Annu Rev Biophys Bio 15, 321-353 160. Schuenemann, T. A., Delgado-Nixon, V. M., and Dalbey, R. E. (1999) Journal of Biological Chemistry 274, 6855-6864 161. Altenbach, C., Froncisz, W., Hyde, J. S., and Hubbell, W. L. (1989) Biophysical journal 56, 1183-1191 162. Altenbach, C., Flitsch, S. L., Khorana, H. G., and Hubbell, W. L. (1989) Biochemistry 28, 7806-7812 163. Altenbach, C., Marti, T., Khorana, H. G., and Hubbell, W. L. (1990) Science 248, 1088-1092 164. Cornish, V. W., Benson, D. R., Altenbach, C. A., Hideg, K., Hubbell, W. L., and Schultz, P. G. (1994) Proceedings of the National Academy of Sciences of the United States of America 91, 2910-2914 165. Berliner, L. J. (2010) Eur Biophys J Biophy 39, 579-588 166. Hustedt, E. J., and Beth, A. H. (1999) Annu Rev Bioph Biom 28, 129-153 167. Liu, K. J., Gast, P., Moussavi, M., Norby, S. W., Vahidi, N., Walczak, T., Wu, M., and Swartz, H. M. (1993) Proceedings of the National Academy of Sciences of the United States of America 90, 5438-5442 168. Khan, N., Hou, H. G., Swartz, H. M., and Kuppusamy, P. (2015) Electron Paramagnetic Resonance Investigations of Biological Systems by Using Spin Labels, Spin Probes, and Intrinsic Metal Ions, Pt B 564, 529-552 169. Klug, C. S., and Feix, J. B. (2008) Method Cell Biol 84, 617-658 170. Sahu, I. D., Craig, A. F., Dunagan, M. M., Troxel, K. R., Zhang, R. F., Meiberg, A. G., Harmon, C. N., McCarrick, R. M., Kroncke, B. M., Sanders, C. R., and Lorigan, G. A. (2015) Biochemistry 54, 6402-6412 171. Sahu, I. D., Zhang, R. F., Dunagan, M. M., Craig, A. F., and Lorigan, G. A. (2017) Journal of Physical Chemistry B 121, 5312-5321 172. Wagener, D. J. T., and Hossfeld, D. K. (2001) European Journal of Cancer 37, Ix- Ix 173. Perozo, E., Cortes, D. M., and Cuello, L. G. (1998) Biophysical journal 74, A44- A44 174. Hilger, D., Jung, H., Padan, E., Wegener, C., Vogel, K. P., Steinhoff, H. J., and Jeschke, G. (2005) Biophysical journal 89, 1328-1338 175. Borbat, P. P., Mchaourab, H., and Freed, J. H. (2002) Biophysical journal 82, 360a- 360a 176. Mchaourab, H. S., Steed, P. R., and Kazmier, K. (2011) Structure 19, 1549-1561 177. Jeschke, G. (2012) Annu Rev Phys Chem 63, 419-446 178. Sahu, I. D., Craig, A. F., Dunagum, M. M., McCarrick, R. M., and Lorigan, G. A. (2017) Journal of Physical Chemistry B 121, 9185-9195 179. Sahu, I. D., and Lorigan, G. A. (2018) Biomed Res Int 128

180. Berliner, L. J., Grunwald, J., Hankovszky, H. O., and Hideg, K. (1982) Anal Biochem 119, 450-455 181. Bordignon, E. (2012) Top Curr Chem 321, 121-157 182. Chen, Y. Y., Capponi, S., Zhu, L., Gellenbeck, P., Freites, J. A., White, S. H., and Dalbey, R. E. (2017) Structure 25, 1403-+ 183. Klenner, C., and Kuhn, A. (2012) Journal of Biological Chemistry 287, 3769-3776 184. Winterfeld, S., Ernst, S., Borsch, M., Gerken, U., and Kuhn, A. (2013) PloS one 8 185. Serek, J., Bauer-Manz, G., Struhalla, G., van den Berg, L., Kiefer, D., Dalbey, R., and Kuhn, A. (2004) Embo Journal 23, 294-301 186. Kusters, I., van den Bogaart, G., de Wit, J., Krasnikov, V., Poolman, B., and Driessen, A. (2010) Protein Secretion: Methods and Protocols 619, 131-143 187. Eletr, S., and Keith, A. D. (1972) Proceedings of the National Academy of Sciences of the United States of America 69, 1353-& 188. Warshaviak, D. T., Khramtsov, V. V., Cascio, D., Altenbach, C., and Hubbell, W. L. (2013) Journal of Magnetic Resonance 232, 53-61 189. Budil, D. E., Lee, S., Saxena, S., and Freed, J. H. (1996) J Magn Reson Ser A 120, 155-189 190. Coey, A. T., Sahu, I. D., Gunasekera, T. S., Troxel, K. R., Hawn, J. M., Swartz, M. S., Wickenheiser, M. R., Reid, R. J., Welch, R. C., Vanoye, C. G., Kang, C. B., Sanders, C. R., and Lorigan, G. A. (2011) Biochemistry 50, 10851-10859 191. Averill, D. F., Smith, D. L., and Legg, J. I. (1972) Inorg Chem 11, 2344-& 192. Hagen, W. R. (2013) PloS one 8 193. Chen, Y., Capponi, S., Zhu, L., Gellenbeck, P., Freites, J. A., White, S. H., and Dalbey, R. E. (2017) Structure 25, 1403-1414 e1403

129