Quick viewing(Text Mode)

Characterization of Cytochrome C Synthetase Ccsba and Latex Clearing Protein

Characterization of Cytochrome C Synthetase Ccsba and Latex Clearing Protein

Characterization of C Synthetase CcsBA and Latex Clearing

Inaugural Dissertation

zur Erlangung der Doktorgrades der Fakultät für Chemie und Pharmazie der Albert-Ludwigs-Universität Freiburg im Breisgau

vorgelegt von

Lorena Ilcu aus Botosani, Rumänien 2018

Vorsitzender des Promotionsausschusses: Prof. Dr. Stefan Weber Dekan: Prof. Dr. Oliver Einsle Referent: Prof. Dr. Oliver Einsle Korreferentin: Prof. Dr. Susana Andrade Promotionsvorsitzender: Prof. Dr. Hans-Georg Koch Datum der mündlichen Prüfung: 5.12.2018

Diese Arbeit wurde am Institut für Biochemie der Albert-Ludwigs-Universität Freiburg im Breisgau unter der Leitung von Prof. Dr. Oliver Einsle erstellt.

Summary

The c-type are a wide group of , characterized by the covalent attachment of the group to the surrounding polypeptide. Their assembly requires assistance from dedicated systems to bring together and to catalyse the thioether bond formation between the vinyl groups of the heme group and two cysteine residues found in a signature motif, most common CXXCH. At least three maturation pathways were developed during evolution [1]. System II or cytochrome c synthesis (Ccs), found in several bacterial species and in , comprises two responsible for the reduction of the heme binding motif on the apocytochrome c and CcsBA complex, dedicated for cytochrome c production. CcsA displays a high conserved region, referred as the WWD motif, and together with two flanking histidine residues orients and positions the heme. The first part of the current study was focused on screening different homologs of CcsBA (or just CcsA) by heterologous production, purification and crystallization in order to elucidate the three-dimensional structure via X-ray crystallography. The CcsBA/CcsA homologs were isolated with a b-type heme found in an oxidized state (Fe3+), indicated by UV/Vis spectral analysis. CcsBA fusion protein from Bacteroides thetaiotaomicron in complex with n-Dodecyl β-d-maltoside (DDM) was the only homolog successfully crystallized. However, the crystal optimization by intensive screening of different conditions did not improve the quality of the diffraction experiments. The second part of this thesis focuses on the structural characterization by X-ray crystallography of latex clearing protein (Lcp) from Streptomyces sp. strain K30, a b-type cytochrome involved in rubber biodegradation. The crystallized structure of Lcp was solved by exploiting the anomalous signal of the heme iron. The three-dimensional model revealed a globular protein, with the core of the protein structurally related to the members of the family. Two different structural conformations were observed depending on the nature of the sixth axial ligand of the iron heme. In the close state, a lysine residue from the protein’s backbone unusually binds the iron atom in the distal site, suggesting a role in the enzymatic reaction, while in the open state the same position is occupied by an imidazole molecule, emerged from the crystal growth condition. Additional electron paramagnetic resonance experiments on LcpK30 were conducted to investigate intermediate states of the heme group.

V

VI

Zusammenfassung

C-Typ Cytochrome sind eine vielfältige Gruppe der Metalloproteine. Sie zeichnen sich durch eine kovalente Bindung der Häm Gruppe an das umgebende Polypeptid aus. Die Bildung dieser Thioetherbindung benötigt spezielle Enzyme, welche die Reaktion der Vinylgruppen des Häms mit zwei Cysteinresten aus einem Hämbindemotiv, meistens CXXCH, katalysieren. Mindestens drei verschiedene Systeme für die c-Typ Cytochromreifung wurden durch die Evolution entwickelt. System II oder cytochrome c synthesis (Ccs), zu finden in diversen Bakterien und Chloroplasten, besteht aus zwei Proteinen, zuständig für die Reduktion des Hämbindemotivs des Apocytochroms, und dem CcsBA-Komplex, welcher die Bindung des Häms an das Cytochrom und somit die Cytochrom c Reifung ermöglicht. CcsA enthält einen hochkonservierten Bereich, das sogenannte WWD-Motiv, gemeinsam mit zwei flankierenden Histidinen ist dieser Bereich für die Orientierung und Positionierung der Häm-Gruppe verantwortlich. Der erste Teil dieser Arbeit beschäftigt sich mit der Untersuchung verschiedener Homologe von CcsBA (oder nur CcsA) durch heterologe Produktion in E. coli, Reinigung und Kristallisation um mittels Röntgendiffraktometrie die dreidimensionale Struktur von CcsBA aufzuklären. In den isolierten CcsBA/CcsA-Homologen wurde ein oxidiertes Häm b (Fe3+) gefunden, nachgewiesen durch UV/vis-Spekroskopie. CcsBA aus Bacteroides thetaiotaomicron konnte als einziges Homolog erfolgreich kristallisiert werden. Intensives Screening der Kristallisationsbedingungen brachte jedoch keine Verbesserung bezüglich der Qualität der Diffraktionsexperimente. Der zweite Teil dieser Arbeit beschäftigt sich mit der strukturellen Charakterisierung des Latex clearing protein (Lcp) aus Streptomyces sp. strain K30, einem b-Typ Cytochrom, welches am biologischen Abbau von Kautschuk beteiligt ist, durch Röntgendiffraktometrie. Die Struktur des kristallisierten Lcp konnte unter Nutzung des anomalen Signals des Häm-Eisens gelöst werden. Das dreidimensionale Modell offenbarte eine globuläre Strukture, deren Kern strukturell mit denen der Globine verwandt ist. Zwei verschiedene strukturelle Konformationen konnten beobachtet werden, abhängig von der Art des sechsten Liganden des Häm-Eisens. Im geschlossenen Zustand bindet ein Lysin aus dem Peptidrückgrat, auf ungewöhnliche Weise, distal an das Eisenatom, während im offenen Zustand diese Position von einem Imidazolmolekül eingenommen wird, welches aus der Kristallisationsbedingung stammt. Dies weist auf eine Rolle des Lysins für die enzymatische Reaktion hin. Zusätzlich wurden

VII

Elektronenspinresonanzexperimente mit LcpK30 durchgeführt, um die Zwischenstufen der Häm-Gruppe zu untersuchen.

VIII

Table of Contents

Characterization of the CcsBA Synthetase from System II of Cytochrome c

Assembly ...... 1

1. Introduction ...... 3

1.1 Cytochromes c ...... 3

1.2 Heme synthesis ...... 4

1.3 Apocytochrome c translocation over various membranes ...... 7

1.4 Cytochrome c maturation systems ...... 8

1.5 System IV and V ...... 8

1.6 System III ...... 9

1.7 System I ...... 11

1.7.1 Bacterial cytochrome c maturation system ...... 11

1.7.2 Mitochondrial cytochrome c maturation system ...... 17

1.8 System II ...... 18

1.8.1 Bacterial cytochrome c synthesis system ...... 18

1.8.2 Chloroplastidial cytochrome c synthesis system ...... 22

1.9 Bacterial species with System II, selected for CcsBA or CcsA characterization ...... 24

1.9.1 Aquifex aeolicus ...... 24

1.9.2 Bacillus megaterium ...... 24

1.9.3 Bacteroides thetaiotaomicron ...... 25

1.9.4 Geobacter sulfurreducens ...... 26

1.9.5 Helicobacter hepaticus ...... 26

1.9.6 Hydrogenobacter thermophilus ...... 27

1.9.7 Micrococcus luteus ...... 27

1.9.8 Symbiobacterium thermophilum ...... 28

1.10 Human pathogenicity ...... 28

IX

1.11 Evolution of cytochrome c biogenesis systems ...... 29

1.12 X-Ray Crystallography ...... 31

1.12.1 Protein Crystallization ...... 31

1.12.2 X-Ray diffraction ...... 33

1.12.3 Molecular replacement...... 36

1.12.4 Single-wavelength anomalous diffraction (SAD) ...... 37

2. Scope of the study...... 39

3. Materials and methods ...... 41

3.1 Materials ...... 41

3.1.1 Cultivation media ...... 41

3.1.2 Bacterial strains ...... 41

3.1.3 Vectors ...... 42

3.2 Molecular biology methods ...... 43

3.2.1 Restriction enzyme cloning / Gibson assembly / Site directed mutagenesis ...... 43

3.2.2 Polymerase chain reaction (PCR) ...... 44

3.2.3 Transformation of competent cells ...... 45

3.2.4 Agarose gel electrophoresis ...... 45

3.2.5 Gel extraction of DNA / PCR product purification ...... 46

3.2.6 Isolation of plasmid and sequencing ...... 46

3.2.7 Knock-out of the ccm operon from E. coli genomic DNA ...... 46

3.3 Microbiology methods ...... 47

3.3.1 E. coli cultivation ...... 47

3.3.2 Cell disruption, membrane isolation and solubilisation ...... 48

3.4 Biochemical methods ...... 49

3.4.1 Affinity purification ...... 49

3.4.2 Detergent exchange ...... 50

3.4.3 Size exclusion chromatography (SEC) ...... 51

X

3.4.4 Analytical chromatography ...... 51

3.4.5 SDS-PAGE ...... 52

3.4.6 Western Blot ...... 53

3.4.7 Heme staining ...... 54

3.4.8 BCA assay ...... 55

3.5 Spectroscopic methods...... 55

3.5.1 UV/Vis spectroscopy ...... 55

3.6 Crystallization methods ...... 55

3.6.1 Protein crystallization ...... 55

3.6.2 Data collection ...... 56

4. Results ...... 57

4.1 Characterization of Helicobacter hepaticus CcsBA ...... 57

4.2 Characterization of different bacterial homologs of CcsA ...... 61

4.2.1 Sequence identity between the CcsA homologs ...... 61

4.2.2 Characterization of Aquifex aeolicus CcsA ...... 62

4.2.3 Isolation of different bacterial homologs of CcsA ...... 64

4.3 Characterization of Bacteroides thetaiotaomicron CcsBA...... 65

4.3.1 Production and isolation of BtCcsBA ...... 65

4.3.2 Analytical chromatography of BtCcsBA ...... 66

4.3.3 Detergent or buffer exchange...... 67

4.3.4 UV/Vis absorption spectra of BtCcsBA ...... 68

4.3.5 Crystallization of BtCcsBA...... 69

4.3.6 Cytochrome c production by the BtCcsBA synthetase ...... 72

5. Discussions ...... 75

5.1 Characterization of H. hepaticus CcsBA ...... 75

5.2 Characterization of CcsA homologs ...... 76

5.3 Characterization of B. thetaiotaomicron CcsBA ...... 77

XI

5.4 UV/Vis spectra of CcsBA/CcsA homologs ...... 80

5.5 CcsA vs. CcmF ...... 81

5.6 Cytochrome c assembly in System II...... 86 Structural Characterization of Latex Clearing Protein (Lcp) from

Streptomyces sp. Strain K30 ...... 87

6. Introduction ...... 89

6.1 Streptomyces sp...... 89

6.2 Natural rubber, latex and synthetic rubber ...... 89

6.3 Latex biodegradation ...... 90

6.4 Rubber oxygenase A (RoxA) ...... 91

6.5 Latex clearing protein (Lcp) ...... 94

6.5.1 Lcp characteristics and comparison to RoxA ...... 94

6.6 LcpK30 overproduction ...... 95

7. Scope of the study...... 97

8. Materials and methods ...... 99

8.1 Spectroscopic methods...... 99

8.1.1 EPR ...... 99

8.2 Crystallization methods ...... 99

8.2.1 Protein crystallization ...... 99

8.2.2 Diffraction data collection ...... 99

8.2.3 Data processing, phase determination and structure building ...... 100

9. Results and discussions ...... 101

9.1 LcpK30 crystallization ...... 101

9.2 LcpK30 structure determination ...... 102

9.3 LcpK30 structure ...... 104

9.4 EPR analysis ...... 110

10. Outlook ...... 113

XII

11. References ...... 115

12. Appendix ...... 127

12.1 Primers ...... 127

12.2 Amino acid sequences ...... 135

12.3 DNA sequences ...... 137

12.4 Overview of CcsBA/CcsA homologs ...... 143

12.5 Secondary structure predictions (made by Protter) ...... 144

12.6 Abbreviations ...... 147

12.7 Units ...... 148

12.8 Prefixes ...... 149

12.9 Nucleobases ...... 149

12.10 Amino acids ...... 149

Acknowledgements ...... 151

XIII

XIV

Characterization of the CcsBA Synthetase from System II of Cytochrome c Assembly

2

1. Introduction

1.1 Cytochromes c

Cytochromes or ‘cellular pigments’, discovered in 1920 by David Keilin [2], are an abundant family, containing heme as a prosthetic cofactor. The heme group is constituted from an aromatic porphyrin ring, called protoporphyrin IX, and a central, coordinated metal ion, most commonly iron. The heme proteins function as electron carriers, cycling between the reduced (ferrous) and oxidized (ferric) state of the iron atom. Depending on the presence or absence of different modifications in their chemical configuration, the heme groups are classified in a, b, c, d, among others [3]. C-type cytochromes comprise a widespread and diverse group among the cytochrome family, essential for energy production pathways. They form key components of aerobic and anaerobic respiration and are implicated in various cellular processes such as photosynthesis, apoptosis or detoxification. In bacteria are also involved in the nitrogen and sulphur cycles [4, 5]. Cytochromes c are distinguished from the other heme proteins by the covalent linkages between the porphyrin vinyl side chains and the reduced thiol groups of cysteine residues most commonly found in a CXXCH sequence motif within the apocytochrome c polypeptide. In all cytochrome c structures, the heme has the same orientation with respect to the binding site, implying that the 2-vinyl group is bonded to the first cysteine, while the 4-vinyl group is always attached to the second cysteine on the motif (Figure 1. 1) [6]. The histidine residue at the end of the conserved signature serves as an axial ligand to the iron heme and was shown to be essential for cytochrome c production [7]. Some of the prokaryotic organisms contain a multitude of various c-type cytochromes in their genome, many of which include more than one CXXCH motif in their ploypeptide chain or they feature variations of the heme-binding site such as CXXXCH, CXXXXCH, CX15CH or CXXCK. In euglenozoa group, cytochromes c have the heme attached via one thioether bond, with the first cysteine residue on the sequence motif replaced by different amino acids such as alanine, phenylalanine or serine [8, 9].

3

Figure 1. 1: C-type heme with the heme-binding site CASCH of a cytochrome c, RoxA, from Xanthomonas sp. Y35 (PDB-ID: 4B2N).

In the Eukaryota domain, where the cellular processes are tightly regulated and the proteins are highly specialized, cytochromes c are not as versatile as in . In the eukaryotic cells, c-type cytochromes are found in chloroplasts, implicated in photosynthesis, and in mitochondria where they are involved in respiration and triggering cellular apoptosis [10]. C-type cytochromes are grouped into four classes, depending on their heme number, three dimensional structure, function and redox potential [5]. Class I comprises small, soluble c-type proteins with a globular fold and one heme-binding motif found at the N-terminus. The sixth axial ligand is usually a methionine residue. Class II contains cytochromes c with a four-helix bundle fold and the heme signature located at the C-terminus. All the low redox potential multi- heme cytochromes c with the iron atom found in a bis-histidine coordination are grouped into class III, while the last group (class IV) includes all c-type proteins with different non-heme cofactors [11].

1.2 Heme synthesis

Heme, a member of the porphyrin family, is essential for all cells with few exceptions. In bacteria, heme is usually synthesized inside the living cell, in cytoplasm, but can also be acquired from the environment through specific receptors located into the outer membrane of Gram-negatives. From periplasm, the heme is transported further in the cytoplasm through the inner membrane by ATP-binding-cassette permeases [12, 13]. In fungi and animal , the heme biosynthesis is started and finished in mitochondria but some intermediate steps take place in the cytoplasm [14, 15]. In parasitic eukaryotes, like some trypanosomatids and kinetoplastids, the genes for heme synthesis are missing from their

4

genomic DNA and the heme is acquired from their hosts [16-18]. In some anaerobic protists, also completely deficient of heme biosynthesis, energy is generated in other ways than oxidative phosphorylation [19].

Figure 1.2: Heme biosynthesis [20]. The universal heme precursor ALA can be synthesized by one of two ways: A. Shemin pathway or B. C5 pathway. C. The succession of reactions from ALA, catalysed by different enzymes (red colour, with the corresponding genes written in brakets).

5

In all living organisms, the enzymatic process that produces heme is initiated with the synthesis of 5-aminolevulinic acid (ALA). This reaction depends on the central metabolism of each organism. For all photosynthetic eukaryotes and prokaryotes, except α-Proteobacteria, the ALA formation requires glutamyl-tRNA, known as the C5-pathway (Figure 1.2, B). For all the other species, ALA is synthesized by the condensation of succinyl coenzyme A and glycine, referred as the Shemin pathway (Figure 1.2, A). The construction in three-steps of the first cyclic tetrapyrrole uroporphyrinogen III (UROGEN) starts with two condensed ALA molecules that form the pyrrole porphobilinogen (PBG). These in turn are linearly clustered to four PBG molecules followed by ring closure. The conversion of UROGEN into heme is executed in four consecutive enzymatic steps with the following intermediates: coproporphyrinogen III (COPROGEN), protoporphyrinogen IX (PROTOGEN) and after the ring system is aromatized it becomes protoporphyrin IX (PROTO). The final step consists of the insertion of ferrous iron into protoporphyrin IX, accomplished by the enzyme ferrochelatase (Figure 1.2, C) [21, 22]. In bacteria, the monomeric form of ferrochelatase is membrane-associated and located on the cytoplasmic side of the plasma membrane [23]. In some Gram-positive bacteria it is found soluble in the cytoplasm [24]. In animals and yeast, the ferrochelatase is a membrane-associated homodimer located on the mitochondrial inner membrane, with its active site exposed to the matrix side [25]. In plants, all the proteins involved in heme synthesis are localized in chloroplasts, but the enzymes from the last three steps (Figure 1.2) have isoforms which are dually targeted to plastids and the mitochondrial matrix [26]. However, their activity was associated with the chloroplasts and it is still unknown if the heme can be transported from the place of its synthesis to mitochondria [27, 28]. All land plants analysed so far have two ferrochelatase isoforms in their chloroplasts, FC1 and FC2, both localized in the thylakoid as well as plastid membranes [29]. Even though the proteins share 83% similarity and 69% sequence identity, it was reported that they play different roles [30]. In the green algae Chlamydomonas reinhardtii, there was just one form of ferrochelatase found associated with the membranes of chloroplasts (possibly thylakoid membranes in respect of the signal sequence), from where heme is transported to the and mitochondria [31]. For all the photosynthetic organisms, the heme biosynthesis divides into four pathways from the protoporphirin IX, each dedicated for the production of chlorophyll, heme, siroheme and phytochromobilin [32]. 6

1.3 Apocytochrome c translocation over various membranes

C-type cytochromes are generated in a post-translational multistep process in tight compartments of the cell, where the translocation over various lipid bilayers of the prosthetic group heme b and of the apocytochrome c occur separately through independent mechanisms [33]. In bacteria, the cytochrome c precursors are kept unfolded by cytosolic chaperones such as motor protein SecA until they are secreted across the inner membrane via SecYEG translocon during cycles of ATP hydrolysis [34]. Once in the periplasm, after the cleavage of the signal by the leader peptidase I, the oxidizing complex protein DsbAB (in Gram-negative organisms) or BdbCD (in Gram-positive organisms) forms a disulfide bridge between the conserved cysteine residues [35, 36]. Prior to the heme attachment, the disulfide bonds are broken by a reduction mechanism composed of the transmembrane electron transporter, CcdA (or DsbD homologue), and other thiol-reductase proteins, different for each cytochrome c assembly system [37]. It is suggested that the effort of first creating and then breaking intramolecular disulfide bonds by specialized machineries protects the apocytochromes from proteolytic degradation or aggregation, and prevents them from cross-linking to unspecific thiol-groups found in other environmental components. [33] In mitochondria, the assembly of cytochromes c takes place in the intermembrane space. The apocytochrome c1 is synthesised in cytosol with a cleavable bipartite signal sequence at the N- terminus. The first part of this presequence is proteolytically removed after potential- independent translocation into the mitochondrial matrix via TOM and TIM complexes. Subsequently, the second part of the , a hydrophobic domain, is inserted into the inner membrane. The C-terminus domain of the precursor, exposed to the IMS, has a transmembrane domain which serves as an anchor into the inner membrane. Finally, after the covalent attachment of heme group, a process catalysed by cytochrome c1 heme lyase, IMS peptidase cleaves the targeting signal [38, 39]. On the other hand, the apocytochrome c does not possess any signal sequence, but the transport is mediated by Tom40, Tom22 and cytochrome c heme lyase (CCHL) acting as a receptor [40, 41]. If the mitochondria is depleted by CCHL activity, it will be unable to import the apocytochrome c from the cytoplasm [42]. In chloroplasts, depending on gene localization (plastid or nuclear), apocytochromes are secreted with one or two targeting signals for the thylakoid lumen. Synthesized on the n-side

7

of the thylakoid membrane, apocytochrome f has one signal sequence that targets the thylakoid lumen in a Sec-dependent pathway [43]. The cytosolically translated cytochrome c6 polypeptide contains a bipartite signal for import into thylakoids through the envelope via TIC-TOC complex and thylakoid membrane by Sec translocon [44]. If the precursors do not successfully attain their tertiary structure as a holocytochrome c, they will be rapidly degraded.

1.4 Cytochrome c maturation systems

At first sight, the construction of cytochromes c requires a deceptively simple post-translational modification: the covalent ligation between the heme and the surrounding protein. However, it is not a particularly easy task to bring two specific substrates together, both in the appropriate conformation and chemical state so that a correct stereospecific attachment can be achieved [1]. It is notable that the mechanism of bond catalysis must be very strict, because the stereochemistry and orientation of the heme group related to the cytochrome c polypeptide is universally conserved [45]. It is not clear whether other special conditions are imposed in vivo for the chemical reaction of thioether bonds, but in specific circumstances in vitro, without any catalysis, just the mere proximity between the apocytochrome c and ferrous heme results in the spontaneous covalent bond formation followed by protein folding. However, the production of cytochromes c has to occur much faster in the cell compared to what has been observed in vitro [46, 47]. At the same time, the assembly machineries are faithful to Fe-protoporphyrins, as other metal porphyrins could not be attached to the particular binding site of the apocytochrome [48]. So far, there are several remarkably diverse systems identified for assembling cytochromes c in different cell types and organelles. Mainly three major biogenesis pathways have been described so far, with the addition of other two minor systems, designated System IV and V [49].

1.5 System IV and V

Usually, the post-translational process of covalent attachment of the heme cofactor to the apocytochrome c takes place on the positive side of biological membranes. One system, designated System IV or ‘cofactor assembly on complex C subunit B’ (CCB), is found on the

8

negative side of the chloroplast membrane in all organisms with oxygenic photosynthesis. Four genes in unicellular green alga Chlamydomonas reinhardtii have been identified as part of the system. It is an additional pathway in the chloroplasts, complementary to System II and it catalyzes the formation of one single thioether bond of a heme c (heme ci) to a cysteine residue in subunit b6 from cytochrome b6f complex. Heme ci has a high-spin configuration, with a pentacoordinated iron atom that does not have an axial ligand from the surrounding protein, but instead it is coordinated by a molecule of water [50-52]. Another system that is required for the formation of one single thioether bond is System V, found in mitochondria of Euglena, Trypanosoma and Leishmania species. Their cytochromes c and c1 have one single cysteine residue on their heme binding motifs (XXXCH). However, no genes belonging to this unique cytochrome c apparatus have been discovered so far [53].

1.6 System III

The simplest form of the maturation systems is System III, designated cytochrome c heme lyase (CCHL) or holocytochrome c synthase (HCCS) (recently relabelled). It consists of a single membrane-associated protein, located in the intermembrane space of the mitochondria of a wide range of eukaryotes: fungi, and invertebrate animals, protists and some algae [54]. The origin of this protein cannot be traced in prokaryotes [54]. Through mutagenesis studies and subsequent activity measurements, was proposed that CCHL has two heme binding sites and a potentially analogous WWD motif involved in heme orientation, represented by a high conserved sequence, designated as domain II [55]. One of the many curiosities in this system is how the thiol groups from the cysteine residues on the apoprotein are maintained in a reduced state. The heme is synthesized in a reduced state (Fe2+), and because the environment in the IMS might promote reductive characteristics, it is not clear whether there is need of a reduction mechanism like in the other systems. However, an inner membrane-bound flavoprotein was described in fungi, called Cyc2, involved in the reduction of the apocytochrome c and of the heme. Reduced levels of cytochrome c production were observed in yeast variants deficient of Cyc2 protein. No conservation of the correspondent gene could be traced in the chromosomes of higher eukaryotes [56]. Heme transport over the mitochondrial membrane is not yet understood, with possibilities that the cofactor, as a hydrophobic molecule, can diffuse through the and be intercepted on the other

9

side by CCHL, or that a flippase or a specific transportor could translocate it through the membrane [57].

Figure 1. 3: System III of cytochrome c heme lyase (CCHL). The only protein of the system, CCHL, found in the intermembrane space of mitochondria, catalysis the insertion of the heme b in the apocytochrome c, translocated from cytoplasm via TOM40/TOM22. Ferrochelatase, the last enzyme in the heme biosynthesis pathway, found in the mitochondrial matrix, releases the heme b into the lipid bilayer where is intercepted by CCHL and used in cytochromes c assembly. The 3-D structure of human cytochrome c (PDB-ID: 3ZCF) is shown in cartoon, coloured in brown, while the heme (red colour) is represented by sticks.

Two kinds of cytochromes c are produced in mitochondria: cytochrome c, which shuttles electrons from the cytochrome c reductase (complex III) to cytochrome c oxidase (complex

IV) of the respiratory chain and cytochrome c1, integrated in complex III, where it transfers electrons from ubiquinol to the cytochrome c. In fungi, depending on the type of the cytochrome c, a specific heme lyase is assigned, CCHL or CC1HL, very similar in some respects. In mammals, both kinds of cytochromes c are assembled only by one maturase, CCHL

[58, 59]. However, the absence of CC1HL gene in fungi induced cytochrome c1 to be produced by the other heme lyase [60]. A CXXCH motif is not sufficient for recognition by the cytochrome c heme lyase [61], and indeed a consensus sequence, K/AGXXL/IFXXXCXXCH, was identified at the N-terminus of the mitochondrial apocytochrome c [62]. It was further determined that the phenylalanine residue (6th residue from the apocytochrome binding site) is one of the crucial amino acids for identification by the CCHL [63]. Even though the wild-type CCHL from Saccharomyces cerevisiae is not able to mature bacterial cytochromes c, the substitution with a leucine and the

10

insertion of a lysine before and after the phenylalanine residue in the heme recognition site, resulted with the heme lyase producing these engineered cytochromes c [64]. Variants of apocytochromes c with one of the cysteines substituted were trapped in a complex with the cytochrome c synthetase. This result suggested that there is no specific order for the thioether bond formation and the only condition is that both covalent attachments must be catalysed that the cytochrome c can be released from the system [65]. Four steps of cytochrome c biogenesis by the heme lyase were proposed, starting with the non- covalent axial ligation of the heme group to CCHL through a histidine residue. The second step consists of the interaction between the apocytochrome and the enzyme resulting with the formation of a second axial bond of the iron atom to the histidine residue from the heme binding motif. In the third step the thioether bonds are formed that triggers a distortion of the heme, diminishing the interaction between the prosthetic group and CCHL, resulting with the release of the holocytochrome c which represents the last step of the mechanism. The displacement of the cytochrome c is likely to be mediated either by the axial bond formation with the methionine residue or by the folding of the protein [65, 66].

1.7 System I

The most elaborate and sophisticated cytochrome c biogenesis machinery is System I, also called cytochrome c maturation (Ccm). Ccm is spread in the Gram-negatives from α-, β-, γ-, δ- Proteobacteria, Chloroflexia (green non-sulphur bacteria), Deinococcus-Thermus and Spirochaetes phyla and in the Cytophaga genus. In eukaryotic cells, it is found in the mitochondria of plants, red algae and some protists [1].

1.7.1 Bacterial cytochrome c maturation system

The bacterial system comprises up to ten proteins, CcmABCDEFGH(I), encoded by eight or nine genes clustered in the same operon on the chromosome. DsbD (or the homologue CcdA), also part of the system, is responsible for the transfer of reducing equivalents from the cytosol to the machinery [35, 67].

11

Figure 1. 4: Schematic representation of the System I or cytochrome c maturation (Ccm) in bacteria. First the heme is attached to CcmE with the help of an ABC transporter, CcmABCD. Then the prosthetic group is transferred to the high conserved region of CcmF, WWD motif, and two histidine residues which axially ligate the heme. Further, spontaneous covalent bonds are formed between the heme vinyl groups and the reduced thiols of an apocytochrome c, imported in the periplasm by SEC translocon. Finally, the holocytochrome c is released from the machinery. CcmG and CcmH are responsible with breaking the disulfide bridges formed in the heme binding site on the apocytochrome c while CcmI or the C-terminus domain of CcmH (for some bacteria) is involved in apocytochrome recognition. The scissors represents sites for enzymatic cleavage of the signal sequence. The 3-D structure of cytochrome c’ from Rhodobacter spheroides (PDB-ID: 1GQA) is represented by cartoon in cyan while the heme is represented in red by sticks.

The Ccm proteins were proposed to form a multi-subunit supercomplex, where the cofactor is elegantly transferred from one complex to another. The first complex CcmABCD covalently attach the heme b to the heme CcmE, which further delivers the cofactor to another complex formed by CcmFH(I) [68]. CcmABCD features an ABC type-transporter with an unknown function, composed of CcmA, the cytoplasmic subunit with an ATP-binding cassette in conjunction with CcmB, the membrane subunit [69]. Even though it was suggested earlier that the transporter might be a candidate for heme export, the theory was abandoned following heme uptake assays in reversed membrane vesicles [70, 71]. However, if the ATPase activity is lost, the covalent bond between CcmE and heme is still maintained, but there are no cytochromes c produced, so there must exist an important role that remains to be elucidated. These findings lead to many controversial interpretations. Maybe the ATP hydrolysis is translocating an essential compound needed for the release of CcmE bound to the heme in a complex with CcmC or possibly the energy from

12

ATP hydrolysis is required to remove CcmC from the heme coordination sphere and thus allow breakage of the covalent bond between the heme and CcmE [72, 73]. CcmC is a , member of the heme-handling family, with a conserved domain rich in tryptophan residues (WWD domain) and two conserved, periplasmic histidine residues required for delivering the heme to CcmE [70, 74]. As stated before, CcmC forms an intermediate complex with holoCcmE and gets locked in this arrangement following the inactivation of the ABC transporter [75, 76]. CcmD is a single helix protein, not essential for heme attachment to CcmE but rather involved in the release of holoCcmE from the intermediate complex, CcmCDE [35, 77]. One of the main players in this system is CcmE, a monotopic membrane protein with a soluble domain comprising six antiparallel β-strands [78]. CcmE has been identified as a heme chaperone with guiding function for the heme group from one heme-handling complex to another, therefore is involved in at least two intermediate complexes, CcmCDE and CcmEFH(I) together with the apocytochrome c. At the surface of the periplasmic domain in CcmE is located an almost strictly conserved histidine residue, found in a highly conserved LAKHDENY motif and implicated in an unusual, covalent bond to the heme cofactor. The attachment is made between the nitrogen of the histidine and the β-carbon of the heme’s vinyl- 2 group. The tyrosine residue, part of the motif, axially coordinates the iron heme [79, 80]. The heme b is found in an oxidized state in holoCcmE alone, but also in the CcmCDE complex [81]. Mutagenesis studies reveal that replacing the histidine from the binding motif with an alanine or a cysteine residue leads to loss of cytochrome c production [82]. In some species from (a modified System I is used) or sulfate-reducing bacteria (Desulfovibrio genus), the covalent linkage between CcmE and the heme is made through a cysteine residue, in a CXXXY motif [83]. Replacement of this amino acid with a histidine or an alanine residue resulted in a deficit of cytochrome c maturation [84]. In vitro studies show the transfer of heme from CcmE to the apocytochrome c. As expected, if an alternative polypeptide lacking the CXXCH motif was introduced, the heme transfer did not take place [85]. HoloCcmE was detected in a complex with CXXCH variants of cytochrome b562, where the heme was covalently bound with a vinyl group to the histidine residue from CcmE and the other vinyl group attached to the newly introduced cysteine of the constructed heme-binding motif in cytochrome c-b562 [86]. This shows an intermediate state in the process of heme attachment to the apocytochrome c. A reverse Michael addition reaction might come into play, as it was proposed, to release the second vinyl group from the attachment with CcmE, but only after the first covalent bound was built [81]. 13

Experiments with inhibition of the last enzyme of heme synthesis, the ferrochelatase, still shows a 38 % activity of cytochrome c production by the Ccm system, compared with wild type activity, due to a heme storage mechanism. It was concluded that CcmE actually supplies the system with the cofactor, functioning as a ‘heme reservoir’ [87]. As a result, the heme storage mechanism protects the cell from free heme, which is known to be cytotoxic, particularly in the presence of oxidants [80]. The complex of CcmF together with CcmH and CcmI is needed for the breakage of the covalent bond between CcmE and heme. It is also specialized in heme-apocytochrome c ligation [88]. It was reported that only E. coli CcmFH (CcmI is part of CcmH) together with CcmG, was sufficient for holocytochrome c production, albeit at levels more than ten times lower than the capacity of the whole system. It was proposed that CcmFH/I represents the actual cytochrome c synthetase, while the function of CcmG can be replaced by an external reductant [89]. CcmF is a large integral membrane protein, structured into 15 transmembrane helices, with the typical conserved tryptophan domain, WWD motif, required for heme orientation [74]. In the periplasm, CcmF displays a soluble domain without any recognizable specific sequences [33]. There was evidence that CcmF also interacts with holoCcmE via two conserved periplasmic histidines, involved in heme coordination. The interaction is not possible with the apoCcmE, suggesting that this complex formation is heme dependent [90]. The interaction of CcmF with the apocytochrome c is not direct, but occurs through the complex partner, CcmI or in some organisms the C-terminus of CcmH [91]. Purified CcmF includes its own heme b, one molecule per protein, buried inside the protein and axially coordinated by two histidine residues from the transmembrane helices. It was suggested through midpoint potential measurements of CcmF and holoCcmE that reducing power is supplied from the membrane heme to the oxidized heme of holoCcmE, which needs to be attached to the apocytochrome c in a ferrous state [92, 93]. The electron flow is proposed to come from the quinol pool, through a quinol oxidation site (a specific sequence, SPF, close to the heme b in CcmF), although mutagenesis in this binding site did not produce any modifications in cytochrome c production [94, 95]. Recently, the first three-dimensional structure of a WWD family member was solved, CcmF from Thermus thermophilus (data not published, A. Brausemann doctoral dissertation) [96]. The overall structure shows a unique fold of 15 transmembrane helices with interconnecting loops and a large periplasmic β-sheet domain (Figure 1. 5).

14

Figure 1. 5: Pymol picture representing CcmF X-ray structure from T. thermophilus. Cartoon illustration in rainbow colour from blue at the N-terminus to red at the C-terminus. The heme b, the axial ligands and the WWD motif are depicted as sticks. The three-dimensional model reveals 15 helices imbedded into the membrane, with a unique fold and a large β-sheet periplasmic domain.

One heme b group is present in the protein core, axially coordinated by two histidine residues, H259 and H493, as predicted in previous works. The heme is oriented with the propionate groups to the cytoplasmic part. A cavity could be observed in the middle of the protein, above the heme group and below the highly conserved WWD motif (Figure 1. 6). Also, the putative quinone binding site reminded above (SPF) was found to be at a very large distance to the heme group, around 30 Å, therefore unlikely to be the electron transfer gate.

15

Figure 1. 6: Periplasmic view of CcmF structure from T. thermophilus. Cartoon picture with rainbow colour from blue at the N-terminus to red at the C-terminus. The heme b, the axial ligands, the WWD motif and a conserved periplasmic histidine (H172) are depicted as sticks and annotated.

CcmG is transferring reducing power from the DsbD or CcdA to CcmH and further to the apocytochrome c. The thioredoxin-like motif (CXXC) faces the positive side of the membrane. It was found that this pair of cysteines targets not just the intramolecular disulfide bond in CcmH, but also a mixed disulfide-complex between CcmH and apocytochrome [97, 98]. In some Archaea species, the C-terminus of the enzyme is fused with CcdA [99]. CcmH, a membrane-associated protein, has the characteristics of a thiol-disulfide oxidoreductase in charge with the transfer of the reductant to the target cysteine residues from the apocytochome c, prior to heme attachment. In some organisms, like E. coli, CcmI is part of CcmH as the C-terminus domain [100]. The conserved motif of the N-terminus, CXXC, essential for reductant provision, is absolutely required in cytochrome c maturation. However, during anaerobic growth only the second cysteine residue of the specific motif is required [101]. CcmH and CcmG are missing in the archaeal genomes as well as for the sulfate-reducing bacteria although in Desulfovibrio species a candidate of CcmG is encoded in another part of the genome [102].

16

CcmI represents in some organisms like E. coli the C-terminus of CcmH. Part of the complex CcmFHI, the enzyme acts as a chaperone to the apocytochrome c, by first recognizing its C- terminus region and then binding to the polypeptide. CcmI features two domains, one at the N- terminus found in a cytoplasmic loop including a leucine-zipper-like motif, interacting with CcmF and CcmH, and a second domain found at the C-terminus region, where a large periplasmic extension containing tetratricopeptide (TPR) repeat motifs facilitate protein- protein interaction [103-105]. CcdA is a homolog of the central part of DsbD, which has two additional thioredoxin-like motifs in the periplasm at both termini. Depending on the organism, one of the homologs provides reducing power from the cytoplasmic thioredoxin TrxA to the periplasmic CcmG via a thiol:disulfide cascade [106, 107].

1.7.2 Mitochondrial cytochrome c maturation system

In higher eukaryotes, System I is employed by plants mitochondria for cytochrome c biogenesis, which was inherited from the endosymbiotic α-proteobacteria ancestors [67]. The ccm genes are not clustered in a single operon like in bacteria, but distributed between the nuclear and mitochondrial genomes. The genes identified so far in nuclear genomes are correlated with the bacterial ccmA, ccmE and ccmH while in mitochondrial genomes they correspond to the bacterial ccmB, ccmC and ccmF. CcmD, ccmG and ccmI homologs are missing or at least they were not recognized from sequence alignments [28]. Also, no dsbD (or ccdA) gene analogue could be identified. If we consider the IMS a reducing environment, then there is no need for a reduction pathway. CcmH may be preserved because of its interaction with apocytochrome c. It was speculated that CcmH might also be implicated in apocytochrome c import into mitochondria [28]. In most plants, the analogue of CcmF is divided in CcmFN and CcmFC, while in Arabidopsis thaliana it is split in three proteins:

CcmFN1, CcmFN2 (which carries the highly conserved WWD motif) and CcmFN3 [108]. The separation is distinct among different species. AtCcmH forms a complex with all three proteins corresponding to CcmF, and at the same time are part as well of a bigger unidentified complex around 500 kDa, proposed to be the heme synthase [109, 110]. Additionally, in the mitochondrial DNA of Nicotiana tabacum, Beta vulgaris and Oryza sativa there is just one gene encoding for CcmFN, while in Marchantia polymorpha CcmFC is divided in two: CcmFC1 and CcmFC2 [43].

17

1.8 System II

System II or cytochrome c synthesis (Ccs) was discovered through studies on chloroplasts from the unicellular green alga Chlamydomonas reinhardtii, and also from analyses of a Gram- positive bacterium, Bacillus subtilis. It is found in organisms from β-, δ-, ε-Proteobacteria, Aquificae, Chlorobi (green sulphur bacteria), Cyanobacteria, Actinobacteria and Firmicutes phyla and in the Bacteroides genus. In eukaryotes, the system is localized in the chloroplasts of plants and algae [1, 33].

1.8.1 Bacterial cytochrome c synthesis system

The bacterial Ccs system (Figure 1. 8) is composed of three or four membrane-bound proteins: CcsA (also known as ResC), CcsB (also known as ResB or Ccs1), CcsX (also known as ResA) and CcdA (or the homolog DsbD) [33]. The main players in this system, CcsA and CcsB, are integral membrane proteins found in a tight complex or in some species fused into a single polypeptide and are responsible for the covalent ligation between the heme group and the apocytochrome c [87, 111]. CcsA is evolutionary related to CcmC and CcmF from System I, all members of the heme- handling family, characterized by the conserved WWD motif which is always located in the periplasm and proposed to serve as a platform for heme orientation [74]. Secondary structure prediction analysis on multiple CcsA homologs displays models with 7 to 13 transmembrane helices. However, experiments based on the fusion of Helicobacter hepaticus CcsBA with PhoA and the resulting alkaline phosphatase activities suggest that some of the transmembrane helices might be hydrophobic patches found in the periplasm (Figure 1. 7) [112]. The same arrangement was proposed for the System I CcmF protein, where contrasting data about the secondary structure prediction were published, all being disaproved when the three- dimensional structure revealed no hydrophobic patches, only transmembrane helices [96]. CcsB contains around 4 to 6 transmembrane helices, including a large soluble domain oriented to the periplasmic side (Figure 1. 7), with no conserved motifs, but proposed to act as a guiding scaffold for apocytochromes c [81]. In the absence of its complex partner, CcsB is completely degraded in Bordetella pertussis [111] whereas CcsA is not detectable in ΔccsB Chlamydomonas mutants [113], indicating a close co-dependency of these two proteins.

18

However, in some species like Burkholderia mallei, with 19 cytochromes c detected, ccsA gene was found alone on the chromosome [99].

Figure 1. 7: Topology prediction of CcsBA from Helicobacter hepaticus after Frawley and Kranz [112]. CcsB comprises four transmembrane helices while CcsA includes six helices and two ‘hydrophobic patches’ highlighted in blue. Conserved residues are displayed in red.

Similar to CcmF, the CcsBA complex contains a heme b cofactor in a 1:1 stoichiometry. According to Frawley and Kranz [112], the heme group, found in a Fe2+ state, is captured from the cytoplasm via two histidine residues located in two transmembrane helices from CcsA and CcsB. Next it is transported across the lipid bilayer through a channel formed in CcsA and finally intercepted by the WWD domain and two flanking histidine residues from the periplasmic loops of CcsA (Figure 1. 7). Both sets of conserved histidine residues, also identified in CcsBA-type isoenzymes from Wolinella succinogenes [114], establish two heme binding domains, one cytoplasmic and one periplasmic. Through mutagenesis studies and UV/Vis spectral analysis, they proposed that the two heme binding sites are also involved in protecting the heme group from oxidation [112]. However, the absence of the Ccs system does not affect the presence in the periplasm of cytochromes b [115], suggesting that CcsBA might not be a heme translocase and the transport mechanism requires another carrier which can have or not an enzymatic nature. Also, it can be that maybe there are different heme export pathways for different final targets or the absence of CcsBA triggers another heme delivery system. Interesting results were observed for ResB and ResC from B. subtillis (CcsB and CcsA, part of the resABCDE cluster on the chromosome). When the target proteins were recombinantly

19

overproduced in E. coli and they were subsequently analysed on SDS-PAGE followed by staining for polypeptides with covalent linkage to , there were observed two bands corresponding to 62 and 27 kDa, identified as ResB and a fragment containing N-terminus of the ResB, respectively. Treatment with acid acetone, high concentration of urea or boiling prior to heme staining did not diminished the visibility of the bands, suggesting the tight bonding of the heme. On the other hand, even though generally enzymes with non-covalently bound heme drop the prosthetic group in the presence of SDS or organic solvents and especially to harsh treatments like the ones reminded above, there are cases where heme b can be deeply buried inside the protein and it does not dissociate on SDS-PAGEs. The most relevant example for this case is CcmF from T. thermophilus, which is a b-type cytochrome and could be observed on heme-stained gels [96]. UV/Vis spectra of the reduced pyridine hemochromogen of ResB were measured to confirm the type of cytochrome. The results were, however, inconclusive, with an absorption maximum of 554 nm, which is between 550 nm and 557 nm, the values corresponding to the absorption maxima for the reduced pyridine hemochromogens of heme c and of heme b, respectively [116]. However, the authors of the experiments suggested that the heme is bound via one covalent linkage with a cysteine residue located at the N-terminus of ResB, constant in CcsB homologs [117]. The substitution of this amino acid with an alanine residue resulted in no band detection after the gel was stained for hemes covalently bound. Nonetheless, the protein still contained heme, but the content was reduced to 50% compared to the wild-type while the UV/Vis spectrum was similar to the wild-type. Also, it was determined that the cysteine knock-out variant of ResB does not influence cytochrome c production. Things become even more intriguing after the genes of resB and resC (ccsA) were overexpressed in B. subtillis wild-type cells. Even though the same heme-stained polypeptides were detected on the SDS-PAGE, their analyse by mass spectrometry confirmed the 62 kDa band as ResB while the 27 kDa band was identified as ResC [118]. It was also reported that a band of CcsA, part of the fused protein CcsBA from H. hepaticus, was visible on a gel stained for polypeptides with covalently attached heme, although the prosthetic group was lost upon boiling [112]. Heterologous cytochromes c that are naturally assembled via System I or III could be produced by System II CcsBA synthetase, with the exception of AXXCH variants, identified in euglenozoa group, which uses an unknown cytochrome c maturase [61, 119]. As mentioned above, three orthologues of a CcsBA fusion protein have been identified in W. succinogenes, specific for different substrates. From them, NrfI identifies CXXCK motif, a common heme binding site for the nitrite reductase, NrfA. However, Campylobacter jejuni 20

NrfA could not be matured by W. succinogenes NrfI, even though E. coli NrfA production by the same heme synthetase was successfully accomplished. Another dedicated biogenesis protein in W. succinogenes, called CcsA1, distinguishes an unusual binding site, a CX15CH motif, particular for the multiheme cytochrome c, MccA. The last W. succinogenes homolog, CcsA2, catalyses the thioether bonds formation between the heme and the standard binding site, CXXCH [9, 120].

Figure 1. 8: Bacterial System II of cytochrome c synthesis (ccs). The components of the system: CcsA/ResC, CcsB/ResB, CcsX/ResA, CcdA/DsbD are membrane proteins. CcsB and CcsA ligate the heme b from the cytoplasm via two histidine residues. CcsA coordinates heme b via the WWD domain and two histidine residues located in the periplasm. The cytoplasmic apocytochrome c is translocated through Sec complex after signal sequence recognition. The heme binding site on the protein precursor is reduced by the thiol-disulfide redox module composed from CcdA and CcsX. The scissors represents sites for enzymatic cleavage of the signal sequence. The 3-D structure of cytochrome c peroxidase from Geobacter sulfurreducens (PDB-ID: 3HQ6) is represented as cartoon, coloured in green while the heme is represented as sticks.

A reducing mechanism is required to prepare the apocytochrome c for heme insertion. The intramolecular disulfide bonds built within the CXXCH motif need to be broken, so that the thiol groups can be accessible for nucleophilic attack on the α-carbon of the vinyl groups in heme b. In System II, there are two proteins dedicated to bring reducing power to the cytochrome c polypeptide. An integral membrane protein, called CcdA (or the homolog DsbD), is translocating reducing equivalents from a thioredoxin protein in the cytoplasm to reduce the other partner in this thiol-disulfide redox module, designated CcsX (or ResA) [121]. CcsX is anchored in the membrane through a helix and displays a big periplasmic domain. Biochemical and structural studies show that upon reduction, CcsX forms a cavity, close to the active site,

21

that catches the histidine residue from the heme binding motif on the apocytochrome c [122, 123]. Production of cytochromes c is inhibited or very diminished upon inactivation of CcdA in B. subtilis or DsbD in E. coli [61, 124]. Its absence can be compensated by external addition of disulfide reductants [125, 126]. A different thio-reduction pathway was reported in

Campylobacter jejuni, where the cytochrome c6 (CccA) seems to be required for the maintenance of the apocytochrome cysteine sulfhydryls in a reduced state [127].

1.8.2 Chloroplastidial cytochrome c synthesis system

Plastidial cytochrome f, a subunit of the b6f complex, encoded by the petA gene on chloroplast DNA, is an unusual cytochrome c composed mainly from β-sheets and anchored to the thylakoid membrane through a C-terminus helix. The heme group is axially coordinated by the amino group of a tyrosine residue in the N-terminus part [128]. The photosynthetic b6f complex shows extensive analogy to the bc1 complex from mitochondria and transfers electrons from plastoquinol to plastocyanin. Cytochrome c6, nuclear encoded, found only in cyanobacteria and algae, is usually produced in stress conditions such as copper deficiency or hypoxia [44, 129].

A cytochrome similar to cytochrome c6 was discovered in the plastid lumen of vascular plants and green algae. However, structural data show differences in surface properties between the novel cytochrome c6 and plastocyanin, thus, unlike cytochrome c6, it cannot be considered a functional substitute of the copper-containing protein [130]. Parallel to the bacterial system, the presence of a complex between CCSA and CCSB (or CCS1) represents the actual heme maturase. The CCSA gene is the only one of the system’s components encoded in the chloroplast DNA and is expressed as a thylakoid membrane protein with three domains of conserved residues, from which the most preserved one is the WWD motif found at the C-terminus [131]. Apart from its role in heme binding and orientation, it was also proposed, similar to the bacterial model, to function in heme transport by forming a channel in the thylakoid membrane [43]. The complex partner CCSB is nuclear encoded and thought to be involved in apocytochrome chaperoning, as well as in heme binding. The secondary structure of CCSB is predicted to have a transmembrane domain at the N-terminus composed of three helices and a lumenal soluble domain at the C-terminus. This soluble domain, a loop from the plastid stroma, as well as a conserved histidine from the last transmembrane helix, seem to be necessary for the protein function. [44]. In Chlamydomonas, a 200 kDa complex containing CCSB in the wild-type is not present anymore in CCSA knock-

22

out variants, suggesting the tight complex between the two proteins, as well as the adjoining of other unknown components to the system [131].

Figure 1. 9: Plastidial System II of cytochrome c synthesis (CCS). The protein components of the system, CCSB, CCSA, CCS5/HCF164, CCDA and CCS4/HCF153 are associated with the thylakoid membrane. CCSB and CCSA non-covalently ligate via two histidine residues a heme b from stroma. The heme is coordinated in the thylakoid lumen by CCSA through the WWD domain and two histidine residues. The cytoplasmic apocytochrome c6 with a bi-partite signal is first translocated by the TIC- TOC complex in the chloroplast stroma and then by the Sec secretory pathway into the thylakoid lumen. The heme binding site from the cytochrome precursor is reduced by CCDA and CCS5/HCF164 thiol- disulfide oxidoreducatases. The scissors represent the enzymatic cleavage of the signal sequences. The 3-D structure of cytochrome c6 from Chlamidomonas reinhardtii (PDB-ID:1CYJ) is represented by cartoon (blue colour) while the heme is represented by sticks (red colour).

A similar pathway as in bacteria for maintaining the thiols reduced on the heme binding motif was proposed for the plastidial counterpart. The homologs for the bacterial DsbD/CcdA and CcsX/ResA thiol-disulfide oxidoreductases are designated CCDA and CCS5 (in Chlamydomonas reinhardtii) or HCF164 (high chlorophyll fluorescence phenotype, in Arabidopsis thaliana) [132, 133]. The thiol-reducing equivalents are proposed to come from a source in stroma, perhaps thioredoxin-m [134], then passing to the p-side in the thylakoid lumen through the CCDA transducer, then to the membrane anchored CCS5/HCF164 and finally transferred to the disulfide bridges on the apocytochrome c. Plastid CCDA, nuclear encoded, is similar in its central part to DipZ/DsbD, with the N-terminus acting as a signal sequence [43]. It is believed that the N-terminus of CCS5/HCF164 is anchored in the thylakoid

23

membrane, while the C-terminus encompasses a large soluble domain, containing a typical thioredoxin motif WCXXC, found to physically interact with apocytochromes [132]. A protein called CCS4 in Chlamidomonas reinhardtii and HCF153 in Arabidopsis thaliana, was identified to be involved in the disulfide-reducing pathway, but without displaying a thioredoxin motif. It is anchored in the membrane of the thylakoid, with a stromal soluble domain, suggested to be implicated in cytochrome b6f accumulation [135, 136].

1.9 Bacterial species with System II, selected for CcsBA or CcsA characterization

1.9.1 Aquifex aeolicus

The two members of the genus Aquifex, A. aeolicus and A. pyrophilus are among the most thermophilic bacteria, being able to grow optimally at 85° C, but can survive up until 95° C, the extreme thermal limit for microbes. The genus name Aquifex comes from Latin, where it means “water maker”, implying the bacteria’s ability to produce water by oxidizing hydrogen through a complex respiratory process. Aquifex species are microaerophilic and obligately chemolithoautotrophic bacteria, capable of growing on hydrogen, oxygen, carbon dioxide and mineral salts using them as carbon and energy sources for biosynthesis. A. aeolicus was first isolated from an underwater volcanic vent in the Aeolic Islands, north of Sicily, thus the name of the species, but it is also encountered in hot springs. It is a Gram- negative bacterium, does not form spores, has a rod shape and it is motile using monopolar, polytrichous flagella. The size of A. aeolicus genome is about a third of the one of Escherichia coli. About 16% of its genes have their origin in the Archaea domain. The similarities suggest that they could be among the oldest members of the Eubacteria domain [137]. This organism with extreme thermophile survival skills, makes it a great candidate for recombinant expression of its System II proteins because thermophilic membrane proteins can be produced in high yields and with good stability.

1.9.2 Bacillus megaterium

B. megaterium is one of the biggest bacteria known, with 1.5-4 μm in length, hence the name, derived from greek meaning “big animal”. It is a Gram-positive, rod-shaped, aerobic spore-

24

forming soil bacterium. Due to its large cell size it is well suited for research in molecular biology, biochemistry, cell morphology, cellular organization, protein localization, sporulation and spore structure. Also, because it is a non-pathogenic bacterium, easily growing on a variety of carbon sources in simple media while being able to produce a large variation of enzymes, B. megaterium is of great interest for food and pharmaceutical industry. Currently, it has a biotechnological use in production of numerous enzymes and substances, including vitamin

B12, penicillin acylase or α- and β-amylases [138]. Interesting results were published on System II proteins, CcsA and CcsB (ResC and ResB) from B. subtillis [118, 139], therefore a close homolog (CcsA) belonging to B. megaterium was placed under investigation.

1.9.3 Bacteroides thetaiotaomicron

B. thetaiotaomicron is one of the dominant endosymbionts of the human gut and the most studied member from the Bacteroides genus. It is a Gram-negative bacterium, obligate anaerobe and uses carbohydrates as a source of carbon and energy [140]. Evolutionarily, the bacterium acquired the capacity to uptake and hydrolyze a wide variety of non-digestible polysaccharides. The acquisition of carbohydrates is achieved by multiple outer membrane proteins (OMPs). The proteome of B. thetaiotaomicron includes the most predicted glycosylhydrolases than any other sequenced bacterium. The microorganism has developed strategies to survive and proliferate by sensing the surrounding environment through signal transduction mechanisms, by manipulating host gene expression such that the symbionts maintain a beneficial partnership and even by DNA transfer between itself and other members residing the human gut microbiota [141]. It is considered an opportunistic pathogen, being associated with infections that are becoming more difficult to treat because of the increased resistance to antimicrobial agents. They spread the antiobiotic resistance genes to other intestinal colonizers through self-transmissible mobile genetic elements like conjugative transposons, as well as conjugative and mobilizable plasmids [142]. The unique metabolic processes in this organism require many electron transport proteins like cytochromes c, therefore sequencing of this species genomic DNA reveals two orthologs of the fusion gene ccsBA and two homologue sets of the single genes ccsA with their ccsB complex partners.

25

1.9.4 Geobacter sulfurreducens

The Geobacter genus is a unique group of bacteria with anaerobic respiration. G. sulfurreducens, one of the main representative of this group is a rod-shaped, Gram-negative proteobacterium with a flagellum. It was first isolated from a soil sample in Oklahoma that was contaminated with hydrocarbons. The bacteria’s capacity for metal or sulphur reduction is possible because of the presence of 111 genes encoding for various forms of c-type cytochromes, the most numerous ever found in an organism’s genome. Insights into the metabolic mechanisms through genomic analysis show that the bacteria can generate electricity from the environment [143]. The multitude of genes specialized in sensing the environmental conditions leads to regulating its own metabolism as a response to the surrounding changes. For example, it was discovered that the microorganism is able to move towards metallic compounds. Further, genes were found that allow the bacteria to survive in the presence of oxygen under certain conditions [144]. G. sulfurreducens also has environmental restorative capabilities by converting uranium that is dissolved in water to a solid compound called uraninite, which can then be removed [145]. All these special particularities within G. sulfurreducens are possible because of an unusual high amount of different cytochrome c classes present in the cell, triggering the development of several copies of the cytochrome c assembly machineries. Therefore, six homolog sets of ccsA and ccsB individual genes were identified on the bacterial chromosome.

1.9.5 Helicobacter hepaticus

H. hepaticus is a Gram-negative, spiral-shaped bacterium, which can be found in the mucosal layer of the gastrointestinal tract or in liver tissue. It grows under aerobic conditions, but can survive at lower oxic levels. This species uses for mobility a bipolar, sheathed flagellum, but it lacks the periplasmic fibers that envelope the bacterial cell like in the case of H. pylori. In mice, infection with H. hepaticus causes chronic hepatitis, liver cancer, inflammatory bowel disease and it can also trigger mammary carcinoma. It is not yet fully understood whether or how can affect humans, but infection with this species was associated with cholecystitis, cholelithiasis and gallbladder cancer [146].

26

H. hepaticus has a strong urease activity, resulting in gastric acid neutralization which allows the bacteria to survive and colonize the acidic environment of the gastrointestinal tract. It can also reduce sulphur to hydrogen sulphide and nitrate to nitrite. The genomic DNA contains regions with different GC content from the rest of the chromosome, implying the acquisition of such sequences by horizontal gene transfer. In one of these genomic islands, it was found a type IV system, which is a conjugation system for DNA and protein transport [147]. The analysis of its genome reveals a unique mixture of features similar to H. pylori and Campylobacter jejuni. Similar to these two microorganisms, H. hepaticus possesses a ccsBA fused gene in the same ORF on the chromosome. Very interesting studies were conducted on the CcsBA fusion protein from this species [112].

1.9.6 Hydrogenobacter thermophilus

Another extremely thermophilic member of the Aquificaceae family is Hydrogenobacter thermophilus, rod-shaped and Gram-negative. Without any motility or sporulation, this species lives in geothermal and even hot saline springs, the optimal temperature for growth ranging between 70-75° C. It is the only obligate chemolitoautotroph among all aerobic hydrogen- oxydizing bacteria reported so far. Compared to other bacteria, this organism contains a unique fatty acid composition, with the acid chain longer by two more carbon atoms [148, 149]. This bacterium genomic DNA was chosen for overexpression of the ccsA gene from the cytochrome c assembly system mainly because of its thermophilic properties.

1.9.7 Micrococcus luteus

Micrococcus luteus is a Gram-positive, obligate aerobe bacterium that can be found in soil, water and air and also as part of the human intestinal flora. It was discovered by Sir Alexander Fleming. The analysis of its genome revealed that it has one of the smallest chromosomes of known actinobacteria, with a 73% GC content. The capacity of M. luteus to concentrate heavy metals could have bioremediatory applications. It is generally considered a harmless species, but it can act as an opportunistic pathogen and is treated as a contaminant in sick patients. Although it is a non-spore-forming bacterium, it can enter in a profoundly dormant state [150].

27

Besides the fact that this organism is a representative of the Actinobacteria phylum and of the Gram-positive bacteria, its genomic DNA has a high GC content similar to thermophilic organisms and therefore it was chosen as a model for the overproduction of the CcsA protein.

1.9.8 Symbiobacterium thermophilum

S. thermophilum is a rod-shaped, thermophilic bacterium whose growth is dependent on co- culture with a Bacillus strain in liquid media. It can proliferate both in aerobic and anaerobic conditions. The Gram staining reaction is negative, and also the production of tryptophanase and tyrosinase suggests that this species is Gram-negative. However, the result of the 16S ribosomal DNA phylogenetic study and the lack of major Gram-negative membrane biosynthesis proteins indicates that S. thermophilum belongs to the Gram-positive group [151]. Despite of its high GC content (68.7%), the genome analysis shows that this bacterium is most closely related to the Firmicutes phylum, which consists of Gram-positive bacteria with a low GC content of the genomic DNA. Another surprise was the finding of genes involved in endospore formation, even though it was considered before as a non-spore-forming organism. This means that it is the first high-GC bacterium which is capable of forming endospores [152]. As a thermophilic and very controversial organism regarding the Gram-staining, S. thermophilum makes an excellent choice for the study of its genuine cytochrome c maturation system, comprising two sets of isoenzymes of CcsA with the complex associate, CcsB.

1.10 Human pathogenicity

Cytochrome c biogenesis can be interconnected with human pathogenicity in a direct way when a person’s DNA manifests errors in the encoded information for the corresponding gene or in an indirect way through bacterial infections due to anomalies in Ccm or Ccs systems. For several pathogens, cytochromes c are essential for respiration or less common processes like iron acquisition (through siderophore production) or cell elongation in response to NO [153, 154]. Mutations in the CCHL gene, encoding for the cytochrome c heme lyase (System III), trigger in humans diseases like microphthalmia with linear skin defects syndrome (MLS). The fact that CCHL gene is found on the X chromosome, means that only female patients can acquire this genetic disorder while in males is lethal [155].

28

A unique anti-apoptotic mechanism, represented by the interaction between the CCHL protein and the neuronal glutamate transporter EAAC1 (in the process altering the association of CCHL with the X-linked inhibitor of apoptosis protein, maintaining its activity), rescues motor neurons from NGF (nerve growth factor) deprivation and nerve injury [156]. The growth of Mycobacterium tuberculosis, responsible for lung tuberculosis, is damaged if the Ccs proteins or some c-type cytochromes are inactivated [157]. On the other hand, in Bacillus anthracis, known for causing anthrax infections, the loss of the ccsB gene, which inactivates production of two types of cytochromes c, results in an increase of anthrax toxin production [158]. A similar case was reported for B. cereus, acknowledged for releasing of cell-damaging toxins, where the knock-out of ccsBA genes contributed to toxin overexpression [159]. The depletion of the ccmC gene from Legionella pneumophila altered the ability to infect macrophages in humans and increased iron assimilation [160]. Some c-type cytochromes are crucial for production of biofilms, which contribute to human health issues like lung infections caused by Pseudomonas aeruginosa in patients with cystic fibrosis disorder or gonorrhoea induced by Neisseria gonorrhoeae [161, 162]. The gene inactivation corresponding to these types of cytochromes c blocks biofilm maturation [163]. A different indirect way for affecting human health are mutations in the heme synthesis pathway, which can cause serious metabolic afflictions. For example, distinct mutations in the ferrochelatase gene leads to erythropoietic protoporphyria (EPP), with signs of cutaneous photosensitivity, or to acute liver damage through overproduction of Protoporphyrin IX [164, 165].

1.11 Evolution of cytochrome c biogenesis systems

Evolution has produced multiple distinct cytochrome c biogenesis pathways, and the reason behind this still represents a mystery. The presence of System I genes in the mitochondrial genome of the protozoan Reclinomonas, a relative of primitive eukaryotes, suggests a logical possibility that the first eukaryotes employed System I. It is thought that approximately 800 million years ago, a new simplified mechanism of cytochrome c production, CCHL (System III) may have evolved from something more primitive, not long after Euglenozoa group diverged [166]. In contrast to other eukaryotes in Excavata phylum where System I prevails, Euglenozoa developed an unknown system,

29

designated as System V, that catalysis the heme linkage to an AXXCH motif (see section 1.5). Maybe the Euglenozoa species use an ancestor of System III, with no sequence homology but mechanistically and structurally related to CCHL [66]. A missing piece from the evolutionary chain was discovered recently, an early eukaryotic species, Ancoracysta twista (single-cell predatory flagellate), with both System I (incomplete, with just four subunits, CcmABCF) and System III encoded in its DNA. It shows a state of redundancy, where System I genes are in the course of being eradicated [167]. This evidence also supports the conception that System III does not descend from the other systems. The need for developing a simpler mechanism in mitochondria, where there are only two specialized cytochromes c produced, was critical for the organism to avoid the laborious synthesis of at least eight integral membrane proteins and the energy consumption implicated in a system like Ccm. It is speculated that the reason for the plants and some protozoans to maintain an incomplete System I in their mitochondria is because of the reduction mechanism found within this machinery, which can be considered a great advantage in a more oxidizing environment [81]. System I and System II display many similarities and both bacterial systems have been integrated into plant mitochondria (CCM) and chloroplasts (CCS) during the endosymbiosis events deep in evolutionary history. All species in the Archaea domain contain a modified version of System I, labelled System I*, with no CcmH or CcmG and with a homologue of CcmE (see section 1.7.1). So far there is no evidence, but System I* might be a precursor of the System I. Or maybe System I is the older version and it was acquired by Archaea through horizontal gene transfer, early in the evolution of Ccm [99]. There is a possibility of System II descending from System I, but with serious reorganization and reduction of obsolete components. If this is not the case, then either both systems have a common origin, evolving from the same ancestor [74], or Ccm evolved from Ccs. It is not an impossible scenario, considering the distribution of System II in ancient groups of bacteria, like Firmicutes and Actinobacteria [99]. The questions are, why the need arose to develop another bacterial system and what might have been the triggering conditions. In species like Bordetella bronchiseptica, Bordetella parapertussis, Desulfitobacterium hafniense, Ralstonia eutropha and Vibrio cholera, where both bacterial systems are encoded in their genome, it is expected that in the course of time, one of the systems will be selected and the other lost [99]. However, these organisms harbour nitrite reductase proteins bearing CXXCK motifs in their sequence, therefore it is thought that Ccm is used for the production of bacterial c-type cytochromes with the standard CXXCH motif while the Ccs machinery is only employed when the cells are using nitrate as a source of nitrogen, enabling the production of nitrite reductase proteins [81]. There 30

is also the example of Desulfovibrio species where, considering the presence in the genome of nitrite reductase genes, the ccm operon is found alongside with a gene encoding for CcsA protein from System II (Ccs) [102]. As expected, the Ccm system from D. desulfuricans could not mature in E. coli a CXXCK variant of cytochrome c550 from Pseudomonas denitrificans [84]. In E. coli, the maturation of the periplasmic nitrite reductase, NrfA, is assigned to a specific paralog of the complex CcmFH, labelled NrfEFG, where NrfE is similar to CcmF, NrfF to the N-terminus of CcmH and NrfG to the C-terminus of CcmH [168, 169]. In Wolinella succinogenes, three types of CcsBA isoenzymes were developed, each specialized with the recognition of different heme binding motifs [120]. The fusion protein CcsBA is found in species belonging to bacterial groups evolved later, possibly suggesting an intermediate state in the progress of System II. A curious and interesting case is the genomic DNA of the malarial mosquito, Anopheles gambiae, where elements from all three systems were identified, CcmEGH from System I, CcsBA from System II and CCHL from System III. However, this aberration might be the result from a horizontal gene transfer deriving out from different bacteria [80].

1.12 X-Ray Crystallography

1.12.1 Protein Crystallization

For a better understanding of the biological processes and a comprehensive image of the cellular universe, it is crucial to visualize the three-dimensional structure of enzymes and other macromolecules at atomic resolution. The fascination with crystallization of biological components was introduced in 1934, with the observation of X-ray diffraction patterns of pepsin crystals found in solution. Bernal and Crowfoot concluded that the pepsin molecules are `dense globular bodies` and contain large amounts of water, around 50% [170]. The pioneering work of Kendrew and Perutz in the first X-ray structural analysis of [171] and [172] in 1956-1959, opened the path for single-crystal X-ray crystallography. Much later, in 1985, the first crystal structure of a membrane protein was solved, the bacterial photosynthetic reaction center complex [173]. The least understood step in the determination of a three-dimensional structure of a protein is obtaining suitable single crystals, which is regarded as a trial-and-error process. The driving force of crystallization is occuring when the gain of entropy by releasing a hydration shell is

31

bigger than the loss of entropy through crystal formation. To achieve production of good quality crystals, a high purity of the protein sample is required, a suitable aqueous buffer solution plus an appropriate detergent in the case of membrane proteins. A precipitant is added in the solution at conconcentrations to reach supersaturation, while the solubility of the protein decreases and drives the formation of small nuclei to minimize free energy [174]. The nature of the precipitant can be salt, organic solvents or versatile polymers (polyethylene glycols). Other ways to decrease protein solubility is changing the pH of the solution or the temperature. At a lower level of supersaturation, crystal growth starts spontaneously (Figure 1. 10) [175]. The solvent content in a protein crystal is much higher than in a inorganic crystal, ranging between 27% to 65%, with an average of 43% [176].

Figure 1. 10: Schematic phase diagram of protein crystallization. The unsaturated zone is where no crystals can grow and the crystallization drop stays clear. The supersaturated zone consists of three main areas. (1) Precipitation zone is where the protein aggregates. (2) Nucleation zone is crucial in protein crystallization, where the nuclei are formed. (3) Crystals are growing at a lower supersaturation level, in the metastable zone. However, if there are crystal seeds in the system, the crystallization process starts directly in the nucleation area. The line between undersaturated zone and metastable zone is called the solubility curve of the protein. The line dividing the metastable zone from nucleation and precipitation zone is the precipitation curve.

Although various crystallization techniques were implemented over time, the most popular approach is vapor diffusion with the hanging or sitting drop method. They are similar methods, working on the same principle, when the protein sample is mixed in a uniform drop with the precipitant solution, extracted from the reservoir. The chamber is sealed with oil or a

32

transparent cover slip, so that the concentration of the drop solution can reach an equilibrium with the reservoir concentration through vapour diffusion (Figure 1. 11) [175].

Figure 1. 11: Left: The hanging drop method: the protein drop is suspended from a cover slip above the precipitant solution. Right: The sitting drop method: the protein drop is sitting in a depression of the tray, surrounded by precipitant solution. The advantage of this method is reflected when there is a low surface tension of the drop, it cannot spread as in the case of the hanging drop.

1.12.2 X-Ray diffraction

In the following sections principles of protein crystallography are summarized from the detailed book by G. Rhodes, ‘Crystallography made crystal clear’ [177]. X-rays with energies that correspond to wavelengths between 0.7 and 2.1 Å are normally used for diffraction experiments in protein crystallography, in order to resolve distances between two bonded atoms in a protein or other molecules. Protein crystals are exposed to X-ray photons that interact with the electron shell of each atom of the molecules. Due to constructive interference, the scattered X-rays can be recorded on a detector as reflections with different intensities and specific patterns. The position of a reflection contains the information of that specific beam direction which was diffracted by the crystal. A crystal lattice is built of repeated structural units translated in all three directions of space (Figure 1. 12, B). A unit cell in a crystal is defined by its axes: a, b, c and its angles: α, β, γ (Figure 1. 12, A). One unit cell can contain one or more protein molecules. The smallest part to contain all structural information to describe the protein crystal by symmetry operations is called the asymmetric unit. A reciprocal lattice can be virtually constructed to intersect the cell axes in distinct points, being inverse to the crystal lattice. The position of a reciprocal lattice plane is defined by the Miller indices h, k and l (Figure 1. 12, C). The three-dimensional space comprising the Miller indices of the respective lattice planes, is called reciprocal space, an

33

imaginary space where each reflection of the diffraction pattern is described. X-ray beams are reflected in all directions by different lattice planes, separated by interplanar distances. There are scattered waves which interfere constructively, with the same phase angle, and produce an intensified diffracted beam by constructive interference. This condition is described by Bragg’s law: nλ = 2푑 sin ϴ Equation 1 where n is an integer, λ is the wavelength, d is the interplanar distance and θ the incident angle (Figure 1. 12, D). These positive interferences are visualized as reflections on a detector.

Figure 1. 12: A. unit cell with its axes: a, b, c and its angles: α, β, γ. B. A three-dimensional crystal lattice. C. Lattice planes in a two-dimensional scheme (h = 2, k = 1). D. Geometric illustration of Bragg’s law: two parallel electromagnetic waves with identical phases before will reflect with the same phase only if the difference in the path length of both waves is an integer multiple of the wavelength.

34

Electromagnetic waves are characterized by their wavelength, amplitude and phase angle (Figure 1. 13, left). However, in a typical diffraction experiment monochromatic X-rays are used giving a constant wavelength, while the wave function is described by amplitude |Fhkl| and phase angle φhkl, shown as a vector in an Argand-diagram. The resulting vector is called structure factor F(h, k, l) of the wave (Figure 1. 13, right).

퐹ℎ푘푙 = |퐹ℎ푘푙| exp(𝑖휑ℎ푘푙) Equation 2

Figure 1. 13: Left: A wave function is described by three parameters: wavelength (λ), amplitude (|F|) and phase angle (φ). Right: Argand diagram representing the structure factor as a vector with the length equal to amplitude and the phase provided by the angle made with the real axis, when the origin vector is placed at the origin of the complex plane.

The structure factor for a certain reflection (h, k, l) in a unit cell, with n atoms, can be expressed as the Fourier transform of the contribution f(h, k, l) of each atom.

푛 2휋푖(ℎ푥+푘푦+푙푧) 퐹(ℎ, 푘, 푙) = ∑푗=1 푓푗 푒 Equation 3

where fj is the scattering factor of atom j with coordinates (x, y, z) in real space while the Miller indices (h, k, l) are coordinates for a reflection in the reciprocal space. The structure factor F(h, k, l) can also be described as an integration of the electron density in the unit cell.

( ) −2휋푖(ℎ푥+푘푦+푙푧) 퐹ℎ푘푙 = ∫푥 ∫푦 ∫푧 휌 푥, 푦, 푧 푒 푑푥 푑푦 푑푧 Equation 4

From this equation, the electron density ρ(x, y, z) can be calculated, the result of the diffraction experiment, according to the inverse Fourier transform F(h, k, l).

35

1 휌(푥, 푦, 푧) = ∑ 퐹(ℎ, 푘, 푙) 푒−2휋푖(ℎ푥+푘푦+푙푧) Equation 5 푉 ℎ,푘,푙 where V is the volume of the unit cell. In diffraction experiments, only the intensities of the reflections can be determined, and they are proportional with the square of the structure factor amplitude while the phase angle is lost in native data set. However, there are several approaches to recover the phase, including molecular replacement, single- or multiple-wavelength anomalous dispersion and isomorphous replacement.

1.12.3 Molecular replacement

One of the most popular methods to regain the phase is to use one from an already known protein structure, the only condition imposed is to have a high structural similarity between the phasing model and the target protein. Usually, if the sequence homology is bigger than 30%, the protein folding must be closely related. This strategy is accomplished by superimposing the orientation and position of the model protein to the target protein through rotation and operations. To simplify the process, the search for orientation and position is separated by first utilizing Patterson maps, calculated using the Patterson function, and then comparison between structure factors to find the best location of the phasing model. The Patterson function is a Fourier sum without phases, which only depends on the structure factor amplitudes. The correlation between the structure factor amplitudes from the model in a certain position and the measured amplitudes of a target is integrated in a parameter called R-factor, which establish the quality of a possible solution.

∑ ||퐹 |−|퐹 || 푅 = 표푏푠 푐푎푙푐 Equation 6 ∑|퐹표푏푠|

A good solution has a low R-factor [178]. When the model was properly placed, the resulting modified phase of the model is used together with the experimental structure factors of the target protein to obtain an initial electron density map.

36

1.12.4 Single-wavelength anomalous diffraction (SAD)

If there is no known protein structure related to the target protein, the phase information has to be obtained de novo. If the protein naturally includes a cofactor with a heavy metal, then the capacity of the heavy atom to absorb X-rays at specific wavelengths can be exploited. The absorbance of X-rays by an atom gets drastically lower before their specific emission wavelength and this shift manifests as an absorption edge. The phenomenon of anomalous scattering takes place only when the X-ray wavelength is near to an element absorption edge. Light elements like C, N or O have lower absorption frequencies than the radiation used in crystallography. However, heavy atoms, due to their distinctive electron shell, absorb more energy and their absorption edges are close to the X-ray wavelengths from synchrotron sources. This causes a delay in reemitting the X-rays which induces a phase shift in all the reflections, known as anomalous dispersion. The total scattering factor of a heavy atom includes three components: the normal scattering factor (f0) which is independent of the wavelength and the anomalous scattering factors (f’, f’’) that are not dependent on the dispersion angle, but change with the frequency.

푓(휆) = 푓0 + 푓′(휆) + 𝑖푓′′(휆) Equation 7

In the presence of anomalous dispersion, Friedel’s law collapses, where the reflections h, k, l and –h, -k, -l will not have the same intensities and their structure factors are not equal in either amplitude or phase. This difference is called anomalous difference.

퐼ℎ,푘,푙 ≠ 퐼−ℎ,−푘,−푙 |퐹ℎ,푘,푙| ≠ |퐹−ℎ,−푘,−푙| 휑ℎ,푘,푙 ≠ −휑−ℎ,−푘,−푙 Equation 8

By using Argand diagrams, vectors can be drawn for their contribution to the structure factors for the Friedel pairs and construct a Harker illustration (Figure 1. 14). The structure factors for the light atoms (Fp) plus the contribution for the heavy atom sum up to give the structure factor for the whole structure (Fh).

37

Figure 1. 14: Under anomalous scattering, Friedel’s law is broken with Fp+ no longer the mirror image - of Fp . Fph is the anomalous dispersion derivative or the heavy atom derivative.

The crucial information needed for solving the phase is the position of the heavy atom. The anomalous difference can be used in Patterson maps or direct methods to locate the anomalous - scatterers. By taking the reverse phase of the Friedel partner, (Fph *), one obtains the two + - anomalous heavy atom contributions (Fh , Fh *) (Figure 1. 14, middle). Circles depicted around both structure factors together with the amplitudes of the anomalous Friedel pair, leads to a unique solution for the phase (Figure 1. 14, right) [179].

38

2. Scope of the study

The involvement of cytochromes c especially in energy conservation and other cellular processes in both prokaryotes and eukaryotes makes them essential for the life as we know it. Their special powers arise from the covalently attached heme cofactor, with the ability to transfer electrons by cycling between the ferrous and the ferric state of the iron atom. Although the assembly of the cofactor in the heme binding pocket of the apocytochrome c looks deceptively simple, a series of actions are required, with mechanisms still unknown to date. The fact that inhibition of cytochrome c production can stop the activity of some human pathogens, in a time of increasing of antibiotic resistance, could constitute a relevant and unique target for future ideas in drug development [118]. Besides clinical implications, cytochrome c proteins are part of the nitrogen cycle in nature, particular in the activation of small-molecule compounds such as NO and N2O that are closely investigated for their negative impact on our climate, thus there is a potential value in biotechnological applications [180, 181]. Three-dimensional models of proteins belonging to the cytochrome c biogenesis systems could provide knowledge about molecular insights and mechanistic explanations of the different machineries. Even though structures of several components from System I were published over the years, until recently there was no structural information about any member of the heme- handling membrane protein family with the common WWD motif [74]. The three-dimensional structure of CcmF from System I could provide some ideas for mechanismic interpretation of cytochrome c maturation [96]. So far, investigations on System II could not reveal much information about this machinery. Previous work on the CcsBA from H. hepaticus suggested that the natural fusion protein translocates the reduced heme group in the periplasm through a channel and protects it from oxidation [112]. Studies on CcsB (ResB) from B. subtilis heterologously produced in E. coli indicate that the protein contains a covalently bound heme, attached via one thioether bond [118]. The scope of this study was to gain a better understanding of the cytochrome c biogenesis type II machinery. The main goal was to structurally characterize the CcsBA complex, the actual cytochrome c maturase in System II, through crystallographic studies. In bacteria where the CcsBA complex is not represented by a natural fused protein, just CcsA was selected for further characterization, on the basis that it belongs to the heme-handling family. In the present study

39

are described biochemical and spectroscopical properties of CcsBA or CcsA homologs from different bacterial species, part of various taxonomic groups, while crystallization trials were attempted to obtain highly-ordered crystals. Different detergents or buffers were screened for some of the homologs to initiate the protein crystallization or to improve the crystal diffraction.

40

3. Materials and methods

3.1 Materials

3.1.1 Cultivation media

All liquid and solid media used for the growth of E. coli cells were prepared according to Table 3. 1 before autoclaving at 121°C for an approximate time of 1.5 hours. A phosphate buffer solution at pH 7.2 ± 0.2 (final concentration 17 mM KH2PO4 and 72 mM K2HPO4) was prepared and autoclaved separately before addition to the TB media.

Table 3. 1: Composition of growth media for E. coli cells cultivation

LB (Lysogeny Broth) TB (Terrific Broth) LB – Agar medium medium medium Tryptone 1% (w/v) 1% (w/v) 1.2% (w/v) Yeast extract 0.5% (w/v) 0.5% (w/v) 2.4% (w/v) NaCl 1% (w/v) 1% (w/v) Agar 1.2% (w/v) Glycerol 0.4% (w/v)

3.1.2 Bacterial strains

Table 3. 2: Strains of Escherichia coli used for plasmid or protein production

Strain Chromosomal genotype Phenotype Source recA1 endA1 gyrA96 thi-1 hsdR17 supE44 Tetracycline XL1-Blue relA1 lac [F´ proAB lacIq Z∆M15 Tn10 Stratagene resistant (Tetr )] TetrΔ(mcrA)183 Δ(mcrCB-hsdSMR- Tetracycline and mrr)173 endA1 supE44 thi-1 recA1 gyrA96 XL10-Gold chloramphenicol Stratagene relA1 lac Hte [F´ proAB lacIqZΔM15 Tn10 resistant (Tetr) Amy Camr]

41

r BL21- F- ompT hsdS(rB - mB - ) dcm+ Tet gal λ Chloramphenicol CodonPlus (DE3) endA Hte [argU proL Camr ] [argU and streptomycin Agilent (DE3)-RIPL ileY leuW Strep/Specr] resistant BioCat – - - C43 (DE3) F ompT gal dcm hsdSB(rB mB )(DE3) GmbH + lacI rrnBT14 ΔlacZWJ16 hsdR514

ΔaraBADAH33 ΔrhaBADLD78 rph-1 BW25113 Δ(araB–D)567 Δ(rhaD– B)568 ΔlacZ4787(::rrnB-3) hsdR514 rph-1

3.1.3 Vectors

Table 3. 3: Expression vectors used in cloning experiments

Origin of Vector Resistance Promoter Affinity Tag replication pASK-IBA5plus Ampr tet N-terminus Strep-Tag II ColE1 pGEX3 Ampr tac N-terminus GST Tag ColE1 pETSTN (derived from Ampr T7 N-terminus Strep-Tag II pMB-1 pET21a) pGEXSN (derived from Ampr tac N-terminus Strep-Tag II ColE1 pGEX-3x) pBAD202OSN (derived Ampr Arabinose N-terminus Strep-Tag II pMB-1 from pBAD202-Topo) pHGST Ampr T7 C-terminus GST ColE1 pETTSC Ampr T7 C-terminus Strep-Tag II ColE1 N-terminus pvpHA Kanr T7 ColA Hisactophilin pASG-IBA_TSC_gfp C-terminus GFP and (derived from pASG- Ampr tet ColE1 Strep-Tag II IBA103) pASG-IBA_2S_gfp N and C-termini Strep- (derived from pASG- Ampr tet Tag II and C-terminus ColE1 IBA103) GFP

42

pEC86 Chlr , Tetr tet - p15A

3.2 Molecular biology methods

3.2.1 Restriction enzyme cloning / Gibson assembly / Site directed mutagenesis

Different ccsBA, ccsA and ccsB gene homologs were cloned into different expression vectors (Table 3. 3) through different approaches. Restriction enzyme cloning is the classical cloning technique where first the insert, in this case various gene homologs of ccsBA, ccsA and ccsB and the chosen vectors were amplified via polymerase chain reaction (PCR, see section 3.2.2) with specific primers containing a complementary DNA sequence and specific restriction recognition sites. Then the PCR products were loaded on an agarose gel (see section 3.2.4) and DNA bands with the right size were cut and digested according to protocol (see section 3.2.5). After DNA concentration measurement at 260 nm, the pure DNA was mixed with the recommended buffer and the corresponding endonucleases (Thermo Fisher Scientific) for the inserted restriction sites and the mixture was incubated for 1 hour at 37° C. Next, the digested vector and insert were placed together in the same reaction tube, in a ratio of 1:3 or 1:1, then T4 DNA ligase (Thermo Fisher Scientific) with the specific buffer were added and everything was incubated overnight at room temperature. The next day, the ligation product was inserted into E. coli competent cells strains XL 10-Gold or XL 1-Blue (see section 3.2.3). Gibson assembly was the preferred method for cloning purposes, starting by amplifying the desired vector and insert via PCR. The unique primers (>35 bp) were designed with a complementary sequence to the target DNA and an overhang complementary to the future joined DNA. This method is endonuclease-free, but uses the activities of T5 exonuclease, which degrades the 5’ strand, Phusion DNA polymerase and Taq DNA ligase, which repair and stitch the newly formed double strand after overhangs overlapping [182]. The PCR products were first incubated for 1 hour at 37° C with the restriction enzyme DpnI to digest the methylated DNA template, followed by DNA purification (see section 3.2.5). Then the DNA concentration was determined at 260 nm and all the components from Table 3. 4 were mixed in a total volume of 40 μl and incubated for 1 hour at 50° C. Afterwards, 50 μl of chemically competent cells were transformed with 20 μl of the Gibson mixture, spread on LB-agar plates and incubated overnight at 37°C.

43

Table 3. 4: Gibson assembly reaction components and their final concentration

Component Final concentration 5X Buffer IRB ( 25% PEG 8000, 500 mM Tris/HCl pH 7.5, 50 mM 1% MgCl2, 50 mM DTT, 5 mM NAD) Phusion DNA polymerase 1 U T5 exonuclease (NEB) 0.2 U Taq DNA ligase (NEB) 160 U dNTPs (Thermo Fisher Scientific) 10 mM Vector DNA + insert DNA (1:3 or 1:1 ratio) 2 – 200 ng

Site directed mutagenesis was used to make plasmid DNA alterations like insertions, deletions or substitutions by amplifying the DNA via PCR with specific primers, containing the desired modifications. After amplification, the PCR product was digested for 1 hour at 37° C with DpnI restriction enzyme for removing the template DNA. Then, chemically competent cells were transformed with the digested mixture, spread on LB-agar plates and incubated overnight at 37°C.

3.2.2 Polymerase chain reaction (PCR)

Table 3. 5: Composition of a typical PCR reaction

Component Final concentration 5X HF or GC Buffer (NEB) 1% Phusion DNA polymerase 1 U dNTPs (Thermo Fisher Scientific) 10 mM DMSO 5% Primer forward 100 - 250 pMol Primer reverse 100 – 250 pMol Template DNA 2 – 200 ng

All the components from Table 3. 5 were mixed in a PCR tube to a final volume of 50 μl, then inserted into the thermal cycler with a program like in Table 3. 6. The PCR product was checked on an agarose gel (see section 3.2.4).

44

Table 3. 6: Typical PCR program

Step Temperature Time Initial denaturation 98° C 1 - 5 min Denaturation 98° C 30 s Annealing 45 – 72° C 30 s Elongation 72° C 15 - 30 s per kb The last three steps were cycled 30 times Final elongation 72° C 7 - 10 min

3.2.3 Transformation of competent cells

Chemically competent transformation protocol: different strains of E. coli competent cells, stored at -80° C, were thawed for 2 minutes on ice and then mixed with 10-100 ng DNA. Then, the mixture was incubated on ice for another 20 minutes followed by a heat shock at 42° C for 45 seconds and another 2 minutes on ice. Apart from XL 10-Gold cells, all the other strains of competent cells were incubated with 500 μl LB media, at 37° C for 30 minutes. The transformed cells were then plated on LB-agar Petri dishes and incubated over night at 37°C. Electrocompetent transformation protocol: competent cells of E. coli BW25113, stored at -80° C, were thawed on ice and then transferred to an electroporation cuvette with 1 mm gap. 1 μl of plasmid DNA was added to the electrocompetent cells and mixed gently by pipetting up and down. An electric pulse was applied for 2-3 seconds with a voltage of 1.7 kV. The cells were removed, placed into an Eppi, 1 ml of LB was added and then were shaken at 37° C for 60 minutes. The transformed cells were then plated on LB-agar Petri dishes and incubated over night at 37°C.

3.2.4 Agarose gel electrophoresis

The agarose gels were prepared with 1% (v/v) Tris-acetate-EDTA (TAE) buffer and agarose to a final concentration of 1% (w/v). The mixture was then microwaved until the agarose completely dissolved and after cooling down the solution was poured in a gel tray with a comb. After polymerization, the DNA samples were mixed with 0.5 μl SYBR Green (1:1000 dilution, Biozym) and loading dye (Thermo Fisher Scientific) and loaded in the gel. Also, a 1 kb ladder

45

(Thermo Fisher Scientific) was added to compare the lengths of the DNA samples. The gel was submersed in 1x TAE buffer in the electrophoresis chamber (BioRad) and the run was performed at 100 V for 50 minutes.

3.2.5 Gel extraction of DNA / PCR product purification

The scope of both methods was the isolation of the amplified target DNA sequence. Both experiments were achieved using preassembled kits from Thermo Scientific (GeneJET Gel Extraction Kit/ GeneJET PCR Purification Kit). In the case of the DNA gel extraction, the PCR samples were first analysed via an agarose gel, then the corresponding bands were cut out, weighed and the protocol from the kit was followed until the clean DNA was eluted. In the case of PCR product purification, a small volume of the sample DNA was first analysed on an agarose gel for checking the success of amplification, then the PCR product was incubated for one hour at 37° C with DpnI restriction enzyme (Thermo Fisher Scientific) and then the protocol from the kit was followed to obtain clear DNA. The DNA concentration of the sample was measured at 260 nm using a nanodrop cuvette in a GeneQuant1300 spectrometer (GE Healthcare).

3.2.6 Isolation of plasmid and sequencing

Selected colonies from an agar plate were picked and incubated overnight at 37° C in LB media with the respective antibiotic. The cell cultures were transferred into 2 ml tubes and centrifuged for 3 minutes at 12 000 x g. For this experiment, a plasmid DNA purification kit from Macherey-Nagel was used. The eluted DNA was measured at 260 nm using a nanodrop cuvette in GeneQuant1300 spectrometer (GE Healthcare) to check the concentration and the purity of the sample. To analyse the plasmid sequence, 60-100 μl of the sample were send to GATC Biotech.

3.2.7 Knock-out of the ccm operon from E. coli genomic DNA

The eight genes forming the ccm operon encoding for System I of cytochrome c maturation were depleted from E. coli K-12 BW25113 genomic DNA and replaced by a kanamycin

46

resistance cassette following the procedure elaborated by Wanner and colleagues [183]. E. coli BW25113 cells were electroporated with purified pKD46 plasmid, which is ampicillin resistant and temperature-sensitive (≤ 30°C). The cells were regenerated and the agar plates were incubated at 30° C. The positive clones were selected, cultured at 30° C until OD 0.2 and induced with 10 mM arabinose for expression of the λ phage Red recombinase encoded on pKDA46. The culture cells were then prepared to cope with electrotransformation. PCR products of the kanamycin resistance cassette were generated with primers designed to contain a sequence complementary to the ends of the gene resistance cassette and a part complementary to the N and C termini of the ccm operon. Subsequently, the PCR products, digested one hour with DpnI and subsequently purified, were inserted in E. coli electrocompetent cells prepared before (with λ phage Red recombinase). After cell regeneration, the bacteria were spread on kanamycin agar plates and incubated at 37° C overnight. The colonies formed were screened by PCR with oligonucleotide primers flanking the kanamycin resistance cassette to check whether the genomic DNA contains the gene. Positive colonies were cultured and made electrocompetent for further usage.

3.3 Microbiology methods

3.3.1 E. coli cultivation

Different strains of Escherichia coli (Table 3. 2) were used as expression hosts for all the CcsBA, CcsA and CcsB homologs. One colony of E.coli, carrying the target plasmid, was transferred into a 300 ml flask containing 100 ml of LB media and ampicillin with a final concentration of 100 μg/ml, shaken for 6-7 hours at 180 rpm and 37° C. Then, 9 flasks (2 L, with baffles) with 1 L autoclaved LB or TB media each were inoculated with 10 ml of the preculture. Ampicillin was added to a final concentration of 100 μg/ml. For E. coli strain BL21 (DE3) CodonPlus RIPL cultures, chloramphenicol was as well provided at 20 μg/ml final concentration. The cell suspension was shaken at 180 rpm and 37° C for around 4 hours until their OD600 reached 0.6 – 0.8 in the case of LB media cultures, or 1.5 – 2 for TB media cultures. Then the protein production was initiated by adding 1 mM IPTG, for T7 expression systems, or 200 μg anhydrotetracyclin for tetracycline controlled gene expression systems. Afterwards the cultures were incubated overnight (around 13-14 hours) at 20° C.

47

A different protocol was implemented for producing cytochromes c. A colony of an E. coli BW25113 Δccm strain bearing the selected plasmids was cultured in 100 ml LB media overnight at 37° C, 180 rpm. Kanamycin was added to a final concentration of 50 μg/ml along with ampicillin (final concentration 100 μg/ml) and chloramphenicol (final concentration 20 μg/ml). Nine flasks (1 L, without baffles) containing 500 ml LB media and proper antibiotics were inoculated with 5 ml each of the overnight culture and incubated at 30° C, 180 rpm. After around 5 hours, the cultures reached OD 0.6 and expression was induced with 200 μg anhydrotetracycline. The temperature was kept at 30° C for the whole incubation time, around 20 hours, until the cultures were harvested.

3.3.2 Cell disruption, membrane isolation and solubilisation

The cells were harvested at 5 000 x g at 4° C for 13 minutes (Rotor JA – 8.1000, Avanti J – 26 XP, Beckman Coulter), followed by resuspending the pellet at 4° C in 2 ml lysis buffer (Table 3. 7) for 1 g of cells. The cells were disrupted using a microfluidizer (Microfluidics), 6 times passing under a pressure of 1000 atm or using a sonifier, which applies high frequency ultrasonic energy, for 15 minutes in cycles of 3 seconds bursts with 7 seconds cooling intervals (70% amplitude, Branson Sonifier 450 W). The resulting crude extracts were centrifuged at 30 000 x g for 30 minutes at 4° C (Rotor JA – 30.50, Avanti J – 30I, Beckman Coulter) to remove cell debris. The supernatant was then ultracentrifuged at 300 000 x g, 4° C for 1 hour (Rotor 70 TI Optima L – 80 XP Ultracentrifuge, Beckman Coulter) to separate the membrane from the cytosolic fraction. The pellet was resuspended in 8 ml buffer for 1 g of membranes with a potter homogenizer. Then the suspension was flash frozen in liquid nitrogen and stored at -80° C. The stored membranes were thawed at 4° C, solubilized with 1% n-Dodecyl β-D- Maltopyranoside (D12M) for 1 hour and then centrifuged at 100 000 x g, 4° C, for 30 minutes (Rotor JA – 30.50, Avanti J – 30I, Beckman Coulter) to remove the unsolubilized parts. The solubilized membranes were afterwards passed through a 0.2 μm filter (Filtropur S 0.2, Sarstedt and used for subsequent purification.

48

3.4 Biochemical methods

3.4.1 Affinity purification

For the purification of the CcsA and CcsBA homologs, a Strep-tag II was attached at C- terminus of the proteins, a ligand specially engineered to bind tightly on modified streptavidin protein. A 5 ml Streptactin Superflow high capacity prepacked column (IBA Lifesciences) was connected to an ÄKTA chromatography system (GE healthcare) and further used for protein purification.

Table 3. 7: Buffers composition used in protein isolation

Buffer Component Final concentration Lysis buffer Tris/ HCl pH 7.5 50 mM NaCl 500 mM glycerol 5% (v/v) Loading/ Wash buffer Tris/ HCl pH 7.5 50 mM NaCl 500 mM glycerol 5% (v/v) D12M 0.03% Elution buffer Tris/ HCl pH 7.5 50 mM NaCl 500 mM glycerol 5% (v/v) D12M 0.03% D-desthiobiotin 5 mM Regeneration buffer Tris/ HCl pH 7.5 20 mM NaCl 700 mM HABA 1 mM SEC Buffer Tris/ Hcl pH 7.5 20 mM NaCl 150 mM Glycerol 5% (v/v) D12M 0.03%

49

After column equilibration first with 50 - 100 ml 100 mM Tris base and then with 50 ml loading buffer, the filtrated supernatant was loaded on the column with 1 ml/min. Next, unbound proteins were washed down with wash buffer (Table 3. 7), while the tagged proteins were further pulled down with elution buffer (Table 3. 7) and collected in 2 ml fractions. Then, a further washing step with 50 – 100 ml of 2 M NaCl was included before the column was regenerated and stored in regeneration buffer (Table 3. 7). The fractions containing the target protein were collected and concentrated in VivaSpin concentrators (50 – 100 kDa cut-off) at 4000 rpm (Beckman Coulter Allegra X-15R) to a volume of approximately 1 ml. In the end, the sample was centrifuged at 16 000 x g, 4° C for 10 minutes (Centrifuge 5415 R, Eppendorf).

3.4.2 Detergent exchange

If different detergents than DDM were tested for further crystallization, then the new buffer was exchanged gradually during the Strep-Tactin column washing, after the solubilized membranes were loaded to the column. The gradient length was 10 times the column volume with a slow flow of 0.1 ml/ min. After, the purification continued as described in section 3.4.1. The detergent concentration used in the buffers was always three times higher than the specific critical micelle concentration (CMC) value of each detergent (Table 3. 8).

Table 3. 8: Types of detergents tested for protein crystallization

Abbreviation CMC Buffer Detergent (H2O) [%] concentration [%] n-Decyl β-D-Maltopyranoside D10M 0.087 0.3 n-Undecyl β-D-Maltopyranoside D11M 0.029 0.09 n-Dodecyl β-D-Maltopyranoside D12M or DDM 0.0087 0.03 n-Tridecyl β-D-Maltopyranoside D13M 0.0017 0.05 Fos-Choline-12 - 0.047 0.14 Fos-Choline-14 - 0.0046 0.014 Lauryl Maltose Neopentyl Glycol LMNG or NG310 0.001 0.003 Octaethylene Glycol C12E8 0.023 0.07 Monododecyl N,N-Dimethyldodecylamine N- LDAO 0.023 0.07 oxide

50

Octyl β-D-glucopyranoside OGP 0.53 1.6 HEGA-10 - 0.26 0.8

3.4.3 Size exclusion chromatography (SEC)

The second step in protein isolation consists of gel filtration chromatography, where the samples are separated according to differences in their hydrodynamic volume. Therefore, the purity is increased by removing other contaminants and the protein oligomeric states can be monitored. Also, a new more suitable buffer for crystallization purposes can be replaced during the procedure. The eluted protein from the affinity purification was loaded with a flow rate of 0.5 ml/min via injection on a gel filtration column (HiLoad Superdex 200 16/60, GE Healthcare). The column was previously equilibrated with SEC buffer, loaded with a flow of 1 ml/min (Table 3. 7). The same buffer was used for the elution step. The fractions (2 ml volume) indicated by the elution peak registered at 280 nm on the chromatogram, containing pure protein, were collected and concentrated. The protein sample was used for further experiments or flash-frozen in liquid nitrogen and stored at -80° C.

3.4.4 Analytical chromatography

Size exclusion chromatography linked to light scattering, absorbance at 280 nm and refractive index detectors is a technique that allows the determination of the absolute molecular mass of membrane proteins in a detergent solution [184]. The gel permeation column physically separates aggregates or empty detergent micelles from the protein of interest. 100 μl of the protein sample of 1 mg/ml concentration was injected automatically using a Viscotek GPCmax VE-2001 system from Malvern on a Superdex 200 10/300 column (GE Healthcare) with a volume of 24 ml. The column was equilibrated previously with SEC buffer (Table 3. 7). The recorded data was analysed with OmniSEC software. Because no membrane protein-detergent complexes with known molecular mass are available, BSA protein diluted in SEC buffer was used as calibration.

51

3.4.5 SDS-PAGE

The proteins were monitored and analysed by a discontinuous SDS-polyacrylamide-gel electrophoresis [185, 186]. In this method, the proteins are denatured by the addition of the anionic detergent SDS, which at the same time covers the electric charges of the enzymes in a negative charge. Therefore, all the samples in the polyacrylamide gel matrix migrate towards the anode with the smaller proteins runing faster, while the bigger ones move slower. This results in a separation by molecular weight, although in the case of membrane proteins, they do not bind the detergent in the same degree, leading to an anomalous migration on SDS- PAGE. After casting the gels, the samples were first mixed with loading dye (Table 3. 9), then loaded together with a molecular weight marker (Thermo Fisher Scientific) for comparison and run for 60 minutes at 40 mA per gel in a miniVE electrophoresis chamber (Amersham Biosciences). The gels were incubated with staining solution (Table 3. 9) for visualising the proteins as bands. After 30 minutes the solution was discarded and destaining solution (Table 3. 9) was added until the background was removed. The procedure was repeated one more time followed by gel scanning.

Table 3. 9: Composition of solutions used for SDS-PAGE

Solution / Buffer Component Final concentration 5 X loading dye Bromophenol blue 0.25 M Glycerol 20% (v/v) SDS 5 % (w/v) Tris/ HCl pH 6.8 160 mM Stacking gel Acrylamide/ bis-acylamide (37.5:1) 5% (v/v) Tris/ HCl pH 6.5 125 mM SDS 0.1% (w/v) APS 0.1% (w/v) TEMED 0.1% (w/v) Resolving gel Acrylamide/ bis-acylamide (37.5:1) 12.5% (v/v) Tris/ HCl pH 8.8 375 mM SDS 0.1% (w/v) APS 0.1% (w/v) TEMED 0.1% (w/v)

52

Running buffer Tris 25 mM Glycerol 192 mM SDS 0.1% (w/v) Staining solution Coomassie Brilliant Blue 0.1% (w/v) Ethanol 20% (v/v) Acetic Acid 10% (v/v) Destaining solution Ethanol 10% (v/v)

3.4.6 Western Blot

To analyse the protein expression or the protein profile, the enzymes on the SDS-PAGEs were transferred to blotting membranes prior staining with Coomassie Brilliant Blue. The molecular weight marker used for these gels was prestained marker (Thermo Fisher Scientific). The transfer took place in a semi-dry blotter (V20-SDB, Scie-Plas), for 1 hour at 50 mA per blot. The nitrocellulose or polyvinylidene difluoride (PVDF) membrane was first incubated in 100% (v/v) methanol for 5 minutes and then equilibrated in transfer buffer (Table 3. 10) together with the gel. Further they were both placed between 4 Whatman filter papers soaked as well in transfer buffer. After the transfer, the membrane was first incubated for 20 minutes in 20 ml blocking buffer (Table 3. 10), then another 20 minutes with 5 μl of Strep-Tactin AP conjugate (Strep-Tactin labeled with alkaline phosphatase, 1:4000, IBA Lifesciences), then washed two times with 25 ml wash buffer (Table 3. 10) and finally incubated in 10 ml reaction buffer (Table 3. 10) mixed with 37.5 μl NBT solution (Table 3. 10) and 37.5 μl BCIP solution (Table 3. 10). The chromogenic reaction was maintained until optimal signal:background ratio was achieved. The colour reaction was stopped by washing with distilled water several times. All the steps after the transfer were performed with gentle shaking of the membrane at room temperature.

Table 3. 10: Composition of solutions and buffers used in Western Blot protocol for Strep-tagged proteins detection

Solution / Buffer Component Final concentration Transfer Tris 25 mM Glycerol 192 mM SDS 0.1% (v/v)

53

Methanol 0.05% (v/v) Blocking Tris/ HCl pH 7.4 50 mM NaCl 440 mM Tween 20 0.1% (v/v) BSA 0.05% (v/v) Washing Tris/ HCl pH 7.4 100 mM NaCl 440 mM Reaction Tris/ HCl pH 8.8 100 mM NaCl 100 mM

MgCl2 5 mM NBT Nitro blue tetrazolium 5% (w/v) Dimethylformamide 70% (v/v) BCIP 5-bromo-4-chloro-3-indolyl phosphate 5% (w/v) Dimethylformamide 100% (v/v)

3.4.7 Heme staining

Prior to Coomassie Brilliant Blue staining, the SDS-PAGE was incubated in heme staining solution (Table 3. 11) for 30 minutes at room temperature. Then, 300 μl of 30% (v/v) H2O2 was added and incubated for another 10 minutes. When coloured bands appeared the staining solution was discarded and the gel was incubated in fixing solution (Table 3. 11) before scanning [187].

Table 3. 11: Compositions of solutions used in heme staining method Solution Component Final concentration Staining TMBZ (3,3’,5,5’-tetramethylbenzidine) 0.1 mM Methanol 30 % (v/v) Sodium Acetate pH 5.0 70 % (v/v) Fixing Iso-propanol 30 % (v/v) Sodium Acetate pH 5.0 70 % (v/v)

54

3.4.8 BCA assay

The protein concentration was determined by the bicinchoninic acid assay. The principle of this method it the reduction of Cu2+ to Cu1+ by a protein sample, in an alkaline medium. The Cu1+ then binds to the bicinchoninic acid and forms a complex with a strong absorbance at 562 nm. Therefore, the measured intensity is proportional to the amount of protein in the sample and it can be calculated by comparison to a standard calibration curve. 1 ml of the reagent solution A (Pierce BCA protein assay, Thermo Fisher Scientific) mixed together with a solution of 4 % CuSO4 (w/v) in a 50:1 ratio was added on 50 μl protein sample. The mixture was then incubated for 30 minutes at 60° C and then for another 2 minutes on ice before measuring on a spectrophotometer (GeneQuant 1300, GE Healthcare).

3.5 Spectroscopic methods

3.5.1 UV/Vis spectroscopy

The UV/Vis spectra of 50 μl purified protein sample, placed in corresponding cuvettes (50 μl- 1000 μl, 2 mm, Eppendorf), were recorded using a USB4000 spectrophotometer (Ocean Optics). The measurements took place in an anoxic chamber, with 95% nitrogen and 5% hydrogen. Sample reduction was achieved with an excess of sodium dithionite. A few grains of ammonium persulfate were added to the sample to generate oxidized spectra. The results were analysed with Origin software (OriginLab).

3.6 Crystallization methods

3.6.1 Protein crystallization

Crystallization trials for obtaining single crystals suitable for X-ray diffraction experiments were carried out using the sitting-drop vapour diffusion method. Initially, the sample proteins were screened with hundreds of conditions, which statistically are mostly used in crystal production, automatically mixed by liquid handling systems (Rigaku Alchemist II). 50 μl of this solutions containing salts or organic compounds were pipetted in 96-well MRC plates with two wells (Molecular Dimensions) and then a high-throughput robot OryxNano (Douglas

55

Instruments) placed the protein and mixed it together with the reservoir solution in different ratios, in a 0.6 μl drop. It was also used due to its particular precision for matrix microseeding, by pipetting 0.1 μl of a microseed solution into the drop. The microseeding solution was prepared from crushed crystals by vortexing them with a small polytetrafluoroethylene bead (Hampton Research). In order to obtain better crystals of sufficient quality for X-ray data collection, fine screens were prepared by changing in small steps the condition concentration and pH. In the case of additive and detergent screens (Hampton Research), two different processes were implemented. In a first scenario, the crystallization sitting drop was set up manually together with a small amount of additive or detergent solution, generally 0.1 μl. In a second scenario, 5 μl of the additive or detergent solution were mixed with 45 μl solution of one preferred condition in the plate reservoir, then the drop was automatically dispensed and mixed by the Oryx robot mentioned above. The plates were incubated in rooms with constant temperatures of 8 or 20 °C. The development of crystallization was monitored by checking the plates under microscope (ZEISS SteREO Discovery.V20/Leica M165C) and pictures could be taken with or without polarized light on a digital photo camera (Canon EOS 600D). The obtained crystals were harvested with a nylon loop, if necessary soaked in a cryoprotectant solution, then flash-frozen and stored in liquid nitrogen.

3.6.2 Data collection

Diffraction data were collected at 100 K on beam lines X06SA and X06DA of the Swiss Light Source (Villigen, Switzerland) using PILATUS 6 M and PILATUS 2 M pixel detectors (Dectris) respectively.

56

4. Results

4.1 Characterization of Helicobacter hepaticus CcsBA

The natural fused ccsBA gene was amplified from the genomic DNA of Helicobacter hepaticus (DSMZ), then subsequently cloned into different expression vectors, illustrated in Figure 4. 1.

57

Figure 4. 1: Vector maps containing the H. hepaticus ccsBA gene.

The ccsBA gene cloned in the vectors from A - H depicted in Figure 4. 1, could not be heterologously expressed in different strains of E. coli. The overexpression of the target gene was finally achieved when it was inserted in a construct with tet promoter, a gene encoding for GFP fused to the 3’ end of ccsBA and two Strep-tags attached at both ends of the gene (Figure 4. 1, I). The recombinant protein was produced in E. coli BL21-CodonPlus (DE3)-RIPL cells. The gfp was exchanged to other fusion genes such as trxA encoding for thioredoxin from E.

58

coli (Figure 4. 1, K) or mistic from Bacillus subtilis (Figure 4. 1, L). The Strep-tag at the 5’ end of the ccsBA in these two new constructs was removed. The ccsBA could also be overexpressed without any fused helper genes (Figure 4. 1, J). The target protein was obtained in small yields (Figure 4. 2, left). Different oligomeric states of HhCcsBA are observed in the purification chromatogram of the protein fused with mistic protein. The elution volume is shifted depending on the size of the different combinations between HhCcsBA and the helper proteins. Detergents such as LDAO (data not available), D11M, DDM or Fos-choline-14 were tested for HhCcsBA fused with thioredoxin (Figure 4. 2, right). The chromatogram of the gel filtration purification using a buffer system with Fos-choline-14, a lipid-like detergent, indicates protein polydispersity. Also, in affinity purifications where DDM was exchanged with D10M, OGP or HEGA-10, the protein was eluted in very low yields.

Figure 4. 2: Left: Size exclusion chromatograms of H. hepaticus CcsBA fused with different helper proteins. Right: Size exclusion chromatograms of H. hepaticus CcsBA fused with thioredoxin purified with different detergents.

The SDS-PAGE pictures (Figure 4. 3) of the HhCcsBA fused with GFP as purified (lanes 1, 3, 7) and reduced with sodium dithionite (lane 2, 4) indicate the full length protein (HhCcsBA + GFP) at around 130 kDa, which dissapears upon adding the reducing agent. The theoretical molecular weight of HhCcsBA was calculated to 106 kDa, while the GFP has 27 kDa. Two close bands at around 70 kDa shift their intensity from lane 1 to lane 2. They represent the C- terminus of HhCcsB with HhCcsA fused to GFP and tagged with Strep-tag. The upper band could contain this fragment of the fusion protein completely linearized with no more tertiary structure while the lower band might be the same fragment of the fusion protein, but with intermolecular disulfide bonds still formed between the existent cysteine residues (1.1% of the

59

total amino acids). Therefore, the protein as purified contains a mixture of two enzyme populations found in different conformations. The remaining part of HhCcsB (N-terminus) could be the polypeptide at ~35 kDa. The second SDS-PAGE (Figure 4. 3), also scanned under UV-light, shows that the reducing environment quenches the fluorescence of the GFP chromophore (lane 6). It also confirms that the upper band close to 70 kDa is the reduced form of the protein, with no visible band on the gel scanned under UV-light (lane 5). Upon boiling the sample protein at 95° C for 5 minutes (lane 8), the lower polyeptide around 70 kDa dissapears completely, suggesting that the corresponding protein to this band was previously not completely denatured. The SDS-PAGE profiles of the other fusion combinations, HhCcsBA fused with mistic (lane 9), HhCcsBA fused with thioredoxin (lane 10) and HhCcsBA without any fusion partner (lane 11) always display the same cleavage event in the periplasmic domain of HhCcsB, splitting the proteins into two parts (66 kDa and 35 kDa) and the full- length protein at around 130 kDa. No major differences between protein sizes are noticeable because mistic, as well as thioredoxin, have molecular masses of 13 kDa and 12 kDa respectively.

Figure 4. 3: SDS-PAGEs of H. hepaticus HhCcsBA 1. HhCcsBA + GFP as purified 2. HhCcsBA + GFP reduced with sodium dithionite 3. HhCcsBA + GFP as purified 4. HhCcsBA + GFP reduced with sodium dithionite 5. HhCcsBA + GFP as purified 6. HhCcsBA + GFP reduced with sodium dithionite 7. HhCcsBA + GFP as purified 8. HhCcsBA + GFP boiled sample 9. HhCcsBA + mistic as purified 10. HhCcsBA + thioredoxin as purified 11. HhCcsBA as purified. The samples as purified were loaded on the gel after the SEC purification, with no boiling or reducing.

The spectral features of the recombinant HhCcsBA fused with GFP or thioredoxin show a typical absorption spectra for oxidized type-b cytochromes (Figure 4. 4). The proteins as purified exhibit a Soret maximum at 413 nm which upon reduction with an excess of sodium dithionite shifts to 425 nm, with pronounced α (559 nm) and β (530 nm) peak absorptions. Upon oxidation of the HhCcsBA fused with GFP with ammonium persulfate, the Soret maximum changes its wavelength to 411 nm while in the spectra of the protein fused with 60

thioredoxin, the peak position of the Soret band remains at the same wavelength. The absorption peak of the GFP variant manifests at 489 nm wavelength.

Figure 4. 4: UV/Vis spectra of HhCcsBA + GFP and HhCcsBA + thioredoxin as purified (black line), reduced with sodium dithionite (purple line) and oxidized with ammonium persulfate (red line).

4.2 Characterization of different bacterial homologs of CcsA

4.2.1 Sequence identity between the CcsA homologs

Just a few bacteria contain in their genomic DNA ccsA and ccsB fused into one single open reading frame. On the bacterial chromosome of most organisms, ccsA and ccsB are separate, but neighbouring genes. Even though both CcsA and CcsB are working as a complex and are needed for cytochrome c production, only CcsA, harboring the highly conserved WWD domain, was subsequently chosen for further characterization. Several homologs were described in this chapter from organisms belonging to different taxonomic groups. Aquifex aeolicus AaCcsA (36 kDa), Bacillus megaterium BmCcsA (44kDa), Bacteroides thetaiotaomicron BtCcsA (30 kDa), Geobacter sulfurreducens GsCcsA (26 kDa), Hydrogenobacter thermophilus HtCcsA (35 kDa), Micrococcus luteus MlCcsA (39 kDa) and Symbiobacterium thermophilum StCcsA (48 kDa) are the homologs selected for the intended experiments. The percentage of identical residues between these CcsA homologs is low, between 16% and 28%, except between the AaCcsA and HtCcsA candidates, with a 61.4% similarity due to their belonging to the same family, Aquificaceae (Table 4. 1).

61

Table 4. 1: Percent identity between the CcsA homologs characterized in this chapter (determined with Geneious)

AaCcsA BmCcsA BtCcsA GsCcsA HtCcsA MlCcsA StCcsA

AaCcsA 22 21.4 22.4 61.4 28.2 23.4 BmCcsA 22 16.2 17.7 21.8 18.6 26.6 BtCcsA 21.4 16.2 24.4 23.4 18.9 15.2 GsCcsA 22.4 17.7 24.4 27.8 24.5 20.6 HtCcsA 61.4 21.8 23.4 27.8 28.4 22.3 MlCcsA 28.2 18.6 18.9 24.5 28.4 22.7 StCcsA 23.4 26.6 15.2 20.6 22.3 22.7

4.2.2 Characterization of Aquifex aeolicus CcsA

The ccsA gene was amplified via PCR from the genomic DNA of Aquifex aeolicus and then cloned consecutively into pASG-IBA_2S_gfp and pETTSC vectors, each containing different promoters, tet and T7, respectively. The expression of the gene was done in E. coli BL21- CodonPlus (DE3)-RIPL cells. The SEC chromatograms of AaCcsA and of AaCcsA fused with GFP at the C-terminus shows an elution difference of 10 ml (Figure 4. 5, left). The SDS-PAGE (Figure 4. 5, right) for AaCcsA with GFP shows at least five bands at different sizes. The theoretical molecular weights are 36 kDa for AaCcsA and 27 kDa for GFP. However, GFP dimerization is resistant to SDS and can be visualised on the SDS-PAGEs. Therefore, the band at around ~100 kDa, could be the full length protein (AaCcsA + GFP) in a completely unfolded state. Subsequently, the band at ~70 kDa represents also the the full length protein, with a dimer formation of GFP, but not completely denatured. This band is fluorescent when it is exposed to UV light (picture not shown), indicating that the chromophore is still packed within the core of the GFP β-barrel structure. Upon boiling at 95° C for 5 minutes, the band at ~70 kDa almost disappears because the polypeptide unfolds and shifts the position to ~100 kDa. The ~130 kDa and ~250 kDa bands indicate the formation of higher oligomers, while the lower band at ~ 35 might be free GFP, also vanished after boiling. AaCcsA expressed without GFP is represented on the gel (Figure 4. 5, right) by a band slightly higher than 25 kDa and a faint band at ~45 kDa, perhaps due to a dimerization of the protein. Upon boiling the sample there are no visible changes, suggesting that the protein is completely denatured with or without heating. 62

Figure 4. 5: Left: SEC chromatograms of CcsA + GFP and CcsA from A. aeolicus. Right: SDS-PAGEs of CcsA + GFP and CcsA from A. aeolicus. The star lanes represent the samples boiled at 95°C for 5 minutes.

The UV/Vis spectra of the protein confirm the presence of a b-type heme group (Figure 4. 6). The absorbance spectra vary with the different states of the iron atom. The protein as purified indicates an oxidized state of the cofactor. Upon reduction with sodium dithionite, α and β bands are visible at wavelengths of 559 nm and 528 nm respectively, while the Soret peak shifts its absorbance maximum from 413 nm to 425 nm. The absorption peak of the GFP variant has a wavelength of 489 nm.

Figure 4. 6: UV/Vis spectra of CcsA + GFP from A. aeolicus as purified (black line) and reduced with sodium dithionite (purple line).

63

4.2.3 Isolation of different bacterial homologs of CcsA

Several gene homologs of ccsA were amplified from genomic DNA samples and cloned into pETTSC vectors. The expression of the gene was achieved in E. coli BL21-CodonPlus (DE3)- RIPL cells. All the homologs were successfully isolated with high stability, but most of them manifested polydisperse profiles (Figure 4. 7). In the chromatograms of BmCcsA (Figure 4. 7, A) or GsCcsA (Figure 4. 7, C), the elution steps reveal a mixture of protein populations found in distinct oligomeric states. The most stable homologs isolated in their monomeric forms are BtCcsA, HtCcsA and StCcsA. From these, the thermophilic correspondents, HtCcsA and StCcsA have been produced in high yields. The SEC elution profile of MlCcsA shows a small aggregated protein peak in the void volume of the column and a small shoulder eluting prior to the main peak, corresponding to the target protein (Figure 4. 7, E). The colour of the homologs ranges from bright red for StCcsA (Figure 4. 7, F) to light yellow for HtCcsA (Figure 4. 7, D),

even though the comparison was made at similar protein concentrations. All the isolated homologs were subjected to SDS-PAGE analysis (Figure 4. 7). BmCcsA, with a calculated molecular mass of 44 kDa, is represented by a polypeptide at ~45 kDa and a lower impurity. Even though the gel filtration profile shows high polydispersity for this homolog, the oligomers separate on the SDS-PAGE (Figure 4. 7, A). BtCcsA has a close double band at ~30 kDa, the same size as the theoretical molecular weight. The double band could imply different folded states of the protein (Figure 4. 7, B). The homolog GsCcsA tends to form high oligomers, which do not dissociate upon addition of SDS. The theoretical molecular mass is calculated to 26 kDa, which corresponds on the gel to a band lower than the 25 kDa marker (Figure 4. 7, C). HtCcsA is represented by a double band at ~30 kDa while its theoretical size is 35 kDa. The upper faint band at ~60 kDa could be a dimer of the protein (Figure 4. 7, D). The shoulder contamination of the elution peak in the chromatogram of MlCcsA purification is represented on the gel as an upper, faint band at ~45 kDa. The main polypeptide at ~35 kDa is the target protein, which has a calculated molecular mass of 39 kDa. Another small contamination, eluted at ~70 ml on the chromatogram, is showed on the SDS-PAGE by a fade band at 18.4 kDa (Figure 4. 7, E). Even though on the SEC graphic, StCcsA homolog is eluted in a unique peak, with no other visible peaks or shoulders, on the gel is revealed a small contamination at ~25 kDa. The theoretical mass of the protein is 48 kDa while the gel profile shows a polypeptide lower than 35 kDa (Figure 4. 7, F).

64

Figure 4. 7: SEC chromatograms and SDS-PAGEs of different CcsA homologs.

4.3 Characterization of Bacteroides thetaiotaomicron CcsBA

4.3.1 Production and isolation of BtCcsBA

The gene encoding for the natural fusion protein BtCcsBA was amplified via PCR from the genomic DNA of B. thetaiotaomicron and subsequently cloned into pETTSC vector. The gene was heterologously expressed in E. coli BL21-CodonPlus (DE3)-RIPL or C43 (DE3) strains. The SEC chromatogram (Figure 4. 8, left) shows the monitoring of the protein at 280 nm where BtCcsBA eluted in a single symmetrical peak. The protein profile on SDS-PAGE presents two major polypeptides and a faint band at ~90 kDa, considered to be the full-length protein, equivalent to the calculated theoretical molecular

65

mass of 91 kDa. The band at ~55 kDa is thought to be the C-terminus of CcsB together with CcsA, fused with a Strep-tag at the C-terminus. This interpretation is confirmed by the detection of a strong signal correspondent to this band on a Western Blot with Strep-Tactin AP-conjugates. Because the lower 25 kDa band is not detected, indicates that it corresponds to the N-terminus of CcsB protein, which is tag-free (Figure 4. 8, right).

Figure 4. 8: Left: SEC chromatogram of BtCcsBA. Right: SDS-PAGE and Western Blot (for Strep- tagged proteins detection) of BtCcsBA.

4.3.2 Analytical chromatography of BtCcsBA

BtCcsBA purified first by gel filtration chromatography in a detergent buffer (20 mM Tris pH 7.5, 150 mM NaCl, 0.03% DDM) and concentrated to 1 mg/ml, was analysed by SEC-LS. The chromatogram shows a main peak registered by all three detectors at 11.3 ml, corresponding to a BtCcsBA-DDM complex. The determined molecular mass of BtCcsBA in this peak was calculated to 94.5 kDa, using the three-detector method [184]. The value corresponds to the monomer of the protein with a theoretical molecular mass of 91 kDa as calculated from the amino acid sequence. The refractive index (green line), the right angle light scattering (red line) and the UV (purple) detectors register a big second peak at 13.9 ml, caused by an excess of detergent micelles (Figure 4. 9).

66

Figure 4. 9: SEC-LS chromatogram of BtCcsBA in SEC buffer (Table 3. 7). The chromatogram shows the readings of the refractive index in green line, right angle light scattering in red line and the absorbance at 280 nm in purple. The black lines represent the calculated molecular masses of the respective elution probes.

4.3.3 Detergent or buffer exchange

Finding the most suitable buffer and detergent for a specific membrane protein is a crucial step in its crystallization. Other buffer compositions and detergents in protein purification were screened in order to enable or improve protein crystallization. Besides the usual buffer system for SEC purification (0.02 M Tris/HCl pH 7.5, 150 mM NaCl, 5% Glycerol, 0.03% DDM), other new buffers were tested, including: 1. 0.1 M HEPES/NaOH pH 7.2, 150 mM NaCl, 0.03% DDM 2. 0.1 M Sodium Phosphate buffer pH 7, 150 mM NaCl, 0.03% DDM 3. 0.02 M Sodium Phophate dibasic/Citric acid pH 7, 150 mM NaCl, 0.03% DDM 4. 0.1 M Tri-Sodium Citrate/Citric acid pH 6.2, 150 mM NaCl, 0.03% DDM The new buffers were used from protein solubilisation until the last step of the purification. The protein purification profiles with different buffers (Figure 4. 10, left) did not indicate differences in protein dispersity or stability, although in the chromatogram where the HEPES/NaOH pH 7.2 buffer was employed, no aggregation peak was visible in the void volume of the column.

The target protein purified with buffers composed of Na2HPO4/citric acid at pH 7 or trisodium citrate/citric acid at pH 6.2 was highly unstable. Similar result could be observed upon detergent exchange with HEGA-10 or Fos-choline-12.

67

The SEC profiles of the protein purified in maltoside detergents show unique symmetrical elution peaks corresponding to the monomeric form of BtCcsBA, except for the protein complexed with D10M (purple) where the peak is tailed. The purification with LMNG (wine colour) shows an aggregate peak in front of an early elution peak of the target protein, which can only be a multimer, possibly a tetramer. The purification with Fos-choline-14 (red) leads to a high polydispersity of the enzyme, indicating different oligomerization states (Figure 4. 10, right).

Figure 4. 10: SEC chromatograms of BtCcsBA purified in different buffers (left) or detergents (right).

4.3.4 UV/Vis absorption spectra of BtCcsBA

Figure 4. 11: UV/Vis spectra of CcsBA from B. thetaiotaomicron as purified (black line) and reduced with sodium dithionite (purple line).

68

Visible absorption spectra of the BtCcsBA was obtained to examine the heme group. The protein as purified exhibits a spectrum with a Soret maximum at 412 nm, indicating that b-type heme is oxidized. Specifically, upon reduction with an excess of sodium dithionite, the Soret peak shifts its position to 425 nm, while α and β bands can be observed at 559 nm and 530 nm respectively (Figure 4. 11).

4.3.5 Crystallization of BtCcsBA

Crystallization trials of the SEC-purified and concentrated protein BtCcsBA were carried out with all available initial sparse matrix screens in the laboratory, using the sitting drop vapour diffusion method. The plates were incubated at 20° C. The protein yielded initially small crystals in condition B11 of the Footprint screen (Figure 4. 12, A), with 1.2 M trisodium citrate and 0.01 M sodium tetraborate at pH 8.5. In time, new other initial screen conditions supported the crystallization, all represented in Figure 4. 12. The required time for crystal growing lasted between four and ten days.

A. Footprint B11 B. Footprint F11 C. Index C5

1.2 M Trisodium citrate 2% PEG 2000 60% Tacsimate pH 7

0.01 M Sodium tetraborate 1.8 M NaH2PO4/K2HPO4 pH 8.5

69

D. MemGold A10 E. MemGold A2 F. Index B8

1.2 M Trisodium citrate 1.1 M Trisodium citrate 1.4 M Trisodium citrate

0.01 M Tris/HCl pH 8 0.066 M HEPES/NaOH pH 0.01 M HEPES/NaOH pH 7.5 7.5

Figure 4. 12: All the initial screen crystal hits. The incubation temperature was at 20° C. The protein was purified in 0.02 M Tris/HCl pH 7.5, 0.15 M NaCl, 0.03% DDM (SEC buffer). The protein concentration ranges between 8-10 mg/ml. The majority of conditions contain high salt concentrations, especially trisodium citrate.

The optimization of the initial hit conditions and screening for new conditions started with microseeding experiments for all further crystallization attempts, although no improvement could be monitored. Fine-gradient screening of the original hit conditions were set up but the crystals could not be reproduced as in the initial screen plates or further optimized. The obtained crystals were too small, too condensed and badly shaped with round edges (Figure 4. 13, A, B, C, F). Crystallization plates were set up manually with solution mixtures from the hit conditions combined with different additives or supplementary detergents in small amounts. However, the addition solutions could not improve the crystal quality (Figure 4. 13, D, E). Lower incubation temperature for the crystallization plates at 8° C did not make any difference.

70

A. 1.3 M Trisodium citrate B. 1.2 M Trisodium citrate C. 1.3 M Trisodium citrate

0.066 M HEPES/NaOH 0.066 M HEPES/NaOH 0.01 M Tris/HCl pH 8 pH 7.5 pH 6.6

D. 1.1 M Trisodium citrate E. 1.1 M Trisodium citrate F. 1.5 M Trisodium citrate

0.066 M HEPES/NaOH 0.066 M HEPES/NaOH pH 0.1 M HEPES pH 6.6 pH 7.5 7.5

0.1 M Sodium malonate 0.5% (w/v) Polyvinylpyrrolidone K15

Figure 4. 13: Fine-screen and additive screen crystal hits. The upper pictures (A, B, C) are photographed under polarized light. D and E pictures show the crystals obtained from an additive screen, while the rest are the crystals resulted from fine screens.

The best resolution of the diffraction experiments on BtCcsBA crystals measured at Swiss Light Source (Villigen, Switzerland) was corresponding to 13 Å (Figure 4. 14) from a crystal grown in the MemGold initial screen, condition A2 (Figure 4. 14). As an observation, crystals obtained in trisodium citrate solutions diffracted to higher resolutions and with more observable reflections on the detector. The unit cell dimensions calculated are: a = 203.92 Å, b = 203.92 Å, c = 249.72 Å, α = 90°, β = 90°, γ = 120°. The space group is P 6322.

71

Figure 4. 14: Left: Picture of two crystals mounted in a nylon loop taken at Swiss Light Source (Villigen, Switzerland). Right: Diffraction pattern of the left thin crystal.

For improvement of crystal quality, different buffers and detergents were tested in the purification setup (see chapter 4.3.3). When the buffer was exchanged to 0.1 M sodium phosphate at pH 7, 150 mM NaCl, 0.03% DDM, it resulted in the formation of two large crystals in the A2 and A10 conditions from MemGold (Figure 4. 15). However, they diffracted X-rays up to 20 Å resolution.

MemGold A2 MemGold A10

1.1 M Trisodium citrate 1.2 M Trisodium citrate

0.066 M HEPES/NaOH pH 7.5 0.01 M Tris/HCl pH 8

Figure 4. 15: Crystal hits in MemGold initial screen using BtCcsBA protein purified in a different buffer: 0.1 M Sodium Phosphate buffer pH 7, 150 mM NaCl, 0.03% DDM.

4.3.6 Cytochrome c production by the BtCcsBA synthetase

In order to test the activity of the recombinant BtCcsBA for assembly of cytochromes c, a system was engineered where the endogenous ccm operon was genetically removed while the BtCcsBA synthetase was inserted. A derivative of E. coli K-12 strain BW25113 was

72

constructed by substitution of its native ccm operon (encoding for System I of cytochrome c maturation) on the genomic DNA with a kanamycin resistance gene, following the procedure of Wanner and colleagues [183]. The exchange was confirmed by PCR (picture not shown). Further, a reporter cytochrome c gene, macA from Geobacter sulfurreducens, which encodes for a di-heme peroxidase [188], was cloned into a pASGIBA plasmid with ampicillin resistance, Strep-tag at the C-terminus and tet promoter. The construct was inserted in E. coli BW25113 Δccm strain. The new E. coli strains bearing the plasmids with the reporter cytochrome c were transformed with a second vector: either pEC86 bearing E. coli ccm operon or pEC86 containing BtCcsBA. The pEC86 vector has chloramphenicol resistance and the target gene is also under control of tet promoter. After the proteins co-production, MacA was isolated by affinity chromatography through a Strep-tactin column. The yield of MacA produced by the EcCcm system was around 10 times higher compared to the yield produced by the BtCcsBA protein. The discrepancy could already be seen from the purification chromatogram or from the colour difference of both samples concentrated to the same volume (Figure 4. 16, right) and finally from the measurements with BCA assay where around 5 mg of MacA was obtained by Ccm system proteins, while less than 0.5 mg of the same enzyme was produced by BtCcsBA. The samples were also analysed by heme staining of SDS-PAGE gel and Western Blot (Figure 4. 16, left). The protein concentrations could be comparable because the same volumes were loaded on the gel in both cases and the cytochromes c retain their hemes quantitatively. The same volume of MacA sample produced in E. coli C41 pEC86::ccm cells (Figure 4. 16, left, lane 7) was applied on the gel as a positive control. The detection of the Strep-tagged proteins by Western Blot is highly sensitive, therefore the band corresponding to MacA produced by BtCcsBA is stronger. In conclusion, BtCcsBA was able to produce in vivo cytochromes c in a bacterium where genuinely they are assembled by System I. However, the protein yield was very low maybe because of the lack of a reducing system for the heme binding motif on the apocytochrome c.

73

Figure 4. 16: Left: Heme staining of SDS-PAGE together with the Western Blot corresponding to the same gel. Lanes 1-3 are: flow-through, wash step and elution peak samples from the affinity purification of MacA produced by BtCcsBA in E. coli BW25113 Δccm strain. Lanes 4-6 are: flow-through, wash step and elution peak samples from the affinity purification of MacA produced by the Ccm system in E. coli BW25113 Δccm strain. Lane 7 represents a positive control of MacA produced by the Ccm system in E. coli C41 pEC86::ccm. Lane 8 represents a prestained protein standard (Thermo Fisher Scientific PageRuler Prestained Protein Ladder). MacA (36 kDa) bands are highlighted in red squares. Right: MacA samples produced by Ccm proteins (intense red colour) and BtCcsBA (light yellow colour).

74

5. Discussions

5.1 Characterization of H. hepaticus CcsBA

Generally, one of the main impediment for structural studies of membrane proteins is the difficulty of heterologous production in large amounts of the target enzyme. The inclusion of the green fluorescent protein tag (GFP) at the C-terminus of HhCcsBA was first implemented as an approach for enabling the expression of the target gene. In previous attempts, glutathione S-transferase (GST, 26 kDa) or the hisactophilin tag (17 kDa) were subsequently fused at the N-terminus of HhCcsBA, but the protein could not be produced. Different promoters for controlling gene expression were also tested, concluding with the success of the tetracycline (tet) inducible promoter which was capable to activate the production of the protein in modest yields. The usage of the E. coli strain BL21-Codon plus (DE3)-RIPL as an expression system (includes plasmids with extra copies of genes encoding for rare tRNAs in E. coli) might have helped with the gene overexpression. The results did not corroborate literature, where the same homolog, HhCcsBA, has been produced in high yields in E. coli C43(DE3) using a system with tac promoter (pGEX-4T-1 vector) while a GST tag was inserted at the N-terminus [112]. Attempts of removing GFP together with Strep-tag from the C-terminus of HhCcsBA with TEV or Precission proteases resulted in the aggregation of the protein (Figure 5. 1). Even though small amounts of HhCcsBA could be produced without any other helper proteins (except the short sequence of the Strep-tag together with the TEV cleavage site), it was noticed an increased structural instability of the protein with the tendancy for aggregation. It was concluded that the HhCcsBA C-terminus might be an intrinsically disordered region and without any fused peptide it destabilizes the protein conformation. A promising helper protein merged with HhCcsBA was the mistic membrane protein from B. subtilis. It was stated that mistic protein is independent from the translocon machinery, being spontaneously inserted into the membrane and therefore it was suggested that production levels of foreign membrane proteins were increased because of the mistic association [189]. The fusion of mistic at the N-terminus of HhCcsBA was not able to generate any expression of the target gene. When the position of mistic was changed to the C-terminus, the protein could be produced, but with no higher yields than the other variants and with a highly polydisperse profile on the SEC chromatogram (Figure 4. 2).

75

Figure 5. 1: SEC chromatograms of HhCcsBA_GFP after digestion with TEV and Precission proteases.

A similar approach was to merge at the C-terminus of HhCcsBA an endogenous, soluble thioredoxin protein from E. coli. Although the purification resulted in a homogeneous and monodisperse protein (Figure 4. 2), the yield was not ameliorated. In theory, besides enhancing the overexpression levels, the fusion partner could also be effective in protein crystallization by promoting the intermolecular contacts required for crystal formation through polar surfaces. Unfortunately, in this case no crystals could be obtained for HhCcsBA and any of the helper proteins.

5.2 Characterization of CcsA homologs

Ccs homologs from representative species of almost all the major taxonomic bacterial groups with System II (see section 12.4) were heterologously produced and screened for crystallization. Because in most organisms the complex CcsBA is represented by individual genes on the chromosome, only CcsA was selected for further characterization. All the purified Ccs proteins were analysed by SDS-PAGEs. In the case of CcsA orthologs, the gels usually show the target protein as a band or in some cases two close bands, indicating different folding states (Figure 4. 7). Sometimes faint small impurities or weak dimer formations were also detected. Even though the general size of CcsA homologs is relatively small, around 25-50 kDa, attempts of crystallization were unsuccessful maybe due to the high polydispersity in some cases (Figure 4. 7) or maybe because without the complex partner, the structural conformation is not similarly maintained as in physiological conditions.

76

5.3 Characterization of B. thetaiotaomicron CcsBA

Production of BtCcsBA The overexpression of ccsBA from B. thetaiotaomicron under the strong T7 promoter in E. coli C43(DE43) cultured in TB media showed a considerable increase in protein yield. The purification chromatogram displays homogeneous and monodisperse protein with a high purity (Figure 4. 8). Two major bands were detected on the SDS-PAGE, corresponding to different parts of the BtCcsBA, cleaved in the periplasmic domain. In addition to the separated peptide fragments, the full-length of the protein could also be observed. Ahuja et al. show that when overproduced in E. coli, CcsB (ResB) from B. subtilis displays a band at around 27 kDa, representing a truncated version of the protein (just the N-terminus) and a full-length polypeptide at 62 kDa, the same size as the theoretical molecular mass calculated from the amino acid sequence. The authors suggested that E. coli has difficulties in secreting or folding big periplasmic domains in membrane proteins [118]. Frawley and Kranz proposed a proteolytic site in the periplasmic domain of CcsB from the CcsBA fusion protein of H. hepaticus produced in E. coli [112]. In both papers mentioned above, bands corresponding to CcsB and CcsA were detected after the gels were stained for polypeptides covalently attached to hemes. In Ahuja et al. paper it is concluded that CcsB binds the heme covalently through a cysteine at the N-terminus while in Frawley and Kranz paper, the heme dissociates from the CcsA polypeptide upon boiling, implying that it is non-covalently attached. Also, in the doctoral thesis of Anton Brausemann [96], CcmF retains the heme on the SDS-PAGE, due to tight connection with the surrounding amino acids. However, in the present work, three heme-staining procedures were performed with CcsBA homologs from H. hepaticus and B. thetaiotaomicron, but no visible bands were noticed on the gel. It is concluded that even though CcsBA is a b-type cytochrome with the heme bound non-covalently, there is the case of CcsB from B. subtilis which might undergo a conformational change when is purified in vitro and therefore a cysteine is in the perfect position to form a spontaneous thioether bond with the heme.

Crystallization of BtCcsBA The purified CcsBA protein from B. thetaiotaomicron, in DDM detergent, was successfully crystallized, but the obtained crystals showed poor diffraction quality and were difficult to

77

reproduce or optimize. Even though the periplasmic soluble domain of CcsB comprises about two thirds of the fusion protein and influences the crystal packing, there may be a lack of protein-protein contacts within the crystal and more detergent-mediated crystal contacts. Addition of different additives or supplementary detergents with concentrations below their CMC values in hope of reducing the size of the detergent regions and subsequently inducing a better crystal packing, did not improve the crystals quality. The excess accumulation of free detergent micelles, illustrated in the SEC-LS chromatogram for BtCcsBA (Figure 4. 9, section 4.3.2), can increase the amount of phase separation that occurs in the crystallization trials, which can inhibit crystal formation. Presence or absence of native lipids could also be an indicative for poor diffraction of the crystals. Post-translational modifications (glycosylation) or chemical modification of surface residues (lysine methylation) could also influence the orderly packing of the crystal. For the crystallization attempts, the protein was concentrated between 8 and 10 mg/ml. However, determination of concentration by BCA assay is a biased evaluation, therefore variations in the protein amount could impair crystal reproducibility. The change of incubation temperature for the crystallization plates from 20° C to 8° C to influence the degree of suprasaturation in the drop did not produce any differences. In our laboratory, the solutions in the initial screens are normally mixed and prepared to the final concentration and pH with an automated liquid handling system (Rigaku Alchemist II). Usually, the fine screens for the optimization of the original conditions were manually prepared. Therefore, to achieve a high reproduction of the introductory conditions, four fine screens were assembled using the Alchemist dispenser in a deep-well block and used further for crystallization experiments. Unfortunately, no crystals were obtained from the fine screens prepared by this approach, concluding that not the mechanization of the screen preparation was the switch point for reproducing the initial crystals. The only common denominator between the most of conditions where BtCcsBA initially crystallized is the presence of trisodium citrate in high concentrations (Figure 5. 2). Although typical membrane proteins prefer low ionic strength in the crystallization conditions, trisodium citrate in high concentrations seems to be favourable for some cases, hence the presence of two conditions (A2 and A10) in the standard MemGold crystallization screen which is specific for membrane proteins. Other salts of citrate were tried in fine screens such as: tripotassium citrate, trilithium citrate or diammonium hydrogen citrate without any success.

78

Figure 5. 2: Trisodium citrate

The general shape of BtCcsBA crystals is a hexagonal prism with imperfect pyramids at both ends (Figure 5. 3, A). In rare occasions, parallelepiped shaped crystals were spotted, elongated at the ends (Figure 5. 3, B). In one case, a crystal was found in large proportions, asymmetrical and not well defined in straight edges (Figure 5. 3, C). Rarely, there were crystals with no expansions at the top (Figure 5. 3, D) or crystals with very sharp edges at the ends (Figure 5. 3, E).

A B C D E

Figure 5. 3: Different shapes of BtCcsBA crystals

Calculations of a Matthews coefficient from the unit cell parameters and the mass of the protein indicate that the crystal volume contains a solvent content of 40-55 % and there are around 3- 4 monomers in the asymmetric unit.

Detergent exchange for BtCcsBA crystallization The biggest challenge in membrane protein production and crystallization is their hydrophobic organization, which can only be overcome by searching for the right detergent to extract the transmembrane helices from the lipid bilayer and support the formation of stable, monodispersed protein-detergent complexes. In order to promote successful crystallization and

79

the best-quality crystals, a screening of a wide range of detergents with different properties is imperious. DDM, normally used for membrane solubilisation in a 1% concentration, was exchanged in the affinity purification washing step. Various types of detergents were tested, from maltosides with different chain lengths to glucosides, zwitterionic or lipid-like detergents. However, few detergents could provide monodisperse, stable protein. It was observed that detergents with higher CMC (eg: OGP, D10M) were not suited for CcsBA, maybe because of the weak binding to the hydrophobic regions and therefore not a proper imitation of the natural lipidic bilayer. No crystals could be obtained from CcsBA in complex with other detergents except DDM.

Buffer exchange for BtCcsBA crystallization Variations of the pH is an important strategy in protein crystallization, being a critical parameter in conformation, activity, electrostatic interaction and stability of the protein in solution. A typical buffer used for BtCcsBA was Tris buffer at a physiological pH (7.5) and high concentrations of salt during affinity purification (300-500 mM NaCl), changed to lower concentrations suitable for protein crystallization (150 mM NaCl) in the gel filtration purification. Other buffer systems with a range of pH between 6.2 and 7.5 were screened for improving crystallization. Buffers titrated to the desired pH with citric acid confered the protein high instability. The other buffers tested did not change the elution profile, oligomeric state or stability of the protein. The crystals obtained from the protein purified with 0.1 M sodium phosphate buffer at pH 7 did not produce a change in the diffraction pattern or resolution.

5.4 UV/Vis spectra of CcsBA/CcsA homologs

A universal observation was that all the CcsBA/CcsA homologs contain b-type heme, with colours of the protein sample varying from pale yellow to bright red for similar concentrations (Figure 4. 4, Figure 4. 6, Figure 4. 7, Figure 4. 11). The light colours suggest a low content of heme in some of the homologs because of too few intermolecular interactions of the prosthetic group with the surrounding protein. In order that thioether bonds can be formed between heme b and the apocytochrome c, the iron atom must be in a ferrous state (Fe2+). However, all recombinant CcsA or CcsBA protein homologs purified in this work have the same spectral properties, with an intense Soret band at 413 nm, suggesting the presence of an oxidized heme b (Fe3+). The state of the heme proteins

80

in vivo is unknown, but because the in vitro purifications were performed in aerobic conditions, it is very possible that the proteins became oxidized during the isolation steps. On the other hand, Frawley and Kranz purified wild-type CcsBA from H. hepaticus in oxic conditions and the UV/Vis spectra shows an α-band maximum at 562 and a Soret peak at 416 nm, typical for a reduced b-type cytochrome (Figure 5. 4, left) [112]. Also, CcsB from S. thermophilum was purified and used for spectroscopic characterization in the master project of Sarah Müller. The UV/Vis spectra of the CcsB as purified shows a mixture between two populations of heme b proteins with iron found in different oxidation states (Figure 5. 4, right). The Soret peak in the spectra of the protein as purified is not symmetrical, with a shoulder at around 420 nm (maximum for reduced iron heme), and there is also an absorption corresponding to the α-band, at 561 nm. This may suggest an initial reduced state of the protein cofactor but due to the air exposure over time it became oxidized.

Figure 5. 4: Left: UV/Vis spectra of HhCcsBA as purified (dotted line), reduced with sodium dithionite (black line) and oxidized with ammonium persulfate (interrupted line). Image taken from Frawley and Kranz [112]. Right: UV/Vis spectra analysis of StCcsB as purified (black line) and reduced with sodium dithionite (green line). Wavelength of absorption maxima are indicated above each peak. Image taken from the master thesis of S. Müller.

5.5 CcsA vs. CcmF

CcsA from System II and CcmF from System I belong to the same family of heme-handling proteins, displaying a highly conserved motif within their sequences, the WWD domain. Unfortunately, a three-dimensional model of CcsBA or CcsA could not be obtained as a

81

consequence of the poor diffraction quality of the crystals. However, a very accurate computationally generated CcsA structure provided a detailed view on the conformation of the protein. Recently it was published that very accurate models of 614 protein families with unknown three-dimensional structures could be generated de novo, using metagenome data combined with Rosetta structure-prediction calculations, regulated by coevolution based on residue- residue contact predictions produced by Gremlin [190]. In this list of protein families, two files were presented, each with one half of the CcsA predicted structure from Cyanothece sp.. The complete model was merged using Coot and then visualized in Pymol (Figure 5. 5). The structural topology of the CcsA model shows a transmembrane domain comprised of eight helices with an additional, small cytoplasmic domain composed of loops and short helices. The arrangement of the transmembrane helices shows two higly conserved histidine residues, one in the transmembrane part at the interface of CcsA and the other in the 3 periplasm region (represented as sticks in 7 2 4 8 Figure 5. 5). 1 A structural alignment between the predicted 6 structure of CcsA from Cyanothece sp. and a 5 truncated structure of CcmF from Thermus thermophilus (only the first 292 residues) displays a strong structural homology in the folding of proteins and a remarkable similarity of the topology and orientation of Figure 5. 5: Cartoon picture illustrating the the transmembrane helices (Figure 5. 6), predicted structure of CcsA from Cyanothece sp., coloured from blue at the N-terminus to red at the while not sharing any remarkable degree of C-terminus. The transmembrane helices are sequence identity (~ 20%). numbered from 1 to 8.

82

Figure 5. 6: Cartoon illustrating the alignment between the predicted structure of CcsA from Cyanothece sp. (rainbow colour) and the CcmF crystal structure (truncated version: 292 residues) from T. thermophilus (pale blue colour). The left picture represents the lateral view while the right picture represents the frontal view of the alignment. The heme, the conserved histidines are depicted as sticks.

The structural alignment between both proteins indicates a similar position for the conserved WWD domain, located in a periplasmic loop (Figure 5. 7, left). A highly conserved histidine residue from the transmembrane region, which represents one of the heme axial ligands in CcmF, is aligned in both structures (Figure 5. 7, right). The similarity indicates that the heme- binding site in CsCcsA is located in the same place as in the TtCcmF. For CcmF, it was proposed that this transmembrane heme might have a role in reduction of the other heme group, placed in the WWD region, delivered by the CcmE chaperone, which participates in assembly of cytochromes c [81]. Structural measurements show that the distance between the transmembrane heme and the WWD motif is around 12 Å, but a cavity is observed between them where two conserved tryptophan residues can be spotted, which might act as a bridge for the electron flux. Therefore, given the high similarity between both proteins, this heme group in CcsA should have the same role as in CcmF.

83

Previous studies suggest that CcsA forms a channel to transport the heme group for further attachment [112], however, the close structural homology of CcsA with CcmF implies that the heme involved in the catalysis process is acquired in a different way than the proposed strategy. TtCcmF has 15 helices (Figure 1. 5) while HhCcsBA and BtCcsBA each have 14 helices (indicated by secondary structure prediction tools, see section 12.5). It is very possible that CcsB looks similar with the C-terminal region of CcmF, only with the periplasmic β-sheet domain much larger for CcsB.

Figure 5. 7: Left picture: Cartoon representing the alignment between the predicted CcsA structure from Cyanothece sp. (rainbow colour) and the CcmF crystal structure (truncated version: 292 residues) from T. thermophilus (pale blue colour) with the conserved WWD domain depicted in black. Right: Closer view of the heme binding site of the predicted CsCcsA structure and the TtCcmF crystal structure. The conserved histidine residues, acting as axial ligands of heme b, are represented as sticks.

The WWD motif in CcsBA and CcmF Highly conserved and rich in tryptophan residues, the WWD domain represents a specific signature for the heme-handling proteins involved in cytochrome c assembly. However, it is not universally conserved in both systems, and as a result, alignments between CcsBA (or CcsA) and CcmF homologs reveal differences between both general motifs (Figure 5. 8). Representative homologs for both types of proteins from the major taxonomic groups were chosen for sequence analysis. Universally conserved amino acids in common for both CcsA and CcmF seem to be four tryptophan residues and another two more charged amino acids, an aspartate (from the motif’s name) with a glutamate two positions further away. The histidine residue included in the alignment (Figure 5. 8) represents the axial ligand for the heme group found in the transmembrane domain. A proper WWD domain, as mapped for CcmF, is comprising WXYXXLGWGGXWXWDPVENAS/AFM/IPWL. An extended WWD motif for CcmC, the other heme-handling protein in System I was proposed by Ahuja and Thöny-Meyer to be: WGXφWXWDXRLT, where φ is a charged amino acid [191]. Although it resembles the

84

motif in CcmF, the glutamate (common for CcmF and CcsA) is replaced by a leucine, but the loss of a charged amino acid is compensated by changing the valine (close to the glutamate in CcmF) to an arginine. The sequence region of the CcsA WWD pattern is not as well conserved as it seems in CcmF. While at least 19 fully conserved residues are part of the WWD motif in CcmF, just 9 amino acids seem to be completely conserved in CcsA proteins, with the mention that there are sometimes exceptions to the rule and more sequences need to be aligned in order to gain a complete overview. An example of such an exception is one of the six homologs from G. sulfurreducens, which seem to have shifted the aspartate from the WWD motif to a threonine. The WWD extended sequence proposed in CcsA is comprising: WAXXS/AWG/TXY/FWXWDXKET/VW. Another difference between the motifs is the location within the protein sequence. While for CcmF the WWD site is positioned relatively at the beginning of the protein, for CcsA it is located in the end of the protein sequence.

Figure 5. 8: Sequences alignment of WWD motifs in CcmF and CcsBA (CcsA) proteins. Picture made with CLC Main Workbench. CcsA from Cyanothece sp. (Cyanobacteria), Aquifex aeolicus (Aquificae), Symbiobacterium thermophilum (Firmicutes), Helicobacter hepaticus (ε-Proteobacteria), Bacteroides thetaiotaomicron (Bacteroidetes), Micrococcus luteus (Actinobacteria) and Bacillus megaterium (Firmicutes) were the homologs chosen for the alignment. CcmF from Pseudomonas aeruginosa (γ- Proteobacteria), Escherichia coli (γ-Proteobacteria), Ralstonia pickettii (β-Proteobacteria), Rhodobacter sphaeroides (α-Proteobacteria), Thermus thermophilus (Deinococcus-Thermales), Deinococcus radiodurans (Deinococcus-Thermales) and Chloroflexi bacterium (Chloroflexi) were the homologs aligned at their WWD motif.

85

5.6 Cytochrome c assembly in System II

One of the most controversial question in cytochrome c biogenesis is how the heme pops up at the place of assembly. It is not yet clear whether the heme groups, those involved in the maturation of cytochromes c, are translocated over various membranes by passive transport because of their hydrophobicity or whether they require a flippase or a specific transporter. However, the potential toxicity of free heme argues against the hypothesis that the prosthetic groups can swim around without a protection mechanism. It is very tempting to suggest that the cytochrome c assembly pathways might be involved, but so far there is no hard evidence to prove a heme transport function included in the systems. There is a possibility that the ferrochelatases discharge fully matured hemes in the lipid bilayer from where they are intercepted by the cytochrome c systems. Maybe this is the reason why the plants keep a ferrochelatase homolog in the mitochondrial matrix although the heme biosynthesis takes place in the chloroplasts. For the System II it was proposed that the heme moves across the membrane through a channel formed in CcsBA. It was also suggested that the histidine residues which ligate and chaperone the heme over the lipid bilayer, additionally keep the iron heme in a reduced state [112]. However, in this work, spectral analyses on Ccs protein homologs show the typical characteristics of an oxidized b-type heme. A proposed hypothesis in respect with this thesis findings is that the CcsBA complex protein requires a permanent heme b cofactor buried inside of a hydrophobic pocket placed at the lateral interfaces of CcsA and CcsB and axially coordinated by two histidine residues found in the transmembrane helices from both complex partners. This heme might oscillate between oxidation states and therefore be involved in electron transfer. It is more likely that a second heme b, which is directly implicated in the cytochrome c assembly, should be ligated non-covalently by the periplasmic histidine residues from CcsA, flanking the WWD motif. This heme should be in short transit until the binding motif on the apocytochrome c is in place, close enough that thioether bonds can occur. It was suggested already that the heme binding site on the apocytochrome c could be recognized and guided by the big periplasmic domain of CcsB. The high flexibility of one of the histidine residues found on the periplasmic loop of CcsA could produce ligand switch in the proximal axial site of the heme with the histidine residue from the heme binding motif on the apocytochrome. The distal axial coordination could be formed upon folding of the cytochrome c in its mature form, in this way providing the energy needed for heme release from the CcsA influence.

86

Structural Characterization of Latex Clearing Protein (Lcp) from Streptomyces sp. Strain K30

6. Introduction

6.1 Streptomyces sp.

Gram-positive bacteria of the Streptomyces genus form the largest group within Actinobacteria phylum, with more than 500 different species investigated and described. Similar to fungi, they grow filaments that form aerial mycelia from which branches with chains of spores emerge [192]. They are widely distributed as saprophytes in the soil (40%), but also populate diverse natural environments, terrestrial and aquatic, while some species are plant or animal pathogens [193]. Their ability to produce secondary metabolites such as antifungals, antivirals, antitumorals, anti-hypertensives, immunosuppressants and especially antibiotics makes them the most important source for medical, veterinary and agricultural products [194, 195]. Also, Streptomyces bacteria perform a tremendously important ecological role in degradation of organic matter [193].

6.2 Natural rubber, latex and synthetic rubber

Natural rubber, or coutchouc, is a hydrocarbon biopolymer [poly (cis-1,4-isoprene)], mainly synthesized by the rubber tree Hevea brasiliensis, but also by more than 2000 types of plants and even some fungi. Natural rubber is the main component of latex milk and it can be extracted by coagulation or precipitation. The latex sap produced by the rubber tree consists of 25-35% rubber particles, around 50-70% water and small amounts of proteins, resins, ash, sugars or other impurities. It is secreted as a protective fluid to seal mechanical injuries of the bark or to immobilize invading insects [196].

Figure 6. 1: Poly (cis-1,4-isoprene)

89

Natural rubber is composed of isoprene (C5H8) units, every one presenting a double bond in cis configuration (Figure 6. 1). The biopolymer derived from the rubber tree has three trans- isoprene units at one end of the chain followed by hundreds of cis-isoprene units. Species of plants were discovered that synthesize rubber with only trans-isoprene units [197]. The synthetic variant of rubber is a fabricated polymer from petroleum, coal, oil, natural gas or acetylene. It consists mainly of cis-isoprene units combined by 1,4-linkages and up to 10% trans-isoprene units including 3,4-linkages [198]. A breakthrough in the field was the discovery of natural rubber vulcanization by Charles Goodyear in 1839, which considerably improved the material properties, but in the same time it gave recalcitrance to chemical and biological degradation [196, 199].

6.3 Latex biodegradation

Every year, since latex was discovered more than 100 years ago, about 5 billion kilograms of natural rubber and even twice the amount of synthetic rubber are produced. Although the quality of natural rubber is preferred, nowadays about 65-70% of the total global production consists of synthetic rubber. There is a huge industry in manufacturing countless rubber products, especially tires (75% of the world’s rubber production, both natural and synthetic), medical, household and industrial items such as gloves, footwear, balloons, condoms, seals, cables, car parts, adhesives, paints and many other products. In consequence, large amounts of rubber have been released in our environment as waste or by abrasion in the case of tires [200]. Even though latex is biodegradable, its complete degradation in nature is rarely observed, resulting in huge accumulations of waste [201]. Until today, several microorganisms were isolated that can use both natural and synthetic rubber as a carbon source [202-210]. They have been divided into two groups that follow different strategies for degrading rubber. Members of the first group form a clear zone around the colonies grown on solid media with latex particles (Figure 6. 2). Most of them are the micelium-forming type, part of the Actinomycetes (Actinomycetales) order including: Actinoplanes, Streptomyces or Micromonospora, but also Xanthomonas sp. strain 35Y, a Gram-negative proteobacteria. Members of the second group are also part of the Actinomycetes such as: Gordonia, Mycobacterium or Nocardia and some Pseudomonas species [211]. They will not proliferate on latex plates because they require a more direct contact with the polymer by growing adhesively at the surface of latex particles suspended in liquid cultures [212-214].

90

Figure 6. 2: Halo created around S. lividans colony harbouring the vector pIJ702::lcp [212].

Rubber is a high-molecular-weight, water-insoluble polymer, which makes it impossible to be taken in by cells directly. Therefore, bacteria evolved a special ability to synthesise and secrete extracellular enzymes that catalyse oxidative reactions to break down the polymer into low- molecular weight products with terminal keto and aldehyde groups that can be further uptaken and processed [215]. Until now, two different types of rubber-degrading enzymes that cleave the isoprene chain at different lengths have been discovered and characterized. The rubber oxygenase protein, RoxA, was first detected in Xanthomonas sp. strain 35Y [216]. It was also identified in soil and marine myxobacteria, all stained as Gram-negative [217]. The second type of enzyme analysed is known as latex clearing protein, Lcp, and was first discovered in Streptomyces sp. K30 strain [218]. It is found exclusively in Gram-positive rubber degrading actinomycetes [210, 219] and is not related to RoxA in any aspect investigated so far [200, 215].

6.4 Rubber oxygenase A (RoxA)

RoxA isolated from Xanthomonas sp. strain Y35 was investigated more in detail, with the crystal structure determined in 2013 [220]. The rubber oxygenase is a 75 kDa c-type cytochrome with two binding sites for the covalent attachment of the heme cofactors. It catalyses the cleavage of the polyisoprene chain to a preferred end product, 12-oxo-4,8- dimethyltrideca-4,8-diene-1-al (ODTD, C15-tri-isoprenoid). This behaviour holds evidence for an internal molecular ruler [200]. The three-dimensional structure of RoxA displays an unusually low amount of secondary structure, with predominant loop regions and just one third of the protein organized as α-helices and β-sheets (Figure 6. 3). The unusual presence of two

91

disulfide bridges between different loops of the protein further stabilizes the fold of the enzyme [220]. One of the heme groups, located at the N-terminus region, was found to bind oxygen as a distal ligand. Through mutagenesis, spectroscopic and functional studies, it was discovered that a phenylalanine residue in position 317 is involved in interactions with the substrate and is essential for polyisoprene cleavage [221]. Therefore, the dioxygen-bound heme, together with Phe317, were proposed to represent the active site of RoxA. The second heme cofactor located at the C-terminus, has its iron atom in a bis-histidine coordination and is implicated in electron transfer. Considering the large interheme distance (21.4 Å), a conserved tryptophan residue in position 302 could facilitate an electron flow bridging between the two iron atoms, similar to the case of the MauG protein [222]. The physiological mechanism is not yet completely understood, but it was suggested that RoxA exhibits an exo-type cleaving mechanism, with electron transfer from the active site to the substrate through a network of aromatic residues [220].

Figure 6. 3: Three-dimensional structure of RoxA from Xanthomonas sp. strain Y35 (PDB-ID: 4B2N), illustrated as cartoon in rainbow colour, from blue at the N-terminus to red at the C-terminus. Both heme groups (red), their heme-binding motifs (CXXCH) and the dioxygen molecule are represented as sticks.

92

RoxA is structurally similar to the peroxidases of the CcpA family and the maturation factor for methylamine dehydrogenase, MauG [220]. An isoform of RoxA was discovered in the same organism, Xanthomonas sp. strain Y35, designated RoxB, with a similar molecular mass and 38% sequence identity. The major difference between the homologs is the length of the cleavage products resulting from their activity. As stated above, RoxA catalyses the cleavage of the polyisoprene chain into one major end product, C15-tri-isoprenoid (ODTD), while RoxB divides the isoprene chain randomly, with end products displaying different lengths (C20, C25, C30, etc), but always higher than ODTD (Figure 6. 5, right). The bacteria were also not capable to take in and metabolize the products of RoxB (Figure 6. 4) [223]. Recently, two new homologs of RoxA and RoxB, respectively, from the Gram-negative bacterium Rhizobacter gummiphilus strain NS21, were biochemically characterized. The novelty in this study was the discovery of a collaborative effect between the two rubber-degrading oxygenases, where RoxA not only catalysis the cleavage of the polyisoprene chain, but also the RoxB-generated high-molecular weight end products (C35 and higher) into C15-tri-isoprenoids [224].

Figure 6. 4: Cells of ΔroxA Xanthomonas sp. strain 35Y, overexpressing the roxA or roxB genes, were cultivated on polyisoprene latex overlay (LOV) agar plates. Clear zone formations were observed for the bacteria producing RoxA35Y or RoxB35Y, but not for the negative control, represented by the ΔroxA Xanthomonas sp. strain 35Y cells. The same colonies stained with fuchsin solution show the formation of a large purple halo around the colony harbouring RoxB35Y, indicating the presence of products with aldehyde groups. The colony expressing RoxA35Y lacks the purple halo, because presumably the ODTD products resulting from the degrading activity were small enough to be taken up and metabolized by the bacteria. Picture taken from Birke et al. [223].

93

6.5 Latex clearing protein (Lcp)

6.5.1 Lcp characteristics and comparison to RoxA

Even though Lcp and RoxA accomplish the same physiological role in nature as rubber degraders, structurally and biochemically they present many differences. In general, latex clearing proteins possess only half of the molecular mass of the rubber oxygenases. There are no heme-binding motifs for covalent attachment present in the amino acid sequences of different Lcp homologs. Many controversial reports were released about the nature of Lcp. First, an ortholog from Gordonia polyisoprenivorans strain VH2 (two lcp paralogous genes on the chromosome) was determined to be a Cu(II)-containing oxygenase, a conclusion that resulted from metal analysis and spectroscopic experiments. The enzyme, purified with a polyhistidine tag, was considered a member of the copper-containing white laccase family [225]. Lcp from Streptomyces sp. strain K30 produced in Xanthomonas sp. ΔroxA strain, isolated also with a polyhistidine tag, was presumed to possess no cofactor and to be part of the so-called ‘cofactor-independent oxygenases’ family. Even though for this study, metal analysis showed 0.8 mol of Fe per mol of Lcp, the results were overlooked because of the low concentration of the enzyme [215]. However, when the polyhistidine tag was exchanged for a Strep-tag (for the Lcp from Streptomyces sp. strain K30), spectroscopic analysis revealed the presence of a b-type heme (Figure 6. 5, left) and repeated metal analysis showed around 1 mol Fe per mol of Lcp, indicating one iron heme group per protein. Absorption spectra of Lcp (Figure 6. 5) show that the iron heme is found in the oxidized state, Fe3+, in contrast to the reduced heme group found in the active site of RoxA. Also, the production host was changed to E. coli JM109, which increased the protein yield considerably, by around 40-fold [200]. Because the maturation of RoxA takes place in the periplasm of the bacteria, the precursor of the protein is translocated into this compartment via the Sec translocon, where the cytochrome c assembly system catalyses the formation of covalent bonds between the heme group and polypeptide. On the other hand, as a , fully matured Lcp is directly secreted via the Twin-arginine translocation pathway [212].

Specific activities of 1.5 U/mg and 4.6 U/mg were determined for LcpK30 at 23°C and 37°C, respectively [200]. Similar to RoxB, cleavage products derived from LcpK30 activity are a

94

mixture of oligoisoprenoids with different lengths, always higher than the preferred C15-tri- isoprenoid (ODTD) generated by RoxA (Figure 6. 5, right) [215].

Figure 6. 5: Left: UV/Vis spectra of Lcp from Streptomyces sp. K30 strain as isolated and reduced with sodium dithionite. Wavelength values for the absorption maxima are indicated above the peaks. Right: Latex degradation products from LcpK30, RoxA and RoxB activity, extracted with ethyl-acetate and separated by HPLC. Pictures taken from Birke et al. [200, 223].

Figure 6. 6: The cleavage of polyisoprene chain by Lcp from Streptomyces sp. strain K30 in low- molecular weight oligoisoprenoids with terminal keto and aldehyde groups that can be further taken up and processed by the bacteria [226].

6.6 LcpK30 overproduction

The lcp gene from Streptomyces sp. strain K30 was cloned into the p4782 vector and the TAT signal peptide sequence was substituted with a Strep-tag. LcpK30 was overproduced in E. coli JM109 and purified by three subsequent chromatographic steps. First, the protein was isolated by affinity chromatography through a Strep-tactin column, then the purification buffer was exchanged to 1 mM potassium phosphate buffer, pH 7.0, via a HiPrep Sephadex G25 26/10

95

column to avoid protein precipitation. The concentrated protein was further applied on a Superdex 200 16/600 column as a last polishing step (Figure 6. 7, right). The purity was checked on SDS-PAGE (Figure 6. 7, left) [200].

Figure 6. 7: Left: SDS-PAGE of Lcp preparations: lane M - molecular marker, lane 1 – soluble crude extract, lane 2 – elution peak after affinity chromatography, lane 3 – elution peak after SEC. Right: SEC chromatogram of LcpK30 purification with monitorization at different wavelengths, 280 nm and 412 nm – for protein and heme detection. Picture taken from Birke et al. paper [200].

96

7. Scope of the study

Worldwide natural and synthetic rubber consumption developed enormously in the last decades, with many applications in industry and in our daily life resulting in tons of waste every year with few solutions of disposing of or recycling the products. Therefore, extensive research has been conducted into microbial degradation of rubbers by seeking microorganisms with the ability to secrete latex-degrading enzymes. Many bacteria were isolated and characterized, especially from the Actinomycetes group, with only two rubber-degrading ezymes identified: the rubber oxygenase, RoxA (with the homolog RoxB), discovered only in Gram-negative bacteria and the latex clearing protein, Lcp, found exclusively in Gram-positive bacteria. However, despite of the numerous studies, the biodegradation mechanism of these materials is not entirely known. RoxA, has been well described biochemically and structurally, with its three-dimensional crystal structure solved a few years ago. The research carried out on two different homologs of Lcp resulted in conflicting data with divergent views on the protein, being assigned several times to different kind of families. Thus, for a better understanding on the molecular architecture and acquirement of mechanistic insights, the aim of this study was the crystallization followed by the three-dimensional structure determination of Lcp from Streptomyces sp. strain K30. Another powerful tool, electron paramagnetic resonance spectrometry, could provide additional observations into complex intermediate states of the heme group during catalysis. The pure, active LcpK30 was heterologously produced and isolated by the Jendrossek group in Stuttgart.

97

98

8. Materials and methods

8.1 Spectroscopic methods

8.1.1 EPR

Electron paramagnetic resonance (EPR) spectroscopy measurements of cytochromes show signals that belong to the unpaired electrons of the iron atoms. These signals indicate the presence of ligands, the spin state or redox state. EPR spectra were recorded on a Bruker Elexsys E500 instrument with a 10n ER073 electromagnet and a Super High Q resonator cavity by Dr. Julia Netzer. The measurements were performed at 10 K, temperature achieved using an Oxford Instruments er 41112HV continuous flow liquid helium cryostat controlled by an ITC 503 temperature device. Quarz tubes of 4 mm outer diameter (705-PQ-9.50, Wilmad) were filled with 300 μl sample and measured at a power of 10 mW at ~9.38 GHz, with a modulation amplitude of 6 G and a receiver gain of 60 dB.

8.2 Crystallization methods

8.2.1 Protein crystallization

The protein crystallization procedures are described in chapter 3.6.1. The protein concentration optimal for Lcp crystallization was between 5-10 mg/ ml, in a 0.6 μl sitting drop with 1:1 ratio. MRC crystallization plates (Swissci) with two drops were used, containing 50 μl volume of the reservoir solution. The incubation temperature of the crystallization plates was set at 20°C.

8.2.2 Diffraction data collection

The diffraction experiments are described in section 3.6.2. Single crystals were harvested from the drop with nylon loops. For cryo-protection, the crystals were incubated for one second in the reservoir buffer containing 10% (v/v) 2R-3R-butane diol and then flash-cooled and stored in liquid nitrogen.

99

8.2.3 Data processing, phase determination and structure building

Diffraction data were indexed and integrated with XDS [227], and scaled and merged with AIMLESS [228] from the CCP4 suite [229]. Phasing, density modification and initial model building were performed in AutoSol from the Phenix suite [230]. Completion and correction of the initial model was carried out into the experimental electron density map using Coot [231] with iterative refinement cycles in REFMAC [232]. The final model from SAD-phased data was used as a model for molecular replacement for other data sets in Phaser [233] and refined with the same procedure as described above. The structures were validated using MolProbity [234]. Three-dimensional structure pictures were generated using Pymol [235]. PDB coordinate files for two conformations of LcpK30 are available in the Protein Database in Europe (PDBe) under the accession codes 5O1L (for the open state) and 5O1M (for the closed state).

100

9. Results and discussions

9.1 LcpK30 crystallization

Crystallization trials were set up at 20° C under aerobic conditions, by scanning all initial screens available in the laboratory, using fresh or thawed LcpK30. After a few days, clusters of needle-like crystals could be observed. However, in a period of two-three weeks, different shapes of crystals with sharp edges started to grow in different conditions. Fine screen optimization of the hit conditions with fresh microseeds were further set up. Most of the conditions suitable for crystal production contained imidazole buffer. In the few conditions that yielded single suitable crystals, different additives were tested, revealing a preference of the protein for manganese chloride. Various crystal shapes were obtained, with a range of colours from light yellow to bright red, depending on heme occupancy in the protein crystal (Figure 9. 1). In general, crystals developed in considerably large proportions with an average side length of around 100 μm. They were usually found fixed at the bottom of the well drop, making crystal harvesting quite a challenge.

Figure 9. 1: LcpK30 crystal shapes

Two structure models were solved from crystals grown in different conditions, assigned as the open and closed conformation because of their structural differences. The high resolution structure (open state) was obtained from LcpK30 crystals produced in 4% (w/v) polyethylene glycol 4000 and 0.2 M malate/imidazole pH 7.5, while the low resolution structure of LcpK30 (closed state) is derived from the protein crystallized in 16% (w/v) polyethylene glycol 3350, 0.1 M HEPES pH 8.5 and 0.2 M L-proline.

101

9.2 LcpK30 structure determination

Table 9. 1: Statistics of the data collection and structures refinement. The values in parentheses represent the highest resolution shell.

Structure 1- 1.48 Å Structure 2 – 2.2 Å PDB ID 5O1L 5O1M Resolution (Å) 60.37 - 1.48 (1.51 – 1.48) 39.22 - 2.20 (2.27 – 2.20) Cell constants – a, b, c (Å) 56.7, 62.8, 64.4 54.9, 56.6, 63.9 α, β, γ (°) 85.4, 66.1, 74.2 74.2, 86.1, 71.0 Space group P 1 P 1 Reflections (unique) 117,382 (5,807) 30,914 (2,566) Multiplicity (%) 3.6 (3.7) 3.7 (3.7) Completeness (%) 90.0 (89.2) 87.2 (83.7) Mean ( (I) / sd (I) ) 8.1 (2.1) 12.3 (3.8) CC (½) 0.994 (0.730) 0.997 (0.932)

R merge 0.086 (0.596) 0.065 (0.299)

R pim 0.053 (0.358) 0.039 (0.180) Refinement

Rcryst 0.1649 0.1746

Rfree 0.1867 0.2253

R.m.s.d. bond lengths (Å) 0.0202 0.0184 R.m.s.d. bond angles (°) 2.0085 1.9487 non-hydrogen atoms 6 374 5 845

Table 9. 2: Summary statistics on LcpK30 structures with 1.48 Å and 2.2 Å resolution (calculations provided by MolProbity)

Protein geometry Structure 1 - 1.48 Å Structure 2 - 2.2 Å Poor rotamers 4 0.66% 22 3.77% Favored rotamers 580 95.87% 522 89.38% Ramachandran outliers 1 0.13% 1 0.14% Ramachandran favored 727 97.72% 711 96.34% Cβ deviations > 0.25 Å 16 2.28% 5 0.74% Bad bonds: 60 / 6098 0.98% 12 / 5930 0.2%

102

Bad angles: 27 / 8366 0.32% 6 / 8114 0.07% Cis Prolines: 0 / 46 0.00% 0 / 44 0.00%

Sequence homology between LcpK30 and other proteins with known structure could not be identified to perform molecular replacement and solve the phase problem. However, the presence of the heme group inside the protein delivered the opportunity to obtain the phase information from the anomalous scattering of the iron atom. Therefore, SAD experiments were carried out at beamline X06DA at the Swiss Light Sourse (Paul Scherrer Institute), equipped with a PILATUS 2M-F detector. Data sets were collected at the iron edge, with a peak absorption wavelength of 7130 eV (1.73 Å) after taking a fluorescence scan. The diffraction pictures were collected from the crystal in rotation steps of 0.1°, exposed for 0.1 second per image with a beam transmission of 50%. The anomalous signal present in diffraction data was maximized by collecting several data sets at different χ-angles (0°, 5° and 10°) of the PRIGo multi-axis goniometer [201].

Figure 9. 2: PRIGo multi-axis goniometer. Picture from Waltersperger at al. [201].

The model derived from SAD experiments was used as a phase model for molecular replacement for other data sets. The best resolution of a data set obtained was 1.48 Å with an

R factor of 16.5%. From another data set, a different conformation of the LcpK30 structure was identified. This structure was refined to a lower resolution, 2.2 Å, with an R value of 17.5%.

All data collection and refinement statistics for both LcpK30 structures are summarized in Table 9. 1 and Table 9. 2. The space group was determined to be triclinic P 1 with the asymmetric unit containing two monomers of the protein. The crystal solvent content was calculated to 50% via Matthews coefficient.

103

9.3 LcpK30 structure

The molecular architecture of LcpK30 reveals a globular enzyme. The three-dimensional structure includes mostly α-helices (n = 15) with a proportion of 63% and with no β-sheets observed (Figure 9. 3). The overall structure of LcpK30 is secured by several salt bridges (Asp56-Arg195, Asp60-Arg202) and hydrogen bonds (His203-Glu68, Glu148-Thr230).

Figure 9. 3: Left and middle: Front and back view of LcpK30, depicted in Pymol as cartoon, rainbow coloured from blue at the N-terminus to red at the C-terminus. The heme groups with the proximal ligand, H198, are shown as sticks. All the helices are numbered from 1 to 15. Right: Molecular surface view of LcpK30 shows a globular protein. The heme group and the proximal ligand, H198, are represented as sticks.

LcpK30 exhibits close structural homology with proteins from the globin family [236]. The most structural similarities were noticed between the core of the latex clearing protein compared with Geobacter sulfurreducens globin-coupled sensor protein (PDB-ID: 2W31, r.m.s.d. = 2.97

Å) (Figure 9. 4, left). During evolution, LcpK30 has added two more domains at both termini, located at opposite sides of the protein, while the globin core was preserved with a classical

3/3 fold (Figure 9. 4, right). Helix 8 of LcpK30 could not be identified in other globular proteins, therefore it was proposed to be specific only for latex clearing proteins, designated ‘Helix L’. The position of helix 8 completely closes the access to the heme group from the lateral side in a cage-like heme pocket.

104

Figure 9. 4: Left: Superposition between the LcpK30 central domain (orange) and Geobacter sulfurreducens globin-coupled sensor protein (PDB-ID: 2W31, teal). Besides the 3/3 fold typical for , notice the heme binding pockets in the same region. Right: LcpK30 structure with N-terminus in blue, C-terminus in red and the core protein in orange. Pictures are represented as cartoons, with heme groups depicted as sticks.

Each monomer of LcpK30 contains one b-type heme non-covalently bound, with the iron atom axially coordinated on the proximal site by a histidine residue in position 198. As for the distal position, we observed two different types of ligands, hence the assumption of different states of the protein (Figure 9. 5). In the high resolution structure (1.48 Å), the distal axial ligand of the iron heme is a molecule of imidazole from the crystal growth condition, designated as the open conformation of LcpK30 (Figure 9. 5, right). The imidazole ring substitutes for the natural substrate, in this case a dioxygen molecule. However, the very same coordination site in the low resolution structure (2.2 Å) is unusually occupied by a lysine residue in position 167 from the protein backbone, thus denominated as a closed state of LcpK30 (Figure 9. 5, left). Until now, two more cases were reported where a lysine residue acts as a ligand to the iron heme in a cytochrome b, one for the truncated haemoglobin THB1 from Chlamydomonas reinhardtii [237] and the other for the putative globin-coupled sensor protein HGbRL from Metthylacidiphilum infernorum [238]. For HGbRL, the authors of the research also proposed two conformations of the enzyme, a closed state when a lysine residue from the surrounding amino acids is bound to the iron heme and open state when, as in the case of LcpK30, an imidazole molecule acts as a surrogate for the binding site of the heme.

105

Figure 9. 5: Different conformations of the overall LcpK30 structure (upper pictures) and close-up views on the active sites (lower pictures): closed state (left) where a lysine residue (K167) acts as a distal ligand to the iron heme and open state (right) where the lysine residue is removed, breaking the helix, while an imidazole molecule (IMD) takes place in the distal position. Proximal ligand is represented by a histidine residue (H198). A charged amino acid, glutamate residue E148 is shown to shift position in the open state, closer to the putative substrate. Red arrows indicate the conformational changes that take place from closed to open state.

In the closed state of LcpK30, the unique hexa-coordination by both ligands restricts the access of the natural substrates to iron heme, the oxygen molecule required for the catalytic activity and the polyisoprene chain that must be cleaved at the double bond. Additionally, one adjacent loop to the heme group moves and obstructs the entry point of a tunnel passing through the heme pocket (Figure 9. 6, left). The whole protein conformation becomes more rigid, but also more stable, especially in the C-terminal region and the helix where Lys167 is included (Figure 9. 5, left). The proximal ligand, His198, seems to be highly conserved in all Lcp sequence alignments, while the distal ligand, Lys167, is partly conserved (59%), implying that another type of amino acid might occupy the critical position.

106

When a signal triggers the open conformation of the protein, the lysine residue (Lys167) within the 6th α-helix retreats while the neighbouring threonine (Thr168) shifts its position, causing a fracture in the secondary structure and a rearrangement of the α-helix. In the same time a large motion of an adjacent highly flexible loop is initiated, located in the heme vicinity (Figure 9. 5, right). This extensive reorganization opens the passage and widens the slit of the channel, with an entry and an exit point, exposing the heme cofactor to the putative substrates (Figure 9. 6, right). Also, the cavity is comprised only of hydrophobic residues with one exception, a glutamate residue in position 148, which might have a role during catalysis, acting as a base (Figure 9. 7). The hydrophobicity of the channel prevents the ligation of water molecules to the penta-coordinate heme iron.

Figure 9. 6: Comparison of LcpK30 in the closed (left) and open state (right). The molecular surface of LcpK30 shows the presence of a channel passing through the active site with the heme pocket. The propionate groups of the heme group are directed to the entry point of the cavity. In the closed state, the tunnel is obstructed, but not completely closed.

The active site of LcpK30 was proposed to be comprised of the heme group and several aminoacids found in the heme binding pocket. Through mutagenesis studies, they were identified as the distal ligand, Lys167, the neighbour Thr168, Arg164 and Glu148, which might deprotonate the substrate by extracting a proton from the double bond of the polyisoprene chain, allowing to one of the carbon atoms to form a bond with one oxygen atom. Then the other oxygen atom performs a nucleophilic attack on the adjacent carbon atom from the former double-bond and an instable, cyclic dioxitane intermediate is formed, which is further cleaved into final products with keto and aldehyde groups [239]. Different snapshots of different conformations show a high flexibility of the protein (Figure 9. 8) with significant changes in the molecular architecture, especially in the active site. The conformational swap could represent an internal protection mechanism against the generation of reactive oxygen species from the reduced iron heme when the natural substrate is not present.

107

Possibly, the affinity of the polyisoprene chain triggers the open state when it is inserted in the hydrophobic channel and, similar to monooxygenases [240], enables the displacing of the lysine and binding of a dioxygen molecule to the iron heme.

Figure 9. 7: Surface hydrophobicity profile of the channel in open state of LcpK30. Red patches indicate hydrophobic residues. Glu148 might act as a base during substrate cleavage. The heme group, the imidazole molecule and E148 are represented as sticks.

Different conformations of the protein with distinct nature of the ligands identifies LcpK30 as a member of the hexa-coordinate haemoglobin subfamily. The binding of an exogenous ligand to such a globin is described as a complex series of actions, where first the endogenous distal ligand of the heme iron, usually an amino acid of the backbone, is removed, resulting in a temporary reactive penta-coordinated species, and then is followed by the occupation of the exogenous ligand in the vacant place of the sixth axial coordination [241].

108

Figure 9. 8: B-factor distribution representation of the closed (left) and open (right) state of LcpK30. For both structures, the C-terminus represents a highly flexible region. Helix 6 together with the adjacent loop, especially in the open structure, show high flexibility. The pictures were made in Pymol, as putty model, coloured from low B-factors in blue to high B-factors in red.

From X-ray diffraction experiments, it was observed that crystals grown in conditions containing imidazole or manganese chloride produced higher resolution data sets with more reflections than diffraction images originated from crystals obtained from growth conditions lacking these ingredients. A data set collected at the manganese edge at 6551 eV from a crystal produced in a condition with both components indicated above, was processed and the resulting structure revealed the electron density of a manganese atom with the binding site located at the surface of the protein, in the N-terminus region (Figure 9. 9, left). The octahedral coordination was achieved through three charged amino acids from the backbone of the protein (an arginine, a glutamate and a histidine residue) and two exogenous ligands, a water and an imidazole molecule (Figure 9. 9, right). The heavy metal atom seems to stabilize the flexible N-terminus and in consequence improve the crystal packing. No phisiological role could be attributed to this interaction.

109

Figure 9. 9: LcpK30 structure (rainbow colour, left picture) comprising one manganese atom (purple colour) in octahedral coordination (zoomed in, right picture).

9.4 EPR analysis

Continuous-wave X-band EPR spectra of LcpK30 as isolated produced three distinct paramagnetic species (Figure 9. 10, blue line). One protein population, with a rhombic signal 3+ containing resonances at gx = 1.39, gy = 2.29 and gz = 3.06, is typical for a low spin heme (Fe ,

S = 1/2), and represents the closed state of LcpK30 with the heme iron in octahedral coordination (Figure 9. 11, middle). The signal at g = 6.16, originated from a second protein species where the iron atom is in the high spin state (Fe3+, S = 5/2), as is commonly observed for five- coordinated heme (Figure 9. 11, left). A minor signal was noticed at g = at 4.29, attributed to a six-coordinate high-spin state of the iron heme. EPR spectra were recorded for the protein in the presence of imidazole, to mimic the conditions where LcpK30 was crystallized in the open state (Figure 9. 10, red line). The absence of the signal at g = 6.16 confirmed that imidazole is a very adequate ligand, occupying the empty distal position of the iron atom for all the protein molecules with a previous five-coordination of the heme. The low spin signal with different resonances at gx = 1.52, gy = 2.27 and gz = 2.97 implies that the heme iron exchanged the distal ligand from lysine to imidazole. Therefore, this sample contains just one single species of the protein found in an open state with imidazole bound heme cofactor (Figure 9. 11, right).

110

Figure 9. 10: EPR analysis of LcpK30. Continuous-wave EPR spectra were recorded for LcpK30 as isolated (blue line), LcpK30 with imidazole (red line) and the LcpK30 Lys167Ala variant.

EPR spectra were recorded in the absence of imidazole for a variant of LcpK30, Lys167Ala, with the lysine residue in the distal axial site of the heme group replaced by an alanine residue (Figure 9. 10, grey line). A strong signal could be registered at g = 5.83, with an additional axial contribution at g = 2.00. The high peak was attributed to five-coordinated high spin Fe3+ heme protein, a species that dominates in the sample (Figure 9. 11, left). However, the protein sample was not homogenous, another state being indicated by the other weak low spin signals at gx = 1.60, gy = 2.33 and gz = 2.83, marking an octahedrally coordinated heme protein population. The sixth axial ligand in this case could be a water molecule that, despite of the channel’s hydrophobicity, found a way inside and bound to the iron heme.

111

Figure 9. 11: Different coordinations of the heme iron in LcpK30. Left: Five-coordinated heme iron. Middle: Octahedral coordination of heme iron with an endogenous axial ligand. Right: Octahedral coordination of the heme iron with an exogenous axial ligand, an imidazole molecule.

112

10. Outlook

Cytochrome c synthesis complex protein CcsBA Production and purification of several recombinant CcsBA/CcsA homologs has been well established. The CcsBA natural fusion protein from B. thetaiotaomicron was successfully crystallized, especially in conditions containing trisodium citrate in high concentrations. However, since the best resolution of a diffraction data was about 13 Å, further optimization of the crystals have to be carried out to achieve determination of the three-dimensional structure at a suitable resolution. First, the reproducibility of the crystals needs to be established. Different detergent or buffer systems could not improve the crystallization process. Maybe other approaches instead of detergent-based protein solubilisation such as lipidic bicelles [242] or nanodiscs [243] could enhance crystal contacts or close-packing of the crystal. As for the phase problem, the high structural homology between CcmF from T. thermopilus and BtCcsBA was discussed, therefore the CcmF structure could possibly be used as a model to calculate the phases and perform molecular replacement. It is believed that E. coli is challenged in folding correctly big periplasmic domains in membrane proteins [118]. As the periplasmic domain of CcsB is almost two thirds of the protein, CcsBA production in a bacterium where System II proteins are genuinely encoded in its chromosome, might help the complex protein to attain a better conformation and improve the crystallization into high-ordered crystalline formations. CcsBA, a complex protein of around 100 kDa molecular mass, represents a suitable candidate for cryo-electron microscopy, one of the major current technique in structural biology with recent advancements in cameras and software for resolving high-resolution structures.

Latex clearing protein

The two conformations of LcpK30 could suggest insights about the rubber cleavage mechanism, but only a structure together with the natural substrates could confirm and validate the steps of the catalytic process. However, trials of co-crystallization and crystal soaking experiments of the enzyme together with the rubber polymer were unsuccessful. The substrate used for experiments was either a pure latex emulsion with the polymer particles dissolved in different solvents with different concentrations or as cleaved end products derived from the LcpK30 activity, solubilized in ethyl acetate or methanol 100% (v/v). The LcpK30 crystals were soaked for different periods of time. Other strategies of LcpK30 co-crystallization with the substrate

113

included addition of some inhibitors (α-tocopherol) or reducing agents (TCEP). The hydrophobicity of the biopolymer complicated the processes, and the majority of the obtained crystals were poor in quality, while for the ones with sufficient diffraction resolution, the structures did not revealed any new observations. New strategies have to be implemented for binding the reduced heme to the oxygen and subsequently arresting the polymer chain in the channel of the protein. Also, crystal structures of new Lcp homolog proteins may disclose new insights of the cleavage process.

114

11. References

1. Kranz, R., et al., Molecular mechanisms of cytochrome c biogenesis: three distinct systems. Mol Microbiol, 1998. 29(2): p. 383-96. 2. Keilin, D., On cytochrome, a respiratory pigment, common to animals, yeast, and higher plants. Proceedings of the Royal Society of London. Series B, Containing Papers of a Biological Character, 1925. 98(690): p. 312-339. 3. Reedy, C.J. and B.R. Gibney, Heme protein assemblies. Chem Rev, 2004. 104(2): p. 617-49. 4. Jiang, X.J. and X.D. Wang, Cytochrome C-mediated apoptosis. Annual Review of Biochemistry, 2004. 73: p. 87-106. 5. Bertini, I., G. Cavallaro, and A. Rosato, Cytochrome c: Occurrence and functions. Chemical Reviews, 2006. 106(1): p. 90-115. 6. Barker, P.D. and S.J. Ferguson, Still a puzzle: why is haem covalently attached in c-type cytochromes? Structure, 1999. 7(12): p. R281-90. 7. Allen, J.W.A., N. Leach, and S.J. Ferguson, The histidine of the c-type cytochrome CXXCH haem-binding motif essential for haem attachment by the Escherichia coli cytochrome maturation (Ccm) apparatus. Biochemical Journal, 2005. 389: p. 587-592. 8. Ferguson, S.J., et al., Cytochrome c assembly: A tale of ever increasing variation and mystery? Biochimica Et Biophysica Acta-Bioenergetics, 2008. 1777(7-8): p. 980-984. 9. Hartshorne, R.S., et al., A dedicated haem lyase is required for the maturation of a novel bacterial cytochrome c with unconventional covalent haem binding. Molecular Microbiology, 2007. 64(4): p. 1049-1060. 10. Liu, X.S., et al., Induction of apoptotic program in cell-free extracts: Requirement for dATP and cytochrome c. Cell, 1996. 86(1): p. 147-157. 11. Ambler, R.P., Sequence Variability in Bacterial Cytochromes-C. Biochimica Et Biophysica Acta, 1991. 1058(1): p. 42-47. 12. Letoffe, S., P. Delepelaire, and C. Wandersman, The housekeeping dipeptide permease is the Escherichia coli heme transporter and functions with two optional peptide binding proteins. Proceedings of the National Academy of Sciences of the United States of America, 2006. 103(34): p. 12891-12896. 13. Woo, J.S., et al., X-ray structure of the Yersinia pestis heme transporter HmuUV. Nature Structural & Molecular Biology, 2012. 19(12): p. 1310-+. 14. Hamza, I. and H.A. Dailey, One ring to rule them all: Trafficking of heme and heme synthesis intermediates in the metazoans. Biochimica Et Biophysica Acta-Molecular Cell Research, 2012. 1823(9): p. 1617-1632. 15. Franken, A.C., et al., Heme biosynthesis and its regulation: towards understanding and improvement of heme biosynthesis in filamentous fungi. Appl Microbiol Biotechnol, 2011. 91(3): p. 447-60. 16. Koreny, L., J. Lukes, and M. Obornik, Evolution of the haem synthetic pathway in kinetoplastid flagellates: An essential pathway that is not essential after all? International Journal for Parasitology, 2010. 40(2): p. 149-156. 17. Tripodi, K.E., S.M. Menendez Bravo, and J.A. Cricco, Role of heme and heme-proteins in trypanosomatid essential metabolic pathways. Enzyme Res, 2011. 2011: p. 873230. 18. Koreny, L., M. Obornik, and J. Lukes, Make It, Take It, or Leave It: Heme Metabolism of Parasites. Plos Pathogens, 2013. 9(1). 19. Embley, T.M., Multiple secondary origins of the anaerobic lifestyle in eukaryotes. Philos Trans R Soc Lond B Biol Sci, 2006. 361(1470): p. 1055-67. 20. Anzaldi, L.L. and E.P. Skaar, Overcoming the heme paradox: heme toxicity and tolerance in bacterial pathogens. Infect Immun, 2010. 78(12): p. 4977-89. 21. Layer, G., et al., Structure and function of enzymes in heme biosynthesis. Protein Science, 2010. 19(6): p. 1137-1161.

115

22. Heinemann, I.U., M. Jahn, and D. Jahn, The biochemistry of heme biosynthesis. Archives of Biochemistry and Biophysics, 2008. 474(2): p. 238-251. 23. Dailey, H.A., et al., Prokaryotic Heme Biosynthesis: Multiple Pathways to a Common Essential Product. Microbiology and Molecular Biology Reviews, 2017. 81(1). 24. Hansson, M. and L. Hederstedt, Purification and Characterization of a Water-Soluble Ferrochelatase from Bacillus-Subtilis. European Journal of Biochemistry, 1994. 220(1): p. 201-208. 25. Harbin, B.M. and H.A. Dailey, Orientation of Ferrochelatase in Bovine Liver-Mitochondria. Biochemistry, 1985. 24(2): p. 366-370. 26. Obornik, M. and B.R. Green, Mosaic origin of the heme biosynthesis pathway in photosynthetic eukaryotes. Molecular Biology and Evolution, 2005. 22(12): p. 2343-2353. 27. Tanaka, R. and A. Tanaka, Tetrapyrrole biosynthesis in higher plants. Annual Review of Plant Biology, 2007. 58: p. 321-346. 28. Giege, P., J.M. Grienenberger, and G. Bonnard, Cytochrome c biogenesis in mitochondria. Mitochondrion, 2008. 8(1): p. 61-73. 29. Suzuki, T., et al., Two types of Ferrochelatase in photosynthetic and nonphotosynthetic tissues of cucumber - Their difference in phylogeny, gene expression, and localization. Journal of Biological Chemistry, 2002. 277(7): p. 4731-4737. 30. Scharfenberg, M., et al., Functional characterization of the two ferrochelatases in Arabidopsis thaliana. Plant Cell and Environment, 2015. 38(2): p. 280-298. 31. van Lis, R., et al., Subcellular localization and light-regulated expression of protoporphyrinogen IX oxidase and ferrochelatase in Chlamydomonas reinhardtii. Plant Physiology, 2005. 139(4): p. 1946-1958. 32. Tanaka, R., K. Kobayashi, and T. Masuda, Tetrapyrrole Metabolism in Arabidopsis thaliana. Arabidopsis Book, 2011. 9: p. e0145. 33. Travaglini-Allocatelli, C., Protein Machineries Involved in the Attachment of Heme to Cytochrome c: Protein Structures and Molecular Mechanisms. Scientifica (Cairo), 2013. 2013: p. 505714. 34. Denks, K., et al., The Sec translocon mediated protein transport in prokaryotes and eukaryotes. Mol Membr Biol, 2014. 31(2-3): p. 58-84. 35. Sanders, C., et al., Cytochrome c biogenesis: the Ccm system. Trends Microbiol, 2010. 18(6): p. 266-74. 36. ThonyMeyer, L. and P. Kunzler, Translocation to the periplasm and signal sequence cleavage of preapocytochrome c depend on sec and lep, but not on the ccm gene products. European Journal of Biochemistry, 1997. 246(3): p. 794-799. 37. Katzen, F., et al., Evolutionary domain fusion expanded the substrate specificity of the transmembrane electron transporter DsbD. EMBO J, 2002. 21(15): p. 3960-9. 38. Arnold, I., et al., Two distinct and independent mitochondrial targeting signals function in the sorting of an inner membrane protein, cytochrome c(1). Journal of Biological Chemistry, 1998. 273(3): p. 1469-1476. 39. Nicholson, D.W., R.A. Stuart, and W. Neupert, Biogenesis of Cytochrome-C1 - Role of Cytochrome-C1 Heme Lyase and of the 2 Proteolytic Processing Steps during Import into Mitochondria. Journal of Biological Chemistry, 1989. 264(17): p. 10156-10168. 40. Diekert, K., et al., Apocytochrome c requires the TOM complex for translocation across the mitochondrial outer membrane. Embo Journal, 2001. 20(20): p. 5626-5635. 41. Wiedemann, N., et al., Biogenesis of yeast mitochondrial cytochrome c: A unique relationship to the TOM machinery. Journal of Molecular Biology, 2003. 327(2): p. 465-474. 42. Nargang, F.E., et al., A mutant of Neurospora crassa deficient in cytochrome c heme lyase activity cannot import cytochrome c into mitochondria. J Biol Chem, 1988. 263(19): p. 9388- 94. 43. Rurek, M., Proteins involved in maturation pathways of plant mitochondrial and plastid c- type cytochromes. Acta Biochim Pol, 2008. 55(3): p. 417-33. 44. Nakamoto, S.S., P. Hamel, and S. Merchant, Assembly of chloroplast cytochromes b and c. Biochimie, 2000. 82(6-7): p. 603-614.

116

45. Bowman, S.E. and K.L. Bren, The chemistry and biochemistry of heme c: functional bases for covalent attachment. Nat Prod Rep, 2008. 25(6): p. 1118-30. 46. Daltrop, O., et al., In vitro formation of a c-type cytochrome. Proceedings of the National Academy of Sciences of the United States of America, 2002. 99(12): p. 7872-7876. 47. Daltrop, O. and S.J. Ferguson, Cytochrome c maturation - The in vitro reactions of horse apocytochrome c and Paracoccus denitrificans apocytochrome c(550) with heme. Journal of Biological Chemistry, 2003. 278(7): p. 4404-4409. 48. Richard-Fogal, C.L., et al., Heme concentration dependence and metalloporphyrin inhibition of the system I and II cytochrome c assembly pathways. Journal of Bacteriology, 2007. 189(2): p. 455-463. 49. Mavridou, D.A., S.J. Ferguson, and J.M. Stevens, Cytochrome c assembly. IUBMB Life, 2013. 65(3): p. 209-16. 50. Stroebel, D., et al., An atypical haem in the cytochrome b(6)f complex. Nature, 2003. 426(6965): p. 413-418. 51. de Vitry, C., Cytochrome c maturation system on the negative side of bioenergetic membranes: CCB or System IV. The FEBS journal, 2011. 278(22): p. 4189-97. 52. Alric, J., et al., Spectral and redox characterization of the heme ci of the cytochrome b6f complex. Proc Natl Acad Sci U S A, 2005. 102(44): p. 15860-5. 53. Allen, J.W., Cytochrome c biogenesis in mitochondria--Systems III and V. The FEBS journal, 2011. 278(22): p. 4198-216. 54. Allen, J.W.A., et al., Order within a mosaic distribution of mitochondrial c-type cytochrome biogenesis systems? The FEBS Journal, 2008. 275(10): p. 2385-2402. 55. Babbitt, S.E., et al., Conserved Residues of the Human Mitochondrial Holocytochrome c Synthase Mediate Interactions with Heme. Biochemistry, 2014. 53(32): p. 5261-5271. 56. Bernard, D.G., et al., Cyc2p, a membrane-bound flavoprotein involved in the maturation of mitochondrial c-type cytochromes. Journal of Biological Chemistry, 2005. 280(48): p. 39852- 39859. 57. Hamel, P., et al., Biochemical requirements for the maturation of mitochondrial c-type cytochromes. Biochim Biophys Acta, 2009. 1793(1): p. 125-38. 58. Dumont, M.E., et al., Identification and Sequence of the Gene Encoding Cytochrome-C Heme Lyase in the Yeast Saccharomyces-Cerevisiae. Embo Journal, 1987. 6(1): p. 235-241. 59. Zollner, A., G. Rodel, and A. Haid, Molecular-Cloning and Characterization of the Saccharomyces-Cerevisiae Cyt2-Gene Encoding Cytochrome-C1 -Heme Lyase. European Journal of Biochemistry, 1992. 207(3): p. 1093-1100. 60. Bernard, D.G., et al., Overlapping specificities of the mitochondrial cytochrome c and c1 heme lyases. Journal of Biological Chemistry, 2003. 278(50): p. 49732-49742. 61. Richard-Fogal, C.L., et al., Thiol redox requirements and substrate specificities of recombinant cytochrome c assembly systems II and III. Biochimica Et Biophysica Acta- Bioenergetics, 2012. 1817(6): p. 911-919. 62. Stevens, J.M., et al., The mitochondrial cytochrome c N-terminal region is critical for maturation by holocytochrome c synthase. Febs Letters, 2011. 585(12): p. 1891-1896. 63. Kleingardner, J.G. and K.L. Bren, Comparing substrate specificity between cytochrome c maturation and cytochrome c heme lyase systems for cytochrome c biogenesis. Metallomics, 2011. 3(4): p. 396-403. 64. Verissimo, A.F., et al., Engineering a prokaryotic apocytochrome c as an efficient substrate for Saccharomyces cerevisiae cytochrome c heme lyase. Biochemical and Biophysical Research Communications, 2012. 424(1): p. 130-135. 65. San Francisco, B., E.C. Bretsnyder, and R.G. Kranz, Human mitochondrial holocytochrome c synthase's heme binding, maturation determinants, and complex formation with cytochrome c. Proc Natl Acad Sci U S A, 2013. 110(9): p. E788-97. 66. Babbitt, S.E., et al., Mitochondrial cytochrome c biogenesis: no longer an enigma. Trends Biochem Sci, 2015. 40(8): p. 446-55. 67. Thony-Meyer, L., et al., Escherichia coli genes required for cytochrome c maturation. J Bacteriol, 1995. 177(15): p. 4321-6.

117

68. Verissimo, A.F. and F. Daldal, Cytochrome c biogenesis System I: An intricate process catalyzed by a maturase supercomplex? Biochimica Et Biophysica Acta-Bioenergetics, 2014. 1837(7): p. 989-998. 69. Goldman, B.S. and R.G. Kranz, ABC transporters associated with cytochrome c biogenesis. Research in Microbiology, 2001. 152(3-4): p. 323-329. 70. Stevens, J.M., et al., Cytochrome c biogenesis System I. FEBS J, 2011. 278(22): p. 4170-8. 71. Cook, G.M. and R.K. Poole, Oxidase and periplasmic cytochrome assembly in Escherichia coli K-12: CydDC and CcmAB are not required for haem-membrane association. Microbiology-Sgm, 2000. 146: p. 527-536. 72. Feissner, R.E., et al., ABC transporter-mediated release of a haem chaperone allows cytochrome c biogenesis. Mol Microbiol, 2006. 61(1): p. 219-31. 73. Christensen, O., et al., Loss of ATP hydrolysis activity by CcmAB results in loss of c-type cytochrome synthesis and incomplete processing of CcmE. Febs Journal, 2007. 274(9): p. 2322-2332. 74. Lee, J.H., et al., Evolutionary origins of members of a superfamily of integral membrane cytochrome c biogenesis proteins. Biochimica Et Biophysica Acta, 2007. 1768(9): p. 2164- 81. 75. Richard-Fogal, C. and R.G. Kranz, The CcmC:Heme:CcmE Complex in Heme Trafficking and Cytochrome c Biosynthesis. Journal of Molecular Biology, 2010. 401(3): p. 350-362. 76. Ren, Q. and L. Thony-Meyer, Physical interaction of CcmC with heme and the heme chaperone CcmE during cytochrome c maturation. Journal of Biological Chemistry, 2001. 276(35): p. 32591-32596. 77. Richard-Fogal, C.L., E.R. Frawley, and R.G. Kranz, Topology and function of CcmD in cytochrome c maturation. Journal of Bacteriology, 2008. 190(10): p. 3489-3493. 78. Enggist, E., et al., NMR structure of the heme chaperone CcmE reveals a novel functional motif. Structure, 2002. 10(11): p. 1551-1557. 79. Schulz, H., H. Hennecke, and L. Thony-Meyer, Prototype of a heme chaperone essential for cytochrome c maturation. Science, 1998. 281(5380): p. 1197-1200. 80. Thony-Meyer, L., A heme chaperone for cytochrome c biosynthesis. Biochemistry, 2003. 42(45): p. 13099-13105. 81. Kranz, R.G., et al., Cytochrome c biogenesis: mechanisms for covalent modifications and trafficking of heme and for heme-iron redox control. Microbiol Mol Biol Rev, 2009. 73(3): p. 510-28, Table of Contents. 82. Stevens, J.M., et al., Interaction of heme with variants of the heme chaperone CcmE carrying active site mutations and a cleavable N-terminal His tag. Journal of Biological Chemistry, 2003. 278(23): p. 20500-20506. 83. Aramini, J.M., et al., Solution NMR Structure, Backbone Dynamics, and Heme-Binding Properties of a Novel Cytochrome c Maturation Protein CcmE from Desulfovibrio vulgaris. Biochemistry, 2012. 51(18): p. 3705-3707. 84. Goddard, A.D., et al., c-Type Cytochrome Biogenesis Can Occur via a Natural Ccm System Lacking CcmH, CcmG, and the Heme-binding Histidine of CcmE. Journal of Biological Chemistry, 2010. 285(30): p. 22880-22887. 85. Daltrop, O., et al., The CcmE protein of the c-type cytochrome biogenesis system: Unusual in vitro heme incorporation into apo-CcmE and transfer from holo-CcmE to apocytochrome. Proceedings of the National Academy of Sciences of the United States of America, 2002. 99(15): p. 9703-9708. 86. Mavridou, D.A.I., et al., A Pivotal Heme-transfer Reaction Intermediate in Cytochrome c Biogenesis. Journal of Biological Chemistry, 2012. 287(4): p. 2342-2352. 87. Feissner, R.E., et al., Recombinant cytochromes c biogenesis systems I and II and analysis of haem delivery pathways in Escherichia coli. Mol Microbiol, 2006. 60(3): p. 563-77. 88. Sanders, C., et al., The Cytochrome c Maturation Components CcmF, CcmH, and CcmI Form a Membrane-integral Multisubunit Heme Ligation Complex. Journal of Biological Chemistry, 2008. 283(44): p. 29715-29722.

118

89. San Francisco, B., M.C. Sutherland, and R.G. Kranz, The CcmFH complex is the system I holocytochrome c synthetase: engineering cytochrome c maturation independent of CcmABCDE. Molecular Microbiology, 2014. 91(5): p. 996-1008. 90. San Francisco, B. and R.G. Kranz, Interaction of HoloCcmE with CcmF in Heme Trafficking and Cytochrome c Biosynthesis. Journal of Molecular Biology, 2014. 426(3): p. 570-585. 91. Ren, Q., U. Ahuja, and L. Thony-Meyer, A bacterial cytochrome c heme lyase - CcmF forms a complex with the heme chaperone CcmE and CcmH but not with apocytochrome c. Journal of Biological Chemistry, 2002. 277(10): p. 7657-7663. 92. Francisco, B.S., et al., Heme Ligand Identification and Redox Properties of the Cytochrome c Synthetase, CcmF. Biochemistry, 2011. 50(50): p. 10974-10985. 93. Sutherland, M.C., J.A. Rankin, and R.G. Kranz, Heme Trafficking and Modifications during System I Cytochrome c Biogenesis: Insights from Heme Redox Potentials of Ccm Proteins. Biochemistry, 2016. 55(22): p. 3150-3156. 94. Richard-Fogal, C.L., et al., A conserved haem redox and trafficking pathway for cofactor attachment. Embo Journal, 2009. 28(16): p. 2349-2359. 95. Mavridou, D.A.I., et al., Probing Heme Delivery Processes in Cytochrome c Biogenesis System I. Biochemistry, 2013. 52(41): p. 7262-7270. 96. Brausemann, A., Crystal structure of CcmF : the cytochrome c maturation heme lyase. 2017. 97. Edeling, M.A., et al., Structure of CcmG/DsbE at 1.14 angstrom resolution: High-fidelity reducing activity in an indiscriminately oxidizing environment. Structure, 2002. 10(7): p. 973- 979. 98. Di Matteo, A., et al., Structural and functional characterization of CcmG from Pseudomonas aeruginosa, a key component of the bacterial cytochrome c maturation apparatus. Proteins- Structure Function and Bioinformatics, 2010. 78(10): p. 2213-2221. 99. Bertini, I., G. Cavallaro, and A. Rosato, Evolution of mitochondrial-type cytochrome c domains and of the protein machinery for their assembly. Journal of Inorganic Biochemistry, 2007. 101(11-12): p. 1798-1811. 100. Zheng, X.M., et al., Biochemical properties and catalytic domain structure of the CcmH protein from Escherichia coli. Biochimica Et Biophysica Acta-Proteins and Proteomics, 2012. 1824(12): p. 1394-1400. 101. Fabianek, R.A., T. Hofer, and L. Thony-Meyer, Characterization of the Escherichia coli CcmH protein reveals new insights into the redox pathway required for cytochrome c maturation. Archives of Microbiology, 1999. 171(2): p. 92-100. 102. Allen, J.W.A., et al., A variant System I for cytochrome c biogenesis in archaea and some bacteria has a novel CcmE and no CcmH. Febs Letters, 2006. 580(20): p. 4827-4834. 103. Sanders, C., et al., Overproduction of CcmG and CcmFH(Rc) fully suppresses the c-type cytochrome biogenesis defect of Rhodobacter capsulatus CcmI-Null mutants. Journal of Bacteriology, 2005. 187(12): p. 4245-4256. 104. Di Silvio, E., et al., Recognition and binding of apocytochrome c to P. aeruginosa CcmI, a component of cytochrome c maturation machinery. Biochimica Et Biophysica Acta-Proteins and Proteomics, 2013. 1834(8): p. 1554-1561. 105. Verissimo, A.F., et al., CcmI subunit of CcmFHI heme ligation complex functions as an apocytochrome c chaperone during c-type cytochrome maturation. J Biol Chem, 2011. 286(47): p. 40452-63. 106. Mavridou, D.A.I., et al., Oxidation State-dependent Protein-Protein Interactions in Disulfide Cascades. Journal of Biological Chemistry, 2011. 286(28): p. 24943-24956. 107. Zhou, Y.P. and J.H. Bushweller, Solution structure and elevator mechanism of the membrane electron transporter CcdA. Nature Structural & Molecular Biology, 2018. 25(2): p. 163-+. 108. Unseld, M., et al., The mitochondrial genome of Arabidopsis thaliana contains 57 genes in 366,924 nucleotides. Nature Genetics, 1997. 15(1): p. 57-61. 109. Meyer, Y., J.P. Reichheld, and F. Vignols, Thioredoxins in Arabidopsis and other plants. Photosynthesis Research, 2005. 86(3): p. 419-433. 110. Rayapuram, N., et al., The three mitochondrial encoded CcmF proteins form a complex that interacts with CCMH and c-type apocytochromes in Arabidopsis. Journal of Biological Chemistry, 2008. 283(37): p. 25200-25208.

119

111. Feissner, R.E., et al., Mutations in cytochrome assembly and periplasmic redox pathways in Bordetella pertussis. J Bacteriol, 2005. 187(12): p. 3941-9. 112. Frawley, E.R. and R.G. Kranz, CcsBA is a cytochrome c synthetase that also functions in heme transport. Proceedings of the National Academy of Sciences of the United States of America, 2009. 106(25): p. 10201-6. 113. Xie, Z., et al., Genetic analysis of chloroplast c-type cytochrome assembly in Chlamydomonas reinhardtii: One chloroplast locus and at least four nuclear loci are required for heme attachment. Genetics, 1998. 148(2): p. 681-92. 114. Kern, M., et al., Essential histidine pairs indicate conserved haem binding in epsilonproteobacterial cytochrome c haem lyases. Microbiology, 2010. 156(Pt 12): p. 3773- 81. 115. Simon, J. and L. Hederstedt, Composition and function of cytochrome c biogenesis System II. FEBS J, 2011. 278(22): p. 4179-88. 116. Barr, I. and F. Guo, Pyridine Hemochromagen Assay for Determining the Concentration of Heme in Purified Protein Solutions. Bio Protoc, 2015. 5(18). 117. Dreyfuss, B.W., et al., Functional analysis of a divergent system II protein, Ccs1, involved in c-type cytochrome biogenesis. J Biol Chem, 2003. 278(4): p. 2604-13. 118. Ahuja, U., et al., Haem-delivery proteins in cytochrome c maturation System II. Molecular Microbiology, 2009. 73(6): p. 1058-71. 119. Goddard, A.D., et al., Comparing the substrate specificities of cytochrome c biogenesis Systems I and II: bioenergetics. The FEBS journal, 2010. 277(3): p. 726-37. 120. Kern, M., et al., Substrate specificity of three cytochrome c haem lyase isoenzymes from Wolinella succinogenes: unconventional haem c binding motifs are not sufficient for haem c attachment by NrfI and CcsA1. Mol Microbiol, 2010. 75(1): p. 122-37. 121. Schiott, T., C. vonWachenfeldt, and L. Hederstedt, Identification and characterization of the ccdA gene, required for cytochrome c synthesis in Bacillus subtilis. Journal of Bacteriology, 1997. 179(6): p. 1962-1973. 122. Crow, A., et al., Structural basis of redox-coupled protein substrate selection by the cytochrome c biosynthesis protein ResA. Journal of Biological Chemistry, 2004. 279(22): p. 23654-23660. 123. Colbert, C.L., et al., Mechanism of substrate specificity in Bacillus subtilis ResA, a thioredoxin-like protein involved in cytochrome c maturation. Proceedings of the National Academy of Sciences of the United States of America, 2006. 103(12): p. 4410-4415. 124. Schiott, T., M. ThroneHolst, and L. Hederstedt, Bacillus subtilis CcdA-defective mutants are blocked in a late step of cytochrome c biogenesis. Journal of Bacteriology, 1997. 179(14): p. 4523-4529. 125. Erlendsson, L.S., et al., Bacillus subtilis ResA is a thiol-disulfide oxidoreductase involved in cytochrome c synthesis. Journal of Biological Chemistry, 2003. 278(20): p. 17852-17858. 126. Beckett, C.S., et al., Four genes are required for the system II cytochrome c biogenesis pathway in Bordetella pertussis, a unique bacterial model. Molecular Microbiology, 2000. 38(3): p. 465-481. 127. Liu, Y.W. and D.J. Kelly, Cytochrome c biogenesis in Campylobacter jejuni requires cytochrome c6 (CccA; Cj1153) to maintain apocytochrome cysteine thiols in a reduced state for haem attachment. Mol Microbiol, 2015. 96(6): p. 1298-317. 128. Gray, J.C., Cytochrome f: Structure, function and biosynthesis. Photosynth Res, 1992. 34(3): p. 359-74. 129. Quinn, J.M., et al., Coordinate copper- and oxygen-responsive Cyc6 and Cpx1 expression in Chlamydomonas is mediated by the same element. Journal of Biological Chemistry, 2000. 275(9): p. 6080-6089. 130. Howe, C.J., et al., The novel cytochrome c(6) of chloroplasts: a case of evolutionary bricolage? Journal of Experimental Botany, 2006. 57(1): p. 13-22. 131. Hamel, P.P., et al., Essential histidine and tryptophan residues in CcsA, a system II polytopic cytochrome c biogenesis protein. J Biol Chem, 2003. 278(4): p. 2593-603. 132. Gabilly, S.T., et al., CCS5, a Thioredoxin-like Protein Involved in the Assembly of Plastid c- Type Cytochromes. Journal of Biological Chemistry, 2010. 285(39): p. 29738-29749.

120

133. Motohashi, K. and T. Hisabori, CcdA Is a Thylakoid Membrane Protein Required for the Transfer of Reducing Equivalents from Stroma to Thylakoid Lumen in the Higher Plant Chloroplast. Antioxidants & Redox Signaling, 2010. 13(8): p. 1169-1176. 134. Motohashi, K., M. Yoshida, and T. Hisabori, HCF164 receives reducing equivalents from stromal thioredoxin across the thylakoid membrane and mediates reduction of target proteins in the thylakoid lumen. Plant and Cell Physiology, 2007. 48: p. S236-S236. 135. Gabilly, S.T., et al., A Novel Component of the Disulfide-Reducing Pathway Required for Cytochrome c Assembly in Plastids. Genetics, 2011. 187(3): p. 793-802. 136. Lennartz, K., et al., HCF153, a novel nuclear-encoded factor necessary during a post- translational step in biogenesis of the cytochrome b(6)f complex. Plant Journal, 2006. 45(1): p. 101-112. 137. Deckert, G., et al., The complete genome of the hyperthermophilic bacterium Aquifex aeolicus. Nature, 1998. 392(6674): p. 353-8. 138. Eppinger, M., et al., Genome sequences of the biotechnologically important Bacillus megaterium strains QM B1551 and DSM319. J Bacteriol, 2011. 193(16): p. 4199-213. 139. Le Brun, N.E., J. Bengtsson, and L. Hederstedt, Genes required for cytochrome c synthesis inBacillus subtilis. Molecular Microbiology, 2000. 36(3): p. 638-650. 140. O'Toole, G.A., Classic Spotlight: Bacteroides thetaiotaomicron, Starch Utilization, and the Birth of the Microbiome Era. J Bacteriol, 2016. 198(20): p. 2763. 141. Xu, J., et al., A genomic view of the human-Bacteroides thetaiotaomicron symbiosis. Science, 2003. 299(5615): p. 2074-6. 142. Wang, J., et al., Characterization of a Bacteroides mobilizable transposon, NBU2, which carries a functional lincomycin resistance gene. Journal of Bacteriology, 2000. 182(12): p. 3559-3571. 143. Bond, D.R. and D.R. Lovley, Electricity production by Geobacter sulfurreducens attached to electrodes. Appl Environ Microbiol, 2003. 69(3): p. 1548-55. 144. Lin, W.C., M.V. Coppi, and D.R. Lovley, Geobacter sulfurreducens can grow with oxygen as a terminal electron acceptor. Applied and Environmental Microbiology, 2004. 70(4): p. 2525-2528. 145. Methe, B.A., et al., Genome of Geobacter sulfurreducens: metal reduction in subsurface environments. Science, 2003. 302(5652): p. 1967-9. 146. Falsafi, T. and M. Mahboubi, Helicobacter hepaticus, a new pathogenic species of the Helicobacter genus: Similarities and differences with H. pylori. Iran J Microbiol, 2013. 5(3): p. 185-94. 147. Suerbaum, S., et al., The complete genome sequence of the carcinogenic bacterium Helicobacter hepaticus. Proceedings of the National Academy of Sciences of the United States of America, 2003. 100(13): p. 7901-7906. 148. Kawasumi, T., et al., Hydrogenobacter-Thermophilus Gen-Nov, Sp-Nov, an Extremely Thermophilic, Aerobic, Hydrogen-Oxidizing Bacterium. International Journal of Systematic Bacteriology, 1984. 34(1): p. 5-10. 149. Zeytun, A., et al., Complete genome sequence of Hydrogenobacter thermophilus type strain (TK-6). Stand Genomic Sci, 2011. 4(2): p. 131-43. 150. Young, M., et al., Genome sequence of the Fleming strain of Micrococcus luteus, a simple free-living actinobacterium. J Bacteriol, 2010. 192(3): p. 841-60. 151. Ohno, M., et al., Symbiobacterium thermophilum gen. nov., sp. nov., a symbiotic thermophile that depends on co-culture with a Bacillus strain for growth. Int J Syst Evol Microbiol, 2000. 50 Pt 5: p. 1829-32. 152. Ueda, K., et al., Genome sequence of Symbiobacterium thermophilum, an uncultivable bacterium that depends on microbial commensalism. Nucleic Acids Research, 2004. 32(16): p. 4937-4944. 153. Baert, B., et al., Multiple phenotypic alterations caused by a c-type cytochrome maturation ccmC gene mutation in Pseudomonas aeruginosa. Microbiology-Sgm, 2008. 154: p. 127-138. 154. Wilson, A., Bacterial c-type cytochromes and pathogenicity European Journal Of BioMedical Research, 2015. 1(1): p. 17-21.

121

155. Prakash, S.K., et al., Loss of holocytochrome c-type synthetase causes the male lethality of X- linked dominant micro-phthalmia with linear skin defects (MLS) syndrome. Human Molecular Genetics, 2002. 11(25): p. 3237-3248. 156. Kiryu-Seo, S., et al., Unique anti-apoptotic activity of EAAC1 in injured motor neurons. EMBO J, 2006. 25(14): p. 3411-21. 157. Sassetti, C.M., D.H. Boyd, and E.J. Rubin, Genes required for mycobacterial growth defined by high density mutagenesis. Mol Microbiol, 2003. 48(1): p. 77-84. 158. Wilson, A.C., J.A. Hoch, and M. Perego, Two small c-type cytochromes affect virulence gene expression in Bacillus anthracis. Molecular Microbiology, 2009. 72(1): p. 109-123. 159. Han, H., T. Sullivan, and A.C. Wilson, Cytochrome c551 and the cytochrome c maturation pathway affect virulence gene expression in Bacillus cereus ATCC 14579. J Bacteriol, 2015. 197(3): p. 626-35. 160. Viswanathan, V.K., et al., The cytochrome c maturation locus of Legionella pneumophila promotes iron assimilation and intracellular infection and contains a strain-specific insertion sequence element. Infection and Immunity, 2002. 70(4): p. 1842-1852. 161. Phillips, N.J., et al., Proteomic Analysis of Neisseria gonorrhoeae Biofilms Shows Shift to Anaerobic Respiration and Changes in Nutrient Transport and Outermembrane Proteins. Plos One, 2012. 7(6). 162. Hoiby, N., et al., Pseudomonas aeruginosa Biofilms in the Lungs of Cystic Fibrosis Patients. Biofilm Infections, 2011: p. 167-184. 163. Southey-Pillig, C.J., D.G. Davies, and K. Sauer, Characterization of temporal protein production in Pseudomonas aeruginosa biofilms. Journal of Bacteriology, 2005. 187(23): p. 8114-8126. 164. Goodwin, R.G., et al., Photosensitivity and acute liver injury in myeloproliferative disorder secondary to late-onset protoporphyria caused by deletion of a ferrochelatase gene in hematopoietic cells. , 2006. 107(1): p. 60-62. 165. Ajioka, R.S., J.D. Phillips, and J.P. Kushner, Biosynthesis of heme in mammals. Biochimica Et Biophysica Acta-Molecular Cell Research, 2006. 1763(7): p. 723-736. 166. Lang, B.F., et al., An ancestral mitochondrial DNA resembling a eubacterial genome in miniature. Nature, 1997. 387(6632): p. 493-497. 167. Janouskovec, J., et al., A New Lineage of Eukaryotes Illuminates Early Mitochondrial Genome Reduction. Curr Biol, 2017. 27(23): p. 3717-3724 e5. 168. Grove, J., S. Busby, and J. Cole, The role of the genes nrf EFG and ccmFH in cytochrome c biosynthesis in Escherichia coli. Molecular & General Genetics, 1996. 252(3): p. 332-341. 169. Eaves, D.J., et al., Involvement of products of the nrfEFG genes in the covalent attachment of haem c to a novel cysteine-lysine motif in the cytochrome c(552) nitrite reductase from Escherichia coli. Molecular Microbiology, 1998. 28(1): p. 205-216. 170. Bernal, J.D. and D. Crowfoot, X-Ray Photographs of Crystalline Pepsin. Nature, 1934. 133: p. 794. 171. Kendrew, J.C., et al., 3-Dimensional Model of the Myoglobin Molecule Obtained by X-Ray Analysis. Nature, 1958. 181(4610): p. 662-666. 172. Steensma, D.P., M.A. Shampo, and R.A. Kyle, Max Perutz and the Structure of Hemoglobin. Mayo Clinic Proceedings, 2015. 90(8): p. e89. 173. Deisenhofer, J., et al., Structure of the protein subunits in the photosynthetic reaction centre of Rhodopseudomonas viridis at 3A resolution. Nature, 1985. 318(6047): p. 618-24. 174. Nanev, C.N., Recent Insights into Protein Crystal Nucleation. Crystals, 2018. 8(5). 175. McPherson, A. and J.A. Gavira, Introduction to protein crystallization. Acta Crystallogr F Struct Biol Commun, 2014. 70(Pt 1): p. 2-20. 176. Matthews, B.W., Solvent content of protein crystals. J Mol Biol, 1968. 33(2): p. 491-7. 177. Rhodes, G., Crystallography made crystal clear : a guide for users of macromolecular models. 3rd ed. Complementary science series. 2006, Amsterdam ; Boston: Elsevier/Academic Press. xxv, 306 p. 178. Brunger, A.T., J. Kuriyan, and M. Karplus, Crystallographic R-Factor Refinement by Molecular-Dynamics. Science, 1987. 235(4787): p. 458-460.

122

179. Taylor, G.L., Introduction to phasing. Acta Crystallogr D Biol Crystallogr, 2010. 66(Pt 4): p. 325-38. 180. Throne-Holst, M., L. Thony-Meyer, and L. Hederstedt, Escherichia coli ccm in-frame deletion mutants can produce periplasmic cytochrome b but not cytochrome c. FEBS Lett, 1997. 410(2-3): p. 351-5. 181. Kern, M. and J. Simon, Three transcription regulators of the Nss family mediate the adaptive response induced by nitrate, nitric oxide or nitrous oxide in Wolinella succinogenes. Environ Microbiol, 2015. 182. Gibson, D.G., Enzymatic assembly of overlapping DNA fragments. Methods in enzymology, 2011. 498: p. 349-61. 183. Datsenko, K.A. and B.L. Wanner, One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc Natl Acad Sci U S A, 2000. 97(12): p. 6640- 5. 184. Slotboom, D.J., et al., Static light scattering to characterize membrane proteins in detergent solution. Methods, 2008. 46(2): p. 73-82. 185. Smith, B.J., SDS polyacrylamide gel electrophoresis of proteins. Methods Mol Biol, 1994. 32: p. 23-34. 186. Laemmli, U.K., Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature, 1970. 227(5259): p. 680-5. 187. Goodhew, C., K.R. Brown, and G.W. Pettigrew, Haem staining in gels, a useful tool in the study of bacterial c-type cytochromes. Vol. 852. 1986. 288-294. 188. Seidel, J., et al., MacA is a second cytochrome c peroxidase of Geobacter sulfurreducens. Biochemistry, 2012. 51(13): p. 2747-56. 189. Dvir, H. and S. Choe, Bacterial expression of a eukaryotic membrane protein in fusion to various Mistic orthologs. Protein Expr Purif, 2009. 68(1): p. 28-33. 190. Ovchinnikov, S., et al., Protein structure determination using metagenome sequence data. Science, 2017. 355(6322): p. 294-297. 191. Ahuja, U. and L. Thony-Meyer, Dynamic features of a heme delivery system for cytochrome C maturation. J Biol Chem, 2003. 278(52): p. 52061-70. 192. Flardh, K. and M.J. Buttner, Streptomyces morphogenetics: dissecting differentiation in a filamentous bacterium. Nat Rev Microbiol, 2009. 7(1): p. 36-49. 193. Chater, K.F., Recent advances in understanding Streptomyces. F1000Res, 2016. 5: p. 2795. 194. Chater, K.F., Streptomyces inside-out: a new perspective on the bacteria that provide us with antibiotics. Philosophical Transactions of the Royal Society B-Biological Sciences, 2006. 361(1469): p. 761-768. 195. Procopio, R.E.D., et al., Antibiotics produced by Streptomyces. Brazilian Journal of Infectious Diseases, 2012. 16(5): p. 466-471. 196. Shah, A.A., et al., Biodegradation of natural and synthetic rubbers: A review. International Biodeterioration & Biodegradation, 2013. 83: p. 145-157. 197. Rose, K. and A. Steinbuchel, Biodegradation of natural rubber and related compounds: recent insights into a hardly understood catabolic capability of microorganisms. Appl Environ Microbiol, 2005. 71(6): p. 2803-12. 198. Bode, H.B., K. Kerkhoff, and D. Jendrossek, Bacterial degradation of natural and synthetic rubber. Biomacromolecules, 2001. 2(1): p. 295-303. 199. Sato, S., et al., Degradation of vulcanized and nonvulcanized polyisoprene rubbers by lipid peroxidation catalyzed by oxidative enzymes and transition metals. Biomacromolecules, 2003. 4(2): p. 321-9. 200. Birke, J., W. Rother, and D. Jendrossek, Latex Clearing Protein (Lcp) of Streptomyces sp. Strain K30 Is a b-Type Cytochrome and Differs from Rubber Oxygenase A (RoxA) in Its Biophysical Properties. Appl Environ Microbiol, 2015. 81(11): p. 3793-9. 201. Waltersperger, S., et al., PRIGo: a new multi-axis goniometer for macromolecular crystallography. J Synchrotron Radiat, 2015. 22(4): p. 895-900. 202. Heisey, R.M. and S. Papadatos, Isolation of microorganisms able to metabolize purified natural rubber. Appl Environ Microbiol, 1995. 61(8): p. 3092-7.

123

203. Imai, S., et al., Isolation and characterization of Streptomyces, Actinoplanes, and Methylibium strains that are involved in degradation of natural rubber and synthetic poly(cis- 1,4-isoprene). Enzyme Microb Technol, 2011. 49(6-7): p. 526-31. 204. Yikmis, M. and A. Steinbuchel, Historical and recent achievements in the field of microbial degradation of natural and synthetic rubber. Appl Environ Microbiol, 2012. 78(13): p. 4543- 51. 205. Imai, S., et al., Rhizobacter gummiphilus sp. nov., a rubber-degrading bacterium isolated from the soil of a botanical garden in Japan. J Gen Appl Microbiol, 2013. 59(3): p. 199-205. 206. Chia, K.H., et al., Identification of new rubber-degrading bacterial strains from aged latex. Polymer Degradation and Stability, 2014. 109: p. 354-361. 207. Jendrossek, D., G. Tomasi, and R.M. Kroppenstedt, Bacterial degradation of natural rubber: a privilege of actinomycetes? FEMS Microbiol Lett, 1997. 150(2): p. 179-88. 208. Tsuchii, A. and K. Takeda, Rubber-degrading enzyme from a bacterial culture. Appl Environ Microbiol, 1990. 56(1): p. 269-74. 209. Tsuchii, A., T. Suzuki, and K. Takeda, Microbial degradation of natural rubber vulcanizates. Appl Environ Microbiol, 1985. 50(4): p. 965-70. 210. Watcharakul, S., et al., Biochemical and spectroscopic characterization of purified Latex Clearing Protein (Lcp) from newly isolated rubber degrading Rhodococcus rhodochrous strain RPK1 reveals novel properties of Lcp. BMC Microbiol, 2016. 16: p. 92. 211. Linos, A., et al., A gram-negative bacterium, identified as Pseudomonas aeruginosa AL98, is a potent degrader of natural rubber and synthetic cis-1, 4-polyisoprene. FEMS Microbiol Lett, 2000. 182(1): p. 155-61. 212. Yikmis, M., et al., Secretion and transcriptional regulation of the latex-clearing protein, Lcp, by the rubber-degrading bacterium Streptomyces sp. strain K30. Appl Environ Microbiol, 2008. 74(17): p. 5373-82. 213. Linos, A., et al., Biodegradation of cis-1,4-polyisoprene rubbers by distinct actinomycetes: microbial strategies and detailed surface analysis. Appl Environ Microbiol, 2000. 66(4): p. 1639-45. 214. Bode, H.B., et al., Physiological and chemical investigations into microbial degradation of synthetic Poly(cis-1,4-isoprene). Appl Environ Microbiol, 2000. 66(9): p. 3680-5. 215. Birke, J. and D. Jendrossek, Rubber oxygenase and latex clearing protein cleave rubber to different products and use different cleavage mechanisms. Appl Environ Microbiol, 2014. 80(16): p. 5012-20. 216. Braaz, R., P. Fischer, and D. Jendrossek, Novel type of heme-dependent oxygenase catalyzes oxidative cleavage of rubber (poly-cis-1,4-isoprene). Appl Environ Microbiol, 2004. 70(12): p. 7388-95. 217. Birke, J., et al., Functional identification of rubber oxygenase (RoxA) in soil and marine myxobacteria. Appl Environ Microbiol, 2013. 79(20): p. 6391-9. 218. Rose, K., K.B. Tenberge, and A. Steinbuchel, Identification and characterization of genes from Streptomyces sp. strain K30 responsible for clear zone formation on natural rubber latex and poly(cis-1,4-isoprene) rubber degradation. Biomacromolecules, 2005. 6(1): p. 180- 8. 219. Hiessl, S., et al., Involvement of two latex-clearing proteins during rubber degradation and insights into the subsequent degradation pathway revealed by the genome sequence of Gordonia polyisoprenivorans strain VH2. Appl Environ Microbiol, 2012. 78(8): p. 2874-87. 220. Seidel, J., et al., Structure of the processive rubber oxygenase RoxA from Xanthomonas sp. Proc Natl Acad Sci U S A, 2013. 110(34): p. 13833-8. 221. Birke, J., et al., Phe317 is essential for rubber oxygenase RoxA activity. Appl Environ Microbiol, 2012. 78(22): p. 7876-83. 222. Shin, S. and V.L. Davidson, MauG, a diheme enzyme that catalyzes tryptophan tryptophylquinone biosynthesis by remote catalysis. Archives of Biochemistry and Biophysics, 2014. 544: p. 112-118. 223. Birke, J., W. Rother, and D. Jendrossek, RoxB Is a Novel Type of Rubber Oxygenase That Combines Properties of Rubber Oxygenase RoxA and Latex Clearing Protein (Lcp). Applied and Environmental Microbiology, 2017. 83(14).

124

224. Birke, J., W. Rother, and D. Jendrossek, Rhizobacter gummiphilus NS21 has two rubber oxygenases (RoxA and RoxB) acting synergistically in rubber utilisation. Appl Microbiol Biotechnol, 2018. 225. Hiessl, S., et al., Latex clearing protein-an oxygenase cleaving poly(cis-1,4-isoprene) rubber at the cis double bonds. Appl Environ Microbiol, 2014. 80(17): p. 5231-40. 226. Yikmis, M. and A. Steinbuchel, Importance of the latex-clearing protein (Lcp) for poly(cis- 1,4-isoprene) rubber cleavage in Streptomyces sp. K30. Microbiologyopen, 2012. 1(1): p. 13- 24. 227. Kabsch, W., Xds. Acta Crystallogr D Biol Crystallogr, 2010. 66(Pt 2): p. 125-32. 228. Evans, P.R. and G.N. Murshudov, How good are my data and what is the resolution? Acta Crystallogr D Biol Crystallogr, 2013. 69(Pt 7): p. 1204-14. 229. Winn, M.D., et al., Overview of the CCP4 suite and current developments. Acta Crystallogr D Biol Crystallogr, 2011. 67(Pt 4): p. 235-42. 230. Adams, P.D., et al., PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr D Biol Crystallogr, 2010. 66(Pt 2): p. 213-21. 231. Emsley, P., et al., Features and development of Coot. Acta Crystallogr D Biol Crystallogr, 2010. 66(Pt 4): p. 486-501. 232. Murshudov, G.N., et al., REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr D Biol Crystallogr, 2011. 67(Pt 4): p. 355-67. 233. McCoy, A.J., et al., Phaser crystallographic software. J Appl Crystallogr, 2007. 40(Pt 4): p. 658-674. 234. Chen, V.B., et al., MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr, 2010. 66(Pt 1): p. 12-21. 235. Schrodinger, LLC, The PyMOL Molecular Graphics System, Version 1.8. 2015. 236. Wajcman, H., L. Kiger, and M.C. Marden, Structure and function evolution in the superfamily of globins. Comptes Rendus Biologies, 2009. 332(2-3): p. 273-282. 237. Rice, S.L., et al., Structure of Chlamydomonas reinhardtii THB1, a group 1 truncated hemoglobin with a rare histidine-lysine heme ligation. Acta Crystallogr F Struct Biol Commun, 2015. 71(Pt 6): p. 718-25. 238. Teh, A.H., et al., Open and Lys-His Hexacoordinated Closed Structures of a Globin with Swapped Proximal and Distal Sites. Sci Rep, 2015. 5: p. 11407. 239. Ilcu, L., et al., Structural and Functional Analysis of Latex Clearing Protein (Lcp) Provides Insight into the Enzymatic Cleavage of Rubber. Sci Rep, 2017. 7(1): p. 6179. 240. Porter, T.D. and M.J. Coon, Cytochrome P-450. Multiplicity of isoforms, substrates, and catalytic and regulatory mechanisms. J Biol Chem, 1991. 266(21): p. 13469-72. 241. de Sanctis, D., et al., Structure-function relationships in the growing hexa-coordinate hemoglobin sub-family. IUBMB Life, 2004. 56(11-12): p. 643-51. 242. Ujwal, R. and J.U. Bowie, Crystallizing membrane proteins using lipidic bicelles. Methods, 2011. 55(4): p. 337-341. 243. Bayburt, T.H. and S.G. Sligar, Membrane protein assembly into Nanodiscs. Febs Letters, 2010. 584(9): p. 1721-1727.

125

126

12. Appendix

12.1 Primers

Table 12. 1: Primers for Gibson assembly for Aquifex aeolicus ccsA and ccsB and their vectors

Directi Name Sequence Purpose on

Aa_for_cc GAGAACCTGTACTTCCAATCAATGCGAGAA Amplification of Aa_ccsA forward s_gene GTAAGTTTAAATGAAAAAGAGTTTCAC gene in pASG-IBA_2S_gfp

Aa_rev_cc CTCCCCTGAAAATATAAATTTTCCTCGGTA Amplification of Aa_ccsA reverse s_gene GCGTAAGAGTGAAGTCCCGG gene in pASG-IBA_2S_gfp

Aa_for_p CCGGGACTTCACTCTTACGCTACCGAGGAA Amplification of pASG- forward ASG AATTTATATTTTCAGGGGAGTAAAGG IBA_2S_gfp

Aa_rev_p CTTTTTCATTTAAACTTACTTCTCGCATTG Amplification of pASG- reverse ASG ATTGGAAGTACAGGTTCTCGG IBA_2S_gfp

LI_Aa_ccs GTTTAACTTTAAGAAGGAGATATAATGCGA Amplification of Aa_ccsA forward A_f_pET GAAGTAAGTTTAAATGAAAAAG gene in pETTSC

LI_Aa_ccs CCAGCTCTGGAAGTACAAGTTCTCCTCGGT Amplification of Aa_ccsA reverse A_r_pET AGCGTAAGAGTGAAG gene in pETTSC

LI_pETTS GACTTCACTCTTACGCTACCGAGGAGAACT forward Amplification of pETTSC C_f_Aa_A TGTACTTCCAGAGCTG

LI_pETTS CATTTAAACTTACTTCTCGCATTATATCTC reverse Amplification of pETTSC C_r_Aa_A CTTCTTAAAGTTAAACAAAA Amplification of Aa_ccsB LI_Aa_for GAGAACCTGTACTTCCAATCAATGAAGAGA forward gene in pASG- _ccsB GTGCTTGAATTTCTCG IBA_2S_gfp::Aa_ccsA Amplification of Aa_ccsB LI_Aa_rev CATTTAAACTTACTTCTCGCATAGCTCTCT reverse gene in pASG- _ccsB TTAACTCCTCCAGTTC IBA_2S_gfp::Aa_ccsA

LI_Aa_for CTGGAGGAGTTAAAGAGAGCTATGCGAGAA Amplification of Aa_ccsA forward _ccsA GTAAGTTTAAATGAAAAAGAGTTTCA gene in pASG-IBA_2S_gfp

127

LI_Aa_rev GAAATTCAAGCACTCTCTTCATTGATTGGA Amplification of pASG- reverse _pASG AGTACAGGTTCTCGG IBA_2S_gfp::Aa_ccsA

Table 12. 2: Primers for Gibson assembly for Bacillus megaterium ccsA gene and pETTSC

Name Sequence Direction Purpose

GAAGGAGATATAGGATCCATGGA Amplification of Bm_ccsA LI_Bm_ccsA_for forward TTACGCGAGTATAAGTAGT gene in pETTSC

GTACAAGTTCTCGAATTCCGCAT Amplification of Bm_ccsA LI_Bm_ccsA_rev reverse ATGAATGTAAACCTGCAATAAC gene in pETTSC

LI_Bm_pETTSC CAGGTTTACATTCATATGCGGAA forward Amplification of pETTSC _for TTCGAGAACTTGTACTTCC

LI_Bm_pETTSC CTTATACTCGCGTAATCCATGGA reverse Amplification of pETTSC _for TCCTATATCTCCTTCTTAAAG

Table 12. 3: Primers for Gibson assembly for Bacteroides thetaiotaomicron ccsBA and ccsA and their vectors

Name Sequence Direction Purpose Amplification of AAGAAGGAGATATAGGATCCATGAA LI_Bt_ccsA1_for forward Bt_ccsBA gene in ACTGATTCGATTGATCG pETTSC Amplification of AAGTACAAGTTCTCGAATTCTGAAT LI_Bt_ccsA1_rev reverse Bt_ccsBA gene in ACTTTTTCAAACGCCTG pETTSC

LI_Bt_pET_for_ GGCGTTTGAAAAAGTATTCAGAATT Amplification of forward A1 CGAGAACTTGTACTTCC pETTSC

LI_Bt_pET_rev_ ATCAATCGAATCAGTTTCATGGATC Amplification of reverse A1 CTATATCTCCTTCTTAAAG pETTSC Amplification of LI_Bth_ccsBA1_ GAGAACCTGTACTTCCAATCAATGA forward Bt_ccsBA gene in for_p AACTGATTCGATTGATC pASG-IBA_2S_gfp Amplification of LI_Bth_ccsBA1_ CCCCTGAAAATATAAATTTTCTGAA reverse Bt_ccsBA gene in rev_p TACTTTTTCAAACGCC pASG-IBA_2S_gfp

128

LI_Bth_pASGGF AGGCGTTTGAAAAAGTATTCAGAAA Amplification of pASG- forward P_for_ ATTTATATTTTCAGGGG IBA_2S_gfp

LI_Bth_pASGGF GATCAATCGAATCAGTTTCATTGAT Amplification of pASG- reverse P_rev_ TGGAAGTACAGGTTCTC IBA_2S_gfp Amplification of GAAGGAGATATAGGATCCATGATAA Bth_ccsA2_for forward Bt_ccsA gene in ACTGGGATAATTTTTATG pETTSC Amplification of GTACAAGTTCTCGAATTCGTTATTA Bth_ccsA2_rev reverse Bt_ccsA gene in CGATTATATAAATGTACGC pETTSC

Bth_pETccsA2_f ATTTATATAATCGTAATAACGAATT Amplification of forward or CGAGAACTTGTACTTCC pETTSC

Bth_pETccsA2_r AAATTATCCCAGTTTATCATGGATC Amplification of reverse ev CTATATCTCCTTCTTAAAG pETTSC

Table 12. 4: Primers for Gibson assembly for Geobacter sulfurreducens ccsA and pETTSC

Name Sequence Direction Purpose

GAAGGAGATATAGGATCCATGA Amplification of Gs_ccsA Gs_ccsA1_for forward AATGGCTCCTTGTTGCC gene in pETTSC

GTACAAGTTCTCGAATTCAAAG Amplification of Gs_ccsA Gs_ccsA1_rev reverse CTGTGCGAACTTTTCATG gene in pETTSC TGAAAAGTTCGCACAGCTTTGA Gs_pETccsA1_for forward Amplification of pETTSC ATTCGAGAACTTGTACTTCC GCAACAAGGAGCCATTTCATGG Gs_pETccsA1_rev reverse Amplification of pETTSC ATCCTATATCTCCTTCTTAAAG

Table 12. 5: Primers for restriction enzyme cloning and Gibson assembly for Helicobacter hepaticus ccsBA and various vectors

Directi Name Sequence Purpose on

GCCGAGAACCTGTACTTCCAATCA Amplification of the LI_f_ccsBA*_ ATGATGAATATAATTAAAACACTT forward Hh_ccsBA gene in pASG- pASG TTTTG IBA_2S_gfp

129

CTTTACTCCCCTGAAAATATAAAT Amplification of the LI_r_ccsBA*_ TTTCTAAATGGGGCATATCAAGCA reverse Hh_ccsBA gene in pASG- pASG CTC IBA_2S_gfp GAGTGCTTGATATGCCCCATTTAG LI_for_pASGI Amplification of pASG- AAAATTTATATTTTCAGGGGAGTA forward BA2S IBA_2S_gfp AAG CAAAAAAGTGTTTTAATTATATTC LI_rev_pASGI Amplification of pASG- ATCATTGATTGGAAGTACAGGTTC reverse BA2S IBA_2S_gfp TCGGC

CTTTAAGAAGGAGATATACAAATG Amplification of the LI_f_geneHh_ ATGAATATAATTAAAACACTTTTT forward Hh_ccsBA gene in pASG- TSC TGTTCAATG IBA_TSC GTTTTAATTATATTCATCATTTGT LI_r_pASG_T Amplification of pASG- ATATCTCCTTCTTAAAGTTAAACA reverse SC IBA_TSC AAATTATTTC

TTTATATTTTCAGGGGGGGTCTGG Insertion of the fusion protein LI_f_thioredo_ GTCTGGGTCTATGAGCGATAAAAT forward thioredoxin from E. coli at C- Ec TATTCACCTGACTG terminus in pASG-IBA_TSC Insertion of the fusion protein LI_r_thioredo_ CTCGAACTGCGGGTGGCTCCACGC reverse thioredoxin from E. coli at C- Ec CAGGTTAGCGTCGAGGAACTC terminus in pASG-IBA_TSC Insertion of the fusion protein

LI_f_pASG_th CCTCGACGCTAACCTGGCGTGGAG thioredoxin from E. coli at C- forward io CCACCCGCAGTTC terminus (pASG-IBA_TSC amplification) Insertion of the fusion protein GAATAATTTTATCGCTCATAGACC LI_r_pASG_th thioredoxin from E. coli at C- CAGACCCAGACCCCCCCTGAAAAT reverse io terminus (pASG-IBA_TSC ATAAATTTTCTAAATGG amplification) Insertion of the fusion protein LI_Mictic_C_f GAAAATTTATATTTTCAGGGGATG forward Mistic from B. subtilis at C- or TTTTGTACATTTTTTGAAAAAC terminus in pASG- IBA_TSC

TTTCTCGAACTGCGGGTGGCTCCA Insertion of the fusion protein LI_Mistic_C_r TTCTTTTTCTCCTTCTTCAGATAC reverse Mistic from B. subtilis at C- ev T terminus in pASG- IBA_TSC

130

Insertion of the fusion protein

LI_pASG_Mis GAAGAAGGAGAAAAAGAATGGAGC Mistic from B. subtilis at C- forward t_c_f CACCCGCAGTTC terminus (pASG- IBA_TSC amplification) Insertion of the fusion protein

LI_pASG_Mis CAAAAAATGTACAAAACATCCCCT Mistic from B. subtilis at C- reverse t_C_r GAAAATATAAATTTTCTAAATGG terminus (pASG- IBA_TSC amplification) Amplification of the gene LI_ccsBA*_N GCGCGCATATGATGATGAATATAA forward Hh_ccsBA with NdeI deI_f TTAAAACACTTTTTTGTTCAATG restriction site (CATATG) Amplification of the LI_ccsBA*_N GCGCGCTCGAGTTATAAATGGGGC reverse Hh_ccsBA gene with XhoI deI_r ATATCAAGCACTCTTTTTC restriction site (CTCGAG) Amplification of the LI_ccsBA*_B GCGCGGGATCCATGATGAATATAA forward Hh_ccsBA gene with BamHI amH_f TTAAAACACTTTTTTGTTCAATG restriction site (GGATCC) Amplification of the LI_ccsBA*_Ec GCGCGGAATTCTTATAAATGGGGC reverse Hh_ccsBA gene with EcoI oR_r ATATCAAGCACTCTTTTTC restriction site (GAATTC)

Table 12. 6: Primers for site-directed mutagenesis in constructs with Helicobacter hepaticus ccsBA

Name Sequence Direction Purpose Removing the GFP tag LI_2S_pASG_ GAAAATTTATATTTTCAGGGGTGG forward from pASG- for AGCCACCCGCAGTTCGAG IBA_2S_gfp::Hh_ccsBA

CTCGAACTGCGGGTGGCTCCACCC Removing the GFP tag LI_2S_pASG_ CTGAAAATATAAATTTTCTAAATG reverse from pASG- rev GGGC IBA_2S_gfp::Hh_ccsBA Inserting a STOP codon in LI_for_pASG_ CTTGATATGCCCCATTTATAAGAA forward front of gfp gene pASG- TAA AATTTATATTTTCAG IBA_2S_gfp::Hh_ccsBA

131

Inserting a STOP codon in LI_rev_pASG CTGAAAATATAAATTTTCTTATAA reverse front of gfp gene pASG- _TAA ATGGGGCATATCAAG IBA_2S_gfp::Hh_ccsBA Changing of TEV cleavage

LI_PPS_GFP_ GAGTGCTTGATATGCCCCATTTAC site with PreScission forward for TGGAAGTTCTGTTCCAGGGGCCG cleavage site in pASG- IBA_2S_gfp::Hh_ccsBA Changing of TEV cleavage CCAGTGAAAAGTTCTTCTCCTTTA LI_PPS_GFP_ site with PreScission CTCGGCCCCTGGAACAGAACTTCC reverse rev cleavage site in pASG- AG IBA_2S_gfp::Hh_ccsBA Removing the STOP codon LI_pHGST_St CATTTACTCGAGCTGGTTCCGCGT forward before the Strep Tag in op_f GGATC pHGST::Hh_ccsBA Removing the STOP codon LI_pHGST_St GATCCACGCGGAACCAGCTCGAGT reverse before the Strep Tag in op_r AAATG pHGST::Hh_ccsBA

Table 12. 7: Primers for restriction enzyme cloning and Gibson assembly for Hydrogenobacter thermophilus ccsA gene and various vectors

Name Sequence Direction Purpose

Ht_ccsA_for GAAGGAGATATAGGATCCATGAGA Amplification of Ht_ccsA forward CATGTAATACATCAAAGAAATG gene in pETTSC

Ht_ccsA_rev GAAGTACAAGTTCTCGAATTCATC Amplification of Ht_ccsA reverse CGTTGCGTAGCTGTGAAG gene in pETTSC

Ht_pET_for TTCACAGCTACGCAACGGATGAAT forward Amplification of pETTSC TCGAGAACTTGTACTTCC

Ht_pET_rev GATGTATTACATGTCTCATGGATC reverse Amplification of pETTSC CTATATCTCCTTCTTAAAG

132

Table 12. 8: Primers for Gibson assembly for Micrococcus luteus ccsA gene and pETTSC

Name Sequence Direction Purpose

LI_ccsA_Ml_pET TAAGAAGGAGATATAGGATCCATG Amplification of Ml_ccsA forward _f TTCAACCCCACGGCCCC gene in pETTSC

LI_ccsA_Ml_pET GAAGTACAAGTTCTCGAATTCGCC Amplification of Ml_ccsA reverse _r GGGCAGACCCGCGTAC gene in pETTSC

LI_pETTSC_Ml_ TCGTACGCGGGTCTGCCCGGCGAA forward Amplification of pETTSC f TTCGAGAACTTGTACTTCC

LI_pETTSC_Ml_ GGGGGCCGTGGGGTTGAACATGGA reverse Amplification of pETTSC r TCCTATATCTCCTTCTTAAAG

Table 12. 9: Primers for Gibson assembly for Symbiobacterium thermophilum ccsA gene and pETTSC

Name Sequence Direction Purpose

St_ccsA1_for GAAGGAGATATAGGATCCATGAC Amplification of St_ccsA forward ACCGGCTAACCAGTTTCTGC gene in pETTSC

St_ccsA1_rev GTACAAGTTCTCGAATTCCAGAT Amplification of St_ccsA reverse CGCCGCCGGCGTAG gene in pETTSC

St_pETccsA1_for CCTACGCCGGCGGCGATCTGGAA forward Amplification of pETTSC TTCGAGAACTTGTACTTCC AACTGGTTAGCCGGTGTCATGGA St_pETccsA1_rev reverse Amplification of pETTSC TCCTATATCTCCTTCTTAAAG

Table 12. 10: Primers for sequencing (Aquifex aeolicus, Helicobacter hepaticus)

Name Sequence Direction

LI_f653_Aa_ccsBA CTACGTGTTCGCAGAAAAGGG forward LI_r127_Aa_ccsBA GCATGCAGATACGCTCCG reverse LI_for_ccsBA* CACTTATTATGCGCGATG forward LI_rev_ccsBA* CCCAATCCGCAAGTATG reverse LI_rev_GFP_GATC GCCCATTAACATCACCATCTAATTC reverse

133

Table 12. 11: Primers for replacing the ccm operon in E. coli with a kanamycine resistance cassette and construction via Gibson assembly pEC86::BtccsBA and pASGIBA::macA (reporter cytochrome c)

Name Sequence Direction Purpose

GTTTTTAACAGGGTTATTGCGTG Change of ccm operon in E. LI_kan_E.c_f_cc GGTATGATTGAACAAGATGGATT forward coli by kan resistance m GCACG cassette

CGATGGCATCACCATCGGGCCTC Change of ccm operon in E. LI_kan_E.c_r_cc TTTTCAGAAGAACTCGTCAAGAA reverse coli by kan resistance m GGCG cassette Primers for sequencing the LI_E.c_+kan_for TTTAACAGGGTTATTGCGTGGGT forward kan resistance gene Primers for sequencing the LI_E.c_+kan_rev CATCACCATCGGGCCTCTTT reverse kan resistance gene LI_BtccsBA_f_P AACAGGGTTATTGCGTGGGTATG Amplification of BtccsBA EC forward AAACTGATTCGATTGATC in pEC86 vector

LI_BtccsBA_r_P CCTGATGCGCTACGCTTATCTCA Amplification of BtccsBA EC reverse TGAATACTTTTTCAAACG in pEC86 vector

LI_for_pEC86_ne GTTTGAAAAAGTATTCATGAGAT Amplification of pEC86 forward w AAGCGTAGCGCATCAGG vector

LI_rev_pEC86_n GATCAATCGAATCAGTTTCATAC Amplification of pEC86 reverse ew CCACGCAATAACCCTGTT vector CTTTAAGAAGGAGATATACAAAT LI_MacA_for_ne Amplification of GsmacA GAAAAAGACAGCTATCGCGATTG forward w in pASGIBA_TSC vector CAG

LI_Gs_MacA_r_ CCCCTGAAAATATAAATTTTCGT Amplification of GsmacA reverse pAS TGCTGACCGGCCTGGGG in pASGIBA_TSC vector

LI_pASG_for_M CCCCCAGGCCGGTCAGCAACGAA Amplification of forward acA AATTTATATTTTCAGGGG pASGIBA_TSC vector GCGATAGCTGTCTTTTTCATTTG LI_pASG_r_Mac Amplification of TATATCTCCTTCTTAAAGTTAAA reverse A_n pASGIBA_TSC vector C

134

12.2 Amino acid sequences

Aquifex aeolicus CcsA (UniProt ID: O67831) MREVSLNEKEFHLNKDLFFLLGGVLLSVLAGFFLPFENLFWYKSAFLGYILSSVVYVAYLLVRERIVGVVATLTL YLSLLLNLTGMIRRAYESYKLGVFHPPWSNLFEALTFWSFIAGFVYLVIERKYGFKILGAFVIPVIAAISGFAIY KANPEITPLMPALRSYWLYLHVVTAFTGYAGFTVAFGGAVAYLLKEHFPENKFVKNLLPRREILDEITYKSIAIV FPIWTASIILGAAWANEAWGGYWSWDPKEVWSLIVWLFFGAYLHARQLMGWKGKRVAWLVVFGFITVLICFFAVN LYFPGLHSYATE

Aquifex aeolicus CcsB (UniProt ID: O67856) MKRVLEFLGGFGGLVFGFVFFVGMSILGLFHLEEHPPLYWAGFFASVFLFVLSFLLNLINWVKALIKDYKKHGSV LAFVYDFLASLKLAIFIMLVLGILSMLGSTYIKQNQSFEWYLDQFGYDVGIWIWKLWLNDVFHSWYYILFIVLLA VNLIFCSIKRLPRVWKQAFSKERILKLDEHAEKHLKPITVKIPDKDKVLKFLLKKGFKVFVEEEGNKLYVFAEKG RFSRLGVYITHIALLVIMAGALIDAIVGVRGSLIVAEGDTNDVMLVGAEQKPYKLPFAVHLIDFRIKTYAEENPN VDKRFAQAVSSYESDIEIINGGKVEAKGTVKVNEPFDFGRYRLFQATYGILDGTSGMGVIVVDRKKAHEDPEKAV IGTFEIKTGQVVEFKDMLISIDRVVLNVHDPNNRNELAPAVVLKVMLNRELYSVPVIYDPRLTALVFSQIPELKD FPYMFFMNGFEPLFFSGFEVSYHPGSVIIWIGSAILVLGMVVAFYTVHRKVWMRIEGDTAKVAFYSHKFKEEFKR SFLRELEELKRA

Bacillus megaterium CcsA (UniProt ID: A0A0H4RZR3) MDYASISSNLLYVAFIAYLVATFFFGGAIRDKRAAQKKTKWATFGIAVTIIGFIAQIGYFITRWIASGHAPVSNL FEFTTFFGMMLVGAFIVIYFIYRLSMLGLFALPVALLIIAYASMFPTEITPLIPALQTNWLHIHVTTAATGEAVL SISFVAGLIYLIKTIDQSKRNKKAFWLEFILYTLISTLGFVAVTLAFNAAHYESSFNWVNKSGEQEQVVYHMVPL VGPHQSEWTDGNDIQPLVDMPAIINAKKLNTVIWSLAGGFVLYWLIRLILRKRIGAALQPLVKQVNSDLLDEVSY RAVAIGFPIFTLGALIFAMIWAQMAWTRFWGWDPKEVWALITFLFYAAFLHLRLSKGWHGEKSAWLAVGGFAIIM FNLVAVNLVIAGLHSYA

Bacteroides thetaiotaomicron CcsBA (UniProt ID: C6IL68) MKLIRLIASPILMYVLAGVYALVLAIATFVENSSGPAVAREYFYYAPWFILLQLLQAVNLLAMFLQGGYFKRISK GSLIFHGALVFIWLGAAVTHYAGVTGIMHIREGETVDRMMRDEGAGMGNASLPFSVTLDDFRLKRYPGSHSPMSY ESDLVIKKENEAPLQATVRMNKVIEVDGYRLFQSSFDPDEQGTVLSVSYDRPGMQITYIGYFLLFAGFVLTLFSK KSRFGRLRRELGEMKKNAPFCLLLFLGLSGALGTQASYAQETLSSSQLPCIPAPHARKFGSLVLLNPNGRLEPVN SYTSAILRKLYGADKLNSINSDQFFLNLLAFPDEWGGYPFIKVDNKDILQRFGRDGKYIAWQDVFDADGNYVLTD EVNAIYAKSASERKRMDSDLLKLDESVNIVYRIMQHQLLPLFPDENDVQGKWYSAGDEQTVFHDKDSLFVSKIMD WYIYELGNGVRTNNWKEADKIVDMMHIFQQAKSKTPAIDNQRVKAELLYNQLNLFFWCRLAYLILGGILLFIACG EIIADFKWGSRLSSILIVLLIAAFLAHTTGVLLRWYISERAPWANAYESMICTSWLLVGGGLLFARRFRILPALA GLLGGIMLFVAGLNHLNPEITPLVPVLQSYWLMSHVAIIMIGYVFFALCALTGLFNLILMNLLSATNRVKLLFRI REFTLLNEMAMILGLFFMTAGTFLGAIWANVSWGRYWGWDPKETWALISIVVYALVLHIRFIPLLKGKTTWCYNL LSVVSILSIIMTWFGVNYYLSGLHSYGKTEGGDLLLWIWGAGLCVVLALALFARRRLKKYS

135

Bacteroides thetaiotaomicron CcsA (UniProt ID: A0A0P0FN15) MINWDNFYVFAAVSICLWLAGAVFALRSSSKSKVAIGFTSGGIIILGVFIAGLWIFLQRPPLRTMGETRLWYSFF MGIAGLLTYIRWKYRWILSFSTLLSTVFVIINLMKPEIHDQSLMPALQSVWFIPHVTVYMFSYSVLGCAFIIALT GLFRHKEEYLHTADNLVYAGVAFLSIGMLLGSLWAKEAWGNYWSWDPKETWAAITWMGYLLYIHLRLFRRTGRKT LYVLLIVSFLALQMCWYGVNYLPAAQQSVHLYNRNN

Escherichia coli Thioredoxin (UniProt ID: S1CI97) MSDKIIHLTDDSFDTDVLKADGAILVDFWAEWCGPCKMIAPILDEIADEYQGKLTVAKLNIDQNPGTAPK YGIRGIPTLLLFKNGEVAATKVGALSKGQLKEFLDANLA

Geobacter sulfurreducens CcsA (UniProt ID: Q747F1) MKWLLVASAALYLFGSFRRPLFALGLGAGLAYLALRGISLGRLPLVGPHDTIAFFSASIGLMTLPFLFSPSLRNS SAFPWATGGTAAVFALFSLAFPALAMPLPPILNTLWFELHVALAFFAYALFTIGAIMGVLFLAGGERRLLDLQYR AALVGYTFFSGSMVAGGIWGYYAWGTYWLWTPKELWTSILWIFYTFWLHLRLRGAGGDRLLAWTGILGFGVMLFT YLGVSMLMKSSHSF

Helicobacter hepaticus CcsBA (UniProt ID: Q7VHG9) MMNIIKTLFCSMKMVLLLIGIYATACGIATFIEKYEGTLAARLWVYDAFWFEILHIWLVACLIGCFITSKAWQRK KYASLLLHASFIVIIIGAGITRYYGFEGLMNLREGQSVNFISTNTHYIFIQIKNPQGDVESVRIPTYIDEKVNHK INQHLTFFGKPLTLHTEEFTAKQVNMSELFILNASIDFLGKNEKTLIMRDGNNAPTKENITMLEIEGYKIFLAWG IDNIALPFSIKLKKFELERYPGSNSPASYTSEVEVLDGQNPPLPFRIFMNNVLDYGGYRFFQSSYHPDEKGSILS VNNDPGKTPTYIGYAMLILGVIWLLFDKNGRFATLGRFLKTQKFFSLMLCSALCYALSSPQIAYASTQSQTDFQP LSENEIPPLQDIPSMIKALADTSSLTNDFDRILVQDFGGRIKPMHTLANEYIHKLTQQRTFKGLNPSQVFLGMLF YPQEWQSIQMIATKSPKLRQILGLDENQKHIAYIDVFTPQGQYILQNYVEAANLKSPSLRDTFEKDVISVDERIN YAFLIYTGQVLRIFPDNKSPNNQWLYPLQAISSAVAQDDTKKAKELMQIYKKFAQGMQQGINTHNWQEAAQATRD IRTFQQNNGGSLLISPAKVDSEIWLNLYNPFYQLTYPYIFISIVLFIIVLVGILKNTPTRPLIHKVFYILLFALF ILHTCGLGLRWYVSEHAPWSNAYESMLYIAWAAILSGVVFFRRSNLALCASSFLAGMTLFVANLGDMDPQIGNLM PVLKSYWLNIHVSVITASYGFLGLCFMLGLITLIMFLLRNEKRSQVDCSILSLSALNEMSMILGLFLLSVGNFLG GIWANESWGRYWGWDSKETWALISIGVYAIILHLRFVVPKNFPFIFASASVIGFFSVLMTYFGVNYYLTGMHSYA AGEAEPVPLWVELMVAGIILLIIIASRKRVLDMPHL

Hydrogenobacter thermophilus CcsBA (UniProt ID: D3DG85) MRHVIHQRNVWMLSDYLIAFIGIFIAMLLSLLQVDNVFWYKSATLLYALSSVMYLSYPLFRNSLFGKVSTLTLFL GLLLNLAGMIRRGVQSYQLGVFHPPWSNLFEALTFWSFIAGSIYLLIERKYALRILGTFVVPLIFGLSAFAMLKA SKDITPLMPALRSYWLYLHVVTAFVGYAGFTVAFAGAVLYLVKGRFPQTKLLPSQDTLEEITYKAIIVVFPVWTA SIILGSAWANEAWGGYWSWDPKEVWSLIVWLFFGAYLHARQMLGWRGRRVAWMVVLGFITVLICFFAINLYFPGL HSYATD

136

Micrococcus luteus CcsA (UniProt ID: A0A132HDT9) MFNPTAPVNEQLALYSDLFMLIAALVYAAAFILFTIDMATASATIRRLEADLAAQRGQVRTAPARETVGVAVGGS TAVGASDSDTARPAAAEPGVDADDDLVDEDMDYTGGGRRPVANVAVAVLAVGWALHAFAVVARGLAASRVPWGNL YEFMTTGALVITTVYLLFLLRKDLRFVGTFVTGIVVAMMCAATMGFPTPVGHLQPPLQTPWLVIHVSIAVLASSL FALTAAMSVLQLLQDRAEKRAAAGERSWAFLRLVPAAQGLENWSYRLNAVGFVMWTFTLIAGAIWAEAAWGRYWN WDTKEVWTFVIWVVYAAYLHARATRGWTGARAAWLSIAGFLCIVFNYTIVNTYFPGLHSYAGLPG

Symbiobacterium thermophilum CcsA (UniProt ID: Q67NM8) MTPANQFLLDAAQWLLLGAFASYLVAFLLYVSGTLGRRISGGTRFTRTPGHGTVANVIGVVLHGLAILARWQGSG FWPTSNMYEFIGFMAFSSMVAFLVLHGMYRLYVLGALVTPVTIALLAYSYVFPPEVTPLIPALQSYWLPLHVSLA ALGEGFFAVAFGAALLYLLRVRGMEIAQARAVSEAAAGMETAAPMAGAAPMAGAAPGVAAESRWERIWGVRLLEI VFYLILVLLGFTVLALFFRYAGFQWIFNNGMTTYHLPPIIGPYGAEVGEKGTILGIPLPTVVVPFGWRGKHLNTL LYSVVIGAFLYGFIRRFVTRGRIGDALALRVTADPELLDEISYRAVAIGYPIFTLGGLIFAMMWAKEAWGRYWMW DPKETWAFIAWLVYSAYLHFRITHGWEGRRSAWLAVLGFGVILFTLVGVNLLIVGLHSYAGGDL

12.3 DNA sequences

Aequorea victoria gfp AGTAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAA TTTTCTGTCAGTGGAGAGGGTGAAGGTGATGCAACATACGGAAAACTTACCCTTAAATTTATTTGCACTACTGGA AAACTACCTGTTCCATGGCCAACACTTGTCACTACTCTGACGTATGGTGTTCAATGCTTTTCCCGTTATCCGGAT CATATGAAACGGTATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAA GATGACGGGAACTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAA GGTATTGATTTTAAAGAAGATGGAAACATTCTCGGACACAAACTTGAGTACAACTATAACTCACACAATGTATAC ATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACATTGAAGATGGATCCGTT CAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTG TCGACACAATCTGCCCTTTTGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCT GCTGGGATTACACATGGCATGGATGAACTATACAAA

Aquifex aeolicus ccsA ATGCGAGAAGTAAGTTTAAATGAAAAAGAGTTTCACTTAAATAAGGACTTATTTTTCCTGCTCGGTGGAGTCCTC CTCTCCGTTTTAGCTGGATTCTTTCTCCCCTTTGAAAACCTCTTCTGGTACAAATCCGCCTTTCTGGGATACATA CTCTCTTCTGTGGTTTACGTAGCCTACCTTCTGGTACGTGAAAGGATAGTTGGCGTAGTAGCAACGCTGACGCTT TACCTTTCCCTGCTCCTCAACTTAACGGGAATGATAAGGAGGGCTTACGAGAGTTATAAGCTCGGCGTTTTCCAC CCGCCTTGGAGTAATTTGTTTGAAGCACTAACCTTCTGGAGTTTTATAGCCGGATTTGTTTACCTTGTTATTGAA AGGAAGTACGGATTTAAGATACTCGGAGCTTTTGTAATTCCCGTAATTGCAGCAATTTCTGGTTTCGCCATATAC AAGGCAAATCCTGAAATAACACCCTTGATGCCCGCCCTTAGGAGTTACTGGCTTTACCTTCACGTGGTAACAGCC TTTACGGGATACGCGGGTTTTACAGTTGCATTCGGTGGAGCGGTGGCTTACTTGCTCAAAGAGCACTTCCCGGAG AATAAATTCGTAAAAAACCTCCTTCCGAGAAGGGAAATCCTCGACGAGATAACTTACAAATCGATCGCCATCGTT

137

TTTCCGATATGGACCGCCTCAATAATCCTGGGTGCTGCGTGGGCAAATGAGGCGTGGGGCGGTTACTGGAGCTGG GACCCTAAGGAAGTATGGTCCTTAATAGTCTGGCTATTCTTCGGAGCGTATCTGCATGCAAGACAGCTTATGGGC TGGAAAGGGAAGAGAGTTGCTTGGCTTGTGGTTTTCGGGTTTATTACAGTTCTTATATGCTTCTTCGCGGTAAAC CTCTACTTCCCGGGACTTCACTCTTACGCTACCGAGTGA

Aquifex aeolicus ccsB ATGAAGAGAGTGCTTGAATTTCTCGGAGGCTTCGGCGGTTTAGTTTTCGGATTTGTTTTCTTCGTTGGAATGTCC ATTTTAGGACTTTTTCACCTCGAAGAACACCCTCCCCTTTACTGGGCGGGCTTTTTTGCTTCAGTTTTTCTCTTC GTACTTTCTTTCCTACTTAACCTGATAAACTGGGTAAAGGCTTTAATAAAGGATTATAAAAAGCACGGAAGTGTT CTTGCCTTCGTGTACGACTTCCTCGCTTCACTTAAGCTCGCAATATTCATAATGCTCGTTCTCGGAATTCTCTCC ATGCTCGGTTCTACCTACATAAAACAGAACCAGTCCTTTGAGTGGTATCTGGATCAATTCGGGTACGACGTGGGT ATATGGATATGGAAGCTCTGGCTCAACGACGTATTTCACTCCTGGTATTACATCCTCTTTATAGTTTTGCTCGCG GTAAACCTGATTTTCTGTTCCATAAAGAGACTTCCGAGAGTATGGAAGCAGGCTTTCAGTAAAGAAAGGATACTT AAACTGGACGAGCACGCGGAAAAGCACTTAAAGCCCATAACCGTAAAAATTCCCGACAAGGATAAGGTTTTGAAG TTTTTACTGAAAAAGGGATTTAAGGTTTTCGTTGAAGAGGAAGGAAATAAACTCTACGTGTTCGCAGAAAAGGGT AGGTTTTCAAGACTCGGCGTTTACATCACACACATAGCCCTCCTCGTTATAATGGCGGGAGCACTGATTGACGCG ATAGTTGGTGTGAGGGGAAGCCTTATAGTTGCGGAAGGGGATACAAACGACGTAATGCTCGTGGGTGCGGAACAA AAGCCCTACAAACTGCCTTTTGCGGTTCACCTGATAGACTTCAGGATAAAGACTTATGCGGAAGAAAACCCGAAC GTGGATAAACGTTTTGCTCAGGCGGTGAGCTCTTATGAGAGTGATATTGAAATCATAAACGGCGGGAAAGTTGAA GCAAAGGGAACCGTTAAGGTTAACGAGCCCTTTGACTTTGGAAGGTACAGACTATTTCAGGCGACTTACGGAATT CTTGACGGAACGAGCGGTATGGGCGTTATAGTCGTTGACAGAAAGAAGGCACACGAAGATCCCGAAAAGGCGGTA ATAGGAACCTTTGAGATAAAAACGGGACAGGTTGTTGAGTTCAAAGACATGCTAATTTCCATAGACAGAGTTGTT CTCAACGTGCACGATCCCAACAACAGGAACGAACTCGCACCGGCGGTAGTTTTAAAGGTTATGCTAAACAGAGAA CTTTACAGCGTTCCCGTTATATACGACCCTAGATTAACCGCTCTTGTATTCTCTCAAATACCAGAACTAAAAGAC TTTCCCTACATGTTCTTCATGAACGGGTTTGAACCTCTATTCTTCTCAGGCTTTGAGGTTTCCTACCACCCTGGA AGTGTAATTATATGGATAGGCTCGGCTATACTGGTTCTCGGAATGGTGGTGGCCTTCTACACGGTTCACAGAAAG GTGTGGATGAGGATAGAGGGAGATACAGCAAAGGTAGCCTTTTACTCCCACAAGTTTAAAGAGGAATTTAAGAGG AGCTTTTTAAGGGAACTGGAGGAGTTAAAGAGAGCTTGA

Bacillus subtilis mistic ATGTTTTGTACATTTTTTGAAAAACATCACCGGAAGTGGGACATACTGTTAGAAAAAAGCACGGGTGTGATGGAA GCTATGAAAGTGACGAGTGAGGAAAAGGAACAGCTGAGCACAGCAATCGACCGAATGAATGAAGGACTGGACGCG TTTATCCAGCTGTATAATGAATCGGAAATTGATGAACCGCTTATTCAGCTTGATGATGATACAGCCGAGTTAATG AAGCAGGCCCGAGATATGTACGGCCAGGAAAAGCTAAATGAGAAATTAAATACAATTATTAAACAGATTTTATCC ATCTCAGTATCTGAAGAAGGAGAAAAAGAA

Bacillus megaterium ccsA TTACGCATATGAATGTAAACCTGCAATAACTAGGTTCACGGCCACTAAGTTGAACATGATGATGGCAAACCCGCC AACTGCCAGCCAAGCTGATTTTTCACCGTGCCATCCTTTAGACAGCCGAAGATGAAGAAAAGCGGCATAAAATAA AAAGGTGATCAGCGCCCACACCTCTTTGGGGTCCCATCCCCAAAAGCGCGTCCAGGCCATCTGAGCCCAAATCAT

138

AGCGAAAATAAGTGCTCCTAATGTAAAAATAGGAAATCCAATCGCCACTGCTCTGTAGCTTACTTCATCAAGCAA GTCGCTGTTTACCTGCTTTACAAGCGGCTGCAGTGCGGCTCCAATTCGCTTCCGTAAAATAAGACGAATCAGCCA ATATAAGACAAACCCACCGGCTAACGACCAAATAACCGTATTTAATTTTTTAGCATTGATAATAGCCGGCATATC AACAAGCGGCTGTATGTCGTTCCCGTCTGTCCACTCACTTTGATGAGGCCCCACTAGCGGAACCATATGATATAC GACTTGCTCTTGTTCGCCACTTTTATTAACCCAATTAAATGATGACTCGTAGTGAGCCGCATTAAAAGCCAGCGT CACCGCTACAAACCCAAGCGTACTAATAAGCGTATACAGGATGAATTCCAGCCAAAATGCTTTTTTATTTCGCTT GGATTGATCAATTGTTTTGATTAAATAAATCAATCCTGCAACAAAGCTAATGGAAAGCACGGCTTCACCCGTAGC TGCCGTTGTAACGTGAATATGCAGCCAATTCGTCTGGAGAGCTGGTATCAGCGGCGTAATTTCAGTTGGAAACAT GCTTGCATAAGCAATGATTAAAAGAGCCACCGGCAAGGCAAAAAGTCCAAGCATGCTAAGACGATAAATAAAGTA AATAACAATAAATGCGCCAACAAGCATCATGCCAAAAAACGTCGTAAATTCAAATAAATTACTAACAGGAGCGTG GCCTGATGCAATCCACCTTGTAATGAAATAGCCAATCTGTGCAATAAAACCGATGATGGTAACCGCAATTCCAAA CGTTGCCCACTTTGTTTTCTTTTGAGCTGCTCGCTTATCTCGAATAGCTCCTCCGAAAAAAAAGGTTGCAACTAA ATAAGCAATAAACGCCACATACAGTAAATTACTACTTATACTCGCGTAATCCAT

Bacteroides thetaiotaomicron ccsBA ATGAAACTGATTCGATTGATCGCTTCTCCGATTTTGATGTACGTATTAGCGGGAGTTTATGCGTTGGTACTGGCA ATCGCCACTTTTGTGGAAAATTCGTCCGGTCCGGCCGTTGCGCGCGAATACTTTTATTATGCTCCGTGGTTTATT TTGTTGCAACTCTTGCAGGCTGTCAATCTGTTGGCCATGTTTCTTCAGGGAGGTTATTTCAAAAGAATCAGTAAA GGAAGTCTGATCTTTCATGGGGCTTTGGTTTTTATTTGGTTGGGGGCGGCTGTTACTCACTATGCAGGAGTAACA GGAATCATGCACATCCGTGAGGGGGAAACGGTGGACCGCATGATGCGCGATGAAGGGGCAGGGATGGGAAATGCC TCCCTGCCTTTTTCTGTCACTCTCGATGATTTCCGGTTAAAGCGTTATCCGGGCTCTCACAGTCCGATGTCTTAT GAAAGCGATCTGGTGATAAAGAAGGAGAATGAAGCTCCTTTGCAGGCTACGGTCCGGATGAATAAGGTGATTGAA GTGGATGGTTATCGCTTGTTTCAGTCTTCGTTCGATCCGGATGAGCAGGGAACGGTGTTGTCCGTAAGTTATGAC CGTCCCGGTATGCAGATTACCTATATCGGCTATTTCCTGCTGTTTGCCGGCTTTGTGCTGACTCTGTTCAGTAAG AAGTCCCGTTTCGGGCGTTTGCGCAGGGAACTGGGGGAGATGAAAAAGAATGCTCCTTTCTGTCTGCTTCTCTTT TTAGGTTTATCCGGTGCTTTGGGTACACAGGCGTCATATGCGCAGGAAACGCTTTCTTCTTCGCAGCTACCTTGC ATTCCTGCCCCGCACGCCCGGAAATTCGGCAGTCTGGTACTGCTCAATCCTAATGGTCGCCTTGAACCTGTCAAC AGTTACACTTCCGCCATTCTGCGCAAACTTTACGGAGCGGACAAGCTGAACAGCATCAATTCAGACCAGTTCTTT CTGAATCTTCTCGCTTTTCCGGATGAGTGGGGTGGCTATCCTTTTATCAAGGTCGATAATAAAGATATTCTTCAA CGGTTCGGAAGAGACGGCAAATACATCGCATGGCAGGATGTCTTTGATGCCGACGGCAATTATGTGCTGACCGAT GAAGTGAATGCCATCTATGCAAAATCCGCTTCAGAACGGAAACGAATGGATTCGGATTTGTTAAAACTGGATGAA TCGGTGAATATTGTCTACCGTATTATGCAACATCAACTTCTGCCTCTCTTCCCGGACGAAAACGACGTTCAGGGA AAATGGTACTCGGCAGGTGACGAACAAACCGTCTTTCATGATAAGGATTCCCTGTTTGTTTCTAAAATCATGGAT TGGTATATCTACGAATTGGGTAATGGTGTCCGCACTAATAACTGGAAAGAAGCGGACAAGATTGTCGATATGATG CATATCTTCCAGCAAGCTAAAAGTAAAACTCCCGCTATTGACAATCAGAGAGTAAAAGCGGAGCTTCTCTATAAT CAGTTAAACCTGTTCTTCTGGTGTCGTCTGGCTTACCTGATTTTGGGCGGAATCTTACTTTTCATAGCCTGCGGA GAGATTATCGCGGATTTCAAATGGGGGAGCAGACTGAGCAGCATACTGATTGTTCTTCTGATCGCTGCTTTCCTT GCACACACGACCGGTGTCTTACTGCGCTGGTATATCTCCGAACGTGCACCGTGGGCGAATGCTTACGAATCAATG ATCTGCACTTCCTGGCTGCTCGTTGGCGGAGGCCTCTTGTTTGCCCGGCGTTTCCGTATTCTCCCTGCGCTTGCC GGACTGTTAGGCGGAATCATGCTTTTCGTGGCAGGACTGAACCATCTGAATCCGGAAATAACTCCTTTGGTTCCG

139

GTGCTTCAGTCGTATTGGCTGATGTCTCATGTCGCCATTATCATGATCGGCTATGTATTCTTCGCACTTTGTGCG CTGACGGGACTGTTTAATCTTATCTTGATGAATCTGCTTTCTGCCACCAATCGGGTGAAGCTTCTCTTCCGTATC CGTGAGTTTACCTTACTGAATGAAATGGCAATGATTCTCGGCTTATTCTTTATGACGGCAGGAACCTTCCTCGGT GCCATCTGGGCAAATGTCTCATGGGGCAGATACTGGGGATGGGACCCGAAAGAAACATGGGCTTTAATCTCTATT GTTGTCTATGCACTGGTACTTCATATCCGTTTTATCCCGCTGCTGAAAGGTAAAACAACTTGGTGTTACAATTTG CTGTCCGTAGTCTCCATTCTCTCCATTATCATGACTTGGTTTGGAGTAAACTACTACCTGAGCGGCTTGCACTCT TATGGAAAGACCGAAGGGGGAGATTTACTGTTATGGATATGGGGAGCGGGTTTGTGTGTTGTGCTTGCTTTGGCC CTCTTTGCACGCAGGCGTTTGAAAAAGTATTCATGA

Bacteroides thetaiotaomicron ccsA ATGATAAACTGGGATAATTTTTATGTATTTGCAGCAGTATCCATCTGCCTATGGCTGGCAGGAGCTGTATTTGCC CTTCGTTCGTCATCGAAGAGTAAGGTGGCAATCGGATTTACTTCCGGCGGCATTATTATATTGGGAGTATTTATT GCCGGATTATGGATATTCCTCCAGCGCCCTCCATTACGCACTATGGGAGAAACCCGCTTGTGGTATTCATTCTTC ATGGGAATAGCAGGATTACTAACCTATATCAGATGGAAGTATCGCTGGATTCTCTCTTTCTCGACGTTACTTTCC ACAGTTTTTGTGATCATTAATTTGATGAAGCCCGAGATACACGATCAGTCATTGATGCCTGCGCTGCAAAGTGTA TGGTTTATCCCGCATGTGACCGTATATATGTTCTCTTACTCCGTTCTGGGATGTGCTTTCATCATCGCCCTGACC GGACTGTTCCGTCATAAAGAAGAATACCTGCATACAGCAGACAATCTGGTCTACGCAGGTGTCGCCTTTCTTTCC ATCGGTATGCTACTCGGTTCTCTCTGGGCAAAGGAGGCTTGGGGAAACTACTGGAGCTGGGACCCTAAAGAAACT TGGGCAGCCATCACCTGGATGGGATATTTGCTCTACATTCACCTCCGCCTGTTCAGAAGAACCGGACGGAAAACA CTCTACGTCCTGCTCATAGTCTCCTTCCTTGCTTTACAAATGTGCTGGTATGGAGTCAACTACCTGCCGGCGGCC CAACAAAGCGTACATTTATATAATCGTAATAACTAA

Escherichia coli thioredoxin ATGAGCGATAAAATTATTCACCTGACTGACGACAGTTTTGACACGGATGTACTCAAAGCGGACGGGGCGATCCTC GTCGATTTCTGGGCAGAGTGGTGCGGTCCGTGCAAAATGATCGCCCCGATTCTGGATGAAATCGCTGACGAATAT CAGGGCAAACTGACCGTTGCAAAACTGAACATCGATCAAAACCCTGGCACTGCGCCGAAATATGGCATCCGTGGT ATCCCGACTCTGCTGCTGTTCAAAAACGGTGAAGTGGCGGCAACCAAAGTGGGTGCACTGTCTAAAGGTCAGTTG AAAGAGTTCCTCGACGCTAACCTGGCGTAA

Geobacter sulfurreducens ccsA ATGAAATGGCTCCTTGTTGCCTCGGCGGCACTCTATCTCTTCGGTTCGTTCCGACGCCCCCTCTTTGCACTGGGT CTCGGCGCGGGGCTCGCCTATCTGGCCCTCCGCGGCATCTCCCTGGGCCGGTTGCCCCTGGTGGGCCCCCACGAC ACCATTGCCTTCTTTTCCGCCTCCATCGGGCTCATGACGCTGCCGTTCCTCTTTTCGCCCTCGCTGCGGAACTCT TCCGCCTTCCCTTGGGCAACGGGGGGAACCGCCGCGGTTTTCGCGCTGTTCTCCCTGGCATTTCCCGCCCTCGCC ATGCCGCTGCCGCCGATACTCAACACCCTCTGGTTCGAGCTGCACGTGGCGCTTGCCTTCTTCGCCTATGCGCTC TTCACCATCGGGGCCATCATGGGCGTCCTCTTCCTGGCCGGCGGAGAGCGGCGGCTCCTGGATCTCCAGTACCGG GCGGCCCTCGTGGGATACACCTTTTTTTCCGGCTCCATGGTTGCCGGCGGCATCTGGGGATACTATGCCTGGGGG ACCTACTGGCTCTGGACGCCCAAGGAGCTCTGGACCTCCATCCTCTGGATATTCTATACCTTCTGGCTCCATCTG CGCCTGAGAGGAGCCGGCGGAGACCGGCTCCTGGCCTGGACCGGCATCCTCGGCTTCGGGGTCATGCTCTTCACC TACCTGGGGGTGAGCATGCTCATGAAAAGTTCGCACAGCTTTTAA

140

Helicobacter hepaticus ccsBA ATGATGAATATAATTAAAACACTTTTTTGTTCAATGAAGATGGTTCTTTTGCTCATAGGCATATATGCCACAGCT TGTGGCATAGCAACATTTATTGAAAAATATGAGGGCACACTTGCAGCTCGTTTATGGGTATATGACGCATTTTGG TTTGAGATATTACATATTTGGCTTGTAGCTTGTCTCATTGGCTGTTTTATCACTTCAAAAGCGTGGCAGAGAAAA AAATATGCTTCATTGCTCTTGCACGCCTCTTTTATCGTTATCATCATTGGTGCGGGTATTACTCGTTATTATGGG TTTGAGGGCTTAATGAATTTGCGTGAGGGACAAAGCGTGAATTTTATTTCTACAAATACCCATTATATTTTTATC CAAATCAAAAATCCTCAAGGCGATGTAGAATCTGTAAGGATTCCCACTTATATTGATGAGAAAGTCAATCATAAA ATCAATCAACATTTAACATTTTTTGGCAAACCTTTGACATTGCATACAGAAGAATTTACAGCAAAACAAGTAAAT ATGTCTGAACTTTTTATACTCAATGCGAGTATTGATTTTTTAGGAAAAAATGAGAAAACACTTATTATGCGCGAT GGGAATAATGCACCTACAAAAGAAAACATTACTATGCTTGAAATTGAAGGCTATAAGATATTTCTTGCGTGGGGC ATAGATAATATTGCCCTCCCCTTTAGTATAAAACTTAAAAAATTTGAACTTGAGCGATACCCGGGTTCAAACTCT CCTGCCTCATATACTTCGGAAGTAGAAGTGCTAGATGGGCAAAATCCTCCCTTGCCTTTTAGAATCTTTATGAAT AATGTATTAGACTATGGTGGTTATCGCTTTTTTCAATCCTCTTATCACCCTGATGAAAAAGGCTCTATTCTCTCT GTTAATAATGACCCGGGCAAAACACCTACATACATAGGCTATGCTATGCTTATACTTGGCGTGATTTGGCTACTT TTTGATAAAAACGGACGATTTGCCACACTTGGTAGATTCTTAAAAACACAAAAGTTTTTCTCACTTATGCTTTGT AGTGCACTTTGTTATGCTTTGAGTAGCCCACAAATAGCTTATGCCTCAACACAATCACAAACTGACTTCCAGCCT TTATCTGAAAATGAAATACCACCTTTACAAGATATCCCCTCTATGATAAAGGCATTGGCGGATACTTCAAGCCTC ACAAATGATTTTGATAGAATCCTCGTTCAAGATTTTGGTGGGCGTATAAAGCCTATGCACACCCTTGCCAATGAG TATATTCATAAATTGACTCAACAAAGAACATTTAAAGGGCTCAATCCTTCACAAGTTTTCTTAGGTATGCTCTTT TATCCACAAGAATGGCAAAGTATCCAAATGATAGCCACCAAAAGTCCAAAGCTCCGCCAGATTCTAGGATTAGAC GAAAATCAAAAACATATTGCTTATATTGATGTTTTTACGCCTCAAGGACAATATATTCTCCAAAACTATGTTGAA GCCGCAAACCTCAAAAGCCCATCTTTGCGCGATACTTTTGAAAAAGATGTAATAAGCGTAGATGAGCGCATTAAT TATGCTTTCCTTATTTACACAGGGCAGGTGCTTAGAATTTTCCCCGATAACAAATCTCCTAATAACCAATGGCTC TACCCACTTCAAGCCATAAGCAGTGCTGTGGCACAAGATGATACAAAAAAAGCAAAAGAGTTAATGCAAATTTAT AAAAAATTCGCTCAAGGTATGCAACAAGGAATAAATACCCATAATTGGCAAGAAGCCGCTCAAGCAACACGAGAC ATTCGCACATTTCAACAAAACAATGGTGGCTCTTTACTTATTTCGCCTGCAAAGGTAGATTCTGAAATTTGGCTT AATCTTTATAATCCATTTTATCAGCTCACCTATCCTTATATCTTCATAAGCATTGTGCTTTTTATCATTGTGCTT GTAGGAATTCTCAAAAATACTCCTACACGTCCGCTTATTCACAAAGTATTTTATATTCTTTTGTTTGCGCTTTTT ATTTTGCATACTTGCGGATTGGGATTGCGATGGTATGTAAGCGAACACGCCCCTTGGAGTAATGCCTATGAATCT ATGCTTTATATTGCGTGGGCAGCAATTCTCTCTGGTGTAGTATTTTTTAGGCGCTCAAATCTTGCATTATGCGCC TCAAGCTTTTTGGCAGGAATGACTCTCTTTGTGGCTAATCTTGGTGATATGGACCCACAAATTGGGAATCTTATG CCCGTGCTTAAATCTTATTGGCTCAACATTCACGTCTCCGTCATCACTGCAAGTTATGGATTCTTAGGTTTATGC TTTATGCTTGGACTTATCACACTCATTATGTTTTTGTTACGAAATGAAAAAAGATCTCAAGTGGATTGCTCAATC CTCTCTTTAAGCGCATTGAATGAAATGAGTATGATTTTGGGCTTATTTTTGCTAAGTGTAGGGAACTTCCTTGGT GGCATTTGGGCTAATGAATCTTGGGGCAGATATTGGGGCTGGGATTCTAAAGAAACTTGGGCACTTATTAGTATT GGTGTATATGCAATTATCTTACACTTGCGCTTTGTTGTGCCTAAAAACTTTCCTTTTATCTTTGCAAGCGCGTCT GTAATAGGCTTTTTCTCAGTGCTAATGACCTATTTTGGAGTGAATTATTATCTCACAGGTATGCACAGCTATGCC GCAGGAGAAGCGGAGCCTGTGCCTTTATGGGTAGAACTTATGGTAGCAGGGATTATACTGCTCATAATTATCGCA AGTAGAAAAAGAGTGCTTGATATGCCCCATTTATAA

141

Hydrogenobacter termophilus ccsA TTAATCCGTTGCGTAGCTGTGAAGCCCTGGGAAGTATAGATTTATAGCAAAAAAGCAAATAAGCACGGTAATAAA GCCTAAGACCACCATCCAGGCAACGCGCCTTCCTCTCCATCCCAGCATCTGTCTTGCATGCAGGTACGCTCCAAA GAAGAGCCAGACTATCAAGGACCATACCTCTTTTGGATCCCAGCTCCAGTATCCACCCCAAGCTTCGTTTGCCCA TGCAGACCCCAATATTATGGAAGCCGTCCATACTGGAAAGACAACTATGATAGCTTTGTAAGTTATTTCTTCCAG AGTATCTTGGGATGGGAGCAATTTGGTTTGAGGAAACCTCCCTTTAACCAAGTAAAGAACTGCTCCAGCAAAAGC AACGGTAAAACCAGCATATCCTACGAAGGCTGTTACCACATGAAGATAAAGCCAATAACTTCTCAATGCAGGCAT AAGGGGAGTTATATCCTTTGATGCCTTAAGCATTGCAAAAGCGCTCAGACCAAAGATAAGAGGCACGACAAAAGT GCCAAGTATTCTAAGAGCGTACTTTCTTTCTATAAGCAGGTATATGCTGCCAGCTATAAAGCTCCAGAAGGTGAG GGCTTCAAAAAGATTGCTCCACGGTGGATGAAAGACACCAAGCTGGTAGCTCTGCACACCCCTTCTTATCATACC CGCCAGGTTGAGAAGCAATCCCAAAAAGAGGGTAAGGGTTGACACTTTACCGAAAAGGGAGTTTCTAAAGAGAGG ATAAGAGAGGTACATGACGGAAGATAAAGCATACAGCAAGGTAGCGGATTTGTACCAGAAGACATTATCCACCTG CAAAAGGGAAAGGAGCATGGCTATAAAAATCCCTATGAAAGCAATAAGGTAATCGCTGAGCATCCATACATTTCT TTGATGTATTACATGTCTCAT

Micrococcus luteus ccsA GTGTTCAACCCCACGGCCCCCGTGAACGAGCAGCTGGCGCTGTACAGCGACCTCTTCATGCTCATCGCCGCGCTC GTGTACGCGGCGGCGTTCATCCTGTTCACCATCGACATGGCCACCGCCTCGGCCACCATCCGCCGCCTCGAGGCC GACCTCGCGGCCCAGCGCGGGCAGGTGCGCACCGCTCCGGCCCGCGAGACCGTGGGGGTCGCCGTCGGCGGGTCC ACGGCGGTCGGCGCGAGCGACTCCGACACGGCCCGGCCGGCCGCCGCCGAGCCCGGCGTCGACGCGGACGACGAC CTCGTGGACGAGGACATGGACTACACGGGCGGCGGCCGCCGCCCGGTCGCCAACGTGGCCGTGGCCGTGCTGGCC GTCGGCTGGGCGCTGCACGCGTTCGCCGTCGTCGCCCGCGGCCTCGCCGCCTCCCGCGTGCCGTGGGGCAACCTG TACGAGTTCATGACCACGGGCGCGCTCGTGATCACGACCGTCTACCTGCTGTTCCTCTTGCGCAAGGATCTGCGC TTCGTGGGCACGTTCGTCACCGGCATCGTCGTGGCGATGATGTGCGCGGCCACCATGGGCTTCCCGACGCCGGTG GGCCACCTCCAGCCGCCGCTGCAGACGCCGTGGCTCGTCATCCACGTGTCCATCGCCGTGCTGGCCAGCTCGCTG TTCGCGCTGACCGCCGCGATGTCCGTGCTGCAGCTGCTGCAGGACCGCGCGGAGAAGCGCGCTGCGGCGGGCGAG CGCTCGTGGGCGTTCCTGCGCCTCGTGCCGGCCGCGCAGGGGCTGGAGAACTGGTCCTACCGCCTCAACGCCGTC GGCTTCGTCATGTGGACCTTCACGCTGATCGCCGGTGCGATCTGGGCCGAGGCCGCGTGGGGCCGCTACTGGAAC TGGGACACCAAGGAGGTCTGGACCTTCGTGATCTGGGTGGTGTACGCGGCGTACCTGCACGCCCGCGCCACCCGC GGCTGGACCGGCGCCCGCGCGGCCTGGCTGAGCATCGCGGGCTTCCTGTGCATCGTGTTCAACTACACGATCGTC AACACGTACTTCCCGGGCCTGCACTCGTACGCGGGTCTGCCCGGCTGA

Symbiobacterium thermophilum ccsA TCACAGATCGCCGCCGGCGTAGGAGTGCAGGCCGACGATCAGCAGGTTGACGCCCACCAGGGTGAACAGGATGAC GCCGAAGCCCAGCACGGCCAGCCAGGCGGAACGGCGCCCCTCCCACCCGTGGGTGATCCGGAAGTGCAGGTAGGC GGAGTACACGAGCCAGGCGATGAACGCCCAGGTCTCCTTGGGGTCCCACATCCAGTAGCGGCCCCAGGCCTCCTT GGCCCACATCATGGCGAAGATCAGGCCGCCGAGGGTGAAGATCGGATAACCAATCGCCACCGCCCGGTAGCTGAT CTCGTCCAGCAACTCCGGGTCTGCGGTCACCCGCAGGGCCAGGGCGTCGCCGATGCGGCCCCTGGTAACGAACCG CCGGATGAACCCGTACAGGAAAGCGCCGATCACCACCGAGTAGAGCAGCGTGTTGAGGTGCTTGCCCCGCCAGCC

142

GAAGGGCACGACCACGGTGGGCAGCGGAATCCCGAGGATCGTCCCCTTCTCCCCCACCTCGGCGCCGTACGGTCC GATGATGGGCGGCAGGTGGTAGGTGGTCATGCCGTTGTTGAAGATCCACTGGAAGCCGGCGTACCGGAAGAACAG GGCCAACACGGTGAAGCCGAGCAGGACCAGGATGAGATAGAAGACGATTTCCAACAGCCGGACGCCCCAGATGCG CTCCCAGCGGCTCTCCGCGGCGACGCCCGGCGCGGCCCCAGCCATCGGAGCGGCCCCCGCCATCGGCGCGGCGGT CTCCATTCCGGCCGCCGCCTCGCTCACGGCCCGGGCCTGTGCGATCTCCATCCCGCGCACCCGGAGCAGGTAGAG CAGCGCCGCGCCGAAGGCCACGGCGAAGAAGCCCTCGCCCAGGGCCGCCAGGCTGACGTGCAGCGGCAGCCAGTA AGATTGCAAGGCCGGGATCAGGGGCGTGACCTCGGGCGGGAAGACGTACGAGTAGGCCAGCAGCGCAATGGTCAC CGGGGTCACCAGCGCGCCGAGCACATACAGCCGGTACATCCCGTGCAGGACCAGGAAGGCCACCATGCTCGAGAA CGCCATGAACCCGATGAACTCATACATGTTGGACGTCGGCCAGAACCCGGATCCCTGCCACCGGGCCAGGATCGC GAGGCCGTGCAGGACCACGCCGATGACGTTGGCTACCGTGCCGTGGCCGGGCGTGCGGGTGAACCGGGTGCCCCC GGAAATCCGCCTGCCCAGGGTGCCTGAGACGTACAGCAGGAACGCGACAAGGTAGGACGCAAAGGCGCCGAGCAA CAGCCACTGGGCGGCATCCAGCAGAAACTGGTTAGCCGGTGTCAT

12.4 Overview of CcsBA/CcsA homologs

Table 12. 12: Overview of the taxonomic groups from Bacteria domain where the Ccs system is present and the Ccs homologs from representative organisms characterized in the first part of this thesis

Groups (Phyla) with System II Model organism Ccs proteins Aquifex aeolicus CcsA Aquificae Hydrogenobacter thermophilus Actinobacteria Micrococcus luteus CcsA Bacteroidetes Bacteroides thetaiotaomicron CcsA, CcsBA Cyanobacteria - - Chloroflexi - - Bacillus megaterium CcsA Firmicutes Symbiobacterium thermophilum CcsA β-Proteobacteria - - δ-Proteobacteria Geobacter sulfurreducens CcsA ε-Proteobacteria Helicobacter hepaticus CcsBA

143

12.5 Secondary structure predictions (made by Protter)

Aquifex aeolicus CcsA

Bacillus megaterium CcsA

Bacteroides thetaiotaomicron CcsA

144

Bacteroides thetaiotaomicron CcsBA

Geobacter sulfurreducens CcsA

Hydrogenobacter thermophilus CcsA

145

Helicobacter hepaticus CcsBA

Micrococcus luteus CcsA

Symbiobacterium thermophilum CcsA

146

12.6 Abbreviations

(v/v) volume per volume (w/v) weight per volume ALA Alpha-lipoic acid Ampr ampicillin resistance APS ammonium persulfate atm atmosphere (unit of pressure) BCA bicinchoninic acid BCIP 5-bromo-4-chloro-3-indolyl phosphate Camr chloramphenicol resistance CCHL cytochrome c heme lyase or System III Ccm cytochrome c maturation or System I Ccs cytochrome c synthesis or System II CMC critical micelle concentration CV column volume ddH2O double deionized water DDM n-Dodecyl β-D-Maltopyranoside DNA deoxyribonucleic acid dNTP deoxynucleotide triphosphate EDTA ethylenediaminetetraacetic acid GFP green fluorescent protein GST glutathione S-transferase HABA (-[4 –hydroxy-benzeneazo] benzoic acid) HCCS holocytochrome c synthase or System III IPTG isopropyl β-D-1-thiogalactopyranoside IMS intermembrane space LB lysogeny broth media LDAO lauryldimethylamine oxide NBT nitro blue tetrazolium NEB New England Biolabs

OD600 optical density at 600 nm OGP octyl β-D-glucopyranoside

147

ORF open reading frame PAGE polyacryl amide gel electrophoresis PCR polymerase chain reaction PEG polyethylene glycol SDS sodium dodecyl sulphate SEC size exclusion chromatography Strepr streptomycin resistance TB terrific broth media TCEP Tris(2-carboxyethyl)phosphine Tet/Tetr tetracycline/tetracycline resistance TEV tobacco etch virus TEMED tetramethylethylenediamine TMBZ (3,3’,5,5’-tetramethylbenzidine) TMs transmembrane segments Tris Tris(hydroxymethyl)aminomethane WWD tryptophan rich domain (WXWD)

12.7 Units

% percent °C degree Celsius Å Angstrom Au absorption unit bp base pair Da Dalton g gram h hour l litre m meter M molarity min minute rpm revolutions per minute sec second

148

12.8 Prefixes

M mega (106) k kilo (103) c centi (10-2) m mili (10-3) μ micro (10-6) n nano (10-9)

12.9 Nucleobases

A Adenine C Cytosine G Guanine T Thymine U Uracil

12.10 Amino acids

A Ala Alanine C Cys Cysteine D Asp Aspartate or aspartic acid E Glu Glutamate or glutamic acid F Phe Phenylalanine G Gly Glycine H His Histidine I Ile Isoleucine K Lys Lysine L Leu Leucine M Met Methionine N Asn Asparagine P Pro Proline

149

Q Gln Glutamine R Arg Arginine S Ser Serine T Thr Threonine V Val Valine W Trp Tryptophan Y Tyr Tyrosine

150

Acknowledgements

First, I would like to express my gratitude and to thank Prof. Oliver Einsle for the opportunity given to work on these projects in his laboratory, for all the support and patience invested, for the proofreading corrections for my thesis and DAAD proposal and all the ideas or feedback received at progress reports and institute seminars in the course of the last five years. A respectful thank you for the thesis defense committee members, Prof. Susana Andrade and Prof. Hans-Georg Koch. To Prof. Thorsten Friederich and Prof. Susana Andrade, thank you for all the discussions and tips during my seminar talks. To Stefan and Wholi thank you for all the organizational things and safety instructions talks. I would like to thank our lab staff: Toni, Isa, Frau Weiser and Manuel for all the great help provided over the years and a special thank you to Elke for helping me in the beginning of my Erasmus training period. For the secretariat ladies – Christiane, Veronique and Sabine – a big thank you for solving all my problems and for the help provided! To my colleagues from Einsle, Andrade and Friederich groups I would like to thank you all for the great work environment, for the help and for all the fun moments over the years! Special thanks to Anton, which from day one you were a great support for me by encouraging, guiding and helping me in the lab with fruitful discussions and great ideas, with crystallography or diverse teachings but mostly thank you for being my friend. To Paula, thank you for making my coming in the lab possible and for being such a good friend. To Anja and Laure, thank you for being my super girlfriends. To Sarah, it was a pleasure be your supervisor, thank you for your interest in my project and all your hard work invested but mostly I am grateful for becoming my friend. To Lin, thank you for all the help in the lab - tons of ideas, answers to my questions and so many scientific discussions (even on the ‘beer’ evenings) - for making the LCP crystallization plates and for risking yourself to try my super delicious cocktails. To Ivana, thank you for always caring about the lab environment (AKA lab mom) and for the funny talks around the corners. To Lukas – the other ‘heme’ guy left in the group - thank you for translating in German my thesis abstract and being cool about always disturbing you with questions,

151

complainings or other things on my mind. Thank you Chris for the funny jokes and talks that were always lifting my mood. To Beni, thank you for always helping me with ideas for presents. Thank you Christian, Jakob and Martin for our interesting office discussions. To Katharina, Michael and Florian, thank you for participating to our group extracurricular social activities. To the old generation colleagues, a big thank you to: Sippel – for making a jolly work atmosphere with your laugh and for so many fun moments outside the lab, Bianca – for your cool presence and all the dedication in our group well-being, Flo – for so many nice discussions about almost everything, Simon – for all the columns that you helped me to pack and for all the funny discussions, Julia – for the EPR measurements until late in the night, Nienke – for being my supervisor in the Erasmus period, for introducing me in the lab life and in the group, Nikola – for bringing sweets (and the chemistry behind) in our group and for teaching us your super cool game, Lisa – for the nice talks, Tobi P. and Tobi W. – for the funny and crazy social events, Julian – for all your help at the start of my PhD and Flore – for your help provided on my project. Big thank you to the Andrade group: Agostina – for always participating in our social events and smiling from one ear to another all the time, Phil – for all the funny discussions, Fabian – for the cool attitude towards everything and always presenting literature (and saving the day), Pauline – for the interesting discussions, Prachi – for your laugh that is always contagious and Mathias - for the interesting conversations. To my fellow institute colleagues from Friederich group: thank you for always helping me. To Joe thank you for the protein stability fluorescent measurements and interpretations. Last but not least, I would like to thank my parents and family for everything, for supporting and encouraging me with all my decisions. Thank you to my Freiburg friends for making the free time so much fun. To Narcis, thank you for proofreading my thesis and giving me ideas. To Ilie, thank you for helping with corrections on my thesis, for all the long discussions and for always being there supporting me.

152