<<

Kristīne Grāve Structural basis for

Characterization of Mycobacterium tuberculosis phosphatidylinositol Structural basis for metalloprotein catalysis synthase PgsA1 and Bacillus anthracis R2

Kristīne Grāve

ISBN 978-91-7911-044-4

Department of Biochemistry and Biophysics

Doctoral Thesis in Biochemistry at Stockholm University, Sweden 2020 Structural basis for metalloprotein catalysis Characterization of Mycobacterium tuberculosis phosphatidylinositol phosphate synthase PgsA1 and Bacillus anthracis ribonucleotide reductase R2 Kristīne Grāve Academic dissertation for the Degree of Doctor of Philosophy in Biochemistry at Stockholm University to be publicly defended on Friday 27 March 2020 at 10.00 in Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B.

Abstract About a third of all need to associate with a particular ion or metallo-inorganic to function. This interplay expands the catalytic repertoire of and reflects the adaption of these catalytic macromolecules to the environments they have evolved in. A large portion of this work focuses on the membrane metalloprotein PgsA1 from the pathogen Mycobacterium tuberculosis and a -harboring R2 from the pathogen Bacillus anthracis, offering a glimpse into the metalloprotein universe and the catalysis they perform. This thesis is divided into two parts; the first part describes a method for high-throughput M. tuberculosis membrane protein expression screening in Escherichia coli and Mycobacterium smegmatis. This method employs target membrane protein fusions with the folding reporter Green Fluorescent Protein, allowing for fast selection of well-expressing membrane protein targets for further structural and functional characterization. This technique allowed overexpression of M. tuberculosis phosphatidylinositol phosphate synthase PgsA1, leading to its crystallization and the characterization of its high-resolution three-dimensional structure. PgsA1 is a MgII- dependent , catalyzing a vital step in the biosynthesis of phosphatidylinositol – one of the major phospholipids comprising the complex mycobacterial envelope. Therefore, PgsA1 presents an attractive target for the development of new antibiotics against tuberculosis. The second part of this thesis concerns the structural characterization of the B. anthracis class Ib ribonucleotide reductase radical-generating subunit R2 (R2b). R2b contains a dinuclear metallocofactor, which is able to be activated by dioxygen and generates a stable tyrosyl radical; the radical is further used for initiation of nucleotide reduction in the catalytic subunit of ribonucleotide reductase. R2b proteins utilize a di- cofactor in vivo, but can also generate the radical using a di- cofactor in vitro, albeit less efficiently. How does R2b achieve correct metallation for efficient catalysis? We show that the B. anthracis R2b protein scaffold is able to select manganese over iron, and furthermore, describe the structural features that govern this metal-specificity. In addition, we describe -dependent structural changes in di-iron B. anthracis R2b after reaction with O2, and propose their role in gating solvent access to the metallocofactor and the radical site.

Stockholm 2020 http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-178861

ISBN 978-91-7911-044-4 ISBN 978-91-7911-045-1

Department of Biochemistry and Biophysics

Stockholm University, 106 91 Stockholm

STRUCTURAL BASIS FOR METALLOPROTEIN CATALYSIS

Kristīne Grāve

Structural basis for metalloprotein catalysis

Characterization of Mycobacterium tuberculosis phosphatidylinositol phosphate synthase PgsA1 and Bacillus anthracis ribonucleotide reductase R2

Kristīne Grāve ©Kristīne Grāve, Stockholm University 2020

ISBN print 978-91-7911-044-4 ISBN PDF 978-91-7911-045-1

Cover image created using QuteMol software (http://qutemol.sourceforge.net) and inspired by artwork of ElizaLiv

Printed in Sweden by Universitetsservice US-AB, Stockholm 2020 List of publications

This thesis is based on the following papers, which are referred in the main text by their roman numerals.

Paper I Kristīne Grāve, Matthew D. Bennett, and Martin Högbom (2020) Expression of Mycobacterium tuberculosis membrane proteins using folding reporter GFP. Manuscript

Paper II Kristīne Grāve, Matthew D. Bennett, and Martin Högbom (2019) Structure of Mycobacterium tuberculosis phosphatidylinositol phosphate synthase reveals mechanism of substrate binding and metal catalysis. Communications Biology 2:175. https://doi.org/10.1038/s42003-019-0427-1

Paper III Kristīne Grāve, Julia J. Griese, Gustav Berggren, Matthew D. Bennett, and Martin Högbom (2020) The Bacillus anthracis class Ib ribonucleotide reduc- tase subunit NrdF intrinsically selects manganese over iron. Submitted to Journal of Biological Inorganic Chemistry

Paper IV Kristīne Grāve, Wietske Lambert, Gustav Berggren, Julia J. Griese, Matthew D. Bennett, Derek T. Logan, and Martin Högbom (2019) Redox-induced structural changes in the di-iron and di-manganese forms of Bacillus anthracis ribonucleotide reductase subunit NrdF suggest a mechanism for gating of rad- ical access. Journal of Biological Inorganic Chemistry 24:849–861. https://doi.org/10.1007/s00775-019-01703-z

List of abbreviations

Å Ångström, 1 Å = 0.1 nanometers AG Arabinogalactan C-terminus Carboxy terminus CDP Cytidine diphosphate CDP-AP CDP-alcohol phosphotransferase CDP-DAG CDP diacylglycerol CL Cardiolipin FMN frGFP Folding reporter green fluorescent protein FSEC Fluorescence size exclusion chromatography GFP Green fluorescent protein IW Irving-Williams LAM Lipoarabinomannan LCP Lipidic cubic phase, in meso LM Lipomannan N-terminus Amino terminus PDB Protein data bank PE Phosphatidylethanolamine PI Phosphatidylinositol PIM Phosphatidylinositol mannosides PIP Phosphatidylinositol phosphate PS Phosphatidylserine R1 Catalytic subunit of ribonucleotide reductase R2 Radical-generating subunit of ribonucleotide reductase R2a Class Ia R2 protein R2b Class Ib R2 protein R2c Class Ic R2 protein R2d Class Id R2 protein R2e Class Ie R2 protein R2lox R2-like -binding oxidase RNR Ribonucleotide reductase TB Tuberculosis XAD X-ray anomalous dispersion

1 abbreviations

Alanine Ala, A Leucine Leu, L Arginine Arg, R Lysine Lys, K Asparagine Asn, N Methionine Met, M Asp, D Phenylalanine Phe, F Cys, C Proline Pro, P Glu, E Serine Ser, S Glutamine Gln, Q Threonine Thr, T Gly, G Tryptophan Trp, W His, H Tyr, Y Isoleucine Ile, I Valine Val, V

2 Abstract

About a third of all proteins need to associate with a particular metal ion or metallo-inorganic cofactor to function. This interplay expands the catalytic repertoire of enzymes and reflects the adaption of these catalytic macromole- cules to the environments they have evolved in. A large portion of this work focuses on the membrane metalloprotein PgsA1 from the pathogen Mycobac- terium tuberculosis and a radical-harboring protein R2 from the pathogen Ba- cillus anthracis, offering a glimpse into the metalloprotein universe and the catalysis they perform. This thesis is divided into two parts; the first part describes a method for high-throughput M. tuberculosis membrane protein expression screening in Escherichia coli and Mycobacterium smegmatis. This method employs target membrane protein fusions with the folding reporter Green Fluorescent Pro- tein, allowing for fast selection of well-expressing membrane protein targets for further structural and functional characterization. This technique allowed overexpression of M. tuberculosis phosphatidylinositol phosphate synthase PgsA1, leading to its crystallization and the characterization of its high-reso- lution three-dimensional structure. PgsA1 is a MgII- dependent enzyme, cata- lyzing a vital step in the biosynthesis of phosphatidylinositol – one of the ma- jor phospholipids comprising the complex mycobacterial cell envelope. Therefore, PgsA1 presents an attractive target for the development of new an- tibiotics against tuberculosis. The second part of this thesis concerns the structural characterization of the B. anthracis class Ib ribonucleotide reductase radical-generating subunit R2 (R2b). R2b contains a dinuclear metallocofactor, which is able to be activated by dioxygen and generates a stable tyrosyl radical; the radical is further used for initiation of nucleotide reduction in the catalytic subunit of ribonucleotide reductase. R2b proteins utilize a di-manganese cofactor in vivo, but can also generate the radical using a di-iron cofactor in vitro, albeit less efficiently. How does R2b achieve correct metallation for efficient catalysis? We show that the B. anthracis R2b protein scaffold is able to select manganese over iron, and furthermore, describe the structural features that govern this metal- specificity. In addition, we describe redox-dependent structural changes in di- iron B. anthracis R2b after reaction with O2, and propose their role in gating solvent access to the metallocofactor and the radical site.

3

4 Table of contents

1. Preface ...... 7 1.1 Enzymes ...... 7 1.2 What can the structure of an enzyme tell us? ...... 7 2. Structural and functional studies of Mycobacterium tuberculosis membrane proteins ...... 8 2.1 Mycobacterium tuberculosis ...... 8 2.2 Membrane proteins: a brief overview ...... 9 2.2.1 Plasma membrane ...... 9 2.2.2 Membrane proteins at a glance ...... 10 2.3 Mycobacterial membrane proteins as drug targets ...... 11 2.4 A few instruments from the toolbox ...... 12 2.4.1 Recombinant membrane protein expression in Escherichia coli 12 2.4.2 Recombinant membrane protein expression in Mycobacterium smegmatis ...... 13 2.4.3 The folding reporter GFP as an expression marker ...... 13 2.4.4 Membrane protein extraction using detergents ...... 14 2.4.5 Membrane protein crystallization in the lipidic cubic phase, LCP ...... 15 2.5 The structure of the M. tuberculosis cell envelope ...... 16 2.5.1 The capsule ...... 18 2.5.2 The cell wall ...... 19 2.5.3 The inner membrane ...... 20 2.6 M. tuberculosis PgsA1: a critical enzyme in PI biosynthesis ...... 20 2.6.1 Main PI precursors: CDP-diacylglycerol and myo-inositol ...... 20 2.6.2 Biosynthesis of PI in M. tuberculosis ...... 21 2.6.3 The CDP-alcohol phosphotransferase protein family ...... 22 2.6.4 CDP-alcohol phosphotransferases: X-ray crystal structures elucidate the catalytic mechanism ...... 24 3. Radical-generating subunit R2 from ribonucleotide reductase: structure- governed metal specificity ...... 26 3.1 Ribonucleotide reductase, RNR ...... 26 3.2 The radical-generating subunits of class I RNR: metal requirement and radical types ...... 27 3.2.1 Class Ia R2: Fe/Fe enzymes and a protein radical ...... 28

5 3.2.2 Class Ib R2: Mn/Mn enzymes and a protein radical ...... 28 3.2.3 Class Ic R2: Mn/Fe enzymes and an inorganic radical ...... 29 3.2.4 Class Id R2: Mn/Mn enzymes and an inorganic radical ...... 31 3.2.5 Class Ie R2: metal-free enzymes and a protein radical ...... 31 3.2.6 R2-like ligand-binding oxidase: Mn/Fe enzymes and a Y-V crosslink ...... 32 3.3 Metal specificity in heterodinuclear Mn/Fe R2 and R2lox proteins .. 32 3.4 Bacillus anthracis R2b as a model system ...... 33 4. Summary of main findings ...... 35 4.1 Paper I ...... 35 4.2 Paper II ...... 36 4.3 Paper III ...... 38 4.4 Paper IV ...... 41 5. Brief description of the crystallographic methods used in this thesis ...... 45 5.1 X-ray crystallography ...... 45 5.2 X-ray anomalous dispersion, XAD ...... 46 5.2.1 Radiation damage and photo-reduction in metalloprotein X-ray crystallography ...... 47 6. Popular summary ...... 48 7. Populärvetenskaplig sammanfattning ...... 51 8. Acknowledgements ...... 54 Bibliography ...... 57

6 1. Preface

1.1 Enzymes Enzymes are highly specialized proteins, which are fundamental to sustain- ing all forms of life on Earth. These macromolecules are biological catalysts that greatly accelerate the rate of chemical reactions, which otherwise would not occur on a useful timescale to sustain . Enzymes are able to perform their task by arranging specific amino acid side chains and solvent into a particular three-dimensional structure. A specific location in the enzyme where a dedicated chemical reaction takes place is referred to as the . Some enzymes require additional chemical components for their activity, termed cofactors and coenzymes. These include inorganic ions such as MgII, FeII or MnII, or complex (metallo-)organic molecules, such as flavin mononu- cleotide (FMN) and groups. Enzymes, that require metal ions for their catalytic function are termed metalloenzymes. All enzymes are grouped into six major classes: Oxidoreductases, Trans- ferases, Hydrolases, Lyases, Isomerases and Ligases. Representatives from the first two enzyme classes will be discussed in this thesis.

1.2 What can the structure of an enzyme tell us? The three-dimensional structure of an enzyme defines its properties. Sig- nificant effort is often needed to elucidate the details of this spatial atomic arrangement in a macromolecule of interest. This structural information pro- vides clues about the reaction and regulation mechanisms of the enzyme, and can reveal how protein complexes are assembled and cooperate to perform their dedicated function. In addition to the fundamental knowledge generated from such studies, this information can be used for structure-guided protein engineering, for establishing effective industrial biological catalysts 1,2 or used in the development of new pharmaceuticals to advance our quality of life 3,4. Currently, there are several major techniques available for the determina- tion of detailed three-dimensional structures of proteins: X-ray crystallog- raphy, Nuclear magnetic resonance (NMR) and Cryo- microscopy (cryo-EM). X-ray crystallography was the main method employed for struc- tural characterization of the enzymes presented in this thesis and is briefly described in Chapter 5.

7 2. Structural and functional studies of Mycobacterium tuberculosis membrane proteins

2.1 Mycobacterium tuberculosis Mycobacterium tuberculosis was first isolated and described by Robert Koch in 1882. The organism is a bacterial pathogen that causes Tuberculosis (TB) in humans; a highly infectious airborne disease mostly affecting the , which is transmitted by people with active respiratory TB 5. TB is recognized as the leading cause of death from a single infectious agent; in 2018 alone about 1.4 million people died of TB and a further 10 million people fell ill with the disease 5. Moreover, up to one quarter of world’s population is estimated to be infected with the latent form of the dis- ease 6,7. Since the discovery of the first- and second-line antibiotics rifampicin, isoniazid, and ethambutol in mid-20th century, the recommended TB treatment regimen has remained unchanged for nearly 60 years. Treatment is long, com- plex, and may result in drug intolerance and 8. Together with the adap- tive of the pathogen, these factors have contributed to the emergence of multidrug-resistant (MDR) and extensively drug-resistant (XDR) M. tuber- culosis strains 5,7. Inhaled during tuberculosis infection are phagocytosed by macro- phages in the lungs, which migrate into tissue and form granulomas – compact aggregates of host immune cells, which normally create an environment, where the infection can be controlled 9. However, granulomas also provide a suitable environment for long-term M. tuberculosis survival, essentially turn- ing the infected cells into a “Trojan horse” within the host immune system. This co-evolution and cross-talk with its host has enabled M. tuberculosis to be capable of surviving while waiting dormant 10; as the immune system weak- ens, the dormant bacteria may reactivate and cause active TB. In addition to macrophages, the pathogen can also infect other immune cells: monocytes, neutrophils and pneumocytes-II 11. The complete genome of M. tuberculosis H37Rv strain was published in 1998 12 and re-annotated in 2002 13; it consists of 4.4 mega-base pairs and about 4000 genes 12, of which about 600 are essential 14,15. Surprisingly, out of all M. tuberculosis genes, about 250 are involved in ; for comparison, in Escherichia coli there are only 50 such genes 12. Also, many

8 lipid species, found in M. tuberculosis, are specific to Mycobacterium genus (reviewed in ref. 16).

2.2 Membrane proteins: a brief overview

2.2.1 Plasma membrane

The emergence of biological membranes played a key role in the develop- ment of cellular life, by segregating chemical processes and providing a se- lective barrier for the flow of various solutes across them 17. Biological mem- branes are composed of amphipathic molecules – lipids which are able to self- assemble into continuous bilayers. The plasma membrane is one type of bio- logical membrane; it is a 3 – 5 nm thick protein-lipid bilayer enclosing the cell cytoplasm (or vacuole in plants). The fluid mosaic model, proposed in 1972, explains how proteins and lipids assemble in biological membranes 18. The core of the bilayer is formed by the hydrophobic hydrocarbon chains of the lipids. The polar headgroups of lipids, on the other hand, are solvent-exposed and form the two interfaces of the bilayer – extracellular and intracellular. The core and the interfaces are chemically distinct; the core is very hydrophobic while the interface is hydrophilic and highly solvated, forming polar leaflets approximately 1.5 nm thick with the potential to form electrostatic and weak van der Waals-type interactions. Glycerophospholipids, such as phosphatidylinositol [PI], phosphatidyleth- anolamine [PE], cardiolipin [CL] and phosphatidylserine [PS]), constitute the majority of lipids in the plasma membrane. The distribution of these glycer- ophospholipids between the two bilayer interfaces is not symmetric, which has important biological consequences. For instance, the presence of PS on the outer leaflet of the plasma membrane marks a cell for destruction. The lipid composition also influences the charge distribution, membrane curva- ture, and fluidity, which in turn affects membrane protein function and mem- brane fusion-related processes 19. The length of the lipid acyl chains dictates membrane thickness and permeability, while density and fluidity are affected by the arrangement and structure of the acyl tails. Saturated acyl chains ar- range more tightly and render the membrane less fluid, whereas unsaturated lipids introduce more disorder and render the membrane more fluid and less dense. At least one of the two chains in most phospholipids is unsaturated, meaning that the plasma membrane is always fluid at a physiological temper- ature 20.

9 2.2.2 Membrane proteins at a glance Approximately half of all plasma membrane components are proteins, which differ in the ways they associate with the membrane (Figure 1). There are four types of membrane proteins defined as: integral, lipid-anchored, pe- ripheral or amphitropic. (i) Integral or ‘membrane-spanning’ proteins can be monotopic or multi-pass; these proteins associate with the membrane so tightly that they can only be extracted using detergents. (ii) Lipid-anchored proteins are located on membrane surfaces and are covalently attached to li- pids in the membrane. (iii) Peripheral membrane proteins associate with the membrane through electrostatic interactions with lipid molecule headgroups and can be separated by changes to the pH or salt concentration. (iv) Am- phitropic proteins, are found both in cytosol and in association with the mem- brane. Their affinity to the membrane is regulated by conformational changes associated with ligand binding. The focus of Papers I and II, presented in this thesis, is on integral a-helical membrane proteins. Multi-pass non-cytosolic proteins are often described in terms of their to- pology; the way in which transmembrane domains and the amino- and car- boxy-termini (N- and C- termini, respectively) span the membrane, and how they are oriented in relation to each other as well as the extracellular/intracel- lular sides of the membrane. The topology of helical transmembrane proteins can be predicted based on the distribution of different amino acid types; smaller hydrophobic residues (glycine, alanine, valine, leucine, isoleucine) are enriched in transmembrane domains. Tryptophan, tyrosine and histidine, on the other hand, are prevalent at the membrane interface 21,22. Statistical studies have established that most membrane proteins follow the so called positive- inside rule and orient themselves in a membrane so that positively charged residues are predominantly exposed to the intracellular side of the membrane leaflet 22–24.

Figure 1. Types of membrane proteins, classified by their mode of interaction with the lipid membrane. Membrane proteins should never be viewed as separate entities, as the lipid bilayer environment in which they reside must also be considered. A single

10 layer of lipid molecules enclosing the membrane protein is called an annulus or annular layer of lipids. There are also lipid molecules that bind between the transmembrane helices, which are referred to as non-annular. Non-annular li- pids are critical for modulating protein function and in maintaining the integ- rity of large protein complexes 25–29.

2.3 Mycobacterial membrane proteins as drug targets As membrane proteins are involved in several vital cellular processes 30, many are targeted by a large proportion of drugs currently on the market 31,32. Understanding the structure-function relationships of protein targets is of par- amount importance for drug discovery, in order to avoid unwanted side effects and to develop better drugs 33. Unfortunately, the number of available three-dimensional structures of these proteins does not fully reflect their importance – only 4% of all deposited structures in Protein Data Bank (PDB) are membrane proteins, of which only 988* are unique. The reason for this lies in the difficulties associated with membrane protein production, isolation, crystallization, and structural analy- sis. Membrane protein research projects often take many years to complete and are therefore very expensive to pursue. The mycobacterial is of high scientific interest from a drug dis- covery, diagnosis and treatment perspective. However, working with myco- bacterial proteins is significantly more difficult compared to E. coli proteins (discussed further below). For example, approximately 1000 unique M. tuber- culosis protein structures have been determined to date using various tech- niques, whereas over 4300 unique E. coli protein structures are available. To facilitate structural characterization of the M. tuberculosis proteome, TB Structural Genomics Consortium (TBSGC) was launched almost 20 years ago 34. However, of the 335† protein structures determined by TBSGC none are M. tuberculosis membrane proteins. We have sought to improve the process of M. tuberculosis membrane pro- tein target selection for structural and functional studies, by using a high- throughput protein expression screening approach. This approach utilizes a folding reporter green fluorescent protein (GFP) as a membrane protein fusion partner (discussed further below). The GFP fluorescence can be measured in live growing cells; and the fluorescence levels can be correlated to the amount of overexpressed membrane protein of interest. Various culturing conditions and expression hosts can be utilized, allowing for the rapid selection of well- expressing protein targets. We have applied this strategy to 42 multi-pass a-

* https://blanco.biomol.uci.edu/mpstruc/, accessed on December 29, 2019 † https://www.rcsb.org/stats/distribution_structural-genomics-centers, accessed on December 29, 2019

11 helical transmembrane proteins, all of which are essential for M. tuberculosis H37Rv survival and virulence. In Paper I we show that this approach is suit- able for M. tuberculosis membrane protein expression in E. coli and Myco- bacterium smegmatis systems. Prior to this project being initiated six years ago, only a single full-length M. tuberculosis membrane protein structure had been published; the 3.5 Å resolution mechanosensitive channel MscL 35,36. Following our study, we were able to over-express, purify, crystallize and solve the structure of M. tuberculosis phosphatidylinositol phosphate synthase PgsA1 to 1.8 Å resolution (Paper II). Notably, this was the first high-resolu- tion structure of a full-length M. tuberculosis membrane protein in PDB.

2.4 A few instruments from the toolbox

2.4.1 Recombinant membrane protein expression in Escherichia coli As reflected in PDB statistics (Section 2.3), transmembrane proteins are substantially more challenging for structural analysis compared to cytosolic proteins. This is due to the innate flexibility of transmembrane proteins, their hydrophobicity, and dependence on native lipids in their membrane home en- vironment. As most structural projects require large amounts of pure and mon- odisperse protein material, it is usually necessary to express the target as re- combinant protein in an expression host. The Gram-negative bacterium E. coli is the most widely used host for pro- karyotic (and to lesser extent eukaryotic) protein production. The E. coli ge- nome can be easily manipulated for target protein expression. In addition, the organism is fast-growing (growth doubling time of approximately 20 minutes), inexpensive, and nonpathogenic 37. However, while E. coli remains a common protein production system, produced membrane proteins often ag- gregate and from inclusion bodies. While b-barrel type proteins can be re- folded from inclusion bodies, membrane proteins with a predominantly a-hel- ical structure are significantly more challenging to refold 37,38. Mycobacterial proteins present another layer of difficulty in their overex- pression due the differences in guanine-cytosine (GC) content between myco- bacterial (65.6% GC) and E. coli (50.8% GC) genomes 12,39. It is therefore likely that E. coli may simply lack some of the necessary machinery for my- cobacterial protein production. This issue can often be alleviated by supple- menting E. coli with rare codons, expressing mycobacterial membrane pro- teins in alternative expression systems, generating fusion constructs that aid protein folding and stability, and exploring different bacterial growth condi- tions (Paper I) 40–42.

12 2.4.2 Recombinant membrane protein expression in Mycobacterium smegmatis M. smegmatis is a saprophytic bacterium, which is used as a model organ- ism for pathogenic and slow-growing mycobacterial species. Also, genetically it is highly similar to M. tuberculosis 43,44. Two M. smegmatis laboratory strains are widely employed for M. tuberculosis protein expression: mc2155 and mc24517. The mc2155 strain has a high propensity for transformation, and is therefore the strain of choice for most laboratories working with mycobac- teria 45,46. The mc24517, strain, on the other hand, has been engineered to be compatible with the T7 promoter-based expression systems that were origi- nally designed for E. coli 47,48. M. smegmatis is a fast-growing Mycobacterium species; depending on the growth media M. smegmatis has a dividing time of about 3 hours, compared to 24 hours needed for M. tuberculosis 44. This ex- pression system is versatile and is compatible with other promoter types which provide more stringent gene expression control 49. M. smegmatis mc24517 with the T7 promoter-based expression system was used for M. tuberculosis membrane protein expression in the study reported in Paper I. Since M. smegmatis and M. tuberculosis are related species, this expression system provides several advantages for M. tuberculosis membrane protein ex- pression. In particular, M. smegmatis has a similar plasma membrane structure and composition, which could provide the specific lipids necessary for mem- brane protein folding, stability and activity. Furthermore, the organism also provides specific mycobacterial chaperones, which may assist with correct protein folding 50,51. M. smegmatis also possesses certain metabolites/, which are not present in E. coli. The protein Rv0132c, for example, requires the F420 coenzyme (a 5-deazaflavin derivative) for activity, and so needs to be co-purified from M. smegmatis, as this coenzyme is absent in the E. coli metabolome 52.

2.4.3 The folding reporter GFP as an expression marker Identification of individual membrane protein candidates, suitable for fur- ther structural and functional characterization, is a very time- and labor-inten- sive task. Fortunately, the introduction of the folding reporter green fluores- cent protein (frGFP) greatly simplifies this endeavor. The GFP fluorophore can be easily detected in live cells expressing frGFP fusion proteins, by ex- posing them to long-wavelength UV light. Compared to wild-type GFP, frGFP is sensitive to misfolding, has stronger fluorescence, and expresses well in E.coli as an individual protein or as a fu- sion partner 53. When frGFP is fused to the C-terminus of a target membrane protein, it fluoresces only in cell cytoplasm, and if its partner membrane pro- tein is folded and inserted into the membrane; if the membrane protein is ag- gregated or located in inclusion bodies, then its fusion partner frGFP does not

13 fluoresce 54–56. It was estimated that C-termini of approximately 80% of all E.coli membrane proteins are located in the cell cytoplasm 57. frGFP has been used for soluble protein folding studies 53, global topology mapping of almost the entire E. coli inner membrane proteome 55,57, as a tool for assessing E. coli inner membrane protein expression 56, and to verify sta- bility during purification prior to structural and functional studies 58,59. While E. coli remains the most common recombinant protein expression host, similar GFP-based membrane protein expression studies have also been reported for the bacteria Lactococcus lactis and M. smegmatis 60–62, and for yeast Saccha- romyces cerevisiae 63.

2.4.4 Membrane protein extraction using detergents Transmembrane proteins are tightly associated with the lipid bilayer and typically need to be extracted (solubilized) using detergents in order to pursue their structural and functional studies. Detergents are amphipathic molecules that are comprised of a polar headgroup and a hydrophobic hydrocarbon tail, and are therefore similar to membrane lipids in terms of these properties. There are many types of detergents available; they differ in their headgroup type and hydrophobic tail length and structure. Detergents are commonly cat- egorized as harsh and mild; harsh detergents have shorter hydrophobic tails and more charged headgroups compared to mild detergents, and thus have a higher propensity to denature membrane proteins. When a critical micelle con- centration (CMC) of detergent in an aqueous solution is reached, micelles spontaneously form; where the hydrophobic tails of the detergent molecules are secluded, while their polar headgroups are exposed to the solvent. When the detergent concentration exceeds CMC, it is capable of extracting proteins from the membrane by covering their hydrophobic and normally membrane- embedded surfaces. Detergent-protein complexes can then be treated similarly to -soluble proteins, simply by keeping the concentration of detergent in solution above its CMC 27,64. The selection of an appropriate detergent is an empirical process for every membrane protein of interest. The detergent of choice should stabilize the pro- tein, keep it monodisperse, and catalytically active (if applicable). In addition, the concentration of free detergent in solution should be minimized for down- stream applications; this is particularly important for activity assays (espe- cially involving water-soluble proteins) and for crystallization 27,65,66. A hand- ful of high-throughput techniques are available for rapid detergent screening of different types of membrane proteins. One of the most widely used ap- proaches is based on C-terminal membrane protein fusion with GFP, which enables the monodispersity of the protein-detergent complex to be monitored using fluorescence-detection size exclusion chromatography (FSEC) 58,59. Al- ternatively, protein-detergent complex stability can be monitored using mod- ified differential scanning fluorimetry (nanoDSF) 67.

14

2.4.5 Membrane protein crystallization in the lipidic cubic phase, LCP Membrane protein crystallization is a highly challenging and iterative pro- cess, with many parameters that require optimization. Crystallization as a pro- tein-detergent complex (in surfo) using vapor diffusion or batch crystallization remains a method of choice for many crystallographers 65,68. The detergent collar, however, often hampers crystal contact formation and allows unwanted protein flexibility; this results in weakly diffracting protein crystals or no crys- tal growth at all. In addition, if free detergent molecules are not carefully re- moved, this may result in phase separation in the crystallization mixture, again hindering formation of diffraction quality crystals 66. Crystallization of membrane proteins in the lipidic cubic phase (LCP, in meso or in cubo) has attracted a lot of attention as it has been successfully employed to crystallize G-protein coupled receptors – an important class of enzymes with high medical relevance 69–72. Fortunately, the concept of mem- brane protein crystallization in LCP is not limited to G-protein coupled recep- tors. Many other proteins have been crystallized using this technique 73, in- cluding M. tuberculosis phosphatidylinositol phosphate synthase PgsA1 (Pa- per II). Lipidic cubic phase is a bicontinuous curved bilayer, that forms non-con- tacting, but interpenetrating channels, which allow the diffusion of aqueous crystallization mixture. LCP has a gel-like consistency and is spontaneously formed by mixing a lipid (typically monoolein), with an aqueous solution (containing pure, detergent-solubilized membrane protein) at certain propor- tions and temperatures 74,75. Transmembrane proteins are able to reconstitute in the monoolein bilayer and diffuse along the interpenetrating channels. Once the mesophase bolus is exposed to a precipitant solution (typically in glass sandwich plates), the precipitant is able to diffuse inward and through the in- terconnecting aqueous channels of the mesophase. A precipitant concentration gradient is then formed along the diffusion path, which favors crystal nuclea- tion and growth (Figure 2) 73.

15

Figure 2. Cartoon representation of the events proposed to take place during the crys- tallization of an integral membrane protein in LCP. First, the protein (green) is recon- stituted into the bicontinuous bilayers of the LCP (left bottom corner). The addition of precipitant then shifts the equilibrium away from protein stability, leading to phase separation, with protein molecules diffusing along the continuous bilayer (cross sec- tion shown in the middle of the figure). Protein molecules then lock into the lattice of the advancing crystal face, adding to crystal growth (upper right corner). Aqueous channels (blue-purple) allow the diffusion of water-soluble components from the crys- tallization mixture. The lipid bilayer (yellow-orange) is approximately 4 nm thick and drawn to scale with the outer membrane B12 transporter BtuB (PDB: 1NQE, ref. 76) shown in green. Reproduced from ref. 75 by permission of The Royal Society of Chemistry. LCP results in tightly packed type I crystals, and as detergent is stripped off the membrane protein upon reconstitution into the lipid mesophase, these crystals typically diffract better than type II crystals grown in presence of de- tergent collars 73,77. Since the detergent does not obstruct crystallogenesis in LCP any detergent in principal can be used. However, it is important to keep the concentration of the detergent as low as possible, as it may disrupt the mesophase. Other advantages of the LCP crystallization method is its toler- ance to impurities in the protein mixture and the possibility to supply specific lipids, necessary for protein stability and function 69,78. The lipid substrate CDP-diacylglycerol (CDP-DAG) was added to monoolein prior to M. tuber- culosis PgsA1 crystallization in LCP (Paper II).

2.5 The structure of the M. tuberculosis cell envelope The cell envelope is an important barrier; it protects mycobacterial cells from mechanical and chemical stresses of the outer environment, presents var- ious virulence factors, and facilitates adhesion to host cells during infection.

16 The mycobacterial cell envelope is very hydrophobic due to the abundance of various lipids, which contributes significantly to the low permeability of the envelope to broad-spectrum antibiotics 79. Despite its clear importance, certain aspects of the structure and composi- tion of the mycobacterial cell envelope remained unknown. However, a direct observation of the arrangement of these different layers in mycobacteria was recently visualized using cryo-electron microscopy 80. As reported in ref. 80,81 and reviewed in ref. 82,83, the M. tuberculosis envelope is about 70 nanometers thick and consists of three main components: (i) the outermost layer, also called the capsule in pathogenic species 84, (ii) the cell wall, which includes a peptidoglycan-arabinogalactan mycomembrane layer and the periplasmic space, and (iii) the inner or plasma membrane (Figure 3). While M. tubercu- losis is structurally closer to Gram-positive bacteria, it also shares similarities with Gram-negative bacteria, in that it does not have a true outer membrane and does not retain Gram stain 85. Much of what we know about the general composition and structure of the mycobacterial cell envelope is mostly based on collective studies of M. tuber- culosis and closely-related pathogenic and non-pathogenic representatives of the Corynebacterineae suborder. The composition of each cell envelope com- partment has been shown to be dynamic; it changes depending on bacterial growth phase, nutrient availability, and environmental stresses (reviewed in ref. 86). The structure of the M. tuberculosis cell envelope and the chemical composition of each of its main compartments are described briefly below.

17

Figure 3. Schematic structure of the M. tuberculosis cell envelope (not to scale). Abbreviations: GLP, glycolipoprotein; TMM and TDM, trehalose mono- and dimy- colates; AG, arabinogalactan; PG, peptidoglycan; PI, phosphatidylinositol; PIM, phosphatidylinositol mannoside; LAM, lipoarabinomannan; PL, phospholipid (gen- eral); IM, inner membrane. Figure based on refs. 81,87,88 and also inspired by ref. 89.

2.5.1 The capsule The capsule is approximately 35 nm thick and is rich in protein and neutral polysaccharides (a-D-glucan, D-arabino-D-mannan and D-mannan), while

18 being relatively poor in lipids, which comprise only 6% of the M. tuberculosis capsule 87,90. A research study by Ortalo-Magné et al. demonstrated that phos- phatidylinositol mannosides (PIMs), phosphatidylethanolamine (PE), diacyl trehaloses and phthiocerol dimycocerosates are the lipid species that are most exposed to the extracellular space 91. Interestingly, the capsule appears to be weakly bound to the cell wall and can be shed, as observed when M. tubercu- losis is grown in detergent-containing medium, or when it is subjected to agi- tation with glass beads 90.

2.5.2 The cell wall The mycobacterial cell wall is a core structure of the cell envelope; it is comprised of three sub-layers: (i) the asymmetric mycomembrane bilayer (8 nm), which is linked to (ii) highly branched arabinogalactan (AG) polysac- charides, which is further covalently attached to a crosslinked layer of pepti- doglycan and (iii) the periplasmic space (20 nm). The core structure, exclud- ing the periplasm, is also referred to as a mAGP complex. The mycomembrane is asymmetrical, meaning that the outer layer is struc- turally distinct from the inner layer of the membrane. The outer leaflet is pri- marily composed of free or trehalose-attached mycolic acids, which form tre- halose monomycolates and dimycolates. Mycolic acids and their derivatives are essential to mycobacteria; the inhibition of their synthesis is one of the primary effects of the first-line antibiotic isoniazid 92. The presence of other lipid types in the outer leaflet of the mycomembrane is actively debated. How- ever, there is experimental evidence indicating that phospholipids also con- tribute to the outer leaflet of the mycomembrane 81 – although this feature may be specific to Mycobacteriaceae 93. The phospholipid moieties of the my- comembrane are decorated with mannose-derivatives, forming PIMs and lipoarabinomannan (LAM), which are important antigen factors recognized by the host immune system 94. The inner leaflet is likely formed by a parallel arrangement of mycolic acids 81. A Study by Chiaradia et al. provided insights into the protein composition of mycomembrane. Specifically, Mass spectrom- etry (MS) analysis resulted in the identification of known porins and protein complexes involved in the modification of mycolate residues. In addition, a number of new proteins, such a putative MCE (Mammalian Cell Entry) family proteins, were also discovered 81. The mycolic acids of the inner leaflet of the mycomembrane are attached through phosphodiester linkages to the AG layer, which in turn is covalently linked with peptidoglycan, forming a honeycomb-like structural arrangement. The peptidoglycan layer provides rigidity to the cell, defining its shape and helping it to withstand osmotic stress. The space between the mAGP complex and the plasma membrane is called the periplasm. While the periplasm is a feature of true Gram-negative bacteria,

19 it is sometimes referred to as periplasm-like space or pseudoperiplasm in My- cobacteria.

2.5.3 The inner membrane The inner membrane of the cell envelope has a conventional bilayer struc- ture; it is approximately 5 – 7 nm thick and is comprised of conventional glyc- erophospholipids such as PI, PE, and CL, which constitute 0.37%, 0.52%, and 1.15% of the dry cell mass, respectively 88,95. The inner membrane also con- tains substantial amounts of PIMs and its heavily glycosylated derivatives, li- pomannans (LMs) and lipoarabinomannans (LAMs). These complex glycoli- pids are also acylated; diacetylated PI mannoside (Ac2PIM) is a major inner membrane component, constituting up to 42% of the dry cell mass 81,88,95. Ad- ditionally, small amounts of trehalose mono- and dimycolates together with currently unidentified apolar lipids (which constitute about 1.1% of the dry cell mass) are found in the inner mycobacterial membrane 81,88. Interestingly, the prevalence of certain lipids can vary between different pathogenic and non-pathogenic Mycobacterium species; also it is largely influenced by the growth stage and presence of available nutrients 81,86,95. As is the case in other bacteria, the plasma membrane in M. tuberculosis is the major permeability barrier; it is rich in biosynthetic protein complexes, many of which are in- volved in nutrient transport, energy generation, drug efflux, and the biosyn- thesis of various cell envelope components 81.

2.6 M. tuberculosis PgsA1: a critical enzyme in PI biosynthesis

PIM, LM and LAM are the most abundant glycoconjugates in the myco- bacterial cell envelope, which play a significant structural role and are also important for virulence and drug-resistance 96–98. They are synthesized by the sequential addition of mannose and arabinose residues to PI, one of the major phospholipids in the mycobacterial plasma membrane 88,99. While the structure and biogenesis of PIM, LM and LAM are reviewed in ref. 97,100–102, the follow- ing section will focus on the biosynthesis of PI (Figure 3).

2.6.1 Main PI precursors: CDP-diacylglycerol and myo-inositol PI is comprised of a phosphatidylglycerol linked to an inositol headgroup, and is synthesized from two precursor molecules: cytidine diphosphate diacyl glycerol (CDP-DAG) and myo-inositol. While abundant in eukaryotes (in- cluding pathogenic fungi and protozoa) and , PI is seldom present in

20 bacteria; it is only found in a few Actinobacteria, including M. tuberculosis, and a few hyperthermophilic Eubacteria 81,88,99,103,104. Phospholipid biosynthesis begins with the acylation of sn-glycerol-3-phos- phate to form phosphatidic acid 105, which is further activated by cytidine tri- phosphate (CTP) to form CDP-DAG. In M. tuberculosis the latter reaction is catalyzed by the transmembrane enzyme CdsA (Rv2881c). The ortholog of this enzyme has previously been biochemically characterized in M. smegmatis 106. Inositol is a six-carbon cyclic sugar; existing in nine stereoisomers derived from the epimerization of its six hydroxyl groups, which are often represented as a mnemonic “Agranoff turtle” cartoon (designed to avoid confusions in the nomenclature) 107,108. Amongst these epimers, myo-inositol is the most abun- dant. For simplicity, myo-inositol will be referred to as inositol in this thesis. In eukaryotes, phosphorylated membrane-bound and soluble derivatives of in- ositol perform various functions; they are used as common mol- ecules or anchor plasma membrane proteins (GPI-anchors) among other func- tions 103,109,110. In Actinobacteria inositol is a component of mycothiol; an abundant antioxidant, which protects cells from oxidative damage and is used as a carbon source reservoir for energy production during the stationary growth phase 111. In M. tuberculosis a single ino1 gene (Rv0046c) controls the de novo synthesis of inositol from D-glucose-6-phosphate. M. tuberculosis Dino1 deletion mutants are only able to grow when their growth media is sup- plemented with high concentrations of inositol 112, indicating the existence of inositol import machinery. A study by Newton et al. confirmed the presence of an inositol transporter in M. smegmatis and have suggested SugI (Rv3331) could function as its ortholog in M. tuberculosis 113.

2.6.2 Biosynthesis of PI in M. tuberculosis PI biosynthesis in M. tuberculosis proceeds through the formation of a PI phosphate (PIP) intermediate, which is catalyzed by an integral membrane protein called phosphatidylinositol phosphate synthase PgsA1 (Rv2612c), also known as PIP synthase. PIP is further dephosphorylated by a putative to yield PI (Figure 4). The proposed PI biosynthesis pathway dif- fers among eukaryotes and M. tuberculosis (including other Actinomyces, and Archaea); in eukaryotes it is a single-step reaction, where PI is synthesized directly from CDP-DAG and inositol 114. Also, deletion of the pgsA1 gene has been shown to be lethal in a M. smegmatis conditional mutant 99 and as sug- gested by transposon mutagenesis data in M. tuberculosis 14,15. Taken together, these findings open an avenue for drug discovery projects targeting M. tuber- culosis PgsA1 115. In Paper II we reported several high-resolution crystal structures of M. tuberculosis PgsA1, which we hope will facilitate structure- based drug design studies of this promising membrane protein target.

21

Figure 4. Illustration of the last two steps in M. tuberculosis PI biosynthesis, which proceeds through the formation of a PIP intermediate. Modified from ref. 114.

2.6.3 The CDP-alcohol phosphotransferase protein family M. tuberculosis PgsA1 belongs to a family of CDP-alcohol phosphotrans- ferases (CDP-AP, Pfam‡ entry: PF01066). These enzymes are transmembrane proteins, which catalyze the cleavage of the phosphoride anhydride bond in CDP-alcohol, followed by the displacement of CMP through a second alcohol (Figure 4). CDP-APs play key roles in phospholipid metabolism (Figure 5) and can utilize substrates with different properties; for example, mitochondrial CL synthase can use two amphipathic substrates (CDP-DAG and phosphati- dylglycerol) 116, whereas M. tuberculosis PgsA1 uses one polar and one am- phipathic substrate (inositol-phosphate and CDP-DAG, respectively) 114.

‡ http://pfam.xfam.org/family/CDP-OH_P_transf, accessed on January 4, 2020

22

Figure 5. A simplified schematic representation of the key metabolic steps in M. tu- berculosis H37Rv phospholipid biosynthesis. CDP-alcohol phosphotransferase genes are shown in bold. Abbreviations: PA, phosphatidic acid; CDP-DAG, cytidine diphos- phate diacylglycerol; DAG, diacylglycerol; CL, cardiolipin; PGP, phosphatidylglyc- erol phosphate; PG, phosphatidylglycerol; PIP, phosphatidylinositol phosphate; PI, phosphatidylinositol; PC, phosphatidylcholine; PS, phosphatidylserine; PE, phospha- tidylethanolamine. Figure based on the relevant KEGG pathway§ and on refs. 99,105,106,114. Despite the diversity of phospholipids synthesized by CDP-APs (Figure 5), all of these enzymes possess a common sequence signature motif and are de- pendent on divalent cation (such as MgII, MnII or CoII) for catalysis 117–120. M. tuberculosis PgsA1 has been shown to require MgII for catalytic activity but is also active in the presence of MnII, albeit to a much lesser extent 115. To date, four representative structures from the CDP-AP enzyme family have been published: (i) bifunctional IPCT/DIPPS (AfDIPP synthase) 121 and (ii) AF2299 PIP synthase 122 from the hyperthermophile Archaeoglobus fulgi- dus, (iii) RsPIP synthase from the fish pathogen Renibacterium salmoninarum 123 and (iv) M. tuberculosis PIP synthase PgsA1 (Paper II). All of these struc- tures (including the transmembrane DIPPS domain of AfDIPP synthase) are homodimers which share a conserved fold comprised of six transmembrane helices and an N-terminal amphipathic helix, which is located at the mem- brane-cytoplasm interface of each monomer (aA in Figure 6A). The large ac- tive site cavity accommodates a dinuclear metal site, which is accessible from both the cytosol and the membrane. Metal ions are primarily coordinated by aspartate residues from the signature motif and are involved in both substrate orientation and catalysis (Figure 6B – C). The active site cavity also contains a conserved positively charged pocket in close proximity to the metal site, which is proposed to be involved in accepting the water-soluble substrate (in- ositol phosphate in PgsA1 and RsPIP).

§ https://www.genome.jp/kegg-bin/show_pathway?mtu00564, accessed on January 4, 2020

23

Figure 6. Overall fold and signature motif of CDP-alcohol phosphotransferases. (A) Cartoon representation of the M. tuberculosis PgsA1 homodimer. One subunit is dis- played with rainbow coloring (blue-green-yellow-red) from N- to C- terminus. The positions of the N-terminal amphipathic helix and the transmembrane helix bundle in the cytoplasmic membrane are highlighted (B) Hydrogen bonds and metal coordina- tion network (solid lines) for the CDP-DAG substrate involving key residues from the signature motif (C) HMM logo of the signature motif**. Figure prepared in PyMOL124 using coordinates of the M. tuberculosis PgsA1 structure (PDB: 6H59 ref. 125) pre- sented in Paper II.

2.6.4 CDP-alcohol phosphotransferases: X-ray crystal structures elucidate the catalytic mechanism Significant progress has been made regarding the structural biology of CDP-alcohol phosphotransferases in the last five years, which has contributed significantly to our understanding of the structure-function relationships in this enzyme family (refs. 121–123 and Paper II). The signature motif resi- dues (Figure 6B – C) are primarily involved in forming the binding site for the CDP-linked substrate and divalent cations; mutations to these residues renders the enzyme completely inactive 121,126. The binding mode of the CDP-DAG is best understood, based on crystal structures obtained with this substrate (Pa- per II and ref. 122,123). In contrast, there are no available crystal structures with

** http://pfam.xfam.org/family/CDP-OH_P_transf, accessed on January 5, 2020

24 the second, water-soluble substrate of CDP-APs (inositol phosphate in PgsA1 and RsPIP). A study by Clarke et al. showed that PIP formation in RsPIP synthase is much more efficient when CDP-DAG binds to the metal site first, implying that the binding of this amphipathic substrate is required to prime the active site for catalysis 123. It would appear that the metal site is not always pre-as- sembled, as in most of the substrate-free structures the site is metal-free (Pa- per II) or the metal ion coordination is uncertain 121–123. It is tempting to spec- ulate that CDP-DAG aids in assembling the metal site, likely acting as a “cat- ion magnet” due to its highly charged pyrophosphate group. In the study, reported in Paper II, we have shown that fully assembled (dinuclear) metal sites in M. tuberculosis PgsA1 homodimer are structurally heterogenous; superposition of two subunits of the homodimer revealed dif- ferences in the coordination and position of the catalytic base D89 and the metal ion in site 2, respectively. We propose that this flexibility has a role in catalysis. Specifically, the MgII ion in site 2 helps in orienting D89 for depro- tonation of inositol phosphate for consequent nucleophilic attack upon the phosphoride anhydride bond in the CDP-DAG substrate. CDP-APs utilize a variety of substrates (e.g. inositol, inositol phosphate, serine or glycerol phosphate), which are precursors for the headgroups of a variety of phospholipids (Figure 5 and references therein). Only CDP-APs that utilize inositol phosphate have been structurally characterized to date (Paper II and ref. 121,123). In the case of AF2299 the substrate remains unknown, but likewise is proposed to be phosphorylated 122. To date, no research groups have succeeded in crystallizing the enzyme with inositol phosphate, however the binding site for this substrate has been characterized indirectly, by func- tional analysis of mutant proteins and molecular docking. These studies sug- gest that the phosphorylated substrate (inositol phosphate in PgsA1 and RsPIP) would bind to the positively charged pocked containing the conserved RxxR motif (R152xxR155, PgsA1 numbering), in close proximity to the metal site; this motif is also conserved in other CDP-AP enzymes that use inositol phosphate as a substrate 122. Amino acid substitutions, designed to alter the charge of the pocket or to introduce steric hindrance, render the protein com- pletely inactive, or less active depending on the mutation (Paper II and ref. 123). In addition, this positively charged pocket is occupied by a counter- charged molecule such as sulphate, tartrate 122 or citrate (Paper II), in all of the available structures, indicating the binding position for the phosphate group of the substrate. Taking all of these factors into account, we have proposed a refined cata- lytic mechanism for CDP-alcohol transferases (Paper II).

25 3. Radical-generating subunit R2 from ribonucleotide reductase: structure-governed metal specificity

3.1 Ribonucleotide reductase, RNR Both ribonucleic acid (RNA) and deoxyribonucleic acid (DNA) are long polymers consisting of four different nucleosides (, guanosine, thy- midine, cytidine in DNA and uridine in RNA), linked through a phosphate backbone. These molecules carry genetic information – the code of life. A dif- ference in atom count in the sugar moiety (ribo- in RNA vs deoxyribo- in DNA) has made a huge difference in selection of DNA to be the universal genetic material in all cellular organisms and many viruses. DNA is consider- ably more stable and much less prone to mutations, compared to RNA 127. Ribonucleotide reductase (RNR) is the key enzyme in the only known de novo pathway, performing ribonucleotide di- and triphosphate reduction to their corresponding form of deoxyribonucleotides; essentially connecting the RNA and DNA worlds 128–130 (Figure 7). RNRs are fascinating enzymes, as they utilize intricate radical chemistry for the reduction of their substrates. The radical is commonly generated with the help of a metallocofactor and then transferred to the active site for subsequent cysteine thiyl radical generation, which is then used for ribonucleotide reduction 131–133. The substrate reduction process is highly conserved and is summarized in Figure 7. These enzymes are also quite complex, in various ways reflecting the adaption of organisms to the ecological niche they populate. All RNRs are divided into three classes (I – III) based on the ways these enzymes generate the radical, their oxygen dependence, substrate preference, and quaternary structure.

Figure 7. Reaction catalyzed by ribonucleotide reductase. The transient cysteine thiyl radical in the active site activates the substrate through ribose 3’ hydrogen abstraction. Consequently, the 2’ OH group leaves as water. Then, the substrate undergoes 2-elec- tron reduction and the 3’ hydrogen is returned to the substrate. Adapted from ref. 129.

26 Of the three classes, class I RNRs have been the most well-studied. They consist of two subunits: R1 – essential for ribonucleotide reduction through the formation of a transient thiyl radical, and R2 – a “pilot light” subunit. The R2 itself generates a stable radical, which is then shuttled 35 Å away to R1 subunit by proton-coupled (PCET) for the initiation of ribo- nucleotide reduction 133,134. All to date studied R2 proteins share a common -like fold and house a di-metal Fe, Mn or mixed Mn/Fe cofactor (with one exception, discussed further below). Radical generation in R2 proteins of this class is oxygen dependent. Different metallocofactors and the ways that they generate a radical will be discussed in more detail below. Class II RNRs are oxygen indifferent and are found in organisms preferring anaerobic or facultatively anaerobic environments. Interestingly, in addition to bacteria, archaea and viruses, class II RNRs are also found in a few unicel- lular eukaryotes 130,135,136. These enzymes are monomeric or homodimeric and generate a cysteine thiyl radical via cleavage of deoxyadenosylcobalamin (vit- amin B12 coenzyme). Notably, in contrast to the class I enzymes that use di- phosphate substrates, class II utilizes either ribonucleotide di- or triphosphates 137,138. Class III enzymes are homodimeric and generate a stable, but extremely oxygen-sensitive glycyl radical in the vicinity of the conserved cysteine pair of the active site 139. In order to generate the glycyl radical, class III proteins require a homodimeric activase protein, which carries a 4Fe-4S cluster and S- adenosylmethionine (SAM) cofactor. The glycyl radical is then generated via SAM reduction by the iron- cluster. In addition, these enzymes prefer ribonucleotide triphosphates as their substrates 140–142. Due to the oxygen sen- sitivity, this class of enzymes is confined to organisms able to thrive in anaer- obic environments. Arguably, ribonucleotide reductase is one of the most important enzymes to the origin of life and represents a very interesting subject from an evolu- tionary point of view. While the three different classes briefly described above share only 10 – 20% sequence identity, there is a striking conservation trend in the three-dimensional structure of the catalytic subunit, catalytic mecha- nism itself and in allosteric regulation, suggesting that all of these enzymes likely have a common origin 129,143. Notably, many organisms encode more than one class of RNRs, thus adapting to dynamic changes in the environments they inhabit 130,135,144.

3.2 The radical-generating subunits of class I RNR: metal requirement and radical types The radical-generating subunits (R2 proteins) of class I RNRs conserve a ferritin-like fold – a four helix bundle typically coordinating two metal ions145.

27 Nevertheless, they diverge in the nature of the radical species and/or metal requirement for radical generation. Based on these differences, class I R2 pro- teins are further grouped into five subclasses: Ia – Ie. Metal sites from repre- sentative structures of the different subclasses are shown in Figures 8 and 9.

3.2.1 Class Ia R2: Fe/Fe enzymes and a protein radical The E. coli class Ia RNR and its R2 subunit (R2a) are often referred to as canonical and are the best characterized amongst all R2 classes 146. By using metal chelators, it has been established that the protein activity is metal-de- pendent, as they render R2a inactive. R2a can then be reactivated after the II 147 metal-free protein is incubated with Fe and O2 ; the active protein generates a stable organic radical 132,147 . In 1977 Sjöberg and colleagues have confirmed that the radical is located on tyrosine-122 (E. coli numbering) in proximity to the metal site 148. Notably, this was the first time an organic radical was ob- served in proteins 131. The FeIII /FeIII-tyrosyl radical (Y•) is generated by oxygen activation of the dinuclear FeII metallocofactor through a series of intermediates (reviewed in 149). Four are needed for the reduction of oxygen to water. Since FeII (ferrous) ions are readily oxidized by molecular oxygen, two electrons are transferred in this process and the ferrous ions are oxidized to a FeIII /FeIII (ferric) peroxide form. The diferric peroxide is then reduced by one electron by neighboring W48 side chain (E. coli numbering), forming a tryptophan cat- ion radical (W•+). Consequently, a short-lived FeIII /FeIV species is formed, denoted as intermediate X. Intermediate X, in turn, oxidizes Y122 to Y• (E. coli numbering). The W•+ species does not accumulate and is likely taken care of by the R2b protein itself or by the ferredoxin YfaE 149,150.

3.2.2 Class Ib R2: Mn/Mn enzymes and a protein radical Enzymes of this class are found in aerobic and facultative anaerobic, and often pathogenic organisms such as Bacillus anthracis, M. tuberculosis, E. coli and Salmonella enterica 130,151. The characterization of class Ib R2 proteins (R2b) is linked to a curious story of a relevant metallocofactor identification 152,153. Early in vivo experi- ments indicated that R2b is Mn-dependent. The enzyme co-purified with Mn was active, however the activity could not be restored after metal removal and subsequent re-introduction of Mn 154,155. Less active protein, however, was ob- tained after reconstituting apoprotein with FeII 154. In addition, R2b heterolo- gously expressed in Mn-enriched media did not result in acquisition of an ac- tive protein either 156. In part because of these observations R2b enzymes were thought to be di-iron enzymes, similar to R2a.

28 Essentially, R2b metallo-factors assemble the MnII /MnII metal site, but un- II II II II like Fe , Mn is unreactive towards O2 directly. Instead, Mn /Mn is oxidized •- III IV by the species (O2 ). The superoxide generates a Mn /Mn in- termediate, proposed to resemble intermediate X in R2a, which decays to MnIII /MnIII and Y•. In this case the number of required oxidizing equivalents for Y• is provided and, in contrast to R2a, the neighboring W side chain is not oxidized 157. Turned out that the missing link for the di-MnII metallocofactor activation in the earlier experiments was NrdI - a small flavodoxin-like protein harboring the redox active flavin mononucleotide (FMN) coenzyme and expressed by all organisms containing class Ib RNR 151,158–162. The FMN coenzyme of NrdI •- generates the O2 during the reaction with oxygen, which is then channeled directly to the metal site through a hydrophilic passage conserved in R2b pro- teins 157,163. Importantly, the first metal-coordination shell residues are completely con- served in R2a and R2b (Figure 8A – D), which rises an important question about the mechanisms of correct enzyme metalation.

3.2.3 Class Ic R2: Mn/Fe enzymes and an inorganic radical R2a and R2b enzymes, discussed above, rely on tyrosyl radical generation for the catalytic activity of the RNR. Another group of R2 proteins was dis- covered in Chlamydia pathogens, in which the residue equivalent to the radi- cal carrying tyrosine is substituted for a phenylalanine, with the significant homology between R2a and R2b being retained (Figure 8E – F). This group of R2 proteins was termed Ic (R2c). However, despite the absence of the sig- nature tyrosine, many lines of evidence suggested that this enzyme is active and uses radical chemistry as well 164,165. At first, the di-iron form of the en- zyme was characterized 164. Further studies have shown that R2c uses a mixed Mn/Fe cofactor and store the radical equivalent on the metallocofactor itself. Specifically, oxygen activation of the MnII/FeII results in formation of a MnIV /FeIV intermediate, decaying to an active MnIV /FeIII cofactor 165,166, in agree- ment with theoretical calculations 167,168. The immediate metal coordination shell in R2c also differs from R2a and R2b proteins. The N-terminal carboxylate side chain in R2c proteins is con- served as glutamate, while in R2a and R2b proteins conserved as an aspartate residue in the equivalent position (Figure 8E – F).

29

Figure 8. Representative metallocofactor structures from R2a, R2b and R2c classes of RNR, and R2lox. (A) E. coli R2a in oxygen-activated form [PDB: 1MXR 169] (B) E. coli R2a in reduced, di-ferrous form [PDB: 1XIK 170] (C) Corynebacterium ammo- niagenes R2b in oxygen-activated form [PDB: 3MJO 160] (D) Bacillus subtilis R2b in reduced, di-MnII form [PDB: 4DR0 171] (E) Chlamydia trachomatis R2c in di-ferric, oxygen activated form (oxidized Mn/Fe structure of R2c is not available) [PDB: 1SYY 164] (F) C. trachomatis R2c in reduced Mn/Fe form [PDB: 4M1I 172] (G) Geo- bacillus kaustophilus Mn/Fe R2lox in oxygen-activated form [PDB: 4HR0 173] (H) G. kaustophilus R2lox in reduced Mn/Fe form [PDB: 4HR4 173]. Fe and Mn ions are shown as orange and purple spheres, respectively, solvent and oxo/hydroxo bridging ligands – as red spheres. Solid lines indicate metal coordinations. Figure prepared using PyMOL 124.

30 3.2.4 Class Id R2: Mn/Mn enzymes and an inorganic radical One of the very recent additions to the R2 protein family classification is class Id (R2d). This class of R2 proteins was separated based on peculiar se- quence features. R2d enzymes seem to retain the tyrosine residue, which in classes R2a and R2b harbors the radical, and contain the N-terminal metal- coordinating glutamate, similar to R2c (Figure 9A). In addition, enzymes of this class appear to have evolved distinct allosteric regulation mechanisms, compared to canonical RNR 174,175. Surprisingly however, R2d enzymes do not generate Y• and appear to assemble a MnII /MnII metal site, as in R2b, but •- IV III rely on an external O2 source in generating an active Mn /Mn cofactor without NrdI 174–177. Thus, R2d enzymes may represent a simpler evolutionary precursor to R2b enzymes. Further studies are needed to show if a dedicated superoxide source exists for this class of R2 proteins.

3.2.5 Class Ie R2: metal-free enzymes and a protein radical The growing number of sequenced genomes have raised the chances for new and unexpected discoveries. A completely new class of metal-free R2 proteins was discovered, denoted Ie (R2e). These RNRs are encoded by gene operons similar to R2b and likewise require the NrdI activase. Surprisingly, these enzymes lack three of the conserved metal-coordinating carboxylate res- idues of the other R2 classes, preventing metal binding. Ribonucleotide reduc- tion in R2e is powered by 3,4-dihydroxyphenylalanine (DOPA) radical, how- ever, the mechanism that generates it is unclear (Figure 9B). Because R2e RNR are commonly found in human pathogens, metal-free adaption might be advantageous in metal-starvation conditions triggered by the host immune system 178,179.

Figure 9. Representative cofactor structures from R2d and R2e classes of RNR. (A) Flavobacterium johnsoniae R2d in reduced, di-MnII form [PDB: 6CWO 175] (B) Metal-free Mesoplasma florum R2e in the active form, containing a modified Y126 residue – 3,4-dihydroxyphenylalanine (DOPA) [PDB: 6GP2 179]. Mn ions are shown as purple spheres, solvent – as red spheres. Solid lines indicate hydrogen bonds and metal coordinations. Figure prepared using PyMOL 124.

31 3.2.6 R2-like ligand-binding oxidase: Mn/Fe enzymes and a Y-V crosslink A subgroup of proteins, denoted R2-like ligand-binding oxidases (R2lox), was identified based on sequence homology to the R2c group discussed ear- lier. They conserve the phenylalanine residue (equivalent to the radical-carry- ing tyrosine in R2a and R2b) and the di-metal cofactor-coordinating residues similar to R2c (Figure 8G – H) 180,181. The crystal structure of M. tuberculosis R2lox (Rv0233) revealed a remodeled R2-like scaffold with a large hydro- phobic cavity, housing a long-chain fatty acid coordinated to a mixed Mn/Fe cofactor. In contrast to R2 proteins, R2lox performs two-electron redox chem- istry, forming a tyrosine-valine ether crosslink in the vicinity of its metal bind- ing site, as a result of the metallocofactor reaction with oxygen. Similarly to IV IV R2c, R2lox is able to reduce O2 via Mn /Fe intermediate without any aux- iliary factors 173,180,182,183. The physiological role of R2lox proteins, however, is yet to be described.

3.3 Metal specificity in heterodinuclear Mn/Fe R2 and R2lox proteins As noted above, all R2 proteins share a common fold and, with the excep- tion of class Ie – similar metal-coordinating residues (Figures 8 and 9A). At the same time, they utilize a variety of different metallocofactor types to per- form ribonucleotide reduction through an evolutionary conserved mechanism. How do these enzymes ensure the correct assembly of their metallocofactors and what are the factors involved in this process? Clearly, a wrong metallation would result in reduced or no protein activity at all, which in vivo could have deleterious consequences 153,184. Typically, transition metal-ligand complex stability is described by the Ir- ving-Williams (IW) series (MnII < FeII < CoII < NiII < CuII > ZnII) 185. Stability of the complex indirectly reflects metal-ligand affinity. Some , like CuII, bind to protein ligands tightly and therefore are commonly chaperoned or seg- regated in the cell environment, and generally found in low abundance 186. II II However, Mn and Fe represent a special case, because both are abundant, bind weakly to their protein ligands, and prefer similar coordination environ- ments. Recent studies show that heterodinuclear Mn/Fe ferritin-like family pro- teins R2c and R2lox do not follow the expected IW order. It is worth mention- ing, that IW series theory is empirical and describes simple metal-ligand com- plexes in solution 185. Proteins are more complex than that and have evolved to create specific environments for dedicated catalytic activities. Crystallo- graphic and solution experiments with in vitro metal-reconstituted protein show that both R2lox and R2c are able to select Mn in their N-terminal metal

32 site (site 1) and Fe in their C-terminal metal site (site 2) (Figure 8F – H), to assemble a functional cofactor 172,173,187,188. The routes of cofactor assembly however, are different in these proteins 189. Thus, the data infers that the pro- tein scaffold itself is able to tune metal site architecture and electronic struc- ture to facilitate the binding of the correct metal for the assembly of catalyti- cally functional cofactor. Unfortunately, we are currently unable to explain this very important process fully 188. Despite similar ligand preference, redox chemistry catalyzed by Mn and Fe is far from interchangeable. The differentiation lies in their redox potentials and, consequently reactivity towards their substrate (O2, as in the case of R2 proteins). A study on superoxide dismutases MnSOD and FeSOD elegantly shows this difference 190. Both enzymes conserve the primary metal coordina- tion shell and yet MnSOD is inactive with Fe and, in turn, FeSOD is inactive with Mn. The study concludes, that the second metal ligand coordination sphere is involved in tuning the electronic environment of the metal site in such a way, that the enzyme can catalyze the reaction 190. Similarly, the rear- rangement of metal-chelating carboxylate ligands in R2 proteins (so called carboxylate shift) and metallocofactor oxidation are inseparable events, fun- damental for the reversible process of radical generation and transfer. There- fore, a protein is likely to create a favorable environment for the binding of the correct metal, even if the process includes outcompeting a less favorable binder as shown for R2c 189. The impact of second coordination shell residues on metallocofactor assembly in R2lox was recently reported 191. In this study, a glycine residue, directly preceding N-terminal metal-chelating carboxylate ligand (E69 in Figure 8G – H) and conserved in R2lox group, was substituted to leucine, a hydrophobic sidechain often found in R2b proteins. Surprisingly, this single mutation was able to convert R2lox to an R2c-like one-electron oxidant cofactor precluding the tyrosine-valine crosslink formation. This study, therefore, highlights the importance of the second metal coordination shell residues in tuning the environment for the physiologically relevant met- alation 191.

3.4 Bacillus anthracis R2b as a model system The assembly of the mixed metallocofactor in R2c and R2lox groups is relatively well-studied 172,173,187,188, but no such studies are reported for R2 pro- teins, utilizing a single type of metal. Therefore, we have set a target to inves- tigate whether the R2b protein scaffold also determines the selectivity towards

33 Mn, the physiologically relevant metal ion, or follows the IW order. We have used Bacillus anthracis R2b protein as a model system in these studies. Bacillus anthracis is a Gram-positive spore-forming bacterium, commonly found in soil and infecting grazing animals. It can also occasionally infect hu- mans through the consumption of contaminated meat or by the inhalation of bacterial spores. For the latter reason B. anthracis is considered as a biological weapon 192. The genome of B. anthracis encodes two RNRs: class Ib and class III. R2b is well-characterized biochemically and resembles a “classic” R2b II •- enzyme, generating a Y• from di-Mn site and O2 with the help of NrdI or II 144,161,193 directly from di-Fe site and O2 , but its three-dimensional structure had not been published. Here we have performed metallocofactor reconstitution experiments in metal-free B. anthracis R2b protein crystals by providing MnII and FeII in large excess and equal concentrations. To account for the possible role of O2 acti- vation, these experiments were performed in aerobic and oxygen-depleted conditions. Since R2lox contains a conserved Gly residue in the second coordination shell 181,191, we have created an R2lox-like mutation in B. anthracis R2b by substituting Leu61 to Gly and performed similar metal-reconstitution experi- ments as for the wild-type R2b protein. Further, we have determined the site- specific metal content and relative amounts of bound Mn and Fe in metal- loaded protein crystals using the quantitative anomalous dispersion method187. The results of this study are reported in Paper III. Furthermore, to investigate if MnII or FeII binding induce any differences in metal-coordination geometry in B. anthracis R2b, we have loaded apopro- tein crystals with a single type of metal, either MnII or FeII, aerobically and in oxygen-depleted conditions. In Paper IV we report several high-resolution crystal structures of B. anthracis R2b in metal-free form, as well as in complex with Mn or Fe. We were also able to trap partially oxidized di-iron metalloco- factor state in one of the crystal structures. We discuss the structural changes associated with metallocofactor oxidation event and suggest their physiologi- cal relevance.

34 4. Summary of main findings

4.1 Paper I The membrane proteome of M. tuberculosis remains a terra incognita for structural biologists, largely because of the challenges in mycobacterial mem- brane protein expression and the selection of well-expressing protein targets suitable for further investigations. To facilitate the selection of such targets, we have streamlined the M. tu- berculosis H37Rv a-helical transmembrane protein expression screening by adapting the folding reported GFP (frGFP) fusion approach, previously used for E. coli inner membrane proteins 194. In this procedure frGFP florescence can be correlated to the expression levels of its fusion membrane protein part- ner, which can be measured in growing cells. We have constructed two protein expression vectors, well suited for target protein production in E. coli and My- cobacterium smegmatis. Both vectors are also compatible with ligation inde- pendent cloning (LIC), a universal and high-throughput approach. In this study we have applied this frGFP-based technique to 42 a-helical transmembrane proteins, all of which are potential drug targets due to their essentiality for M. tuberculosis virulence and survival. We show that protein expression media and choice of expression host have a profound influence on target protein production both qualitatively and quantitatively, suggesting the basic conditions for routine protein expression tests. Interestingly, several pro- tein targets from our library express better without the addition of IPTG (iso- propyl β-d-1-thiogalactopyranoside) inducer or in autoinduction media. The increased protein yield in these cases is likely associated with slower produc- tion rates of target membrane proteins and, consequently, reduced strain on the machinery for their insertion into the membrane. In addition to these gen- eral observations, we describe conditions allowing high-level expression of more than 25 essential proteins from our library. Importantly, this approach allowed the overexpression of PgsA1 protein, leading to its successful crys- tallization and structure determination, reported in Paper II of this thesis. As this is a study in progress, the expression and integrity of several target proteins still have to be verified. Nevertheless, we hope that our findings will encourage further M. tuberculosis membrane protein research.

35 4.2 Paper II The structure of the cell envelope is a hallmark of the Mycobacterium genus (Figure 3). M. tuberculosis heavily relies on the integrity of its cell envelope for protection from the host immune response, environmental factors, as well as for virulence. Because lipids constitute a very large proportion of the enve- lope, targeting enzymes involved in lipid biosynthesis pathways is advanta- geous for antitubercular drug development. M. tuberculosis phosphatidylinositol phosphate synthase (PgsA1, Rv2612c) is one such potential target; it is an integral plasma membrane en- zyme, belonging to the CDP-alcohol phosphotransferase (CDP-AP) protein family. PgsA1 is an essential enzyme, and deletion of the pgsA1 gene renders the pathogen nonviable 99. This enzyme is involved in the biosynthesis of phosphatidylinositol phosphate (PIP), a precursor for phosphatidylinositol (PI), which is an abundant phospholipid in the mycobacterial cell envelope. PI functions as the first building block and lipid anchor for the synthesis of the structurally complex glycoconjugates phosphatidylinositol mannosides, lipo- mannan and lipoarabinomannan, which comprise the cell envelope of the pathogen (Figure 3) 97. Unlike what is observed in eukaryotes, PI biosynthesis in M. tuberculosis is a two-step process 114. In the first step, PgsA1 catalyzes the conjugation of the 1’ hydroxyl group of D-myo-inositol-3-phosphate (ino- sitol phosphate) with a lipid tail of the acceptor substrate cytidine diphosphate diacylglycerol (CDP-DAG), resulting in the formation of PIP. In the second step, a putative phosphatase removes the 3’ phosphate from the PIP, forming PI. In this study we report three crystal structures of M. tuberculosis PgsA1: a substrate-free form (to 2.9 Å resolution), a metal-substituted (Mn-) form bound with citrate (to 1.9 Å resolution) and a Mg-containing form that was co-crystallized with the substrate CDP-DAG (to 1.8 Å resolution). The struc- tures reveal a homodimer with the characteristic CDP-AP fold; an amphi- pathic juxtamembrane N-terminal helix and a six transmembrane helix bun- dle, arranged to house a large solvent-accessible pocket and a dinuclear metal site in each subunit of the homodimer (Figure 6A – B). PgsA1 requires MgII for catalysis 114,115, however we could not initially lo- cate the metal ions in the apo structure. Therefore, we co-crystallized PgsA1 with MnII; a cation that also supports PgsA1 catalysis, and which additionally allows for anomalous dispersion experiments. We were able to unambigu- ously identify and locate two metal ions per protomer of the PgsA1 dimer. The MnII ions were coordinated by four strictly conserved aspartate side chains from the CDP-AP signature motif. Unexpectedly, some of the MnII associated with citrate ions, which originated from the crystallization solution. These Mn-citrate complexes were clearly recognizable in the electron density and were located in the solvent-accessible positively charged pocket, comprised of a number of conserved arginine and lysine side chains. It was previously

36 hypothesized that this positively charged pocket (and specifically the R152xxR155 motif; PgsA1 numbering), may be involved in binding of the phosphate moiety of the substrate inositol-phosphate 122. To test this hypothe- sis, we introduced several point mutations that were expected to alter the local charge of the pocket, or which would sterically obstruct the binding of inosi- tol-phosphate. Protein activity was then assayed in proteoliposomes. The mu- tated sidechains in close proximity to the R152xxR155 motif (i.e. the apical side of the pocket), did indeed render the protein significantly less active com- pared to the wild-type. We also attempted to co-crystallize PgsA1 with inosi- tol-phosphate, but this was not successful. This is likely due to the inositol- phosphate being outcompeted by the sulphate ions in the crystallization solu- tion, which were essential for the formation of well-diffracting crystals. Fur- thermore, the addition of 10 mM inositol phosphate to apoprotein crystals grown in lipidic cubic phase (LCP) led to lipid phase change and crystal dete- rioration (unpublished data). Nevertheless, the results of the inositol-phos- phate binding site mutational studies were further confirmed by molecular docking experiments, additionally suggesting a plausible orientation for the substrate relative to the metal site and the catalytic aspartate residue. A previously reported 3.6 Å crystal structure of another member of the CDP-AP enzyme family, RsPIPS protein from a fish pathogen Renibacterium salmoninarum 123, revealed the binding site of CDP-DAG, which is also a sub- strate of M. tuberculosis PgsA1. However, the limited resolution of this struc- ture hindered the analysis of structural details in the active site. We were able to solve a co-crystal structure of M. tuberculosis PgsA1 in complex with CDP- DAG to 1.8 Å resolution, by supplementing the LCP with this substrate. The structure revealed clear electron density for CDP-DAG, which coordinated the binuclear MgII site in both protein chains of the homodimer. However, the long acyl chains of the substrate were disordered, indicative of their flexibility along the protein surface. Notably, a metal ion in site 2 (coordinated by the catalytic residue D93) in both in MnII and MgII-containing structures exhibited a certain degree of mo- bility, when the two protein chains that comprise the protein dimer were com- pared. Specifically, in one of the protein chains both metal ions were closer together (we denoted this state as “tight”) and were further apart in the second protein chain (we denoted this state as “relaxed”). We propose that this flexi- bility has a functional implication in catalysis; it is required for orienting the correct hydroxyl group of the inositol-phosphate towards the catalytic base D93 and CDP-DAG. Finally, we proposed a refined 121–123 catalytic mecha- nism for CDP-AP enzymes (Figure 10). The high-resolution crystal structures presented in this study provide an excellent basis for structure-guided drug design against M. tuberculosis.

37

Figure 10. Proposed model for substrate-induced catalysis initiation. (A) Resting “re- laxed” state with CDP-DAG bound. (B) Binding of D-myo-inositol-3- phosphate (ino- P) to a carboxylate shift in D93 accompanied by a shift in the metal position of Mg2 resulting in the “tight” state. In step I, the catalytic base D93 deprotonates the 1′ hydroxyl of D-myo-inositol-3-phosphate. In step II, D-myo-inositol-3-phosphate per- forms a nucleophilic attack on the β-phosphorus of the CDP-DAG substrate. (C) Pre- sumed penta-coordinated transition state. (D) Release of the reaction product phos- phatidylinositol phosphate (PIP) and the return of the active site to the resting “re- laxed” state. Aspartate residues undergoing the carboxylate shift are marked with blue filled circles. Figure adapted from Paper II 125.

4.3 Paper III

The heterodinuclear ferritin-like family of proteins R2c from Chlamydia trachomatis and R2lox from Geobacillus kaustophilus defy the Irving-Wil- liams series (IW) and selectively assemble their Mn/Fe metallocofactor, in- stead of the expected Fe/Fe cofactor 172,173,187,188. Furthermore, both proteins have been shown to assemble their cofactors differently 189, suggesting that the protein scaffold itself determines metal selectivity through mechanisms we currently do not fully understand. Following these observations, we de- cided to investigate whether the R2b enzyme (di-manganese in vivo) from Ba- cillus anthracis can also select the physiologically relevant metal ion, or if it follows the IW series instead.

38 We employed the quantitative X-ray anomalous dispersion (Q-XAD 187) technique to estimate the relative amounts of Mn and Fe bound in the metal sites, using two different strategies of metallocofactor reconstitution: (1) Incubating metal-free B. anthracis R2b protein crystals with excess and equal concentrations of MnII and FeII in oxygen-depleted conditions. In this setup no oxygen activation is expected, and metals remain in their reduced state. (2) Incubating metal-free B. anthracis R2b protein crystals with excess and equal concentrations of MnII and FeII in aerobic conditions. In contrast to the previous setup, oxygen activation is possible. Q-XAD data showed that irrespective of the presence or absence of oxygen, B. anthracis R2b assembled a di-manganese cofactor, thus defying the IW series. Considering that metal-coordinating ligands are completely conserved in R2a and R2b groups, selective binding of Mn in R2b was not expected. Many broad questions remain to be answered in future experiments: (a) What are the steps in metallocofactor assembly in R2b? (b) Does R2b assemble tran- sient Mn/Fe, Fe/Mn or Fe/Fe cofactors? (c) If it does, why do these cofactors not activate oxygen as is the case with R2c, R2lox and R2a proteins? (d) R2a enzymes are active with a di-iron cofactor in vivo and in vitro, and conserve identical metal-coordinating residues as R2b proteins. However, would an R2a enzyme select for iron from a mixed pool of labile MnII and FeII? We introduced an R2lox-like point mutation in the second metal coordina- tion sphere of B. anthracis R2b. This residue is a conserved glycine in R2lox proteins, which directly precedes the N-terminal carboxylate ligand (E69 in Figure 8G – H), coordinating the metal ion in site 1 181,191. In contrast, there is no obvious sequence conservation of this residue in the R2 group, although it is generally hydrophobic; in the R2b group a leucine is often found in this position (L61 in B. anthracis) 191. We therefore crystallized the apo L61G var- iant of B. anthracis R2b and performed metallocofactor reconstitution as de- scribed in procedures (1) and (2) for the wild-type protein. Similarly, we ana- lyzed the site-specific metal content in the L61G mutant by Q-XAD and com- pared this data with the metal quantification results obtained for the wild-type R2b protein. Q-XAD data showed that in the absence of oxygen during me- tallocofactor reconstitution (procedure 1), the L61G variant still assembled a di-manganese cofactor as was observed for the wild-type enzyme, however it did become slightly more permissive to iron binding in both metal sites (Fig- ure 11). In contrast, when the L61G variant was reconstituted with divalent Mn and Fe in aerobic conditions, the protein changed its metal preference and assembled the Mn/Fe cofactor, similar to what is observed for R2lox and R2c enzymes (Figures 8F – H and 11).

39

Figure 11. Relative amounts of Mn and Fe in the two metal binding sites of (A) wild- type and (B) B. anthracis R2b L61G variant. The corresponding representative anom- alous difference electron density maps derived from Q-XAD data are shown. Differ- ence density maps for Mn and Fe are shown as purple and orange mesh, respectively. For the wild-type and anaerobically prepared L61G mutant metallocofactors maps are contoured at 5 s. Maps for aerobically reconstituted L61G crystal are contoured at 4 s. Figure adapted from Paper III. In addition to quantitative studies, we also obtained several high-resolution crystal structures of the wild-type protein and the B. anthracis R2b L61G var- iant in a metal-free state as well as after metallocofactor reconstitution, fol- lowing the procedures (1) and (2), described above. Compared to wild-type R2b, the L61G variant displayed significant rearrangement in the metal site coordination sphere. Briefly, the L61G mutation created a vacant space for side chain rearrangement in close proximity to the radical-generating residue Y100. Consequently, metal-chelating carboxylates were also rearranged, likely due to the absence of steric obstruction normally created by the second- shell of bulky hydrophobic sidechains. As a consequence of the carboxylate rearrangement both MnII ions were 6- coordinated and solvated in the anaerobically prepared L61G protein. For comparison, in the FeII and MnII-containing wild-type protein, site 1 is 4- co- ordinated and site 2 is 5-coordinated with no metal-coordinating solvent mol- ecules visible in the electron density maps (Ref. 195 and Paper III). Rearrange- ments in the second metallocofactor coordination shell were present in L61G crystal reconstituted in presence of O2 as well. However, the metal site was significantly more disordered, as reflected by the refined occupancies of the metal ions in each binding site, as well as the flexibility of the metal-chelating

40 carboxylates. Notably, the mixed Mn/Fe cofactor in L61G mutant appears to be able to activate O2. Taking into account the extreme sensitivity of R2 pro- teins to photo-reduction 196, observation of even partially oxidized (aqua/oxo/hydroxo-bridged metallocofactors) is surprising – this particular crystal survived three rounds of X-ray data collection (Mn-, Fe- and near-Se edges). Further investigation is required to show if a tyrosyl radical (Y•) is formed during the oxygen activation process, or if the metallocofactor is irre- versibly trapped in its oxidized state. In summary, we have shown that the protein scaffold of the R2b enzyme selects Mn over Fe. Similar to R2lox enzymes, the second coordination shell environment of R2b proteins is able to constrain metal-chelating residues, for physiologically relevant metal ion selection. In addition, we provide experi- mental evidence that the second coordination shell environment is involved in tuning the metallocofactor redox potential for O2 activation in R2 proteins.

4.4 Paper IV In this study we focused on the structural characterization of B. anthracis R2b, complexed with Mn or Fe. While a number of orthologs from closely- related Bacillus species have already been structurally characterized (e.g. B. cereus R2b 197 and B. subtilis R2b 171, which share 99% and 59% protein se- quence similarity, respectively), we investigated whether the type of metal- locofactor (i.e. di-Mn or di-Fe) would have an immediate impact on the metal site geometry in B. anthracis R2b. We have solved five high-resolution B. anthracis R2b crystal structures: one in a metal-free state, and four structures which were obtained by metal- locofactor reconstitution in protein crystals, using two different procedures: (1) Incubating metal-free crystals with excess of MnII or FeII in oxygen- depleted conditions. In this setup no oxygen activation is expected, and metals remain in their reduced state. (2) Incubating metal-free crystals with an excess of MnII or FeII in aerobic conditions. In this setup oxygen activation of the FeII /FeII cofactor is expected, leading to the formation of an oxo-bridged FeIII /FeIII state and Y•, as in R2a 144,149. Since one of the aerobically Fe-loaded protein structures (denoted Fe2-aer in this study) clearly represented a reduced metallocofactor state and metal coordination geometry, we concluded that its metal sites must have been photo-reduced during data collection. Photoreduction is not uncommon in R2 proteins 196, and in in general 198. The metallocofactor struc- ture in Fe2-aer is identical to cofactors in anaerobically prepared Fe and Mn- containing B. anthracis R2b structures reported in this study. We therefore concluded that the structure of the metallocofactor does not change depending

41 on the metal type bound, at least when the metal ions remain in their reduced (II) state. To decrease the possibility of photoreduction during data collection, we collected another X-ray dataset from a B. anthracis R2b crystal. This crystal was aerobically prepared with FeII, and was expected to form an oxo-bridged FeIII /FeIII cofactor and the Y•, as in R2a. This time this crystal was exposed to highly attenuated X-ray beam (unfortunately, total radiation dose for any of these datasets was not recorded). Following this procedure, we were indeed able to obtain a B. anthracis R2b structure with a partially oxidized di-iron metallocofactor. The photo-reduction could not be avoided completely in both chains of the R2 homodimer, and so the metal coordination geometry repre- sents a mixture of oxidized and reduced states. This structure is denoted Fe2- semiox. Intriguingly, positive difference density maps of the Fe2-semiox structure indicated significant rearrangements in helix E. This helix possesses a stretch of amino acids engaging in p-type/310-type hydrogen bonding (Figure 12). The distorted topology of helix E is retained across members of the ferritin-like protein family, suggesting it has a functional role 145,199. For example, this hel- ical segment undergoes reorganization upon regulatory subunit binding in bacterial multicomponent monooxygenase 200,201. In B. anthracis R2b, helix E donates one of the metal-chelating carboxylates, E161, which coordinates to metal in site 2. The reason why R2 proteins conserve this energetically expen- sive structural feature remains enigmatic. However, since the rearrangements in helix E are linked to oxidation of the metallocofactor, we have proposed a functional role associated with this event.

42

Figure 12. Distorted topology of helix E and model of structural rearrangements oc- curring upon metallocofactor oxidation in B. anthracis R2b. (A) Residues participat- ing in i + 5 interactions are shown in green and the respective hydrogen bonds are shown as dashed black lines. The residues participating in 310-type (i + 3) interactions are shown in yellow and the respective hydrogen bonds are shown as solid black lines. Metal ions are shown as spheres (B) Helix E in Fe2-semiox is modeled as in reduced structures; however, the positive difference electron density (shown as blue mesh) suggests structural rearrangements upon oxidation (C) Model of helix E rearrange- ments upon metallocofactor oxidation, as suggested by positive Fo–Fc electron den- sity. The 2Fo–Fc electron density map is contoured at 1.2 σ and is shown as a grey mesh within a 1.2 Å radius from the residues. The Fo–Fc electron density map is contoured at 2.5 σ and shown as a blue mesh within 2.0 Å radius of the residues. Figure adapted from Paper IV 195.

By comparing the structure of the putative superoxide passage in the Fe2- semiox structure to a reduced di-iron R2b-NrdI complex from B. cereus 197, we have noted that oxidation-related rearrangements in the first and second coordination shells of the metal sites in Fe2-semiox render the superoxide pas- sage closed. Specifically, the Y166 side chain, located on helix E, likely forms a hydrogen bond with a residue S196 on the neighboring helix F (Figure 13). While we could not model this interaction in our structure due to the poor quality of the positive difference density for Y166, this interaction is unam- biguously observed in the oxidized di-manganese structure of C. ammoniag- enes R2b 160. We propose that these conformational changes in the

43 metallocofactor first and second coordination shells are required to seal the superoxide passage in order to avoid the loss of formed Y• to solvent.

Figure 13. Proposed gating of the NrdI–R2b superoxide passage. (A) Docked model of the FMN-bound NrdI and Fe2-semiox R2b (based on the B. cereus complex, PDB id: 4BMO 197). A surface-slice view of the open superoxide passage (blue arrows), which connects the FMN coenzyme and the metal cluster (modeled as reduced) in B. anthracis R2b. (B) Upon oxidation of the metal cluster, the distorted helix E seg- ment rearranges; Y166 seals the superoxide passage (red arrow) by making a polar interaction with S196 on helix F. Metal ions are shown as large spheres. Residues, gating access to the superoxide channel, are highlighted in bold. The protein surface is shown in translucent grey. Figure adapted from Paper IV 195.

44 5. Brief description of the crystallographic methods used in this thesis

5.1 X-ray crystallography X-ray crystallography is by far the most widely used technique†† for mac- romolecular (e.g. protein) three-dimensional (3D) structure determination, able to reach atomic resolution. As the name of technique implies – formation of protein crystals is an essential pre-requisite and often remains a major bot- tleneck in structural biology, as many proteins resist crystallization. Protein crystals are made of orderly 3D arrays of repeating units of protein molecules, interacting with each other non-covalently in a very regular way. In a typical X-ray data collection experiment a protein crystal is exposed to a focused monochromatic X-ray beam, and its diffraction pattern is recorded by rotating the crystal as the X-rays pass through it. X-rays are scattered from the molecules in the crystal lattice and, if in phase, result in constructive scattering and a diffraction pattern. Each spot (or reflection) in the diffraction pattern contains contributions from all atoms in the unit cell – a minimal repetitive unit of the crystal. Spot distribution in the diffraction pattern may provide in- formation about crystal space group (molecule organization in 3D in the unit cell), distance between them – about the unit cell size. The intensity of each reflection describes the atomic contents of the unit cell. X-rays can be described as a wave function. With the X-ray wavelength known, amplitudes are calculated from the intensity of each reflection. How- ever, the phase information, essential for electron density map reconstruction, cannot be measured and has to be estimated indirectly. This represents the phase problem in X-ray crystallography. Commonly, the phase problem solu- tion is aided by the molecular replacement method, or using experimental phasing techniques, such as isomorphous replacement or anomalous disper- sion methods. Atomic resolution (1.2 Å and better) protein structures can be solved using direct phasing – a common technique for solving the phase prob- lem in small molecule crystallography 202,203. Once phases are available, electron density map can be calculated using complex mathematical operations (Fourier transform). Electron density repre- sents distribution of electron clouds in 3D space, indicating the location of

†† https://www.rcsb.org/stats/summary, accessed on January 12, 2020

45 protein atoms, and therefore structural features of the crystallized protein. The crystallographer then reconstructs a structural model of the protein by inter- preting the electron density maps via iterative cycles of model building, re- finement and validation. Many excellent books about X-ray crystallography have been written; for detailed theory and practice, recent books from Bernhard Rupp 204 and Gale Rhodes 205 are recommended.

5.2 X-ray anomalous dispersion, XAD XAD, is a technique in X-ray crystallography widely used for phasing and identification of chemical elements (atoms) in macromolecules of interest. The method employs a property of atoms to absorb some X-ray , if particular conditions are met. Specifically, if the X-ray wavelength energy is near the absorption energy of an atom, the atom will strongly absorb some radiation, ionize and emit a photoelectron with an altered energy and phase (i.e. compared to elastic scattering, when energy of X-rays is reflected by the atom, this type of scattering is anomalous). This absorption event has a measurable effect on diffraction, because it affects scattering magnitude and a phase of scattered rays. The scattering differences between X-rays scattered in an elastic way and anomalously provide the basis for anomalous difference electron density map calculation 202,203. Since synchrotron X-ray wavelength energy can be tuned for specific element absorption; generated anomalous dif- ference density maps can provide information about the location of specific atoms of interest in the protein structure. Additionally, XAD is a powerful technique in metal ion identification (Figure 14). Furthermore, XAD can be used as a quantitative technique – for estimation of relative amounts of metal ions bound in the same site of a protein (i.e. dif- ferent subsets of protein molecules in a crystal bind different metals) 187.

46

Figure 14. X-ray anomalous scattering. The absorption edges of transition metals, i.e. Fe (orange sphere) and Mn (purple sphere), are accessible by synchrotron radiation. The graph shows energy distributions of theoretical X-ray scattering factor compo- nents f´ (real) and f´´ (imaginary) for Mn, Fe, Cu and Zn (the scattering factors for each element can be obtained from ref. 206) 202. The insets show anomalous difference density of G. kaustophilus R2lox metallocofactor (contoured at 4 electrons/Å3), de- rived from data collected at the Mn (purple) and Fe (orange) absorption edges 173. At Mn edge, only Mn displays an anomalous signal, but at the Fe edge – both Mn and Fe display an anomalous signal. Figure reproduced from ref. 188.

5.2.1 Radiation damage and photo-reduction in metalloprotein X- ray crystallography

X-ray radiation is ionizing. Crystallographers and those interested in using structures obtained by X-ray crystallography must be aware of the effects and defects this irradiation may introduce. Metalloenzymes are especially suscep- tible, as X-ray induced photoreduction may alter the structure of the metal- locofactor and its environment 198. X-ray photons interact with electrons of matter by (i) a non-absorbing event (i.e. elastic or coherent scattering of a photon), by (ii) inelastic scattering of a photon (i.e. incoherent), when part of its energy is absorbed and scattered at lower energy together with the ejection of a free recoil electron (Compton effect), and by (iii) photoelectric absorption – all photon energy is absorbed and a photoelectron is ejected from an inner atomic shell. At synchrotron en- ergies, photoelectric absorption accounts for over 90% of energy absorbed by the protein crystal 207. The ejected photoelectron can give rise to the ionization of up to 500 neighboring atoms, resulting in the formation of radical species able to diffuse and introduce further ionization and excitation events 207,208. Clearly, this is the limitation of X-ray crystallography used in redox active protein cofactor studies, because transition metals act as electron sinks result- ing in photo-reduction during X-ray data collection. Metallocofactors in R2 proteins are no exception (Papers II and III) and are photo-reduced extremely fast 196. Still, X-ray-induced radiation damage can be used in a positive way – for phasing 209.

47 6. Popular summary

Enzymes are special kind of proteins; accelerating chemical transfor- mations in all living organisms and sustaining all forms of life on Earth. In living cells, enzymes are assembled from just 20 different building blocks called amino acids, in various combinations. Enzymes may also incorporate metal atoms and metallo-organic molecules, which help these proteins to per- form a dedicated chemical reaction; such proteins are called metalloenzymes. My work, presented in this thesis, largely focused on these types of enzymes. In particular, we aimed to understand what these protein molecules look like in three-dimensional atomic detail, in order to relate their structures to their functions. Needless to say, such knowledge is an important piece of the puzzle in regard to understanding the workings of nature. Structural characterization of enzymes also greatly advances our quality of life, for example, through the development of novel pharmaceuticals against deadly pathogens or other dis- eases.

In the first half of my thesis I have focused on membrane proteins from the pathogen Mycobacterium tuberculosis, the causative agent of tuberculosis (TB) – a deadly infectious disease affecting millions of people worldwide. M. tuberculosis is a highly adaptive pathogen that has evolved to escape the defense mechanisms of our immune system; it may even wait dormant in the body until the immune system weakens, following which it causes active TB. In addition, there are strains of M. tuberculosis which have evolved resistance to the antibiotics which are currently used to treat the disease. Therefore, the development of new effective pharmaceuticals to fight this pathogen is of the utmost importance. Membrane proteins, as the name would suggest, are found in the membrane – the outermost shell of the cell. If a bacterial cell was com- pared to a town, the membrane would be the town wall, and the membrane proteins would be the gate keepers; allowing communication with the outside world, importing food, exporting waste, and ensuring that the town is safe from invaders. Membrane proteins are therefore perfect candidates to be tar- geted by new antibiotics; if we can break through the wall, then we can con- quer the city. Protein structural analysis requires large amounts of pure protein material. Unfortunately, membrane proteins are not very abundant in cells and are dif- ficult to express (produce) in large amounts. In Paper I we describe an

48 approach that helps to identify well expressing M. tuberculosis membrane pro- teins. In this approach, 42 membrane proteins were tagged by green fluores- cent protein (GFP), which can easily be detected in growing cells by measur- ing its fluorescence. In this experiment the fluorescence intensity is assumed to be proportional to the produced amount of the membrane protein of interest. In this study we used two protein production “factories” Escherichia coli and Mycobacterium smegmatis, to explore the effects of different bacterial growth conditions on the yields of 42 GFP-tagged membrane proteins. We established production conditions for more than 25 M. tuberculosis membrane proteins, all of which are essential for growth and virulence of M. tuberculosis and are therefore attractive drug targets. The structure of the M. tuberculosis cell wall is unique, compared to bac- teria from other genera; it consists of numerous proteins, fats, complex sugars and their conjugates, forming a thick waxy amour for the pathogen. The GFP- based screening approach, described in Paper I, helped us to produce the M. tuberculosis membrane metalloenzyme PgsA1, and we were able to determine its three-dimensional structure and catalytic mechanism. PgsA1 is involved in biosynthesis of phosphatidylinositol (PI), one of the major phospholipids (fat molecules), that comprises the membrane “town wall”. This enzyme is abso- lutely essential for the growth of M. tuberculosis, and the pathogen is unable to survive when the pgsA1 gene is removed. Mycobacteria synthesize PI in a two-step chemical transformation, in contrast to humans where the reaction is performed in a single step. The three-dimensional structures described in Pa- per II provide an excellent basis for the design of antibiotics, that inhibit PgsA1, and which hopefully kill M. tuberculosis.

The second half of this thesis concerns another metalloprotein – the R2 subunit of ribonucleotide reductase (RNR). The RNR enzyme makes the building blocks for DNA; the molecule that carries the genetic code. Different organisms contain different types of RNR proteins, depending on the ecolog- ical niche they populate (e.g. oxygen requirement, lifestyle, availability of mi- cro-elements). RNR enzymes are rather unique as they utilize radical chemis- try to perform their dedicated chemical transformation; besides RNR only a handful of other radical enzymes have been described. Radicals are typically very reactive and cause harm to organic molecules if they are not contained or cleared by special systems in our bodies. All organisms that require oxygen for survival harbor an RNR that consists of at least 2 proteins working to- gether, called R1 and R2. R2, the subject of my thesis, is a small protein that commonly requires two metal atoms in order to function: two iron atoms, two manganese atoms, or one of each. In some instances, the metal atom pair reacts with oxygen in the air and this reaction forms a radical inside the R2 protein. The radical is then relayed to its partner protein R1, which then uses it to make the DNA building block product. This contrasts with the di-manganese R2 proteins, which do not react with oxygen directly and which also require a

49 third protein for their RNR complex, called NrdI; this protein partners with R2 in order to modify oxygen to a superoxide radical (oxygen with an addi- tional electron). The superoxide is then relayed to the manganese pair in the R2 protein, allowing the generation of the radical, which is then transferred to R1. RNRs were discovered about 60 years ago, but nevertheless remain a sub- ject of active research and a source of unexpected discoveries. All metal-dependent R2 proteins can only generate the radical efficiently in the presence of a specific type and combination of metal atoms. If the pro- tein is “miss-metallated”, the enzyme is not functional and the organism may be unable to survive. How the protein selects the right metal from its environ- ment and furthermore, how it is able to distinguish between manganese and iron (two very similar metals) is an intriguing area of research. In Paper III we studied the metal selection process in R2 protein from the pathogenic bac- terium Bacillus anthracis by exposing the metal-free protein crystals to a mix- ture of manganese and iron in equal concentrations. The protein exclusively chose manganese, the relevant metal type for this metalloenzyme from which we concluded that B. anthracis R2 is able to distinguish between the two met- als. We then aimed to elucidate the features in the three-dimensional structure of R2 which govern the metal ion selection. Specifically, when we substituted one of the conserved amino acids in close proximity to where two manganese atoms bind, the R2 protein suddenly changed its metal ion preference. In the presence of air, the B. anthracis R2 mutant protein selected manganese in one of the metal binding sites and iron in the other. Taken together, our findings show that B. anthracis R2 has control over the selection of relevant metal at- oms, and that amino acids in the vicinity of the metal binding site play direct roles in the differentiation between metal ions. In addition, we also investi- gated how the binding of manganese and iron affects the three-dimensional structure of the B. anthracis R2 protein, before and after the metal reaction in the presence of oxygen. Interestingly, we found that reaction with oxygen in the iron-containing protein triggers local structural changes in the core of the protein. Based on available data reported in Paper IV, we hypothesize that these rearrangements are important for containing the radical produced by the R2 protein.

50 7. Populärvetenskaplig sammanfattning

Enzymer är en grupp proteiner med mycket speciella egenskaper; de har förmågan att öka hastigheten för de kemiska omvandlingsreaktioner som ligger till grund för allt liv på jorden. Liksom andra proteiner är enzymer uppbyggda av långa kedjor bestående av 20 olika byggstenar, så kallade aminosyror, som sitter ihop i olika kombinationer. Enzymer kan även innehålla metaller som hjälper proteinet att utföra den kemiska reaktionen. Sådana enzymer kallas metalloenzymer, och i det arbete som presenteras i denna avhandling har jag framför allt fokuserat på just denna typ av enzymer. Mitt arbete har syftat till att ta reda på hur dessa proteiners tredimensionella struktur ser ut, det vill säga hur aminosyrorna och de individuella atomerna sitter i förhållande till varandra, för att försöka förstå hur enzymerna utför sitt arbete. Denna kunskap är i sin tur mycket viktig för våra möjligheter att förstå hur naturen runt om kring oss fungerar. Dessutom kan kunskap om enzymernas tredimensionella struktur förbättra vår livskvalité, bland annat genom att möjliggöra rationell design av nya läkemedel mot sjukdomar.

Den första halvan av min avhandling är inriktad på membranproteiner från Mycobacterium tuberculosis; bakterien som orsakar tuberkulos. Tuberkulos är en mycket farlig sjukdom som mer än en miljon männsikor dör av varje år, och något tiotal miljoner insjuknar i. Tuberkulosbakterien är en mycket anpassningsbar bakterie, som genom evolutionen har lärt sig att undvika vårt immunförsvars olika mekanismer. Den kan till och med ”övervintra” slumrande i kroppen till den dag immunförsvaret av någon anledning försvagas, varvid bakterien ”vaknar” igen och patienten insjuknar i tuberkulos. Dessutom finns stammar av tuberkulosbakterien som utvecklat resistens mot många av de antibiotika som i dagsläget används för att behandla tuberkulos, och därför är det extremt viktigt att hitta nya läkemedel mot denna bakterie. Membranproteiner sitter som namnet antyder i membranet – cellernas yttersta skal. Om vi för enkelhetens skull liknar en cell vid en stad, skulle membranet vara stadsmuren och membranproteinerna vakterna vid stadens portar. Dessa portvakter tillåter kommunikation med omvärlden, transporterat in mat, transporterar ut sopor och ser till att skydda staden från inkräktare. Portvakterna är jätteviktiga för alla celler; även för Tuberkulosbakterien. Dess membranproteiner är därför perfekta kandidater för att rikta in sig på vid utvecklingen av nya läkemedel. Om vi slår ut portvakterna kan vi erövra staden! Analys av ett proteins tredimensionella struktur kräver som regel väldigt mycket protein. Tyvärr finns det inte så mycket av varje membranprotein i cellen, och membranproteiner är som regel extra svåra att producera i större

51 mängder. Artikel I beskriver ett nytt tillvägagångssätt, som hjälper oss att identifiera de membranproteiner från Tuberkulosbakterien som är lätta att producera. Genom att sätta ihop 42 olika membranproteiner med ett grönt fluorescerande protein – GFP – kan vi se proteinerna lysa i cellen. Mängden grön fluorescens är proportionell mot mängden membranprotein (eftersom varje membranprotein sitter ihop med ett GFP-protein). Vi har använt oss av två ”proteinfabriker”; bakterierna Escherichia coli och Mycobacterium smegmatis, för att titta på hur tillväxtbetingelserna påverkar hur mycket som produceras av de 42 membranproteinerna. Vi har hittat bra betingelser för att producera 25 sådana proteiner, som alla är mycket viktiga för Tuberkulosbakteriens överlevnad (och därmed potentiella måltavlor för utveckling av nya läkemedel). Ett av dessa proteiner är PgsA1, som är inblandat i tillverkningen av fosfatidylinositol, en av de viktigaste byggstenarna i cellmembranet (”stadsmuren”). Detta enzym är absolut nödvändigt för tuberkulosbakteriens överlevnad; den klarar sig helt enkelt inte utan det. Tuberkulosbakterien tillverkar fosfatidylinositol i två steg, medan vi människor gör det i ett steg, vilket gör denna process extra intressant för läkemedelsutveckling, eftersom man kanske kan påverka bakterien utan att samtidigt riskera biverkningar. Den tredimensionella strukturen för PgsA1 beskrivs i artikel II, och utgör en perfekt grund för att designa nya läkemedel som kan hämma enzymets funktion.

Den andra halvan av min avhandling handlar om ett annat metalloprotein – ribonukleotidreduktas, ett enzym som tillverkar byggstenarna till DNA; molekylen som lagrar våra gener. Olika organismer har olika varianter av ribonukleotidreduktas, beroende på hur de lever (exempelvis tillgång på syre och olika näringsämnen). Ribonukleotidreduktaserna är lite speciella såtillvida att de använder sig av kemiska radikaler i omvandlingsprocessen, vilket endast ett fåtal enzymer gör; radikaler är nämligen skadliga för de andra molekylerna i cellen. Alla organismer som använder syre har ett ribonukleotidreduktas som består av åtminstone två proteiner som arbetar ihop; R1 och R2. R2-proteinet, som är det protein min avhandling handlar om, behöver två metallatomer för att fungera: Antingen två järn, två mangan, eller en atom av vardera sort. För de flesta av dessa R2-proteiner fungerar det på så sätt att de två metallatomerna reagerar med syret i luften, varvid en radikal bildas inuti proteinet. Denna skickas sedan vidare till R1-proteinet, som använder radikalen vid tillverkningen av DNA-byggstenar. De R2-proteiner som har dubbla manganatomer fungerar dock lite annorlunda; de reagerar inte direkt med syre, utan behöver ett tredje protein, NrdI. NrdI hjälper R2-proteinet att göra om syre till radikalen superoxid (syre med en extra elektron), som skickas vidare till de två manganatomerna i R2. Dessa använder superoxidmolekylen för att göra den ”vanliga” radikalen som sedan vidarebefordras till R1. Ribonukleotidreduktaserna upptäcktes redan för omkring 60 år sedan, men de utgör fortfarande ett stort och viktigt forskningområde och är källa till många nya överraskande upptäckter. De metallinnehållande R2-proteinerna kan bara bilda radikalen om de har en särskild kombination av metallatomer. Om proteinet blir ”fel-metalliserat” fungerar inte enzymet alls, vilket kan bli förödande för organismen. Hur R2-

52 proteinerna väljer ut metallerna från omgivingen och hur de lyckas skilja på mangan och järn, två kemiskt snarlika metaller, är ett mycket aktivt och spännande forskningsfält. I artikel III studerar vi hur R2-proteinet från Bacillus anthracis (antraxbakterien, som orsakar mjältbrand) väljer sina metallatomer. Vi gör det genom att blanda kristaller av helt metallfria R2-proteiner med lika delar mangan och järn. Resultaten visar att detta R2-protein alltid väljer två mangatomer, vilket också är det naturliga tillståndet för just detta R2-protein. I nästa steg försökte vi ta reda på vilka av proteinets komponenter som styr valet av metall, genom att undersöka den tredimensionella strukturen. När vi bytte ut en av de aminosyror som sitter nära platsen där metallerna binder ändrade plötsligt R2-proteinet sina metallpreferenser; det valde mangan i det ena metallbindande positionen, men järn i den andra. Sammantaget visar våra resultat att R2-proteinet noga kontrollerar vilka metallatomer som inkorporeras, och att aminosyror som sitter intill metallpositionerna påverkar valet av metall. Vi undersökte även hur bindningen av mangan och järn påverkar den tredimensionella strukturen för antraxbakteriens R2-protein, och fann att den kemiska reaktionen med syre orsakar förändringar i strukturen, djupt i proteins inre. Baserat på de resultat vi beskriver i artikel IV drar vi slutsatsen att dessa strukturella förändringar sannolikt är viktiga för att den farliga radikal som skapas av R2-proteinet ska kunna hållas ”inlåst”, så att den inte kan orsaka skador på andra molekyler i cellen.

53 8. Acknowledgements

When I arrived to the DBB as a young and green PhD student, I knew very little about X-ray crystallography and protein biochemistry. Looking back, it feels like a long way has been walked to where I am now. Importantly, I was not alone on this path; there were many people who continuously helped me during this exciting journey (even when they were not aware of it!).

I would like to thank Martin for raising a scientist in me; for all the scien- tific advices and directions, for giving me freedom to make mistakes and try to do things my own way. This was a valuable experience, and I am very for- tunate to have you as my advisor. I could not wish for better! Pål, your excite- ment about research is contagious! Thank you for the fun and also for the valuable scientific advises throughout my PhD. Stefan and Pia, thank you for your guidance and for keeping me on the time-track through the PhD program. Also, I would like to acknowledge the administrative staff at DBB, especially Alex and Maria, for continuous help literally with everything. Håkan, thank you for all the technical support and for making sure we never ran out of cof- fee!

This thesis would have not been written without you, Matt and Julia. Matt, thank you so much for being patient with me and teaching me labwork essentials, crystallography and so much more. I lost the count how many times you have saved my experiments! I am very grateful for all your support at and outside the workplace. Julia, thank you for introducing me to R2(lox) and the metal world. But most importantly, thank you for all the knowledge you have shared, all the help, advices and attention to every detail.

The RNR crew: Britt-Marie, Margareta, Gustav, Daniel, Derek and Christoph – our brainstorming sessions about and around RNR have been amazing! Who knows how many more exciting discoveries are awaiting ahead? Inna, Ornella and Ghada – ladies, you are fantastic, and I am so glad to know you! Thank you for all the laughs, RNR discussions and all the great time together.

I would like to extend my sincere gratitude to everyone at Sten- mark/Högbom Lab. Vivek, I can’t believe that six years have passed so

54 quickly! With you around the labwork was much more fun (hej, neighbor!). Also, thank you for the discussions about everything and nothing, memorable cycling trips and unlimited number of jokes about languages and accents. Thanks a lot for the cheer during the thesis writing. Hugo, my PhD time would have not been the same without you around. Your catching scientific drive, kindness and support made me a bit of a better person. Merci! Ben, I am grate- ful for all the help with the membrane protein project and for eliminating the Slavic problems from my thesis J. Thank you and Cata for allowing me to celebrate your special day with you and for making the Columbia trip one of the most memorable! Dan, thank you very much for translating the popular summary to Swedish and for your creative vibe during various lab events! Juliane, thank you for the continuous cheer, ideas and discussions about sci- ence and beyond. I value your tremendous help in improving this thesis, it really made a big difference! Emma, without your help and suggestions this thesis would be one really boring piece of text to read. I value your help a lot! Thank you, Oskar, for your enthusiasm and for patience in answering all my silly questions about crystallography and data processing. Also, I really miss the everyday portion of dark jokes. Camilla, you are the personification of joy and it was fun to share the lab and discuss projects with you! Geoffrey and Sarah, you are an absolute delight to be around. I value our friendship a lot. Parul and Mohit, you have organized the best ever Diwali and Holi par- ties, I will cherish these memories forever. Michael, Riccardo and Markel thank you for all the fun times at and outside the workplace. Jana, Sara, Ha- gen, Jonathan, Daniel, Esra, Terezia, Soni and Rohit – you are making the StenHög Lab the best place to work at J! I would like to extend my gratitude to all past members of the StenHög lab and especially Henryk, Karin, Daniel, Robert, Linda, Magnus, Ronnie, Catrine and Megan. Thank you for mak- ing the beginning of my PhD so easy; I learned a lot from you!

It is amazing how much in life often depends on just a small, lucky coinci- dence. I was so fortunate to be a part of the Protein Engineering course at Uppsala University back in 2011; it has changed the direction of my life. Dear Torsten and Nina, thank you for the inspiration. Without you I would miss all of this. I am forever grateful.

I would like to say a huge thank you to Ana and Sandesh, for the chase of Northern lights, for the raclette and hikes in the fjords. I am looking forward for more trips together. Silvana, thank you for the best Italian treats. Masoud, thank you so much for the help with my thesis, I really appreciate it!

55 I am very fortunate to have so many great friends, without whom the PhD time and the thesis writing (and life!) would have been unimaginable. Dear Sony and Srinu, Geeta and Chandu, Anusha and Sethu, Pratyusha and Varun, Sandya and Gopi, thank you so much for being there for me and for all the unconditional support you are giving. This means a world to me. Hari, Manoj and Prakash, thank you for your friendship and for the super fun bad- minton games!

To my friends in Latvia – my greatest support team! Rita, paldies Tev, ka Tu toreiz pieņēmi darbā mani, to nemākulīgo studentu J Re, kas sanāca! Zane, paldies, ka vienmēr esi gatava parunāt par zinātni un ne tikai. Tavs atbalsts man vienmēr bija svarīgs! Юля и Вова, Варя, Даша, Дениза, Настя, Алина, Галя, Инна и Lāsma, без вашей поддержки я бы не спра- вилась. Спасибо, что всегда были готовы выслушать и помочь советом. Я очень дорожу нашей многолетней (!) дружбой.

I would like to extend my sincere gratitude to my extended family in India, the U.S. and Sweden. Dear Varalakshmi aunty, Subba and Moukhtika, thank you for accepting me to your family and supporting me in everything I do; this means a lot to me. Looking forward to meet you, Anirudh, Akshar and Indira more often. Thank you, Santhi, Sisir and Gamana for being so kind and so positive; you have always inspired me.

Также, хочу выразить свою искреннюю благодарность моим самым близким: маме, Марго, бабушке Нате и Андрису. Родные, спасибо вам, что разрешили мне мечтать, всегда поддерживали меня в выборе и нико- гда во мне не сомневались. Вы – моя самая надежная опора!

Without you, Sudarsan, this whole journey would have been one real bumpy road through the dark woods. Thank you for being my safe haven and adding the brightest colors to my life!

56 Bibliography

1. Kan, S. B. J., Lewis, R. D., Chen, K. & Arnold, F. H. Directed evolution of c for carbon-silicon bond formation: Bringing silicon to life. Science 354, 1048–1051 (2016). 2. Leveson-Gower, R. B., Mayer, C. & Roelfes, G. The importance of catalytic promiscuity for enzyme design and evolution. Nature Reviews Chemistry 3, 687–705 (2019). 3. Pandurangan, A. P., Ascher, D. B., Thomas, S. E. & Blundell, T. L. Genomes, structural biology and drug discovery: Combating the impacts of mutations in genetic disease and antibiotic resistance. Biochemical Society Transactions vol. 45 303–311 (2017). 4. Wacker, D., Stevens, R. C. & Roth, B. L. How ligands illuminate GPCR molecular pharmacology. Cell 170, 414–427 (2017). 5. WHO. Global tuberculosis report 2019. (Geneva, World Health Organization, 2019. Licence: CC BY-NC-SA 3.0 IGO., 2019). 6. WHO. Latent TB Infection: Updated and consolidated guidelines for programmatic management. (Geneva: World Health Organization; 2018. Licence: CC BY-NC-SA 3.0 IGO.). 7. Knight, G. M., McQuaid, C. F., Dodd, P. J. & Houben, R. M. G. J. Global burden of latent multidrug-resistant tuberculosis: trends and estimates based on mathematical modelling. The Lancet Infectious Diseases 19, 903–912 (2019). 8. Zumla, A., Nahid, P. & Cole, S. T. Advances in the development of new tuberculosis drugs and treatment regimens. Nature Reviews Drug Discovery vol. 12 388–404 (2013). 9. Silva Miranda, M., Breiman, A., Allain, S., Deknuydt, F. & Altare, F. The tuberculous granuloma: An unsuccessful host defence mechanism providing a safety shelter for the bacteria? Clinical and Developmental Immunology (2012) doi:10.1155/2012/139127. 10. Jayachandran, R., BoseDasgupta, S. & Pieters, J. Surviving the macrophage: Tools and tricks employed by Mycobacterium tuberculosis. Current Topics in Microbiology and Immunology 189–209 (2012). 11. Ganbat, D. et al. Mycobacteria infect different cell types in the human and cause species dependent cellular changes in infected cells. BMC Pulmonary Medicine 16, (2016). 12. Cole, S. T. et al. Deciphering the biology of mycobacterium tuberculosis from the complete genome sequence. Nature vol. 393 537–544 (1998). 13. Camus, J. C., Pryor, M. J., Médigue, C. & Cole, S. T. Re-annotation of the genome sequence of Mycobacterium tuberculosis H37Rv. Microbiology

57 148, 2967–2973 (2002). 14. Sassetti, C. M., Boyd, D. H. & Rubin, E. J. Genes required for mycobacterial growth defined by high density mutagenesis. Molecular Microbiology 48, 77–84 (2003). 15. Dejesus, M. A. et al. Comprehensive essentiality analysis of the Mycobacterium tuberculosis genome via saturating transposon mutagenesis. mBio 8, (2017). 16. Jackson, M. The mycobacterial cell envelope-lipids. Cold Spring Harbor perspectives in medicine 4, (2014). 17. Deamer, D. The role of lipid membranes in life’s origin. Life 7, (2017). 18. Singer, S. J. & Nicolson, G. L. The fluid mosaic model of the structure of cell membranes. Science 175, 720–731 (1972). 19. Van Meer, G., Voelker, D. R. & Feigenson, G. W. Membrane lipids: Where they are and how they behave. Nature Reviews Molecular Cell Biology 9, 112–124 (2008). 20. Lee, A. G. Biological membranes: The importance of molecular detail. Trends in Biochemical Sciences vol. 36 493–500 (2011). 21. Ulmschneider, M. B. & Sansom, M. S. P. Amino acid distributions in integral membrane protein structures. Biochimica et Biophysica Acta - Biomembranes 1512, 1–14 (2001). 22. Wallin, E. & von Heijne, G. Genome-wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms. Protein science : a publication of the Protein Society 7, 1029–1038 (1998). 23. Heijne, G. The distribution of positively charged residues in bacterial inner membrane proteins correlates with the trans-membrane topology. The EMBO journal 5, 3021–3027 (1986). 24. Von Heijne, G. Membrane-protein topology. Nature Reviews Molecular Cell Biology vol. 7 909–918 (2006). 25. Koshy, C. & Ziegler, C. Structural insights into functional lipid-protein interactions in secondary transporters. Biochimica et Biophysica Acta - General Subjects vol. 1850 476–487 (2015). 26. Arnarez, C., Marrink, S. J. & Periole, X. Molecular mechanism of cardiolipin-mediated assembly of respiratory chain supercomplexes. Chemical Science 7, 4435–4443 (2016). 27. Seddon, A. M., Curnow, P. & Booth, P. J. Membrane proteins, lipids and detergents: Not just a soap opera. Biochimica et Biophysica Acta - Biomembranes 1666, 105–117 (2004). 28. Lee, A. G. How lipids and proteins interact in a membrane: A molecular approach. Molecular BioSystems 1, 203–212 (2005). 29. Nilsson, T. et al. Lipid-mediated protein-protein interactions modulate respiration-driven ATP synthesis. Scientific Reports 6, (2016). 30. von Heijne, G. The membrane protein universe: what’s out there and why bother? Journal of Internal Medicine 261, 543–557 (2007). 31. Santos, R. et al. A comprehensive map of molecular drug targets. Nature Reviews Drug Discovery 16, 19–34 (2017). 32. Yıldırım, M. A., Goh, K.-I., Cusick, M. E., Barabási, A.-L. & Vidal, M. Drug—target network. Nature Biotechnology 25, 1119–1126 (2007).

58 33. Reis, R. & Moraes, I. Structural biology and structure–function relationships of membrane proteins. Biochemical Society Transactions vol. 47 47–61 (2018). 34. Chim, N. et al. The TB Structural Genomics Consortium: A decade of progress. Tuberculosis vol. 91 155–172 (2011). 35. Steinbacher, S., Bass, R., Strop, P. & Rees, D. C. Structures of the prokaryotic mechanosensitive channels MscL and MscS. Current Topics in Membranes 58, 1–24 (2007). 36. Chang, G., Spencer, R. H., Lee, A. T., Barclay, M. T. & Rees, D. C. Structure of the MscL homolog from Mycobacterium tuberculosis: A gated mechanosensitive ion channel. Science 282, 2220–2226 (1998). 37. Schlegel, S., Hjelm, A., Baumgarten, T., Vikström, D. & De Gier, J. W. Bacterial-based membrane protein production. Biochimica et Biophysica Acta - Molecular Cell Research vol. 1843 1739–1749 (2014). 38. Wagner, S., Bader, M. L., Drew, D. & de Gier, J. W. Rationalizing membrane protein overexpression. Trends in Biotechnology 24, 364–371 (2006). 39. Blattner, F. R. et al. The complete genome sequence of Escherichia coli K-12. Science vol. 277 1453–1462 (1997). 40. Lakey, D. L. et al. Enhanced production of recombinant Mycobacterium tuberculosis antigens in Escherichia coli by replacement of low-usage codons. Infection and Immunity 68, 233–238 (2000). 41. Paraskevopoulou, V. & Falcone, F. Polyionic tags as enhancers of protein solubility in recombinant protein expression. 6, 47 (2018). 42. Korepanova, A. et al. Cloning and expression of multiple integral membrane proteins from Mycobacterium tuberculosis in Escherichia coli. Protein Science 14, 148–158 (2009). 43. Mohan, A., Padiadpu, J., Baloni, P. & Chandra, N. Complete genome sequences of a Mycobacterium smegmatis laboratory strain (MC2 155) and isoniazid-resistant (4XR1/R2) mutant strains. Genome Announcements 3, (2015). 44. Singh, A. K. & Reyrat, J. Laboratory maintenance of Mycobacterium smegmatis. in Current Protocols in Microbiology vol. 14 (2009). 45. Snapper, S. B., Melton, R. E., Mustafa, S., Kieser, T. & Jr, W. R. J. Isolation and characterization of efficient plasmid transformation mutants of Mycobacterium smegmatis. Molecular Microbiology 4, 1911–1919 (1990). 46. Etienne, G. et al. The cell envelope structure and properties of Mycobacterium smegmatis mc2155: Is there a clue for the unique transformability of the strain? Microbiology 151, 2075–2086 (2005). 47. Wang, F. et al. Mycobacterium tuberculosis dihydrofolate reductase is not a target relevant to the antitubercular activity of isoniazid. Antimicrobial Agents and Chemotherapy 54, 3776–3782 (2010). 48. Studier, F. W. & Moffatt, B. A. Use of bacteriophage T7 RNA polymerase to direct selective high-level expression of cloned genes. Journal of Molecular Biology 189, 113–130 (1986).

59 49. Bashiri, G. & Baker, E. N. Production of recombinant proteins in Mycobacterium smegmatis for structural and functional studies. Protein Science 24, 1–10 (2015). 50. Xu, Z., Meshcheryakov, V. A., Poce, G. & Chng, S. S. MmpL3 is the flippase for mycolic acids in mycobacteria. Proceedings of the National Academy of Sciences of the United States of America 114, 7993–7998 (2017). 51. Batt, S. M. et al. Structural basis of inhibition of Mycobacterium tuberculosis DprE1 by benzothiazinone inhibitors. Proceedings of the National Academy of Sciences of the United States of America 109, 11354–11359 (2012). 52. Bashiri, G. et al. Tat-dependent translocation of an F420-binding protein of Mycobacterium tuberculosis. PLoS ONE 7, (2012). 53. Waldo, G. S., Standish, B. M., Berendzen, J. & Terwilliger, T. C. Rapid protein-folding assay using green fluorescent protein. Nature Biotechnology 17, 691–695 (1999). 54. Feilmeier, B. J., Iseminger, G., Schroeder, D., Webber, H. & Phillips, G. J. Green fluorescent protein functions as a reporter for protein localization in Escherichia coli. Journal of Bacteriology 182, 4068–4076 (2000). 55. Drew, D. et al. Rapid topology mapping of Escherichia coli inner- membrane proteins by prediction and PhoA/GFP fusion analysis. Proceedings of the National Academy of Sciences of the United States of America 99, 2690–2695 (2002). 56. Drew, D. E., Von Heijne, G., Nordlund, P. & De Gier, J. W. L. Green fluorescent protein as an indicator to monitor membrane protein overexpression in Escherichia coli. FEBS Letters 507, 220–224 (2001). 57. Daley, D. O. et al. Biochemistry: Global topology analysis of the Escherichia coli inner membrane proteome. Science 308, 1321–1323 (2005). 58. Kawate, T. & Gouaux, E. Fluorescence-detection size-exclusion chromatography for precrystallization screening of integral membrane proteins. Structure 14, 673–681 (2006). 59. Sjöstrand, D., Diamanti, R., Lundgren, C. A. K., Wiseman, B. & Högbom, M. A rapid expression and purification condition screening protocol for membrane protein structural biology. Protein Science 26, 1653–1666 (2017). 60. Belardinelli, J. M. & Jackson, M. Green fluorescent protein as a protein localization and topological reporter in mycobacteria. Tuberculosis 105, 13–17 (2017). 61. Niu, Y., Kong, J. & Xu, Y. A novel GFP-fused eukaryotic membrane protein expression system in Lactococcus lactis and its application to overexpression of an elongase. Current Microbiology 57, 423–428 (2008). 62. Radhakrishnan, A., Furze, C. M., Ahangar, M. S. & Fullam, E. A GFP- strategy for efficient recombinant protein overexpression and purification in Mycobacterium smegmatis. RSC Advances 8, 33087–33095 (2018). 63. Newstead, S., Kim, H., Von Heijne, G., Iwata, S. & Drew, D. High-

60 throughput fluorescent-based optimization of eukaryotic membrane protein overexpression and purification in Saccharomyces cerevisiae. Proceedings of the National Academy of Sciences of the United States of America 104, 13936–13941 (2007). 64. Garavito, R. M. & Ferguson-Miller, S. Detergents as tools in membrane biochemistry. Journal of Biological Chemistry 276, 32403–32406 (2001). 65. Loll, P. J. Membrane proteins, detergents and crystals: What is the state of the art? Acta Crystallographica Section F:Structural Biology Communications 70, 1576–1583 (2014). 66. Newby, Z. E. R. et al. A general protocol for the crystallization of membrane proteins for X-ray structural investigation. Nature Protocols 4, 619–637 (2009). 67. Kotov, V. et al. High-throughput stability screening for detergent- solubilized membrane proteins. Scientific Reports 9, (2019). 68. Delmar, J. A., Bolla, J. R., Su, C. C. & Yu, E. W. Crystallization of membrane proteins by vapor diffusion. in Methods in Enzymology vol. 557 363–392 (Academic Press Inc., 2015). 69. Cherezov, V. et al. High-resolution crystal structure of an engineered human β2-adrenergic G protein-coupled receptor. Science 318, 1258– 1265 (2007). 70. Jaakola, V. P. et al. The 2.6 angstrom crystal structure of a human A2A adenosine receptor bound to an antagonist. Science 322, 1211–1217 (2008). 71. Shimamura, T. et al. Structure of the human histamine H 1 receptor complex with doxepin. Nature 475, 65–72 (2011). 72. Landau, E. M. & Rosenbusch, J. P. Lipidic cubic phases: A novel concept for the crystallization of membrane proteins. Proceedings of the National Academy of Sciences of the United States of America 93, 14532–14535 (1996). 73. Caffrey, M. A comprehensive review of the lipid cubic phase or in meso method for crystallizing membrane and soluble proteins and complexes. Acta crystallographica. Section F, Structural biology communications 71, 3–18 (2015). 74. Qiu, H. & Caffrey, M. The phase diagram of the monoolein/water system: Metastability and equilibrium aspects. Biomaterials 21, 223–234 (2000). 75. Cherezov, V. & Caffrey, M. Membrane protein crystallization in lipidic mesophases. A mechanism study using X-ray microdiffraction. Faraday Discussions 136, 195–212 (2007). 76. Chimento, D. P., Mohanty, A. K., Kadner, R. J. & Wiener, M. C. Substrate-induced transmembrane signaling in the cobalamin transporter BtuB. Nature structural biology 10, 394–401 (2003). 77. Michel, H. Crystallization of membrane proteins. Trends in Biochemical Sciences 8, 56–59 (1983). 78. Kors, C. A. et al. Effects of impurities on membrane-protein crystallization in different systems. Acta Crystallographica Section D: Biological Crystallography 65, 1062–1073 (2009). 79. Jarlier, V. & Nikaido, H. Permeability barrier to hydrophilic solutes in

61 Mycobacterium chelonei. Journal of Bacteriology 172, 1418–1423 (1990). 80. Zuber, B. et al. Direct visualization of the outer membrane of mycobacteria and corynebacteria in their native state. Journal of Bacteriology 190, 5672–5680 (2008). 81. Chiaradia, L. et al. Dissecting the mycobacterial cell envelope and defining the composition of the native mycomembrane. Scientific Reports 7, 12807 (2017). 82. Minnikin, D. E. et al. Pathophysiological implications of cell envelope structure in Mycobacterium tuberculosis and related taxa. in Tuberculosis - Expanding Knowledge (2015). doi:10.5772/59585. 83. Jankute, M., Cox, J. A. G., Harrison, J. & Besra, G. S. Assembly of the mycobacterial cell wall. Annual Review of Microbiology 69, 405–423 (2015). 84. Daffé, M. & Draper, P. The envelope layers of mycobacteria with reference to their pathogenicity. Advances in Microbial Physiology vol. 39 131–203 (1997). 85. Fu, L. & Fu-Liu, C. Is Mycobacterium tuberculosis a closer relative to Gram-positive or Gram–negative bacterial pathogens? Tuberculosis 82, 85–90 (2002). 86. Dulberger, C. L., Rubin, E. J. & Boutte, C. C. The mycobacterial cell envelope — a moving target. Nature Reviews Microbiology 18, 47–59 (2020). 87. Kalscheuer, R. et al. The Mycobacterium tuberculosis capsule: A cell structure with key implications in pathogenesis. Biochemical Journal vol. 476 1995–2016 (2019). 88. Bansal-Mutalik, R. & Nikaido, H. Mycobacterial outer membrane is a lipid bilayer and the inner membrane is unusually rich in diacyl phosphatidylinositol dimannosides. Proceedings of the National Academy of Sciences of the United States of America 111, 4958–4963 (2014). 89. Lundgren, C. A. K. Structural and functional studies of membrane proteins: From characterisation of a fatty acyl-CoA synthetase to the discovery of duperoxide oxidase. (Stockholm University, 2019). 90. Ortalo-Magne, A. et al. Molecular composition of the outermost capsular material of the tubercle bacillus. Microbiology 141, 1609–1620 (1995). 91. Ortalo-Magné, A. et al. Identification of the surface-exposed lipids on the cell envelopes of Mycobacterium tuberculosis and other mycobacterial species. Journal of Bacteriology 178, 456–461 (1996). 92. Takayama, K., Wang, L. & David, H. L. Effect of isoniazid on the in vivo mycolic acid synthesis, cell growth, and viability of Mycobacterium tuberculosis. Antimicrobial Agents and Chemotherapy 2, 29–35 (1972). 93. Marchand, C. H. et al. Biochemical disclosure of the mycolate outer membrane of Corynebacterium glutamicum. Journal of Bacteriology 194, 587–597 (2012). 94. Torrelles, J. B., Azad, A. K. & Schlesinger, L. S. Fine discrimination in the recognition of individual species of phosphatidyl-myo-inositol mannosides from Mycobacterium tuberculosis by C-type lectin pattern

62 recognition receptors. The Journal of Immunology 177, 1805–1816 (2006). 95. Sartain, M. J., Dick, D. L., Rithner, C. D., Crick, D. C. & Belisle, J. T. Lipidomic analyses of Mycobacterium tuberculosis based on accurate mass measurements and the novel ‘Mtb LipidDB’. Journal of Lipid Research 52, 861–872 (2011). 96. Fukuda, T. et al. Critical roles for lipomannan and lipoarabinomannan in cell wall integrity of mycobacteria and pathogenesis of tuberculosis. mBio 4, (2013). 97. Mishra, A. K., Driessen, N. N., Appelmelk, B. J. & Besra, G. S. Lipoarabinomannan and related glycoconjugates: Structure, biogenesis and role in Mycobacterium tuberculosis physiology and host-pathogen interaction. FEMS Microbiology Reviews 35, 1126–1157 (2011). 98. Neyrolles, O. & Guilhot, C. Recent advances in deciphering the contribution of Mycobacterium tuberculosis lipids to pathogenesis. Tuberculosis vol. 91 187–195 (2011). 99. Jackson, M., Crick, D. C. & Brennan, P. J. Phosphatidylinositol is an essential phospholipid of mycobacteria. Journal of Biological Chemistry 275, 30092–30099 (2000). 100. Sancho-Vaello, E., Albesa-Jové, D., Rodrigo-Unzueta, A. & Guerin, M. E. Structural basis of phosphatidyl-myo-inositol mannosides biosynthesis in mycobacteria. Biochimica et Biophysica Acta - Molecular and Cell Biology of Lipids 1862, 1355–1367 (2017). 101. Angala, S. K., Belardinelli, J. M., Huc-Claustre, E., Wheat, W. H. & Jackson, M. The cell envelope glycoconjugates of Mycobacterium tuberculosis. Critical reviews in biochemistry and molecular biology 49, 361–99 (2014). 102. Guerin, M. E., Korduláková, J., Alzari, P. M., Brennan, P. J. & Jackson, M. Molecular basis of phosphatidyl-myo-inositol mannoside biosynthesis and regulation in mycobacteria. Journal of Biological Chemistry 285, 33577–33583 (2010). 103. Michell, R. H. Inositol derivatives: Evolution and functions. Nature Reviews Molecular Cell Biology vol. 9 151–161 (2008). 104. Michell, R. H. Inositol and its derivatives: their evolution and functions. Advances in enzyme regulation 51, 84–90 (2011). 105. Owens, R. M. et al. M. tuberculosis Rv2252 encodes a diacylglycerol kinase involved in the biosynthesis of phosphatidylinositol mannosides (PIMs). Molecular Microbiology 60, 1152–1163 (2006). 106. Nigou, J. & Besra, G. S. Cytidine diphosphate-diacylglycerol synthesis in Mycobacterium smegmatis. Biochemical Journal 367, 157–162 (2002). 107. Fisher, S. K., Novak, J. E. & Agranoff, B. W. Inositol and higher inositol in neural tissues: homeostasis, metabolism and functional significance. Journal of Neurochemistry 82, 736–754 (2002). 108. Agranoff, B. W. Turtles all the way: Reflections on myo-inositol. Journal of Biological Chemistry 284, 21121–21126 (2009). 109. Strahl, T. & Thorner, J. Synthesis and function of membrane phosphoinositides in budding yeast, Saccharomyces cerevisiae.

63 Biochimica et Biophysica Acta - Molecular and Cell Biology of Lipids vol. 1771 353–404 (2007). 110. Agranoff, B. W. & Fisher, S. K. Phosphoinositides and their stimulated breakdown. in ACS Symposium Series 20–32 (American Chemical Society, 1991). doi:10.1021/bk-1991-0463.ch002. 111. Newton, G. L. et al. Distribution of in microorganisms: Mycothiol is a major in most actinomycetes. Journal of Bacteriology 178, 1990– 1995 (1996). 112. Movahedzadeh, F. et al. The Mycobacterium tuberculosis ino1 gene is essential for growth and virulence. Molecular Microbiology 51, 1003– 1014 (2004). 113. Newton, G. L., Ta, P., Bzymek, K. P. & Fahey, R. C. Biochemistry of the initial steps of mycothiol biosynthesis. Journal of Biological Chemistry 281, 33910–33920 (2006). 114. Morii, H., Ogawa, M., Fukuda, K., Taniguchi, H. & Koga, Y. A revised biosynthetic pathway for phosphatidylinositol in mycobacteria. Journal of Biochemistry 148, 593–602 (2010). 115. Morii, H. et al. Studies of inositol 1-phosphate analogues as inhibitors of the phosphatidylinositol phosphate synthase in mycobacteria. Journal of Biochemistry 153, 257–266 (2013). 116. Houtkooper, R. H. et al. Identification and characterization of human cardiolipin synthase. FEBS Letters 580, 3059–3064 (2006). 117. Woodard, D. S., Lee, T. C. & Snyder, F. The final step in the de novo biosynthesis of platelet-activating factor. Properties of a unique CDP- choline:1-alkyl-2-acetyl-sn-glycerol choline-phosphotransferase in microsomes from the renal inner medulla of rats. Journal of Biological Chemistry 262, 2520–2527 (1987). 118. De Rudder, K. E. E., Sohlenkamp, C. & Geiger, O. Plant-exuded choline is used for rhizobial membrane lipid biosynthesis by phosphatidylcholine synthase. Journal of Biological Chemistry 274, 20011–20016 (1999). 119. Sandoval-Calderón, M., Geiger, O., Guan, Z., Barona-Gómez, F. & Sohlenkamp, C. A eukaryote-like cardiolipin synthase is present in Streptomyces coelicolor and in most actinobacteria. Journal of Biological Chemistry 284, 17383–17390 (2009). 120. Hostetler, K. Y., Galesloot, J. M., Boer, P. & Van Den Bosch, H. Further studies on the formation of cardiolipin and phosphatidylglycerol in rat liver mitochondria. Effect of divalent cations and the fatty acid composition of CDP-diglyceride. Biochimica et Biophysica Acta (BBA)/Lipids and Lipid Metabolism 380, 382–389 (1975). 121. Nogly, P. et al. X-ray structure of a CDP-alcohol phosphatidyltransferase membrane enzyme and insights into its catalytic mechanism. Nature Communications 5, (2014). 122. Sciara, G. et al. Structural basis for catalysis in a CDP-alcohol phosphotransferase. Nature Communications 5, (2014). 123. Clarke, O. B. et al. Structural basis for phosphatidylinositol-phosphate biosynthesis. Nature Communications 6, (2015). 124. The PyMOL Molecular Graphics System, Version 2.0.0 Schrödinger,

64 LLC. https://www.pymol.org/. 125. Grāve, K., Bennett, M. D. & Högbom, M. Structure of Mycobacterium tuberculosis phosphatidylinositol phosphate synthase reveals mechanism of substrate binding and metal catalysis. Communications Biology 2, (2019). 126. Williams, J. G. & McMaster, C. R. Scanning alanine mutagenesis of the CDP-alcohol phosphotransferase motif of Saccharomyces cerevisiae cholinephosphotransferase. Journal of Biological Chemistry 273, 13482– 13487 (1998). 127. Leu, K., Obermayer, B., Rajamani, S., Gerland, U. & Chen, I. A. The prebiotic evolutionary advantage of transferring genetic information from RNA to DNA. Nucleic Acids Research 39, 8135–8147 (2011). 128. Reichard, P. The evolution of ribonucleotide reduction. Trends in Biochemical Sciences vol. 22 81–85 (1997). 129. Lundin, D., Berggren, G., Logan, D. T. & Sjöberg, B. M. The origin and evolution of ribonucleotide reduction. Life vol. 5 604–636 (2015). 130. Torrents, E. Ribonucleotide reductases: Essential enzymes for bacterial life. Frontiers in Cellular and Infection Microbiology 4, (2014). 131. Stubbe, J. A. Protein radical involvement in biological catalysis? Annual Review of Biochemistry 58, 257–285 (1989). 132. Ehrenberg, A. & Reichard, P. Electron spin resonance of the iron- containing protein B2 from ribonucleotide reductase. Journal of Biological Chemistry 247, 3485–3488 (1972). 133. Nordlund, P. & Reichard, P. Ribonucleotide Reductases. Annual Review of Biochemistry 75, 681–706 (2006). 134. Tomter, A. B. et al. Ribonucleotide reductase class I with different radical generating clusters. Coordination Chemistry Reviews 257, 3–26 (2013). 135. Lundin, D., Gribaldo, S., Torrents, E., Sjöberg, B. M. & Poole, A. M. Ribonucleotide reduction - Horizontal transfer of a required function spans all three domains. BMC Evolutionary Biology 10, (2010). 136. Torrents, E., Poplawski, A. & Sjöberg, B. M. Two proteins mediate class II ribonucleotide reductase activity in Pseudomonas aeruginosa: Expression and transcriptional analysis of the aerobic enzymes. Journal of Biological Chemistry 280, 16571–16578 (2005). 137. Sintchak, M. D., Arjara, G., Kellogg, B. A., Stubbe, J. A. & Drennan, C. L. The crystal structure of class II ribonucleotide reductase reveals how an allosterically regulated monomer mimics a dimer. Nature Structural Biology 9, 293–300 (2002). 138. Larsson, K. M. et al. Structural mechanism of allosteric substrate specificity regulation in a ribonucleotide reductase. Nature Structural and Molecular Biology 11, 1142–1149 (2004). 139. Logan, D. T., Andersson, J., Sjöberg, B. M. & Nordlund, P. A glycyl radical site in the crystal structure of a class III ribonucleotide reductase. Science 283, 1499–1504 (1999). 140. Ollagnier, S. et al. Assembly of 2Fe-2S and 4Fe-4S clusters in the anaerobic ribonucleotide reductase from Escherichia coli. Journal of the American Chemical Society 121, 6344–6350 (1999).

65 141. Tamarit, J., Mulliez, E., Meier, C., Trautwein, A. & Fontecave, M. The anaerobic ribonucleotide reductase from Escherichia coli. The small protein is an activating enzyme containing a [4Fe-4S]2+ center. Journal of Biological Chemistry 274, 31291–31296 (1999). 142. Gambarelli, S., Luttringer, F., Padovani, D., Mulliez, E. & Fontecave, M. Activation of the anaerobic ribonucleotide reductase by S- adenosylmethionine. ChemBioChem 6, 1960–1962 (2005). 143. Reichard, P. From RNA to DNA, why so many ribonucleotide reductases? Science 260, 1773–1777 (1993). 144. Torrents, E., Sahlin, M., Biglino, D., Gräslund, A. & Sjöberg, B. M. Efficient growth inhibition of Bacillus anthracis by knocking out the ribonucleotide reductase tyrosyl radical. Proceedings of the National Academy of Sciences of the United States of America 102, 17946–17951 (2005). 145. Lundin, D., Poole, A. M., Sjöberg, B. M. & Högbom, M. Use of structural phylogenetic networks for classification of the ferritin-like superfamily. Journal of Biological Chemistry 287, 20565–20575 (2012). 146. Nordlund, P., Sjöberg, B.-M. & Eklund, H. Three-dimensional structure of the free radical protein of ribonucleotide reductase. Nature 345, 593– 598 (1990). 147. Atkin, C. L., Thelander, L., Reichard, P. & Lang, G. Iron and free radical in ribonucleotide reductase. Exchange of iron and Mossbauer spectroscopy of the protein B2 subunit of the Escherichia coli enzyme. Journal of Biological Chemistry 248, 7464–7472 (1973). 148. Sjöberg, B. M. & Reichard, P. Nature of the free radical in ribonucleotide reductase from Escherichia coli. The Journal of biological chemistry 252, 536–41 (1977). 149. Cotruvo, J. A. & Stubbe, J. Class I ribonucleotide reductases: metallocofactor assembly and repair in vitro and in vivo. Annual Review of Biochemistry 80, 733–767 (2011). 150. Wu, C. H., Jiang, W., Krebs, C. & Stubbe, J. YfaE, a ferredoxin involved in diferric-tyrosyl radical maintenance in Escherichia coli ribonucleotide reductase. Biochemistry 46, 11577–11588 (2007). 151. Lundin, D., Torrents, E., Poole, A. M. & Sjöberg, B. M. RNRdb, a curated database of the universal enzyme family ribonucleotide reductase, reveals a high level of misannotation in sequences deposited to Genbank. BMC Genomics 10, (2009). 152. Högbom, M. Metal use in ribonucleotide reductase R2, di-iron, di- manganese and heterodinuclear - An intricate bioinorganic workaround to use different metals for the same reaction. Metallomics 3, 110–120 (2011). 153. Huang, M., Parker, M. J. & Stubbe, J. Choosing the right metal: case studies of class I ribonucleotide reductases. Journal of Biological Chemistry 289, 28104–28111 (2014). 154. Schimpff-Weiland, G. & Follmann, H. A new manganese-activated ribonucleotide reductase found in gram-positive bacteria. Biochemical and Biophysical Research Communications 102, 1276–1282 (1981). 155. Willing, A., Folmann, H. & Auling, G. Ribonucleotide reductase of

66 Brevibacterium ammoniagenes is a manganese enzyme. European Journal of Biochemistry 170, 603–611 (1988). 156. Huque, Y. et al. The active form of the R2F protein of class Ib ribonucleotide reductase from Corynebacterium ammoniagenes is a diferric protein. Journal of Biological Chemistry 275, 25365–25371 (2000). 157. Cotruvo, J. A., Stich, T. A., Britt, R. D. & Stubbe, J. Mechanism of assembly of the dimanganese-tyrosyl radical cofactor of class Ib ribonucleotide reductase: Enzymatic generation of superoxide is required for tyrosine oxidation via a Mn(III)Mn(IV) intermediate. Journal of the American Chemical Society 135, 4027–4039 (2013). 158. Cotruvo, J. A. & Stubbe, J. An active dimanganese(III)-tyrosyl radical cofactor in Escherichia coli class Ib ribonucleotide reductase. Biochemistry 49, 1297–1309 (2010). 159. Zhang, Y. & Stubbe, J. Bacillus subtilis class Ib ribonucleotide reductase is a dimanganese(III)-tyrosyl radical enzyme. Biochemistry 50, 5615–23 (2011). 160. Cox, N. et al. A tyrosyl−dimanganese coupled spin system is the native metalloradical cofactor of the R2F subunit of the ribonucleotide reductase of Corynebacterium ammoniagenes. Journal of the American Chemical Society 132, 11197–11213 (2010). 161. Berggren, G., Duraffourg, N., Sahlin, M. & Sjöberg, B. M. Semiquinone- induced maturation of bacillus anthracis ribonucleotide reductase by a superoxide intermediate. Journal of Biological Chemistry 289, 31940– 31949 (2014). 162. Cotruvo, J. A. & Stubbe, J. NrdI, a flavodoxin involved in maintenance of the diferric-tyrosyl radical cofactor in Escherichia coli class Ib ribonucleotide reductase. Proceedings of the National Academy of Sciences of the United States of America 105, 14383–14388 (2008). 163. Boal, A. K., Cotruvo, J. A., Stubbe, J. & Rosenzweig, A. C. Structural basis for activation of class Ib ribonucleotide reductase. Science 329, 1526–1530 (2010). 164. Högbom, M. et al. The radical site in chlamydial ribonucleotide reductase defines a new R2 subclass. Science 305, 245–248 (2004). 165. Jiang, W. et al. A manganese(IV)/iron(III) cofactor in Chlamydia trachomatis ribonucleotide reductase. Science 316, 1188–1191 (2007). 166. Jiang, W., Hoffart, L. M., Krebs, C. & Bollinger, J. M. A manganese(IV)/iron(IV) intermediate in assembly of the manganese(IV)/iron(III) cofactor of Chlamydia trachomatis ribonucleotide reductase. Biochemistry 46, 8709–8716 (2007). 167. Roos, K. & Siegbahn, P. E. M. Oxygen cleavage with manganese and iron in ribonucleotide reductase from Chlamydia trachomatis. Journal of Biological Inorganic Chemistry 16, 553–565 (2011). 168. Kwak, Y. et al. Geometric and electronic structure of the Mn(IV)Fe(III) cofactor in class Ic ribonucleotide reductase: correlation to the class Ia binuclear non-heme iron enzyme. Journal of the American Chemical Society 135, 17573–17584 (2013).

67 169. Högbom, M. et al. Displacement of the tyrosyl radical cofactor in ribonucleotide reductase obtained by single-crystal high-field EPR and 1.4-Å x-ray data. Proceedings of the National Academy of Sciences of the United States of America 100, 3209–3214 (2003). 170. Logan, D. T. et al. Crystal structure of reduced protein R2 of ribonucleotide reductase: the structural basis for oxygen activation at a dinuclear iron site. Structure 4, 1053–1064 (1996). 171. Boal, A. K., Cotruvo, J. A., Stubbe, J. & Rosenzweig, A. C. The dimanganese(II) site of Bacillus subtilis class Ib ribonucleotide reductase. Biochemistry 51, 3861–3871 (2012). 172. Dassama, L. M. K., Krebs, C., Bollinger, J. M., Rosenzweig, A. C. & Boal, A. K. Structural basis for assembly of the MnIV/FeIII cofactor in the class Ic ribonucleotide reductase from chlamydia trachomatis. Biochemistry 52, 6424–6436 (2013). 173. Griese, J. J. et al. Direct observation of structurally encoded metal discrimination and ether bond formation in a heterodinuclear metalloprotein. Proceedings of the National Academy of Sciences of the United States of America 110, 17189–17194 (2013). 174. Rozman Grinberg, I. et al. Novel ATP-cone-driven allosteric regulation of ribonucleotide reductase via the radical-generating subunit. eLife 7, e31529 (2018). 175. Rose, H. R. et al. Structural basis for superoxide activation of Flavobacterium johnsoniae class I ribonucleotide reductase and for radical initiation by its dimanganese cofactor. Biochemistry 57, 2679–2693 (2018). 176. Rozman Grinberg, I. et al. A glutaredoxin domain fused to the radical- generating subunit of ribonucleotide reductase (RNR) functions as an efficient RNR reductant. The Journal of biological chemistry 293, 15889– 15900 (2018). 177. Rozman Grinberg, I. et al. Class Id ribonucleotide reductase utilizes a Mn2(IV,III) cofactor and undergoes large conformational changes on metal loading. JBIC Journal of Biological Inorganic Chemistry 24, 863– 877 (2019). 178. Blaesi, E. J. et al. Metal-free class Ie ribonucleotide reductase from pathogens initiates catalysis with a tyrosine-derived dihydroxyphenylalanine radical. Proceedings of the National Academy of Sciences of the United States of America 115, 10022–10027 (2018). 179. Srinivas, V. et al. Metal-free ribonucleotide reduction powered by a DOPA radical in Mycoplasma pathogens. Nature 563, 416–420 (2018). 180. Andersson, C. S. & Högbom, M. A Mycobacterium tuberculosis ligand- binding Mn/Fe protein reveals a new cofactor in a remodeled R2-protein scaffold. Proceedings of the National Academy of Sciences of the United States of America 106, 5633–5638 (2009). 181. Högbom, M. The manganese/iron-carboxylate proteins: What is what, where are they, and what can the sequences tell us? Journal of Biological Inorganic Chemistry 15, 339–349 (2010). 182. Griese, J. J. et al. Structural basis for oxygen activation at a

68 heterodinuclear manganese/iron cofactor. Journal of Biological Chemistry 290, 25254–25272 (2015). 183. Miller, E. K., Trivelas, N. E., Maugeri, P. T., Blaesi, E. J. & Shafaat, H. S. Time-resolved investigations of heterobimetallic cofactor assembly in R2lox reveal distinct Mn/Fe intermediates. Biochemistry 56, 3369–3379 (2017). 184. Cotruvo, J. A. & Stubbe, J. Metallation and mismetallation of iron and manganese proteins in vitro and in vivo: The class I ribonucleotide reductases as a case study. Metallomics 4, 1020–1036 (2012). 185. Irving, H. & Williams, R. J. P. The stability of transition-metal complexes. Journal of the Chemical Society (Resumed) 3192–3210 (1953). 186. Argüello, J. M., Raimunda, D. & Padilla-Benavides, T. Mechanisms of homeostasis in bacteria. Frontiers in Cellular and Infection Microbiology 4, (2013). 187. Griese, J. J., Högbom, M. & IUCr. Location-specific quantification of protein-bound metal ions by X-ray anomalous dispersion: Q-XAD. Acta Crystallographica Section D Structural Biology 75, 764–771 (2019). 188. Griese, J. J., Srinivas, V. & Högbom, M. Assembly of nonheme Mn/Fe active sites in heterodinuclear metalloproteins. JBIC Journal of Biological Inorganic Chemistry 19, 759–774 (2014). 189. Kutin, Y. et al. Divergent assembly mechanisms of the manganese/iron cofactors in R2lox and R2c proteins. Journal of Inorganic Biochemistry 162, 164–177 (2016). 190. Vance, C. K. & Miller, A. F. Novel insights into the basis for Escherichia coli ’s metal ion specificity from mn-substituted FeSOD and its very high Em. Biochemistry 40, 13079–13087 (2001). 191. Kutin, Y. et al. Chemical flexibility of heterobimetallic Mn/Fe cofactors: R2lox and R2c proteins. Journal of Biological Chemistry 294, 18372– 18386 (2019). 192. Spencer, R. C. Bacillus anthracis. Journal of Clinical Pathology 56, 182– 187 (2003). 193. Crona, M. et al. NrdH-redoxin protein mediates high enzyme activity in manganese- reconstituted ribonucleotide reductase from Bacillus anthracis. Journal of Biological Chemistry 286, 33053–33060 (2011). 194. Drew, D. et al. A scalable, GFP-based pipeline for membrane protein overexpression screening and purification. Protein science : a publication of the Protein Society 14, 2011–2017 (2005). 195. Grāve, K. et al. Redox-induced structural changes in the di-iron and di- manganese forms of Bacillus anthracis ribonucleotide reductase subunit NrdF suggest a mechanism for gating of radical access. JBIC Journal of Biological Inorganic Chemistry 24, 849–861 (2019). 196. Sigfridsson, K. G. V. et al. Rapid X-ray photoreduction of dimetal-oxygen cofactors in ribonucleotide reductase. Journal of Biological Chemistry 288, 9648–9661 (2013). 197. Hammerstad, M., Hersleth, H.-P., Tomter, A. B., Røhr, Å. K. & Andersson, K. K. Crystal structure of Bacillus cereus class Ib ribonucleotide reductase di-iron NrdF in complex with NrdI. ACS

69 Chemical Biology 9, 526–537 (2014). 198. Bowman, S. E. J., Bridwell-Rabb, J. & Drennan, C. L. Metalloprotein crystallography: more than a structure. Accounts of Chemical Research 49, 695–702 (2016). 199. Cooley, R. B., Arp, D. J. & Karplus, P. A. Evolutionary origin of a secondary structure: α-helices as cryptic but widespread insertional variations of α-helices that enhance protein functionality. Journal of Molecular Biology 404, 232–246 (2010). 200. Bailey, L. J., McCoy, J. G., Phillips, G. N. & Fox, B. G. Structural consequences of effector protein complex formation in a diiron hydroxylase. Proceedings of the National Academy of Sciences of the United States of America 105, 19194–19198 (2008). 201. Sazinsky, M. H., Dunten, P. W., McCormick, M. S., DiDonato, A. & Lippard, S. J. X-ray structure of a hydroxylase−regulatory protein complex from a hydrocarbon-oxidizing multicomponent monooxygenase, Pseudomonas sp. OX1 phenol hydroxylase. Biochemistry 45, 15392– 15404 (2006). 202. Hendrickson, W. A. Anomalous diffraction in crystallographic phase evaluation. Quarterly Reviews of Biophysics vol. 47 49–93 (2014). 203. Taylor, G. L. Introduction to phasing. Acta Crystallographica Section D: Biological Crystallography 66, 325–338 (2010). 204. Rupp, B. Biomolecular crystallography: principles, practice, and application to structural biology. (Garland Science, 2009). 205. Rhodes, G. Crystallography made crystal clear. (Academic Press, 2006). 206. Merritt, E. A. Anomalous scattering coefficients. http://skuld.bmsc.washington.edu/scatter/AS_form.html. 207. Garman, E. F. Radiation damage in macromolecular crystallography: What is it and why should we care? Acta Crystallographica Section D: Biological Crystallography 66, 339–351 (2010). 208. Griese, J. J. et al. X-ray reduction correlates with soaking accessibility as judged from four non-crystallographically related diiron sites. Metallomics 4, 894 (2012). 209. Foos, N. et al. X-ray and UV radiation-damage-induced phasing using synchrotron serial crystallography. Acta Crystallographica Section D: Structural Biology 74, 366–378 (2018).

The cover image was created using QuteMol: Tarini, M., Cignoni, P. & Montani, C. Ambient occlusion and edge cueing to enhance real time molecular visualization. IEEE Transactions on Visualization and Computer Graphics 12, 1237–1244 (2006).

70