Computational modeling of glycyl radical : Novel insights to the mechanism of Pyruvate Formate-

Computermodellierung von Glycyl-Radikalenzymen: Neue Einblicke in den Mechanismus von Pyruvat-Formiat-Lyase

Der Naturwissenschaftlichen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg

zur Erlangung des Doktorgrades Dr. rer. nat.

vorgelegt von Marko Hanževački aus Zagreb, Kroatien

Als Dissertation genehmigt von der Naturwissenschaftlichen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg

Tag der mündlichen Prüfung: 30.10.2020

Vorsitzender des Promotionsorgans: Prof. Dr. Wolfgang Achtziger Gutacher/in: Prof. Dr. Ana-Sunčana Smith Prof. Dr. Timothy Clark Table of Contents

Acknowledgments

I would like to express my sincere appreciation to my supervisors David and Ana Smith for always being friendly, patient, and understanding. David, as my main supervisor you have been a constant source of support and influence. You have demonstrated dedication and belief in the research topic – and me. Thank you for sharing your indispensable expertise and knowledge, as well as for all constructive guidance and motivation. Thank you, Ana, for taking the role of co-supervisor and strongly assisting in steering me through to the end with enthusiastic encouragement, inspiring and dynamic discussions, and useful critiques of my research work. I admire you both for your excellence in science, supervisory qualities, and authority. I will always appreciate you as my mentors and positive role models and do my best to apply the knowledge and skills I have obtained in my future career.

I want to kindly thank all those who helped me in the scientific work and in learning new techniques and tools. Accordingly, I would like to pay my special regards to Radha Dilip Banhatti for providing a great deal of professional support and assistance in keeping my progress on schedule and for her valuable and constructive suggestions during the development of this research work. My deepest gratitude is also extended to Karmen Čondic-Jurkić for her willingness to dedicate her time so generously and for transferring essential knowledge and experience about molecular simulations and catalysis.

During the last couple of years, I have met many wonderful people, but three of them are very special to me and I want to acknowledge them here. Dear Katarina, Nataša, and Zlatko, you have always believed in me and I consider you as my second family. I am truly grateful for everything you have done for me and I could not be happier to have such friends.

Thus, once again I would like to thank my best friend Katarina for a decade of friendship and all the wonderful moments we spent together from college to this day. Kate, thank you for all the love, understanding, care, and support you have given me along this path. Thanks for being such an incredible person and that you showed me how precious true friendship can be, and how much incredible value it can add to my life.

Special thanks to my dear friend Nataša for all the laughter and cheerful moments in our small but at the same time always positive and joyful office in the basement of the sixth wing. I enjoyed sharing the office space with you as my colleague and a true friend. These

I

Table of Contents

couple of years with you in my life were exciting and amazing. There is nothing better than a friend unless it is a friend with a large Milka chocolate.

I want to thank my best buddy Zlatko for turning every ordinary situation into something extraordinary. Thank you for all the fruitful discussions and useful life coaching. Thanks for accepting me for who I am from the very beginning and for teaching me some of the techniques and programming which have helped to improve my skills and this dissertation.

I want to express my deepest gratitude to my family for the endless support along the way, especially my parents who have always been there for me and whose love and guidance are with me in whatever I pursue. Without you, all of this would be impossible.

I would like to take this opportunity to thank Jakov, Josip, Mislav, Luka, Robert, Joe, Zoran, Boris, Danijela, and all my friends and colleagues from the group in Zagreb for productive working atmosphere, helpful discussions, support, and motivation. These people have continuously encouraged me and were always willing and enthusiastic to assist in any way possible throughout the research projects. I am also grateful to my colleagues from the group in Erlangen, especially Christian Wick for excellent cooperation and hospitality. I would like to kindly thank Christof Jäger, Anna Croft, and the British Scholarship Trust for the opportunities to conduct part of my research abroad at the University of Nottingham. I want to thank my dear friends Vera, Zé, Carolina, and Ania from Nottingham for constant stimulation and support. I would like to acknowledge Željka, Nikolina, Ira, and the entire administrative staff of RBI and FAU.

Last but not least, I would like to gratefully acknowledge the Croatian Science Foundation (IP-11-2013-8238) and the ERC Starting Grant (337283) for financial support. I would like to acknowledge the Cluster of Excellence: Engineering of Advanced Materials (EAM), the computing facilities of the Regionales Rechenzentrum Erlangen (RRZE), and the University Computing Centre (SRCE) for providing computational and storage resources.

“I have been bent and broken, but – I hope – into a better shape.” — Charles Dickens, Great Expectations

II

Table of Contents

Abstract

Enzymes whose mechanisms involve the formation of species with unpaired electrons (free radicals) are collectively known as radical enzymes. In the last couple of years, there has been an increasing interest in radical enzymes due to the ability of these species to perform a large variety of biochemical transformations, which opens an avenue for potential industrial applications in biotechnology and enzyme engineering. Enzymes utilizing unpaired electrons on glycine residues (glycyl radical enzymes or GREs) have proven to be interesting targets for pharmaceutical applications e.g. in the synthesis of novel antibiotics. In a natural setting, GREs assume a key role in the metabolic pathways of strict and facultative anaerobes such as E. coli and other microorganisms.

In this thesis, a prototypical member of the GRE family, pyruvate formate-lyase (PFL), has been systematically studied using multiscale computational modeling. This enzyme is central to the microbial anaerobic glucose metabolism, where it catalyzes the reversible conversion of pyruvate and coenzyme A (CoA) to formate and acetyl-CoA in two half- reactions by employing complex radical chemistry. Although PFL is one of the most extensively characterized GREs, several important aspects of its catalytic mechanism are still not understood. For example, the details of neither the first nor the second half-reaction have been fully characterized. The connection between the two half-reactions, in which the CoA molecule must approach the enzyme’s , is even more elusive. Namely, all available crystal structures in complex with CoA indicate that CoA is bound to the surface of the enzyme, some 20-30 Å away from the active site. For more than a decade, the actual entry mechanism of CoA has remained hidden due to the lack of experimental data.

In the first part of this thesis, the effects of chemical modification of the enzyme (acetylation), during the first half-reaction, were investigated using molecular dynamics. This led to the identification of an entry channel for CoA, which leads from the surface of the enzyme to the active site. The channel was found to exhibit accentuated fluctuations and a higher probability of being in an open state in the acetylated (post first half-reaction) systems. This finding suggests that the structural modification of the enzyme has an important functional role, whereby the formation of the acyl-enzyme intermediate serves to initiate a subtle molecular signaling cascade that influences the protein dynamics and facilitates the entry (and the timing thereof) of CoA into the active site of PFL.

III

Table of Contents

Using the understanding of the channel position, the reaction coordinates that connect CoA with the active site of PFL were examined using steered MD simulations and umbrella sampling. Additionally, the unrestrained dissociation dynamics of CoA from the active site were investigated. These simulations were performed on acetylated and non-acetylated model systems of PFL, aiming to investigate the possible binding and unbinding pathways of CoA, through the previously identified channel. The energetics associated with the process of CoA approaching the active site of PFL before and after the first half-reaction has been established, thus uncovering potential bound states of the coenzyme in the near vicinity of the active site. The key findings of this study reveal the presence of reactive bound states of CoA close to the active site, significant for triggering the second half-reaction and determining the overall outcomes for both acetylated and non-acetylated PFL systems.

The detailed chemical aspects of PFL catalysis have been studied in the third part of this thesis by employing QM/MM calculations. In this respect, both half-reactions were investigated using the previously determined binding poses of CoA in the active site of non- acetylated and acetylated model systems. The results suggest that the progression of the first half-reaction occurs in two distinct steps. The initiation of the second half-reaction, by the abstraction of an H-atom, was found to be of key importance. On one hand, it was demonstrated that the flexibility of the active site allows for the adjustment to the newly formed species in the acetylated systems, which is crucial to lowering the barrier for the initial H-abstraction reaction. Furthermore, the H-abstraction from CoA was found to be energetically more favorable when carried by a formate radical anion rather than by the cysteinyl radical in the active site. Additionally, it was found that the abstraction of an H-atom from CoA in the active site before the first half-reaction is possible. However, such a reaction would cause a premature quenching of the radical and could lead to the inactivation of PFL.

Overall, the presented work enhances the understanding of the catalytic mechanism of PFL. Possible chemical pathways that appear once CoA reaches the active site of PFL were characterized starting from the detailed investigation of the factors and the underlying energetics that influences the binding mechanism and positioning of CoA at the active site. These findings will undoubtedly serve as a template for future endeavors in the field of GREs towards understanding the structure-function relationships and catalytic mechanisms in other classes of radical enzymes.

IV

Table of Contents

Zusammenfassung

Enzyme, deren Mechanismen die Bildung von Intermediaten mit ungepaarten Elektronen (freie Radikale) beinhalten, werden als radikalische Enzyme bezeichnet. In den letzten Jahren hat das Interesse an radikalischen Enzymen stark zugenommen, da diese in der Lage sind eine Vielzahl an biochemischen Transformationen durchzuführen. Dies ebnet den Weg für die industrielle Verwendung dieser Enzyme in der Biotechnologie und der Enzymtechnik. Insbesondere haben sich Enzyme, die Glycin-Reste mit ungepaarten Elektronen verwenden (Glycyl-Radikalenzymen oder GREs), als besonders interessante Ziele für pharmazeutische Anwendungen herausgestellt, z. B. in der Synthese neuer Antibiotika. In der Natur spielen GREs eine Schlüsselrolle in den Stoffwechselwegen strenger und fakultativer Anaerobier wie E. coli und anderen Mikroorganismen.

In dieser Arbeit wurde ein prototypisches Mitglied der GRE-Familie, die Pyruvat- Formiat-Lyase (PFL), systematisch mit Hilfe von computerbasierter Multiskalenmodellierung untersucht. Dieses Enzym spielt eine zentrale Rolle im mikrobiellen anaeroben Glukosestoffwechsel und katalysiert die reversible Umwandlung von Pyruvat und Coenzym A (CoA) in Formiat und Acetyl-CoA in zwei Halbreaktionen mithilfe komplexer radikalischer Mechanismen. Obwohl PFL eines der am umfassendsten charakterisierten GREs ist, sind mehrere wichtige Aspekte des katalytischen Mechanismus noch nicht aufgeklärt. Beispielsweise wurden weder die Details der ersten noch der zweiten Halbreaktion vollständig charakterisiert. Die Verbindung der beiden Halbreaktionen, bei denen sich das CoA-Molekül dem aktiven Zentrum des Enzyms nähern muss, ist noch schwieriger zu charakterisieren. Alle verfügbaren Kristallstrukturen mit CoA weisen darauf hin, dass CoA an die Oberfläche des Enzyms (etwa 20-30 Å vom aktiven Zentrum entfernt) gebunden ist. Seit mehr als einem Jahrzehnt ist der tatsächliche Zugangsmechanismus von CoA aufgrund fehlender experimenteller Daten verborgen geblieben.

Im ersten Teil dieser Arbeit wurden die Auswirkungen der chemischen Modifikation des Enzyms (Acetylierung) während der ersten Halbreaktion mithilfe von Molekulardynamiksimulationen untersucht. Dies führte zur Identifizierung eines Eintrittskanals für CoA, der von der Oberfläche des Enzyms zum aktiven Zentrum führt. Es wurde festgestellt, dass der Kanal in den acetylierten Systemen (nach der ersten Halbreaktion) akzentuierte Fluktuationen und eine höhere Wahrscheinlichkeit aufweist, in einem offenen Zustand zu sein. Dieser Befund legt nahe, dass die strukturelle Modifikation des Enzyms eine

V

Table of Contents

wichtige funktionelle Rolle spielt, wobei die Bildung des Acyl-Enzym-Intermediats dazu dient, eine subtile molekulare Signalkaskade zu initiieren, die die Proteindynamik beeinflusst und den Eintritt (sowie das Timing) von CoA in das aktive Zentrum von PFL erleichtert.

Unter Verwendung des Wissens um die genaue Position des Kanals wurden die Reaktionskoordinaten, die CoA mit dem aktiven Zentrum von PFL verbinden, mittels steered MD-Simulationen und umbrella sampling untersucht. Zusätzlich wurde die ungehemmte Dissoziationsdynamik von CoA vom aktiven Zentrum erforscht. Diese Simulationen wurden an acetylierten und nicht acetylierten Modellsystemen von PFL durchgeführt, um die möglichen Bindungs- und Dissoziationswege von CoA über den zuvor identifizierten Kanal zu untersuchen. Die Energetik der Annäherung von CoA an das aktive Zentrum von PFL vor und nach der ersten Halbreaktion wurde ermittelt, wodurch mögliche gebundene Zustände des Coenzyms in der Nähe des aktiven Zentrums aufgedeckt werden konnten. Die wichtigsten Ergebnisse dieser Studie zeigen das Vorhandensein reaktiver Bindungszustände von CoA in der Nähe des aktiven Zentrums, die für die Ausführung der zweiten Halbreaktion und für die Gesamtreaktion in sowohl acetylierten als auch nicht acetylierten PFL-Systemen von Bedeutung sind.

Die detaillierten chemischen Aspekte des katalytischen Mechanismus von PFL wurden im dritten Teil dieser Arbeit mit Hilfe von QM/MM-Berechnungen untersucht. In diesem Zusammenhang wurden beide Halbreaktionen unter Verwendung der zuvor bestimmten Bindungspositionen von CoA im aktiven Zentrum von nicht acetylierten und acetylierten Modellsystemen erforscht. Die Ergebnisse legen nahe, dass die erste Halbreaktion in zwei Schritten erfolgt. Es wurde festgestellt, dass die Einleitung der zweiten Halbreaktion durch Abstraktion eines H-Atoms von zentraler Bedeutung ist. Einerseits konnte gezeigt werden, dass die Flexibilität des aktiven Zentrums die Anpassung an die neu gebildeten Spezies in den acetylierten Systemen ermöglicht. Dies ist entscheidend, um die Barriere für die erste H- Abstraktionsreaktion zu senken. Darüber hinaus erwies sich die H-Abstraktion von CoA als energetisch günstiger, wenn sie von einem anionischen Formiat-Radikal anstelle des Cysteinyl- Radikals im aktiven Zentrum getragen wird. Zusätzlich wurde festgestellt, dass die Abstraktion eines H-Atoms von CoA im aktiven Zentrum vor der ersten Halbreaktion möglich ist. Eine solche Reaktion würde jedoch eine vorzeitige Absättigung des Radikals bewirken und könnte letztendlich zur Inaktivierung von PFL führen.

VI

Table of Contents

Die vorgestellte Arbeit erweitert das Verständnis des katalytischen Mechanismus von PFL maßgeblich. Ausgehend von der detaillierten Untersuchung der Faktoren, die den Bindungsmechanismus und die Positionierung von CoA am aktiven Zentrum beeinflussen konnte eine energetische Charakterisierung der möglichen chemischen Pfade, die auftreten, wenn CoA das aktive Zentrum von PFL erreicht hat, durchgeführt werden. Diese Ergebnisse werden zweifellos als Vorlage für zukünftige Studien in dem Gebiet der GREs dienen, um die Struktur-Funktions-Beziehungen und katalytischen Mechanismen in anderen Klassen von radikalischen Enzymen zu verstehen.

VII

Table of Contents

Table of Contents

Table of Contents ...... VIII Chapter 1: Radical Enzymes ...... 1 1.1. Introduction ...... 1 1.2. Radical Enzymes That Produce Radicals Employing Cofactors – Adenosylcobalamin and S-adenosylmethionine Dependent Enzymes ...... 3 1.3. Radical Enzymes Involving Protein Radicals – Glycyl Radical Enzymes ...... 12 1.4. Pyruvate Formate-Lyase ...... 17 1.5. References ...... 21 Chapter 2: Computational Methods for Modeling ...... 27 2.1. Introduction ...... 27 2.2. Molecular Dynamics Simulations ...... 31 2.3. QM/MM Calculations ...... 35 2.3. References ...... 38 Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase ...... 43 3.1. Introduction ...... 44 3.2. Methods ...... 49 3.2.1. The Building of Model Systems ...... 49 3.2.2. Force Field Parameters ...... 49 3.2.3. Conditions of Molecular Dynamics Simulations ...... 51 3.2.4. Steered Molecular Dynamics Simulations ...... 52 3.2.5. Analysis ...... 53 3.3. Results ...... 54 3.3.1. Geometries of the Active Site in Non-acetylated and Acetylated PFL Model Systems ...... 54 3.3.2. Global PFL Conformational Dynamics ...... 55 3.3.3. Coenzyme’s Dynamics ...... 56 3.3.4. Identification of Potential CoA Entry Channel ...... 60 3.3.5. Topology Effects ...... 64 3.4. Conclusion ...... 69 3.5. References ...... 72 Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase ...... 75 4.1. Introduction ...... 76 4.2. Methods ...... 82 4.2.1. Model Systems ...... 82 4.2.2. Reaction Coordinate ...... 82 4.2.3. Forward Pulls with Steered Molecular Dynamics ...... 83 4.2.4. Reverse Pulls ...... 83 4.2.5. Unrestrained Dissociation Dynamics from the Active Site to the Protein Surface ...... 84 4.2.6. Analysis ...... 84

VIII

Table of Contents

4.3. Results ...... 86 4.3.1. Insights from Forward and Reverse Pulls ...... 86 4.3.2. Potentials of Mean Force from Umbrella Sampling ...... 87 4.3.3. Detailed Characterization of Bound States ...... 91 4.3.3. Characterization of Reactive Bound States ...... 94 4.3.4. Implications Emerging for the Catalytic Process ...... 97 4.4. Conclusion ...... 100 4.5. References ...... 102 Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase ...... 104 5.1. Introduction ...... 105 5.2. Methods ...... 111 5.2.1. QM/MM Model Systems ...... 111 5.2.2. QM Region Setup...... 111 5.2.3. QM/MM Calculations Setup...... 112 5.2.4. Analysis ...... 113 5.2.5. QM/MM Benchmark ...... 114 5.3. Results ...... 115 5.3.1. Density Functionals Quality Assessment ...... 115 5.3.2. The 1st Half-Reaction with Pyruvate and Hydrogen Transfer inside the Active Site without CoA .. 116 5.3.3. The Attack of the C418 Thiyl Radical on Pyruvate ...... 118 5.3.4. The Collapse of the Radical Tetrahedral Intermediate and Acetylation of C418 ...... 120 •- 5.3.5. The Hydrogen Transfer between C419 and CO2 ...... 121 5.3.6. The 2nd Half-Reaction and Inactivation with CoA ...... 122 5.3.7. The Hydrogen Abstraction from CoA ...... 123 5.3.8. The Acetylation of CoA ...... 125 5.3.9. The Inactivation...... 126 5.4. Conclusion ...... 128 5.5. References ...... 131 Summary and Outlook ...... 133 Appendix ...... 137 Bibliography ...... 181 List of Publications ...... 191 Curriculum Vitae ...... 192 Statutory Declaration ...... 194

IX

Chapter 1: Radical Enzymes

Chapter 1: Radical Enzymes

1.1. Introduction

Among several most prominent molecules that shape life, proteins are truly the building blocks1-3 considering that they are the most common molecules found in all cells.4,5 Proteins have numerous remarkable properties and are responsible for a large variety of functions.6 These biological macromolecules are extremely dynamic7 which allows them to interact with each other,8 as well as with other cellular compartments, e.g. membrane or genetic material, by creating perfect networks9 and signaling cascades crucial for the proper functioning of cells.10- 12 They also serve as targets for binding small molecules13 and metabolites vital for life, such as drugs,14 which is of key importance in the development of novel therapeutic agents.15 Moreover, it is well-established that the absence, mutation, or even inactivation of certain proteins may cause numerous serious diseases.16,17

Proteins consist of the polypeptide chain, built from twenty different amino acids, whose combination allows the formation of extremely complex three-dimensional architectures and therefore makes them the most structurally challenging molecules known to biology.18 The sequence of amino acids in the polypeptide chain determines the protein structure while this structure further determines the function of the protein.19 Some proteins that possess a fascinating property of carrying out chemical transformations are called enzymes.20 Enzymes are one of nature’s most important protagonists and life’s key coordinators with the fundamental task to accelerate biochemical reactions, by converting substrates to products and produce all essential metabolites needed to keep the balance in cells.21-23 This phenomenon is known as enzyme catalysis.24-26

In chemistry, radicals are a highly reactive species (atoms, molecules, or ions) that contain a single unpaired electron in an outer orbit.27-30 The reactivity of this electron, and its affinity to pair with another electron, enables unusual chemical reactions to take place, which is otherwise inaccessible by standard routes.31-35 It has been discovered that certain enzymes use this high reactivity of radicals for catalyzing chemically demanding reactions in nature, placing the radical-based catalysis among extremely important topics in modern biochemistry and enzymology.36 These enzymes that contain or can produce and stabilize radical intermediates are named radical enzymes (REs).37 REs are likely to have had a significant role in directing chemistry in the pre-oxygen environment,38,39 and continue to catalyze crucial

1

Chapter 1: Radical Enzymes

reactions in life processes that we know today, such as in the synthesis of deoxyribonucleic acid (DNA) precursors,40 body detoxification,41 biomaterial degradation,42 oxidation and epoxidation reactions43,44 in the production of essential metabolites, decarboxylation45,46 as well as in other biosynthetic metabolic pathways.47-49

The general classification of REs can be broken into two main classes: aerobic and anaerobic. Aerobic REs are well-established catalysts in biotechnology and typically include oxidases50-55 which utilize molecular oxygen or oxygen-derived reactive species to facilitate their catalysis with applications ranging from bioremediation, detoxification of wastewater streams, food preservation, fuel cells, and bleaching technology to biosensors and diagnostics. However, in this thesis, our focus is set solely on their anaerobic counterparts. Namely, anaerobes have developed an alternative mechanism to catalyze important biochemical transformations by employing anaerobic REs, the ancient proteins evolved to carry out fundamental chemistries in primary and secondary metabolism of bacteria in the anoxic environment using radicals as reactive species.48 Their substrates are usually shielded from the bulk solvent, as well as from other potentially reactive molecules, and buried deep within the interior of the enzyme under several well-structured protein layers, which serves to protect highly-reactive radical intermediates from the unwanted side-reactions, fatal quenching, and inactivation.56-58

Recently, there is an increasing interest in anaerobic REs for industrial applications, especially in biotechnology and enzyme engineering.59-61 These enzymes possess enriched scope of chemistries, effecting reactions in hydrocarbons and aromatic compounds,62-65 providing routes to methane activation and insertion of heteroatoms,66 and in the metabolism of amino acids to generate branched and linear hydrocarbons.48,49 In this respect, numerous chemically challenging reactions, including formation and cleavage of C-C, C-O, and C-N bonds of mostly non-reactive compounds, proceed by the assistance of catalytic radicals, are carried out by anaerobic REs utilizing the high reactivity of radicals which allows direct routes or easier access to molecules that are otherwise inaccessible through standard existing chemical procedures.

Due to their versatile reactivity, anaerobic REs have already demonstrated a broad scope of applications in industrial challenges, including the production of bulk and fine chemicals as well as much-needed antibiotics.67 Current technologies, as well as new modern ones, are likely to play a strong role in developing these enzymes for further applications. Even

2

Chapter 1: Radical Enzymes

though considerable efforts have recently been made in discovering and studying REs, many fundamental aspects of radical-based catalysis are still poorly understood due to the limited availability of experimental data. Namely, the reactions in which radicals are involved are usually unpredictable and difficult to monitor or control, while at the same time, the reactions catalyzed by anaerobic REs involve unstable, short-lived intermediates that are challenging to study experimentally due to their increased oxygen-sensitivity.48,68-70

At the forefront, largely in the field of REs, the computer-modeling based design offers much promise, especially when coupled with experimental feedback or information from complementary experimental approaches such as directed evolution. On one hand, radical- based catalysis enables novel reaction routes and allows enzymes to perform low-energy processes, while on the other it also requires a highly controlled environment to avoid undesirable side-reactions which could potentially damage cells. In general, radical catalysis is far more complex than classical acid-base catalysis because it demands controlled formation, storage, and decomposition of radicals that is critical for these reactive species.31,48,71

Radical species in anaerobic REs are further classified into two major groups, based on how the radicals are formed within - one derived from cofactors, such as coenzyme B12 (adenosylcobalamin) or S-adenosylmethionine (SAM), 72,73 while others derived directly from amino acid residues in proteins, such as glycyl radical enzymes (GREs).74 Since this thesis sets its main focus on GREs, with an aim to, through the application of various molecular modeling and computational methods, shed the light on the understanding of the catalytic mechanism of one of the GREs family member, pyruvate formate-lyase (PFL), a general features of GREs as well as other relevant classes of -dependent REs, namely coenzyme B12, and S- adenosylmethionine dependent REs, is presented in the following sections.

1.2. Radical Enzymes That Produce Radicals Employing Cofactors – Adenosylcobalamin and S-adenosylmethionine Dependent Enzymes

Adenosylcobalamin (AdoCbl, coenzyme B12 or vitamin B12) is a molecule (see Figure 1.2.1a) that has been widely employed in numerous catalytic processes and successfully applied in laboratory synthesis dealing with extremely challenging reactions including halide and alkene coupling, ring-opening, addition to double bonds, dehalogenation, hydrogenation, 75 transmethylation, cyclopropanation, and rearrangements. However, coenzyme B12 has been associated with radical biochemistry since the discovery of B12-dependent enzymes that prevail

3

Chapter 1: Radical Enzymes

predominantly in anaerobic microorganisms, governing some of the critical steps in their 76,77 metabolic pathways. Coenzyme B12, being one of the most-studied radical generating compounds in nature, is a commonly occurring organometallic compound with, from the perspective of cofactors, a highly complex structure (see Figure 1.2.1a).78-81

The coenzyme molecule is built of a peripherally substituted macrocyclic corrin in which there is embedded central cobalt (III) cation ligated to four in-plane nitrogen atoms from the pyrroles of the corrinoid ring. The remaining two axial positions of the octahedral metal complex are occupied by 5′-deoxyadenosyl (5’-dAdo) residue bound to the Co3+ via a relatively long and weak σ-bond and a Lewis base, which is either 5,6-dimethylbenzimidazole (DMB) eventually covalently linked to the heterocyclic corrin (“base on”) or the imidazole moiety of a conserved histidine (His) from the enzyme (“base off, His on”).82 The schematic

Figure 1.2.1. (a) The structure of organometallic coenzyme B12 complex features a corrin and the 5’- deoxyadenosyl residue (magenta) bound to cobalt (III). (b) Schematic representation of homolytic Co-

C bond cleavage in vitamin B12 which reversibly generates the cobalt (II) and the reactive 5’- deoxyadenosyl radical (shown in red) which is then used in further catalysis. The cleavage of the weak • Co-C σ-bond of vitamin B12 is the main catalytic event from which the 5’-dAdo is generated, and is thought to be triggered upon substrate binding to the cofactor-containing enzyme, induced by conformational changes of the enzyme.

4

Chapter 1: Radical Enzymes

Figure 1.2.2. Active site structures of the two representative AdoCbl-dependent REs. The crystal structures of (a) the anaerobically obtained B12-dependent propane-1,2-diol dehydratase (B12-dPDH, EC

4.2.1.28) showing “base on” architecture of coenzyme B12 (PDB: 5YRV), and (b) the methylmalonyl- CoA mutase (MCM, EC 5.4.99.2) with “base off, His on” mode of cobalamin (PFB: 4REQ). The 5’- dAdo moiety is not shown.

representation of “base on” AdoCbl is shown in Figure 1.2.1. All reactions catalyzed by B12- dependent enzymes include the initial homolysis of the Co-C bond83 upon which Co3+ is reduced to Co2+ generating 5’-deoxyadenosyl radical (5’-dAdo•), which eventually abstracts a hydrogen atom from various substrates (see Figure 1.2.1b for details).81

To date, several anaerobic radical enzymes that use B12-cofactor have been reported typically catalyzing the rearrangements of carbon-skeleton carried out by mutases,84-88 eliminations of heteroatoms by eliminases89-94 or intramolecular amino group migrations by aminomutases95,96 utilizing a highly-sensitive and reactive organic carbon-centered 5’-dAdo• intermediate. Also, these enzymes can be sorted into two different representative groups of

vitamin B12-dependent enzymes, the “base on” eliminases, and the “base off, His on” mutases (see Figure 1.2.2a-b for details).97,98 The ability of mutases to perform carbon skeleton transformations is extremely useful for the preparation of new chemicals,99 which is already highlighted through the development of the readily accessible precursors and intermediates for obtaining carbon-based compounds.84-88 The irreversible eliminases include ethanolamine ammonia-lyase (EAL, EC 4.3.1.7)92,100 which readily cleaves C-N bond in ethanolamine substrate and participates in glycerophospholipid bacterial anaerobic metabolism, and diol

5

Chapter 1: Radical Enzymes

dehydratases (DDs)89 acting on ethane-1,2-diol, propane-1,2-diol, and glycerol substrates.90,91

The reactions catalyzed by B12-dependent eliminases are already well-established and have important applications in biotechnology and enzyme engineering. Namely, in these transformations, the diols are dehydrated to useful aldehydes, acetaldehyde, and propanal; glycerol substrate is dehydrated to invaluable 3-hydroxypropanal (3-HPA), while ethanolamine is deaminated to acetaldehyde by several metabolically engineered microorganisms. Furthermore, the ability to convert such a short-chain 1,2-diols enable the employment of DDs in other reactions for renewable bulk derivatives, in production of butanone and 2-butanol from butane-2,3-diol, propanal from propane-1,2-diol and other cases where key 3-HPA is used as a central intermediate.77

The majority of AdoCbl-dependent enzymes share similar structural motifs including a

full (β/α)8 or partial (β/α)6 triosephosphate (TIM)-barrel domain - a compact, so- called Rossman-type fold, consisting of eight or six parallel β-strands arranged in a cylinder and surrounded by α-helices, which plays a key functional role, especially in stabilizing radical intermediates (see Figure 1.2.3).81,101-103 There are two possible modes of cobalamin binding in these classes - enzymes catalyzing heteroatom eliminations bind cobalamins in the DMB-on mode whereas the others catalyzing carbon-skeleton rearrangements and an intramolecular amino group migrations bind cofactor in the DMB-off/His-on mode as shown in Figure 1.2.2a- b.98

Figure 1.2.3. (a) Top and (b) side view of crystal structure of catalytic subunit of coenzyme B12- dependent ethanolamine ammonia-lyase (EAL) which reveals a cobalamin sitting on top of a full (β/α)8 TIM-barrel fold in radical catalytic site (PDB: 5YSR).

6

Chapter 1: Radical Enzymes

Moreover, the cofactor molecule is bound to cover the active sites stacked between the TIM-barrel domains (see Figure 1.2.3). Since highly reactive and unstable radical intermediates are involved in the reactions catalyzed by AdoCbl-dependent enzymes, these reactive species must be protected from undesirable side reactions. Therefore, such compact common architectures may be necessary as a protection for radical species by spatial isolation, within the core of the enzyme, from the environment. Namely, the structure of these enzymes ensures that intermediates formed in the catalytic cycle are shielded from the side-reactions that are practical for a range of reactions in industrial applications. As demonstrated above, there is a significant contribution from B12-dependent radical enzymes in targeting the production of chemicals relevant to biotechnology. Their ability to fine-tune the radical reactions indicates that they are typically more resistant to atmospheric oxygen than the rest of the anaerobic REs (especially GREs and iron-sulfur-containing enzymes as will be demonstrated in the following sections) which makes them more robust and applicable to various process conditions while further advances and studies of these enzymes may contribute to improving existing technologies.48

S-adenosylmethionine (SAM or AdoMet)73 is a sulfonium molecule (see Figure 1.2.4), well-known as a key cofactor involved in a methyl-group donation in numerous methylation reactions in nature,104-109 but also a protagonist in the generation of substrate radicals in a class of radical SAM-dependent superfamily of anaerobic enzymes.73,110-112 Although often underestimated based on its much simpler structure than AdoCbl, the AdoMet cofactor, 107,113 when incorporated in the enzymatic environment, is capable of performing a greater diversity of biological radical transformations, some of which have not yet been fully characterized.66,111,114-116

Radical SAM-dependent enzymes are currently the largest known superfamily of metalloenzymes mostly enrolled in general metabolic pathways in strict and facultative anaerobic bacteria and other microorganisms.117 These enzymes catalyze a huge scope of versatile biochemical transformations and radical chemistries including aminomutase reactions, sulfur insertion into non-reactive C-H bond in biotin and lipoate biosyntheses, DNA repair of thymine dimer, anaerobic oxidative decarboxylation, vitamin/cofactor biosynthesis, natural product biosynthesis, methylthiolation of tRNA, methylation of challenging compounds (including phosphorous and non-reactive carbon atoms) and in antibiotics synthesis, as well as generation of glycyl radical in the activation of glycyl radical enzymes

7

Chapter 1: Radical Enzymes

and peptide modification.61,73,118,119 Some of these REs members include glycyl radical activating enzymes (GRE-AEs or GRE activases) such as pyruvate formate-lyase activating enzyme (pflA or PFL-AE, EC 1.97.1.4)74 and glycerol dehydratase activating enzyme (GDH- AE, EC 4.2.1.30),120-122 lysine 2,3-aminomutase (LAM, EC 5.4.3.2),123 biotin synthase (BioB, EC 2.8.1.6),124 methylornithine synthase (PylB, EC 5.4.99.58), peptide-modifying enzyme involved in streptide biosynthesis (SuiB), queuine lyase (EC 2.4.2.29), lipoyl synthase (LipA, EC 2.8.1.8),125 7-carboxy-7-deazaguanine (CDG) synthase (QueE, EC 4.3.99.3),126,127 molybdopterin cofactor biosynthesis enzyme (MoaA, EC 4.1.99.22),128 oxygen-independent coproporphyrinogen oxidase (HemN, EC 1.3.99.22),39 spore photoproduct-lyase (SPL, EC 4.1.99.14),129 tRNA methylthiolation enzyme (MiaB, EC 2.8.4.3), enzymes involved in thiamine biosynthesis (ThiC, EC 4.1.99.17 and ThiH, EC 2.5.1.77),130,131 tRNA modification enzyme (TYW1, EC 2.1.1.225),132 formyglycine-generating enzyme (AtsB, EC 1.8.3.7) and others. Although thousands of potential SAM-dependent radical enzymes have been detected in genomes of both aerobes and anaerobes up to this date, the mechanisms of only a few of its members have been investigated in detail.114

Most of the radical AdoMet enzymes share similar structural features, having common barrel-like TIM architecture which can vary with the size of the substrate, from a complete-

Figure 1.2.4. Two SAM-cleavage pathways. The iron-sulfur Fe4S4 cluster provides the single electron to reductively cleave SAM molecule which results in formation of (a) L-methionine (green) and the reactive 5’-deoxyadenosyl radical (red) or (b) 5’-deoxy-5’-(methylthio)adenosine (orange) and less common 3-amino-3-carboxypropyonyl radical (blue).

8

Chapter 1: Radical Enzymes

Figure 1.2.5. (a) Top and (b) side view of structure of a full (β/α)8 TIM-barrel found in E. coli biotin synthase (BioB) with iron-sulfur cluster and S-adenosylmethionine placed in the active site (PDB: 1R30). Note that BioB is a dimeric enzyme, however only one subunit is shown here for clarity.

(β/α)8 fold (found for example in biotin synthase (BioB) of E.coli as shown in Figure 1.2.5) to 133 more common partial (β/α)6 barrel motifs (e.g. QueE of Escherichia coli (E. coli) and pyruvate formate-lyase activating enzyme (PFL-AE)74 as shown in Figure 1.2.6) similar to the

structural motifs found in B12-dependent REs, again probably to protect unstable and reactive radical intermediates generated during the catalysis.134 Moreover, the key sequence motif, 110,135 which is used to identify these enzymes, is revealed to be a CX3CX2C region. The three highly conserved cysteine residues are necessary for coordinating the three metal Fe-atoms from the iron-sulfur cluster,136 with the fourth unique iron site occupied by the sulfur atom, as well as amino and carboxylate groups of SAM as shown in Figure 1.2.4 and 1.2.7.124,137,138

All SAM-dependent REs139 are redox-active and require a low-potential one-electron donor, ferredoxin or flavodoxin, to reduce SAM, by S-C bond homolysis in SAM upon electron transfer from the 4Fe-4S cluster, to usually generate L-methionine (Met) and the 5′-dAdo• as shown in Figure 1.2.4a,140-142 which initiates catalysis, generally by hydrogen atom abstraction from the substrate from where the further mechanisms of radical SAM enzymes diverge substantially as demonstrated earlier.143 The 5’-dAdo• is either irreversibly released as product or recycled to regenerate SAM to participate in the next enzymatic cycle.144 The majority of the SAM-dependent radical enzymes belong to the irreversible type of REs. Interestingly, there

9

Chapter 1: Radical Enzymes

are different SAM cleavage pathways possible.145 The common paradigm is that the formation of 5’-dAdo• and Met is crucial in all SAM enzymes. However, an alternative SAM cleavage pathway was recently suggested for the enzyme involved in diphthamide biosynthesis (Dph2, EC 2.5.1.108)146 and glycerol dehydratase activating enzyme (GDH-AE), where it was demonstrated that it results in the formation of 5′-deoxy-5′-(methylthio)adenosine and 3- amino-3-carboxypropyl radical and not 5′-dAdo•, in contrast to a previously proposed generalized catalysis pattern (see Figure 1.2.4b for details). 120,121,147,148

Although the number of structurally characterized rSAM-dependent enzymes is rather scarce, due to the increased sensitivity of their central iron-sulfur cluster towards oxygen and difficulties related to the crystallization of such enzymes, here we focus on a GRE-AE, namely pyruvate formate-lyase activating enzyme (PFL-AE), whose high-resolution crystal structure is solved (see Figure 1.2.6). Being a prototypical AdoMet-dependent anaerobic RE, PFL-AE

occurs as a monomeric protein (28 kDa), with an overall (β/α)6 partial TIM-barrel core domain,

which contains one Fe4S4 cluster and SAM cofactor in the active site and requires reduced

flavodoxin as a co-substrate. Interestingly, PFL-AE utilizes only one Fe4S4 cluster in contrast to several other radical SAM-enzymes that possess multiple auxiliary iron-sulfur clusters present in their structure. Within the basic reaction catalyzed by PFL-AE, a hydrogen atom is abstracted from a glycine residue in the PFL protein substrate.

Figure 1.2.6. (a) Top and (b) side view of crystal structure of monomeric radical SAM enzyme pyruvate formate-lyase activating enzyme (PFL-AE) isolated from E. coli (PDB: 3CB8). The structure shows a

TIM-barrel architecture of typical partial (β/α)6 barrel. The usual position of Fe4S4 cluster and S- adenosylmethionine are also shown.

10

Chapter 1: Radical Enzymes

PFL-AE (PFL activase) is one of the first discovered and structurally well-characterized member of the radical SAM superfamily of metalloenzymes that introduces a stable glycyl radical into the backbone of PFL, a corresponding GRE. Like all GRE activases, PFL-AE shares the essential presence of the AdoMet cofactor and the iron-sulfur cluster in the active site responsible for catalysis as depicted in Figure 1.2.7. In this catalytic reaction, the oxidation of [4Fe-4S]+ to [4Fe-4S]2+ is followed by a reductive electron

transfer to SAM initiating the Figure 1.2.7. The usual coordination pattern of Fe4S4 irreversible homolytic cleavage of the cluster in the active site complexed by three cysteines S-C bond in SAM which usually and SAM as found in radical SAM-enzymes, for generates Met and 5’-dAdo•. The example in PFL-AE from E. coli (PDB: 3C8B). resulting radical species in PFL-AE directly abstract a stereospecific Cα-hydrogen atom of glycine from the bound PFL substrate yielding a highly-reactive glycyl radical which further activates cysteine for performing a C-C bond cleavage in pyruvate as will be demonstrated in details in the following sections.72

Recently it has been demonstrated that the presence of essential sodium ion in the active site of PF-AE as well as binding of PFL loop has critical roles in the stability of PFL-AE and tremendous impact on the catalysis outcome.149 Interestingly, it has been recently reported that the newly appearing EPR signal in the experiment could potentially be assigned to the formation of an organometallic intermediate (named Ω) in PFL-AE upon generation of the unusual metal-carbon (Fe-5’C) bond which demonstrates a higher analogy to the central 119,150 intermediate found in coenzyme B12-containing enzymes, as shown in Figure 1.2.1.

However, previously described coenzyme B12-dependent enzymes and radical SAM superfamily, while both oxygen-sensitive and involved in generating the equivalent 5’-dAdo• specie, apparently do not share a common evolutionary origin.151

11

Chapter 1: Radical Enzymes

1.3. Radical Enzymes Involving Protein Radicals – Glycyl Radical Enzymes

Organic carbon-centered radicals are extremely unstable and highly-reactive in solution,152-154 but when introduced into the protein environment, their stability drastically increases.155,156 This is mostly due to the shielding effect of the perfectly structured protein layers surrounding the radical which keeps reactive specie deeply buried in the interior of the enzyme protecting it from potential quenchers, similarly as in the case of structural features found in the previously mentioned REs containing AdoCbl and AdoMet cofactors. Interestingly, the C-centered radicals are found in nature, not only in 5’-dAdo• but also with the unpaired electron residing on glycine of the polypeptide chain.72,74 This class of

anaerobic REs, which possess a radical positioned on amino acid Cα-backbone atom of glycine residue72 is known as glycyl radical enzymes. The stability of such radicals is often connected to the captodative effect157,158 (or so-called push-pull mechanism achieved through the stabilization of central radical-containing C-atom by electron-donating and withdrawing groups) provided by the neighboring nitrogen and carbonyl from peptide bonds, which also makes them less reactive than typical organic radicals.

GREs are a classical example of stable radical-containing proteins evolved to catalyze essential biochemical reactions in the central metabolism of strict and facultative anaerobes,

Figure 1.3.1. Crystal structures of dimeric GREs (a) CutC (PDB: 5FAU), (b) HPAD (PDB: 2YAJ), (c)

B12-iGDH (PDB: 1R9D) and (d) BSS (PDB: 5BWE) with C-terminus containing glycyl radical domain in yellow and glycine shown as yellow spheres. All protein structures show a common structural motif including central 10-stranded β-barrel architecture surrounded by α-helices.

12

Chapter 1: Radical Enzymes

Figure 1.3.2. General overview of GRE activation and catalysis. (a) The common two C-terminal α- helices (pink) above ten β-strands surrounding the active site and two catalytic loops, glycine loop (yellow) containing G734 and cysteine loop (magenta) containing two catalytic cysteines C418 and C419 representing the active site architecture of PFL from E. coli (PDB: 1H16). Interestingly, PFL is only known GRE that employs two cysteines for its catalysis, while other GREs contain only one catalytic cysteine. The surrounding α-helices and connecting loops are not shown. (b) The 5’-dAdo•, generated upon the SAM cleavage, abstracts the hydrogen atom from the glycine of the GRE yielding a glycyl radical (Gly•) and 5’-deoxyadenosine (5’-dAdo). (c) GRE-AE (blue) installs a Gly• on the usually homodimeric GRE in a reaction that requires a significant conformational change of a glycyl radical domain (yellow) as proposed by Backman et al. in ref. 187. such as bacteria and other microorganisms, by employing demanding radical chemistries in an anoxic environment.159-162, Recently, GREs have been reported to play key roles both in the human microbiota163-170 and in the environment171,172 which makes them extremely attractive for biochemical and enzyme engineering, as well as in biotechnology and for potential applications in medicine. These, for microbes, alternative metabolic pathways take place only under strictly anaerobic conditions and are usually difficult to assess by standard experimental approach because of the high sensitivity of these enzymes towards oxygen from the atmosphere.64,173-175

There are at least five major classes of GREs characterized up to date based on the analysis of the phylogenetic tree illustrating phylogenetically distinct GREs.176,177 These five well-established families are classified based on the type of biochemical transformation they perform and include GRE formate- (pyruvate formate-lyase (PFL or pflB, EC 2.3.1.54)159 and ketobutyrate formate-lyase (TdcE, EC 2.3.1.54)),173 GRE 1,2-eliminases 162 (choline trimethylamine-lyase (CutC, EC 4.3.99.4), B12-independent glycerol dehydratase

13

Chapter 1: Radical Enzymes

147 178 (B12-iGDH, EC 4.2.1.30) and propane-1,2-diol dehydratase (PDH, EC 4.2.1.28), GRE ribonucleotide reductases (class III RNRs, EC 1.1.98.6),160 GRE decarboxylases (4- hydroxyphenylacetate decarboxylase (HPAD, EC 4.1.1.83))164 and X-succinate synthases (XSSs where X = aromatic or aliphatic hydrocarbon compound).161 Recently, a new distinct type of GREs that enables hydrogen sulfide (H2S) production by the human intestinal bacterium Bilophila wadsworthia has been discovered to operate in the metabolism of taurine. This critical enzyme in the metabolism is isethionate sulfite-lyase (IslA), 179 which catalyzes the conversion of isethionate (2-hydroxyethanesulfonate) to sulfite and acetaldehyde. This invaluable discovery could improve the understanding of H2S production in humans and offers 180 a new perspective for controlling intestinal H2S production and bacterial infections.

Most GREs typically appear as homodimers, comprised of two identical subunits, with a size of about 80 kDa per subunit. Although their primary sequences differ significantly, most GREs are structurally homologous, according to the available crystal structures (see Figure

1.3.1a-d for structures of four characteristic GREs, namely eliminases CutC and B12-iGDH, decarboxylase HPAD and synthase BSS (EC 4.1.99.11)). Like AdoCbl and AdoMet radical enzymes, GREs possess a barrel-like architecture of the active site to support radical catalysis. However, GREs have a slightly different structure than previously described two types of a Rossman-fold TIM-barrel rearrangement. A common motif in GREs is a 10-stranded β-barrel core (comprised of two five stranded half barrels arranged antiparallel to each other) surrounded by α-helices from the outside in each monomer, with two loops extending into the center of the barrel. One of the loops contains a conserved glycine in a predominantly hydrophobic sequence (for example, the consensus sequence for PFL appears to be RVSGYAV, with minor variations among other GREs) close to the C-terminus, while the other carries the conserved cysteine (one or two) that directly participates in the catalysis (see Figure 1.3.2a for details on the typical active site architecture in PFL).74,181

All GREs require the presence of the corresponding activating enzyme (GRE-AE) for their activation. Namely, in GREs, a conserved glycine near the C-terminus of one subunit of a large dimeric apoenzyme is irreversibly converted to a radical through stereospecific hydrogen atom abstraction108,110,182,183 performed by the 5′-dAdo• derived from SAM, performed by the radical AdoMet-dependent activases, as described earlier.72,74 Therefore, all GREs share similar functional commonalities that employ post-translationally installed stable glycyl radical on the protein backbone crucial for radical-based catalysis (see Figure 1.3.2b-

14

Chapter 1: Radical Enzymes

Figure 1.3.3. Classification of characterized GREs reactions. GRE formate-lyases (pyruvate formate- lyase (PFL) and ketobutyrate formate-lyase (TdcE)), GRE 1,2-eliminases (choline trimethylamine-lyase (CutC), isethionate sulfite-lyase (IslA), glycerol dehydratase (GDH) and propane-1,2-diol dehydratase (PDH)), GRE ribonucleotide reductase (class III RNRs), GRE decarboxylase (4-hydroxyphenylacetate decarboxylase (HPAD)) and X-succinate synthases (XSSs, where X = toluene in the case of the (BSS)). c). For the GRE-AE to perform initial hydrogen abstraction, drastic conformational changes are necessary upon the exposure of the glycine residue for activation (illustrated in Figure 1.3.2c). Namely, GRE and its activase have to achieve favorable protein-protein interactions (PPIs) upon binding glycine-containing loop near the active site of the activase. It has been demonstrated that the presence of the activase triggers these crucial conformational changes, however, the exact mechanism of the transition between closed and open forms of GRE remains elusive (see Figure 1.3.2c).

Although the number of characterized GREs is small, they cover a large variety of chemistries and are involved in numerous metabolic pathways and the environment. In particular, the transformations within the activated GREs cover a large scope of useful fundamental chemical steps, including C-C bond cleavage in PFL and homolog TdcE,173,184,185 C-N bond cleavage catalyzed by CutC,170 decarboxylations (C-C bond-breaking) activity of

15

Chapter 1: Radical Enzymes

HPAD, the reductase activity of anaerobic class III RNRs,160 the dehydratase activity of the

B12-iGDH and the trans-4-hydroxy-L-proline (Hyp) dehydratase (PflD or HypD, EC 4.2.1.172) by C-O bond cleavage, a novel radical-based S-C bond cleavage reaction by IslA and C-C bond-forming reactions in hydrocarbon anaerobic metabolism and degradation of toluene and its derivatives including methylnaphtyl derivatives of BSS and (1-methylpentyl)succinate synthase (MSS, EC 4.1.99).62,64,161 The overall reactions catalyzed by representative GREs members are shown in Figure 1.3.3.

The exposure of glycyl radical to the solvent or dioxygen can have fatal outcomes due to the enzyme inactivation via quenching the radical and breaking the protein backbone. However, some facultative anaerobes have reactivases, small proteins which structurally look like the C-terminal part of GRE, and upon switching the conditions from aerobic to anaerobic metabolism these enzymes can reactivate central protagonist in the metabolism and recover their function. The first member of the GRE class to be characterized as a radical enzyme was PFL, which is produced in E. coli only under anoxic conditions. As discussed earlier, the glycyl radical in PFL exhibits a high sensitivity of the radical with molecular oxygen permanently inactivating the enzyme by cleavage of the polypeptide chain. Hence, these enzymes can occur only in anaerobically thriving microorganisms. For example, transferring E. coli cells containing active PFL from anoxic to oxic conditions irreversibly inactivates the enzyme by cleaving the glycyl radical unless the radical is reduced. Recently, a small reactivating protein (YfiD) was discovered that could rescue PFL cleaved by exposure to oxygen.186 Interestingly, the C-terminal half of this protein contains a sequence that has a high similarity to the C- terminus cleaved region of PFL bearing the glycyl radical. Therefore, these proteins can replace the damaged portion of the PFL which in turn enables the enzyme to activate again once the favorable anoxic conditions are provided.187

Recently, it has been confirmed that GREs are one of the most prominent enzyme families in the human gut microbiome,164 indicating that their activity in the human gut could be directly connected to the hosts of several diseases such as colorectal cancer and inflammatory bowel disease. Interestingly, the analysis of the distribution of prominent enzymes in the human gut microbiome revealed that PFL is not only involved in anaerobic sugar metabolism of E. coli, but also in high abundance in every examined gut metagenome, and in the vast majority (more than 96 %) of human gut metagenomes.188 Recent progress in discovering new GREs which operate in the human gut microbiota could be critical for

16

Chapter 1: Radical Enzymes

understanding their structure-function relationship at a molecular level and could help in modulating their activity which could, in turn, serve in finding potential applications in medicine to target suitable treatments and find pharmaceuticals for those diseases.189,190

1.4. Pyruvate Formate-Lyase

Since the catalytic mechanism operative in pyruvate formate-lyase (PFL) is the central topic of this thesis, this introductory part aims to provide some basic insights into already known aspects about this enzyme as well as present the open questions regarding the PFL catalysis. PFL belongs to the formate-lyases class of GREs and a wider type of anaerobic formate C-acetyltransferases and it is one of the best-characterized GREs to date. Although PFL is a prototypical GRE,173 many fundamental aspects of its catalysis are still not understood well. PFL catalyzes the key step in anaerobic primary glucose metabolism of E. coli and other microbes,191 where it performs the reversible conversion of pyruvate and coenzyme A (CoA) into formate and acetyl-CoA.192 The overall reaction catalyzed by PFL, initially shown in Figure 1.3.3 of the previous section where an overview of reactions catalyzed by GREs was explained in details, can be generally described by two half-reactions with two different substrates, namely pyruvate and CoA, as will be demonstrated in details in the following sections.193

Figure 1.4.1. Crystal structure of (a) E. coli PFL homodimer with glycine shown as spheres and glycyl radical domain depicted as yellow ribbons (PDB: 1H16). (b) Architecture of PFL active site set up for the first half-reaction with G734 close to the cysteine loop containing two catalytic cysteines C418 and C419 near pyruvate. CoA is found at the protein surface in rare syn-conformation, 20-30 Å away from the active site.

17

Chapter 1: Radical Enzymes

The determined X-ray structure of PFL from E. coli (shown in Figure 1.4.1)194-196 reveals that PFL is a homodimeric enzyme, comprised of two identical subunits (made of 85 kDa, 759 residues per monomeric subunits) whose motifs of particular regions of interest, namely the active site, contain the two catalytic cysteine residues at positions 418 and 419 (Cys loop) and a conserved glycine 734 in the Gly radical domain, buried between the 10-stranded β/α barrel motif in the predominantly hydrophobic protein interior and surrounded by α-helices near the C-terminal region of the protein (see Figure 1.3.2a and 1.4.1).58,197 The active site of the enzyme holds, besides three essential amino acids that perform catalysis (G734, C418, and C4119), two positively charged residues that bind the negatively charged substrate pyruvate close by (R435 and R176), and two hydrophobic residues (W333 and F432).198,199

Although glycyl radical in PFL exhibits extraordinary high stability when exposed to the aerobic environment for more than 24 hours at room temperature, it was previously confirmed that the enzyme becomes ultimately inactivated in the presence of oxygen by readily interconverting between active and inactive forms.175 The signals from the EPR experiments indicate that PFL stores the radical on Cα-atom of glycine amino acid in the protein backbone where only one subunit becomes activated per dimer (half-site reactivity).72,177,200 Therefore, the activation process occurs under strictly anaerobic conditions and is catalyzed by rSAM- dependent PFL activase (PFL-AE),201,202 which post-translationally installs a glycyl radical on the PFL backbone as described in the previous section where general GREs activation pattern was discussed in details.203 Pioneer studies on the activation of PFL have revealed that a complex activation cocktail is required, including PFL, PFL-AE, pyruvate, SAM, and flavodoxin. The aforementioned activation results in the cleavage of SAM to 5’-dAdo• and Met and a subsequent hydrogen abstraction from the inactive PFL to produce glycyl radical.

The proposed mechanism for the PFL-mediated catalytic reaction indicates the formation of an acyl-enzyme intermediate.74,204 This is consistent with the observation of ping- pong kinetics (reaction of the enzyme with the first reactant changes the state of the enzyme which then binds the second reactant) and the CoA-independent pyruvate-formate carboxylate exchange. The putative mechanism for the PFL-catalyzed reaction has been proposed to operate by the generation of a thiyl radical at an active site cysteine and is thought to proceed in several steps (see Figure 1.4.2).205 The first step in the mechanism is hydrogen abstraction from the S-H group of nearest cysteine C419, from the Cys loop, by the glycyl radical G734• (see Figure 1.4.2a). Since the G734 is adjacent to two conserved cysteines in the active site,

18

Chapter 1: Radical Enzymes

Figure 1.4.2. Proposed PFL mechanism. This mechanism is initially proposed by Becker and Kabsch in 2002 (ref. 195) and revised by Backman et al. in ref. 187. namely C418 and C419, the following step could be the H-transfer from C418 to C419 which generates C418• thiyl radical (see Figure 1.4.2b). The next step is the addition of this thiyl radical to the carbonyl C-atom of pyruvate, initially bound in the active site via hydrogen bonding with two conserved arginine residues 176 and 435, which creates a tetrahedral oxyradical intermediate (see Figure 1.4.2c-d). This step is proposed to be feasible based on the available crystal structure which indicates the perfect positioning of the sulfur atom from C418 above the carbonyl group in pyruvate (see Figure 1.4.1b). The collapse of the tetrahedral radical intermediate (cleavage of the C-C bond) yields the acetylated enzyme at C418 and the formate radical anion or carbon dioxide radical anion which is then quenched to formate by C419 reforming a thiyl radical on C419 (see Figure 1.4.2d-f).

The steps described above relate only to the first half-reaction of PFL with the first substrate pyruvate, which is proposed based on the available X-ray structure and biochemical experimental data. Even though experiments imply the plausibility of this proposed reaction

19

Chapter 1: Radical Enzymes

mechanism, none of the formate binding poses nor covalent intermediates have been structurally confirmed to date.

The final step involves acetyl transfer from C418 to CoA in the second half-reaction and product release (see Figure 1.4.2g-i). Although a structure of PFL with CoA and pyruvate has been solved, in this structure CoA is not in its catalytic position in the active site. Instead, it is found on the protein surface, in a disengaged syn-conformation, with the thiol group extending away from the active site (see Figure 1.4.1b for details on CoA binding mode). Although it has been proposed that this binding pose of CoA is consistent with the experimental findings showing that CoA is not mandatory for pyruvate cleavage in the first half-reaction, therefore this crystal structure is believed to mimic the positions of both substrates before the first half-reaction. On this basis, it has been proposed that certain structural changes of the protein, as well as CoA, are crucial for the thiol S-H group of CoA to reach the active site and allowing the transfer of acetyl attached to C418 to form acetyl-CoA. Namely, for the second half-reaction to take place, the coenzyme molecule must change its conformation from syn to anti and the thiol group of CoA must enter the active site of the enzyme to retrieve a protein- attached acyl group. However, CoA is initially bound at the protein surface, while the active site is buried in the protein’s interior, 30 Å away (see Figure 1.4.1b). Even though this mechanistic proposal seems reasonable, the evidence for proposed conformational changes remains elusive (see Figure 1.4.2f-g).

In this thesis we make an effort to resolve these open questions regarding the mechanism of PFL utilizing a comprehensive computational study, encompassing both standard and enhanced molecular dynamics (MD) simulation approaches, as well as multilevel quantum mechanics/molecular mechanics (QM/MM) calculations. We investigate the elusive role of CoA in the second half-reaction to deliver key insight into the understanding of fundamental structural and functional aspects responsible for the activity and the underlying catalytic mechanism in PFL.

20

Chapter 1: Radical Enzymes

1.5. References

1. Ouzounis, C. A.; Coulson, R. M. R.; Enright, A. J.; Kunin, V.; Pereira-Leal, J. B. Nat. Rev. Genet. 2003, 4, 508-519. 2. Chothia, C.; Hubbard, T.; Brenner, S.; Barns, H.; Murzin, A. Annu. Rev. Biophys. Biomol. Struct. 1997, 26, 597-627. 3. Mannige, R. V. Proteomes 2014, 2, 128-153. 4. Berg, J. M.; Tymoczko, J. L.; Stryer L. Biochemistry, International Edition. 7th ed. WH Freeman & Co.; New York, NY, USA: 2010. 5. Cooper, G. M.; Hausman, R. E. The Cell: A Molecular Approach. 6th ed. Sinauer Associates, Inc.; Sunderland, MA, USA: 2013. 6. Lee, D.; Redfern, O.; Orengo, C. Nat. Rev. Mol. Cell Biol. 2007, 8, 995-1005. 7. Acuner-Ozbabacan, S. E.; Gursoy, A.; Keskin, O.; Nussinov, R. Curr. Opin. Drug Discov. Dev. 2010, 13, 527-537. 8. Jones, S.; Thornton, J. M. Proc. Natl. Acad. Sci. USA 1996, 93, 13-20. 9. Strogatz, S. H. Nature 2001, 410, 268-276. 10. Srihari, S.; Yong, C. H.; Patil, A.; Wong, L. FEBS Lett. 2015, 589, 2590-2602. 11. Alanis-Lobato, G.; Andrade-Navarro, M. A.; Schaefer, M. H. Nucleic Acids Res. 2017, 45, 408- 414. 12. Miura, K. Protein Pept. Lett. 2018, 25, 728-733. 13. Du, X.; Li, Y.; Xia, Y-L.; Ai, S-M.; Liang, J.; Sang, P.; Ji, X-L.; Liu, S-Q. Int. J. Mol. Sci. 2016, 17, 144. 14. Demchenko, A. P. J. Mol. Recognit. 2001, 14, 42-61. 15. Janin, J. Prog. Biophys. Mol. Biol. 1995, 64, 145-166. 16. Gao, M.; Zhou, H.; Skolnick, J. Structure 2015, 23, 1362-1369. 17. Collins, F. S.; Lander, E. S.; Rogers, J.; Waterston, R. H. Nature 2004, 431, 931-945. 18. Kauzmann, W. Biophys. J. 1964, 4, 43-54. 19. Overington, J. P. Curr. Opin. Struct. Biol. 1992, 2, 394-401. 20. Itoh, T.; Hanefeld, U. Green Chem. 2017, 19, 331-332. 21. Menger, F. M.; Nome, F. ACS Chemical Biology 2019, 14, 1386-1392. 22. Callender, R.; Dyer, R. B. Acc. Chem. Res. 2015, 48, 407-413. 23. Martínez Cuesta, S.; Asad Rahman, S.; Furnham, N.; Thornton, J. M. Biophys. J. 2015, 109, 1082- 1086. 24. Schramm, V. L. Chem. Rev. 2006, 106, 3029-3030. 25. Knowles, J. R. Nature 1991, 350, 121-124. 26. Bruice, T. C.; Benkovic, S. J. Biochemistry 2000, 39, 6267-6274. 27. Studer, A.; Curran, D. P. Angew. Chem. Int. Ed. 2016, 55, 58-102. 28. Rüchardt, C. Free Radic. Res. Commun. 1987, 2, 197-216. 29. Jasperse, C. P.; Curran, D. P.; Fevig, T. L. Chem. Rev. 1991, 91, 1237-1286. 30. Romero, K. J.; Galliher, M. S.; Pratt, D. A.; Stephenson C. R. J. Chem. Soc. Rev. 2018, 47, 7851- 7866. 31. Renaud, P.; Sibi, M. P. Radicals in Organic Synthesis, 1st ed. Wiley-VCH, Weinheim: 2001. 32. Lobo, V.; Patil, A.; Phatak, A.; Chandra, N. Pharmacogn. Rev. 2010, 4, 118-126. 33. Phaniendra, A.; Jestadi, D. B.; Periyasamy, L. Indian J. Clin. Biochem. 2015, 30, 11-26. 34. Alkadi, H. Infect. Disord. Drug Targets 2020, 20, 16-26. 35. Sharma, G. N.; Gupta, G.; Sharma, P. Crit. Rev. Eukaryot. Gene Expr. 2018, 28, 139-154. 36. Shibata, N.; Toraya, T. J. Biochem. 2015, 158, 271-292. 37. Frey, P. A. Annu. Rev. Biochem. 2001, 70, 121-148. 38. Boyd, E. S.; Thomas, K. M.; Dai, Y.; Boyd, J. M.; Outten, F. M. Biochemistry 2014, 53, 5834-5847. 39. Layer, G.; Moser, J.; Heinz, D. W.; Jahn, D.; Schubert, W. D. EMBO J. 2003, 22, 6214-6224. 40. Stubbe, J.; Ge, J.; Yee, C. S. Trends Biochem. Sci. 2001, 26, 93-99. 41. Anzenbacher, P.; Anzenbacherova, E. Cell. Mol. Life Sci. 2001, 58, 737-747. 42. De Gonzalo, G.; Colpa, D. I.; Habib, M. H. M.; Fraaije, M. W. J. Biotechnol. 2016, 236, 110-119. 43. Leys, D.; Scrutton, N. S. Curr. Opin. Struct. Biol. 2016, 41, 19-26.

21

Chapter 1: Radical Enzymes

44. Sono, M.; Roach, M. P.; Coulter, E. D.; Dawson, J. H. Chem. Rev. 1996, 96, 2841-2888. 45. Gibson, M. I.; Chen, P. Y.-T.; Johnson, A. C.; Pierce, E.; Can, M.; Ragsdale, S. W.; Drennan, C. L. Proc. Natl. Acad. Sci. USA 2016, 113, 320-325. 46. Mansoorabadi, S. O.; Seravalli, J.; Furdui, C.; Krymov, V.; Gerfen, G. J.; Begley, T. P.; Melnick, J.; Ragsdale, S. W.; Reed, G. H. Biochemistry 2006, 45, 7122-7131. 47. Stubbe, J.; Van der Donk, W. A. Chem. Rev. 1998, 98,705-762. 48. Buckel, W.; Golding, B. T. Annu. Rev. Microbiol. 2006, 60, 27-49. 49. Buckel, W.; Golding, B. T. Encyclopedia of Radicals in Chemistry, Biology and Materials. John Wiley & Sons, Hoboken, New York, NY, USA: 2012. 50. Rodríguez Couto, S.; Toca Herrera, J. L. Biotechnol. Adv. 2006, 24, 500-513. 51. Mate, D. M.; Alcalde, M. Microb. Biotechnol. 2017, 10, 1457-1467. 52. Urlacher, V. B.; Girhard, M. Trends Biotechnol. 2012, 30, 26-36. 53. Sakaki, T. Biol. Pharm. Bull. 2012, 35, 844-849. 54. Krainer, F. W.; Glieder, A. Appl. Microbiol. Biotechnol. 2015, 99, 1611-1625. 55. Beilen, J. B. v.; Funhoff, E. G. Curr. Opin. Biotechnol. 2005, 16, 308-314. 56. Adams, G. E.; Posener, M. L.; Bisby, R. H.; Cundall, R. B.; Key, J. R. Int. J. Radiat. Biol. Relat. Stud. Phys. Chem. Med. 1979, 35, 497-507. 57. Robins, A. B.; Butler, J. A. V. Radiation Research 1962, 16, 7-17. 58. Sawers, G.; Watson, G. Mol. Microbiol. 1998, 29, 945-954. 59. Melchiorsen, C. R.; Jokumsen, K. V.; Villadsen, J.; Johnsen, M. G.; Israelsen, H.; Arnau, J. J. Bacteriol. 2000, 182, 4783-4788. 60. Zelcbuch, L.; Lindner, S. N.; Zegman, Y.; Vainberg Slutskin, I.; Antonovsky, N.; Gleizer, S.; Milo, R.; Bar-Even, A. Biochemistry 2016, 55, 2423-2426. 61. Jäger, C. M.; Croft, A. K. Chem. Bio. Eng. Rev. 2018, 5, 143-162. 62. Heider, J.; Spormann, A. M.; Beller, H. R.; Widdel, F. FEMS Microbiol. Rev. 1998, 22, 459-473. 63. Rabus, R.; Wilkes, H.; Behrends, A.; Armstroff, A.; Fischer, T.; Pierik, A. J.; Widdel, F. J. Bacteriol. 2001, 183, 1707-1715. 64. Rabus, R.; Boll, M.; Heider, J.; Meckenstock, R. U.; Buckel, W.; Einsle, O.; Ermler, U.; Golding, B. T.; Gunsalus, R. P.; Kroneck, P. M.; Kruger, M.; Lueders, T.; Martins, B. M.; Musat, F.; Richnow, H. H.; Schink, B.; Seifert, J.; Szaleniec, M.; Treude, T.; Ullmann, G. M.; Vogt, C.; Von Bergen, M.; Wilkes, H. J. Mol. Microbiol. Biotechnol. 2016, 26, 5-28. 65. Callaghan, A. V. Front. Microbiol. 2013, 4, 89. 66. Fontecave, M.; Ollagnier-de-Choudens, S.; Mulliez, E. Chem. Rev. 2003, 103, 2149-2166. 67. Liu, W. Q.; Amara, P.; Mouesca, J. M.; Ji, X.; Renoux, O.; Martin, L.; Zhang, C.; Zhang, Q.; Nicolet, Y. J. Am. Chem. Soc. 2018, 140, 1365-1371. 68. Buckel, W.; Golding, B. T. FEMS Microbiol. Rev. 1999, 22, 523-541. 69. Imlay, J. A. Mol. Microbiol. 2006, 59, 1073-1082. 70. Marsh, E. N.; Patterson, D. P.; Li, L. Chembiochem. 2010, 11, 604-621. 71. Buckel, W. Angew. Chem. Int. Ed. 2009, 48, 6779 . 72. Shisler, K. A.; Broderick, J. B. Arch. Biochem. Biophys. 2014, 546, 64-71. 73. Broderick, J. B.; Duffus, B. R.; Duschene, K. S.; Shepard, E. M. Chem. Rev. 2014, 114, 4229-4317. 74. Vey, J. L.; Yang, J.; Li, M.; Broderick, W. E.; Broderick, J. B.; Drennan, C. L. Proc. Natl. Acad. Sci. USA 2008, 105, 16137-16141. 75. Giedyk, M.; Goliszewskaab, K.; Gryko, D. Chem. Soc. Rev. 2015, 44, 3391-3404. 76. Sandala, G. M.; Smith, D. M.; Radom, L. Acc. Chem. Res. 2010, 43, 642-651. 77. Kovačević, B.; Barić, D.; Babić, D.; Bilić, L.; Hanževački, M.; Sandala, G. M.; Radom, L.; Smith, D. M. J. Am. Chem. Soc. 2018, 140, 8487-8496. 78. Banerjee, R. Chemistry and Biochemistry of B12. Wiley. New York, NY, USA: 1999. 79. Brown, K. L. Chem. Rev. 2005, 105, 2075-2150. 80. Martens, J. H.; Barg, H.; Warren, M. J.; Jahn, D. Appl. Microbiol. Biotechnol. 2002, 58, 275-285. 81. Dowling, D. P.; Croft, A. K.; Drennan, C. L. Annu. Rev. Biophys. 2012, 41, 403-427. 82. Jaouen, G. Bioorganometallics: Biomolecules, Labeling, Medicine. Wiley-VCH, Weinheim: 2006. 83. Toraya, T. Chem. Rev. 2003, 103, 2095-2127. 84. Mancia, F.; Keep, N. H.; Nakagawa, A.; Leadlay, P. F.; McSweeney, S.; Rasmussen, B.; Bosecke, P.; Diat, O.; Evans, P. R. Structure 1996, 4, 339-350.

22

Chapter 1: Radical Enzymes

85. Froese, D. S.; Kochan, G.; Muniz, J. R.; Wu, X.; Gileadi, C.; Ugochukwu, E.; Krysztofinska, E.; Gravel, R. A.; Oppermann, U.; Yue, W. W. J. Biol. Chem. 2010, 285, 38204-38213. 86. Tollinger, M.; Konrat, R.; Hilbert, B. H.; Marsh, E. N.; Krautler, B. Structure 1998, 6, 1021-1033. 87. Reitzer, R.; Gruber, K.; Jogl, G.; Wagner, U. G.; Bothe, H.; Buckel, W.; Kratky, C. Structure 1999, 7, 891-902. 88. Kurteva-Yaneva, N.; Zahn, M.; Weichler, M. T.; Starke, R.; Harms, H.; Müller, R. H.; Sträter, N.; Rohwerder, T. J. Biol. Chem. 2015, 290, 9727-9737. 89. Shibata, N.; Masuda, J.; Tobimatsu, T.; Toraya, T.; Suto, K.; Morimoto, Y.; Yasuoka, N. Structure 1999, 7, 997-1008. 90. Yamanishi, M.; Yunoki, M.; Tobimatsu, T.; Sato, H.; Matsui, J.; Dokiya, A.; Iuchi, Y.; Oe, K.; Suto, K.; Shibata, N.; Morimoto, Y.; Yasuoka, N.; Toraya, T. Eur. J. Biochem. 2002, 269, 4484-4494. 91. Liao, D. I.; Dotson, G.; Turner, I. Jr.; Reiss, L.; Emptage, M. J. Inorg. Biochem. 2003, 93, 84-91. 92. Shibata, N.; Tamagaki, H.; Hieda, N.; Akita, K.; Komori, H.; Shomura, Y.; Terawaki, S.; Mori, K.; Yasuoka, N.; Higuchi, Y.; Toraya, T. J. Biol. Chem. 2010, 285, 26484-26493. 93. Sintchak, M. D.; Arjara, G.; Kellogg, B. A.; Stubbe, J.; Drennan, C. L. Nat. Struct. Biol. 2002, 9, 293-300. 94. Larsson, K. M.; Logan, D. T.; Nordlund, P. ACS Chem. Biol. 2010, 5, 933-942. 95. Berkovitch, F.; Behshad, E.; Tang, K. H.; Enns, E. A.; Frey, P. A.; Drennan, C. L. Proc. Natl. Acad. Sci. USA 2004, 101, 15870-15875. 96. Wolthers, K. R.; Levy, C.; Scrutton, N. S.; Leys, D. J. Biol. Chem. 2010, 285, 13942-13950. 97. Buckel, W.; Kratky, C.; Golding, B. T. Chem. Eur. J. 2005, 12, 352-362. 98. Wei, Y.; Funk, M. A.; Rosado, L. A.; Baek, J.; Drennan, C. L.; Stubbe, J. Proc. Natl. Acad. Sci. USA 2014, 111, 3756-3765. 99. Banerjee, R. Chem. Rev. 2003, 103, 2083-2094. 100. Babior, B. Gould, D. C. Biochem. Biophys. Res. Commun. 1969, 34, 441-447. 101. Wierenga, R. K. FEBS Lett. 2001, 492, 193-198. 102. Nagano, N.; Orengo, C. A.; Thornton, J. M. J. Mol. Biol. 2002, 321, 741-765. 103. Vijayabaskar, M. S.; Vishveshwara, S. PLoS Comput. Biol. 2012, 8, e1002505. 103. Markham, G. D. S-Adenosylmethionine. Encyclopedia of Life Sciences, John Wiley Sons: 2010. 105. Zhang, Q.; Van der Donk, W. A.; Liu, W. Acc. Chem. Res. 2012, 45, 555-564. 106. Bauerle, M. R.; Schwalm, E. L.; Booker, S. J. J. Biol. Chem. 2015, 290, 3995-4002. 107. Chirpich, T. P.; Zappia, V.; Costilow, R. N.; Barker, H. A. J. Biol. Chem. 1970, 245, 1778-1789. 108. Knappe, J.; Neugebauer, F. A.; Blaschkowski, H. P.; Gänzler, M. Proc. Natl. Acad. Sci. USA 1984, 81, 1332-1335. 109. Moss, M.; Frey, P. A. J. Biol. Chem. 1987, 262, 14859-14862. 110. Sofia, H. J.; Chen, G.; Hetzler, B. G.; Reyes-Spindola, J. F.; Miller, N. E. Nucleic Acids Res. 2001, 29, 1097-1106. 111. Frey, P. A.; Hegeman, A. D.; Ruzicka, F. J. Crit. Rev. Biochem. Mol. Biol. 2008, 43, 63-88. 112. Yokoyama, K.; Lilla, E. A. Nat. Prod. Rep. 2018, 35, 660-694. 113. Frey, P. A.; Ballinger, M. D.; Reed, G. H. Biochem. Soc. Trans. 1998, 26, 304-310. 114. Bridwell-Rabb, J.; Drennan, C. L. Curr. Opin. Chem. Biol. 2017, 37, 63-70. 115. Wang, J.; Woldring, R. P.; Roman-Melendez, G. D.; McClain, A. M.; Alzua, B. R.; Marsh, E. N. ACS Chem. Biol. 2014, 9, 1929-1938. 116. Frey, P. A.; Magnusson, O. T. Chem. Rev. 2003, 103, 2129-2148. 117. Vey, J. L.; Drennan, C. L. Chem. Rev. 2011, 111, 2487-2506. 118. Duschene, K. S.; Veneziano, S. E.; Silver, S. C.; Broderick, J. B. Curr. Opin. Chem. Biol. 2009, 13, 74-83. 119. Broderick, W. E.; Broderick, J. B. J. Biol. Inorg. Chem. 2019, 24, 769-776. 120. Demick, J. M.; Lanzilotta, W. N. Biochemistry 2011 50, 440-442. 121. Raynaud, C.; Sarçabal, P.; Meynial-Salles, I.; Croux, C.; Soucaille, P. Proc. Natl. Acad. Sci. USA 2003, 100, 5010-5015. 122. Kampmeier, J. A. Biochemistry 2010, 49, 10770-10772. 123. Lepore, B. W.; Ruzicka, F. J.; Frey, P. A.; Ringe, D. Proc. Natl. Acad. Sci. USA 2005, 102, 13819- 13824. 124. Berkovitch, F.; Nicolet, Y.; Wan, J. T.; Jarrett, J. T.; Drennan, C. L. Science 2004, 303, 76-79.

23

Chapter 1: Radical Enzymes

125. Harmer, J. E.; Hiscox, M. J.; Dinis, P. C.; Fox, S. J.; Iliopoulos, A.; Hussey, J. E.; Sandy, J.; Van Beek, F. T.; Essex, J. W.; Roach, P. L. Biochem. J. 2014, 464, 123-133. 126. Dowling, D. P.; Bruender, N. A.; Young, A. P.; McCarty, R. M.; Bandarian, V.; Drennan, C. L. Nat. Chem. Biol. 2014, 10, 106-112. 127. Bruender, N. A.; Grell, T. A.; Dowling, D. P.; McCarty, R. M.; Drennan, C. L.; Bandarian, V. J. Am. Chem. Soc. 2017, 139, 1912-1920. 128. Hanzelmann, P.; Schindelin, H. Proc. Natl. Acad. Sci. USA 2004, 101, 12870-12875. 129. Benjdia, A.; Heil, K.; Barends, T. R.; Carell, T.; Schlichting, I. Nucleic Acids Res. 2012, 40, 9308- 9318. 130. Coquille, S.; Roux, C.; Mehta, A.; Begley, T. P.; Fitzpatrick, T. B.; Thore, S. J. Struct. Biol. 2013, 184, 438-444. 131. Fenwick, M. K.; Mehta, A. P.; Zhang, Y.; Abdelwahed, S. H.; Begley, T. P.; Ealick, S. E. Nat. Commun. 2015, 6, 6480. 132. Suzuki, Y.; Noma, A.; Suzuki, T.; Senda, M.; Senda, T.; Ishitani, R.; Nureki, O. J. Mol. Biol. 2007, 372, 1204-1214. 133. Grell, T. A. J.; Bell, B. N.; Nguyen, C.; Dowling, D. P.; Bruender, N. A.; Bandarian, V.; Drennan, C. L. Protein Sci. 2019, 28, 202-215. 134. Dowling, D. P.; Vey, J. L.; Croft, A. K.; Drennan, C. L. Biochim. Biophys. Acta 2012, 1824, 1178- 1195. 135. Layer, G.; Heinz, D. W.; Jahn, D.; Schuber, W.-D. Curr. Opin. Chem. Biol. 2004, 8, 468-476. 136. Hiscox, M. J.; Driesner, R. C.; Roach, P. L. Biochem. Biophys. Acta 2012, 1824, 1165-1177. 137. Krebs, C.; Broderick, W. E.; Henshaw, T. F.; Broderick, J. B.; Huynh, B. H. J. Am. Chem. Soc. 2002, 124, 912-913. 138. Walsby, C. J.; Ortillo, D.; Yang, J.; Nnyepi, M. R.; Broderick, W. E.; Hoffman, B. M.; Broderick, J. B. Inorg. Chem. 2005, 44, 727-741. 139. Holliday, G. L.; Akiva, E.; Meng, E. C.; Brown, S. D.; Calhoun, S.; Pieper, U.; Sali, A.; Booker, S. J.; Babbitt, P. C. Methods Enzymol. 2018, 606, 1-71. 140. Sayler, R. I.; Stich, T. A.; Joshi, S.; Cooper, N.; Shaw, J. T.; Begley, T. P.; Tantillo, D. J.; Britt, R. D. ACS Cent. Sci. 2019, 5, 1777-1785. 141. Yang, H.; McDaniel, E. C.; Impano, S.; Byer, A. S.; Jodts, R. J.; Yokoyama, K.; Broderick, W. E.; Broderick, J. B.; Hoffman, B. M. J. Am. Chem. Soc. 2019, 141, 12139-12146. 142. Yang, H.; Impano, S.; Shepard, E. M.; James, C. D.; Broderick, W. E.; Broderick, J. B.; Hoffman, B. M. J. Am. Chem. Soc. 2019, 141, 16117-16124. 143. Cosper, N. J.; Booker, S. J.; Ruzicka, F.; Frey, P. A.; Scott, R. A. Biochemistry 2000, 29, 15668. 144. Rittle, J. ACS Cent. Sci. 2019, 5, 1741-1743. 145. Horitani, M.; Shisler, K.; Broderick, W. E.; Hutcheson, R. U.; Duschene, K. S.; Marts, A. R.; Hoffman, B. M.; Broderick, J. B. Science 2016, 352, 822-825 146. Dong, M.; Kathiresan, V.; Fenwick, M. K.; Torelli, A. T.; Zhang, Y.; Caranto, J. D.; Dzikovski, B.; Sharma, A.; Lancaster, K. M.; Freed, J. H.; Ealick, S. E.; Hoffman, B. M.; Lin, H. Science 2018, 359, 1247-1250. 147. O’Brien, J. R.; Raynaud, C.; Croux, C.; Girbal, L.; Soucaille, P.; Lanzilotta, W. N. Biochemistry 2004, 43, 4635-4645. 148. Zhang, Y.; Zhu, X.; Torelli, A. T.; Lee, M.; Dzikovski, B.; Koralewski, R.M.; Wang, E.; Freed, J.; Krebs, C.; Ealick, S. E.; Lin, H. Nature 2010, 465, 891-896. 149. Shisler, K. A.; Hutcheson, R. U.; Horitani, M.; Duschene, K. S.; Crain, A. V.; Byer, A. S.; Broderick, J. B. J. Am. Chem. Soc. 2017, 139, 11803-11813. 150. Broderick, W. E.; Hoffman, B. M.; Broderick, J. B. Acc. Chem. Res. 2018, 51, 2611-2619. 151. Banerjee, R.; Ragsdale, S. W. Annu. Rev. Biochem. 2003, 72, 209-247. 152. Gomberg, M. Chem. Rev. 1924, 1, 91-141. 153. Norman, R. O. C. Nature 1969, 222, 1008-1009. 154. Griller, D.; Ingold, K. U. Acc. Chem. Res. 1976, 9, 13-19. 155. Zipse, H. Topics Curr. Chem. 2006, 263, 163-190. 156. Hioe, J.; Zipse, H. Org. Biomol. Chem. 2010, 8, 3609-3617. 157. Hioe, J.; Savasci, G.; Brand, H.; Zipse, H. Chem. Eur. J. 2011, 17, 3781-3789. 158. Peterson, J. P.; Winter, A. H. J. Am. Chem. Soc. 2019, 141, 12901-12906.

24

Chapter 1: Radical Enzymes

159. Knappe, J.; Wagner, A. F. Methods Enzymol. 1995, 258, 343-362. 160. Sun, X.; Ollagnier, S.; Schmidt, P. P.; Atta, M.; Mulliez, E.; Lepape, L.; Eliasson, R.; Gräslund, A.; Fontecave, M.; Reichard, P.; Sjöberg, B. M. J. Biol. Chem. 1996, 271, 6827-6831. 161. Leuthner, B.; Leutwein, C.; Schulz, H.; Hörth, P.; Haehnel, W.; Schiltz, E.; Schägger, H.; Heider, J. Mol. Microbiol. 1998, 28, 615-628. 162. Craciun, S.; Balskus, E. P. Proc. Natl. Acad. Sci. USA 2012, 109, 21307-21312. 163. Scott, K. P.; Martin, J. C.; Campbell, G.; Mayer, C. D.; Flint, H. J. J. Bacteriol. 2006, 188, 4340- 4349. 164. Levin, B. J.; Huang, Y. Y.; Peck, S. C.; Wei, Y.; Martínez-Del Campo, A.; Marks, J. A.; Franzosa, E. A.; Huttenhower, C.; Balskus, E. P. Science 2017, 355, eaai8386. 165. Bammens, B.; Evenepoel, P.; Keuleers, H.; Verbeke, K.; Vanrenterghem, Y. Kidney Int. 2006, 69, 1081-1087. 166. Poveda, J.; Sanchez-Niño, M. D.; Glorieux, G.; Sanz, A. B.; Egido, J.; Vanholder, R.; Ortiz, A. Nephrol. Dial. Transplant. 2014, 29, 56-64. 167. Dumas, M. E.; Barton, R. H.; Toye, A.; Cloarec, O.; Blancher, C.; Rothwell, A.; Fearnside, J.; Tatoud, R.; Blanc, V.; Lindon, J. C.; Mitchell, S. C.; Holmes, E.; McCarthy, M. I.; Scott, J.; Gauguier, D.; Nicholson, J. K. Proc. Natl. Acad. Sci. USA 2006, 103, 12511-1256. 168. Wang, Z.; Klipfell, E.; Bennett, B. J.; Koeth, R.; Levison, B. S.; Dugar, B.; Feldstein, A. E.; Britt, E. B.; Fu, X.; Chung, Y. M.; Wu, Y.; Schauer, P.; Smith, J. D.; Allayee, H.; Tang, W. H.; DiDonato, J. A.; Lusis, A. J.; Hazen, S. L. Nature 2011, 472, 57-63. 169. Tang, W. H.; Wang, Z.; Levison, B. S.; Koeth, R. A.; Britt, E. B.; Fu, X.; Wu, Y.; Hazen, S. L. N. Engl. J. Med. 2013, 368, 1575-1584. 170. Christodoulou, J. J. Paediatr. Child Health 2012, 48, 153-155. 171. Griebler, C.; Leuders, T. Freshwater Biol. 2009, 54, 649-677. 172. Jiang, W.; Wang, S.; Wang, Y.; Fang, B. Biotechnol. Biofuels. 2016, 9, 57. 173. Wagner, A. F.; Frey, M.; Neugebauer, F. A.; Schafer, W.; Knappe, J. Proc. Natl. Acad. Sci. USA 1992, 89, 996-1000. 174. Reddy, S. G.; Wong, K. K.; Parast, C. V.; Peisach, J.; Magliozzo, R. S.; Kozarich, J. W. Biochemistry 1998, 37, 558-563. 175. Zhang, W.; Wong, K. K.; Magliozzo, R. S.; Kozarich, J. W. Biochemistry 2001, 40, 4123-4130. 176. Sawers, G. FEMS Microbiol. Rev. 1998, 22, 543-551. 177. Selmer, T.; Pierik, A. J.; Heider, J. Biol. Chem. 2005, 386, 981-988. 178. LaMattina, J. W.; Keul, N. D.; Reitzer, P.; Kapoor, S.; Galzerani, F.; Koch, D. J.; Gouvea, I. E.; Lanzilotta, W. N. J. Biol. Chem. 2016, 291, 15515-15526. 179. Levin, B. J.; Balskus, E. P. Curr. Opin. Chem. Biol. 2018, 47, 86-93. 180. Peck, S. C.; Denger, K.; Burrichter, A.; Irwin, A. M.; Balskus, E. P.; Schleheck, D. Proc. Natl. Acad. Sci. USA 2019, 116, 3171-3176. 181. Peng, Y.; Veneziano, S. E.; Gillispie, G. D.; Broderick, J. B. J. Biol. Chem. 2010, 285, 27224- 27231. 182. Cheek, J.; Broderick, J. B. J. Biol. Inorg. Chem. 2001, 6, 209-226. 183. Wang, S. C.; Frey, P. A. Trends Biochem. Sci. 2007, 32, 101-110. 184. Sawers, G.; Hesslinger, C.; Muller, N.; Kaiser, M. J. Bacteriol. 1998, 180, 3509-3516. 185. Hesslinger, C.; Fairhurst, S. A.; Sawers, G. Mol. Microbiol. 1998, 27, 477-492. 186. Wagner, A. F.; Schultz, S.; Bomke, J.; Pils, T.; Lehmann, W. D.; Knappe, J. Biochem. Biophys. Res. Commun. 2001, 285, 456-462. 187. Backman, L. R. F.; Funk, M. A.; Dawson, C. D.; Drennan, C. L. Crit. Rev. Biochem. Mol. Biol. 2017, 52, 674-695. 188. Shoaie, S.; Karlsson, F.; Mardinoglu, A.; Nookaew, I.; Bordel, S.; Nielsen, J. Sci. Rep. 2013, 3, 2532. 189. Ferlez, B.; Sutter, M.; Kerfeld, C. A. MBio. 2019, 10, e02327-18. 190. Landgraf, B. J.; McCarthy, E. L.; Booker, S. J. Annu. Rev. Biochem. 2016, 85, 485-514. 191. Knappe, J.; Sawers, G. FEMS Microbiol. Rev. 1990, 6, 383-398. 192. Unkrig, V.; Neugebauer, F. A.; Knappe, J. Eur. J. Biochem. 1989, 184, 723-728. 193. Knappe, J.; Blaschkowski, H. P.; Gröbner, P.; Schmitt, T. Eur. J. Biochem. 1974, 50, 253-263.

25

Chapter 1: Radical Enzymes

194. Becker, A.; Fritz-Wolf, K.; Kabsch, W.; Knappe, J.; Schultz, S.; Volker Wagner, A. F. Nat. Struct. Biol. 1999, 6, 969-675. 195. Becker, A.; Kabsch, W. J. Biol. Chem. 2002, 277, 40036-40042. 196. Leppänen, V. M.; Merckel, M. C.; Ollis, D. L.; Wong, K. K.; Kozarich, J. W.; Goldman, A. Structure 1999, 7, 733-744. 197. Frey, M.; Rothe, M.; Wagner, A. F.; Knappe, J. J. Biol. Chem. 1994, 269, 12432-12437. 198. Plaga, W.; Frank, R.; Knappe, J. Eur. J. Biochem. 1988, 178, 445-450. 199. Plaga, W.; Vielhaber, G.; Wallach, J.; Knappe, J. FEBS Lett. 2000, 466, 45-48. 200. Parast, C. V.; Wong, K. K.; Kozarich, J. W.; Peisach, J.; Magliozzo, R. S. Biochemistry 1995, 34, 5712-5717. 201. Crain, A. V.; Broderick, J. B. J. Biol. Chem. 2014, 289, 5723-5729. 202. Buis, J. M.; Broderick, J. B. Arch. Biochem. Biophys. 2005, 433, 288-296. 203. Wong, K. K.; Murray, B. W.; Lewisch, S. A.; Baxter, M. K.; Ridky, T. W.; Ulissi-DeMario, L.; Kozarich, J. W. Biochemistry 1993, 32, 14102-14110. 204. Lehtiö, L.; Goldman, A. Protein Eng. Des. Sel. 2004, 17, 545-552. 205. Parast, C. V.; Wong, K. K.; Lewisch, S. A.; Kozarich, J. W.; Peisach, J.; Magliozzo, R. S. Biochemistry 1995, 34, 2393-2399.

26

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

2.1. Introduction

As detailed in the previous chapter, enzymes are structurally complex macromolecular catalysts naturally involved in all biological processes vital for life.1-7 In those outstanding catalytic reactions, it was shown that enzymes efficiently perform conversion of reactants to products which are then used in metabolic pathways in living organisms.8,9 The chemical reactions, catalyzed within enzymes, are typically localized in a small conserved region known as the active site.10-12 Active sites have unique and peculiar shapes and electronic properties enabling enzymes to recognize and position the specific substrates inside.13 This supports the required transformation by either directly stabilizing the transition states formed within or by completely changing the reaction mechanism.14,15 Namely, for those cases where the transformation happens via transition states, it is evident that the enzymes provide a strictly organized environment that has a stabilizing effect on the transition states.16,17 Basically, enzymes operate by lowering the activation energy – the amount of energy required for the reaction to occur, without affecting the outcome of the overall reaction.18 In biochemistry and molecular biology, it is extremely important to understand the fundamentals of enzyme catalysis to elucidate the basis of structure-function relationships related to enzyme evolution.19-26

At the same time, enzymes are dynamic entities that can adopt a large variety of conformations and in many cases, their catalytic activity is intrinsically linked to conformational changes.27 The number of different conformational states is defined by the free energy landscape of the enzyme. The energy landscape also defines the rates for conformational changes and might therefore indirectly determine the rates of the catalytic reaction.28-30 Thus, from the free energy landscape, one can also infer the rates of conformational changes, thereby indirectly determining the rates of the various steps of the catalytic reaction. A variety of factors, such as enzyme evolutionary origin and mutations, allosteric effects, external forces, and microenvironment, can alter the energy landscape thereby directly influencing enzymatic activity.31-33 Indeed, enzymes are often found to undergo certain structural changes, which allow for the binding of the substrate and the release of the product, thus enabling the essential chemical transformations to take place in the active

27

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

site.34,35 In particular, transitions between various states and available conformational ensembles are characterized by both the free energy landscape of the enzyme and the details of its catalytic cycle.36-41 However, the identification and characterization of specific conformational changes that are coupled with enzyme catalysis, or protein function in general, can be rather challenging and usually require various experimental and theoretical tools.25,26,42- 48

As explained in some detail in the previous chapter, most of the enzymes we know today catalyze simple acid-base reactions49,50 while others, for example, radical enzymes (REs),51-54 use highly reactive radicals to activate mostly non-reactive compounds to participate in more complex processes. Up to date, many enzymes that catalyze a wide scope of chemical reactions are identified to operate via radical-mediated mechanisms and are found in metabolic pathways of numerous biological systems, as well as in the environment.55-57 The ability of REs to catalyze chemically demanding and versatile reactions is indeed interesting for biochemical and enzyme engineering as well as industrial applications and biotechnology as demonstrated in the previous chapter.58 There we showed that while the types of radical-catalyzed reactions are diverse, the principle of their action is simple. Before we sketch the general features of their energetics, what is to be noted is that these reactions are usually thermodynamically feasible yet kinetically extremely slow due to the high activation barrier for the transition state as presented in the simplified hypothetical potential energy profile shown in Figure 2.1.1a.54

In those reactions, the substrate-derived radicals S• are usually unstable and high in energy; however, upon the introduction of catalytic radical R• into the reaction, it becomes stabilized by hydrogen atom abstraction from substrate SH generating RH and S• (see Figure 2.1.1b). This way, SH becomes activated by coupling with the accentuated hydrogen abstraction, which significantly reduces the barrier height for the transition state S‡. After the rearrangement of substrate-derived radical to a product-derived radical (P•), the latter species re-abstracts hydrogen from RH to generate final product PH and regenerated catalytic radical (R•). Therefore, it has been proposed that REs accelerate reactions by dividing a single transition state with a high barrier into several transition states with lower barrier heights. In other words, the mechanism of radical-based catalysis has significant overlap and is comparable to the covalent catalysis and is effective even in the case of reaction with extremely high activation energy (see Figure 2.1.1b).

However, many fundamental aspects of the radical-based catalysis remain elusive, irrespective of efforts made to understand their function at a molecular level, as the reactions

28

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

catalyzed by REs involve unstable intermediates that are difficult to study experimentally.59-62 The modern biochemical experimental techniques for studying REs include, for example, structural X-ray crystallography,63-65 electron paramagnetic resonance (EPR) spectroscopy66, and site-directed mutagenesis. Although it is possible to study radical enzymes via experiments, these are often challenging and require strictly controlled conditions due to the increased sensitivity of the majority of REs towards atmospheric oxygen as stated earlier.67,68 Recently, tremendous efforts have been made in the discovery of new REs, whereby novel sequencing technologies, homology69-71 and structural-chemical modeling, and bioinformatics analyses are used to understand their chemistry and biology as well as their impact on the environment and human health.72

In this regard, over the past couple of decades, computer-based modeling73-76 has proven to be of critical importance for exploring the fundamental mechanisms operative in various biochemical processes and it has rapidly developed to become an invaluable tool, complementing experimental techniques.77 From the basic investigation of small and rather simple molecular systems upwards to studying complex and large biomolecular machinery

Figure 2.1.1. Hypothetical potential energy diagram of radical enzyme-catalyzed reaction (a) before and (b) after the addition of a catalytic radical. SH is substrate, PH is product, S is transition state, R• is catalytic radical and ΔEa is the activation energy barrier for substrate conversion into product. Barrier heights are arbitrary. This scenario is initially proposed by Shibata and Toraya in ref. 54.

29

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

such as proteins, computational approaches have been widely used in different aspects of investigations, especially in fundamental scientific research.78 The application of these methodologies has proven to be extremely useful in pharmaceutical studies enabling the discovery of novel potential therapeutics for treating numerous diseases. Furthermore, developments in material science were highly impacted by the use of computational methods as the properties of the new materials and interfaces could be both better characterized and improved.79 Nowadays, various molecular features can be accurately predicted employing computational modeling while its further development can provide a wide scope of applicability and support to basic research in chemical biology, biophysics, and life sciences.80

Turning back to the problem of our interest, namely enzyme catalysis, it is not always possible to investigate all steps of the mechanisms experimentally or characterize the energetics and structures of the transition states or intermediates that could occur as a part of the reaction mechanism during the catalytic cycle. Moreover, when it comes to enzyme catalysis, it is usually exceedingly demanding, and sometimes even impossible, to investigate some of its aspects experimentally, such as to properly characterize structures of the transition states or intermediates that occur as a part of the reaction mechanism during the catalytic cycle. Over the years, and especially in the past decades, modern computational methods, particularly those based on molecular dynamics (MD) simulations, have proven to be advantageous and matured to become an indispensable tool in studying protein structure-function relationship at a molecular level, especially for systems that are difficult to access experimentally, such as REs.81 The application of MD simulations enables one to obtain a detailed insight into the enzymatic catalysis and can serve well to predict and explain the behavior of the biologically relevant systems at the atomistic level.82 Furthermore, by combining both quantum and classical descriptions, such as a multilevel approach available through quantum mechanics/molecular mechanics (QM/MM) hybrid methods, a more accurate description of various aspects that are part of the enzymatic catalysis can be obtained.83,84 In this respect, the two tools, namely, MD and QM/MM are complementary. Namely, while MD simulations provide valuable insights into the conformational motions and flexibility of the enzyme, QM/MM calculations provide detailed information about the energetics and the optimal path for the reaction mechanism of the enzyme.85-87

Hence, the first results gained in this work (Chapter 3) are based on extensive unrestrained MD simulations, as well as advanced statistical analyses tools which have been employed to study the effects of chemical change at the active site on protein dynamics using

30

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

both dimeric and monomeric PFL models as a study in case.88 This study aimed to find and characterize in detail a potential entry channel for coenzyme molecule which could provide insights into the connection between the catalytic cycle and structural changes in PFL. The second major result of this thesis (Chapter 4) is based on various enhanced sampling techniques which we utilized to simulate coenzyme’s entry, initiating its approach into the active site of non-acetylated and acetylated monomeric PFL models for the characterization of the reliable energetics of this process and for establishing chemical interpretations of important contacts between coenzyme and the active site compartments. In the last part of this thesis (Chapter 5), a set of comprehensive QM/MM calculations were carried out to systematically investigate reaction mechanisms of both half-reactions in the PFL-mediated catalysis, utilizing previously characterized potentially reactive bound states with the coenzyme occupying the active site of monomeric PFL model systems with an emphasis on the second half-reaction. 2.2. Molecular Dynamics Simulations

Enzymes adopt a series of complex three-dimensional structures by constantly interconverting between numerous different conformations in solution, which usually markedly affects their affinity for substrates as well as their activity in general.89-92 It is crucial to understand how these structural flexibility and dynamics influence the enzymatic catalysis to enable enzymes to perform novel functions and allow for alternative pathways to take place by introducing targeted mutations or external mechanical manipulations.21,93,94 Due to their dynamic nature and the fact that they exist in a plethora of conformations in their physiological environment, isolating those conformations that are important for catalytic activity can be rather challenging.95-97 The employment of computer-based tools allows the investigation of complex biological systems, such as enzymes, at an atomistic scale, providing deep insights into their conformational space and the underlying free energy landscape.98 It is well- established that the native conformations are predominant in performing basic catalytic transformations or binding natural substrates, supporting their conversion into products. However, sometimes the occurrence of certain conformational changes is critical before the binding of substrates or is necessary for the activation of enzymes while understanding these structural features at a molecular level can help in understanding the general functional roles of enzymes in central metabolic processes, as well as in the environment.99,100

Computational modeling based on molecular mechanics (MM) is widely used to study enzymatic systems, even though classical MM methods are not applicable for modeling

31

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

chemical reactions due to the lack of explicit description of electrons. However, the aforementioned approach is typically a basis both for MD simulations or combined with quantum mechanical (QM) methods in hybrid QM/MM calculations for modeling chemical reactions.101,102 The main advantage of MM methods is their applicability to relatively large protein systems (hundreds of thousands to millions of atoms),103 thus providing invaluable insight into the dynamics of proteins on longer time-scales (usually hundreds of nanoseconds to a couple of microseconds).104-106 This is possible due to the simple energy formulations used in MM energy functions (force fields),107 which includes, e.g. bonded parameters such as harmonic terms representing the energy of bond stretching and angle bending, while simple periodic terms describe proper torsional and improper (out-of-plane bending) dihedral angles. Non-bonded van der Waals interactions are incorporated by a simple Lennard-Jones potential while electrostatic interactions in MM force fields are usually treated by using fixed partial atomic point charges interacting via Coulomb force. On one hand, the atomic point charges usually do not adjust in response to changes in the molecular environment or conformation, which means that the electronic polarization effects are not included in the standard force field formalism. On the other hand, atomic point charges are critical for the quality of MM results and are obtained directly from the experiment or by fitting the electrostatic potential (ESP) values, obtained with the QM method (ab initio or DFT), using restrained electrostatic potential (RESP)108,109 to overcome the over-polarization of the bonds which may lead to artifacts in the simulation. Typical MM force fields for biological macromolecules (e.g. AMBER)110,111 represent all atoms in a protein explicitly (all-atom or atomistic force fields). The simple MM potential energy function (Hamiltonian) is given in Equation 1.

NN−1 22 Aij B ij q i q j Etotal= k b( r − r00) + k ( − ) + V n 1 + cos( n  −  ) + − + (1)      12 6  R bonds angles dihedrals i=11 j = i + RRij ij ij

The first term in Eq. 1 describes the energy between covalently bonded atoms. This harmonic potential is a good approximation near the equilibrium bond length. The second term represents the energy due to the angle bending while the third term represents the torsional energy for twisting a bond due to bond order (e.g., single or double bonds). The fourth term represents the non-bonded energy between atom pairs, which can be separated into van der Waals and electrostatic energies. As already mentioned, classical MM potentials cannot model bond formation nor dissociation, which represent elementary processes in all chemical reactions: for example, the stiff harmonic spring used to describe chemical bonds does not

32

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

allow them to break. Also, MM force field parameters are developed based on the properties of stable molecules and usually will not apply to unstable intermediates and transition states. Nevertheless, it is possible to develop MM functions and parameters specifically to study reactions, namely by using the empirical valence bond (EVB) approach.112-114 However, the generated parameters would naturally apply only to a specific reaction type, meaning that reparametrization is necessary for the individual problem studied. As mentioned above, standard protein MM force fields only include implicit electronic polarization, through an average atomic point charge field. Recently, improved state-of-the-art polarizable force fields for biological molecules, which are the subject of several contemporary studies, are extensively being developed and in the future, one may expect that it becomes a practice to include explicit polarization terms in simulations of biological macromolecules.115

Molecular dynamics simulations are a key theoretical tool in the investigation of enzyme dynamics and in characterizing time-dependent internal motions, as well as conformational changes, in generating structures for modeling chemical reactions, and for QM/MM calculations of energy profiles for enzyme-catalyzed reactions.116,117 MD simulations are typically performed with MM methods as described earlier.118 It is increasingly common to use MD snapshots as initial structures as for QM/MM calculations, rather than the artificially restrained X-ray structure directly.119 This approach has indeed been employed in this thesis (Chapter 5) where the overall reaction mechanism including both half-reactions of PFL catalysis was studied in detail.

To start with, in Chapter 3 of this thesis, a standard MD setup procedure is employed, namely, the whole enzymatic system is initially neutralized by adding counter ions, embedded in a box of explicit water molecules, and thereafter relaxed and simulated by applying periodic boundary conditions (PBC) for approximating an infinite system to properly emulate experimental conditions, and further, Particle-Mesh Ewald (PME) method which is routinely employed in computing long-range interactions (e.g. electrostatic interactions) in periodic systems. In MD simulations, Newton’s equations of motion are used to propagate the system of interest in time, which is usually followed by extensive analysis of the obtained statistics.120

Even though standard MD simulations seem to represent a “perfect” tool for studying conformational changes in proteins and behavior of biological macromolecules in their natural aqueous environment at the time-scales of a couple of hundreds of nanoseconds, this technique, besides already mentioned underlying issues (unable to explicitly model chemical reactions and simulate rearrangements of the electrons), has various other limitations. First, when it

33

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

comes to large molecules, such as enzymes, certain processes, and specific conformational changes occur at longer timescales than those available through common MD routines. Namely, these slow processes, such as large domain motions and protein association, or substrate binding and unbinding to protein occur typically in the μs to ms range.121 In particular, slow time-scales are connected to processes that require the system of interest to overcome a high energy barrier, corresponding to low populated states in the conformational energy landscape; in this case, the system can become trapped in a local minimum, making standard MD unable to adequately explore the conformational space. Secondly, typical enzymes, having a size of tens of thousands of atoms, are extremely complex and have numerous conformations which are almost impossible to sample properly in a duration of a single classical MD run. As one plausible solution to this, recently, it has been proposed that multiple MD runs, which start from different initial conditions, should be performed rather than having one single long run. Thirdly, common processes including substrate binding or protein associations and assembly are impossible to study with standard MD techniques, since the observation of binding events during a classical simulation is indeed very rare, raising the time-scale problem. Numerous important processes like chemical reactions, substrate binding/unbinding, or protein folding are characterized by different metastable states interconverting extremely slowly. This all stems from the presence of high energy barriers on the free energy landscape. Several alternative techniques have been developed in the last few years to overcome the time limitation imposed by classical simulations while still retaining the information on the all-atom simulations. MD simulations of such processes, being limited in terms of time scale, require the use of enhanced sampling techniques to explore efficiently.46,122

Additional strategies consist of enhanced sampling techniques that apply bias potentials or restraints to the original potential of the system during MD simulations to increase the accessible time-scale and sampling of difficult-to-reach microstates, enabling the simulation of slow processes, such as substrate binding or unbinding and large-domain conformational changes, in a significantly more frequent occurrence. These methods add a bias force or potential to the system to increase the rate of escape from local minima and to help in overcoming energy barriers, entailing an acceleration of conformational sampling. They have been developed primarily to investigate either protein folding or binding and unbinding processes of ligands. Enhanced sampling techniques can be divided into methods that require collective variables (CVs) to introduce the bias and methods that do not. The CV-free enhanced sampling approaches include methods such as replica exchange molecular dynamics

34

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

(REMD)123,124 and accelerated molecular dynamics (aMD) simulations.125,126 In contrast, the employment of a particular CV is based on the idea that a complex system can be decomposed into one or a combination of reaction coordinates (RCs) describing the process of interest. These RCs are named as collective variables since it is assumed that they can summarize the behavior of the entire system. After a careful choice of CVs, the bias potential is added on these coordinates during the simulation, thus enhancing the sampling along with the predefined CVs. The phase space is reduced to the space of the CVs, since the conformational space is projected to the selected CVs, with a consequential dimensional reduction of the free energy surface. Several examples of CV-dependent methods have been used in Chapter 4 of this thesis, as they include steered molecular dynamics (SMD)127-135 and umbrella sampling (US),41,136-139 which were used to investigate the entry of coenzyme into the active site of PFL. Examples of alternative methods using the same underlying principle are random accelerated molecular dynamics (RAMD)140,141 and metadynamics (mD).142-144 2.3. QM/MM Calculations

Modeling a chemical reaction requires a sophisticated formalism capable of describing the formation and cleavage of covalent bonds between atoms. Quantum mechanical (QM) methods can explicitly include the distribution of electrons in molecules. They are widely used to study numerous molecular properties, such as geometries, electronic structures, and reactions of small molecules in the gas phase or implicit solvent (polarizable continuum).145- 151 Many different QM methods exist, and several methods based on the wave function involve finding the solution to the Schrödinger equation. The solution of the Schrödinger equation requires rigorous approximations in the case of systems containing more than one electron. Available QM methods differ depending on the level of approximations made, ranging from semi-empirical (low-level theory, “rough and fast”)152 to ab initio (high-level theory, “accurate and slow”) methods. In this thesis, we employ two different QM levels of theory: ab initio and density functional theory (DFT) methods.

The simplest ab initio QM method is the Hartree-Fock (HF) theory, in which it is approximated that the spatial distribution of each electron does not depend on the movement of other electrons.153 The main disadvantage of this approximation is that it neglects the key correlation between electrons, which is a natural tendency of electrons to avoid each other, therefore results (especially energies) obtained with the HF method often give large errors. However, as mentioned earlier, in some cases, the geometries obtained with HF calculations

35

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

can be appropriate for deriving a molecular ESP which is then used in certain force field parametrizations (e.g. AMBER) in the generation of partial atomic charges. Many “correlated” ab initio methods, including those whose correlation terms are based on Møller-Plesset perturbation theory (e.g. MP2 or MP4),154-157 configuration interaction (CI),158 or coupled- cluster theory (CC),159 use HF wave functions as an initial guess. These methods offer a significant improvement in accuracy but are also significantly more computationally expensive, which currently makes them applicable only for molecules containing several heavy atoms.

On the other hand, DFT methods can offer accuracy comparable to that of the corrected ab initio methods, but at a substantially lower computational cost.160-162 The basic approximation of DFT is that the ground-state energy of a system can be calculated exclusively from its electron density distribution. The density represents a function of only three variables and is much simpler than the ab initio wave function, representing a function of 3N variables, where N is the number of electrons. However, the exact form of the functional relating the density to the energy remains unknown. In this respect, numerous density functionals have been developed based on already known limiting characteristics of the exact functional, but there exists no systematic way to improve them.163 One of the most popular and widely used functional is B3LYP, a hybrid functional,164 which has also been employed in this thesis, in combination with Pople’s basis sets, as a QM method of choice that performs well for obtaining the geometries of organic radicals.

The size of an enzymatic system typically exceeds thousands of atoms, hence modeling an entire enzyme using QM methods is computationally demanding and prohibited in most of the cases.165 Nowadays, such calculations can be performed using hybrid quantum mechanics/molecular mechanics approaches (QM/MM).166,167 QM/MM methods simultaneously exploit the benefits of both QM and MM approaches as described above.168,169 The smaller protein region of most interest (i.e. the active site, along with surrounding catalytically essential residues) is treated with a QM method.116,170-172 The DFT is the most common method of choice, and it usually requires a set of benchmark calculations against a certain higher level of theory, taking into account that a specific functional performance directly depends on the type of the system or chemical reaction studied. In this thesis, the reference method includes a series of calculations performed by following the multilevel energy-decomposition G3(MP2)-RAD173 procedure which is designed particularly for the

36

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

prediction of reliable energetics of radical species. These methods have proven to perform particularly well for thermochemistry and kinetics of reactions in which radicals are involved.

The rest of the enzyme environment, as well as solvent molecules surrounding the active site, are treated with a MM classical force field, whereby certain restraints are normally applied (but not on those atoms directly connected to the QM region) to avoid possible artifacts produced by the mobility of unphysically embedded surface residues or evaporation of the solvent into the vacuum.174 In standard QM/MM methodologies the two regions interact with each other, as described below and presented in Equation 2. In this thesis, the subtractive scheme (e.g. the ONIOM (our own N‐layered integrated molecular orbital + molecular mechanics) method in Gaussian software package)175,176 to QM/MM calculations is

177-179 re al employed. In this approach the total energy Esub is calculated using the following simplified expression:

realMMQMMM EEEEsubrealmodelmodel=+− (2)

MM QM where Ereal is the total force field energy of the entire (real) enzymatic system, Emodel is the QM

MM energy, and Emodel is the MM energy of the QM (small model subsystem, often including several active site catalytic residues) region.180 The key electrostatic interactions between the QM and the MM zone are treated by employing the electrostatic embedding (EE)181 approach which explicitly includes partial charges of the atoms in the MM region into the QM Hamiltonian, in turn enabling the inclusion of the polarization term into the QM wave function in the response to its environment (point charges of MM region), thus producing more realistic results.182-185 The critical issue of the QM/MM method is the description of the coupling between the QM and MM zone. If the two subsystems are connected through covalent bonds, the valences of these bonds must remain saturated. The most common approach is to introduce a monovalent link atom, usually hydrogen, at an appropriate position along the bond vector between the QM and MM atom, which cap the QM zone at the boundary ensuring the proper evaluation of the QM wave function.186,187

In the present chapter, a general survey is presented with an attempt to provide a basic overview of different computational methods employed in this work to explore enzyme dynamics and catalysis, however, technical details specific to each of the studies are presented in the respective chapters of the thesis as the work unfolds.

37

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

2.3. References

1. Knowles, J. R. Nature 1991, 350, 121-124. 2. Agarwal, P. K. Microb. Cell Fact. 2006, 5, 2. 3. Schramm, V. L. Chem. Rev. 2006, 106, 3029-3030. 4. Warshel, A.; Bora, R. P. J. Chem. Phys. 2016, 144, 180901. 5. Noda-Garcia, L.; Liebermeister, W.; Tawfik, D. S. Annu. Rev. Biochem. 2018, 87, 187-216. 6. Wells, S. A.; Van der Kamp, M. W.; McGeagh, J. D.; Mulholland, A. J. PloS ONE 2015, 10, e0133372. 7. Robinson, P. K. Essays Biochem. 2015, 59, 1-41. 8. Martínez Cuesta, S.; Asad Rahman, S.; Furnham, N.; Thornton, J. M. Biophys. J. 2015, 109, 1082- 1086. 9. Adrio, J. L.; Demain, A. L. Biomolecules 2014, 4, 117-139. 10. Berntsson, R. P.; Smits, S. H.; Schmitt, L.; Slotboom, D. J.; Poolman, B. FEBS Lett. 2010, 584, 2606-2617. 11. Kohen, A. Acc. Chem. Res. 2015, 48, 466-473. 12. Daniel, E. Angew. Chem. Int. Ed. 1995, 33, 2375-2378. 13. Weng, Y-Z.; Chang, D. T.; Huang, Y-F.; Lin, C-W. BMC Bioinformatics 2011, 12, S1-S32. 14. Himo, F. Theor. Chem. Acc. 2006, 116, 232-240. 15. Fried, S. D.; Boxer, S. G. Annu. Rev. Biochem. 2017, 86, 387-415. 16. Neet, K. E. J. Biol. Chem. 1998, 273, 25527-25528. 17. Radzicka, A.; Wolfenden, R. Science 1995, 267, 90-93. 18. Kraut, J. Science 1988, 242, 533-540. 19. Gao, J.; Ma, S.; Major, D. T.; Nam, K.; Pu, J.; Truhlar, D. G. Chem. Rev. 2006, 106, 3188-3209. 20. Ulusu, N. N. J. Mol. Evol. 2015, 80, 251-257. 21. Zeymer, C.; Hilvert, D. Annu. Rev. Biochem. 2018, 87, 131-157. 22. Hong, N. S.; Petrović, D.; Lee, R.; Gryn’ova, G.; Purg, M.; Saunders, J.; Bauer, P.; Carr, P. D.; Lin, C. Y.; Mabbitt, P. D.; Zhang, W.; Altamore, T.; Easton, C.; Coote M. L.; Kamerlin, S. C. L.; Jackson C. J. Nat. Commun. 2018, 9, 3900. 23. Petrović, D.; Risso, V. A.; Kamerlin, S. C. L.; Sanchez-Ruiz J. M. J. R. Soc. Interface 2018, 15, 20180330. 24. Maria-Solano, M. A.; Serrano-Hervás, E.; Romero-Rivera, A.; Iglesias-Fernández, J.; Osuna, S. Chem. Commun. 2018, 54, 6622-6634. 25. Kingsley, L. J.; Lill, M. A. Proteins 2015, 83, 599-611. 26. Richard, J. P. J. Am. Chem. Soc. 2019, 141, 3320-3331. 27. Sweetlove, L. J.; Fernie, A. R. Nat. Commun. 2018, 9, 2136. 28. Agarwal, P. K. J. Am. Chem. Soc. 2005, 127, 15248-15256. 29. Agarwal, P. K.; Geist, A.; Gorin, A. Biochemistry 2004, 43, 10605-10618. 30. Agarwal, P. K. Biochemistry 2019, 58, 438-449. 31. Lancaster, L.; Abdallah, W.; Banta, S.; Wheeldon, I. Chem. Soc. Rev. 2018, 47, 5177-5186. 32. Blank, K.; De Cremer, G.; Hofkens, J. Biotechnol. J. 2009, 4, 465-479. 33. Engelkamp, H.; Hatzakis, N. S.; Hofkens, J.; De Schryver, F. C.; Nolte, R. J. M.; Rowan, A. E. Chem. Commun. 2006, 9, 935-940. 34. Gutteridge, A.; Thornton, J. FEBS Lett. 2004, 567, 67-73. 35. Secundo, F. Chem. Soc. Rev. 2013, 42, 6250-6261. 36. Powers, R. Proteins: Structure, Function, and Bioinformatics 2006, 65, 124-135. 37. Züercher, M.; Diederich, F. J. Org. Chem. 2008, 73, 4345-4361. 38. Sullivan, S. M.; Holyoak, T. Proc. Natl. Acad. Sci. USA 2008, 105, 13829-13834. 39. Wang, L.; Wu, Y.; Deng, Y.; Kim, B.; Pierce, L.; Krilov, G.; Lupyan, D.; Robinson S.; Dahlgren, M .K.; Greenwood, J.; Romero, D. L.; Masse, C.; Knight, J. L.; Steinbrecher, T.; Beuming, T.; Damm, W.; Harder, E.; Sherman, W.; Brewer, M.; Wester, R.; Murcko, M.; Frye, L.; Farid, R.; Lin, T.; Mobley, D. L.; Jorgensen, W. L.; Berne, B. J.; Friesner, R. A.; Abel, R. J. Am. Chem. Soc. 2015, 137, 2695- 2703. 40. Souaille, M.; Roux, B. Comput. Phys. Commun. 2001, 135, 40-57.

38

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

41. Kumar, S.; Rosenberg, J. M.; Bouzida, D.; Swendsen, R. H.; Kollman, P. A. J. Comput. Chem. 1992, 13, 1011-1021. 42. Kovermann, M.; Grundström, C.; Sauer-Eriksson, A. E.; Sauer, U. H.; Wolf-Watz, M. Proc. Natl. Acad. Sci. USA 2017, 114, 6298-6303. 43. Hammes, G. G. Biochemistry 2002, 41, 8221-8228. 44. Csermely, P.; Palotai, R.; Nussinov, R. Trends Biochem. Sci. 2010, 35, 539-546. 45. Cuzzolin, A.; Sturlese, M.; Deganutti, G.; Salmaso, V.; Sabbadin, D.; Ciancetta, A.; Moro, S. J. Chem. Inf. Model. 2016, 56, 687-705. 46. De Vivo, M.; Masetti, M.; Bottegoni, G.; Cavalli, A. J. Med. Chem. 2016, 59, 4035-4061. 47. Doerr, S.; De Fabritiis, G. J. Chem. Theory Comput. 2014, 10, 2064-2069. 48. Durrant, J. D.; McCammon, J. A. BMC Biol. 2011, 9, 71. 49. Pollack, R. M. J. Am. Chem. Soc. 2010, 132, 3637-3637. 50. Holliday, G. L.; Mitchell, J. B.; Thornton, J. M. J. Mol. Biol. 2009, 390, 560-577. 51. Adams, G. E.; Aldrich, J. E.; Bisby, R. H.; Cundall, R. B.; Redpath, J. L.; Willson, R. L. Radiation Research 1972, 49, 278-289. 52. Pedersen, J. Z.; Finazzi-Agrò, A. FEBS Lett. 1993, 325, 53-58. 53. Buckel, W.; Golding, B. T. Annu. Rev. Microbiol. 2006, 60, 27-49. 54. Shibata, N.; Toraya, T. J. Biochem. 2015, 158, 271-292. 55. Yamazaki, I. Free Radic. Biol. Med. 1987, 3, 397-404. 56. Frey, P. A. Annu. Rev. Biochem. 2001, 70, 121-148. 57. Frey, P. A.; Hegeman, A. D.; Reed, G. H. Chem. Rev. 2006, 106, 3302-3316. 58. Jäger, C. M.; Croft, A. K. Chem. Bio. Eng. Rev. 2018, 5, 143-162. 59. Liao, L. Y.; Chung, W. S.; Chen, K. M. J. Adv. Nurs. 2017, 73, 108-111. 60. Fontecave, M. Cell. Mol. Life Sci. 1998, 54, 684-695. 61. Sawers, G. FEMS Microbiol. Rev. 1999, 22, 543-551. 62. Andersson, J.; Westman, M.; Sahlin, M.; Sjöberg, B.-M. J. Biol. Chem. 2000, 275, 19449-19455. 63. Srivastava, A.; Nagai, T.; Srivastava, A.; Miyashita, O.; Tama, F. Int. J. Mol. Sci. 2018, 19, 3401. 64. Holcomb, J.; Spellmon, N.; Zhang, Y.; Doughan, M.; Li, C.; Yang, Z. AIMS Biophys. 2017, 4, 557- 575. 65. Helliwell, J. R. Biosci. Rep. 2017, 37, BSR20170204. 66. Jeschke, G. Biochimica et Biophysica Acta (BBA) - Bioenergetics 2005, 1707, 91-102. 67. Marsh, E. N.; Patterson, D. P.; Li, L. Chembiochem. 2010, 11, 604-621. 68. Imlay, J. A. Mol. Microbiol. 2006, 59, 1073-1082. 69. Wiltgen, M.; Tilz, G. P. Wien Med. Wochenschr. 2009, 159, 112-125. 70. Vyas, V. K.; Ukawala, R. D.; Ghate, M.; Chintha, C. Indian J. Pharm. Sci. 2012, 74, 1-17. 71. Cavasotto, C. N.; Phatak, S. S. Drug Discov. Today 2009, 14, 676-683. 72. Sim, A. Y. L.; Minary, P.; Levitt, M. Curr. Opin. Struct. Biol. 2012, 22, 273-278. 73. Linder, M. Comput. Struct. Biotechnol. J. 2012, 2, e201209009. 74. García-Guevara, F.; Avelar, M.; Ayala, M.; Segovia, L. Biocatalysis 2015, 1, 109-117. 75. Hediger, M. R.; De Vico, L.; Svendsen, A.; Besenmatter, W.; Jensen, J. H. PLoS ONE 2012, 7, e49849. 76. Damborsky, J.; Brezovsky, J. Curr. Opin. Chem. Biol. 2014, 19, 8-16. 77. Lonsdale, R.; Ranaghan, K. E.; Mulholland, A. J. Chem. Commun. 2010, 46, 2354-2372. 78. Claeyssens, F.; Harvey, J. N.; Manby, F. R.; Mata, R. A.; Mulholland, A. J.; Ranaghan, K. E.; Schütz, M.; Thiel, S.; Thiel, W.; Werner, H.-J. Angew. Chem. Int. Ed. 2006, 45, 6856-6859. 79. Steinhauser, M. O.; Hiermaier, S. Int. J. Mol. Sci. 2009, 10, 5135-5216. 80. Markowetz, F. PLoS Biol. 2017, 15, e2002050. 81. Senn, H. M.; Thiel, W. Top. Curr. Chem. 2007, 268, 173-290. 82. Van der Kamp, M. W.; Shaw, K. E.; Woods, C. J.; Mulholland, A. J. J. R. Soc. Interface 2008, 5, S173-S190. 83. Lin, H.; Truhlar, D. G. Theor. Chem. Acc. 2007, 117, 185-199. 84. Hofer,T. S.; de Visser, S. P. Front. Chem. 2018, 6, 2296-2646. 85. Reuter, N.; Dejaegere, A.; Maigret, B.; Karplus, M. J. Phys. Chem. A 2000, 104, 1720-1735. 86. Cui, Q.; Karplus, M. J. Phys. Chem. B 2000, 104, 3721-3743. 87. Cao, L.; Ryde, U. Front. Chem. 2018, 6, 89.

39

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

88. Backman, L. R. F.; Funk, M. A.; Dawson, C. D.; Drennan, C. L. Crit. Rev. Biochem. Mol. Biol. 2017, 52, 674-695. 89. Saleh, T.; Kalodimos, C. G. Science 2017, 355, 247-248. 90. Motlagh, H. N.; Wrabl, J. O.; Li, J.; Hilser, V. J. Nature 2014, 508, 331-339. 91. Wei, G.; Xi, W.; Nussinov, R.; Ma, B. Chem. Rev. 2016, 116, 6516-6551. 92. Campbell, E.; Kaltenbach, M.; Correy, G. J.; Carr, P. D.; Porebski, B. T.; Livingstone, E. K.; Afriat- Jurnou, L.; Buckle, A. M.; Weik, M.; Hollfelder, F.; Tokuriki, N.; Jackson, C. J. Nat. Chem. Biol. 2016, 12, 944-950. 93. Khersonsky, O.; Roodveldt, C.; Tawfik, D. S. Curr. Opin. Chem. Biol. 2006, 10, 498-508. 94. Khersonsky, O.; Tawfik, D. S. Annu. Rev. Biochem. 2010, 79, 471-505. 95. James, L. C.; Tawfik, D. S. Trends Biochem. Sci. 2003, 28, 361-368. 96. Henzler-Wildman, K. A.; Lei, M.; Thai, V.; Kerns, S. J.; Karplus, M.; Kern, D. Nature 2007, 450, 913-916. 97. Klinman, J. P.; Kohen, A. J. Biol. Chem. 2014, 289, 30205-30212. 98. Ramos, M. J.; Fernandes, P. A. Acc. Chem. Res. 2008, 41, 689-698. 99. Eisenmesser, E. Z.; Bosco, D. A.; Akke, M.; Kern, D. Science 2002, 295, 1520-1523. 100. Olsson, M. H. M.; Parson, W. W.; Warshel, A. Chem. Rev. 2006, 106, 1737-1756. 101. Lonsdale, R.; Harvey, J. N.; Mulholland, A. J. Chem. Soc. Rev. 2012, 41, 3025-3038. 102. Ryde, U. Methods Enzymol. 2016, 577, 119-158. 103. Perilla, J. R.; Goh, B. C.; Cassidy, C. K.; Liu, B.; Bernardi, R. C.; Rudack, T.; Yu, H.; Wu, Z.; Schulten, K. Curr. Opin. Struct. Biol. 2015, 31, 64-74. 104. Noé, F. Biophys. J. 2015, 108, 228-229. 105. Klepeis, J. L.; Lindorff-Larsen, K.; Dror, R. O.; Shaw, D. E. Curr. Opin. Struct. Biol. 2009, 19, 120-127. 106. Xue, Y.; Ward, J. M.; Yuwen, T.; Podkorytov, I. S.; Skrynnikov, N. R. J. Am. Chem. Soc. 2012, 134, 2555-2562. 107. Monticelli, L.; Tieleman, D. P. Methods Mol. Biol. 2013, 924, 197-213. 108. Dupradeau, F.-Y.; Cézard, C.; Lelong, R.; Stanislawiak, E.; Pêcher, J.; Delepine, J. C.; Cieplak, P. Nucleic Acids Res. 2008, 36, D360-D367. 109. Vanquelef, E.; Simon, S.; Marquant, G.; Garcia, E.; Klimerak, G.; Delepine, J. C.; Cieplak, P.; Dupradeau, F.-Y. Nucleic Acids Res. 2011, 39, W511-W517. 110. Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. J. Am. Chem. Soc. 1995, 117, 5179-5197. 111. Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, K. E.; Simmerling, C. J. Chem. Theory Comput. 2015, 11, 3696-3713. 112. Warshel, A.; Weiss, R. M. J. Am. Chem. Soc. 1980, 102, 6218-6226. 113. Warshel, A. Annu. Rev. Biophys. Biomol. Struct. 2003, 32, 425-443. 114. Kamerlin, S. C.; Warshel, A. Faraday Discuss. 2010, 145, 71-106. 115. Jing, Z.; Liu, C.; Cheng, S. Y.; Qi, R.; Walker, B. D.; Piquemal, J.-P.; Ren, P. Annu. Rev. Biophys. 2019, 48, 371-394. 116. Friesner, R. A.; Guallar, V. Annu. Rev. Phys. Chem. 2005, 56, 389-427. 117. Ranaghan, K. E.; Mulholland, A. J. Int. Rev. Phys. Chem. 2010, 29, 65-133. 118. Adcock, S. A.; McCammon, J. A. Chem. Rev. 2006, 106, 1589-1615. 119. Siegbahn, P. E. M.; Himo, F. J. Biol. Inorg. Chem. 2009, 14, 643-651. 120. Cheatham, T. E.; Miller, J. H.; Fox, T.; Darden, P. A.; Kollman, P. A. J. Am. Chem. Soc. 1995, 117, 4193. 121. Henzler-Wildman, K.; Kern, D. Nature 2007, 450, 964-972. 122. Salmaso, V.; Moro, S. Front. Pharmacol. 2018, 9, 923. 123. Sugita, Y.; Okamoto, Y. Chem. Phys. Lett. 1999, 314, 141-151. 124. Sugita, Y.; Kitao, A.; Okamoto, Y. J. Chem. Phys. 2000, 113, 6042-6051. 125. Hamelberg, D.; Mongan, J.; McCammon, J. A. J. Chem. Phys. 2004, 120, 11919-11929. 126. Kappel, K.; Miao, Y.; McCammon, J. A. Rev. Biophys. 2015, 48, 479-487. 127. Isralewitz, B.; Gao, M.; Schulten, K. Curr. Opin. Struct. Biol. 2001, 11, 224-230. 128. Isralewitz, B.; Izrailev, S.; Schulten, K. Biophys. J. 1997, 73, 2972-2979. 129. Izrailev, S.; Crofts, A. R.; Berry, E. A.; Schulten, K. Biophys. J. 1999, 77, 1753-1768.

40

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

130. Izrailev, S.; Stepaniants, S.; Balsera, M.; Oono, Y.; Schulten, K. Biophys. J. 1997, 72, 1568-1581. 131. Hummer, G.; Szabo, A. Biophys. J. 2003, 85, 5-15. 132. Park, S.; Khalili-Araghi, F.; Tajkhorshid, E.; Schulten, K. J. Chem. Phys. 2003, 119, 3559-3566. 133. Crespo, A.; Marti, M. A.; Estrin, D. A.; Roitberg, A. E. J. Am. Chem. Soc. 2005, 127, 6940-6941. 134. Park, S.; Schulten, K. J. Chem. Phys. 2004, 120, 5946-5961. 135. Aldeghi, M.; Heifetz, A.; Bodkin, M. J.; Knapp, S.; Biggin, P. C. J. Am. Chem. Soc. 2017, 139, 946-957. 136. Torrie, G. M.; Valleau, J. P. J. Comput. Phys. 1977, 23, 187-199. 137. Kästner, J. WIREs Comput. Mol. Sci. 2011, 1, 932-942. 138. Kästner, J.; Thiel, W. J. Chem. Phys. 2005, 123, 144104. 139. Hansen, N.; Van Gunsteren, W. F. J. Chem. Theory Comput. 2014, 10, 2632-2647. 140. Lüdemann, S. K.; Lounnas, V.; Wade, R. C. J. Mol. Biol. 2000, 303, 797-811. 141. Wang, T.; Duan, Y. J. Mol. Biol. 2009, 392, 1102-1115. 142. Laio, A.; Parrinello, M. Proc. Natl. Acad. Sci. USA 2002, 99, 12562-12566. 143. Laio, A.; Gervasio, F. L. Rep. Prog. Phys. 2008, 71, 126601. 144. Gervasio, F. L.; Laio, A.; Parrinello, M. J. Am. Chem. Soc. 2005, 127, 2600-2607. 145. LaPointe, S. M.; Weaver, D. F. Curr. Comput. Aided Drug Des. 2007, 3, 290-296. 146. Cui, Q. J. Chem. Phys. 2016, 145, 140901. 147. Sousa, S. F.; Ribeiro, A. J. M.; Neves, R. P. P.; Brás, N. F.; Cerqueira, N. M. F. S. A.; Fernandes, P. A.; Ramos, M. J. WIREs Comput. Mol. Sci. 2017, 7, e1281. 148. Cerqueira, N. M. F. S. A.; Fernandes, P. A.; Ramos, M. J. Chem. Phys. Chem. 2018, 19, 669-689. 149. Ahmadi, S.; Barrios Herrera, L.; Chehelamirani, M.; Hostaš, J.; Jalife, S.; Salahub, D. R. Int. J. Quantum Chem. 2018, 118, e25558. 150. Mennucci, B. WIREs Comput. Mol. Sci. 2012, 2, 386-404. 151. Lipparini, F.; Mennucci, B. J. Chem. Phys. 2016, 144, 160901. 152. Christensen, A. S.; Kubař, T.; Cui, Q.; Elstner, M. Chem. Rev. 2016, 116, 5301-5337. 153. Friesner, R. A. Proc. Natl. Acad. Sci. USA 2005, 102, 6648-6653. 154. Møller, C.; Plesset, M. S. Phys. Rev. 1934, 46, 618-622. 155. Head-Gordon, M.; Pople, J. A.; Frisch, M. J. Chem. Phys. Lett. 1988, 153, 503-506. 156. Murray, C.; Davidson, E. R. Chem. Phys. Lett. 1991, 187, 451-454. 157. Granovsky, A. A. J. Chem. Phys. 2011, 134, 214113. 158. Szalay, P. G.; Müller, T.; Gidofalvi, G.; Lischka, H.; Shepard, R. Chem. Rev. 2012, 112, 108-181. 159. Kong, L.; Bischoff, F. A.; Valeev, E. F. Chem. Rev. 2012, 112, 75-107. 160. Elstner, M.; Frauenheim, T.; Suhai, S. J. Mol. Struct.: THEOCHEM. 2003, 632, 29-41. 161. Su, N. Q.; Xu, X. Annu. Rev. Phys. Chem. 2017, 68, 155-182. 162. Jones, R. O. Rev. Mod. Phys. 2015, 87, 897. 163. Cohen, A. J.; Mori-Sánchez, P.; Yang, W. Chem. Rev. 2012, 112, 289-320. 164. Tirado-Rives, J.; Jorgensen, W. L. J. Chem. Theory Comput. 2008, 4, 297-306. 165. Jensen, F. Introduction to computational chemistry. Wiley, New York, NY, USA: 2001. 166. Warshel, A.; Levitt, M. J. Mol. Biol. 1976, 103, 227-249. 167. Jover, J.; Maseras, F. Struct. Bond. 2016, 167, 59-79. 168. Gao, J. Acc. Chem. Res. 1996, 29, 298-305. 169. Monard, G.; Merz, K. Acc. Chem. Res. 1999, 32, 904-911. 170. Gao, J.; Truhlar, D. Annu. Rev. Phys. Chem. 2002, 53, 467-505. 171. Senn, H. M.; Thiel, W. Angew. Chem. Int. Ed. 2009, 48, 1198-1229. 172. Hu, H.; Yang, W. J. Mol. Struct.: THEOCHEM. 2009, 898, 17-30. 173. Henry, D. J.; Sullivan, M. B.; Radom, L. J. Chem. Phys. 2003, 118, 4849-4860. 174. Klahn, M.; Braun-Sand, S.; Rosta, E.; Warshel, A. J. Phys. Chem. B 2005, 109, 15645-15650. 175. Dapprich, S.; Komáromi, I.; Byun, K. S.; Morokuma, K.; Frisch, M. J. J. Mol. Struct.: THEOCHEM. 1999, 462, 1-21. 176. Wa Chung, L.; Hirao, H.; Li, X.; Morokuma, K. WIREs Comp. Mol. Sci. 2012, 2, 327-350. 177. Maseras, F.; Morokuma, K. J. Comput. Chem. 1995, 16, 1170-1179. 178. Svensson, M.; Humbel, S.; Froese, R. D. J.; Matsubara, T.; Sieber, S.; Morokuma, K. J. Phys. Chem. 1996, 100, 19357-19363.

41

Chapter 2: Computational Methods for Modeling Enzyme Catalysis

179. Vreven, T.; Byun, K. S.; Komáromi, I.; Dapprich, S.; Montgomery, J. A.; Morokuma, K.; Frisch, M. J. J. Chem. Theory Comput. 2006, 2, 815-826. 180. Ranaghan, K. E.; Ridder, L.; Szefczyk, B.; Sokalski, W. A.; Hermann, J. C.; Mulholland, A. J. Org. Biomol. Chem. 2004, 2, 968-980. 181. Bakowies, D.; Thiel, W. J. Phys. Chem. 1996, 100, 10580-10594. 182. Szabo, A.; Ostlund, N. S. Modern quantum chemistry. Dover Publications, New York, NY, USA: 1989. 183. Field, M. J.; Bash, P. A.; Karplus, M. J. Comp. Chem. 1990, 11, 700-733. 184. Biancardi, A.; Barnes, J.; Caricato, M. J. Chem. Phys. 2016, 145, 224109. 185. Ho, J.; Shao, Y.; Kato, J. Molecules 2018, 23, 2466. 186. Singh, U. C.; Kollman, P. A. J. Comput. Chem. 1984, 5, 129-145. 187. Singh, U. C.; Kollman, P. A. J. Comput. Chem. 1986, 7, 718-730.

42

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

Declaration for Chapter 3

The following chapter includes the text, clearly noted by the quotation marks (“…”), and the figures which are a literal or partially paraphrased copy of the material published in the following article:

[MH1] Marko Hanževački,† Karmen Čondić-Jurkić,† Radha Dilip Banhatti, Ana-Sunčana Smith, David Matthew Smith “The Influence of Chemical Change on Protein Dynamics: A Case Study with Pyruvate Formate-Lyase” Chem. Eur. J. 2019, 25, 8741-8753. DOI: 10.1002/chem.201900663. †These authors contributed equally.

Reprinted (adapted) with permission from Chem. Eur. J. 2019, 25, 8741-8753. Publication Date: 22 March 2019. https://doi.org/10.1002/chem.201900663. Copyright 2019 Wiley-VCH.

In the publication [MH1], the research was conceived by David Matthew Smith and Ana-Sunčana Smith, who supervised the research and all stages of manuscript preparation. The work builds on the study conducted by Karmen Čondić-Jurkić, described in Chapter 3 of the following doctoral thesis:

Karmen Čondić-Jurkić “Strategies in the Computational Modelling of Biological Systems: Case Studies with Radical Enzymes” 2013. URN: urn:nbn:de:bvb:29-opus-48333.

Marko Hanževački extended this original work by performing MD and steered MD simulations with revised force field parameters, up to microsecond time scales, and performing systematic data analysis including PCA, CAVER, and DCCM. These new results are the foundation for the present chapter and the paper [MH1], which was initially drafted by Marko Hanževački and Karmen Čondić- Jurkić. Radha Dilip Banhatti contributed with a critical assessment of the developed arguments and participated in the further development of the manuscript.

The authors declare no conflict of interest and confirm that the above statements are true.

Date and signature:

Erlangen, June 10, 2020

Marko Hanževački Karmen Čondić-Jurkić Radha Dilip Banhatti

Ana-Sunčana Smith David Matthew Smith

43

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

3.1. Introduction

Certain enzymes undergo chemical modifications at the active site, which can impact their conformational dynamics during the catalysis.1-5 A classic example of chemical modification emerges in the context of bi-substrate ping-pong catalysis.6-9 This scenario involves an enzymatic reaction with two substrates and a mechanism that includes a temporary intermediate state in which the enzyme is chemically modified during the reaction with the first substrate. The original state of the enzyme is then regenerated upon the reaction with the second substrate. A schematic representation of the enzyme-catalyzed ping-pong mechanism is shown in Scheme 3.1.1.

The ping-pong kinetics occurs in a prototypical member of the glycyl radical enzyme (GRE) family, pyruvate formate-lyase (PFL), which is a case study in this thesis.10 As discussed earlier in Chapter 1, PFL performs a key function in the anaerobic glycolysis of E. coli and

Scheme 3.1.1. Enzymatic ping-pong reaction. other microorganisms, when limited amounts of oxygen are available, catalyzing the cleavage of the C-C bond in the first substrate pyruvate and the consecutive acetylation of the second substrate coenzyme A (CoA) through an acetylated enzyme intermediate in two half-reactions using highly-reactive radicals as shown in Scheme 3.1.2 and Figure 1.4.2.11 Recently, the vital importance of PFL and other GREs has been confirmed in the central anaerobic metabolism of bacteria that dominate the human gut microbiome and which could be potentially related to several diseases of the host.

A homodimeric structure of the enzyme shows that, in its biological assembly, PFL is comprised of two identical subunits, in which only one subunit is active per dimer. These types of enzymes are said to possess a half-site reactivity.12 As already briefly described in Chapter 1 (see Figure 1.4.2) and reviewed in the publication [MH1], “the active site of the activated subunit contains a glycyl radical (G734), as part of the Gly radical domain, which serves for radical storage, and two neighboring cysteines C418 and C419, as a part of the conserved Cys

44

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

Scheme 3.1.2. Half-reactions of PFL. Scheme taken from [MH1]. loop, actively involved in catalysis.13 PFL is one of the first discovered GRE, a family of

enzymes all of which temporarily store radical reactivity on the Cα atom of a glycine residue in the polypeptide chain.14-16 To become fully active, PFL undergoes post-translational modification in which the glycyl radical is introduced into the enzyme backbone during the activation process by the PFL activating enzyme (PFL-AE).17-20”

From the investigation of the available crystal structure of PFL in the presence of both substrates, pyruvate, and CoA, it appears that pyruvate binds in the generally hydrophobic active site of buried enzyme’s interior as shown in Figure 1.4.1 in Chapter 1.21,22 This is mostly due to the protection of the glycyl radical intermediate from the undesirable side-reaction during the catalysis. Regarding CoA binding, we stated the following concisely in [MH1]: “it is found that CoA binds close to the interface between two subunits in the dimer, with one CoA bound to the surface of each subunit and approximately 30 Å from the active site.23 The bound coenzyme adopts the unusual syn-conformation concerning the N-glycosidic bond, although the anti-conformer is the preferred for the free CoA in solution.24-26 In the syn-conformation, the thiol (S-H) group of the pantothenate chain is located in the predominantly hydrophobic pocket stacked by the side chains of residues F200 and H227 of the opposing monomer.” In Figure 3.1.1 the above described binding poses of both substrates are depicted in more detail.

The currently accepted PFL catalytic reaction mechanism is shown in Figure 3.1.2.27- 31 The various steps of this mechanism, as detailed in the publication [MH1] are as follows: “in the first half-reaction, pyruvate is fragmented into formate and the acetyl group upon the addition of the thiyl radical located at C418. The radical initially stored at G734 (see structure A in Figure 3.1.2) is shuttled to C418 via C419 (see structure B in Figure 3.1.2). The addition of C418 radical to the central carbonyl C-atom of pyruvate leads to C-C bond dissociation,

45

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

•- resulting in the formation of carbon dioxide radical anion (formate radical anion or CO2 ) and acetyl-C418. The latter species has a role of a temporary acetyl carrier and a reactant in the second half-reaction, with the co-substrate CoA, to produce acetyl-CoA. The formation of acetylated CoA, the final product, closes the catalytic cycle of PFL. This mechanism suggests •- that CO2 is quenched by C419 (see C → E in Figure 3.1.2), which then activates CoA for a - radical acetyl transfer, although it has been proposed that this CO2 radical species could alternatively adopt the latter role (see D → E in Figure 3.1.2)30. The sequence of events taking place after the first half-reaction is still somewhat speculative, as most of the available experimental data offer insights into the first half-reaction and the initial generation of the glycyl radical.32,33” While here, we accept the proposed mechanism and examine the PFL acetylation and its influence on the enzyme dynamics, in Chapter 5 of this thesis, we investigate both half-reactions and present results from a comprehensive QM/MM study of the catalytic mechanisms in PFL.

In general, when it comes to factors triggering conformational dynamics in PFL, we stated in publication [MH1] that, “whereas certain radical enzymes, such as those dependent

on vitamin B12, are known to undergo large conformational changes upon substrate binding, activation or product release,34-41 the situation regarding PFL is less clear. In particular, the pathway for CoA from its crystallographic binding position on the enzyme’s surface to the buried active site is not obvious from the available structure.23 Indeed, apart from the consensus that a conformational change appears to be required for the thiol group of CoA to reach the active site, very little is known about the structural basis for such a change. It has been

Figure 3.1.1. The crystal structure of the homodimeric form of PFL in complex with CoA and pyruvate, with a closer view of pyruvate in the active site (a) and the binding site of CoA (b). The moieties that build CoA are shown as an inset in panel (b). Figure taken from [MH1].

46

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

Figure 3.1.2. The currently accepted PFL-catalytic mechanism. The structures represent the active site of PFL before (PFL•) and after (AcPFL I and AcPFL II) the first half-reaction (see text for a detailed description). The location of the radical at each step is denoted by the symbol ‘•’. Figure taken from [MH1].

suggested by Drennan and co-workers,11 that the ribose and pantothenate moieties of CoA might rotate around the N-glycosidic bond (see Figure 3.1.1) and change from the syn- to the anti-conformation. This transition is expected to be energetically favorable and it could potentially lead to a more favorable positioning of the S-H group for the entry in the active site.23 Nevertheless, the structural evidence surrounding the conformational change of CoA, in the context of the PFL mechanism, has been even recently described as elusive.11”

Although the experimental data indicates that the presence of CoA is not essential for the first half-reaction, it can be bound to the enzyme adopting the role of the spectator without interfering with the pyruvate cleavage. In the publication [MH1], we highlighted another interesting fact about the inhibition of PFL, namely “small thiols, such as mercaptoethanol and dithiothreitol, are efficient deactivators of PFL, whereas larger thiols, such as cysteine or glutathione, show no detectable deactivation.42 These observations indicate that a conformational change is not only required for the second half-reaction to take place but that its timing may also need to be regulated.” Importantly, this would imply that “if the S-H group of CoA or any other thiol molecules were to enter the active site before the completion of the first half-reaction, it could result in the premature quenching of the radical activity. This could explain why the active site needs to be occluded, as observed in the crystal structure, and,

47

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase potentially, why a conformational change might be a requisite aspect of the overall reaction mechanism” [MH1].

Over the past few years, computational modeling methods have matured into a mandatory tool for studying proteins and the connection between their structure and function, especially in the case of radical enzymes where several crucial details are too challenging for experiments to access.43,44 For this, classical molecular dynamics (MD) and some of its advanced procedures were chosen as the appropriate tool for attempting to resolve the key open questions surrounding the catalysis of PFL and, particularly, the puzzling role of coenzyme substrate in the second half-reaction. Furthermore, we examined whether the chemical modification (acetylation) of the enzyme could trigger conformational changes. Our central hypothesis that “a pathway for CoA could indeed appear as a result of the necessary conformational changes” was indeed established in the publication [MH1], where we first analyzed the overall enzyme structure followed by exploring the binding properties and the dynamics of CoA to substantiate the hypothesis. Using robust computational tools, we then strove towards identifying a potential CoA entry channel into the active site and examined the nature of the conformational changes that would allow the coenzyme molecule to access the active site from its experimentally observed, surface-bound pose. Before presenting these results in the following sections, we start by defining the model systems, the parameters, computational methodologies, and analysis tools employed in our MD study of the PFL enzyme.

48

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

3.2. Methods

3.2.1. The Building of Model Systems

For this study, three different models of PFL were created to represent the state of the enzyme before (PFL•), and after (AcPFL I and AcPFL II) the first half-reaction. As stated in [MH1], “it must be noted from Figure 3.1.2 that each of these three models contains a different radical carrier. Also, all models contain CoA initially bound at the protein surface according to the crystal structure data.” As mentioned earlier, PFL is a homodimeric enzyme with a single active site activated, based on the experimentally resolved half-site radical residence.45-48 However, as analyzed in [MH1], “the binding site of CoA in the available crystal structure spans both subunits so that the nucleotide moiety interacts with one monomer, whereas the cysteamine group is placed between a histidine and phenylalanine residue of the opposing monomer.” To examine the role of the inactive subunit in the catalysis we additionally introduced two different sets of models – monomeric and dimeric models. The monomeric models can also be considered as preliminary systems from which one may obtain important insights at the significantly lower computational cost. Similar to the notation used in [MH1], when presenting the results in the following sections, the prefix “d” is used for dimeric models, whereas “m” is used for monomeric models.

Each of the models was obtained from a crystal structure of PFL in a complex with pyruvate and CoA (RCSB Protein Data Bank (PDB) entry: 1H16).23 Namely, as stated in [MH1], “the PDB file was modified by removing duplicate entries and assigning the protonation states of titratable residues using the H++ server.49 Assignments made by the server were additionally verified and confirmed by visual inspection of the local environments of the residues. Sodium ions and water molecules available in the crystal structure were retained, while other co-crystallized species were removed from the PDB file, including the Mg2+ ion, 1-treitol, and tetraethylene glycol.”

3.2.2. Force Field Parameters

As already explained in the theoretical methodology section of the publication [MH1], “the force field parameters assigned to standard amino acid residues in PFL were taken from the ff14SB50 available within the AMBER 16 software package.51 For non-standard residues, the missing parameters were derived using the R.E.D. server52 and AMBERTools 17 suite by creating the custom-built libraries using the leap module.51 The non-standard residues include

49

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

radical cysteine (cysteinyl radical), acetylated form of cysteine, and the substrates and •- intermediates: coenzyme A (CoA), pyruvate, carbon dioxide radical anion (CO2 ), and formate - (HCO2 ). The parameters for CoA were obtained by combining molecular fragments of usual cofactors in biochemistry from the R.E.D. database under project F-91 which was developed by Dupradeau.53 All phosphate groups of CoA were fully negatively charged. For the rest of the non-standard residues, bonding and non-bonding parameters were taken from the general AMBER force field (GAFF).54,55 The missing partial charges for these residues were obtained by following the standard restrained electrostatic potential (RESP) procedure.56 Namely, the charges for substrates and modified cysteines were derived from quantum mechanical (QM) calculations at B3LYP/cc-pVTZ//HF/6-31G(d,p) level of theory combined with an implicit IEFPCM (ε = 4.335)57 continuum dielectric model mimicking the polarization of the protein interior. The anionic forms of pyruvate, carbon dioxide radical, and the formate were used in parametrization. All QM calculations were performed using the Gaussian 09 software package.58 Furthermore, each system was solvated with TIP3P59 waters in a truncated octahedron box. The edge length of the resulting box of solvent was about 140 Å for dimeric systems and 100 Å for systems containing monomers (see Figure 3.2.1). All crystal water molecules and seven sodium ions per monomer present in the initial PDB file were retained. The Mg2+ was replaced by a Na+ ion.” Although the phosphate group in biological systems has

Figure 3.2.1. The truncated octahedron periodic system containing PFL dimer (a), PFL monomer (b) and coenzyme A (c) in the center, surrounded by a total of 55579, 23567 and 5564 water molecules, respectively. A total number of atoms was 190754 and 82709 for dimeric and monomeric model systems, respectively. Figure adopted from the supporting information of [MH1].

50

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase an overall high affinity towards binding Mg2+, this replacement seems reasonable because the authors claim that they did not use magnesium salts in the crystallization buffer as reported in the ref. 23. Moreover, according to the interpretation of the electron density peak near CoA, it is unclear whether the cation bound to the phosphate is magnesium or sodium since these species are isoelectronic. Interestingly, the first coordination sphere of the metal by crystal water molecules also confirms that this peak could correspond to the monovalent rather than divalent cation binding to the phosphate of CoA. Having thus made this choice of using a sodium ion instead of the magnesium ion, it was found that “besides sodium cation found near CoA, eight additional sodium ions per monomer were required to neutralize the system. The number of water molecules added to monomeric systems was about 24000, whereas dimers were surrounded by about 55600 solvent molecules. To compare bound CoA structures in protein models to the reference case of free CoA in water, an additional system where the CoA molecule was placed in a truncated octahedron (60 Å) of approximately 5500 TIP3P water molecules was build and neutralized by adding four sodium cations (see Figure 3.2.1)” [MH1].

3.2.3. Conditions of Molecular Dynamics Simulations

The best practices well established for classical MD simulations of enzymatic systems were also employed in our work. Namely, periodic boundary conditions (PBC) were applied in all directions in the case of all generated systems. Long-range electrostatic interactions were calculated using the Particle-Mesh Ewald (PME) method with 8 Å non-bonded cut-off to limit the direct space sum. The target temperature was set to 300 K in all simulations and was controlled by coupling the entire system using the Langevin thermostat with a collision frequency of 2 ps-1.60 An integration time step of 2 fs was used and the SHAKE algorithm was employed to keep bonds involving hydrogen atoms fixed during the performed MD simulation.61

Relaxation of the prepared systems was carried out as follows: “steepest descent minimization was first applied to the aqueous solution of protein-substrate-cofactors complexes (solute) with harmonic positional restraints on solute molecules (2 kcal/mol/Å2). Heating dynamics was performed with continued solute restraints at constant volume (NVT). Thereby, the temperature was increased from 0 K to 300 K over 60 ps and kept at that value for another 40 ps. Subsequently, 400 ps of constant pressure (isothermal-isobaric NPT ensemble) dynamics at 300 K were performed, with isotropic position scaling at a target pressure of 1 bar and a pressure relaxation time of 0.2 ps by using the Berendsen barostat.62

51

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

Finally, an unrestrained NPT simulation at 300 K and 1 bar were performed for a duration of 500 ps” [MH1].

After the equilibration phase, all systems were subjected to ten unrestrained MD production runs, using different initial conditions, for minimally 100 ns each. The two selected simulations (from these ten) were extended to a total of 500 ns each, giving rise to an overall simulation time of 1.8 µs per investigated model system. All production MD simulations were propagated at constant volume (canonical NVT ensemble) and temperature (300 K), frequently collecting the frames every 10 ps. Simulations were carried out using the GPU accelerated pmemd module,63-65 whereas the temperature was controlled by the Langevin thermostat in all performed production simulations. In contrast, the unrestrained MD simulation of free CoA in aqueous solution was propagated for 1 µs by using an analogous procedure.

3.2.4. Steered Molecular Dynamics Simulations

Although the conventionally performed MD simulations provide detailed insights into the intrinsic conformational dynamics and the free energy landscape of the enzyme in the presence of CoA, the simulations were unable to predict the spontaneous entry of CoA into the active site of PFL which is crucial for the second half-reaction. The limitations of the poor sampling of the rare events or less frequently populated states are overcome in our work through the use of steered molecular dynamics (SMD), an enhanced sampling method that allows the investigation of such a process.66-69 As described in [MH1], “trajectories that correspond to the process of CoA approaching the active site in the monomeric set of PFL enzymatic models are generated with SMD using NVT ensemble at 300 K. The pulling of CoA molecule towards the active site of PFL was performed in one direction. A suitable reaction coordinate was defined as the distance between the centers of mass of the cysteamine group of CoA and the backbone of atoms C418 and C419. This initial distance between the two groups was about 30 Å, while the final distance was set to 5 Å in all performed pulls. The force constant of the harmonic potential used to drive the system along the points of the chosen coordinate was set to 5 kcal/mol/Å2. The duration of each pull was 10 ns.” The generation of the initial structures has been carried out following the procedure reported in the publication [MH1], namely “the starting ensemble of conformations for the pulling was generated by running restrained MD during which the system was restrained to the initial value of the reaction coordinate by a harmonic potential (1 kcal/mol/Å2). These restrained simulations were started from an adequate snapshot taken from free dynamics. The ten independent distance restrained simulations were performed for each monomeric system. The duration of the restrained MD

52

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase simulations was 20 ns and starting snapshots for pulls were collected every nanosecond, which gave rise to a total of 200 SMD trajectories per investigated model.” Note that, since all 200 steering simulations start from different initial structures, this procedure assures the avoidance of any bias of the choice of the starting points. The final ensembles of CoA structures, obtained with SMD simulations, were collected and depicted as a simple geometrical overlap of CoA tail (cysteamine and pantothenate moieties) occupying a putative entry channel.

3.2.5. Analysis

A thorough analysis has been carried out as illustrated in [MH1], more specifically “all the data obtained from the MD simulations were subsequently processed and analyzed using the cpptraj module of the AMBER 16 software package. Principal component analysis (PCA) was performed to track the conformational changes and enzyme motions. Also, CAVER 3.070 was employed for identification and analysis of CoA entry pathways in MD trajectories of PFL systems processing multiple snapshots from a 500 ns of unrestrained MD simulation of ·- monomeric and dimeric PFL systems. The molecules in the active site (pyruvate, CO2 or formate) were chosen as starting points for dynamic pathway calculation. All standard and non- standard amino acids were included in the channel calculation. The second substrate, CoA, was excluded from the channel calculation, along with all water molecules and sodium ions present in MD trajectories. In each snapshot, all pathways with the bottleneck radius equal to or larger than 0.5 Å were identified. The pathways were clustered by the average-link algorithm based on the pairwise distances of the pathways using the clustering threshold of 3.5 Å. The dynamical cross-correlation (DCC) matrices were constructed to represent the correlated and anti-correlated motion of backbone Cα across MD trajectories in all non-acetylated and acetylated systems. DCC between the residue-based fluctuations during 500 ns of MD simulation was calculated using the Bio3D package.71 All structures were visualized in VMD 1.9.3.72”

53

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

3.3. Results

3.3.1. Geometries of the Active Site in Non-acetylated and Acetylated PFL Model Systems

Before discussing the impact of acetylation in the active site on overall enzyme structure and dynamics, it is crucial to understand the differences in the topology of the active site obtained from MD simulations. Accordingly, the active site of PFL in both non-acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) model system is shown in Figure 3.3.1b-d. For comparison, the active site of the inactive (non-radical) subunit in a dimer is also presented in Figure 3.3.1a. Note that geometries from Figure 3.3.1b-d correspond to structures B, C, and D in Figure 3.1.2, respectively. From the inspection of Figure 3.3.1 it is evident that in the case of PFL•, both the active (orange) and inactive (gray) subunits share identical features of the atomic rearrangement in the active site. In both Figure 3.3.1a and 3.3.1b, pyruvate stays tightly bound in the active site and positioned under the Cys loop, similar to the binding pose observed in the X-ray structure, despite the relaxation induced by the relatively long simulation time. As highlighted in [MH1], “in contrast, a distinct modification is introduced in response to changing the connectivity of the atoms and relaxing the system, in the acetylated forms (green and blue). Thus, note that in Figure 3.3.1c-d, pyruvate is divided into two parts with the acetyl group bound to the C418, whereas •- the other fragment (either CO2 or - HCO2 ) remains non-covalently bound in the active site.” This visualization is important to refer to Figure 3.3.1. Models representing the snapshots of the as further conformational changes PFL active site in the inactive subunit (a) and before (b) upon acetylation of PFL unfold in and after (c, d) the first half-reaction. All models also our model systems. In the following contain CoA occupying its crystallographic binding site at stage, we carefully explored the protein surface and the location of the radical is denoted by the symbol ‘•’. Figure taken from [MH1]. different conformational changes

54

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

Figure 3.3.2. Average RMSD (a) and RMSF (b) values of the protein backbone from the crystal structure in monomeric and dimeric (active subunit only) systems during 500 ns of MD simulation. The MD snapshots from both non-acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) models were considered. The RMSD values are shown in a boxplot representation, where the rectangle indicates the interquartile range. The median is shown by the line in the box while the whiskers go from the minimum to the maximum of all data. For RMSD time series of monomers and dimers see Figure 1a and 2a (Appendix), respectively. The residues at the interface of two subunits in dimeric models are highlighted in emerald in the panel (b). Figure adopted from [MH1]. that occur in our model systems which could help in understanding structural features that might assist the entry of CoA.

3.3.2. Global PFL Conformational Dynamics

We began our analysis by investigating the overall enzyme conformation and behavior of monomeric and dimeric systems in the aqueous environment. Therefore, the root-mean- square deviation (RMSD) of the backbone atoms (N, Cα, and C) from the reference crystal structure in monomeric and dimeric systems was calculated. This is a standard measure of protein structural stability. The resulting boxplots for the two longest trajectories (500 ns) for each of the three model systems shown in Figure 3.3.2a (monomer and dimer) are constructed using the time traces of the RMSD shown in Figure 1a and 2a in the Appendix for monomeric and dimeric systems, respectively. Additionally, as mentioned in the publication [MH1], “an inspection of similar RMSD time traces for all the remaining 100 ns trajectories (see Figure 3a and 4a in the Appendix) confirms that the boxplots shown in Figure 3.3.2a are indeed representative of the results of the shorter runs.” Different timescales were primarily chosen to explore the conformational dynamics of the enzyme and to obtain a sufficient amount of structural data for the analysis. Although different events in biological systems occur at various

55

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase timescales, our analysis implicates that similar information is obtained from both shorter and longer simulations.

Furthermore, the RMS deviations of the backbone atoms in the inactive subunit of the dimer also appear to be comparable to the monomeric and the active subunit of the dimeric systems (see Figure 2a and 4a in the Appendix). Turning the focus back to Figure 3.3.2a, in the publication [MH1] we showed that “the average RMSD was within 1.2-1.6 Å, whereas the maximum values did not exceed 2 Å in any of the simulated systems. All these results indicate that both monomeric and dimeric sets of PFL models are stable concerning the reference crystal structure, and do not display major structural changes during the performed MD simulations.” This result is somehow expected and confirms that the MD simulations and protocols chosen reproduce this inherent stability of protein backbone.

We also calculated the root-mean-square fluctuation (RMSF) of the backbone atoms, which is a measure of the average atomic mobility of the enzyme, by measuring the fluctuations of each atom comprising the backbone during the MD simulations. Note that the RMS fluctuations of the backbone atoms were estimated by averaging data over 500 ns shown in Figure 1b and 2b (Appendix) for both monomeric and dimeric sets of PFL models. These plots are shown in Figure 3.3.2b. Similar to the RMSD analysis, the RMSF values from the ten shorter simulations (see Figure 3b and 4b in Appendix) show similar features to the longer simulations (Figure 1b and 2b, Appendix), confirming the structural stability of PFL compared to the initial crystal structure irrespective of the simulated timescales. A closer inspection of Figure 3.3.2b shows that similar fluctuations could be assigned to the solvent-exposed regions and protein termini in both dimeric and monomeric systems. Interestingly, the helices found at the interface between two subunits in dimeric models show significantly decreased fluctuations compared to their counterparts in the monomer simulations (region highlighted emerald in Figure 3.3.2b).

3.3.3. Coenzyme’s Dynamics

In the publication [MH1] we reviewed that “CoA is a linear, flexible molecule with many possible conformations. To better understand the effect of the binding of CoA to PFL on the conformational space, two additional 500 ns simulation of CoA in water were also performed to obtain the reference state for the unbound molecule.” Here we began our analysis by using the previously outlined procedure of constructing boxplots from time traces of RMS deviations, taking into account all non-hydrogen atoms of CoA during 500 ns (see Figure 5a

56

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase and 6, as well as Figure 7 of the Appendix in which short 100 ns time traces are also representative of the longer time traces). We find from Figure 3.3.3, RMSD values of up to about 5 Å, which are significantly larger compared to that observed for the enzyme backbone. Note that larger errors are associated with the dimeric model system, which will soon Figure 3.3.3. RMSD boxplot of non-hydrogen atoms of become clear as we look closely at CoA from the crystal structure position in water, the conformational isomerism. The monomeric and dimeric systems during 500 ns of MD overall more dynamic behavior of simulation. For RMSD time series of CoA in water and CoA is, in some sense, to be bound to the protein see Figure 5a and 6 (Appendix), completely expected given its respectively. Figure taken from [MH1]. intrinsic flexibility. As outlined in the introductory section, it is known from the X-ray structure that CoA binds between two subunits in PFL and, thus, prefers the syn-conformation, whereas the anti-conformer is dominant in its unbound state (Figure 3.3.4a).

It is thus important in this context to find and examine if evidence for this expected conformational isomerism in CoA can be obtained from our MD simulations, and in how far could this indicate the ease of the coenzyme reaching the buried active site of PFL. Interestingly, as stated in [MH1], “we have indeed found such evidence by monitoring two geometrical parameters, namely, the N-glycosidic dihedral angle O4’-C1’-N9-C8, and the head-to-tail distance between the cysteamine and the adenine moiety. These two parameters when plotted against each other, for example by using average values from time traces shown in pairs of Figure 5b-c (Appendix) for CoA in solution, yield a heatmap as shown in Figure 3.3.4b. Figure 3.3.4c is made for CoA in monomeric PFL systems (by using sets from Figure 8a and 10a in the Appendix, as well as by using results of shorter runs on more independent production runs shown in Figure 9a and 11a, Appendix). Figure 3.3.4d-e corresponds to heatmaps of CoA in the active and inactive subunit of the dimer, respectively (using sets from Figure 8b and 10b, as well as Figure 9b and 11b in the Appendix). From Figure 3.3.4b it is

57

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

Figure 3.3.4. (a) The equilibrium between syn- and anti-conformation of CoA. Heatmaps for CoA conformations in (b) water, (c) monomer, (d) active and (e) inactive subunit of a dimer. The snapshots were taken from both non-acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) systems. Two geometrical parameters (descriptors) were monitored, namely glycosidic dihedral O4’-C1’-N9-C8 (cyan) was plotted against distance between adenine and cysteamine (magenta). The heatmap surface is given by W/kBT = -ln(N/Ntot) where Ntot represents the total number of configurations in each individual case (Ntot is 100 000 and 540 000 for CoA in water and bound to the enzyme, respectively). Figure taken from [MH1]. evident that free CoA in solution prefers folded states, whereas the syn- and anti-conformation are similarly populated. Surprisingly, in the monomeric system (Figure 3.3.4c) folded anti- conformation is preferred. In the two dimeric systems, despite the starting conformation bias (Figure 8b, 9b, 10b, and 11b, Appendix) that makes the extended syn-conformation appear dominant, transitions towards the more compact states and anti-rotamer were consistently observed during simulations (Figure 3.3.4d-e).”

To understand this in detail, we examine the various features revealed from these two panels when compared with Figure 3.3.4c. Essentially, the main difference between the monomeric- and dimeric-based simulations originates from the stacking interactions of the CoA thiol group from the cysteamine moiety with the aromatic side chains of F220 and H227 of the opposing subunit, as observed in the experimental X-ray structure. Interestingly, as discussed in the publication [MH1], “this interaction appears persistent under the simulation conditions. However, the thiol group of CoA eventually disengages in the majority of the simulations, demonstrated both in the rise of the RMSD (see Figure 6b and 7b, Appendix) and in the increasing distance between the cysteamine moiety of CoA and two interacting residues F220 and H227 from the neighboring subunit (Figure 12 and 13, Appendix). This release of the S-H group then allows sampling of the more folded conformations observed in the

58

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase heatmaps presented in Figure 3.3.4d-e. The complete absence of this interaction in the monomeric systems allows for the increased mobility of the cysteamine and pantothenate groups exposed to the solvent from the very beginning of the simulation, resulting in a different conformational distribution (compare Figure 3.3.4c with Figure 3.3.4d-e). The RMSD patterns further illustrate the difference between monomers and dimers (see Figure 3.3.3 and Figure 6a and 7a for monomers and 6b and 7b for dimers, Appendix).”

Additionally, to examine the differences observed in the distribution of structures of CoA bound to the monomeric and dimeric enzymatic systems, it is important to mention that the binding mode of the adenosine part of CoA, present in both monomeric and dimeric models, is achieved through the interactions of the nucleotide moiety with the (activated) subunit of PFL. This binding site is comprised of a relatively short α-helix that includes the conserved N145, Q146, and F149. In the crystal structure (see Figure 3.1.1), the latter residue engages in stacking interactions with the smaller imidazole ring of adenine moiety, while N145 and Q146 further strengthen binding through the formation of hydrogen bonds with the amino group of adenine. Further interaction of CoA with PFL is provided by a salt bridge between the 3’- and 5’-phosphates and K161. The distance between the adenine base and its stacking partner in the crystal structure, F149, showed that adenine remains tightly or partially bound in all examined cases (see Figure 14 and 15, Appendix). As mentioned in the publication [MH1], “from the inspection of the MD snapshots we observe that the negatively charged phosphate groups of CoA seem to form more of an exclusive interaction with the initially shared K161, but also with several other neighboring positively charged lysine or arginine residues (K118, K159, R160, K615, and K617). This could potentially affect the adenine-F149 stacking and hydrogen bonding with N145 and Q146. However, it is also confirmed that the stacking and hydrogen bonding were able to reform during the investigated simulation times, keeping the adenine of CoA bound to the enzyme surface. These changes were also reflected in the RMSD plots of CoA (see Figure 3.3.3 and Figure 6 and 7 in the Appendix).”

From a functional point of view, the most important findings regarding CoA dynamics are the following. As discussed in [MH1], “whereas the binding mode of the adenosine fragment of CoA remains as observed in the crystal structure, the binding of the S-H group to a neighboring subunit is significantly less stable. Related to this, we observe, both in monomeric and dimeric model systems and within the simulation time of 500 ns, a systematic change of the CoA conformation concerning the N-glycosidic bond, from syn to anti, changing the adenosine dihedral angle from around 110 to 80 degrees (Figure 3.3.4 and Figure 16 in

59

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

Figure 3.3.5. (a) The location of the potential CoA entry channel gateway helices in the active subunit of PFL for all three systems. (b) The top ranked collective pathways identified throughout the MD simulations by CAVER. Pathways are shown in one frame as pathway centerlines where only channel with the lowest cost is shown. (c) Ensemble of CoA (pantothenate and cysteamine blocks shown as green lines) geometries derived from SMD simulations in AcPFL I, where CoA was pulled into the active site independently two hundred times. (d) The representative CoA binding mode derived from SMD simulations, forming the potential gateway in the channel, with residues in direct contact with bound CoA in AcPFL I. The representative CoA binding mode is shown in a licorice representation. Figure taken from [MH1]. the Appendix). This is in agreement with the previously mentioned hypothesis, namely, that ribose and pantothenate moiety might rotate around N-glycosidic bond, which in turn, potentially enables CoA to reach the buried active site more easily.”

3.3.4. Identification of Potential CoA Entry Channel

Detailed visualization of the trajectories from the unbiased MD simulations of both monomeric and dimeric PFL systems did not reveal any obvious large-scale conformational changes that would allow CoA to enter the active site from its binding location at the surface of the enzyme. Thus, despite we previously showed that CoA can easily adopt a range of conformations while still bound to PFL, a clear route that can accommodate CoA on its way toward the active site remains elusive. Instead, we identified dominant channels pointing from the buried interior towards the surface of the enzyme by employing CAVER – software that is widely used in drug design and structural enzymology for predicting cavities in proteins. This tool allowed us to analyze and visualize dominant tunnels or channels in all three monomeric PFL model systems by generating spherical particles from the active site to the surface of the enzyme (see section 3.2.5. for details). As discussed in [MH1], “irrespective of whether we considered acetylated or non-acetylated systems, these channels were found pointing from the active site to CoA and its binding site. What is more interesting is that in the acetylated systems,

60

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase the highly ranked channels identified by CAVER when clustered converged to a single but broad channel starting above two neighboring helices positioned above the active site (Figure 3.3.5a-b). Other pathways that were mostly pointing away from the CoA site were also identified and these could potentially be relevant in the PFL activation process or product release after the completion of the catalytic cycle (Figure 3.3.6a). Figure 3.3.6b also contains results obtained on the dimeric form of the enzyme, indicating that the results from the monomer are similar.”

Importantly, the size of these probing spheres used by CAVER is typically smaller than the size of a CoA substrate molecule. To ascertain that this single dominant channel, shown in Figure 3.3.5b, is relevant and could indeed serve as a suitable pathway for entry of CoA, a series of steered molecular dynamics (SMD) simulations is performed on all three monomeric model systems. In these simulations, the cysteamine group of CoA was pulled from the protein surface into the active site along the predefined reaction coordinate (see section 3.2.4. for details). As stated in the publication [MH1], “the distance between the cysteamine moiety and the backbone atoms of catalytic cysteines C418 and C419 was used as a reaction coordinate. In these steering simulations, CoA was able to penetrate the protein dynamically and

Figure 3.3.6. The three top ranked collective pathways identified throughout the MD simulation by CAVER in (a) monomeric and (b) dimeric model systems. Pathways are shown in one frame as pathway centerlines where channels with the lowest cost are shown. The channels shown in green are pointing towards CoA and its binding site while yellow and red pathways are oriented in different directions. Figure taken from the supporting information of [MH1].

61

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

Figure 3.3.7. Displacement vectors of all protein non-hydrogen atoms for the first five modes shown as porcupine and drawn in both directions. Figure adopted from the supporting information of [MH1]. successfully reach the active site, see Figure 3.3.5c-d. Furthermore, when tracing the atoms comprising the main chain of the deeply bound CoA significant overlap with the previously identified channel obtained by CAVER was revealed (see Figure 3.3.5b-c). This surprisingly good agreement between the CAVER modeling and tracing of the SMD trajectories supports the plausibility of the suggested entry channel.”

A thorough inspection of the trajectories from the SMD simulations allows identifying the enzyme residues that most frequently interact with CoA upon its entry into the active site. These amino acids also form the putative entry channel and include G167, Y172, R176, Y323, L326, F327 F432, R435, and V611 (Figure 3.3.5d). Interestingly, three of these residues (R176, F432, and R435) also participate in pyruvate binding and exclusively interact with CoA when it is deeply bound in the active site (see Figure 3.3.5d). Moreover, three hydrophobic aromatic residues (Y323, F327, and Y172) are stacked together forming a spiral lid that further covers the active site. Furthermore, the aliphatic L326 and V611 are positioned at the top of two opposing helices, which normally keep the channel closed. These helices are a part of two longer sequences (321-327 and 607-615, respectively) that enclose the channel (shown as purple ribbons in Figure 3.3.5a). Importantly, we set the focus on the resulting gateway positioned 15 Å above the catalytic cysteines (effectively between L326 and V611) which correspond to the point where the multiple CAVER channels converged to a single one.

62

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

Encouraged by the identification of the putative entry channel and the gateway by SMD simulations, as well as the agreement with results obtained from CAVER, which has narrowed down the enzyme region that might be relevant for the approach of the coenzyme, we considered additional directions for Figure 3.3.8. A snapshot from fifth principal component the analysis. A potentially useful and showing a view down the possible gateway: (a) closed powerful approach in this respect is and (b) open state. Figure taken from [MH1]. principal component analysis (PCA), which has been performed on data from unrestrained dynamics in the case of all six model systems, both monomeric and dimeric, as well as acetylated and non-acetylated forms taking into account only protein backbone atoms as explained in section 3.2.5. This technique is applied to transform a multidimensional data series into a set of orthogonal vectors or principal components (PCs) and to explain the biggest variance in the data. Namely, as illustrated in [MH1], “we conducted a close inspection of the first ten principal components (PC1-10) thus produced and present the displacement vectors of the first five (PC1-5) in Figure 3.3.7. The highest variance was typically associated with the movement of the termini and loops, as expected. However, important fluctuations comprising PC5 are found to be localized in the

Figure 3.3.9. Histograms constructed by projecting all MD snapshots of non-acetylated and acetylated monomeric and dimeric systems on fifth principal component. Figure taken from the supporting information of [MH1].

63

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase gateway region. Specifically, these fluctuations correspond to the “breathing” motion of the two opposing helices that ordinarily keep the channel closed (Figure 3.3.8).” From Figure 3.3.8 it is evident that this separation of the two interacting helices increases the width of the entire channel which could facilitate the entry of the coenzyme into the active site. In other words, by reducing the dimensionality, PCA captures the essential movements that might be relevant for the function of the macromolecule. “Furthermore, when we consider the histograms constructed by projecting all MD snapshots onto PC5 (Figure 3.3.9), it is evident that in non-acetylated model systems, the distribution is shifted towards the closed configuration (fewer positive values) in comparison to acetylated model systems, for which the probability of the channel in an open state (more positive values) is larger. This result indicates that the topology could have a potentially important relationship with the identified channel, although more conclusive information in this direction is difficult to obtain from PCA, due to the inherent noise” [MH1].

3.3.5. Topology Effects

The results from PC analysis support the hypothesis that the conformational changes in PFL might be introduced after the acetylation of cysteine in the active site. This, in turn, suggests that the chemical modification at the active site might be involved in transferring a signal, which causes structural changes in the distal regions of the enzyme and an associated change in the equilibrium between a closed state and an open state of the identified channel. We, therefore, presented in Figure 3.3.10 the cross- Figure 3.3.10. Residue dynamical cross-correlation correlation map of α-carbon positions to map for MD simulations of (a) non-acetylated and (b) further explore this possibility by acetylated PFL systems. Cyan regions (0 to 1) comparing the dynamic cross- represent positive correlation, while magenta regions correlation matrices between non- (0 to -1) represent negative or anti-correlation. acetylated and acetylated PFL systems. Negative values typically indicate regions that move

As discussed in the publication synchronously in opposite directions. Figure taken from [MH1]. [MH1], “the plots in Figure 3.3.10 are

64

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

Figure 3.3.11. Histograms constructed from L326-V611 distances considering all MD simulations of (a) non-acetylated and (b) acetylated monomeric and dimeric systems. The open state of the channel, whereby the distance between L326 and V611 residues is 8.8 Å, is significantly more pronounced in acetylated forms. A snapshot from free MD simulations showing a view down the possible entrance channel: (a) closed state in mPFL• model; (b) open state in mAcPFL I model. Figure taken from [MH1]. represented by different colors with highly positive regions ranging from white to cyan and highly negative regions ranging from white to magenta. These are associated with strongly correlated and anti-correlated movements of specific residues, respectively. The correlation maps for non-acetylated and acetylated states exhibit similar overall structures, with certain additional correlations present in the acetylated systems. The additional features are indicative of higher flexibility overall and consist of some enhancement of positive correlations along the diagonal regions, coupled with the appearance of several anti-correlated regions in the off- diagonal regions. The negative signals between C-terminal residues in the range 75-125 and N-terminal residues in the range 550-650 (Figure 3.3.10b) correspond to relatively large-scale motions whose appearance, while interesting, is difficult to associate with a specific functional aspect. On the other hand, the anti-correlations of residues 320-330 with the same N-terminal residues and, particularly those between 600 and 620, appear to be highly relevant in the present context.”

Namely, we previously characterized residues 320-330 and 600-620 (highlighted in Figure 3.3.10b), as a gateway region of the identified channel (Figure 3.3.5 and 3.3.8), which is potentially important for the entry of CoA into the active site. The fact that these residues experience negligible cross-correlation before acetylation but become strongly negatively correlated after the acetylation is quite intriguing. Overall, Figure 3.3.10 shows that the

65

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase acetylation of C418 in the active site has a subtle effect on the global dynamics of PFL. Among the numerous manifestations of this event, it is particularly revealing that a conformational change corresponding to the opening of an identified channel gateway (Figure 3.3.5) is thus significantly enhanced. Considering these results, it is useful to examine the RMSDs of all the relevant residues listed in Figure 3.3.5d from their initial position in the crystal structure for all simulated model systems (Figure 17 and 18, Appendix). The calculated RMSD values show that the channel remains close to its reference conformation for the two non-acetylated PFL• systems, while all four acetylated systems (two AcPFL I and two AcPFL II) display periods associated with significant deviations. Interestingly, we observed very similar trends in the distance calculated between L326 and V611 (Figure 3.3.11, see also Figure 19 and 20 in the Appendix), which are the two residues near each other on either side of the two opposing helices (Figure 3.3.5d).

Thus, as summarized in [MH1], “by analyzing the influence of acetylation in the active site on enzyme dynamics, starting from the general idea of finding dominant channels using CAVER, followed by SMD simulations, PCA, cross-correlation matrices, and RMSD values, we have been led systematically to deeper insights. A careful analysis of their implications has led to the discovery of the L326-V611 (or L-V) distance as a simple, but telling descriptor for the channel mobility. The measurement of the L-V distance (Figure 3.3.11) in the acetylated

Figure 3.3.12. (a) A snapshot of the changes observed between non-acetylated PFL• and the acetylated AcPFL I and AcPFL II forms depicting the potential cascade of changes (following the C-C cleavage in pyruvate), involving R435 (i) and the orientation of the helix containing F327, L326 and Y323 (ii). This is more clearly indicated in panels (b) and (c) for AcPFL I. Note the change in distances between R435 and F327, which appears to be related to the change in the orientation of F327, which is, in turn, related to the rotation of the whole helix and the nearly doubling of the distance of the L-V gate. See text for details. Figure taken from the supporting information of [MH1].

66

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

Figure 3.3.13. Closed state of the channel in the inactive subunit, where the distance between L326 and V611 residues is 6.8 Å, dominates in all examined forms (non-acetylated and acetylated). Histograms constructed from L326-V611 distances considering all MD simulations of dimeric systems. Figure taken from the supporting information of [MH1]. monomeric systems, mAcPFL I and mAcPFL II, revealed the presence of two distinct states, which were not found in the non-acetylated mPFL• system (see Figure 19a and 20a, Appendix). The same observations were made for the dimeric system (Figure 19b and 20b in the Appendix). The two states observed in the L-V coordinate can be directly linked to open and closed states of the channel described above. Namely, the channel remained closed (L-V ≈ 6.8 Å, Figure 3.3.11a) in the non-acetylated systems, corresponding to the moderate RMSDs of the channel-forming residues (Figure 17 and 18, Appendix), while the open state (L-V ≈ 8.8 Å, Figure 3.3.11b) of the channel was found predominantly in the acetylated systems.”

The opening events have also been investigated on a molecular level by carefully monitoring the mobility of the residues comprising the channel as well as the active site residues, focusing on the two longest MD trajectories (500 ns) for each of the two acetylated monomeric model systems. We compared these trajectories with the corresponding long trajectories of the non-acetylated systems, despite there existing no opening events. Firstly, we recall that the geometry of the active sites after the acetylation (Figure 3.3.1c-d) share identical features while being distinct from that of the activated subunit in the non-acetylated system (Figure 3.3.1b). In particular, in the publication [MH1] we note that “although in the non- acetylated form (Figure 3.3.1b), pyruvate can be more tightly bound to the arginine residues •- (not shown in Figure 3.3.1), in the acetylated forms, either the CO2 in Figure 3.3.1c or formate in Figure 3.3.1d, tend to display increased flexibility in their binding to the arginine residues owing to their smaller size and altered charge distributions.” Indeed, we obtained this

67

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase insight by visualizing the MD trajectories. In Figure 3.3.12 we illustrated the snapshots depicting the changes introduced moving from the non-acetylated to the acetylated forms of PFL following the C-C bond cleavage in pyruvate. Therefore, as discussed in the publication [MH1], we observed the following changes: “the R435, which appears to be more tightly bound to the pyruvate carboxylate group in PFL•, is shifted away from the acetyl group in AcPFL I and AcPFL II. There is a small change in the distance between R435 and F327, which appears to be related to a more marked change in the orientation of F327 and a corresponding rotation of the whole helix containing F327, L326, and Y323, significantly increasing the width of the L-V gate.” Although the changes described above are indeed present in multiple opening events that were examined, it is more apt to consider the observed changes to be a result of a subtle dynamic interplay rather than associating it to a deterministic sequence of specific events.

Given that CO2 radical anion is smaller than the formate, we observed a larger change in the case of AcPFL I, in agreement with the results depicted in Figure 3.3.11.

Notably, while the mean values are shown in Figure 3.3.11, the indicated distances in Figure 3.3.12 are representative of a snapshot. It must also be mentioned that the opening and closing events (Figure 19 and 20, Appendix) are always associated with rather sudden jumps in the instantaneous value of the L-V distance, which strongly indicates the existence of distinct states, even though the difference in the mean descriptor values is in the same range as the thermal fluctuations. Referring to Figure 3.3.1a-b, we found that the architecture of the PFL• active site is very similar irrespective of whether we view the active or the inactive subunit. The observations from Figure 3.3.12 support our conjecture that pyruvate remains more tightly bound to the arginine preventing the kind of structural cascade observed in acetylated systems.

Furthermore, when considering the individual contributions to the histograms constructed from the L-V distances (presented in Figure 3.3.11) for the acetylated and non- acetylated dimeric and monomeric systems, it becomes evident that “although the contributions are similar for monomeric and dimeric systems regarding the closed state, the open state is somewhat more populated in the monomeric systems” [MH1]. We were not able to determine whether this difference in population originates from a slower sampling of the states in the dimeric systems or a potential mechanistic role of the inactive subunit, possibly to reduce the number of undesired molecules reaching the active site (see Figure 19b and 20b, Appendix). Interestingly, the putative channel of the inactive subunit in dimers was found to strongly favor the closed state in all performed MD simulations, for both acetylated and non-acetylated cases (Figure 3.3.13).

68

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

3.4. Conclusion

By performing this broad computational study, in which both conventional and enhanced sampling MD simulations were employed, we obtained detailed insights into the effects of temporary acetylation of PFL at C418 in the active site on the conformational dynamics of the enzyme. The results presented in this chapter and in the publication [MH1] clearly indicate that this chemical change has an important influence in favoring the open state of the channel which would potentially allow CoA to approach the active site from its initial position on the protein surface.

As summarized in [MH1], “a series of unrestrained MD simulations of 100-500 ns was carried out to investigate conformational space of PFL using topologies that correspond to the protein states before (PFL•) and after the first half-reaction (AcPFL I and AcPFL II) by using monomeric and dimeric representations of the protein. Analysis of the simulated systems revealed that the global protein conformation was stable during the simulated time and the calculated backbone deviations from the crystal structure were minor in both monomeric and dimeric model systems. On the other hand, the observed conformations of CoA ranged from the extended (syn) to the more compact (anti) states. The thiol group of CoA was found to frequently unbind from the inactive subunit of the dimer, thus allowing for a counterclockwise rotation around the N-glycosidic bond, which was also observed in the monomeric and isolated systems in solution. In all simulations, the adenosine part of CoA remained close to the binding site as characterized in the crystal structure, despite certain rearrangements in binding interactions with surrounding residues.”

An additional inspection of the trajectories obtained using the CAVER software disclosed three dominant channels connecting the protein interior with the surface regions, in the case of both monomeric and dimeric models. Interestingly, as stated in [MH1], “the channel with the highest rank was found to point towards the CoA binding site. The other two channels pointed in different directions and could be relevant for the activation of the enzyme or product dissociation once the catalysis is terminated.” Namely, one of the detected channels is leading towards the C-terminal region of the enzyme, including the portion of the Gly radical domain, while the other passes through the canonical ten-stranded α/β barrel fold on the opposite side. Moreover, SMD simulations carried out by pulling the S-H group of CoA towards cysteines C418 and C419 in the active site of the monomeric systems, demonstrated compatibility of CoA to explicitly fit in a channel reaching to the buried active site. Furthermore, “the resulting

69

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase positions of the CoA tail atoms exhibited a significant overlap with the channel identified by CAVER. This putative channel also represents the shortest and the least crowded pathway when the system is driven along the chosen reaction coordinate. Careful inspection of the SMD trajectories revealed residues capable of interacting with a channel-bound CoA. Employing the PCA in analyzing the trajectories of unrestrained dynamics in both monomeric and dimeric model systems, the focus was set on the extent of fluctuations for the first five principal components. It was demonstrated that significant fluctuations associated with PC5 are localized in a gateway region, which corresponded to a set of residues identified in the context of the CAVER and SMD analyses to encompass the putative channel” [MH1]. A comparison of the residues comprising the potential entry channel during unbiased dynamics on the monomeric systems revealed that the channel was mostly found in its closed state in the non-acetylated system (mPFL•). However, in the acetylated systems (mAcPFL I and mAcPFL II), this putative entry channel was frequently found to be in a more open state. Importantly, “it was demonstrated that the distance between L326 and V611 (L-V) can be used as an excellent descriptor of the channel opening fluctuations, which also correlates well with the fluctuations associated with PC5. The analysis of the channel residues and the L-V descriptor in the dimeric models (dAcPFL I and dAcPFL II) also revealed a more frequent visitation of the open state, only after the acetylation event (not for dPFL•)” [MH1].

On a closer view, it was found that the acetylation of C418 loosens the binding of R435 •- to either the CO2 or formate, which is closely related to the change in the orientation of the helix containing L326 and the widening the L-V distance providing an open state of the gateway. This molecular-level view provides insight into how the chemical change at the active site transmits a signal to a distal region. Thus, a novel insight into the details of the potential sequence of the PFL mechanism was obtained. Namely, the acetylation of the enzyme at the end of the first half-reaction triggers more fluctuations, which causes the channel for CoA entry to frequently appear in an open state. It is important to recap that none of the trajectories capture the spontaneous entry of CoA into the active site, although SMD shows this to be feasible. This event is most likely to occur on even longer timescales. As concluded in the publication [MH1], “using classical MD simulations, which are hampered by the usual issues, such as insufficient sampling time and an inability to deal with the effects of electronic rearrangements on the fly, it has only been possible to consider distinct states. Nevertheless, a fascinating and remarkable result by which a minor chemical change on a length scale of a couple of Angstroms causes a change in the fluctuation spectrum of the entire enzymatic system was uncovered. The effect

70

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase of this change is to shift the dynamic equilibrium of the residues of the gateway to favor an open state. The resulting channel happens to correspond to the most probable path of the S-H group of CoA from its surface-bound, crystal structure position to the buried active site it needs to reach to complete the second half-reaction.”

Importantly, this entry channel becomes available only after the completion of the first half-reaction. This is particularly intriguing in the context of the inactivation effects of exclusively small thiol compounds that can access the active site in the absence of a specific channel. Thus, the apparent prevention of the access of CoA to the active site before the completion of the first half-reaction seems to assume a functional significance, which is related to protecting the radical intermediates from the potentially harmful, premature interference of the S-H group of CoA.

Although the potential CoA entry channel has been characterized, our unbiased simulations were not able to reveal any evidence of the spontaneous binding of CoA into the active site of PFL through this channel. Therefore, in the next chapter, we present the effort made towards finding the entire entry pathway of CoA into the buried active site of PFL before and after the first half-reaction by employing enhanced sampling methods such as steered MD simulations and umbrella sampling. Apart from a reliable estimation of the associated free energy, we also show that these methods lead not only to a discovery of stable and reactive conformations of coenzyme molecule bound in the active site of an enzyme but also can provide a full explanation for the implications of conformational changes upon acetylation of PFL.

71

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

3.5. References

1. Yang, L.-Q.; Sang, P.; Tao, Y.; Fu, Y.-X.; Zhang, K.-Q.; Xie, Y.-H.; Liu, S.-Q. J. Biomol. Struct. Dyn. 2014, 32, 372-393. 2. Hertig, S.; Latorraca, N. R.; Dror, R. O. PLoS Comput. Biol. 2016, 12, e1004746. 3. Liao, Q.; Kulkarni, Y.; Sengupta, U.; Petrović, D.; Mulholland, A. J.; Van der Kamp, M. W.; Strodel, B.; Kamerlin, S. C. L. J. Am. Chem. Soc. 2018, 140, 15889-15903. 4. Doshi, U.; McGowan, L. C.; Ladani, S. T.; Hamelberg, D. Proc. Natl. Acad. Sci. USA 2012, 109, 5699-5704. 5. Boehr, D. D.; D’Amico, R. N.; O’Rourke, K. F. Protein Sci. 2018, 27, 825-838. 6. Cleland, W. W. Biochim. Biophys. Acta - Spec. Sect. Enzymol. Subj. 1963, 67, 104-137. 7. Ulusu, N. N. J. Mol. Evol. 2015, 80, 251-257. 8. Damager, I.; Buchini, S.; Amaya, M. F.; Buschiazzo, A.; Alzari, P.; Frasch, A. C.; Watts, A.; Withers, S. G. Biochemistry 2008, 47, 3507-3512. 9. Oliveira, I. A.; Gonçalves, A. S.; Neves, J. L.; Von Itzstein, M.; Todeschini, A. R. J. Biol. Chem. 2014, 289, 423-436. 10. Knappe, J.; Blaschkowski, H. P.; Gröbner, P.; Schmitt, T. Eur. J. Biochem. 1974, 50, 253-263. 11. Backman, L. R. F.; Funk, M. A.; Dawson, C. D.; Drennan, C. L. Crit. Rev. Biochem. Mol. Biol. 2017, 52, 674-695. 12. Unkrig, V.; Neugebauer, F. A.; Knappe, J. Eur. J. Biochem. 1989, 184, 723-728. 13. Knappe, J.; Elbert, S.; Frey, M.; Wagner, A. F. V. Biochem. Soc. Trans. 1993, 21, 731-734. 14. Wagner, A. F.; Frey, M.; Neugebauer, F. A.; Schäfer, W.; Knappe, J. Proc. Natl. Acad. Sci. USA 1992, 89, 996-1000. 15. Yang, J.; Naik, S. G.; Ortillo, D. O.; García-Serres, R.; Li, M.; Broderick, W. E.; Huynh, B. H.; Broderick, J. B. Biochemistry 2009, 48, 9234-9241. 16. Sawers, G.; Suppmann, B. J. Bacteriol. 1992, 174, 3474-3478. 17. Buis, J. M.; Broderick, J. B. Arch. Biochem. Biophys. 2005, 433, 288-296. 18. Frey, M.; Rothe, M.; Wagner, A. F.; Knappe, J. J. Biol. Chem. 1994, 269, 12432-12437. 19. Külzer, R.; Pils, T.; Kappl, R.; Hüttermann, J.; Knappe, J. J. Biol. Chem. 1998, 273, 4897-4903. 20. Broderick, W. E.; Hoffman, B. M.; Broderick, J. B. Acc. Chem. Res. 2018, 51, 2611-2619. 21. Conradt, H.; Hohmann-Berger, M.; Hohmann, H. P.; Blaschkowski, H. P.; Knappe, J. Arch. Biochem. Biophys. 1984, 228, 133-142. 22. Becker, A.; Fritz-Wolf, K.; Kabsch, W.; Knappe, J.; Schultz, S.; Volker Wagner, A. F. Nat. Struct. Biol. 1999, 6, 969-975. 23. Becker, A.; Kabsch, W. J. Biol. Chem. 2002, 277, 40036-40042. 24. Lee, C.-H.; Sarma, R. H. J. Am. Chem. Soc. 1975, 97, 1225-1236. 25. Dordine, R. L.; Paneth, P.; Anderson, V. E. Bioorg. Chem. 1995, 23, 169-181. 26. Wu, W.-J.; Tonge, P. J.; Raleigh, D. P. J. Am. Chem. Soc. 1998, 120, 9988-9994. 27. Parast, C. V.; Wong, K. K.; Lewisch, S. A.; Kozarich, J. W.; Peisach, J.; Magliozzo, R. S. Biochemistry 1995, 34, 2393-2399. 28. Himo, F.; Eriksson, L. A. J. Am. Chem. Soc. 1998, 120, 11449-11455. 29. Lucas, M. de F.; Fernandes, P. A.; Eriksson, L. A.; Ramos, M. J. J. Phys. Chem. B 2003, 107, 5751- 5757. 30. Guo, J.-D.; Himo, F. J. Phys. Chem. B 2004, 108, 15347-15354. 31. Čondić-Jurkić, K.; Perchyonok, V. T.; Zipse, H.; Smith, D. M. J. Comput. Chem. 2008, 29, 2425- 2433. 32. Peng, Y.; Veneziano, S. E.; Gillispie, G. D.; Broderick, J. B. J. Biol. Chem. 2010, 285, 27224- 27231. 33. Crain, A. V.; Broderick, J. B. J. Biol. Chem. 2014, 289, 5723-5729. 34. Mancia, F.; Evans, P. R. Structure 1998, 6, 711-720. 35. Pang, J.; Li, X.; Morokuma, K.; Scrutton, N. S.; Sutcliffe, M. J. J. Am. Chem. Soc. 2012, 134, 2367- 2377. 36. Chen, Y.-H.; Maity, A. N.; Frey, P. A.; Ke, S.-C. J. Am. Chem. Soc. 2013, 135, 788-794. 37. Lo, H.-H.; Lin, H.-H.; Maity, A. N.; Ke, S.-C. Chem. Commun. 2016, 52, 6399-6402.

72

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

38. Wolthers, K. R.; Rigby, S. E. J.; Scrutton, N. S. J. Biol. Chem. 2008, 283, 34615-34625. 39. Wolthers, K. R.; Levy, C.; Scrutton, N. S.; Leys, D. J. Biol. Chem. 2010, 285, 13942-13950. 40. Menon, B. R. K.; Fisher, K.; Rigby, S. E. J.; Scrutton, N. S.; Leys, D. J. Biol. Chem. 2014, 289, 34161-34174. 41. Bharadwaj, V. S.; Dean, A. M.; Maupin, C. M. J. Am. Chem. Soc. 2013, 135, 12279-12288. 42. Nnyepi, M. R.; Peng, Y.; Broderick, J. B. Arch. Biochem. Biophys. 2007, 459, 1-9. 43. Hospital, A.; Goñi, J. R.; Orozco, M.; Gelpí, J. L. Adv. Appl. Bioinform. Chem. 2015, 8, 37-47. 44. Childers, M. C.; Daggett, V. Mol. Syst. Des. Eng. 2017, 2, 9-33. 45. Knappe, J.; Sawers, G. FEMS Microbiol. Rev. 1990, 75, 383-398. 46. Peng, Y.; Veneziano, S. E.; Gillispie, G. D.; Broderick, J. B. J. Biol. Chem. 2010, 285, 27224- 27231. 47. Henshaw, T. F.; Cheek, J.; Broderick, J. B. J. Am. Chem. Soc. 2000, 122, 8331-8332. 48. Plaga, W.; Frank, R.; Knappe, J. Eur. J. Biochem. 1988, 178, 445-450. 49. Anandakrishnan, R.; Aguilar, B.; Onufriev, A. V. Nucleic Acids Res. 2012, 40, 537-541. 50. Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, K. E.; Simmerling, C. J. Chem. Theory Comput. 2015, 11, 3696-3713. 51. Case, D. A.; Cerutti, D. S.; Cheatham, T. E., III; Darden, T. A.; Duke, R. E.; Giese, T. J.; Gohlke, H.; Goetz, A. W.; Greene, D.; Homeyer, N.; Izadi, S.; Kovalenko, A.; Lee, T. S.; LeGrand, S.; Li, P.; Lin, C.; Liu, J.; Luchko, T.; Luo, R.; Mermelstein, D.; Merz, K. M.; Monard, G.; Nguyen, H.; Omelyan, I.; Onufriev, A.; Pan, F.; Qi, R.; Roe, D. R.; Roitberg, A.; Sagui, C.; Simmerling, C. L.; Botello-Smith, W.-M.; Swails, J.; Walker, R. C.; Wang, J.; Wolf, R. M.; Wu, X.; Xiao, L.; York, D. M.; Kollman, P. A. AMBER 2017, University of California: San Francisco, CA, 2017. 52. Vanquelef, E.; Simon, S.; Marquant, G.; Garcia, E.; Klimerak, G.; Delepine, J. C.; Cieplak, P.; Dupradeau, F.-Y. Nucleic Acids Res. 2011, 39, W511-W517. 53. RED server http://upjv.q4md-forcefieldtools.org/REDDB/projects/F-91/, accessed Feb 15, 2017. 54. Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. J. Comput. Chem. 2004, 25, 1157-1174. 55. Betz, R. M.; Walker, R. C. J. Comput. Chem. 2015, 36, 79-87. 56. Bayly, C. I.; Cieplak, P.; Cornell, W.; Kollman, P. A. J. Phys. Chem. 1993, 97, 10269-10280. 57. Mennucci, B.; Tomasi, J. J. Chem. Phys. 1998, 106, 5151-5158. 58. Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B. M.; Petersson, G. A.; Nakatsuji, H.; Caricato, M.; Li, X.; Hratchian, H. P.; Izmaylov, A. F.; Bloino, J.; Zheng, G.; Sonnenberg, J. L.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Montgomery, J. A. Jr.; Peralta, J. E.; Ogliaro, F.; Bearpark, M.; Heyd, J. J.; Brothers, E.; Kudin, K. N.; Staroverov, V. N.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Rega, N.; Millam, J. M.; Klene, M.; Knox, J. E.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Zakrzewski, V. G.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Dapprich, S.; Daniels, A. D.; Farkas, Ö.; Foresman, J. B.; Ortiz, J. V.; Cioslowski, J.; Fox, D. J. Gaussian 09, revision D.01; Gaussian, Inc.: Wallingford, CT, 2009. 59. Mahoney, M. W.; Jorgensen, W. L. J. Chem. Phys. 2000, 112, 8910-8922. 60. Uberuaga, B. P.; Anghel, M.; Voter, A. F. J. Chem. Phys. 2004, 120, 6363-6374. 61. Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. J. Comput. Phys. 1977, 23, 327-341. 62. Berendsen, H. J. C.; Postma, J. P. M.; Van Gunsteren, W. F.; DiNola, A.; Haak, J. R. J. Chem. Phys. 1984, 81, 3684-3690. 63. Götz, A. W.; Williamson, M. J.; Xu, D.; Poole, D.; Le Grand, S.; Walker, R. C. J. Chem. Theory Comput. 2012, 8, 1542-1555. 64. Salomon-Ferrer, R.; Götz, A. W.; Poole, D.; Le Grand, S.; Walker, R. C. J. Chem. Theory Comput. 2013, 9, 3878-3888. 65. Le Grand, S.; Götz, A. W.; Walker, R. C. Comput. Phys. Commun. 2013, 184, 374-380. 66. Jensen, M. Ø.; Park, S.; Tajkhorshid, E.; Schulten, K. Proc. Natl. Acad. Sci. USA 2002, 99, 6731- 6736. 67. Xiong, H.; Crespo, A.; Marti, M.; Estrin, D.; Roitberg, A. E. Theor. Chem. Acc. 2006, 116, 338- 346.

73

Chapter 3: The Influence of Acetylation in the Active Site on Enzyme Dynamics in Pyruvate Formate-Lyase

68. Park, S.; Khalili-Araghi, F.; Tajkhorshid, E.; Schulten, K. J. Chem. Phys. 2003, 119, 3559-3566. 69. Hummer, G.; Szabo, A. Biophys. J. 2003, 85, 5-15. 70. Chovancova, E.; Pavelka, A.; Benes, P.; Strnad, O.; Brezovsky, J.; Kozlikova, B.; Gora, A.; Sustr, V.; Klvana, M.; Medek, P.; Biedermannova, L.; Sochor, J.; Damborsky, J. PLoS Comput. Biol. 2012, 8, e1002708. 71. Grant, B. J.; Rodrigues, A. P. C.; ElSawy, K. M.; McCammon, J. A.; Caves, L. S. D. Bioinformatics 2006, 22, 2695-2696. 72. Humphrey, W.; Dalke, A.; Schulten, K. J. Mol. Graph. 1996, 14, 33-38.

74

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

Declaration for Chapter 4

The following chapter includes the text, clearly noted by the quotation marks (“…”), and the figures, which are a literal or partially paraphrased copy of the material published in the following article:

[MH2] Marko Hanževački, Radha Dilip Banhatti, Karmen Čondić-Jurkić, Ana-Sunčana Smith, David Matthew Smith “Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase” J. Phys. Chem. A 2019, 123, 9345-9356. DOI: 10.1021/acs.jpca.9b06913.

Reprinted (adapted) with permission from J. Phys. Chem. A 2019, 123, 43, 9345-9356. Publication Date: October 3, 2019. https://doi.org/10.1021/acs.jpca.9b06913. Copyright 2019 American Chemical Society.

In the publication [MH2], the research was conceived by David Matthew Smith and Ana-Sunčana Smith. David Matthew Smith supervised the research and manuscript preparation. The present chapter, which is published as part of the manuscript [MH2], follows elements of a similar protocol that Karmen Čondić-Jukrić developed in her doctoral dissertation:

Karmen Čondić-Jurkić “Strategies in the Computational Modelling of Biological Systems: Case Studies with Radical Enzymes” 2013. URN: urn:nbn:de:bvb:29-opus-48333.

All presented MD and SMD simulations, as well as umbrella sampling and the entire analysis in the present chapter, and the manuscript [MH2], was extended and carried out by Marko Hanževački. Radha Dilip Banhatti participated in critically assessing and suggesting those analysis procedures that proved relevant for the detailed chemical insights. The manuscript was written by Marko Hanževački and Radha Dilip Banhatti, with comments and contributions from all authors.

The authors declare no conflict of interest and confirm that the above statements are true.

Date and signature:

Erlangen, June 10, 2020

Marko Hanževački Radha Dilip Banhatti Karmen Čondić-Jurkić

Ana-Sunčana Smith David Matthew Smith

75

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

4.1. Introduction

While Chapter 3 focussed on uncovering the dynamics of PFL influenced by the acetylation, and one could establish the presence of a putative CoA entry channel appearing after the chemical modification of the active site, the energetics factors were not yet explored. This may be extremely important for assessing the key changes in the conformation of the enzyme and that of the substrate for indicating how CoA might bind to molecules near the active site.

In the introductory section of Chapter 2, we reviewed that one of the most prominent tasks of enzymes is to efficiently catalyze biochemical transformations by binding substrates at the active site to increase the conversion rate into products of all the chemical reactions within cells. In many enzymes, it is well-known that both enzyme and substrate undergo important structural changes that are coupled to the underlying catalytic cycle as illustrated in Scheme 4.1.1.

Scheme 4.1.1. Enzymatic catalysis model.

Namely, enzymes are dynamic entities that can adopt multiple conformations upon substrate binding,1-4 whereas, often important structural changes are essential before the substrate binds in the active site of an enzyme to facilitate or prevent this event.5-10 As highlighted in the publication [MH2], “these aspects underline the key nature of the conformational free energy landscape of the enzyme, in the presence of the substrate, for a complete understanding of the structure-function relationship in biologically relevant systems.11-14” Moreover, accurate assessment of the free energy of substrate binding to the enzyme is a key for understanding enzyme catalysis and fundamental molecular and physical mechanisms. When the substrate binds to an enzyme or even during the process of its entry

76

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase into the enzyme’s active site, both experience conformational changes which are prerequisites for the chemical reaction in the active site. Because most of the biological systems are intrinsically dynamic and flexible, the elegant assessment of the energetics of such processes reliably becomes even more complex and requires many complementary techniques.

Turning our discussion back to the importance of conformational dynamics and binding free energy estimation in radical enzymes, all the above-mentioned elements become especially admissible to the catalytic mechanism of pyruvate formate-lyase Scheme 4.1.2. (a) The overall reaction catalyzed by (PFL).15-17 Namely, this enzyme is PFL is described by two half-reactions which is subjected to subtle conformational consistent with ping-pong kinetics and the formation changes to accommodate substrates and of the acetyl-enzyme intermediate. (b) An alternate efficiently perform its catalytic function. representation of the two half-reactions broken into Naturally, PFL places two very diverse a sequence of steps is shown here. The notation substrates in its active site, one being a introduced here represents the active site state of PFL small and rigid organic molecule before (PFL•) and after (AcPFL I and AcPFL II) (pyruvate) and the other a significantly the first half-reaction. See text for details. Scheme taken from [MH2]. larger and flexible coenzyme A (CoA). As explained earlier in Chapter 3, this ability to adjust to the size and the type of the substrate allows this enzyme to operate via a bi-substrate ping-pong mechanism, catalyzing the reversible transformation of pyruvate and CoA into formate and acetyl-CoA in two half- reactions (see Scheme 4.1.2a). As already indicated, PFL occurs as a homodimeric enzyme with one active site per subunit, namely, only one monomeric unit per dimer is fully active (half-site reactivity).18 PFL initially stores radical on G734 which is part of a larger Gly radical domain, deeply buried in the core of the active subunit.19

77

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

As reviewed in [MH2], “the two half-reactions can be separated into several key steps and visualized as follows (see Scheme 4.1.2b). The first step in the mechanism is hydrogen abstraction from the nearest cysteine C419 by the glycyl radical G734•.” Since the G734 is adjacent to two conserved catalytic cysteines in the active site, namely C418 and C419, the following step is the H-transfer from C418 to C419 which generates C418• thiyl radical (PFL•). “The third step is still a part of the first half-reaction and includes the addition of this thiyl radical to the carbonyl C-atom of pyruvate (S-C bond formation), which creates a tetrahedral oxy-radical intermediate in the fourth step. The cleavage of the C-C bond in pyruvate causes the collapse of the tetrahedral radical intermediate yielding the acetylated enzyme at C418 and •- the carbon dioxide radical anion CO2 (AcPFL I, see step 5 of Scheme 4.1.2b). Two possible •- scenarios for quenching the radical from CO2 to yield formate are direct quenching of the radical by CoA which activates the coenzyme (step 6a, Scheme 4.1.2b), or the radical is quenched by C419 reforming a thiyl radical on C419 (AcPFL II) which then activates CoA (step 6b, Scheme 4.1.2b). The final step involves acetyl transfer from C418 to CoA in the second half-reaction and product release (steps 7 and 8, Scheme 4.1.2b)” [MH2].

In the publication [MH2] we also note that during the first half-reaction with pyruvate, “CoA is bound on the protein surface, 30 Å away from the active site of PFL, taking the part of the spectator of the reaction.18 After the first half-reaction CoA has to enter the active site, become activated, and pick up the acetyl group from PFL, which constitutes the second half- reaction.17-19 As mentioned earlier this demands significant conformational changes of the substrate and also that of the enzyme.20”

“Although enzymes are dynamic molecules that constantly fluctuate and change their conformation and while some of these intrinsic conformational motions are crucial for enzyme’s activity and could impact on the general aspects of underlying catalytic mechanisms, it is not in general expected to affect the rate of the catalyzed reaction. No evidence of dynamical influence on the turnover rate of the reaction has ever been established by experiments or theories21-23” [MH2].

The facts known about the catalysis in PFL are following: the experimental kinetics studies at 30 °C show that the forward reaction (770 s-1) is slightly faster than the reverse reaction (260 s-1).15,24-26 Furthermore, as reported in [MH2], “by employing a relatively simple partial cluster model, the reaction mechanism proposed in Scheme 4.1.2b was confirmed using quantum mechanics (QM) calculations.27-31 From this calculation it was shown that the addition of the thiyl radical to pyruvate, yielding the oxy-radical tetrahedral intermediate, has the highest

78

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase energy barrier (12.3 kcal/mol), indicating that this could be the rate-limiting step of the reaction.”

In the previous chapter and the publication [MH1], we investigated, using extensive atomistic molecular dynamics (MD) simulations, the impact of chemical change in the active site on PFL conformational dynamics. It was demonstrated that the acetylation of the protein plays a critical role in governing the behavior of a potential channel of entry for CoA by several complementary methods.32 Namely, as reviewed in [MH2], “utilizing a CAVER33 tool, the three major channels connecting the buried active site with the protein surface were found in all examined model systems (both monomeric and dimeric non-acetylated PFL•, and acetylated AcPFL I and AcPFL II). The channel with the highest rank invariably led towards the surface binding site of CoA. The geometries of the active site in each of the three model systems are again shown in Figure 4.1.1a-c. In the acetylated systems, when the highly ranked channels were clustered, they were found to converge to a single but broad channel passing between two neighboring helices positioned above the active site. Surprisingly, the same dominant channel was also identified when using steered molecular dynamics (SMD) simulations, on the monomeric model systems, in which the S-H group of CoA was pulled towards cysteines C418 and C419, confirming that it was feasible for CoA to be accommodated in a channel reaching to the active site.”

Further, as summarized in the publication [MH2], “by employing principal component analysis (PCA), it was demonstrated that significant fluctuations associated with the fifth principal component (PC5) corresponded to a set of residues located in a distal region, about 15 Å from the active site. The fluctuations of these residues, which corresponded to those identified in the context of the CAVER and SMD analyses, reflected the opening and closing of the identified channel. In particular, the distance between L326 and V611 (L-V) was found to be an excellent descriptor whether the state of the channel is open or closed (see Figure 4.1.1d-e), providing a gateway for the putative channel for CoA entry, identified by CAVER and SMD.32”

Additional insight into how possibly a chemical change at the active site transfers a signal to a distal gateway protein region was obtained by “demonstrating that active site •- acetylation loosens the binding of R435 to either the CO2 or formate, which, in turn, is closely related to the change in the orientation of the helix containing L326 and hence the widening of the L-V distance to transition in an open state” [MH2].

79

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

Figure 4.1.1. The structure of PFL monomeric system. Models representing the active site of PFL active subunit before (PFL•) and after (AcPFL I and AcPFL II) the first half-reaction are shown in panels (a), (b) and (c) respectively. The fluctuations of the opposing helices (residues 321-327 and 607-615) containing L326 and V611 at the gateway of the identified CoA entry channel between the closed and open state is shown in panels (d) and (e) respectively. The location of the radical in each model system is denoted by the symbol ‘•’ and is shown in panels (a), (b) and (c). Figure taken from [MH2]. Despite the importance of the newly discovered connection between protein acetylation and the possibility for CoA to enter the active site, many aspects of CoA entry and binding close to the active site of PFL and the ensuing interactions remain elusive. Namely, as mentioned in [MH2], “utilizing classical MD, no evidence for a spontaneous entry of CoA could be gleaned in the timescale of performed extensive unrestrained MD simulations. One could infer that the timescale of entry of CoA through the identified channel is either long or that entry is an infrequent event. The latter view seems physically reasonable given that the catalytic cycle in PFL is a slow one with the turnover for the forward reaction being 770 s-1 and, possibly, the entry of CoA could be the rate-limiting step.15,24 By this we do not imply that that conformational changes of CoA or PFL alter the activation barrier, but that there is an energy cost for this entry.”

Having thus established the entry pathway for CoA in a previous Chapter 3, this chapter aims to address the energetics of the entry process and whether the progression of CoA through the identified channel to the active site is a simple one or if there are significant barriers along the path. This is essential to ascertain, as the coenzyme needs at least one reactive bound state close to the active site to complete the catalytic cycle. As stated in [MH2], “to sample these

80

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase bound states and to characterize the free energy (FE) landscape of this entry process in all three model systems, the two lines of investigation have been chosen. First, using SMD, a series of forward and reverse pulls, navigating CoA through the identified channel was performed. This involves the application of time-dependent forces to the system along a predefined 1D reaction coordinate (RC).34,35 Using starting structures generated from these pulls, the corresponding PMFs have been further evaluated using umbrella sampling (US).”

Further, “in the second line of investigation, the final structures from the forward pulls of CoA through the channel were utilized to investigate its dissociation from the active site without applying any restraints. Thus, via the unrestrained dissociation dynamics, CoA and the active site residues can enjoy their full flexibility. Here too, the US was carried out using starting structures from selected trajectories, as for the forward and reverse pulls. Besides, a direct analysis has been performed, using the time traces of the RC to construct histograms describing the frequency of sites populated by the CoA tail in the channel. This yields the spatial distribution of CoA as a function of the reaction coordinate, which can be transformed into an alternative approximation to the FE” [MH2].

In taking up these two routes of investigations, the aim was to perform adequate sampling to build reliable statistics using US simulations which implies extensive sampling of small increments of the RC. However, as described in [MH2], “in the case of unrestrained dynamics, this requires accruing many independent dissociation trajectories to obtain a reliable distribution. Indeed, by employing extensive sampling in both approaches, valuable information could be obtained on the energetically favorable positions of CoA close to the active site, which would enable the triggering of the second half-reaction.”

As will be presented in detail in the following sections, the approach that includes dissociation dynamics provides valuable information about the structure-function relationship of the binding process of CoA near the active site of PFL. It also confirms results obtained with the US of selected snapshots of the same process but yields a more detailed free energy profile. Furthermore, the free energy profile calculated from the spatial distribution along the RC enables the identification of stable states of CoA near the active site of PFL. Such unique structures might potentially represent the intermediary reactive conformations in the PFL catalytic cycle and will be carefully analyzed and discussed in this chapter.

81

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

4.2. Methods

4.2.1. Model Systems

For the present study, monomeric models of PFL representing the system before (non- acetylated, PFL•) and after (acetylated, AcPFL I and AcPFL II) the first half-reaction were built (see Figure 4.1.1a-c). All models and their parametrization are described in detail in the previous chapter and the publication [MH1], wherein the thorough unrestrained MD simulation of PFL was carried out.32 Note that each of these three models contains a different radical specie as described earlier and depicted in Figure 4.1.1a-c. Note also that all models contain CoA bound at the surface of the enzyme according to the available X-ray structure.

4.2.2. Reaction Coordinate

Since SMD simulations were used in both lines of investigation in the present study, we describe in detail the procedure employed for the forward and backward pulling simulations of CoA concerning the active site of PFL. As illustrated in [MH2], “the reaction coordinate (RC) was defined as the distance between the centers of mass of two groups of atoms. One group is formed by eight backbone atoms connecting the active site catalytic cysteines C418 and

C419 (N, Cα, C, O), while the other group is made of four heavy atoms at the thiol end of CoA, which all belong to the

cysteamine moiety. The latter group is Figure 4.2.1. The reaction coordinate (RC) used in the referred to as “CoA tail” in this thesis. SMD simulations and FE calculations. Details of the These two groups are depicted, along center of mass of the two groups are depicted within with the RC length scale, in Figure magenta circles. (a) Representative structure of CoA 4.2.1.” An alternative definition of RC tail in the near vicinity of the channel gateway was not considered relevant for this depicted by two opposing helices shown as purple study since the present one is well- ribbons in AcPFL I model system. (b) Final structure defined based on the analysis and the taken from SMD of CoA approaching the active site through the putative entry channel along the chosen quantification that emerged in Chapter RC in AcPFL I system. Figure adapted from [MH2]. 3. Note that the previously characterized

82

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

L-V gateway of the entry channel is located 15 Å away from the Cys loop in the active site of PFL. See Figure 4.2.1 for details.

4.2.3. Forward Pulls with Steered Molecular Dynamics

The initial distance between the two groups of atoms describing the RC in Figure 4.2.1 was 20-25 Å, while the final target distance was set to 5 Å in all performed pulling simulations. The distance restraint force constant used to steer the system along the defined coordinate was set to 5 kcal/mol/Å2 by employing the simple harmonic potential. It has been found that due to the relatively low force constants employed and mostly due to steric clashes, the attained final distances were closer to 6 Å. Following the logic reported in the publication [MH2], “starting ensembles of conformations for the pulling were generated from ten suitable snapshots, which themselves were chosen from several independent extensive unrestrained MD runs on each of the three model systems. Each of these snapshots had the most promising conformations, namely, the L-V gateway in an open state as well as the proximity of CoA to this gate.” The chosen descriptors were previously defined and examined geometrical parameters, for example, CoA dihedral, the distance between L326 and V611, or distance between the cysteamine moiety of CoA and the potential entry channel gateway. Further details of this choice are described in the methods section of the previous chapter.32

“Each of these ten snapshots was prepared for the forward pulls by performing distance- restrained simulations for each system where a weak harmonic restraint (1 kcal/mol/Å2) was applied so that the CoA tail could explore various states for the initial value of RC. The duration of such restrained MD simulations was 20 ns and starting snapshots for the forward pulls were collected every nanosecond. Thus, 200 forward pulls could be obtained. The duration of each pull was 10 ns, which yielded a total of 2 µs of pulling simulation time. Note, however, that the two hundred SMD trajectories investigated per each system had the pulling speeds that varied between about 2 and 3 Å/ns, due to different starting values of RC” [MH2]. The final geometries of the coenzyme molecule, obtained after the CoA tail has been pulled along the predefined reaction coordinate towards the cysteines in the active site of PFL, were taken for further analysis.

4.2.4. Reverse Pulls

Reverse pulls were carried out by steering the CoA tail away from the active site, starting from the final SMD structures obtained after the forward pulls in non-acetylated and acetylated model systems. The starting and target values of RC were set to 5 and 40 Å,

83

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase respectively. We employed a similar protocol as in the case of the forward pulls, namely we specified the force constant (5 kcal/mol/Å2), the number of pulls (200), and the duration of each reverse pulling simulation (10 ns) which gave rise to a total simulation time of 2 µs. In the methodology section of the publication [MH2] we explained that “in the case of the forward pulls, the initial RC was chosen close to the gateway, and using reverse pulls we wished to additionally examine the region in the bulk solvent which the CoA tail could potentially explore while the adenine moiety of CoA remains bound to the surface of the protein. This implies that in this case, a uniform speed for each pull, namely, 3.5 Å/ns, was used.”

4.2.5. Unrestrained Dissociation Dynamics from the Active Site to the Protein Surface

For each of the three model systems, we selected 15 out of the 200 forward pulls based on the criteria that CoA entered the active site through the L-V gate. Upon the equilibration of all chosen snapshots, unrestrained MD simulations were carried out ten times using different initial conditions. Namely, we allowed the CoA tail to freely explore the interior of the channel from the regions near the catalytic cysteines in the active site of PFL to those close to the L-V gate and beyond, in each of the 150 production MD simulations. Considering that the aim is to detect any potentially reactive and stable bound states of CoA in each of these model systems, the results of the trajectory analysis are presented as the time traces of the RC written every 10 ps, tracked over 10 ns, implying a total simulation time of 1.5 µs for each of the three model systems.

4.2.6. Analysis

We performed a similar analysis procedure as reported in the publication [MH2], namely “an appropriate estimation of the free energy was obtained by performing umbrella sampling simulations on selected snapshots from forward and reverse pulls, as well as those obtained from the unrestrained dissociation dynamics. The following restraints were applied to the predefined RC: the force constant for the region between 6 and 10 Å was set to 2.5 kcal/mol/Å2 while a weaker force constant of 1.25 kcal/mol/Å2 was applied for the region beyond 10 Å. The starting structures were minimized and relaxed for 100 ps. The equilibrated structures were then subjected to constrained NVT production simulation for a duration of 10 ns. The sampling regions of the RC were divided optimally to be about 0.5 Å to allow for an overlap of histograms and the data was accumulated every 10 steps (0.02 ps) yielding 500 000 data points for each 0.5 Å. The WHAM code was utilized to construct individual PMFs.36

84

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

Master PMFs were calculated for each model system by combining all the corresponding data using the WHAM code. Further, due to the small bin size used here, the master PMFs constructed as arithmetic means of the individual PMFs were identical to those obtained with WHAM.”

Tentatively, from the extensive statistics of 150 000 values for the time traces of the RC in each model system obtained from the unrestrained dissociation dynamics, histograms showing the frequency of regions populated by the CoA tail in the channel have been constructed. The free energy profiles were then estimated from these distributions as a function of RC using the standard cpptraj analysis tool. All MD simulations were carried out using the pmemd module of AMBER 1637-40 package while the structures were visualized using VMD 1.9.3.41

85

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

4.3. Results

4.3.1. Insights from Forward and Reverse Pulls

The work profiles determined from SMD simulations in the non-acetylated (PFL•) and the acetylated (AcPFL I and AcPFL II) systems are presented in the upper and lower panels of Figure 4.3.1 for the forward and the reverse pulls, respectively. Our calculations suggest that in the forward pulls, the required work in a majority of the two hundred steering trajectories was found to be negligibly low until near18 Å, and thereafter a constant and steep increase was observed. In particular, we discussed in the publication [MH2], that “the obtained values of work necessary to deliver S-H group of CoA from the protein surface into the active site were found to be relatively high, ranging from 50 to 200 kcal/mol for the acetylated models, while being even higher, up to 300 kcal/mol, for the non-acetylated system. Interestingly, there is no evidence of a decrease of work as CoA nears the active site in neither of the three model systems.”

For the comparison, in the case of reverse pulls, the calculated work values are found to be overall lower (50 to 100 kcal/mol). Furthermore, a clear minimum was detected around 10 Å, which even remain to about 15 Å in all three examined systems. These features display

Figure 4.3.1. Trajectories obtained from the SMD simulations in which CoA is pulled towards (forward) and away (backward) from the active site before (PFL•) and after (AcPFL I and AcPFL II) the first half-reaction. The smooth lines represent the amount of work as a function of the values of the RC for each of the two hundred runs during the (a) forward and (b) reverse pulling. Figure adopted from the supporting information of [MH2].

86

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

Figure 4.3.2. Ensemble of CoA (S-atom of CoA tail depicted as green spheres) geometries derived from SMD simulations of both non-acetylated and acetylated systems, where CoA was pulled (a) into the active site from the protein surface (forward pulls) and (b) from the active site to the protein surface (reverse pulls) independently showing that the dominant CoA entry and exit pathways are through the L-V gateway. Note that the CoA conformation shown here corresponds to that in AcPFL I model system. Using the scale displayed at the right, the different regions of the CoA entry channel can be discerned. Figure taken from [MH2]. favorable sites where the CoA tail could bind, potentially poising itself for the second half- reaction and the extraction of acetyl group in the active site. Also, we highlighted in [MH2], that “the trajectories corresponding to the minimum are those that pass through the L-V gateway. Furthermore, a closer examination of the CoA tail in trajectories corresponding to both forward and reverse pulls through the L-V gate, see Figure 4.3.2 shows that they have geometrically similar pathways. Based on this scrutiny, several snapshots from amongst these trajectories were selected for performing extensive sampling of the RC using umbrella sampling simulations.”

4.3.2. Potentials of Mean Force from Umbrella Sampling

Besides the trajectories selected from forward and reverse pulls, to perform the US simulations we also included the snapshots from unrestrained dissociation dynamics, based on the criteria that the starting structures resembled those that passed through the L-V gate in the initial protocol. Thus, in Figure 4.3.3, the potentials of mean force (PMFs) obtained with the US method (see methods section 4.2.6 for details) were presented for all three cases and each

87

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase of the three model systems. Namely, as outlined in the publication [MH2], “each row corresponds to the parent process and the fourth panel shows the master PMFs for all three systems for that parent process. The nine panels showing the actual PMFs display the following three trends. The first clue is similar to the information obtained from work profiles (see Figure 4.3.1), the PMFs effectively show no cost of work for RC values greater than 18 Å. Secondly, the energy required for delivering the CoA tail from the protein surface near the active site shows a steady decrease when going down the column for each system. This corresponds to an average of 40 to 60 kcal/mol for the forward pulls, between 20 and 40 kcal/mol for reverse pulls, and between 15 and 30 kcal/mol for unrestrained dissociation dynamics, respectively. One can attempt to understand this trend by the following line of argument. The forward pulls, over 10 ns, may not allow for enough relaxation and flexibility of the moieties and residues of

Figure 4.3.3. Potentials of mean force (PMFs) derived from umbrella sampling simulations with 10 ns windows simulation time and calculated using WHAM. A total of 8, 9 and 12 trajectories of forward and 3, 5 and 5 of reverse pulls were used to construct the PMFs depicted in upper and middle panel respectively for non-acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) systems, respectively. For the unrestrained dynamics, the PMFs (lowest panel) have been constructed using 3, 4 and 4 unrestrained dissociation trajectories. Master PMFs for all three model systems were constructed using all PMFs for each system. Note that, the relevant barrier in all cases corresponds to that at RC ≈ 8 Å. The higher energies observed at RC ≈ 6 Å are already indicative of steric clashes of CoA with the active site and is not relevant for our discussion. See text for a detailed discussion of the various PMFs shown in all the twelve panels. Figure taken from [MH2].

88

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase the protein as the CoA tail is steered through the channel, and may also not allow for complete internal torsional flexibility of CoA itself. However, for the reverse pulls, the simulations were started by equilibrating the structures with CoA already close to the active site. Thus, certain degrees of freedom of the coupled system is already realized. In the case of unrestrained dynamics, CoA and the protein are entirely free to mutually adjust to one another, without the interference of external restraints. However, note that even in this case the history of effects of the restraints in the generation of the US starting structures is visible in the final PMFs.”

Furthermore, as highlighted in [MH2], “the third trend that needs to be discussed pertains to the structures that seem to be indicated based on these PMFs. First, by considering those resulting from the forward pulls across the three systems, it is observed that in both acetylated systems, a hint of a barrier around 9 Å and a minimum around 7 Å is visible in several of the profiles. This trend is somewhat attenuated in the master PMF corresponding to the forward pulls, where the three master PMF curves, orange, green and blue, corresponding respectively to PFL• and AcPFL I and AcPFL II, are obtained from combining the PMF data displayed in the panels at the left. A similar observation, namely the minimum at 7 Å and the barrier at 9 Å, can be made for the profiles resulting from the reverse pulls. Especially in the case of reverse pulls, we note that the attenuation in the master curves is less, implying that there is a considerable loss of detailed information apparent from the combination of the individual profiles.”

We now focus on discussing the individual PMFs resulting from unrestrained dissociation dynamics, shown in the bottom panel of Figure 4.3.3. Remarkably, a clear indication of minima near 7 Å is now obtained in all the PMF profiles. Indeed, the minima are persistent enough to remain visible in the combined master PMFs, depicted in the fourth bottom panel of Figure 4.3.3. This result we consider to be an indication of an intrinsic feature of the examined systems.

A similar conclusion has been drawn by considering the geometrical aspects of the unrestrained dynamics underlying the corresponding energetics from PMFs. In this context, we depicted in Figure 4.3.4, the trail of the CoA tail in the channel during unrestrained dissociation for all three systems. In addition to obvious regions of occupancy, between the active site and the L-V gateway, in the publication [MH2] we stated that “it is apparent that CoA is more likely to dissociate from the channel in the non-acetylated system than in the acetylated counterparts.

89

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

Figure 4.3.4. Ensemble of S-atom of CoA cysteamine moiety (shown as transparent spheres) derived from unrestrained dynamics starting from 150 final snapshots taken after forward pulls in both non- acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) systems. CoA was left to explore the channel from the active site to the protein surface through the L-V gate. Panels (a), (b) and (c) are showing the CoA tail S-atom occupation of the certain regions in the channel in PFL• (orange), AcPFL I (green) and AcPFL II (blue) model systems, respectively. Figure taken from the supporting information of [MH2].

The time traces in Figure 4.3.5 confirm this aspect, showing the greater propensity for CoA in the PFL• system to diffuse towards the bulk water than in either of the AcPFL I or AcPFL II systems.”

Generally, a few additional statements can also be made regarding the resulting PMFs from unrestrained dynamics shown in Figure 4.3.3. For example, as reported in [MH2], “the energy of the bound states appears lower for acetylated compared to the non-acetylated systems. Further, it can be seen that two of the four trajectories shown for the US in the AcPFL I system have PMFs corresponding to energies as low as of about 10 kcal/mol and that there are minima at about 7 Å (close to the active site) and about 10 Å. These could very well be indicative of bound states. Surprisingly, while low energy states in the PFL• and AcPFL II systems are not visible, indications of these states are found as inflection points. These detailed aspects are, however, lost to a large extent in the master PMF, as pointed out earlier.” This encouraged us to now proceed with the second analysis step, namely investigating the information contained in all generated trajectories for the unrestrained dissociation dynamics.”

90

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

4.3.3. Detailed Characterization of Bound States

To gain a deeper insight into the nature of these bound states, the time series (see Figure 4.3.5) were analyzed and processed to directly obtain the resulting distributions of the chosen reaction coordinate (see analysis in the methods section 4.2.2 for details), which are plotted as histograms in Figure 4.3.6a. Namely, as discussed in the publication [MH2], “several maxima in RC distribution were observed both in acetylated and non-acetylated models. There is one positioned around 7 Å, where the frequency of the population of this RC increases from PFL• to AcPFL II and nearly doubles in the case of AcPFL I. Further, the three systems show similar propensities to populate the region of the RC between 10 to 12 Å. Both these peaks can be said to correspond to stable bound states. However, further analysis would help us understand whether these sites also correspond to reactive sites, where there are interactions possible with the key moieties of the active site. Interestingly, there is an additional small peak visible in the AcPFL I system around 9 Å and unlike PFL• and AcPFL II, this system shows no further peaks inside the channel. From umbrella sampling, see Figure 4.3.3, bottom panel, it is visible that the US-PMFs show no energy cost beyond 14 Å. The same trend is observed also in the histograms, see Figure 4.3.6a, although the unrestrained dissociation trajectories should be focused primarily on RC distances less than 14 Å, where the statistics are sufficiently significant.”

Hence, considering that histograms from Figure 4.3.6a reflect probability distributions, we constructed free energy profiles (see Figure 4.3.6b) to reveal the nature of the sites within the L-V gate and channel. The zero-energy reference point is chosen automatically at the lowest minima. It is clear from these profiles that the entire energy cost for unrestrained dissociation

Figure 4.3.5. Time traces of distance between groups of atoms defining a reaction coordinate over the 10 ns of unrestrained MD simulations of the initial bound states starting from 150 final snapshots obtained with forward pulls in (a) PFL•, (b) AcPFL I and (c) AcPFL II systems. Figure adopted from the supporting information of [MH2].

91

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

Figure 4.3.6. (a) Histograms constructed from time traces (Figure 4.3.5) of distance between groups of atoms defining the RC over the 10 ns of unrestrained MD simulations starting from multiple final snapshots obtained from forward pulls showing the distribution of the RC in non-acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) systems. (b) Free energy profiles constructed from histograms shown in panel (a) in non-acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) systems. Free energy along collective variables is calculated as a difference in free energy going from one position along the coordinate R to another position using the expression ΔG(R) = -kBT ln(P(R1)/P(R2)) which allows the conversion of histogram of probabilities along a particular coordinate to the free energy change associated with moving along that coordinate. Representative conformations of stable bound states of CoA in the active site in (c) PFL•, (d) AcPFL I and (e) AcPFL II obtained from unrestrained dynamics. The location of the radical in each model system is denoted by the symbol ‘•’ and is shown in panels (c), (d) and (e). Figure taken from [MH2]. is of the order of 2 kcal/mol for all three systems, which is significantly lower than the 10 kcal/mol obtained for the two trajectories in the AcPFL I system from the US (bottom panel of Figure 4.3.3). In particular, we summarized in the publication [MH2], that “part of this discrepancy may arise because the direct counting approach does not offer as systematic connection with larger values of the RC as does the US methodology. On the other hand, direct counting is completely free from the influence of imposed restraints and therefore allows both the enzyme and the coenzyme to explore their mutual phase spaces freely. Indeed, the restrictive effect of constraints arising from the initial structures is present in the PMFs for forward pulls, reverse pulls, and unrestrained dynamics (Figure 4.3.3), with a systematic lowering due to what could be considered as less drastic restraining protocols as one goes from top to bottom panel of Figure 4.3.3. Thus, while the US is a well-established equilibrium

92

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase procedure that essentially describes the larger energy landscape, in a restricted channel, where flexibility and inner degrees of freedom of the moieties could play an important role, analyzing the spatial distribution of unrestrained dynamics can offer important complementary information.”

Furthermore, given that we explored several favorable (minima) and non-favorable (barriers) states within the channel using the unrestrained dynamics, we can also view this as an appropriate landscape available for CoA on its way into the channel from the gateway. In connection to this, we found that “there is a rich source of information that can be extracted from the various features from the free energy profiles displayed in Figure 4.3.6b. When focusing on the domain close to the active site and in the channel, between about 6 and 15 Å, the two well-defined minima are observed, corresponding to the maxima observed in Figure 4.3.6a, both similar energies. One of these occurs at about 7 Å and the other lies between 11 and 12 Å. The first minimum at about 7 Å is similar for all three systems, with that for PFL• being higher and slightly shifted to the left. The nature of the other minimum shows different features for the three different systems. Observing this more distant minimum from the point of CoA entry into the channel, a few comments can be made. While it is a clear minimum for AcPFL II, it is, following a small maximum at about 13 Å, part of a continuing energetically favorable process of CoA entry for PFL•. On the other hand, for AcPFL I, it appears to be a favorable yet shallow minimum” [MH2].

The major energy barrier that separates the two minima can be observed in Figure 4.3.6b for all examined model systems. Interestingly, in both acetylated systems, the barrier heights are found to be lower than the one obtained in the non-acetylated system. We concluded in the publication [MH2], that “in the case of PFL• system, CoA resides more frequently in the region between 10 to 14 Å due to the larger barrier at 8.5 Å. As will be discussed later in the text, those infrequent events of CoA reaching the active site, although potentially feasible from an energetics perspective, would potentially increase the chance of inactivation. Moreover, the approach of CoA in AcPFL I is found to be energetically favorable, with relatively low barriers, right until the deep minima at about 7 Å, which are referred to henceforth as a reactive bound state. In the case of AcPFL II, a broad shallow minimum was observed between 9 and 13 Å and a sharp barrier at around 8.5 Å, decreasing the probability of its transition to the reactive bound state.”

93

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

4.3.3. Characterization of Reactive Bound States

Using representative snapshots of the coenzyme when residing near the active site and stacked between the L-V gates shown in the upper panel of Figure 4.3.6c-e, a closer view of CoA interactions with the residues comprising the active site is presented. After careful analysis of these MD snapshots, we stated in [MH2], that “in their stable bound conformation, all three systems share several common features. Firstly, CoA is mostly in the anti- conformation, with its main chain extended and positioned between two opposing helices comprising the gateway, shown as purple ribbons (between L326 and V611). In these conformations, the adenine moiety of CoA is occasionally binding and unbinding from its stacking partner F149 whereas the phosphate groups of CoA are strongly interacting with positively charged residues (lysine and arginine) on the protein surface near the putative channel entrance. The pantothenate methyl groups of CoA are also found to be in close hydrophobic interactions with the gateway residues, mostly interacting with the aliphatic sidechain of V611. Further down the channel, the peptide motifs as well as the hydroxyl group, of bound CoA, are found to occasionally form hydrogen bonds with the backbone atoms of E325 and L326.”

Due to the formation of stabilizing hydrogen bonds close to the active site of PFL, we characterized the crucial interactions of CoA with the sidechain of R435 that provide favorable positioning of CoA for the hydrogen abstraction that is required to initiate the second half- reaction. Namely, as highlighted in the publication [MH2], “while, in all three systems, a hydrogen of the guanidine group in R435 interacts with one of the peptide bond oxygens of CoA, it is different hydrogen in non-acetylated PFL• than in the acetylated AcPFL I and AcPFL II systems. Note from Figure 4.3.6c-e that the orientation of R435 in the non- acetylated systems is different from that observed in the acetylated systems. The reason for this was discussed in detail in the previous Chapter 3, where we found two aspects that contributed to it. One is the fact that the inherent flexibility of the geometry of the active site (and regions •- close to it) is larger in the acetylated systems. This is due to the looser binding of the CO2 and formate compared to pyruvate. Superimposed on this are larger-scale flexibilities, which initially served to signal the gateway to be in a more open state. Due to the tighter binding of pyruvate with R435, the geometry of the interaction between R435 and the peptide oxygen of CoA is different. This is consistent with both the shift in the peak of the maximum distribution towards lower RC values that are observed in Figure 4.3.6a and the increase in energy of the minimum as noted in Figure 4.3.6b for the PFL• system.” In particular, we also explained in

94

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

Figure 4.3.7. Heatmaps in panels (a), (b) and (c) were constructed by taking snaphots from both non- acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) systems, respectively. The geometrical parameters were monitored, namely H-abstraction coordinate (x-axis) distances were plotted against

CoA-acetylation coordinate (y-axis) distances. The heatmap surface is given by Gi = - kBT ln (Ni/Ntot), where kB is Boltzmann’s constant, T is the temperature (300 K), Ni is the population of bin i while Ntot represents the total number of configurations of each system (Ntot is 150 000 for each system). The rightmost scale in the panels (a), (b) and (c) corresponds to relative free energy in kcal/mol obtained by counting the structures in each bin for each of our three model systems, respectively. Panels (d), (e) and (f) show 2D representative structures of CoA interacting with the residues in the active site of the non-acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) systems, respectively. The location of the radical in each model system is denoted by the symbol ‘•’. These 2D structures of CoA bound in the active site of PFL show the important contacts of CoA with the active site compartments in each model system and are also depicted in the zoomed part in the lower panels of Figure 4.3.6c-e, where more structural details are provided. Figure taken from the supporting information of [MH2].

[MH2], that “the order of decrease in energy of the minimum around 7 Å in the three systems, shows a direct correlation to the R435 binding to the substrates in the active site (see Figure 3.3.11 in the previous chapter). The lower parts of Figure 4.3.6c-e depict zoomed regions close to the CoA tail and the active site. Here the details of the differences between the three systems become apparent, particularly in terms of the interaction of the S and H atoms of the CoA tail interacting with different atoms in the active site.”

95

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

To obtain more detailed insights, we monitored the two distances that are relevant for the ensuing second half-reaction in each of the three model systems. As defined in [MH2], “the first distance is identified as the H-abstraction coordinate: in the case of PFL•, it is the distance between the thiol hydrogen of CoA tail and the radical-carrying sulfur atom of C418 thiyl radical; in the case of the AcPFL I and AcPFL II model systems, it is the distance between •- • the same CoA thiol H-atom and the carbon of CO2 or the sulfur of C419 , respectively. The other distance of interest we denote as the CoA acetylation coordinate: for PFL•, it is the distance between S-atom of the CoA tail and carbonyl C-atom of pyruvate; for AcPFL I and AcPFL II, it is the distance between the sulfur of CoA tail and the carbonyl C-atom in C418- attached acetyl, see Figure 4.3.6c-e.”

We obtained the heatmaps, presented in Figure 4.3.7, by plotting the population of the CoA acetylation coordinate against the population of the H-abstraction coordinate. Notably, as discussed in the publication [MH2], “only in the case of AcPFL I a strong correlation was observed where the most populated regions of the value of H-abstraction coordinate are short enough to allow a productive transfer (as low as 1.8 Å) while coinciding with low values of the CoA-acetylation coordinate (about 3.5 to 4 Å). Additionally, a 2D representation of these contacts is presented in Figure 4.3.7d-f.”

The resulting free energies corresponding to the distributions of each of the two key distances for all model systems were calculated and plotted in Figure 4.3.8a-b. Although both plots show various features as a function of these two distances, to learn more about the

Figure 4.3.8. Free energy profiles of (a) hydrogen abstraction and (b) acetylation coordinates calculated by counting the contacts between CoA atoms (S or H) and atom comprising the active site of PFL (S or C) during 1.5 µs of unbiased dissociation MD simulations in each model system. Free energy differences were calculated using the analogus expression from Figure 4.3.6. Figure taken from [MH2].

96

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase chemical implications of the reactive bound state (RC ≈ 7 Å, see Figure 4.3.6b), we set the focus on the first minima in Figure 4.3.8a-b. As mentioned in the context of the heatmaps, we concluded in [MH2], that “only in the case of AcPFL I, a sharp minimum was found at distances as small as about 1.8 Å and about 4 Å for the H-abstraction, and CoA-acetylation coordinates respectively. In the case of PFL• and AcPFL II, the position of this first minimum or inflection, respectively, for the H-abstraction coordinate is at around 3.5 Å and higher in energy compared to the minimum for AcPFL I. While the shift in the position could be partially attributed to the size of sulfur in the case of PFL• and AcPFL II, the raised cost in the energetics could be associated with the lack of favorable charge interaction with the S-H bond •- in the absence of CO2 (as in PFL• and AcPFL II). In Figure 4.3.8b, the CoA-acetylation coordinate has an overall similar free energy profile for all systems. However, in the acetylated systems, a clear minimum is apparent at about 4 Å, which is due to the interaction with the carbon from the acetyl group, indicating the effect of the similar topologies of the acetyl carbon and its surroundings. In the non-acetylated system, this carbon is part of pyruvate and has a significantly different position than when attached to C418.”

4.3.4. Implications Emerging for the Catalytic Process

The possible mechanistic outcomes, starting from each of the three model systems, are illustrated in Figure 4.3.9. Namely, as summarized in the publication [MH2], “for the non- acetylated PFL• system, a hydrogen transfer from CoA to C418• would infer the participation of CoA before the first half-reaction. This is inconsistent with experimental observations and is, therefore, not considered further in the context of a productive mechanism. On the other hand, H-abstraction from CoA in either of the acetylated model systems results in the initiation •- of the second half-reaction. The H-atom could be received by CO2 , in the case of AcPFL I, or by C419•, in the case of AcPFL II. The subsequent acetyl transfer from acetyl-C418 to the activated CoA is set to occur in the same way in both cases.”

Interestingly, from the numerous evidence presented earlier in the text, it emerges that “the AcPFL I system has both structural and energetics advantages necessary for triggering the second half-reaction. Although, based on the study shown in the previous chapter, it was found that the open state of the gateway for the CoA entry channel favored acetylated systems in general, and AcPFL I in particular, it was still far from clear which factors determine the energetics as CoA approaches the active site. While the orientation of the R435, which is crucial for the hydrogen bonding of CoA with guanidine group, is similar both in AcPFL I and •- • AcPFL II, it is the nature of the radical, namely, CO2 rather than C419 , which appears to be

97

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase responsible for ensuring a stronger bonding with the hydrogen of the CoA thiol moiety. The result is that the AcPFL I topology is the only variant that presents distances conducive to abstracting the thiol H-atom from CoA while, simultaneously, presenting values of the acetylation coordinate consistent with the subsequent acetyl transfer.

Despite the large barrier to reaching the active site in the non-acetylated system, PFL•, a small but finite frequency of the population of CoA close to the substrate was found. In these configurations, the abstraction of the thiol H-atom from CoA could only be carried out by

Figure 4.3.9. Impact of possible chemical outcomes of the second half-reaction on the overall PFL catalysis starting from non-acetylated and acetylated model systems with CoA present in the active site of PFL. Single electron movements are shown by drawing a red arrow with a single-headed line. (a) In the case of PFL• system, the H-transfer from CoA tail to C418• is found to occur in a single step that is very likely to cause the premature radical quenching from C418• which leads to inactivation of •- PFL. (b) In the AcPFL I system, the first step is the H-transfer from the S-H group of CoA to CO2 - which yields formate (HCO2 ) and the CoA-radical. Furthermore, the second step is the acetylation of CoA and regeneration of radical on C418. (c) For the AcPFL II system, the first step is the H-transfer between the S-H group of CoA and C419• which produces the CoA-radical, which then, in the second step, leads to the same reaction as in the case of AcPFL I system. Figure adapted from [MH2].

98

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

C418• (Figure 4.3.6c). However, this eventuality would prevent C418• from adding to the pyruvate substrate, thus precluding the first half-reaction from taking place. This implies that however infrequent is the event of CoA achieving the stable binding pose in the active site, such an event would lead to the premature quenching of the active site radical and the inactivation of the enzyme. This line of thinking provides additional support for the functional significance of a CoA entry channel whose opening occurs after the first half-reaction, namely acetylation, as opposed to before the first half-reaction with pyruvate32” [MH2].

99

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

4.4. Conclusion

In this chapter, we presented results of a comprehensive computational study that was carried out to uncover the routes associated with CoA’s entry and exit concerning the buried active site of PFL through the previously characterized channel (results presented in Chapter 3 and the publication [MH1]). The main outcome of this investigation is the identification of the different conformations of CoA in the three model systems when close to the active site, which has helped in gaining a definitive and a more complete understanding of the catalytic mechanism of PFL. As highlighted in [MH2], “in the absence of detailed experimental data, the entry and exit pathways have been examined through extensive SMD and unrestrained dissociation simulations. These simulations have also been supplemented with a significant measure of additional sampling, both in the presence and absence of external restraints.”

The energetics arising from the profiles obtained by performing the SMD simulations (Figure 4.3.1) is difficult to describe as reliable since initially observed work values are too high. This conclusion is evident from the comparison with the PMFs obtained from umbrella sampling carried out using structures from the SMD trajectories as starting geometries (see upper panels of Figure 4.3.3). However, despite extensive sampling, the calculated PMFs remain dependent on the protocol used to generate their initial conditions (including unrestrained dissociation as shown in lower panels of Figure 4.3.3). Indeed, a more realistic energetics of the CoA approach towards the active site is obtained when both the protein and CoA are allowed to mutually adjust to one another in the restraints-free simulations. Interestingly, the information obtained from the master (averaged) PMFs follows a similar trend. These results led to the exploration of the least invasive protocol, that of unbiased unbinding, in more detail.

“By repeating the spontaneous dissociation simulations on multiple occasions, many trajectories relevant for the binding and the unbinding events was generated (Figure 4.3.5). On one hand, this protocol was the most useful means of generating appropriate initial structures for umbrella sampling simulations, as shown in Figure 4.3.3. On the other hand, it allowed the identification of several significant basins of attraction for the CoA tail in the near vicinity of the active site (Figure 4.3.4). This analysis was found to be consistent with the umbrella sampling results while, nevertheless, providing significantly more detailed structural insights close to the active site (Figure 4.3.6)” [MH2].

100

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

Two important conclusions arise after combining this information with the possible catalytic scenarios (Figure 4.3.6, 4.3.7, and 4.3.8). We summarized in the publication [MH2], “firstly, based on the obtained structure, the AcPFL I arrangement is significantly more likely to initiate the second half-reaction than AcPFL II. In other words, the activation of CoA by H- •- • abstraction is more likely to be performed by CO2 than it is by C419 . This is because the reactive bound state for AcPFL I is in a significantly superior position for CoA activation than is its AcPFL II analog (Figure 4.3.8). Secondly, in the PFL• arrangement, a possible reactive bound state where H-transfer from CoA to C418• might occur was identified. Such a transfer would prevent the first half-reaction and thus be counterproductive in the context of the overall mechanism.” Therefore, functionally, an avoidance of the approach of CoA to the active site before the completion of the first half-reaction would be preferred. As such, this finding offers a strong reason why there is a correlation to the open state of the entry channel which is triggered by the enzyme's acetylation at the active site as described in detail in Chapter 3 and the publication [MH1].

As concluded in [MH2], “a more complete characterization of the PFL catalytic mechanism requires an investigation of the chemical steps, including intermediates and transition states, using more sophisticated QM methods. Such an approach would require a variety of initial structures differing significantly from the available crystal structure, particularly in terms of the second half-reaction with CoA. In this respect, in addition to their structurally based mechanistic importance explained earlier, the reactive bound states identified herein constitute an excellent basis for a complete QM/MM characterization of the mechanism of PFL.” The results of the QM/MM study of both half-reactions performed by employing the newly discovered reactive bound states are presented in the following Chapter 5.

101

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

4.5. References

1. Kovermann, M.; Grundström, C.; Sauer-Eriksson, A. E.; Sauer, U. H.; Wolf-Watz, M. Proc. Natl. Acad. Sci. USA 2017, 114, 6298-6303. 2. Hammes, G. G. Biochemistry 2002, 41, 8221-8228. 3. Gutteridge, A.; Thornton, J. FEBS Lett. 2004, 567, 67-73. 4. Secundo, F. Chem. Soc. Rev. 2013, 42, 6250-6261. 5. Csermely, P.; Palotai, R.; Nussinov, R. Trends Biochem. Sci. 2010, 35, 539-546. 6. Cuzzolin, A.; Sturlese, M.; Deganutti, G.; Salmaso, V.; Sabbadin, D.; Ciancetta, A.; Moro, S. J. Chem. Inf. Model. 2016, 56, 687-705. 7. De Vivo, M.; Masetti, M.; Bottegoni, G.; Cavalli, A. J. Med. Chem. 2016, 59, 4035-4061. 8. DesJarlais, R. L.; Sheridan, R. P.; Dixon, J. S.; Kuntz, I. D.; Venkataraghavan, R. J. Med. Chem. 1986, 29, 2149-2153. 9. Doerr, S.; De Fabritiis, G. J. Chem. Theory Comput. 2014, 10, 2064-2069. 10. Durrant, J. D.; McCammon, J. A. BMC Biol. 2011, 9, 71. 11. Berntsson, R. P.; Smits, S. H.; Schmitt, L.; Slotboom, D. J.; Poolman, B. FEBS Lett. 2010, 584, 2606-2617. 12. Kingsley, L. J.; Lill, M. A. Proteins 2015, 83, 599-611. 13. Richard, J. P. J. Am. Chem. Soc. 2019, 141, 3320-3331. 14. Robinson, P. K. Essays Biochem. 2015, 59, 1-41. 15. Knappe, J.; Blaschkowski, H. P.; Gröbner, P.; Schmitt, T. Eur. J. Biochem. 1974, 50, 253-263. 16. Conradt, H.; Hohmann-Berger, M.; Hohmann, H. P.; Blaschkowski, H. P.; Knappe, J. Arch. Biochem. Biophys. 1984, 228, 133-142. 17. Unkrig, V.; Neugebauer, F. A.; Knappe, J. Eur. J. Biochem. 1989, 154, 723-728. 18. Becker, A.; Fritz-Wolf, K.; Kabsch, W.; Knappe, J.; Schultz, S.; Volker Wagner, A. F. V. Nat. Struct. Bio. 1999, 6, 969-975. 19. Becker, A.; Kabsch, W. J. Biol. Chem. 2002, 277, 40036-40042. 20. Backman, L. R. F.; Funk, M. A.; Dawson, C. D.; Drennan, C. L. Crit. Rev. Biochem. Mol. Biol. 2017, 52, 674-695. 21. Olsson, M. H. M.; Mavri, J.; Warshel, A. Philos. Trans. R. Soc. Lond. B: Biol. Sci. 2006, 361, 1417- 1432. 22. Olsson, M. H. M.; Parson, W. W.; Warshel, A. Chem. Rev. 2006, 106, 1737-1756. 23. Warshel, A.; Bora, R. P. J. Chem. Phys. 2016, 144, 180901. 24. Knappe, J.; Neugebauer, F. A.; Blaschkowski, H. P.; Gänzler, M. Proc. Natl. Acad. Sci. USA 1984, 81, 1332-1335. 25. Wagner, A. F.; Frey, M.; Neugebauer, F. A.; Schäfer, W.; Knappe, J. Proc. Natl. Acad. Sci. USA 1992, 89, 996-1000. 26. Buis, J. M.; Broderick, J. B. Arch. Biochem. Biophys. 2005, 433, 288-296. 27. Himo, F.; Eriksson, L. A. J. Am. Chem. Soc. 1998, 120, 11449-11455. 28. Leppanen, V. M.; Merckel, M. C.; Ollis, D. L.; Wong, K. K.; Kozarich, J. W.; Goldman, A. Structure 1999, 7, 733-744. 29. Zhang, W.; Wong, K. K.; Magliozzo, R. S.; Kozarich, J. W. Biochemistry 2001, 40, 4123-4130. 30. Parast, C. V.; Wong, K. K.; Lewisch, S. A.; Kozarich, J. W.; Peisach, J.; Magliozzo, R. S. Biochemistry 1995, 34, 2393-2399. 31. Reddy, S. G.; Wong, K. K.; Parast, C. V.; Peisach, J.; Magliozzo, R. S.; Kozarich, J. W. Biochemistry 1998, 37, 558-563. 32. Hanževački, M.; Čondić-Jurkić, K.; Banhatti, R. D.; Smith, A.-S.; Smith, D. M. Chem. Eur. J. 2019, 25, 8741-8753. 33. Chovancova, E.; Pavelka, A.; Benes, P.; Strnad, O.; Brezovsky, J.; Kozlikova, B.; Gora, A.; Sustr, V.; Klvana, M.; Medek, P.; Biedermannova, L.; Sochor, J.; Damborsky, J. PLoS Comput. Biol. 2012, 8, e1002708. 34. Izrailev, S.; Stepaniants, S.; Balsera, M.; Oono, Y.; Schulten, K. Biophys. J. 1997, 72, 1568-1581. 35. Narayanan, C.; Bernard, D. N.; Doucet, N. Catalysts 2016, 6, 81.

102

Chapter 4: Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase

36. Grossfield, A. “WHAM: the weighted histogram analysis method”, version 2.0.9.1, http://membrane.urmc.rochester.edu/wordpress/?page_id=126. 37. Case, D. A.; Cerutti, D. S.; Cheatham, T. E., III; Darden, T. A.; Duke, R. E.; Giese, T. J.; Gohlke, H.; Goetz, A. W.; Greene, D.; Homeyer, N.; Izadi, S.; Kovalenko, A.; Lee, T. S.; LeGrand, S.; Li, P.; Lin, C.; Liu, J.; Luchko, T.; Luo, R.; Mermelstein, D.; Merz, K. M.; Monard, G.; Nguyen, H.; Omelyan, I.; Onufriev, A.; Pan, F.; Qi, R.; Roe, D. R.; Roitberg, A.; Sagui, C.; Simmerling, C. L.; Botello-Smith, W.-M.; Swails, J.; Walker, R. C.; Wang, J.; Wolf, R. M.; Wu, X.; Xiao, L.; York, D. M.; Kollman, P. A. AMBER 2017, University of California: San Francisco, CA, 2017. 38. Götz, A. W.; Williamson, M. J.; Xu, D.; Poole, D.; Le Grand, S.; Walker, R. C. J. Chem. Theory Comput. 2012, 8, 1542-1555. 39. Salomon-Ferrer, R.; Götz, A. W.; Poole, D.; Le Grand, S.; Walker, R. C. J. Chem. Theory Comput. 2013, 9, 3878-3888. 40. Le Grand, S.; Götz, A. W.; Walker, R. C. Comput. Phys. Commun. 2013, 184, 374-380. 41. Humphrey, W.; Dalke, A.; Schulten, K. J. Mol. Graph. 1996, 14, 33-38.

103

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in

Pyruvate Formate-Lyase

Declaration for Chapter 5

The following chapter includes the text and the figures, which are currently part of a manuscript in preparation for the publication in a significantly shorter form:

[MH3] Marko Hanževački, Ana-Sunčana Smith, David Matthew Smith “A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase”, in preparation.

The research was conceived by David Matthew Smith and Ana-Sunčana Smith. Marko Hanževački performed all QM/MM calculations and analyses. Marko Hanževački is the author of the current chapter.

Date and signature:

Erlangen, June 10, 2020

Marko Hanževački Ana-Sunčana Smith David Matthew Smith

104

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

5.1. Introduction

The glycyl radical enzyme pyruvate formate-lyase (PFL)1-5 catalyzes a reversible transformation of pyruvate and coenzyme A (CoA) to formate and acetyl-CoA via a ping-pong mechanism.6-11 This catalytic activity of PFL is found to be carried out in two half-reactions which include rearrangements in the active site, conformational changes of the enzyme, and substrate entry in the active site. The mechanism of the 1st half-reaction is proposed based on the available crystal structures of PFL with its first substrate pyruvate already positioned in the active site. For the 2nd half-reaction, the conformational changes of the enzyme and the entry of the second substrate CoA in the active site are crucial.12

In Chapter 3 we reported results of extensive unrestrained all-atom MD simulations conducted in monomeric PFL model systems before (PFL•) and after (AcPFL I and AcPFL II) the 1st half-reaction (see Figure 5.1.1a for non-acetylated and Figure 5.1.1b-c for acetylated model systems, respectively) to quantify the influence of the chemical change in the active site of PFL on its dynamics. We sought to find evidence of any conformational changes that would allow CoA to approach closer to the active site of PFL which is a prerequisite for triggering the 2nd half-reaction. We found that subtle chemical rearrangement in the active site, namely acetylation at C418, causes changes in the fluctuation spectra of distal regions localized at the protein surface. By analyzing the obtained trajectories we could successfully identify a potential CoA entry channel, which enabled us to characterized the important gateway residues comprising this channel and obtained molecular insights into the cascade of signaling events that lead to the opening of this gate after the 1st half-reaction.13

Since we could not find evidence for a spontaneous entry of CoA into the active site of PFL despite long simulation time of hundreds of nanoseconds of unbiased dynamics in any of

Figure 5.1.1. Models representing the snapshots of the PFL active site before (a) and after (b-c) the first half-reaction. The location of the radical is denoted by the symbol ‘•’.

105

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase our model systems, further we employed steered molecular dynamics simulations to simulate this entry/exit pathways, and umbrella sampling to estimate free energy profiles of such events. In parallel, we allowed unrestrained dissociation dynamics of CoA from the active site of PFL following forward pulls and analyzed these trajectories, built meaningful statistics, and extracted free energy profiles. What emerged from this study was a significant result, namely, that in the process of CoA approaching the active site of PFL it experiences relatively low energy barriers (in the range of few kcal/mol) both in acetylated (AcPFL I and AcPLF II) and non-acetylated (PFL•) model systems. We also could examine closely the resulting dynamic conformations of the moieties of PFL and CoA and could show that CoA binds to the guanidine group of R435 via different hydrogen bonding patterns in all three model systems with an especially strong preference of interactions with the active site compartments in AcPLF I, •- mostly due to favorable H-C interactions between S-H group of CoA and CO2 . Both these results were presented in detail in Chapter 4.14

So far, we have focused on finding stable binding poses of CoA in the active site of PFL and conformational requisites for such a process using MD simulations. Despite interesting findings, answers to a few questions related to the catalytic mechanism of PFL remain unsolved, e.g. the influence of the enzyme on the reaction mechanism and elucidation of key residues in the mechanism itself. How do these radical intermediates and transition states look like and how many steps are involved in the overall catalytic cycle? What enables the reversibility of the catalysis and what are the potential rate-determining steps for both half- reactions?

The currently accepted catalytic mechanism of PFL, proposed by Knappe et al.15 was derived based on an earlier mechanism proposed by Kozarich and co-workers16 and is shown in Figure 5.1.2. The catalysis proceeds in two half-reactions following ping-pong kinetics with acyl-enzyme intermediate as presented in Chapter 3. Moreover, the reaction is fully reversible -1 with relatively high turnover in both directions (kcat being 770 and 260 s for the forward and backward directions, respectively).1

As already mentioned in Chapter 1, to become fully active PFL requires activation by the corresponding activating enzyme (PFL-AE).17 During the activation step, PFL undergoes dramatic conformational changes that allow its binding to PFL-AE, where the iron-sulfur [4Fe-

106

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

Figure 5.1.2. The proposed reaction mechanism of PFL catalysis indicating the presence of seven transition states and ten intermediates corresponding to the 1st (TS1-2 and A-C shown in orange) and the 2nd (TS4-6 and E-H in turquoise) half-reaction as well as the inactivation (TS7 and I-J given in red). The formation of intermediate D from the intermediate C via transition state TS3 without the presence of CoA is a separate step and does not belong to any of the half-reactions (yellow). Note that the radical center is always shown as ‘•’ while we use the red curved arrows with a single headline to show the movement of single electrons. See text for details. 4S]+ cluster reductively cleaves SAM to generate Met and a 5′-deoxyadenosyl radical (5′- dAdo•).18 The 5′-dAdo• abstracts the H-atom from the PFL glycine G734 leaving the radical • 19 activity on the Cα-atom of a glycine backbone (G734 ). Following activation, the radical is shuttled from G734, via C419, to C418, to reversibly form a thiyl radical C418•. The above- described steps (shown gray in Figure 5.1.2) are not part of this research.

Once the radical finds itself at C418, the attack on the first substrate pyruvate can be easily achieved, triggering the 1st half-reaction. Importantly, when it comes to the interpretation of the mechanisms of the 1st and 2nd half-reaction in the PFL catalytic cycle, there is no unique definition and that those cannot be simply separated. Nevertheless, the current opinion is that the 2nd half-reaction is the reaction between PFL and its second substrate CoA. Interestingly, the steps that belong to the 1st half-reaction, with the first substrate pyruvate, occur in the active site without CoA being present inside. Once pyruvate is cleaved by the enzyme, all reactions taking place in between the entry of CoA in the active site, cannot be classified in either of the two half-reactions. We will now explain in detail the possibilities of series of transformations

107

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase taking place in the absence of CoA in the active site, as well as the steps that can occur after the termination of the reaction with pyruvate and before the entry of CoA, followed by the reactivity of the active site residues with CoA in its presence.14

Having all that in mind, the 1st half-reaction is thought to proceed in two following steps; In the first step, the sulfur atom of the ideally positioned thiyl radical C418• attacks the carbonyl C-atom of the pyruvate (see A in Figure 5.1.2), which leads to the formation (see A TS1 → B in Figure 5.1.2) of a radical tetrahedral intermediate (see B in Figure 5.1.2). In the TS2 second step, the intermediate collapses (see B → C in Figure 5.1.2) into the carbon dioxide •- radical anion (CO2 ) and acetylated C418 (see C in Figure 5.1.2).

After the cleavage of pyruvate, with the active site still free of CoA, the newly formed TS3 •- radical specie, CO2 , could abstract (see C → D in Figure 5.1.2) the nearest hydrogen from the thiol group of C419, which gives formate. Based on our definition of both half-reactions comprising the PFL catalytic mechanism, we propose that this H-transfer cannot be uniformly included in any of the two half-reactions and we will treat it as a separate step in the mechanism. However, as it will become clear from the following text, this step is crucial for one of the possible branches of the 2nd half-reaction and the overall mechanism.

The 2nd half-reaction, having only two steps, can be triggered via two different routes, starting from either AcPFL I or AcPFL II (see E and F in Figure 5.1.2, respectively). Note that the structures E and F correspond to the same active site architecture as the structures C and D in Figure 5.1.2 as if CoA was present inside. In both cases, the entry of CoA into the active site of PFL is needed, however, the first route has one step less in the overall mechanism. The correct positioning of the CoA S-H group in the active site enables the initiation of the 2nd half-reaction with its first step being the hydrogen abstraction from the thiol group of CoA (see TS4 TS5 E → G and F → G in Figure 5.1.2) which produces the CoA-S• specie (see G in Figure TS6 5.1.2). This radical then attacks the acetyl group (see G → H in Figure 5.1.2) attached to C418 which in turn yields acetyl-CoA (see H in Figure 5.1.2) and regenerated active state of PFL with radical stored at C418. The dissociation of final products, namely formate and acetyl-CoA terminates the catalytic cycle of PFL. An alternative route in which CoA enters the active site of PFL before the pyruvate breakdown causes the premature radical quenching and the TS7 inactivation of PFL (see I → J in Figure 5.1.2).

108

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

Several attempts have been reported in theoretical studies of both half-reactions of the PFL catalytic mechanism. However, most of them were based on using a simple and small representation of the active site in the gas phase employing a polarizable continuum model to account for the electrostatics of the environment. First, back in 1998, Himo and Eriksson20,21 calculated the mechanism using a neutral form of pyruvate when the X-ray structure of PFL was still not available. In their study, the reaction mechanism proposed by Kozarich and co- workers,16,22,23 was confirmed. They proposed that the addition of the thiyl radical to pyruvate, yielding the radical tetrahedral intermediate, has the highest barrier (≈ 51.5 kJ/mol), indicating that this is the rate-limiting step of the reaction. They found that the tetrahedral intermediate •- collapses to CO2 with reasonable energetics and activation energy barrier and that the radical •- quenching from CO2 occurs via cysteine (≈ 4.6 kJ/mol) and not glycine (≈ 20.5 kJ/mol), as the thought of earlier. They also calculated the exothermicity for the entire reaction being about 33.9 kJ/mol.

Later, in 2003, Ramos and co-workers24 recalculated the mechanism using a charged model of pyruvate and they found that the addition of the C418 thiyl radical to pyruvate (≈ 48.5 •- kJ/mol) and the subsequent release of the CO2 are coupled, namely that previously postulated tetrahedral intermediate did not exist according to their model. They characterized the H- transfer between the C419 and G734 as the rate-limiting step (≈ 58.2 kJ/mol), while they calculated the overall activation energy needed to drive the entire reaction to be about 67.4 kJ/mol, while the overall reaction was exothermic by ≈ 24.3 kJ/mol.

In their second revisited mechanistic study of both half-reactions of PFL catalysis in 2004, Himo and co-workers25 employed a larger cluster model to more accurately represent the interactions in the active site of PFL which were based on the available crystal structure of PFL by Becker et al. from 2002.26 They found that the homolytic C-C bond cleavage of pyruvate that occurs after the attack of the C418 radical on the carbonyl carbon of pyruvate is the rate- limiting step (ca. 67 kJ/mol). However, based on their calculations, the higher energy barrier was calculated for the backward reaction (ca. 100 kJ/mol) which makes the reaction in the backward direction slower than in the forward direction. They also confirmed the existence of the stable radical tetrahedral intermediate in the 1st half-reaction and calculated the possibility •- of radical quenching from the CO2 by CoA. However, in addition to examining the role of C419 in this, they additionally considered quenching by CoA, albeit using flexible methanethiol to mimic CoA in the active site.

109

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

The ONIOM method was previously used by Smith, and co-workers27,28 to study certain aspects of the PFL-catalyzed reaction employing sophisticated high-level theory but still using only simple gas-phase models of the active site. Similarly, another attempt has been made where the 1st half-reaction was investigated using the full-enzyme QM/MM approach starting from the force field relaxed crystal structure of PFL.29 However, the 2nd half-reaction was never studied using the full-enzyme multiscale modeling mostly due to the unknown binding pose of CoA in the active site of PFL. Making use of the advantage of having found the stable binding pose of CoA in the active site of PFL, we attempt a QM/MM study also of the 2nd half-reaction.

The aim of this present study is thus to investigate the enzymatic transformation that occurs within the 1st half-reaction of PFL catalysis as well as to investigate the mechanistic aspects of the 2nd half-reaction with CoA placed in the active site employing QM/MM calculations on monomeric PFL enzyme model. In this respect, we took advantage of previously analyzed MD trajectories of three monomeric PFL model systems (PFL•, AcPFL I, and AcPFL II), which enabled us to determine starting snapshots for our QM/MM studies. We start by presenting the procedure for building our QM/MM model systems and used theoretical methodologies.

110

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

5.2. Methods

The hybrid QM/MM method is a multiscale scheme used to divide the system into different levels of complexity. This approach is based on research pioneered by Warshel and co-workers30-33 and is made possible through ONIOM implementation, originally developed by Morokuma and co-workers.,34 where the system is divided into several layers; the active site residues and atoms that directly participate in a chemical reaction are treated using QM methods (e.g. ab initio or DFT) while the rest of the protein residues and surrounding solvent molecules are treated classically using molecular mechanics (MM) based on the force field. The region between the two is usually treated simply to satisfy the valence of atoms at the border by adding several hydrogen link-atoms as already explained in Chapter 2.35 The information about the underlying energetics and the effect of the enzyme on the reaction mechanism can be extracted from the obtained geometries and energy profiles.

5.2.1. QM/MM Model Systems

The enzyme model was constructed from the crystal structure of PFL with its substrates, namely pyruvate and CoA (PDB: 1H16).26 Suitable initial structures for QM/MM calculations were extracted from previously obtained unrestrained MD trajectories of non-acetylated (PFL•) and two types of acetylated (AcPFL I and AcPFL II) monomeric PFL model systems.13,14 In this respect, to enable modeling of the 1st and the 2nd half-reaction, two different initial structures were extracted for each of the three PFL model systems in the absence and in the presence of CoA in the active site for each of the three model systems (a total of twelve MD snapshots were extracted).

5.2.2. QM Region Setup

Since both half-reactions present in the PFL-mediated catalysis occurs in the active site, the same portion of the enzyme was described using QM methods when modeling both half- reactions. More precisely, the QM region includes, in both cases, the side chains of R176, •- C418, C419, and R435 and the substrate present in the active site (be it pyruvate, CO2 or - nd HCO2 depending on the examined model system). Naturally, when modeling the 2 half- reaction, besides already mentioned QM region, the part of the cysteamine moiety of CoA was treated quantum-mechanically (see Figure 5.2.1A and 5.2.1B for details on QM regions of both half-reactions). The charge of the QM region is qQM = +1 while the multiplicity is m = 2 in all QM/MM calculations. The external parts of models were kept fixed during the geometry

111

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

Figure 5.2.1. Geometries of the QM part of the QM/MM model used in the calculations of the 1st half- reaction (PFL•) and the 2nd half-reaction (AcPFL I) are shown in panels A and B, respectively. The atoms treated quantum-mechanically are shown in a ball-and-stick representation, while the rest of the protein is omitted for clarity. The QM region encompasses pyruvate, the side-chains of the catalytic cysteines C418 and C419 and that of R176 and R435 for the 1st half-reaction as well as the additional part of the cysteamine moiety of CoA for the 2nd half-reaction (46 and 52 QM atoms in total including link H-atoms for the 1st and the 2nd half-reaction, respectively). The label L indicates atoms replaced by hydrogen link-atoms. C. The representative model system used in QM/MM calculations (two-layer ONIOM). The central QM region as well as all MM residues that are found 10 Å away from the QM region can move during the optimization. The rest of the protein, solvent molecules and ions are kept fixed during the optimization and treated with MM. optimizations, with the position restraints applied on MM residues found at the distance ≥ 10 Å from any QM atom (see Figure 5.2.1C).

5.2.3. QM/MM Calculations Setup

Snapshots of the systems extracted from their respective MD simulations (see QM/MM Model Systems paragraph) were minimized using the pmemd module of AMBER 16 software.36 To prepare models suitable for QM/MM calculations 1000 water molecules closest to the protein and the substrate present in the relaxed snapshots, as well as 8 Na+ ions, were preserved. The complete MM models of each monomeric PFL model system consisted of 759 protein residues (constituting entire enzyme), 1000 water molecules, one CoA molecule, one •- pyruvate molecule in the case of the PFL• model, and one carbon dioxide radical anion (CO2 - ) or formate (HCO2 ) molecule in the case of AcPFL I and AcPFL II models, respectively (15 008 atoms in total). See Figure 5.2.1C for details.

Further structure preparations were performed by TAO (A Toolkit to Assist ONIOM Calculations) package.37 The QM/MM calculations were handled by the Gaussian 16 (Revision

112

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

A.03) program employing a two-layer ONIOM procedure.38 In the case of geometry optimization, hybrid B3LYP density functional and AMBER ff14SB36 force field were employed as QM (high layer) and MM (low layer) potentials, respectively. The electrostatic embedding (EE) was used for QM/MM geometry optimizations of starting structures and single-point calculations of optimized structures. To saturate the valence at the QM/MM boundary, link H-atoms were introduced (see Figure 5.2.1A-B). A medium-sized Pople’s 6- 31G(d,p) basis set was used for the optimization of chosen geometries of investigated systems. Final energies and atomic properties (Mulliken spin densities) were evaluated by performing single-point energy calculations employing 6-311+G(3df,2p) basis set on the previously optimized geometries were added. Final energies were corrected by unscaled zero-point vibrational energies.

5.2.4. Analysis

Starting from the substrate geometry, the reaction path was explored by performing potential energy surface (PES) scans followed by geometry optimizations. The PES scans were performed by either decreasing or increasing selected distances between two atoms. For example, to investigate the enzyme’s acetylation step within the 1st half-reaction by a PES scan, the distance between atoms S of the C418 and carbonyl-C of the pyruvate was decreased starting from MD snapshot taken from the PFL• model system. At each point of the scan, a constrained geometry optimization was performed. A scan generated structures that approximated the path of the reaction step. The structure of the highest energy along the selected reaction coordinate can be treated as a favorable guess of the transition state (TS) geometry and as such was used to obtain TS of the examined coordinate via TS optimization using Berny’s algorithm. The last structure from the scan was optimized to find the geometry of a new intermediate. This geometry was subsequently used to initiate another scan to find the consecutive intermediate on the reaction path. The TAO package was used to extract desired geometries from PES scans. To validate all optimized geometries, we performed the frequency analysis using the same level of theory as in the case of geometry optimization. In the last step, the single-point energy and atomic properties (Mulliken spin densities) calculations were performed on these optimized geometries using the already described level of theory. The VMD 1.9.339 software was used for the visualization of structures.

113

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

5.2.5. QM/MM Benchmark

In the present work, we also performed a benchmark of density functionals (DFs) by employing single-point QM/MM calculations using ONIOM[G3(MP2)-RAD:AMBER]40,41 level on previously optimized ONIOM[B3LYP/6-31G(d):AMBER] geometries. Selected functionals (F) were then tested by evaluating ONIOM[F/6-311+G(3df,2p):AMBER] energies (F = BMK+D3BJ, M06-2X, B3LYP+D3BJ, B2PLYP+D3BJ, ωB97XD and PBE0+D3BJ, where “+D3BJ” indicates Grimme’s D3 dispersion corrections with Becke-Johnson (BJ) damping function,42 and comparing them with the benchmark values. For this assessment, we obtained the initial structure by extracting the MD snapshot from the PFL• system (see Figure 5.2.1A) with the QM zone reduced to a minimal number of atoms including only the side chain of C418, R435, and pyruvate (with a total of 27 atoms including two link H-atoms). The total charge of such a reduced QM region is qQM = 0 while the multiplicity is again m = 2. The outer layer of the system was kept fixed during the geometry optimizations, with the position restraints applied to the MM region as described earlier. All presented QM/MM energies were calculated at 0 K and include a scaled (by 0.9806) zero-point vibrational energy correction.

114

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

5.3. Results

5.3.1. Density Functionals Quality Assessment

To ascertain which density functional (DF) describes the energetics of the PFL- catalyzed thiyl radical addition to the carbonyl group of pyruvate most satisfactorily, we characterized the transformation of A to C, via B, through TS1 and TS2 (see Figure 5.1.2). The individual steps will be discussed in more detail below in the context of the 1st half-reaction calculations with the larger QM region (Figure 5.2.1A).

The reference ONIOM[G3(MP2)-RAD:AMBER] energy profile, obtained using ONIOM[B3LYP/6-31G(d):AMBER] geometries, is depicted in Figure 5.3.1 as a black dashed line and shows the corresponding energies of the stationary points relative to intermediate A. As seen from Figure 5.3.1, almost all tested DFs (energies calculated on the same geometry) •- suggest that the addition of C418 thiyl radical to pyruvate and the subsequent release of CO2 are concerted, in contrast to G3(MP2)-RAD which indicates that this process is stepwise. While the energetics of TS1 was found to be less sensitive on the choice of the method, the majority of DFs over-stabilize the radicals, especially transition structure TS2 and intermediate C which are found to be over-stabilized by all tested functionals. The largest discrepancies occur for the C-C bond cleavage transition structure (TS2) which was found, by almost all functionals, to be slightly more stable than the previous intermediate B, compared to the G3(MP2)-RAD method which suggests that this barrier is significantly higher in energy. Figure 5.3.1. Calculated QM/MM benchmark energy Particularly notable in this respect are profiles using ONIOM[G3(MP2)-RAD:AMBER] and PBE0 and B3LYP functional, DFT (ONIOM[F/6- irrespective of the dispersion 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6- corrections added. 31G(d):AMBER], where F = BMK+D3BJ, M06-2X,

The overall performance of the B3LYP+D3BJ, B2PLYP+D3BJ, ωB97XD and PBE0+D3BJ) level of theory. The position of radical tested functionals can be best evaluated in each intermediate is shown as ‘•’. by inspecting their mean absolute

115

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase deviations (MADs in the upper-rightmost inset of Figure 5.3.1) from the benchmark calculations (numerical values for all stationary points are provided in Table 1 of the Appendix. A global hybrid GGA functionals PBE0 and B3LYP with included dispersion corrections are associated with the poorest overall performance (MAD = 15.9 and 11.0 kJ/mol, respectively). Moreover, better performance of meta-hybrid BMK (9.4 kJ/mol) and a double-hybrid B2PLYP (6.2 kJ/mol) functionals with included dispersion is observed, with both being just out of the chemical accuracy (≈ 4 kJ/mol). A range-separated functional with inbuilt dispersion correction, ωB97XD, with a MAD of 3.0 kJ/mol, is certainly preferable to either of the aforementioned functionals, however, it still underestimates the energy of TS2 and intermediate C by approximately 7 and 6 kJ/mol, respectively. Inspection of Figure 5.3.1 shows that the representative of the Minnesota family of hybrid meta functionals, M06-2X, which, due to its parametrization, already implicitly covers medium-range dispersion effects, is associated with a slightly higher MAD (3.6 kJ/mol) compared to ωB97XD. On this basis, we believe that ONIOM[ωB97XD/6-311+G(3df,2p):AMBER] should prove to be the most reliable model chemistry for investigation of the energetics of chemical transformations within PFL-catalyzed reaction.

5.3.2. The 1st Half-Reaction with Pyruvate and Hydrogen Transfer inside the Active Site without CoA

We start by analyzing the obtained energy profiles and atomic properties (intra- and intermolecular distances and atomic Mulliken spin densities) for the 1st half-reaction (see intermediates A-C and transition state TS1-2 in Figure 5.1.2) and for the independent H- transfer step (see C → D via TS3 in Figure 5.1.2) calculated within QM/MM framework as described in the methods section. In Table 2 and 3 of the Appendix, we present the values of calculated energies associated with the steps comprising the 1st half-reaction and the H-transfer without CoA in the active site, respectively. Using these values, we plot, in Figure 21 and 22 of the Appendix, the initial energy profiles calculated starting from independent MD snapshots taken from each model system (non-acetylated and acetylated) without CoA present in the active site. For the sake of clarity, we plot, in Figure 21 and 22, only final single-point energies obtained by ONIOM[ωB97XD/6-311+G(3df,2p):AMBER] level of theory and corrected by the zero-point vibrational energies. Moreover, for each topology, we constructed an average profile by Boltzmann averaging the initial profiles. These are shown in Figure 5.3.2a-b regarding the 1st half-reaction and the H-transfer step, respectively.

116

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

In summary, the obtained results suggest that the 1st half-reaction proceeds without the interference of CoA as follows; The formation of the S-C bond between pyruvate and C418• (A → B) upon the addition of thiyl radical C418• to pyruvate’s carbonyl group (TS1) and the cleavage of the C-C bond in the tetrahedral radical intermediate (TS2) with the formation of •- carbon dioxide radical anion (CO2 ) and acetyl-enzyme intermediate (B → C). See Figure - 5.3.2a for details. The later stage includes the formation of final product formate (HCO2 ) and regeneration of the radical center on C419 (C → D) that occurs upon hydrogen transfer between •- the S-H group of C419 and CO2 (TS3) as shown in Figure 5.3.2b.

By careful inspection of averaged energy profiles (see Figure 5.3.2) and by comparing these resulting profiles with the ones obtained starting from MD snapshots of different topologies (see Figure 21 and 22), a couple of interesting features arise. We note that irrespective of the initial snapshot, the general conclusion on the catalytic mechanism regarding the 1st half-reaction (see A-C in Figure 5.1.2) remains the same. Namely, the resulting independent profiles (see Figure 21) have the minor influence of the chosen starting snapshot within the topology of their origin. This is also the case when comparing the profiles that originate from different model systems, especially for the first part of the 1st half-reaction, where only a small increase in the barrier height for TS1 is observed when starting the calculations from the snapshot taken from AcPFL I system. Interestingly, Figure 5.3.2b shows •- a significant system dependence, since the barrier for H-transfer between C419 and CO2 (TS3) was found to be significantly lower in acetylated systems, especially in AcPFL I, compared to the non-acetylated system (see also Figure 22 in Appendix for initial energy profiles). This is probably due to the increased flexibility of the active site upon acetylation, which was described in detail in Chapter 3.13 Namely, the flexibility of the active site increases with a decrease in the molecular size of species present in the active site in each topology. For example •- - the smaller size of CO2 compared to HCO2 and to extent pyruvate, enables the formation of different hydrogen bonding networks with the arginine residues (R176 and R435) in the active TS2 site of PFL which mainly influences the second part of the 1st half-reaction (B → C) and the TS3 H-transfer (C → D), favoring the positioning of CO2 radical anion for hydrogen abstraction as evident from the energy profiles in Figure 5.3.2b. We now proceed to in detail analysis of each reaction step involved in the mechanism of the 1st half-reaction in the PFL-mediated catalytic •- process, as well as the step including the H-transfer between C419 and CO2 .

117

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

5.3.3. The Attack of the C418 Thiyl Radical on Pyruvate

Even though the obtained energy profiles in Figure 5.3.2 (and Figure 21 of the Appendix) suggest that the addition of C418• to pyruvate and collapse of the radical tetrahedral •- intermediate to acetyl-C418 and CO2 (see Figure 5.1.2 for steps including intermediates A, B, and C) are concerted, here we will discuss them as independent steps.

The energy barrier for the attack of C418• sulfur to carbonyl C-atom of pyruvate, that yields in a formation of S-C bond between the radical and pyruvate, was calculated to be about 46.4 to 57.0 kJ/mol depending on the examined model system (see Figure 5.3.2a and Table 2 of the Appendix). These values are in agreement with the previously calculated literature values (≈ 50-54 kJ/mol) for this barrier obtained by employing different cluster models of the PFL active site.20,21,24,25

Since the changes in the distances between atoms that participate in each step of the reaction, together with the atomic spin occupancies, represent the best indicators of the progress of the radical reaction, we decided to show all the important intra- and intermolecular distances in Table 4-6 of the Appendix, as well as the absolute values of atomic Mulliken spin densities, which are tabulated in Table 7-9 of the Appendix, regarding the mechanism of the 1st half- reaction and the independent H-abstraction for each model system (PFL•, AcPFL I and AcPFL II) without CoA present in the active site. Besides, we plotted the values of each distance versus the reaction coordinate showing the changes of distance as the reaction progresses (see Figure 26-28 of the Appendix). Moreover, we created similar plots for the Mulliken spin densities (see Figure 29-31 of the Appendix). Note that, in these combined plots we include all possible steps that can occur in the active site without the participation of CoA, and that we comment only the distances with the highest variations, namely those that contribute the most to the overall reaction coordinate, as well as the absolute Mulliken spin densities that are ≥ 10 % in each step of the reaction mechanism.

The distance between a sulfur atom of ideally positioned cysteine C418 and carbonyl C-atom of pyruvate in the first intermediate A was found to be about 3.2 Å with about 95 % of the spin localized on S-atom of C418•. In the transition structure TS1, where the C418 approaches closer to pyruvate, the distance between reactive sulfur and carbon decreases to 2.2 Å followed by a steady increase in the C-C distance in pyruvate for about 0.1 Å. The spin density in TS1 drops drastically from being exclusively localized on sulfur of C418 to about 70 % of the spin still concentrated on the accentuated S-atom while the rest of the spin density

118

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

Figure 5.3.2. Boltzmann weighted energy profiles (T = 298.15 K) for the 1st half-reaction of PFL catalysis (a) and the independent H-abstraction reaction (b). These profiles are obtained by performing QM/MM calculations starting from different initial snapshots taken from MD trajectories of PFL•, AcPFL I and AcPFL II systems in the absence of CoA in the active site (see Table 2-3 and Figure 21-22 of the Appendix for the initial values and energy profiles). The profiles are obtained by performing single-point calculations with the ONIOM[B97XD/6-311+G(3df,2p):AMBER] level of theory on top of the ONIOM[B3LYP/6-31G(d,p):AMBER] optimized geometries. The profiles in panel (a) are set relative to the energy of the first intermediate A, while the profiles in panel (b) are set relative to the intermediate C. Representative QM/MM-optimized geometries (PFL• (orange), AcPFL I (green) and AcPFL II (blue)) of intermediates are shown under the profiles while those of transition structures are shown above the profiles. The position of radical in each intermediate is shown as symbol ‘•’. See Figure 5.1.2 for details, as well as Figure 23-25 for all QM/MM-optimized geometries. For details on the intra- and intermolecular distances and absolute values of atomic Mulliken spin densities see Table 4-9 and Figure 26-31 of the Appendix. di stributes to the carbonyl oxygen of pyruvate (≈ 20 %). As a product of this addition, a tetrahedral radical intermediate B is formed between the enzyme and pyruvate. The structural characteristics of this complex species include the shortening of the distance between the attacking S-atom of C418 and the carbonyl C-atom of pyruvate to about 1.9 Å as well as the increase in the C-C bond distance in pyruvate to about 1.8-1.9 Å depending on the model system. The spin density of the intermediate B is mostly spread between three central atoms with about 30 % of the spin on carbonyl O-atom, and an equal amount of spin (approximately

119

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

20 %) on sulfur of C418 and carboxylate C-atom of pyruvate. See Table 4-9 and Figure 26- 31 for details on distances and spin localization.

Another feature of this intermediate is evident from the energetics shown in Figure 5.3.2a and Figure 21, namely its stability. In this respect, the physical occurrence of the tetrahedral radical intermediate B is questionable, both due to its relatively high energy (being 36.3-41.9 kJ/mol less stable than A), and because its energy was found to be similar to the following transition state TS2 (32.5-40.6 kJ/mol). Moreover, intermediate B is characterized as a shallow energy minimum on the potential energy surface at a lower level of theory (see Table 2 of the Appendix). However, given the short distance of the C-C bond in B and the delocalized spin distribution, the full C-C bond dissociation has not yet taken place.

5.3.4. The Collapse of the Radical Tetrahedral Intermediate and Acetylation of C418

Despite detection and characterization of the transition structure TS2 as the C-C bond dissociation in the tetrahedral intermediate on a lower level of theory in all examined model systems, the single-point calculations with a larger basis set illustrate a barrierless dissociation process of this intermediate to intermediate C (with a slightly lower energy of TS2 compared to the energy of the previous intermediate B). Namely, this tetrahedral radical intermediate B, while highly energetic, spontaneously collapses following the dissociation of the C-C bond which separates the pyruvate molecule into two parts. The first part stays attached to C418 as an enzyme-modified acetyl intermediate, while the latter is transferred to the other side towards the arginine 176 and 435 while still placed tightly in the active site as the non-covalently bound •- CO2 (see intermediate C in Figure 5.1.2). This specie can be characterized as an overall negatively charged molecule with a high spin density (approximately 75 % of the spin localized on carbon and 10 % distributed on each of the oxygen atoms). Interestingly, a distance between the sulfur of fully acetylated C418 and the carbonyl C-atom is found to be as short as 1.8 Å while the distance between carbonyl and carboxylate carbons of dissociated pyruvate was found to be about 2.8-3.0 Å depending on the model system. In this state, the acetyl-C418 moiety is found to be free of the radical (see Table 4-9 and Figure 26-31 of the Appendix). Interestingly, this intermediate C is found to be only 10.0-19.4 kJ/mol less stable than the reference intermediate A.

120

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

•- 5.3.5. The Hydrogen Transfer between C419 and CO2

•- In the later stage, the recently formed CO2 , that is produced upon the completion of the 1st half-reaction and pyruvate cleavage has a possibility to abstract a hydrogen atom and •- regenerate radical activity on PFL. The nearest hydrogen to CO2 is located on another PFL’s cysteine C419. The H-abstraction from the S-H group of C419 (TS3) occurs with a relatively high energy barrier and has the highest sensitivity towards starting topology as can be observed from Figure 5.3.2b (calculated relative energies from the intermediate C ranges from 23.4 to 48.7 kJ/mol for acetylated and non-acetylated systems, respectively). This H-transfer yields final product formate, acetyl-C418 intermediate, and radical activity regenerated on C419. Intermediate D was found to be about 9.7-11.7 kJ/mol more stable than the intermediate C, which is following the relative thermodynamic stability of the S-H vs. C-H bond based on their dissociation energies from the literature.43,44 Significantly higher energy barrier (TS3) for the •- H-transfer between the CO2 and C419, compared to the energy calculated for TS1 and to an •- extent for TS2, could be since the reactive radical-carrying C-atom of CO2 is initially positioned about 4 Å away from the hydrogen of C419 (in intermediate C) and has to travel a larger distance to abstract the accentuated H-atom and regenerate radical on C419. Interestingly, the observed trend in the energy barriers for TS3 could be due to both R176 and R435 having to significantly change their position while keeping the network of hydrogen - bonds with radical CO2 , stabilizing this specie upon its approach towards the C419 for the hydrogen abstraction. Namely, the lower H-abstraction barriers found in acetylated systems could be due to the possibility of the active site to adjust to the specie present in the active site during the MD simulations. After the completed H-transfer, the calculated Mulliken spin density on S-atom of C419 was found to be about 96 %. This step terminates with the acetylated-C418, C419 thiyl radical, and formate in the active site (see Table 4-9 and Figure 26-31 of the Appendix for details).

In summary, our QM/MM calculations indicate that the rate-determining step of the PFL-catalyzed reaction, occurring without the participation of CoA in the active site, is the acetylation of the enzyme upon the addition of C418 thiyl radical to pyruvate. We also show that the choice of the starting MD snapshot has a minor influence on the energetics of the first half-reaction with pyruvate obtained after performing a series of QM/MM calculations (see steps A-C in Figure 5.3.2a), whereas a significant decrease of the H-abstraction barrier (see step C-D in Figure 5.3.2b) was found when performing the QM/MM calculations by extracting snapshots that originated from acetylated topologies, especially AcPFL I. However, a general

121

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase conclusion on the mechanism of the 1st half-reaction with a possibility of the H-transfer without CoA inside the active site is that the potential energy profile remains the same, irrespective of the initial geometry. Note that all previously described resulting energy profiles shown in Figure 5.3.2 are calculated for the steps comprising the 1st half-reaction and the H-transfer •- between CO2 and C419 which do not account for the explicit presence of the second substrate CoA in the active site of PFL. To give a complete insight into the overall catalytic cycle of PFL we now proceed towards presenting the QM/MM results for the 2nd half-reaction with CoA present, where we explain three possible mechanistic pathways together with their outcomes in details using a similar analysis.

5.3.6. The 2nd Half-Reaction and Inactivation with CoA

To trigger the 2nd half-reaction, the activation of CoA by the hydrogen abstraction from the S-H group is crucial. Therefore, to achieve this coenzyme molecule must bind with its thiol group in the active site of PFL in a favorable position. The first step of the 2nd half-reaction is the H-abstraction from CoA which activates coenzyme for the acyl transfer on the activated CoA radical yielding the final product acetyl-CoA. In Chapters 3 and 413,14 we found a putative CoA entry channel and characterized stable and potentially reactive binding modes of CoA in the active site of non-acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) model systems using MD simulations. To investigate the reaction mechanism of PFL with CoA regarding the 2nd half-reaction, the snapshots, where CoA was present in the active site of PFL, were extracted and set up for the QM/MM calculations. Now we move towards explaining and comparing the results obtained for two branches of progressive reactions in AcPFL I and AcPFL II, followed by the results for the inactivation reaction in PFL•.

First, we construct in Figure 5.3.3a-b the energy profiles for both steps comprising the 2nd half-reaction by Boltzmann averaging two initial profiles (see Table 10 and Figure 32 of the Appendix) obtained by performing QM/MM calculations on MD snapshots extracted from trajectories in which CoA stably resided in the active site in each of the two acetylated (AcPFL I and AcPFL II) model systems, respectively. The energy profiles in panels (a) and (b) of Figure 5.3.3 show the underlying energetics for the two reactions in AcPFL I and AcPFL II model systems, respectively. This includes a hydrogen abstraction from CoA (see intermediates E → G and transition state TS4 in AcPFL I and intermediates F → G and transition state TS5 in AcPFL II) and acyl transfer from C418 to CoA (see intermediates G → H and transition state TS6) in both AcPFL I and AcPFL II acetylated model systems, respectively. Interestingly, we found that both steps are energetically more favorable when carried out in

122

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

AcPFL I rather than in the AcPFL II system (see Figure 5.3.3, Table 10 and Figure 32 of the Appendix). This remarkable result encouraged us to now move towards discussing in detail the obtained energetics as well as geometrical properties for all the steps in the 2nd half-reaction including the enzyme inactivation reaction.

5.3.7. The Hydrogen Abstraction from CoA

•- The calculated energy barrier for CoA activation via H-abstraction performed by CO2 in AcPFL I was found to be as low as 4.7 kJ/mol (see Figure 5.3.3a), in contrast to a high barrier of 40.5 kJ/mol calculated for the similar reaction carried out by C419• in AcPFL II (see Figure 5.3.3b). To gain deeper insight into the geometries of intermediates and transition structures we performed a similar analysis of the key inter- and intramolecular distances and the absolute atomic Mulliken spin densities as in the case of the 1st half-reaction. In this respect, we show in Table 11 and 12 of the Appendix the numerical values of distances as well as the distances plotted versus the reaction coordinate depicting, in Figure 35 and 36 of the Appendix, the progress of the 2nd half-reaction starting from AcPFL I and AcPFL II model systems, respectively. The absolute atomic Mulliken spin densities are also presented in Table 13-14 and plotted against the reaction coordinate in Figure 37-38 of the Appendix for AcPLF I and AcPFL II, respectively.

From those analyses, we can extract several interesting features on the activation of CoA by H-abstraction. A fundamental difference between the two topologies is the initial •- location of the radical center. In AcPFL I system, the radical is initially stored on the CO2 molecule, while being localized on sulfur of C419• in the AcPFL II system. Those carbon and sulfur atoms are responsible for triggering the activation of CoA in the first step of the 2nd half- reaction in each of the acetylated AcPFL I and AcPFL II model systems, respectively. Therefore, a shorter distance was found between hydrogen from the S-H group of CoA and the •- carbon atom of CO2 (being slightly above 2 Å) rather than in the case of a distance between thiol hydrogen of CoA and S-atom of C419• where it was found to be ≈ 3 Å. This indicates an •- easier H-transfer to CO2 due to a smaller size of carbon compared to a larger sulfur atom which causes steric clashes in the active site of AcPFL II. Interestingly, the calculated spin density in the intermediate E (see Table 13 and Figure 37 of the Appendix) is distributed over •- the whole CO2 with about 75 % of the spin on carbon and the equal amount of spin occupancy on both oxygen atoms (≈ 12 %). In contrast, the 80-90 % of the spin is localized on the sulfur of C419 in the intermediate F (see Table 14 and Figure 38 of the Appendix). This highly concentrated spin density could be one of the potential reasons for the higher barrier of H-

123

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase transfer from CoA to C419 radical, although it is known from the literature that the C-H bond is stronger than the S-H bond (having systematically higher bond dissociation energies). In the transition state TS4, the transferring hydrogen atom is shared between the S-atom of CoA and •- carbon of CO2 with a slightly shorter distance (for about 0.1 Å) with CoA’s sulfur indicating that the TS4 is an early transition state. In TS4, the spin occupancy redistributes between C- •- atom of CO2 (ca. 55 %) and S-atom of CoA (ca. 30 %) with a similar spin density found at •- each oxygen of CO2 molecule as in the case of previous intermediate E. However, in the transition structure TS5, the hydrogen is shared between the two S-atoms of CoA and C419•, whereby both sulfurs carry a similar amount of spin density (ca. 50 %) with distances between

Figure 5.3.3. Boltzmann weighted energy profiles (T = 298.15 K) shown in panels (a) and (b) correspond to the 2nd half-reaction of PFL catalysis obtained starting from AcPFL I and AcPFL II model systems, respectively, with CoA placed inside the active site after the 1st half-reaction. The profiles are calculated by performing single-point calculations with the 6-311+G(3df,2p) basis set on top of the optimized geometries (see Table 10 and Figure 32 of the Appendix for the initial values and energy profiles). The profiles are set relative to the energy of the first intermediates E and F for AcPFL I and AcPFL II model systems, respectively. Representative QM/MM-optimized geometries of intermediates (E-H) and transition states (TS4-6) for the H-abstraction from CoA (E → G via TS4 for AcPFL I (green) and F → G via TS5 for AcPFL II (blue) systems), as well as for the acetylation of CoA (G → H via TS6) in each system. The position of radical in each intermediate is shown as symbol ‘•’. See Figure 5.1.2 for details. See Figure 33 and 34 for all QM/MM-optimized geometries. For details on distances and spin densities see Table 11-14 and Figure 35-38 of the Appendix.

124

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase each sulfur and the transferring H-atom again indicating an early transition state. The spin •- • density shifts, from being exclusively localized on CO2 or C419 , towards the S-atom of CoA which now bears about 90 % of the spin in the intermediate G in the case of both AcPFL I and AcPFL II model systems. This activates the coenzyme for the next step which is the acyl transfer from the acetyl-C418 (see intermediates G-H and TS6 in Figure 5.1.2).

5.3.8. The Acetylation of CoA

The calculated energy barrier for the CoA acetylation step (see G → H via TS6 in Figure 5.1.2) was found to be 43.6 and 116.1 kJ/mol in the case of AcPFL I and AcPFL II systems, respectively. See Figure 5.3.3a-b, as well as Table 10 and Figure 32 of the Appendix for details. The greater thermodynamic stability of the intermediate H compared to the reference intermediates E in AcPFL I, also suggest this forward reaction route to be more feasible than the G → H transformation in AcPFL II. The distance between sulfur of activated CoA and carbonyl C-atom of acetyl group decreases with the steady increase of the distance between S-atom of C418 indicating the progress of CoA approaching the C418-attached acetyl group (see Table 11-12 and Figure 35-36 of the Appendix). The spin localization indicates the late transition state TS6 in the AcPFL II while the symmetric distribution of the spin on both sulfurs of CoA and C418 (ca. 55 %) was found in the case of TS6 in AcPFL I system. In the intermediate H, a 90-95 % of the spin is transferred on the S-atom of C418 which indicates the termination of the catalytic cycle with the regeneration of the radical activity on PFL and both products formate and acetyl-CoA bound in the active site (see Table 13-14 and Figure 37-38 of the Appendix for details on spin density changes with the progress of the reaction).

Interestingly, in both acetylated AcPFL I and AcPFL II models, the achieved a similar network of hydrogen bonds between CoA’s cysteamine peptide motif and a guanidine group of R435, which was previously proven to be crucial for the correct positioning of S-H group in the active site, was retained during the QM/MM geometry optimizations (see Figure 5.3.3). However, the QM/MM geometries of the optimized intermediates have less favorable contacts with R176 in the case of AcPFL II since CoA entered the active site which previously had to adjust to a presence of a bigger formate molecule. This makes an active site more crowded and some of the interactions are lost during the optimization, which in turn tentatively explains the observed less favorable energetics in the AcPFL II system (see Figure 5.3.3).

125

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

5.3.9. The Inactivation

In Figure 5.3.4, as well as in Table 15 and Figure 39 of the Appendix, we present the energetics of the possible hydrogen transfer between CoA and C418 thiyl radical before the 1st half-reaction and pyruvate cleavage in the active site of the PFL• model system. The energy barrier was found to be 43.1 kJ/mol, with almost equal thermodynamics stability of the intermediate J (1.5 kJ/mol less stable) compared to the intermediate I. In Table 16 and Figure 41 of the Appendix we show distances between a sulfur atom of CoA or C418• and the transferring hydrogen from S-H group of CoA. By careful analysis of the distances, we found that the value of the intramolecular S-H bond in each of the intermediates I and J is about 1.4 Å, while in the transition structure

(TS7) this hydrogen is shared Figure 5.3.4. Boltzmann weighted energy profile (T = between the two sulfur atoms with 298.15 K) for the inactivation of PFL by CoA obtained slightly shorter distance with S- starting from PFL• model system with CoA placed inside atom of C418 (about 1.57 Å) and the active site, before the 1st half-reaction with pyruvate. about 1.65 Å for a bond with sulfur The profile is calculated by performing single-point of CoA. The Mulliken spin densities calculations with the 6-311+G(3df,2p) basis set on top of are shown in Table 17 and plotted the optimized geometries (see Table 15 and Figure 39 of the Appendix for the initial values and energy profiles). versus the reaction coordinate in The profile is set relative to the energy of the intermediate Figure 42 of the Appendix, with I. Representative QM/MM-optimized geometries of expected values of about 95 % of intermediates (I and J) and transition state (TS7) for the the spin on S-atom of C418 in the H-abstraction from CoA (orange). The position of radical intermediate I, to almost equal spin in each intermediate is shown as symbol ‘•’. See Figure 2 occupancy on both sulfur atoms in for details. See Figure 40 for all QM/MM-optimized TS7 (≈ 53 %) and about 87 % of the geometries. For details on distances and spin densities see spin density on CoA’s sulfur in the Table 16-17 and Figure 41-42 of the Appendix.

126

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase intermediate J. Another interesting feature is visible from QM/MM-optimized geometries shown in Figure 5.3.4 and Figure 40 of the Appendix, and that is a unique H-bonding network with a guanidine of R435 which significantly differs from CoA’s binding pattern in previously analyzed acetylated systems as described earlier. Namely, we found that in the PFL• system, pyruvate and CoA compete for the same hydrogen bond with guanidine of R435 in the active site.

In this respect, the underlying energetics suggests that this hydrogen transfer step is indeed competitive with the initial step of the 1st half-reaction, namely the addition of C418• to pyruvate (≈ 47 kJ/mol). Nevertheless, this particular H-transfer is only possible assuming that CoA reached the active site before the 1st half-reaction with pyruvate, whereby this reaction would cause a premature radical quenching and would ultimately lead to the inactivation of PFL since such activated CoA radical is not in the perfect position for abstracting acetyl group from pyruvate. However, we demonstrated earlier (Chapter 3) that the opening of the CoA entry channel is highly correlated with the completion of the 1st half-reaction in the active site. We showed that the channel gateway stays mostly closed before the reaction with pyruvate, protecting the active site from any side reactions that could interfere with the 1st half-reaction. This dynamical component implies that the enzyme has a control mechanism that protects the active site from futile radical quenchers, such as CoA, by direct communication between the active site that notifies and signals distal regions of the enzyme about the progress of the reaction in the active site.

127

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

5.4. Conclusion

Through a systematic computational study, in which a series of extensive QM/MM calculations were performed on monomeric PFL model systems, we investigated in detail the mechanism of both half-reactions in PFL catalysis, as well as the possibility of enzyme inactivation with its natural coenzyme substrate. For this purpose, the initial geometries that were used to set up the QM/MM models were extracted from previously analyzed MD trajectories of non-acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) protein models with an emphasis on the recently discovered stable and potentially reactive binding poses of CoA in the near vicinity of the active site in each model system.

In this respect, we constructed, in Figure 5.4.1, the master QM/MM energy profile for an overall PFL catalytic mechanism showing the relative energetics for the first and the second half-reaction, as well as the possible enzyme inactivation. We propose that the 1st half-reaction with pyruvate progresses in the absence of CoA in the active site, whereby the presence of CoA before and after the 1st half-reaction is crucial for the inactivation and the 2nd half-reaction, respectively.

Our QM/MM calculations suggest a concerted process for the addition of C418 thiyl TS1 radical to pyruvate (see Figure 5.4.1, A → B) and homolytic C-C bond cleavage with a TS2 subsequent collapse of the unstable tetrahedral radical intermediate (see Figure 5.4.1, B → C) having an overall barrier of about 45-55 kJ/mol (see also Figure 5.3.2a). This process, st •- belonging to the 1 half-reaction, yields a carbon dioxide radical anion CO2 and the acyl- enzyme intermediate. Upon completion of the 1st half-reaction with pyruvate, a newly formed TS3 •- CO2 has a possibility to abstract nearest hydrogen from C419 (see Figure 5.4.1, C → D) where we calculated the barrier for this reaction to range from about 25 kJ/mol in acetylated to about 50 kJ/mol in non-acetylated systems (see also Figure 5.3.2b). This sensitivity towards the starting snapshot was assigned to the fact that, during our MD simulations, the active site •- residues relax and adjust to a smaller and loosely bound substrate, namely CO2 and formate, in contrast to pyruvate which is tightly bound in the active site via hydrogen bonding with R176 •- and R435. Thus, this flexibility favors the H-transfer in the active sites with the smaller CO2 - or HCO2 species present inside.

The calculated QM/MM profiles for the 2nd half-reaction strongly indicate that the H- •- abstraction and the subsequent activation of CoA is carried out by CO2 (5 kJ/mol) rather than

128

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

TS4 TS5 C419• (40.5 kJ/mol) in the active site (see E → G and F → G in Figure 5.3.3a and 5.3.3b for the first step of the 2nd half-reaction in AcPFL I and AcPFL II model systems, respectively). Remarkably, this result is in agreement with results obtained from the previously investigated binding process of CoA approaching near the active site of PFL, through a predominantly open gateway of the entry channel, which suggests more stable binding poses of CoA in the active site of acetylated model systems, especially in AcPFL I. Moreover, the computed QM/MM profile for acetyl group transfer from the enzyme-modified C418 to activated CoA• was again found to be energetically more favorable in AcPFL I model system (≈ 64 kJ/mol in AcPFL I

Figure 5.4.1. The overall PFL catalytic mechanism. The 1st half-reaction (orange) includes the formation of S-C bond between the enzyme and pyruvate upon the addition of C418 thiyl radical (A TS1 TS2 •- → B) followed by a homolysis of C-C bond in pyruvate (B → C) yielding a CO2 and acetylated •- C418. In the absence of CoA, an energetically expensive H-transfer (yellow) between C419 and CO2 TS3 can occur (C → D) in the active site. The 2nd half-reaction can be divided into two pathways. The entry of CoA before (green) this particular H-abstraction triggers the energetically favorable transfer TS4 •- of a hydrogen from CoA to CO2 (E → G) which activates CoA for picking up the acetyl group TS6 initially attached to C418 (G → H). However, upon the entry of CoA after (blue) the H-abstraction from C419, the activation of CoA occurs via H-transfer between C419 thiyl radical and CoA with a TS5 TS6 high energy barrier (F → G). This activates CoA for energetically less favorable acetylation (G → H). The entry of CoA before the 1st half-reaction with pyruvate (magenta) leads to the quenching of TS7 radical from C418 which in turn inactivates the enzyme (I → J). See Table 18 in Appendix for details.

129

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase compared to 126 kJ/mol in AcPFL II), yielding thermodynamically more stable acetyl-CoA intermediate and regenerated radical on C418 (compare green and blue profiles in Figure 6 TS6 and 8 for G → H in AcPFL I and AcPFL II, respectively).

The results of our QM/MM calculations also suggest that the premature presence of coenzyme near the active site in PFL• model system followed by the H-abstraction from CoA, carried out by the C418• before the completion of the 1st half-reaction with pyruvate, highly competes (≈ 43.1 kJ/mol) with the 1st half-reaction which could ultimately lead to the inactivation of the enzyme by quenching the radical from PFL (see Figure 5.3.4 and 5.4.1, I TS7 → J). However, previous findings suggest the closed state of getaway residues comprising the entry channel which keeps any potential quenchers, including natural coenzyme substrate away from the active site before the 1st half-reaction. This result indicates that PFL is a finely controlled machinery where signaling cascade and coupling between the small chemical changes and distal conformational flexibility play a key role, which makes this radical enzyme a textbook case in which the protein structure and its function are tightly connected.

130

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

5.5. References

1. Knappe, J.; Blaschkowski, H. P.; Gröbner, P.; Schmitt, T. Eur. J. Biochem. 1974, 50, 253-263. 2. Unkrig, V.; Neugebauer, F. A.; Knappe, J. Eur. J. Biochem. 1989, 184, 723-728. 3. Wagner, A. F.; Frey, M.; Neugebauer, F. A.; Schafer, W.; Knappe, J. Proc. Natl. Acad. Sci. USA 1992, 89, 996-1000. 4. Lehtiö, L.; Goldman, A. Protein Eng. Des. Sel. 2004, 17, 545-552. 5. Vey, J. L.; Yang, J.; Li, M.; Broderick, W. E.; Broderick, J. B.; Drennan, C. L. Proc. Natl. Acad. Sci. USA 2008, 105, 16137-16141. 6. Plaga, W.; Wielhaber, G.; Wallach, J.; Knappe, J. FEBS Lett. 2000, 466, 45-48. 7. Zhang, W.; Wong, K. K.; Magliozzo, R. S.; Kozarich, J. W. Biochemistry 2001, 40, 4123-4130. 8. Nnyepi, M. R.; Peng, Y.; Broderick, J. B. Arch. Biochem. Biophys. 2007, 459, 1-9. 9. Becker, A.; Fritz-Wolf, K.; Kabsch, W.; Knappe, J.; Schultz, S.; Volker Wagner, A. F. Nat. Struc. Biol. 1999, 6, 969-975. 10. Selmer, T.; Pierik, A. J.; Heider, J. Biol. Chem. 2005, 386, 981-988. 11. Shisler, K. A.; Broderick, J. B. Arch. Biochem. Biophys. 2014, 546, 64-71. 12. Backman, L. R. F.; Funk, M. A.; Dawson, C. D.; Drennan, C. L. Crit. Rev. Biochem. Mol. Biol. 2017, 52, 674-695. 13. Hanževački, M.; Čondić-Jurkić, K.; Banhatti, R. D.; Smith, A.-S., Smith, D. M. Chem. Eur. J. 2019, 25, 8741-8753. 14. Hanževački, M.; Banhatti, R. D.; Čondić-Jurkić, K.; Smith, A.-S.; Smith, D. M. J. Phys. Chem. A 2019, 123, 9345-9356. 15. Knappe, J.; Neugebauer, F. A.; Blaschkowski, H. P.; Gänzler, M. Proc. Natl. Acad. Sci. USA 1984, 81, 1332-1335. 16. Parast, C. V.; Wong, K. K.; Lewisch, S. A.; Kozarich, J. W.; Peisach, J.; Magliozzo, R. S. Biochemistry 1995, 34, 2393-2399. 17. Crain, A. V.; Broderick, J. B. J. Biol. Chem. 2014, 289, 5723-5729. 18. Yang, H.; McDaniel, E. C.; Impano, S.; Byer, A. S.; Jodts, R. J.; Yokoyama, K.; Broderick, W. E.; Broderick, J. B.; Hoffman, B. M. J. Am. Chem. Soc. 2019, 141, 12139-12146. 19. Buis, J. M.; Broderick, J. B. Arch. Biochem. Biophys. 2005, 433, 288-296. 20. Himo, F.; Eriksson, L. A. J. Am. Chem. Soc. 1998, 120, 11449-11455. 21. Gauld, J.; Eriksson, L. J. Am. Chem. Soc. 2000, 122, 2035-2040. 22. Leppanen, V. M.; Merckel, M. C.; Ollis, D. L.; Wong, K. K.; Kozarich, J. W.; Goldman, A. Structure 1999, 7, 733-744. 23. Zhang, W.; Wong, K. K.; Magliozzo, R. S.; Kozarich, J. W. Biochemistry 2001, 40, 4123-4130. 24. Lucas, M. de F.; Fernandes, P. A.; Eriksson, L. A.; Ramos, M. J. J. Phys. Chem. B 2003, 107, 5751- 5757. 25. Guo, J.-D.; Himo, F. J. Phys. Chem. B 2004, 108, 15347-15354. 26. Becker, A.; Kabsch, W. J. Biol. Chem. 2002, 277, 40036-40042. 27. Čondić-Jurkić, K.; Perchyonok, V. T.; Zipse, H.; Smith, D. M. J. Comput. Chem. 2008, 29, 2425- 2433. 28. Čondić-Jurkić, K.; Zipse, H.; Smith, D. M. J. Comput. Chem. 2010, 31, 1024-1035. 29. Feliks, M. “Computational Modeling of Catalytic Mechanisms of Glycyl Radical Enzymes" Ph.D. diss., Universität Bayreuth, 2014. 30. Warshel, A.; Levitt, M. J. Mol. Biol. 1976, 103, 227-249. 31. Bellott, E. M. J. Med. Chem. 1998, 41, 4675. 32. Åqvist, J.; Warshel, A. Chem. Rev. 1993, 93, 2523-2544. 33. Warshel, A.; Sharma, K. P.; Kato, M.; Xiang, Y.; Liu, H.; Olsson, M. H. Chem. Rev. 2006, 106, 3210-3235. 34. Dapprich, S.; Komáromi, I.; Byun, K. S.; Morokuma, K.; Frisch, M. J. J. Mol. Struct.: THEOCHEM. 1999, 462, 1-21. 35. Wa Chung, L.; Sameera, W. M. C; Ramozzi, R.; Page, A. J.; Hatanaka, M.; Petrova, G. P.; Harris, T. V.; Li, X.; Ke, Z.; Liu, F.; Li, H-B.; Ding, L.; Morokuma, K. Chem. Rev. 2015, 115, 5678-5796.

131

Chapter 5: A QM/MM Study of Ping-Pong Mechanism of Both Half-Reactions and Inactivation by Coenzyme A in Pyruvate Formate-Lyase

36. Case, D. A.; Cerutti, D. S.; Cheatham, T. E., III; Darden, T. A.; Duke, R. E.; Giese, T. J.; Gohlke, H.; Goetz, A. W.; Greene, D.; Homeyer, N.; Izadi, S.; Kovalenko, A.; Lee, T. S.; LeGrand, S.; Li, P.; Lin, C.; Liu, J.; Luchko, T.; Luo, R.; Mermelstein, D.; Merz, K. M.; Monard, G.; Nguyen, H.; Omelyan, I.; Onufriev, A.; Pan, F.; Qi, R.; Roe, D. R.; Roitberg, A.; Sagui, C.; Simmerling, C. L.; Botello-Smith, W.-M.; Swails, J.; Walker, R. C.; Wang, J.; Wolf, R. M.; Wu, X.; Xiao, L.; York, D. M.; Kollman, P. A. AMBER 2017, University of California: San Francisco, CA, 2017. 37. Tao, P.; Schlegel, H. B. J. Comput. Chem. 2010, 31, 2363-2369. 38. Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Petersson, G. A.; Nakatsuji, H.; Li, X.; Caricato, M.; Marenich, A. V.; Bloino, J.; Janesko, B. G.; Gomperts, R.; Mennucci, B.; Hratchian, H. P.; Ortiz, J. V.; Izmaylov, A. F.; Sonnenberg, J. L.; Williams-Young, D.; Ding, F.; Lipparini, F.; Egidi, F.; Goings, J.; Peng, B.; Petrone, A.; Henderson, T.; Ranasinghe, D.; Zakrzewski, V. G.; Gao, J.; Rega, N.; Zheng, G.; Liang, W.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Throssell, K.; Montgomery, J. A., Jr.; Peralta, J. E.; Ogliaro, F.; Bearpark, M. J.; Heyd, J. J.; Brothers, E. N.; Kudin, K. N.; Staroverov, V. N.; Keith, T. A.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A. P.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Millam, J. M.; Klene, M.; Adamo, C.; Cammi, R.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Farkas, O.; Foresman, J. B.; Fox, D. J. Gaussian 16, Revision A.03; Gaussian, Inc., Wallingford CT, 2016. 39. Humphrey, W.; Dalke, A.; Schulten, K. J. Mol. Graph. 1996, 14, 33-38. 40. Henry, D. J.; Parkinson, C. J.; Radom, L. J. Phys. Chem. A 2002, 106, 7927-7936. 41. Henry, D. J.; Sullivan, M. B.; Radom, L. J. Chem. Phys. 2003, 118, 4849-4860. 42. Grimme, S.; Ehrlich, S.; Goerigk, L. J. Comp. Chem. 2011, 32, 1456-1465. 43. Henry, D. J.; Parkinson, C. J.; Mayer, P. M.; Radom, L. J. Phys. Chem. A 2001, 105, 6750-6756. 44. Hioe, J.; Zipse, H. Org. Biomol. Chem. 2010, 8, 3609-3617.

132

Summary and Outlook

Summary and Outlook

In this thesis, the catalytic mechanism of a prototypical member of the glycyl radical enzymes (GREs) family, pyruvate formate-lyase (PFL), has been systematically studied by utilizing a multiscale computational approach including atomistic classical molecular dynamics (MD) simulations, enhanced sampling techniques and QM/MM modeling.

PFL is known to have prominent functions in critical metabolic pathways of microbes as well as in the environment. It is a central enzyme in the anaerobic sugar metabolism of E. coli where it plays a key role in catalyzing a reversible conversion of pyruvate and coenzyme A (CoA) to formate and acetyl-CoA in two steps via a complex radical ping-pong mechanism. In particular, it has been known for a long time that the overall catalytical cycle of PFL involves two half-reactions and for the second half-reaction to take place the S-H group of CoA must enter the active site of PFL to retrieve a protein attached acetyl group. However, as evident from the available crystal structure of PFL with its substrates, the coenzyme molecule is located at the protein surface, whereas the active site is buried in the protein interior, some 20-30 Å away. Despite several attempts in the past, a possible route or mechanism for the start of the second half-reaction was never found.

This problem has been addressed anew in this thesis using state-of-art computational methods successfully. Thus, it has been demonstrated through a series of extensive unrestrained MD simulations (in the µs range), performed on monomeric and dimeric PFL model systems before (PFL•) and after (AcPFL I and AcPFL II) the first half-reaction, that CoA can approach the active site through a putative channel which is found to be in an open state providing a gateway for CoA entry after the completion of the first half-reaction. Further, we could show that this channel was found to be associated with accentuated fluctuations and a higher probability of being in an open state in acetylated systems. This result implies that the acetylation of the enzyme assumes a prominent functional role, whereby the formation of the acyl intermediate serves to initiate a subtle signaling cascade that influences the enzyme dynamics and facilitates the entry of the second substrate.

Since the spontaneous entry of CoA into the active site of PFL has not been observed in any of the simulated model systems, despite long simulation time of hundreds of nanoseconds of unbiased dynamics, a set of steered molecular dynamics simulations was carried out to simulate coenzyme’s binding/unbinding pathways, whereas umbrella sampling

133

Summary and Outlook

technique has been employed to estimate reliable free energy profiles of such processes. In addition, instead of following forward pulls only with reverse pulls, a set of unrestrained dissociation dynamics of CoA disengaging from the active site of PFL has been performed and analyzed by building meaningful statistics and by extracting free energy profiles. Remarkably, it was found that in its approach towards the active site of PFL, coenzyme crosses over relatively low energy barriers (in the range of few kcal/mol) both in acetylated (AcPFL I and AcPFL II) and non-acetylated (PFL•) model systems. Further, a closer look at the conformation of CoA and residues in the active site of PFL resulted in an insight, namely, that CoA binds to the guanidine group of R435 via different hydrogen bonding patterns in all three model systems with an especially strong preference of interactions with the active site compartments in AcPFL I, mostly due to favorable H-C interactions between S-H group of •- CoA and CO2 .

Using a full-enzyme as a model for QM/MM calculations, and by employing the ONIOM approach, the mechanistic aspects of both half-reactions in the PFL catalytic cycle have been explored taking into account the electrostatic effects of the protein environment and structural restraints on the active site imposed by the enzyme in the calculations. It has thus been confirmed that the progression of the first half-reaction with pyruvate occurs in two steps by characterizing a total of three intermediates and two transition states starting from non- acetylated and acetylated monomeric PFL model systems without CoA occupying the active site. The QM/MM calculations suggested a concerted process for the addition of C418 thiyl radical to pyruvate and homolytic C-C bond cleavage with a subsequent collapse of the unstable tetrahedral radical intermediate having an overall barrier of about 45-55 kJ/mol. This process, belonging to the first half-reaction with pyruvate, yielded a carbon dioxide radical •- •- anion CO2 and the acyl-enzyme intermediate. A newly formed CO2 has a possibility to abstract the nearest hydrogen from C419 where the calculated barrier for this reaction ranges from about 25 kJ/mol in acetylated to about 50 kJ/mol in non-acetylated systems. Additionally, the two-step mechanism has been proposed for the progressive pathways of the second half- reaction with CoA on identifying the three intermediates and two transition states for each acetylated model system in the presence of CoA in the active site of PFL. The calculated QM/MM profiles for the second half-reaction strongly indicate that the H-abstraction and the •- • subsequent activation of CoA are carried out by CO2 (5 kJ/mol) rather than C419 (40.5 kJ/mol) in the active site. The final step in the second half-reaction was found to be the acetyl group transfer from C418 to the activated CoA• which yields acetyl-CoA and formate (≈ 64

134

Summary and Outlook

kJ/mol in AcPFL I compared to ≈ 126 kJ/mol in AcPFL II). Also, while a definite possibility of H-abstraction from CoA in the active site before the first half-reaction exists, it is demonstrated that this would cause a premature quenching (43.1 kJ/mol) of the radical and could ultimately lead to the inactivation of PFL.

Although in general PFL is a highly sensitive enzyme that becomes irreversibly deactivated upon exposure to oxygen and other bulk solvent molecules, certain significant conformational changes are crucial for governing critical steps in its usual catalytic cycle. This body of work using multiscale computational approach has shown that it is indeed important that these subtle conformational changes occur at a specific time, namely after the completion of the first half-reaction with pyruvate and the chemical modification (acetylation) of the enzyme’s active site for the flexible coenzyme molecule to reach the buried active site of an enzyme. This favorable accommodation of CoA and its interactions with the active site residues assures the success of the second half-reaction and the overall catalytic cycle. Moreover, we found that conformational changes of the channel residues and those on the protein surface are highly-coupled to the catalytic cycle of PFL, thus having prominent functional roles in protecting the interior of the enzyme from the undesirable side-reactions which could potentially cause the inactivation of the enzyme by premature quenching of the radical.

Through the extensive application of various computational techniques, it has been demonstrated that the PFL-mediated catalysis is an extremely complex, yet a perfectly controlled process, as it involves rearrangements in the active site, conformational changes of the enzyme, and entry of the substrate into the active site. The results of this thesis provide invaluable and novel insights into the catalytic mechanism of PFL and present numerous strongly substantiated evidence highlighting the importance of structure-function relationships in this class of radical enzymes.

Even though this thesis addresses key steps in the catalytic mechanism of PFL, findings presented here could in turn help in the understanding of the catalysis in other classes of structurally similar anaerobic radical enzymes. This could also open new questions regarding the large-scale conformational changes that should occur during the initial radical installation on glycine and activation process, as well as binding of GREs to the corresponding activase via protein-protein interactions, that precede the catalysis itself. Furthermore, it would be interesting to explore the possible interactions of GREs with formate translocating channels and other membrane proteins to reveal the mechanisms responsible for the product release. To

135

Summary and Outlook

conclude, GREs are known to carry out vital and extremely challenging biochemical reactions in metabolic pathways of anaerobic bacteria. Although the application of modern experimental techniques and bioinformatics tools has increased our knowledge of the nature and eminence of the GRE family, additional comprehensive and systematic biochemical, structural, and computational studies of key members remain essential. Since GREs have been found to catalyze an impressive scope of reactions with vast significance on human gut health, the environment, and industry, discovery, and characterization of new GREs require further involvement and effort.

136

Appendix

Appendix

Figure 1. (a) RMSD and (b) RMSF of the protein backbone from the initial crystal structure in monomeric systems during 500 ns of MD simulation. The figure is taken from the supporting information of [MH1].

137

Appendix

Figure 2. (a) RMSD and (b) RMSF of the protein backbone from the crystal structure in dimeric systems during 500 ns of MD simulation. The figure is taken from the supporting information of [MH1].

138

Appendix

Figure 3. (a) RMSD and (b) RMSF of the protein backbone from the initial crystal structure in monomeric systems during 100 ns of MD simulation. The figure is taken from the supporting information of [MH1].

139

Appendix

Figure 4. (a) RMSD and (b) RMSF of the protein backbone from the crystal structure in dimeric systems during 100 ns of MD simulation. The figure is taken from the supporting information of [MH1].

140

Appendix

Figure 5. (a) RMSD of CoA from the crystal structure during 500 ns of MD simulation in water. (b) Time series of CoA glycosidic dihedral angle (O4’-C1’-N9-C8) during 500 ns of MD simulation in water. (c) Distance between adenine and cysteamine moiety of CoA during 500 ns of performed MD simulation in water. The figure is taken from the supporting information of [MH1].

141

Appendix

Figure 6. RMSD of CoA from the crystal structure in monomeric (a) and dimeric (b) systems during 500 ns of MD simulation. The figure is taken from the supporting information of [MH1].

142

Appendix

Figure 7. RMSD of CoA from the crystal structure in monomeric (a) and dimeric (b) systems during 100 ns of MD simulation. The figure is taken from the supporting information of [MH1].

143

Appendix

Figure 8. Time series of CoA adenosine moiety dihedral angle (O4’-C1’-N9-C8) in monomeric (a) and dimeric (b) systems during 500 ns of performed MD simulations. The figure is taken from the supporting information of [MH1].

144

Appendix

Figure 9. Time series of CoA adenosine moiety dihedral angle (O4’-C1’-N9-C8) in monomeric (a) and dimeric (b) systems during 100 ns of performed MD simulations. The figure is taken from the supporting information of [MH1].

145

Appendix

Figure 10. Time series of the distance between adenine and cysteamine moiety of CoA in monomeric (a) and dimeric (b) systems during 500 ns of performed MD simulation. The figure is taken from the supporting information of [MH1].

146

Appendix

Figure 11. Time series of the distance between adenine and cysteamine moiety of CoA in monomeric (a) and dimeric (b) systems during 100 ns of performed MD simulation. The figure is taken from the supporting information of [MH1].

147

Appendix

Figure 12. Data collected during independent simulations (500 ns) carried out with the dimeric systems for the distance between cysteamine moiety of CoA from F220 and H227 in the active and inactive subunit. The figure is taken from the supporting information of [MH1].

Figure 13. Data collected during independent simulations (100 ns) carried out with the dimeric systems for the distance between cysteamine moiety of CoA from F220 and H227 in the active and inactive subunit. The figure is taken from the supporting information of [MH1].

148

Appendix

Figure 14. Data collected during independent simulations (500 ns) carried out with the monomeric (a) and dimeric (b) systems for the distance between the adenine moiety of CoA and F149. The figure is taken from the supporting information of [MH1].

149

Appendix

Figure 15. Data collected during independent simulations (100 ns) carried out with the monomeric (a) and dimeric (b) systems for the distance between the adenine moiety of CoA and F149. The figure is taken from the supporting information of [MH1].

150

Appendix

Figure 16. The probability distribution of CoA adenosine moiety glycosidic dihedral angle (O4’-C1’- N9-C8) in water, monomeric, and both active and inactive subunit of the dimeric system. The figure is taken from the supporting information of [MH1].

151

Appendix

Figure 17. RMSD of channel residues (G167, Y172, R176, Y323, L326, F327, F432, R435, and V611) in monomeric (a) and dimeric (b) systems during 500 ns of performed MD simulation. These residues were mostly found to be responsible for the channel fluctuations. The figure is taken from the supporting information of [MH1].

152

Appendix

Figure 18. RMSD of channel residues (G167, Y172, R176, Y323, L326, F327, F432, R435, and V611) in monomeric (a) and dimeric (b) systems during 100 ns of performed MD simulation. The figure is taken from the supporting information of [MH1].

153

Appendix

Figure 19. Time series of L326-V611 residues distance in monomeric (a) and dimeric (b) systems during 500 ns of performed MD simulation. The figure is taken from the supporting information of [MH1].

154

Appendix

Figure 20. Time series of L326-V611 residues distance in monomeric (a) and dimeric (b) systems during 100 ns of performed MD simulation. The figure is taken from the supporting information of [MH1].

155

Appendix

Table 1. Energies obtained by ONIOM[F/6-311+G(3df,2p):AMBER]//ONIOM[B3LYP/6- 31G(d):AMBER] and ONIOM[G3(MP2)-RAD:AMBER] levels of theory and mean absolute deviation (MAD) from G3(MP2)-RAD values. All values are at 0 K and given in kJ/mol.

156

Appendix

Table 2. The QM/MM energies for the 1st half-reaction (intermediates A-C and transition structures TS1-2). The energies are obtained by ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER] level of theory and given in kJ/mol. The initial structures 1 and 2 are taken from MD simulations of non-acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) model systems without CoA in the active site.

157

Appendix

Table 3. The QM/MM energies for the hydrogen transfer (intermediates C-D and transition structure TS3). The energies are obtained by ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER] level of theory and given in kJ/mol. The initial structures 1 and 2 are taken from MD simulations of non-acetylated (PFL•) and acetylated (AcPFL I and AcPFL II) model systems without CoA in the active site.

158

Appendix

Figure 21. The QM/MM energy profiles (ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER]) calculated for the 1st half-reaction (A-C via TS1 and TS2) starting from snapshots taken from MD simulations of (a) PFL•, (b) AcPFL I and (c) AcPFL II model systems without CoA occupying the active site. The arrows indicate the direction and the origin of the calculated QM/MM energy profiles.

Figure 22. The QM/MM energy profiles (ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER]) calculated for the hydrogen transfer (C-D via TS3) starting from snapshots taken from MD simulations of (a) PFL•, (b) AcPFL I and (c) AcPFL II model systems without CoA occupying the active site. The arrows indicate the direction and the origin of the calculated QM/MM energy profiles.

159

Appendix

Figure 23. Optimized QM/MM geometries (ONIOM[B3LYP/6-31G(d,p):AMBER]) of intermediates and transition structures for the 1st half-reaction (A-C and TS1-2) and the H-transfer (C-D and TS3) starting from structures 1 (a) and 2 (b) obtained from MD simulations of PFL• model system without CoA in the active site.

160

Appendix

Figure 24. Optimized QM/MM geometries (ONIOM[B3LYP/6-31G(d,p):AMBER]) of intermediates and transition structures for the 1st half-reaction (A-C and TS1-2) and the H-transfer (C-D and TS3) starting from structures 1 (a) and 2 (b) obtained from MD simulations of AcPFL I model system without CoA in the active site.

161

Appendix

Figure 25. Optimized QM/MM geometries (ONIOM[B3LYP/6-31G(d,p):AMBER]) of intermediates and transition structures for the 1st half-reaction (A-C and TS1-2) and the H-transfer (C-D and TS3) starting from structures 1 (a) and 2 (b) obtained from MD simulations of AcPFL II model system without CoA in the active site.

162

Appendix

Table 4. Values of important inter- and intramolecular distances extracted from ONIOM[B3LYP/6- 31G(d,p):AMBER] geometries for the 1st half-reaction and the H-transfer in the active site without CoA starting from structures 1 (upper) and 2 (lower) obtained from MD simulations of PFL• model system. The units of bond distances are given in Angstrom (Å).

Figure 26. The important inter- and intramolecular distances plotted versus reaction coordinate extracted from ONIOM[B3LYP/6-31G(d,p):AMBER] geometries for the 1st half-reaction and the H- transfer in the active site without CoA starting from structures 1 (a) and 2 (b) obtained from MD simulations of PFL• model system.

163

Appendix

Table 5. Values of important inter- and intramolecular distances extracted from ONIOM[B3LYP/6- 31G(d,p):AMBER] geometries for the 1st half-reaction and the H-transfer in the active site without CoA starting from structures 1 (upper) and 2 (lower) obtained from MD simulations of AcPFL I model system. The units of bond distances are given in Angstrom (Å).

Figure 27. The important inter- and intramolecular distances plotted versus reaction coordinate extracted from ONIOM[B3LYP/6-31G(d,p):AMBER] geometries for the 1st half-reaction and the H- transfer in the active site without CoA starting from structures 1 (a) and 2 (b) obtained from MD simulations of AcPFL I model system.

164

Appendix

Table 6. Values of important inter- and intramolecular distances extracted from ONIOM[B3LYP/6- 31G(d,p):AMBER] geometries for the 1st half-reaction and the H-transfer in the active site without CoA starting from structures 1 (upper) and 2 (lower) obtained from MD simulations of AcPFL II model system. The units of bond distances are given in Angstrom (Å).

Figure 28. The important inter- and intramolecular distances plotted versus reaction coordinate extracted from ONIOM[B3LYP/6-31G(d,p):AMBER] geometries for the 1st half-reaction in the active site without CoA starting from structures 1 (a) and 2 (b) obtained from MD simulations of AcPFL II model system.

165

Appendix

Table 7. The absolute values of atomic Mulliken spin densities calculated by performing QM/MM calculations using ONIOM[ωB97XD/6-311+G(3df,2p):AMBER]//ONIOM[B3LYP/6- 31G(d,p):AMBER] method for the obtained geometries in the active site without CoA starting from structures 1 (upper) and 2 (lower) obtained from PFL• model system.

Figure 29. The absolute values of atomic Mulliken spin densities plotted versus reaction coordinate calculated by performing QM/MM calculations using ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER] method for the obtained geometries in the active site without CoA starting from structures 1 (a) and 2 (b) obtained from PFL• model system.

166

Appendix

Table 8. The Absolute values of atomic Mulliken spin densities calculated by performing QM/MM calculations using ONIOM[ωB97XD/6-311+G(3df,2p):AMBER]//ONIOM[B3LYP/6- 31G(d,p):AMBER] method for the obtained geometries in the active site without CoA starting from structure s 1 (upper) and 2 (lower) obtained from AcPFL I model system.

Figure 30. The absolute values of atomic Mulliken spin densities plotted versus reaction coordinate calculated by performing QM/MM calculations using ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER] method for the obtained geometries in the active site without CoA starting from structures 1 (a) and 2 (b) obtained from AcPFL I model system.

167

Appendix

Table 9. The Absolute values of atomic Mulliken spin densities calculated by performing QM/MM calculations using ONIOM[ωB97XD/6-311+G(3df,2p):AMBER]//ONIOM[B3LYP/6- 31G(d,p):AMBER] method for the obtained geometries in the active site without CoA starting from structures 1 (upper) and 2 (lower) obtained from AcPFL II model system.

Figure 31. The absolute values of atomic Mulliken spin densities plotted versus reaction coordinate calculated by performing QM/MM calculations using ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER] method for the obtained geometries in the active site without CoA starting from structures 1 (a) and 2 (b) obtained from AcPFL II model system.

168

Appendix

Table 10. The QM/MM energies for the 2nd half-reaction (intermediates E-H and transition structures TS4-6). The energies are obtained by ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER] level of theory and given in kJ/mol. The initial structures 1 and 2 are taken from MD simulations of acetylated (AcPFL I and AcPFL II) model systems with CoA in the active site.

Figure 32. The QM/MM energy profiles (ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER]) calculated for the 2nd half-reaction (E-H via TS4-6) starting from snapshots taken from MD simulations of (a) AcPFL I and (b) AcPFL II model systems with CoA occupying the active site. The arrows indicate the direction and the origin of the calculated QM/MM energy profiles.

169

Appendix

Figure 33. Optimized QM/MM geometries (ONIOM[B3LYP/6-31G(d,p):AMBER]) of intermediates and transition structures for the 2nd half-reaction (E-H and TS4-6) starting from structures 1 (a) and 2 (b) obtained from MD simulations of AcPFL I model system with CoA in the active site.

170

Appendix

Figure 34. Optimized QM/MM geometries (ONIOM[B3LYP/6-31G(d,p):AMBER]) of intermediates and transition structures for the 2nd half-reaction (F-H and TS5-6) starting from structures 1 (a) and 2 (b) obtained from MD simulations of AcPFL II model system with CoA in the active site.

171

Appendix

Table 11. Values of important inter- and intramolecular distances extracted from ONIOM[B3LYP/6- 31G(d,p):AMBER] geometries for the 2nd half-reaction in the active site with CoA starting from structures 1 (upper) and 2 (lower) obtained from MD simulations of AcPFL I model system. The units of bond distances are given in Angstrom (Å).

Figure 35. The important inter- and intramolecular distances plotted versus reaction coordinate extracted from ONIOM[B3LYP/6-31G(d,p):AMBER] geometries for the 2nd half-reaction in the active site with CoA starting from structures 1 (a) and 2 (b) obtained from MD simulations of AcPFL I model system.

172

Appendix

Table 12. Values of important inter- and intramolecular distances extracted from ONIOM[B3LYP/6- 31G(d,p):AMBER] geometries for the 2nd half-reaction in the active site with CoA starting from structures 1 (upper) and 2 (lower) obtained from MD simulations of AcPFL II model system. The units of bond distances are given in Angstrom (Å).

Figure 36. The important inter- and intramolecular distances plotted versus reaction coordinate extracted from ONIOM[B3LYP/6-31G(d,p):AMBER] geometries for the 2nd half-reaction in the active site with CoA starting from structures 1 (a) and 2 (b) obtained from MD simulations of AcPFL II model system.

173

Appendix

Table 13. The absolute values of atomic Mulliken spin densities calculated by performing QM/MM calculations using ONIOM[ωB97XD/6-311+G(3df,2p):AMBER]//ONIOM[B3LYP/6- 31G(d,p):AMBER] method for the obtained geometries in the active site with CoA starting from structure s 1 (upper) and 2 (lower) obtained from AcPFL I model system.

Figure 37. The absolute values of atomic Mulliken spin densities plotted versus reaction coordinate calculated by performing QM/MM calculations using ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER] method for the obtained geometries in the active site with CoA starting from structures 1 (a) and 2 (b) obtained from AcPFL I model system.

174

Appendix

Table 14. The absolute values of atomic Mulliken spin densities calculated by performing QM/MM calculations using ONIOM[ωB97XD/6-311+G(3df,2p):AMBER]//ONIOM[B3LYP/6- 31G(d,p):AMBER] method for the obtained geometries in the active site with CoA starting from structure s 1 (upper) and 2 (lower) obtained from AcPFL II model system.

Figure 38. The absolute values of atomic Mulliken spin densities plotted versus reaction coordinate calculated by performing QM/MM calculations using ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER] method for the obtained geometries in the active site with CoA starting from structures 1 (a) and 2 (b) obtained from AcPFL II model system.

175

Appendix

Table 15. The QM/MM energies for the inactivation (intermediates I-J and transition structure TS7). The energies are obtained by ONIOM[ωB97XD/6-311+G(3df,2p):AMBER]//ONIOM[B3LYP/6- 31G(d,p):AMBER] level of theory and given in kJ/mol. The initial structures 1 and 2 are taken from MD simulations of non-acetylated (PFL•) model systems with CoA in the active site.

Figure 39. The QM/MM energy profiles (ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER]) calculated for the inactivation (I-J via TS7) starting from snapshots taken from MD simulations of PFL• model system with CoA occupying the active site. The arrow indicates the direction and the origin of the calculated QM/MM energy profiles.

176

Appendix

Figure 40. Optimized QM/MM geometries (ONIOM[B3LYP/6-31G(d,p):AMBER]) of intermediates and transition structure for the inactivation (I-J and TS7) starting from structures 1 (a) and 2 (b) obtained from MD simulations of PFL• model system with CoA in the active site.

177

Appendix

Table 16. Values of important inter- and intramolecular distances extracted from ONIOM[B3LYP/6- 31G(d,p):AMBER] geometries for the inactivation in the active site with CoA starting from structures 1 (upper) and 2 (lower) obtained from MD simulations of PFL• model system. The units of bond distances are given in Angstrom (Å).

Figure 41. The important inter- and intramolecular distances plotted versus reaction coordinate from ONIOM[B3LYP/6-31G(d,p):AMBER] geometries for the inactivation in the active site with CoA starting from structures 1 (a) and 2 (b) obtained from MD simulations of PFL• model system.

178

Appendix

Table 17. The absolute values of atomic Mulliken spin densities calculated by performing QM/MM calculations using ONIOM[ωB97XD/6-311+G(3df,2p):AMBER]//ONIOM[B3LYP/6- 31G(d,p):AMBER] method for the obtained geometries in the active site with CoA starting from structures 1 (upper) and 2 (lower) obtained from PFL• model system.

Figure 42. The absolute values of atomic Mulliken spin densities plotted versus reaction coordinate calculated by performing QM/MM calculations using ONIOM[ωB97XD/6- 311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER] method for the obtained geometries in the active site with CoA starting from structures 1 (a) and 2 (b) obtained from PFL• model system.

179

Appendix

Table 18. The final QM/MM energies for the overall PFL catalytic mechanism containing the 1st half- reaction (intermediates A-C and transition structures TS1-2), the H-transfer (C-D and TS3), the 2nd half- reaction (E-G and TS4-6), and the inactivation (I-J and TS7). The energies are obtained at ONIOM[ωB97XD/6-311+G(3df,2p):AMBER]//ONIOM[B3LYP/6-31G(d,p):AMBER] level of theory and given in kJ/mol. All three Boltzmann averaged energy profiles (see A-D and TS1-3 in Figure 5.3.2a-b) for each topology are reweighted using Boltzmann average and set relative to the energy of the intermediate A. The Boltzmann weighted energies of the intermediates E, F, and I, together with the corresponding transition structures TS4, TS5, TS6, and TS7 (see Figure 5.3.3a-b and Figure 5.3.4), are set relative to the energies of the intermediates C, D, and A, respectively.

180

Bibliography

Bibliography

List of all references sorted alphabetically:

1. Acuner-Ozbabacan, S. E.; Gursoy, A.; Keskin, O.; Nussinov, R. Curr. Opin. Drug Discov. Dev. 2010, 13, 527- 537. 2. Adams, G. E.; Aldrich, J. E.; Bisby, R. H.; Cundall, R. B.; Redpath, J. L.; Willson, R. L. Radiation Research 1972, 49, 278-289. 3. Adams, G. E.; Posener, M. L.; Bisby, R. H.; Cundall, R. B.; Key, J. R. Int. J. Radiat. Biol. Relat. Stud. Phys. Chem. Med. 1979, 35, 497-507. 4. Adcock, S. A.; McCammon, J. A. Chem. Rev. 2006, 106, 1589-1615. 5. Adrio, J. L.; Demain, A. L. Biomolecules 2014, 4, 117-139. 6. Agarwal, P. K. Biochemistry 2019, 58, 438-449. 7. Agarwal, P. K.; Geist, A.; Gorin, A. Biochemistry 2004, 43, 10605-10618. 8. Agarwal, P. K. J. Am. Chem. Soc. 2005, 127, 15248-15256. 9. Agarwal, P. K. Microb. Cell Fact. 2006, 5, 2. 10. Ahmadi, S.; Barrios Herrera, L.; Chehelamirani, M.; Hostaš, J.; Jalife, S.; Salahub, D. R. Int. J. Quantum Chem. 2018, 118, e25558. 11. Alanis-Lobato, G.; Andrade-Navarro, M. A.; Schaefer, M. H. Nucleic Acids Res. 2017, 45, 408-414. 12. Aldeghi, M.; Heifetz, A.; Bodkin, M. J.; Knapp, S.; Biggin, P. C. J. Am. Chem. Soc. 2017, 139, 946-957. 13. Alkadi, H. Infect. Disord. Drug Targets 2020, 20, 16-26. 14. Anandakrishnan, R.; Aguilar, B.; Onufriev, A. V. Nucleic Acids Res. 2012, 40, 537-541. 15. Andersson, J.; Westman, M.; Sahlin, M.; Sjöberg, B.-M. J. Biol. Chem. 2000, 275, 19449-19455. 16. Anzenbacher, P.; Anzenbacherova, E. Cell. Mol. Life Sci. 2001, 58, 737-747. 17. Åqvist, J.; Warshel, A. Chem. Rev. 1993, 93, 2523-2544. 18. Babior, B. Gould, D. C. Biochem. Biophys. Res. Commun. 1969, 34, 441-447. 19. Backman, L. R. F.; Funk, M. A.; Dawson, C. D.; Drennan, C. L. Crit. Rev. Biochem. Mol. Biol. 2017, 52, 674- 695. 20. Bakowies, D.; Thiel, W. J. Phys. Chem. 1996, 100, 10580-10594. 21. Bammens, B.; Evenepoel, P.; Keuleers, H.; Verbeke, K.; Vanrenterghem, Y. Kidney Int. 2006, 69, 1081-1087. 22. Banerjee, R. Chemistry and Biochemistry of B12. Wiley. New York, NY, USA: 1999. 23. Banerjee, R. Chem. Rev. 2003, 103, 2083-2094. 24. Banerjee, R.; Ragsdale, S. W. Annu. Rev. Biochem. 2003, 72, 209-247. 25. Bauerle, M. R.; Schwalm, E. L.; Booker, S. J. J. Biol. Chem. 2015, 290, 3995-4002. 26. Bayly, C. I.; Cieplak, P.; Cornell, W.; Kollman, P. A. J. Phys. Chem. 1993, 97, 10269-10280. 27. Becker, A.; Fritz-Wolf, K.; Kabsch, W.; Knappe, J.; Schultz, S.; Volker Wagner, A. F. Nat. Struct. Biol. 1999, 6, 969-975. 28. Becker, A.; Kabsch, W. J. Biol. Chem. 2002, 277, 40036-40042. 29. Beilen, J. B. v.; Funhoff, E. G. Curr. Opin. Biotechnol. 2005, 16, 308-314. 30. Bellott, E. M. J. Med. Chem. 1998, 41, 4675. 31. Benjdia, A.; Heil, K.; Barends, T. R.; Carell, T.; Schlichting, I. Nucleic Acids Res. 2012, 40, 9308-9318. 32. Berendsen, H. J. C.; Postma, J. P. M.; Van Gunsteren, W. F.; DiNola, A.; Haak, J. R. J. Chem. Phys. 1984, 81, 3684-3690. 33. Berg, J. M.; Tymoczko, J. L.; Stryer L. Biochemistry, International Edition. 7th ed. WH Freeman & Co.; New York, NY, USA: 2010. 34. Berkovitch, F.; Behshad, E.; Tang, K. H.; Enns, E. A.; Frey, P. A.; Drennan, C. L. Proc. Natl. Acad. Sci. USA 2004, 101, 15870-15875. 35. Berkovitch, F.; Nicolet, Y.; Wan, J. T.; Jarrett, J. T.; Drennan, C. L. Science 2004, 303, 76-79. 36. Berntsson, R. P.; Smits, S. H.; Schmitt, L.; Slotboom, D. J.; Poolman, B. FEBS Lett. 2010, 584, 2606-2617. 37. Betz, R. M.; Walker, R. C. J. Comput. Chem. 2015, 36, 79-87. 38. Bharadwaj, V. S.; Dean, A. M.; Maupin, C. M. J. Am. Chem. Soc. 2013, 135, 12279-12288. 39. Biancardi, A.; Barnes, J.; Caricato, M. J. Chem. Phys. 2016, 145, 224109. 40. Blank, K.; De Cremer, G.; Hofkens, J. Biotechnol. J. 2009, 4, 465-479. 41. Boehr, D. D.; D’Amico, R. N.; O’Rourke, K. F. Protein Sci. 2018, 27, 825-838. 42. Boyd, E. S.; Thomas, K. M.; Dai, Y.; Boyd, J. M.; Outten, F. M. Biochemistry 2014, 53, 5834-5847. 43. Bridwell-Rabb, J.; Drennan, C. L. Curr. Opin. Chem. Biol. 2017, 37, 63-70. 44. Broderick, J. B.; Duffus, B. R.; Duschene, K. S.; Shepard, E. M. Chem. Rev. 2014, 114, 4229-4317. 45. Broderick, W. E.; Broderick, J. B. J. Biol. Inorg. Chem. 2019, 24, 769-776. 46. Broderick, W. E.; Hoffman, B. M.; Broderick, J. B. Acc. Chem. Res. 2018, 51, 2611-2619.

181

Bibliography

47. Brown, K. L. Chem. Rev. 2005, 105, 2075-2150. 48. Bruender, N. A.; Grell, T. A.; Dowling, D. P.; McCarty, R. M.; Drennan, C. L.; Bandarian, V. J. Am. Chem. Soc. 2017, 139, 1912-1920. 49. Bruice, T. C.; Benkovic, S. J. Biochemistry 2000, 39, 6267-6274. 50. Buckel, W. Angew. Chem. Int. Ed. 2009, 48, 6779 . 51. Buckel, W.; Golding, B. T. Annu. Rev. Microbiol. 2006, 60, 27-49. 52. Buckel, W.; Golding, B. T. Encyclopedia of Radicals in Chemistry, Biology and Materials. John Wiley & Sons, Hoboken, New York, NY, USA: 2012. 53. Buckel, W.; Golding, B. T. FEMS Microbiol. Rev. 1999, 22, 523-241. 54. Buckel, W.; Kratky, C.; Golding, B. T. Chem. Eur. J. 2005, 12, 352-362. 55. Buis, J. M.; Broderick, J. B. Arch. Biochem. Biophys. 2005, 433, 288-296. 56. Callaghan, A. V. Front. Microbiol. 2013, 4, 89. 57. Callender, R.; Dyer, R. B. Acc. Chem. Res. 2015, 48, 407-413. 58. Campbell, E.; Kaltenbach, M.; Correy, G. J.; Carr, P. D.; Porebski, B. T.; Livingstone, E. K.; Afriat-Jurnou, L.; Buckle, A. M.; Weik, M.; Hollfelder, F.; Tokuriki, N.; Jackson, C. J. Nat. Chem. Biol. 2016, 12, 944-950. 59. Cao, L.; Ryde, U. Front. Chem. 2018, 6, 89. 60. Case, D. A.; Cerutti, D. S.; Cheatham, T. E., III; Darden, T. A.; Duke, R. E.; Giese, T. J.; Gohlke, H.; Goetz, A. W.; Greene, D.; Homeyer, N.; Izadi, S.; Kovalenko, A.; Lee, T. S.; LeGrand, S.; Li, P.; Lin, C.; Liu, J.; Luchko, T.; Luo, R.; Mermelstein, D.; Merz, K. M.; Monard, G.; Nguyen, H.; Omelyan, I.; Onufriev, A.; Pan, F.; Qi, R.; Roe, D. R.; Roitberg, A.; Sagui, C.; Simmerling, C. L.; Botello-Smith, W.-M.; Swails, J.; Walker, R. C.; Wang, J.; Wolf, R. M.; Wu, X.; Xiao, L.; York, D. M.; Kollman, P. A. AMBER 2017, University of California: San Francisco, CA, 2017. 61. Cavasotto, C. N.; Phatak, S. S. Drug Discov. Today 2009, 14, 676-683. 62. Cerqueira, N. M. F. S. A.; Fernandes, P. A.; Ramos, M. J. Chem. Phys. Chem. 2018, 19, 669-689. 63. Cheatham, T. E.; Miller, J. H.; Fox, T.; Darden, P. A.; Kollman, P. A. J. Am. Chem. Soc. 1995, 117, 4193. 64. Cheek, J.; Broderick, J. B. J. Biol. Inorg. Chem. 2001, 6, 209-226. 65. Chen, Y.-H.; Maity, A. N.; Frey, P. A.; Ke, S.-C. J. Am. Chem. Soc. 2013, 135, 788-794. 66. Childers, M. C.; Daggett, V. Mol. Syst. Des. Eng. 2017, 2, 9-33. 67. Chirpich, T. P.; Zappia, V.; Costilow, R. N.; Barker, H. A. J. Biol. Chem. 1970, 245, 1778-1789. 68. Chothia, C.; Hubbard, T.; Brenner, S.; Barns, H.; Murzin, A. Annu. Rev. Biophys. Biomol. Struct. 1997, 26, 597-627. 69. Chovancova, E.; Pavelka, A.; Benes, P.; Strnad, O.; Brezovsky, J.; Kozlikova, B.; Gora, A.; Sustr, V.; Klvana, M.; Medek, P.; Biedermannova, L.; Sochor, J.; Damborsky, J. PLoS Comput. Biol. 2012, 8, e1002708. 70. Christensen, A. S.; Kubař, T.; Cui, Q.; Elstner, M. Chem. Rev. 2016, 116, 5301-5337. 71. Christodoulou, J. J. Paediatr. Child Health 2012, 48, 153-155. 72. Claeyssens, F.; Harvey, J. N.; Manby, F. R.; Mata, R. A.; Mulholland, A. J.; Ranaghan, K. E.; Schütz, M.; Thiel, S.; Thiel, W.; Werner, H.-J. Angew. Chem. Int. Ed. 2006, 45, 6856-6859. 73. Cleland, W. W. Biochim. Biophys. Acta - Spec. Sect. Enzymol. Subj. 1963, 67, 104-137. 74. Cohen, A. J.; Mori-Sánchez, P.; Yang, W. Chem. Rev. 2012, 112, 289-320. 75. Collins, F. S.; Lander, E. S.; Rogers, J.; Waterston, R. H. Nature 2004, 431, 931-945. 76. Čondić-Jurkić, K.; Perchyonok, V. T.; Zipse, H.; Smith, D. M. J. Comput. Chem. 2008, 29, 2425-2433. 77. Čondić-Jurkić, K.; Zipse, H.; Smith, D. M. J. Comput. Chem. 2010, 31, 1024-1035. 78. Conradt, H.; Hohmann-Berger, M.; Hohmann, H. P.; Blaschkowski, H. P.; Knappe, J. Arch. Biochem. Biophys. 1984, 228, 133-142. 79. Cooper, G. M.; Hausman, R. E. The Cell: A Molecular Approach. 6th ed. Sinauer Associates, Inc.; Sunderland, MA, USA: 2013. 80. Coquille, S.; Roux, C.; Mehta, A.; Begley, T. P.; Fitzpatrick, T. B.; Thore, S. J. Struct. Biol. 2013, 184, 438- 444. 81. Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. J. Am. Chem. Soc. 1995, 117, 5179-5197. 82. Cosper, N. J.; Booker, S. J.; Ruzicka, F.; Frey, P. A.; Scott, R. A. Biochemistry 2000, 29, 15668. 83. Craciun, S.; Balskus, E. P. Proc. Natl. Acad. Sci. USA 2012, 109, 21307-21312. 84. Crain, A. V.; Broderick, J. B. J. Biol. Chem. 2014, 289, 5723-5729. 85. Crespo, A.; Marti, M. A.; Estrin, D. A.; Roitberg, A. E. J. Am. Chem. Soc. 2005, 127, 6940-6941. 86. Csermely, P.; Palotai, R.; Nussinov, R. Trends Biochem. Sci. 2010, 35, 539-546. 87. Cui, Q. J. Chem. Phys. 2016, 145, 140901. 88. Cui, Q.; Karplus, M. J. Phys. Chem. B 2000, 104, 3721-3743. 89. Cuzzolin, A.; Sturlese, M.; Deganutti, G.; Salmaso, V.; Sabbadin, D.; Ciancetta, A.; Moro, S. J. Chem. Inf. Model. 2016, 56, 687-705.

182

Bibliography

90. Damager, I.; Buchini, S.; Amaya, M. F.; Buschiazzo, A.; Alzari, P.; Frasch, A. C.; Watts, A.; Withers, S. G. Biochemistry 2008, 47, 3507-3512. 91. Damborsky, J.; Brezovsky, J. Curr. Opin. Chem. Biol. 2014, 19, 8-16. 92. Daniel, E. Angew. Chem. Int. Ed. 1995, 33, 2375-2378. 93. Dapprich, S.; Komáromi, I.; Byun, K. S.; Morokuma, K.; Frisch, M. J. J. Mol. Struct.: THEOCHEM. 1999, 462, 1-21. 94. De Gonzalo, G.; Colpa, D. I.; Habib, M. H. M.; Fraaije, M. W. J. Biotechnol. 2016, 236, 110-119. 95. Demchenko, A. P. J. Mol. Recognit. 2001, 14, 42-61. 96. Demick, J. M.; Lanzilotta, W. N. Biochemistry 2011 50, 440-442. 97. DesJarlais, R. L.; Sheridan, R. P.; Dixon, J. S.; Kuntz, I. D.; Venkataraghavan, R. J. Med. Chem. 1986, 29, 2149-2153. 98. De Vivo, M.; Masetti, M.; Bottegoni, G.; Cavalli, A. J. Med. Chem. 2016, 59, 4035-4061. 99. Doerr, S.; De Fabritiis, G. J. Chem. Theory Comput. 2014, 10, 2064-2069. 100. Dong, M.; Kathiresan, V.; Fenwick, M. K.; Torelli, A. T.; Zhang, Y.; Caranto, J. D.; Dzikovski, B.; Sharma, A.; Lancaster, K. M.; Freed, J. H.; Ealick, S. E.; Hoffman, B. M.; Lin, H. Science 2018, 359, 1247-1250. 101. Dordine, R. L.; Paneth, P.; Anderson, V. E. Bioorg. Chem. 1995, 23, 169-181. 102. Doshi, U.; McGowan, L. C.; Ladani, S. T.; Hamelberg, D. Proc. Natl. Acad. Sci. USA 2012, 109, 5699-5704. 103. Dowling, D. P.; Bruender, N. A.; Young, A. P.; McCarty, R. M.; Bandarian, V.; Drennan, C. L. Nat. Chem. Biol. 2014, 10, 106-112. 104. Dowling, D. P.; Croft, A. K.; Drennan, C. L. Annu. Rev. Biophys. 2012, 41, 403-427. 105. Dowling, D. P.; Vey, J. L.; Croft, A. K.; Drennan, C. L. Biochim. Biophys. Acta 2012, 1824, 1178-1195. 106. Dumas, M. E.; Barton, R. H.; Toye, A.; Cloarec, O.; Blancher, C.; Rothwell, A.; Fearnside, J.; Tatoud, R.; Blanc, V.; Lindon, J. C.; Mitchell, S. C.; Holmes, E.; McCarthy, M. I.; Scott, J.; Gauguier, D.; Nicholson, J. K. Proc. Natl. Acad. Sci. USA 2006, 103, 12511-1256. 107. Dupradeau, F.-Y.; Cézard, C.; Lelong, R.; Stanislawiak, E.; Pêcher, J.; Delepine, J. C.; Cieplak, P. Nucleic Acids Res. 2008, 36, D360-D367. 108. Durrant, J. D.; McCammon, J. A. BMC Biol. 2011, 9, 71. 109. Duschene, K. S.; Veneziano, S. E.; Silver, S. C.; Broderick, J. B. Curr. Opin. Chem. Biol. 2009, 13, 74-83. 110. Du, X.; Li, Y.; Xia, Y-L.; Ai, S-M.; Liang, J.; Sang, P.; Ji, X-L.; Liu, S-Q. Int. J. Mol. Sci. 2016, 17, 144. 111. Eisenmesser, E. Z.; Bosco, D. A.; Akke, M.; Kern, D. Science 2002, 295, 1520-1523. 112. Elstner, M.; Frauenheim, T.; Suhai, S. J. Mol. Struct.: THEOCHEM. 2003, 632, 29-41. 113. Engelkamp, H.; Hatzakis, N. S.; Hofkens, J.; De Schryver, F. C.; Nolte, R. J. M.; Rowan, A. E. Chem. Commun. 2006, 9, 935-940. 114. Feliks, M. “Computational Modeling of Catalytic Mechanisms of Glycyl Radical Enzymes" Ph.D. diss., Universität Bayreuth, 2014. 115. Fenwick, M. K.; Mehta, A. P.; Zhang, Y.; Abdelwahed, S. H.; Begley, T. P.; Ealick, S. E. Nat. Commun. 2015, 6, 6480. 116. Ferlez, B.; Sutter, M.; Kerfeld, C. A. MBio. 2019, 10, e02327-18. 117. Field, M. J.; Bash, P. A.; Karplus, M. J. Comp. Chem. 1990, 11, 700-733. 118. Fontecave, M. Cell. Mol. Life Sci. 1998, 54, 684-695. 119. Fontecave, M.; Ollagnier-de-Choudens, S.; Mulliez, E. Chem. Rev. 2003, 103, 2149-2166. 120. Frey, M.; Rothe, M.; Wagner, A. F.; Knappe, J. J. Biol. Chem. 1994, 269, 12432-12437. 121. Frey, P. A. Annu. Rev. Biochem. 2001, 70, 121-148. 122. Frey, P. A.; Ballinger, M. D.; Reed, G. H. Biochem. Soc. Trans. 1998, 26, 304-310. 123. Frey, P. A.; Hegeman, A. D.; Reed, G. H. Chem. Rev. 2006, 106, 3302-3316. 124. Frey, P. A.; Hegeman, A. D.; Ruzicka, F. J. Crit. Rev. Biochem. Mol. Biol. 2008, 43, 63-88. 125. Frey, P. A.; Magnusson, O. T. Chem. Rev. 2003, 103, 2129-2148. 126. Fried, S. D.; Boxer, S. G. Annu. Rev. Biochem. 2017, 86, 387-415. 127. Friesner, R. A.; Guallar, V. Annu. Rev. Phys. Chem. 2005, 56, 389-427. 128. Friesner, R. A. Proc. Natl. Acad. Sci. USA 2005, 102, 6648-6653. 129. Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B. M.; Petersson, G. A.; Nakatsuji, H.; Caricato, M.; Li, X.; Hratchian, H. P.; Izmaylov, A. F.; Bloino, J.; Zheng, G.; Sonnenberg, J. L.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Montgomery, J. A. Jr.; Peralta, J. E.; Ogliaro, F.; Bearpark, M.; Heyd, J. J.; Brothers, E.; Kudin, K. N.; Staroverov, V. N.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Rega, N.; Millam, J. M.; Klene, M.; Knox, J. E.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Zakrzewski, V. G.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Dapprich, S.; Daniels, A. D.; Farkas, Ö.; Foresman, J. B.; Ortiz, J. V.; Cioslowski, J.; Fox, D. J. Gaussian 09, revision D.01; Gaussian, Inc.: Wallingford, CT, 2009.

183

Bibliography

130. Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Petersson, G. A.; Nakatsuji, H.; Li, X.; Caricato, M.; Marenich, A. V.; Bloino, J.; Janesko, B. G.; Gomperts, R.; Mennucci, B.; Hratchian, H. P.; Ortiz, J. V.; Izmaylov, A. F.; Sonnenberg, J. L.; Williams-Young, D.; Ding, F.; Lipparini, F.; Egidi, F.; Goings, J.; Peng, B.; Petrone, A.; Henderson, T.; Ranasinghe, D.; Zakrzewski, V. G.; Gao, J.; Rega, N.; Zheng, G.; Liang, W.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Throssell, K.; Montgomery, J. A., Jr.; Peralta, J. E.; Ogliaro, F.; Bearpark, M. J.; Heyd, J. J.; Brothers, E. N.; Kudin, K. N.; Staroverov, V. N.; Keith, T. A.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A. P.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Millam, J. M.; Klene, M.; Adamo, C.; Cammi, R.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Farkas, O.; Foresman, J. B.; Fox, D. J. Gaussian 16, Revision A.03; Gaussian, Inc., Wallingford CT, 2016. 131. Froese, D. S.; Kochan, G.; Muniz, J. R.; Wu, X.; Gileadi, C.; Ugochukwu, E.; Krysztofinska, E.; Gravel, R. A.; Oppermann, U.; Yue, W. W. J. Biol. Chem. 2010, 285, 38204-38213. 132. Gao, J. Acc. Chem. Res. 1996, 29, 298-305. 133. Gao, J.; Ma, S.; Major, D. T.; Nam, K.; Pu, J.; Truhlar, D. G. Chem. Rev. 2006, 106, 3188-3209. 134. Gao, J.; Truhlar, D. Annu. Rev. Phys. Chem. 2002, 53, 467-505. 135. Gao, M.; Zhou, H.; Skolnick, J. Structure 2015, 23, 1362-1369. 136. García-Guevara, F.; Avelar, M.; Ayala, M.; Segovia, L. Biocatalysis 2015, 1, 109-117. 137. Gauld, J.; Eriksson, L. J. Am. Chem. Soc. 2000, 122, 2035-2040. 138. Gervasio, F. L.; Laio, A.; Parrinello, M. J. Am. Chem. Soc. 2005, 127, 2600-2607. 139. Gibson, M. I.; Chen, P. Y.-T.; Johnson, A. C.; Pierce, E.; Can, M.; Ragsdale, S. W.; Drennan, C. L. Proc. Natl. Acad. Sci. USA 2016, 113, 320-325. 140. Giedyk, M.; Goliszewskaab, K.; Gryko, D. Chem. Soc. Rev. 2015, 44, 3391-3404. 141. Gomberg, M. Chem. Rev. 1924, 1, 91-141. 142. Götz, A. W.; Williamson, M. J.; Xu, D.; Poole, D.; Le Grand, S.; Walker, R. C. J. Chem. Theory Comput. 2012, 8, 1542-1555. 143. Granovsky, A. A. J. Chem. Phys. 2011, 134, 214113. 144. Grant, B. J.; Rodrigues, A. P. C.; ElSawy, K. M.; McCammon, J. A.; Caves, L. S. D. Bioinformatics 2006, 22, 2695-2696. 145. Grell, T. A. J.; Bell, B. N.; Nguyen, C.; Dowling, D. P.; Bruender, N. A.; Bandarian, V.; Drennan, C. L. Protein Sci. 2019, 28, 202-215. 146. Griebler, C.; Leuders, T. Freshwater Biol. 2009, 54, 649-677. 147. Griller, D.; Ingold, K. U. Acc. Chem. Res. 1976, 9, 13-19. 148. Grimme, S.; Ehrlich, S.; Goerigk, L. J. Comp. Chem. 2011, 32, 1456-1465. 149. Grossfield, A. “WHAM: the weighted histogram analysis method”, version 2.0.9.1, http://membrane.urmc.rochester.edu/wordpress/?page_id=126. 150. Guo, J.-D.; Himo, F. J. Phys. Chem. B 2004, 108, 15347-15354. 151. Gutteridge, A.; Thornton, J. FEBS Lett. 2004, 567, 67-73. 152. Hamelberg, D.; Mongan, J.; McCammon, J. A. J. Chem. Phys. 2004, 120, 11919-11929. 153. Hammes, G. G. Biochemistry 2002, 41, 8221-8228. 154. Hansen, N.; Van Gunsteren, W. F. J. Chem. Theory Comput. 2014, 10, 2632-2647. 155. Hanzelmann, P.; Schindelin, H. Proc. Natl. Acad. Sci. USA 2004, 101, 12870-12875. 156. Hanževački, M.; Banhatti, R. D.; Čondić-Jurkić, K.; Smith, A.-S.; Smith, D. M. J. Phys. Chem. A 2019, 123, 9345-9356. 157. Hanževački, M.; Čondić-Jurkić, K.; Banhatti, R. D.; Smith, A.-S., Smith, D. M. Chem. Eur. J. 2019, 25, 8741-8753. 158. Harmer, J. E.; Hiscox, M. J.; Dinis, P. C.; Fox, S. J.; Iliopoulos, A.; Hussey, J. E.; Sandy, J.; Van Beek, F. T.; Essex, J. W.; Roach, P. L. Biochem. J. 2014, 464, 123-133. 159. Head-Gordon, M.; Pople, J. A.; Frisch, M. J. Chem. Phys. Lett. 1988, 153, 503-506. 160. Hediger, M. R.; De Vico, L.; Svendsen, A.; Besenmatter, W.; Jensen, J. H. PLoS ONE 2012, 7, e49849. 161. Heider, J.; Spormann, A. M.; Beller, H. R.; Widdel, F. FEMS Microbiol. Rev. 1998, 22, 459-473. 162. Helliwell, J. R. Biosci. Rep. 2017, 37, BSR20170204. 163. Henry, D. J.; Parkinson, C. J.; Mayer, P. M.; Radom, L. J. Phys. Chem. A 2001, 105, 6750-6756. 164. Henry, D. J.; Parkinson, C. J.; Radom, L. J. Phys. Chem. A 2002, 106, 7927-7936. 165. Henry, D. J.; Sullivan, M. B.; Radom, L. J. Chem. Phys. 2003, 118, 4849-4860. 166. Henshaw, T. F.; Cheek, J.; Broderick, J. B. J. Am. Chem. Soc. 2000, 122, 8331-8332. 167. Henzler-Wildman, K. A.; Lei, M.; Thai, V.; Kerns, S. J.; Karplus, M.; Kern, D. Nature 2007, 450, 913-916. 168. Henzler-Wildman, K.; Kern, D. Nature 2007, 450, 964-972. 169. Hertig, S.; Latorraca, N. R.; Dror, R. O. PLoS Comput. Biol. 2016, 12, e1004746. 170. Hesslinger, C.; Fairhurst, S. A.; Sawers, G. Mol. Microbiol. 1998, 27, 477-492. 171. Himo, F.; Eriksson, L. A. J. Am. Chem. Soc. 1998, 120, 11449-11455.

184

Bibliography

172. Himo, F. Theor. Chem. Acc. 2006, 116, 232-240. 173. Hioe, J.; Savasci, G.; Brand, H.; Zipse, H. Chem. Eur. J. 2011, 17, 3781-3789. 174. Hioe, J.; Zipse, H. Org. Biomol. Chem. 2010, 8, 3609-3617. 175. Hiscox, M. J.; Driesner, R. C.; Roach, P. L. Biochem. Biophys. Acta 2012, 1824, 1165-1177. 176. Hofer,T. S.; de Visser, S. P. Front. Chem. 2018, 6, 2296-2646. 177. Ho, J.; Shao, Y.; Kato, J. Molecules 2018, 23, 2466. 178. Holcomb, J.; Spellmon, N.; Zhang, Y.; Doughan, M.; Li, C.; Yang, Z. AIMS Biophys. 2017, 4, 557-575. 179. Holliday, G. L.; Akiva, E.; Meng, E. C.; Brown, S. D.; Calhoun, S.; Pieper, U.; Sali, A.; Booker, S. J.; Babbitt, P. C. Methods Enzymol. 2018, 606, 1-71. 180. Holliday, G. L.; Mitchell, J. B.; Thornton, J. M. J. Mol. Biol. 2009, 390, 560-577. 181. Hong, N. S.; Petrović, D.; Lee, R.; Gryn’ova, G.; Purg, M.; Saunders, J.; Bauer, P.; Carr, P. D.; Lin, C. Y.; Mabbitt, P. D.; Zhang, W.; Altamore, T.; Easton, C.; Coote M. L.; Kamerlin, S. C. L.; Jackson C. J. Nat. Commun. 2018, 9, 3900. 182. Horitani, M.; Shisler, K.; Broderick, W. E.; Hutcheson, R. U.; Duschene, K. S.; Marts, A. R.; Hoffman, B. M.; Broderick, J. B. Science 2016, 352, 822-825 183. Hospital, A.; Goñi, J. R.; Orozco, M.; Gelpí, J. L. Adv. Appl. Bioinform. Chem. 2015, 8, 37-47. 184. Hu, H.; Yang, W. J. Mol. Struct.: THEOCHEM. 2009, 898, 17-30. 185. Hummer, G.; Szabo, A. Biophys. J. 2003, 85, 5-15. 186. Humphrey, W.; Dalke, A.; Schulten, K. J. Mol. Graph. 1996, 14, 33-38. 187. Imlay, J. A. Mol. Microbiol. 2006, 59, 1073-1082. 188. Isralewitz, B.; Gao, M.; Schulten, K. Curr. Opin. Struct. Biol. 2001, 11, 224-230. 189. Isralewitz, B.; Izrailev, S.; Schulten, K. Biophys. J. 1997, 73, 2972-2979. 190. Itoh, T.; Hanefeld, U. Green Chem. 2017, 19, 331-332. 191. Izrailev, S.; Crofts, A. R.; Berry, E. A.; Schulten, K. Biophys. J. 1999, 77, 1753-1768. 192. Izrailev, S.; Stepaniants, S.; Balsera, M.; Oono, Y.; Schulten, K. Biophys. J. 1997, 72, 1568-1581. 193. Jäger, C. M.; Croft, A. K. Chem. Bio. Eng. Rev. 2018, 5, 143-162. 194. James, L. C.; Tawfik, D. S. Trends Biochem. Sci. 2003, 28, 361-368. 195. Janin, J. Prog. Biophys. Mol. Biol. 1995, 64, 145-166. 196. Jaouen, G. Bioorganometallics: Biomolecules, Labeling, Medicine. Wiley-VCH, Weinheim: 2006. 197. Jasperse, C. P.; Curran, D. P.; Fevig, T. L. Chem. Rev. 1991, 91, 1237-1286. 198. Jensen, F. Introduction to computational chemistry. Wiley, New York, NY, USA: 2001. 199. Jensen, M. Ø.; Park, S.; Tajkhorshid, E.; Schulten, K. Proc. Natl. Acad. Sci. USA 2002, 99, 6731-6736. 200. Jeschke, G. Biochimica et Biophysica Acta (BBA) - Bioenergetics 2005, 1707, 91-102. 201. Jiang, W.; Wang, S.; Wang, Y.; Fang, B. Biotechnol. Biofuels. 2016, 9, 57. 202. Jing, Z.; Liu, C.; Cheng, S. Y.; Qi, R.; Walker, B. D.; Piquemal, J.-P.; Ren, P. Annu. Rev. Biophys. 2019, 48, 371-394. 203. Jones, R. O. Rev. Mod. Phys. 2015, 87, 897. 204. Jones, S.; Thornton, J. M. Proc. Natl. Acad. Sci. USA 1996, 93, 13-20. 205. Jover, J.; Maseras, F. Struct. Bond. 2016, 167, 59-79. 206. Kamerlin, S. C.; Warshel, A. Faraday Discuss. 2010, 145, 71-106. 207. Kampmeier, J. A. Biochemistry 2010, 49, 10770-10772. 208. Kappel, K.; Miao, Y.; McCammon, J. A. Rev. Biophys. 2015, 48, 479-487. 209. Kästner, J.; Thiel, W. J. Chem. Phys. 2005, 123, 144104. 210. Kästner, J. WIREs Comput. Mol. Sci. 2011, 1, 932-942. 211. Kauzmann, W. Biophys. J. 1964, 4, 43-54. 212. Khersonsky, O.; Roodveldt, C.; Tawfik, D. S. Curr. Opin. Chem. Biol. 2006, 10, 498-508. 213. Khersonsky, O.; Tawfik, D. S. Annu. Rev. Biochem. 2010, 79, 471-505. 214. Kingsley, L. J.; Lill, M. A. Proteins 2015, 83, 599-611. 215. Klahn, M.; Braun-Sand, S.; Rosta, E.; Warshel, A. J. Phys. Chem. B 2005, 109, 15645-15650. 216. Klepeis, J. L.; Lindorff-Larsen, K.; Dror, R. O.; Shaw, D. E. Curr. Opin. Struct. Biol. 2009, 19, 120-127. 217. Klinman, J. P.; Kohen, A. J. Biol. Chem. 2014, 289, 30205-30212. 218. Knappe, J.; Blaschkowski, H. P.; Gröbner, P.; Schmitt, T. Eur. J. Biochem. 1974, 50, 253-263. 219. Knappe, J.; Elbert, S.; Frey, M.; Wagner, A. F. V. Biochem. Soc. Trans. 1993, 21, 731-734. 220. Knappe, J.; Neugebauer, F. A.; Blaschkowski, H. P.; Gänzler, M. Proc. Natl. Acad. Sci. USA 1984, 81, 1332- 1335. 221. Knappe, J.; Sawers, G. FEMS Microbiol. Rev. 1990, 6, 383-398. 222. Knappe, J.; Wagner, A. F. Methods Enzymol. 1995, 258, 343-362. 223. Knowles, J. R. Nature 1991, 350, 121-124. 224. Kohen, A. Acc. Chem. Res. 2015, 48, 466-473. 225. Kong, L.; Bischoff, F. A.; Valeev, E. F. Chem. Rev. 2012, 112, 75-107.

185

Bibliography

226. Kovačević, B.; Barić, D.; Babić, D.; Bilić, L.; Hanževački, M.; Sandala, G. M.; Radom, L.; Smith, D. M. J. Am. Chem. Soc. 2018, 140, 8487-8496. 227. Kovermann, M.; Grundström, C.; Sauer-Eriksson, A. E.; Sauer, U. H.; Wolf-Watz, M. Proc. Natl. Acad. Sci. USA 2017, 114, 6298-6303. 228. Krainer, F. W.; Glieder, A. Appl. Microbiol. Biotechnol. 2015, 99, 1611-1625. 229. Kraut, J. Science 1988, 242, 533-540. 230. Krebs, C.; Broderick, W. E.; Henshaw, T. F.; Broderick, J. B.; Huynh, B. H. J. Am. Chem. Soc. 2002, 124, 912-913. 231. Külzer, R.; Pils, T.; Kappl, R.; Hüttermann, J.; Knappe, J. J. Biol. Chem. 1998, 273, 4897-4903. 232. Kumar, S.; Rosenberg, J. M.; Bouzida, D.; Swendsen, R. H.; Kollman, P. A. J. Comput. Chem. 1992, 13, 1011-1021. 233. Kurteva-Yaneva, N.; Zahn, M.; Weichler, M. T.; Starke, R.; Harms, H.; Müller, R. H.; Sträter, N.; Rohwerder, T. J. Biol. Chem. 2015, 290, 9727-9737. 234. Laio, A.; Gervasio, F. L. Rep. Prog. Phys. 2008, 71, 126601. 235. Laio, A.; Parrinello, M. Proc. Natl. Acad. Sci. USA 2002, 99, 12562-12566. 236. LaMattina, J. W.; Keul, N. D.; Reitzer, P.; Kapoor, S.; Galzerani, F.; Koch, D. J.; Gouvea, I. E.; Lanzilotta, W. N. J. Biol. Chem. 2016, 291, 15515-15526. 237. Lancaster, L.; Abdallah, W.; Banta, S.; Wheeldon, I. Chem. Soc. Rev. 2018, 47, 5177-5186. 238. Landgraf, B. J.; McCarthy, E. L.; Booker, S. J. Annu. Rev. Biochem. 2016, 85, 485-514. 239. LaPointe, S. M.; Weaver, D. F. Curr. Comput. Aided Drug Des. 2007, 3, 290-296. 240. Larsson, K. M.; Logan, D. T.; Nordlund, P. ACS Chem. Biol. 2010, 5, 933-942. 241. Layer, G.; Heinz, D. W.; Jahn, D.; Schuber, W.-D. Curr. Opin. Chem. Biol. 2004, 8, 468-476. 242. Layer, G.; Moser, J.; Heinz, D. W.; Jahn, D.; Schubert, W. D. EMBO J. 2003, 22, 6214-6224. 243. Lee, C.-H.; Sarma, R. H. J. Am. Chem. Soc. 1975, 97, 1225-1236. 244. Lee, D.; Redfern, O.; Orengo, C. Nat. Rev. Mol. Cell Biol. 2007, 8, 995-1005. 245. Le Grand, S.; Götz, A. W.; Walker, R. C. Comput. Phys. Commun. 2013, 184, 374-380. 246. Lehtiö, L.; Goldman, A. Protein Eng. Des. Sel. 2004, 17, 545-552. 247. Lepore, B. W.; Ruzicka, F. J.; Frey, P. A.; Ringe, D. Proc. Natl. Acad. Sci. USA 2005, 102, 13819-13824. 248. Leppanen, V. M.; Merckel, M. C.; Ollis, D. L.; Wong, K. K.; Kozarich, J. W.; Goldman, A. Structure 1999, 7, 733-744. 249. Leuthner, B.; Leutwein, C.; Schulz, H.; Hörth, P.; Haehnel, W.; Schiltz, E.; Schägger, H.; Heider, J. Mol. Microbiol. 1998, 28, 615-628. 250. Levin, B. J.; Balskus, E. P. Curr. Opin. Chem. Biol. 2018, 47, 86-93. 251. Levin, B. J.; Huang, Y. Y.; Peck, S. C.; Wei, Y.; Martínez-Del Campo, A.; Marks, J. A.; Franzosa, E. A.; Huttenhower, C.; Balskus, E. P. Science 2017, 355, eaai8386. 252. Leys, D.; Scrutton, N. S. Curr. Opin. Struct. Biol. 2016, 41, 19-26. 253. Liao, D. I.; Dotson, G.; Turner, I. Jr.; Reiss, L.; Emptage, M. J. Inorg. Biochem. 2003, 93, 84-91. 254. Liao, L. Y.; Chung, W. S.; Chen, K. M. J. Adv. Nurs. 2017, 73, 108-111. 255. Liao, Q.; Kulkarni, Y.; Sengupta, U.; Petrović, D.; Mulholland, A. J.; Van der Kamp, M. W.; Strodel, B.; Kamerlin, S. C. L. J. Am. Chem. Soc. 2018, 140, 15889-15903. 256. Linder, M. Comput. Struct. Biotechnol. J. 2012, 2, e201209009. 257. Lin, H.; Truhlar, D. G. Theor. Chem. Acc. 2007, 117, 185-199. 258. Lipparini, F.; Mennucci, B. J. Chem. Phys. 2016, 144, 160901. 259. Liu, W. Q.; Amara, P.; Mouesca, J. M.; Ji, X.; Renoux, O.; Martin, L.; Zhang, C.; Zhang, Q.; Nicolet, Y. J. Am. Chem. Soc. 2018, 140, 1365-1371. 260. Lobo, V.; Patil, A.; Phatak, A.; Chandra, N. Pharmacogn. Rev. 2010, 4, 118-126. 261. Lo, H.-H.; Lin, H.-H.; Maity, A. N.; Ke, S.-C. Chem. Commun. 2016, 52, 6399-6402. 262. Lonsdale, R.; Harvey, J. N.; Mulholland, A. J. Chem. Soc. Rev. 2012, 41, 3025-3038. 263. Lonsdale, R.; Ranaghan, K. E.; Mulholland, A. J. Chem. Commun. 2010, 46, 2354-2372. 264. Lucas, M. de F.; Fernandes, P. A.; Eriksson, L. A.; Ramos, M. J. J. Phys. Chem. B 2003, 107, 5751-5757. 265. Lüdemann, S. K.; Lounnas, V.; Wade, R. C. J. Mol. Biol. 2000, 303, 797-811. 266. Mahoney, M. W.; Jorgensen, W. L. J. Chem. Phys. 2000, 112, 8910-8922. 267. Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, K. E.; Simmerling, C. J. Chem. Theory Comput. 2015, 11, 3696-3713. 268. Mancia, F.; Evans, P. R. Structure 1998, 6, 711-720. 269. Mancia, F.; Keep, N. H.; Nakagawa, A.; Leadlay, P. F.; McSweeney, S.; Rasmussen, B.; Bosecke, P.; Diat, O.; Evans, P. R. Structure 1996, 4, 339-350. 270. Mannige, R. V. Proteomes 2014, 2, 128-153. 271. Mansoorabadi, S. O.; Seravalli, J.; Furdui, C.; Krymov, V.; Gerfen, G. J.; Begley, T. P.; Melnick, J.; Ragsdale, S. W.; Reed, G. H. Biochemistry 2006, 45, 7122-7131.

186

Bibliography

272. Maria-Solano, M. A.; Serrano-Hervás, E.; Romero-Rivera, A.; Iglesias-Fernández, J.; Osuna, S. Chem. Commun. 2018, 54, 6622-6634. 273. Markham, G. D. S-Adenosylmethionine. Encyclopedia of Life Sciences, John Wiley Sons: 2010. 274. Markowetz, F. PLoS Biol. 2017, 15, e2002050. 275. Marsh, E. N.; Patterson, D. P.; Li, L. Chembiochem. 2010, 11, 604-621. 276. Martens, J. H.; Barg, H.; Warren, M. J.; Jahn, D. Appl. Microbiol. Biotechnol. 2002, 58, 275-285. 277. Martínez Cuesta, S.; Asad Rahman, S.; Furnham, N.; Thornton, J. M. Biophys. J. 2015, 109, 1082-1086. 278. Maseras, F.; Morokuma, K. J. Comput. Chem. 1995, 16, 1170-1179. 279. Mate, D. M.; Alcalde, M. Microb. Biotechnol. 2017, 10, 1457-1467. 280. Melchiorsen, C. R.; Jokumsen, K. V.; Villadsen, J.; Johnsen, M. G.; Israelsen, H.; Arnau, J. J. Bacteriol. 2000, 182, 4783-4788. 281. Menger, F. M.; Nome, F. ACS Chemical Biology 2019, 14, 1386-1392. 282. Mennucci, B.; Tomasi, J. J. Chem. Phys. 1998, 106, 5151-5158. 283. Mennucci, B. WIREs Comput. Mol. Sci. 2012, 2, 386-404. 284. Menon, B. R. K.; Fisher, K.; Rigby, S. E. J.; Scrutton, N. S.; Leys, D. J. Biol. Chem. 2014, 289, 34161- 34174. 285. Miura, K. Protein Pept. Lett. 2018, 25, 728-733. 286. Møller, C.; Plesset, M. S. Phys. Rev. 1934, 46, 618-622. 287. Monard, G.; Merz, K. Acc. Chem. Res. 1999, 32, 904-911. 288. Monticelli, L.; Tieleman, D. P. Methods Mol. Biol. 2013, 924, 197-213. 289. Moss, M.; Frey, P. A. J. Biol. Chem. 1987, 262, 14859-14862. 290. Motlagh, H. N.; Wrabl, J. O.; Li, J.; Hilser, V. J. Nature 2014, 508, 331-339. 291. Murray, C.; Davidson, E. R. Chem. Phys. Lett. 1991, 187, 451-454. 292. Nagano, N.; Orengo, C. A.; Thornton, J. M. J. Mol. Biol. 2002, 321, 741-765. 293. Narayanan, C.; Bernard, D. N.; Doucet, N. Catalysts 2016, 6, 81. 294. Neet, K. E. J. Biol. Chem. 1998, 273, 25527-25528. 295. Nnyepi, M. R.; Peng, Y.; Broderick, J. B. Arch. Biochem. Biophys. 2007, 459, 1-9. 296. Noda-Garcia, L.; Liebermeister, W.; Tawfik, D. S. Annu. Rev. Biochem. 2018, 87, 187-216. 297. Noé, F. Biophys. J. 2015, 108, 228-229. 298. Norman, R. O. C. Nature 1969, 222, 1008-1009. 299. O’Brien, J. R.; Raynaud, C.; Croux, C.; Girbal, L.; Soucaille, P.; Lanzilotta, W. N. Biochemistry 2004, 43, 4635-4645. 300. Oliveira, I. A.; Gonçalves, A. S.; Neves, J. L.; Von Itzstein, M.; Todeschini, A. R. J. Biol. Chem. 2014, 289, 423-436. 301. Olsson, M. H. M.; Mavri, J.; Warshel, A. Philos. Trans. R. Soc. Lond. B: Biol. Sci. 2006, 361, 1417-1432. 302. Olsson, M. H. M.; Parson, W. W.; Warshel, A. Chem. Rev. 2006, 106, 1737-1756. 303. Ouzounis, C. A.; Coulson, R. M. R.; Enright, A. J.; Kunin, V.; Pereira-Leal, J. B. Nat. Rev. Genet. 2003, 4, 508-519. 304. Overington, J. P. Curr. Opin. Struct. Biol. 1992, 2, 394-401. 305. Pang, J.; Li, X.; Morokuma, K.; Scrutton, N. S.; Sutcliffe, M. J. J. Am. Chem. Soc. 2012, 134, 2367-2377. 306. Parast, C. V.; Wong, K. K.; Kozarich, J. W.; Peisach, J.; Magliozzo, R. S. Biochemistry 1995, 34, 5712-5717. 307. Parast, C. V.; Wong, K. K.; Lewisch, S. A.; Kozarich, J. W.; Peisach, J.; Magliozzo, R. S. Biochemistry 1995, 34, 2393-2399. 308. Park, S.; Khalili-Araghi, F.; Tajkhorshid, E.; Schulten, K. J. Chem. Phys. 2003, 119, 3559-3566. 309. Park, S.; Schulten, K. J. Chem. Phys. 2004, 120, 5946-5961. 310. Peck, S. C.; Denger, K.; Burrichter, A.; Irwin, A. M.; Balskus, E. P.; Schleheck, D. Proc. Natl. Acad. Sci. USA 2019, 116, 3171-3176. 311. Pedersen, J. Z.; Finazzi-Agrò, A. FEBS Lett. 1993, 325, 53-58. 312. Peng, Y.; Veneziano, S. E.; Gillispie, G. D.; Broderick, J. B. J. Biol. Chem. 2010, 285, 27224-27231. 313. Perilla, J. R.; Goh, B. C.; Cassidy, C. K.; Liu, B.; Bernardi, R. C.; Rudack, T.; Yu, H.; Wu, Z.; Schulten, K. Curr. Opin. Struct. Biol. 2015, 31, 64-74. 314. Peterson, J. P.; Winter, A. H. J. Am. Chem. Soc. 2019, 141, 12901-12906. 315. Petrović, D.; Risso, V. A.; Kamerlin, S. C. L.; Sanchez-Ruiz J. M. J. R. Soc. Interface 2018, 15, 20180330. 316. Phaniendra, A.; Jestadi, D. B.; Periyasamy, L. Indian J. Clin. Biochem. 2015, 30, 11-26. 317. Plaga, W.; Frank, R.; Knappe, J. Eur. J. Biochem. 1988, 178, 445-450. 318. Plaga, W.; Vielhaber, G.; Wallach, J.; Knappe, J. FEBS Lett. 2000, 466, 45-48. 319. Pollack, R. M. J. Am. Chem. Soc. 2010, 132, 3637-3637. 320. Poveda, J.; Sanchez-Niño, M. D.; Glorieux, G.; Sanz, A. B.; Egido, J.; Vanholder, R.; Ortiz, A. Nephrol. Dial. Transplant. 2014, 29, 56-64. 321. Powers, R. Proteins: Structure, Function, and Bioinformatics 2006, 65, 124-135.

187

Bibliography

322. Rabus, R.; Boll, M.; Heider, J.; Meckenstock, R. U.; Buckel, W.; Einsle, O.; Ermler, U.; Golding, B. T.; Gunsalus, R. P.; Kroneck, P. M.; Kruger, M.; Lueders, T.; Martins, B. M.; Musat, F.; Richnow, H. H.; Schink, B.; Seifert, J.; Szaleniec, M.; Treude, T.; Ullmann, G. M.; Vogt, C.; Von Bergen, M.; Wilkes, H. J. Mol. Microbiol. Biotechnol. 2016, 26, 5-28. 323. Rabus, R.; Wilkes, H.; Behrends, A.; Armstroff, A.; Fischer, T.; Pierik, A. J.; Widdel, F. J. Bacteriol. 2001, 183, 1707-1715. 324. Radzicka, A.; Wolfenden, R. Science 1995, 267, 90-93. 325. Ramos, M. J.; Fernandes, P. A. Acc. Chem. Res. 2008, 41, 689-698. 326. Ranaghan, K. E.; Mulholland, A. J. Int. Rev. Phys. Chem. 2010, 29, 65-133. 327. Ranaghan, K. E.; Ridder, L.; Szefczyk, B.; Sokalski, W. A.; Hermann, J. C.; Mulholland, A. J. Org. Biomol. Chem. 2004, 2, 968-980. 328. Raynaud, C.; Sarçabal, P.; Meynial-Salles, I.; Croux, C.; Soucaille, P. Proc. Natl. Acad. Sci. USA 2003, 100, 5010-5015. 329. Reddy, S. G.; Wong, K. K.; Parast, C. V.; Peisach, J.; Magliozzo, R. S.; Kozarich, J. W. Biochemistry 1998, 37, 558-563. 330. RED server http://upjv.q4md-forcefieldtools.org/REDDB/projects/F-91/, accessed Feb 15, 2017. 331. Reitzer, R.; Gruber, K.; Jogl, G.; Wagner, U. G.; Bothe, H.; Buckel, W.; Kratky, C. Structure 1999, 7, 891- 902. 332. Renaud, P.; Sibi, M. P. Radicals in Organic Synthesis, 1st ed. Wiley-VCH, Weinheim: 2001. 333. Reuter, N.; Dejaegere, A.; Maigret, B.; Karplus, M. J. Phys. Chem. A 2000, 104, 1720-1735. 334. Richard, J. P. J. Am. Chem. Soc. 2019, 141, 3320-3331. 335. Rittle, J. ACS Cent. Sci. 2019, 5, 1741-1743. 336. Robins, A. B.; Butler, J. A. V. Radiation Research 1962, 16, 7-17. 337. Robinson, P. K. Essays Biochem. 2015, 59, 1-41. 338. Rodríguez Couto, S.; Toca Herrera, J. L. Biotechnol. Adv. 2006, 24, 500-513. 339. Romero, K. J.; Galliher, M. S.; Pratt, D. A.; Stephenson C. R. J. Chem. Soc. Rev. 2018, 47, 7851-7866. 340. Rüchardt, C. Free Radic. Res. Commun. 1987, 2, 197-216. 341. Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. J. Comput. Phys. 1977, 23, 327-341. 342. Ryde, U. Methods Enzymol. 2016, 577, 119-158. 343. Sakaki, T. Biol. Pharm. Bull. 2012, 35, 844-849. 344. Saleh, T.; Kalodimos, C. G. Science 2017, 355, 247-248. 345. Salmaso, V.; Moro, S. Front. Pharmacol. 2018, 9, 923. 346. Salomon-Ferrer, R.; Götz, A. W.; Poole, D.; Le Grand, S.; Walker, R. C. J. Chem. Theory Comput. 2013, 9, 3878-3888. 347. Sandala, G. M.; Smith, D. M.; Radom, L. Acc. Chem. Res. 2010, 43, 642-651. 348. Sawers, G. FEMS Microbiol. Rev. 1998, 22, 543-551. 349. Sawers, G.; Hesslinger, C.; Muller, N.; Kaiser, M. J. Bacteriol. 1998, 180, 3509-3516. 350. Sawers, G.; Suppmann, B. J. Bacteriol. 1992, 174, 3474-3478. 351. Sawers, G.; Watson, G. Mol. Microbiol. 1998, 29, 945-954. 352. Sayler, R. I.; Stich, T. A.; Joshi, S.; Cooper, N.; Shaw, J. T.; Begley, T. P.; Tantillo, D. J.; Britt, R. D. ACS Cent. Sci. 2019, 5, 1777-1785. 353. Schramm, V. L. Chem. Rev. 2006, 106, 3029-3030. 354. Scott, K. P.; Martin, J. C.; Campbell, G.; Mayer, C. D.; Flint, H. J. J. Bacteriol. 2006, 188, 4340-4349. 355. Secundo, F. Chem. Soc. Rev. 2013, 42, 6250-6261. 356. Selmer, T.; Pierik, A. J.; Heider, J. Biol. Chem. 2005, 386, 981-988. 357. Senn, H. M.; Thiel, W. Top. Curr. Chem. 2007, 268, 173-290. 358. Senn, H. M.; Thiel, W. Angew. Chem. Int. Ed. 2009, 48, 1198-1229. 359. Sharma, G. N.; Gupta, G.; Sharma, P. Crit. Rev. Eukaryot. Gene Expr. 2018, 28, 139-154. 360. Shibata, N.; Masuda, J.; Tobimatsu, T.; Toraya, T.; Suto, K.; Morimoto, Y.; Yasuoka, N. Structure 1999, 7, 997-1008. 361. Shibata, N.; Tamagaki, H.; Hieda, N.; Akita, K.; Komori, H.; Shomura, Y.; Terawaki, S.; Mori, K.; Yasuoka, N.; Higuchi, Y.; Toraya, T. J. Biol. Chem. 2010, 285, 26484-26493. 362. Shibata, N.; Toraya, T. J. Biochem. 2015, 158, 271-292. 363. Shisler, K. A.; Broderick, J. B. Arch. Biochem. Biophys. 2014, 546, 64-71. 364. Shisler, K. A.; Hutcheson, R. U.; Horitani, M.; Duschene, K. S.; Crain, A. V.; Byer, A. S.; Broderick, J. B. J. Am. Chem. Soc. 2017, 139, 11803-11813. 365. Shoaie, S.; Karlsson, F.; Mardinoglu, A.; Nookaew, I.; Bordel, S.; Nielsen, J. Sci. Rep. 2013, 3, 2532. 366. Siegbahn, P. E. M.; Himo, F. J. Biol. Inorg. Chem. 2009, 14, 643-651. 367. Sim, A. Y. L.; Minary, P.; Levitt, M. Curr. Opin. Struct. Biol. 2012, 22, 273-278. 368. Singh, U. C.; Kollman, P. A. J. Comput. Chem. 1984, 5, 129-145.

188

Bibliography

369. Singh, U. C.; Kollman, P. A. J. Comput. Chem. 1986, 7, 718-730. 370. Sintchak, M. D.; Arjara, G.; Kellogg, B. A.; Stubbe, J.; Drennan, C. L. Nat. Struct. Biol. 2002, 9, 293-300. 371. Sofia, H. J.; Chen, G.; Hetzler, B. G.; Reyes-Spindola, J. F.; Miller, N. E. Nucleic Acids Res. 2001, 29, 1097- 1106. 372. Sono, M.; Roach, M. P.; Coulter, E. D.; Dawson, J. H. Chem. Rev. 1996, 96, 2841-2888. 373. Souaille, M.; Roux, B. Comput. Phys. Commun. 2001, 135, 40-57. 374. Sousa, S. F.; Ribeiro, A. J. M.; Neves, R. P. P.; Brás, N. F.; Cerqueira, N. M. F. S. A.; Fernandes, P. A.; Ramos, M. J. WIREs Comput. Mol. Sci. 2017, 7, e1281. 375. Srihari, S.; Yong, C. H.; Patil, A.; Wong, L. FEBS Lett. 2015, 589, 2590-2602. 376. Srivastava, A.; Nagai, T.; Srivastava, A.; Miyashita, O.; Tama, F. Int. J. Mol. Sci. 2018, 19, 3401. 377. Steinhauser, M. O.; Hiermaier, S. Int. J. Mol. Sci. 2009, 10, 5135-5216. 378. Strogatz, S. H. Nature 2001, 410, 268-276. 379. Stubbe, J.; Ge, J.; Yee, C. S. Trends Biochem. Sci. 2001, 26, 93-99. 380. Stubbe, J.; Van der Donk, W. A. Chem. Rev. 1998, 98,705-762. 381. Studer, A.; Curran, D. P. Angew. Chem. Int. Ed. 2016, 55, 58-102. 382. Sugita, Y.; Kitao, A.; Okamoto, Y. J. Chem. Phys. 2000, 113, 6042-6051. 383. Sugita, Y.; Okamoto, Y. Chem. Phys. Lett. 1999, 314, 141-151. 384. Sullivan, S. M.; Holyoak, T. Proc. Natl. Acad. Sci. USA 2008, 105, 13829-13834. 385. Su, N. Q.; Xu, X. Annu. Rev. Phys. Chem. 2017, 68, 155-182. 386. Sun, X.; Ollagnier, S.; Schmidt, P. P.; Atta, M.; Mulliez, E.; Lepape, L.; Eliasson, R.; Gräslund, A.; Fontecave, M.; Reichard, P.; Sjöberg, B. M. J. Biol. Chem. 1996, 271, 6827-6831. 387. Suzuki, Y.; Noma, A.; Suzuki, T.; Senda, M.; Senda, T.; Ishitani, R.; Nureki, O. J. Mol. Biol. 2007, 372, 1204-1214. 388. Svensson, M.; Humbel, S.; Froese, R. D. J.; Matsubara, T.; Sieber, S.; Morokuma, K. J. Phys. Chem. 1996, 100, 19357-19363. 389. Sweetlove, L. J.; Fernie, A. R. Nat. Commun. 2018, 9, 2136. 390. Szabo, A.; Ostlund, N. S. Modern quantum chemistry. Dover Publications, New York, NY, USA: 1989. 391. Szalay, P. G.; Müller, T.; Gidofalvi, G.; Lischka, H.; Shepard, R. Chem. Rev. 2012, 112, 108-181. 392. Tang, W. H.; Wang, Z.; Levison, B. S.; Koeth, R. A.; Britt, E. B.; Fu, X.; Wu, Y.; Hazen, S. L. N. Engl. J. Med. 2013, 368, 1575-1584. 393. Tao, P.; Schlegel, H. B. J. Comput. Chem. 2010, 31, 2363-2369. 394. Tirado-Rives, J.; Jorgensen, W. L. J. Chem. Theory Comput. 2008, 4, 297-306. 395. Tollinger, M.; Konrat, R.; Hilbert, B. H.; Marsh, E. N.; Krautler, B. Structure 1998, 6, 1021-1033. 396. Toraya, T. Chem. Rev. 2003, 103, 2095-2127. 397. Torrie, G. M.; Valleau, J. P. J. Comput. Phys. 1977, 23, 187-199. 398. Uberuaga, B. P.; Anghel, M.; Voter, A. F. J. Chem. Phys. 2004, 120, 6363-6374. 399. Ulusu, N. N. J. Mol. Evol. 2015, 80, 251-257. 400. Unkrig, V.; Neugebauer, F. A.; Knappe, J. Eur. J. Biochem. 1989, 154, 723-728. 401. Urlacher, V. B.; Girhard, M. Trends Biotechnol. 2012, 30, 26-36. 402. Van der Kamp, M. W.; Shaw, K. E.; Woods, C. J.; Mulholland, A. J. J. R. Soc. Interface 2008, 5, S173-S190. 403. Vanquelef, E.; Simon, S.; Marquant, G.; Garcia, E.; Klimerak, G.; Delepine, J. C.; Cieplak, P.; Dupradeau, F.-Y. Nucleic Acids Res. 2011, 39, W511-W517. 404. Vey, J. L.; Drennan, C. L. Chem. Rev. 2011, 111, 2487-2506. 405. Vey, J. L.; Yang, J.; Li, M.; Broderick, W. E.; Broderick, J. B.; Drennan, C. L. Proc. Natl. Acad. Sci. USA 2008, 105, 16137-16141. 406. Vijayabaskar, M. S.; Vishveshwara, S. PLoS Comput. Biol. 2012, 8, e1002505. 407. Vreven, T.; Byun, K. S.; Komáromi, I.; Dapprich, S.; Montgomery, J. A.; Morokuma, K.; Frisch, M. J. J. Chem. Theory Comput. 2006, 2, 815-826. 408. Vyas, V. K.; Ukawala, R. D.; Ghate, M.; Chintha, C. Indian J. Pharm. Sci. 2012, 74, 1-17. 409. Wa Chung, L.; Hirao, H.; Li, X.; Morokuma, K. WIREs Comp. Mol. Sci. 2012, 2, 327-350. 410. Wa Chung, L.; Sameera, W. M. C; Ramozzi, R.; Page, A. J.; Hatanaka, M.; Petrova, G. P.; Harris, T. V.; Li, X.; Ke, Z.; Liu, F.; Li, H-B.; Ding, L.; Morokuma, K. Chem. Rev. 2015, 115, 5678-5796. 411. Wagner, A. F.; Frey, M.; Neugebauer, F. A.; Schafer, W.; Knappe, J. Proc. Natl. Acad. Sci. USA 1992, 89, 996-1000. 412. Wagner, A. F.; Schultz, S.; Bomke, J.; Pils, T.; Lehmann, W. D.; Knappe, J. Biochem. Biophys. Res. Commun. 2001, 285, 456-462. 413. Walsby, C. J.; Ortillo, D.; Yang, J.; Nnyepi, M. R.; Broderick, W. E.; Hoffman, B. M.; Broderick, J. B. Inorg. Chem. 2005, 44, 727-741. 414. Wang, J.; Woldring, R. P.; Roman-Melendez, G. D.; McClain, A. M.; Alzua, B. R.; Marsh, E. N. ACS Chem. Biol. 2014, 9, 1929-1938.

189

Bibliography

415. Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. J. Comput. Chem. 2004, 25, 1157-1174. 416. Wang, L.; Wu, Y.; Deng, Y.; Kim, B.; Pierce, L.; Krilov, G.; Lupyan, D.; Robinson S.; Dahlgren, M .K.; Greenwood, J.; Romero, D. L.; Masse, C.; Knight, J. L.; Steinbrecher, T.; Beuming, T.; Damm, W.; Harder, E.; Sherman, W.; Brewer, M.; Wester, R.; Murcko, M.; Frye, L.; Farid, R.; Lin, T.; Mobley, D. L.; Jorgensen, W. L.; Berne, B. J.; Friesner, R. A.; Abel, R. J. Am. Chem. Soc. 2015, 137, 2695-2703. 417. Wang, S. C.; Frey, P. A. Trends Biochem. Sci. 2007, 32, 101-110. 418. Wang, T.; Duan, Y. J. Mol. Biol. 2009, 392, 1102-1115. 419. Wang, Z.; Klipfell, E.; Bennett, B. J.; Koeth, R.; Levison, B. S.; Dugar, B.; Feldstein, A. E.; Britt, E. B.; Fu, X.; Chung, Y. M.; Wu, Y.; Schauer, P.; Smith, J. D.; Allayee, H.; Tang, W. H.; DiDonato, J. A.; Lusis, A. J.; Hazen, S. L. Nature 2011, 472, 57-63. 420. Warshel, A. Annu. Rev. Biophys. Biomol. Struct. 2003, 32, 425-443. 421. Warshel, A.; Bora, R. P. J. Chem. Phys. 2016, 144, 180901. 422. Warshel, A.; Levitt, M. J. Mol. Biol. 1976, 103, 227-249. 423. Warshel, A.; Sharma, K. P.; Kato, M.; Xiang, Y.; Liu, H.; Olsson, M. H. Chem. Rev. 2006, 106, 3210-3235. 424. Warshel, A.; Weiss, R. M. J. Am. Chem. Soc. 1980, 102, 6218-6226. 425. Wei, G.; Xi, W.; Nussinov, R.; Ma, B. Chem. Rev. 2016, 116, 6516-6551. 426. Wei, Y.; Funk, M. A.; Rosado, L. A.; Baek, J.; Drennan, C. L.; Stubbe, J. Proc. Natl. Acad. Sci. USA 2014, 111, 3756-3765. 427. Wells, S. A.; Van der Kamp, M. W.; McGeagh, J. D.; Mulholland, A. J. PloS ONE 2015, 10, e0133372. 428. Weng, Y-Z.; Chang, D. T.; Huang, Y-F.; Lin, C-W. BMC Bioinformatics 2011, 12, S1-S32. 429. Wierenga, R. K. FEBS Lett. 2001, 492, 193-198. 430. Wiltgen, M.; Tilz, G. P. Wien Med. Wochenschr. 2009, 159, 112-125. 431. Wolthers, K. R.; Levy, C.; Scrutton, N. S.; Leys, D. J. Biol. Chem. 2010, 285, 13942-13950. 432. Wolthers, K. R.; Rigby, S. E. J.; Scrutton, N. S. J. Biol. Chem. 2008, 283, 34615-34625. 433. Wong, K. K.; Murray, B. W.; Lewisch, S. A.; Baxter, M. K.; Ridky, T. W.; Ulissi-DeMario, L.; Kozarich, J. W. Biochemistry 1993, 32, 14102-14110. 434. Wu, W.-J.; Tonge, P. J.; Raleigh, D. P. J. Am. Chem. Soc. 1998, 120, 9988-9994. 435. Xiong, H.; Crespo, A.; Marti, M.; Estrin, D.; Roitberg, A. E. Theor. Chem. Acc. 2006, 116, 338-346. 436. Xue, Y.; Ward, J. M.; Yuwen, T.; Podkorytov, I. S.; Skrynnikov, N. R. J. Am. Chem. Soc. 2012, 134, 2555- 2562. 437. Yamanishi, M.; Yunoki, M.; Tobimatsu, T.; Sato, H.; Matsui, J.; Dokiya, A.; Iuchi, Y.; Oe, K.; Suto, K.; Shibata, N.; Morimoto, Y.; Yasuoka, N.; Toraya, T. Eur. J. Biochem. 2002, 269, 4484-4494. 438. Yamazaki, I. Free Radic. Biol. Med. 1987, 3, 397-404. 439. Yang, H.; Impano, S.; Shepard, E. M.; James, C. D.; Broderick, W. E.; Broderick, J. B.; Hoffman, B. M. J. Am. Chem. Soc. 2019, 141, 16117-16124. 440. Yang, H.; McDaniel, E. C.; Impano, S.; Byer, A. S.; Jodts, R. J.; Yokoyama, K.; Broderick, W. E.; Broderick, J. B.; Hoffman, B. M. J. Am. Chem. Soc. 2019, 141, 12139-12146. 441. Yang, J.; Naik, S. G.; Ortillo, D. O.; García-Serres, R.; Li, M.; Broderick, W. E.; Huynh, B. H.; Broderick, J. B. Biochemistry 2009, 48, 9234-9241. 442. Yang, L.-Q.; Sang, P.; Tao, Y.; Fu, Y.-X.; Zhang, K.-Q.; Xie, Y.-H.; Liu, S.-Q. J. Biomol. Struct. Dyn. 2014, 32, 372-393. 443. Yokoyama, K.; Lilla, E. A. Nat. Prod. Rep. 2018, 35, 660-694. 444. Zelcbuch, L.; Lindner, S. N.; Zegman, Y.; Vainberg Slutskin, I.; Antonovsky, N.; Gleizer, S.; Milo, R.; Bar- Even, A. Biochemistry 2016, 55, 2423-2426. 445. Zeymer, C.; Hilvert, D. Annu. Rev. Biochem. 2018, 87, 131-157. 446. Zhang, Q.; Van der Donk, W. A.; Liu, W. Acc. Chem. Res. 2012, 45, 555-564. 447. Zhang, W.; Wong, K. K.; Magliozzo, R. S.; Kozarich, J. W. Biochemistry 2001, 40, 4123-4130. 448. Zhang, Y.; Zhu, X.; Torelli, A. T.; Lee, M.; Dzikovski, B.; Koralewski, R.M.; Wang, E.; Freed, J.; Krebs, C.; Ealick, S. E.; Lin, H. Nature 2010, 465, 891-896. 449. Zipse, H. Topics Curr. Chem. 2006, 263, 163-190. 450. Züercher, M.; Diederich, F. J. Org. Chem. 2008, 73, 4345-4361.

190

List of Publications

List of Publications

Publications used in this thesis: 1. Hanževački, M.; Čondić-Jurkić, K.; Banhatti, R. D.; Smith, A.-S.; Smith, D. M. The Influence of Chemical Change on Protein Dynamics: A Case Study with Pyruvate Formate- Lyase. Chem. Eur. J. 2019, 25, 8741-8753. 2. Hanževački, M.; Banhatti, R. D.; Čondić-Jurkić, K.; Smith, A.-S.; Smith, D. M. Exploring Reactive Conformations of Coenzyme A during Binding and Unbinding to Pyruvate Formate-Lyase. J. Phys. Chem. A 2019, 123, 9345-9356. Publications not used in this thesis: 3. Šakić, D.; Hanževački, M.; Smith, D. M.; Vrček, V. A Computational Study of the Chlorination and Hydroxylation of Amines by Hypochlorous Acid. Org. Biomol. Chem. 2015, 13, 11740-11752. 4. Kovačević, B.; Barić, D.; Babić, D.; Bilić, L.; Hanževački, M.; Sandala, G. M.; Radom, L.; Smith, D. M. A Computational Tale of Two Enzymes: Glycerol Dehydration With or Without B12. J. Am. Chem. Soc. 2018, 140, 8487-8496. 5. Štimac, A.; Tokić, M.; Ljubetič, A.; Vuletić, T.; Šekutor, M.; Požar, J.; Leko, K.; Hanževački, M.; Frkanec, L.; Frkanec, R. Functional Self-assembled Nanovesicles Based on β-cyclodextrin, Liposomes and Adamantyl Guanidines as Potential Nonviral Gene Delivery Vectors. Org. Biomol. Chem. 2019, 17, 4640-4651. 6. Wang, X.; Shyshov, O.; Hanževački, M.; Jäger, C. M.; von Delius, M. Ammonium Complexes of Orthoester Cryptands are Inherently Dynamic and Adaptive. J. Am. Chem. Soc. 2019, 141, 8868-8876. 7. Leko, K.; Hanževački, M.; Brkljača, Z.; Pičuljan, K.; Ribić. R.; Požar, J. Solvophobically Driven Complexation of Adamantyl Mannoside with β-cyclodextrin in Water and Structured Organic Solvents. Chem. Eur. J. 2020, 26, 5208-5219.

191

Curriculum Vitae

Curriculum Vitae

Name Marko Personal Surname Hanževački Information Date of birth 27/10/1990 Place of birth Bjelovar, Croatia Ph.D. Theoretical Chemistry/Physics 2015 - present PULS Group, Institute for Theoretical Physics Friedrich-Alexander-University, Erlangen, Germany M.Sc. Chemistry Education 2012 - 2014 Division of Chemistry Faculty of Science, University of Zagreb, Zagreb, Croatia B.Sc. Chemistry 2009 - 2012 Division of Chemistry Faculty of Science, University of Zagreb, Zagreb, Croatia Research Assistant in Group for Computational Life Sciences 2015 - 2019 Division of Physical Chemistry Work Ruđer Bošković Institute, Zagreb, Croatia Experience Research Associate in Sustainable Process Technologies group 2019 - present Faculty of Chemical and Environmental Engineering University of Nottingham, Nottingham, United Kingdom Lectures and Poster Presentations 2015 AMBER Molecular Dynamics Workshop, Haifa, Israel Poster title: The solvation of transition state structures for HOCl-induced chlorination and hydroxylation of ammonia in water 7th EAM (Engineering of Advanced Materials) Symposium, Bad Staffelstein, Germany Poster title: Activin Receptor Type IIA Protein Kinase Inhibitors: Free Energy Calculations and Ligand Binding 2016 30th Molecular Modelling Workshop, Erlangen, Germany Contributed lecture title: Activin Receptor Type IIA Protein Kinase Inhibitors: Free Energy Calculations and Ligand Binding Poster title: Investigation of the Effect of β-Cyclodextrin on Peptide Deamidation: A Molecular Dynamics Study Workshop on Innovative Surfaces and Materials (MIPoMat), Primošten, Croatia Poster title: Activin Receptor Type IIA Protein Kinase Inhibitors: Free Energy Calculations and Ligand Binding 8th EAM (Engineering of Advanced Materials) Symposium, Bad Staffelstein, Germany Poster title: Investigation of the Effect of β-Cyclodextrin on Peptide Deamidation: A Molecular Dynamics Study 2017 31st Molecular Modelling Workshop, Erlangen, Germany Poster title: The Influence of Chemical Change on Protein Dynamics: A Case study with Pyruvate Formate-Lyase 11th Triennial Congress of the World Association of Theoretical and Computational Chemists (WATOC), Munich, Germany Poster title: The Influence of Chemical Change on Protein Dynamics: A Case study with Pyruvate Formate-Lyase International Congress Engineering of Advanced Materials (ICEAM), Erlangen, Germany Poster title: The Influence of Chemical Change on Protein Dynamics: A Case Study with Pyruvate Formate-Lyase 2018 32nd Molecular Modelling Workshop, Erlangen, Germany Contributed lecture title: The Entry of CoA into Enzymes: A Case Study with Pyruvate Formate-Lyase 1st Computational Chemistry Day, Zagreb, Croatia Contributed lecture title: The Entry of CoA into Enzymes: A Case Study with Pyruvate Formate-Lyase 11th Congress on Electronic Structure Principles and Applications (ESPA), Toledo, Spain

192

Curriculum Vitae

Poster title: The Entry of CoA into Enzymes: A Case Study with Pyruvate Formate-Lyase Gordon Research Conference and Gordon Research Seminar: Computational Chemistry, West Dover, Vermont, USA Poster title: The Entry of CoA into Enzymes: A Case Study with Pyruvate Formate-Lyase 2019 2nd Computational Chemistry Day, Zagreb, Croatia Poster title: The Entry of CoA into Enzymes: A Case Study with Pyruvate Formate-Lyase 4th Grandmaster Early-Career Workshop in Physics, Split, Croatia Poster title: The Entry of CoA into Enzymes: A Case Study with Pyruvate Formate-Lyase 2020 34th Molecular Modelling Workshop, Erlangen, Germany Contributed lecture title: From the catalytic mechanism of the glycyl radical enzyme pyruvate formate- lyase to the dynamics of its activation Awards 2013 Rectors Award for a great team-based scientific and artistic work, Zagreb, Croatia 2014 Rectors Award for individual scientific and artistic work, Zagreb, Croatia 2015 Poster prize, 7th EAM (Engineering of Advanced Materials) symposium, Bad Staffelstein, Germany 2016 Poster prize, Workshop on Innovative Surfaces and Materials (MIPoMat), Primošten, Croatia 2017 Poster prize, 31st Molecular Modeling Workshop, Erlangen, Germany 2018 British Scholarship Trust, Nottingham, UK 2019 Front Cover: The Influence of Chemical Change on Protein Dynamics: A Case Study with Pyruvate Formate- Lyase (Chem. Eur. J. 37/2019), DOI: 10.1002/chem.201901772 2020 Front Cover: Solvophobically Driven Complexation of Adamantyl Mannoside with β‐Cyclodextrin in Water and Structured Organic Solvents (Chem. Eur. J. 23/2020), DOI: 10.1002/chem.202000849

193

Statutory Declaration

Statutory Declaration

Full name of the candidate:

Marko Hanževački

Faculty:

Faculty of Sciences, Friedrich-Alexander-Universität Erlangen-Nürnberg

Thesis title:

Computational modeling of glycyl radical enzymes: Novel insights to the mechanism of Pyruvate Formate-Lyase

Statutory declaration:

I hereby solemnly declare that I am the author of the enclosed thesis and that I have not used materials and sources without corresponding declaration or citation in the text. All thoughts or quotations which were inferred from the sources are marked as such. I formally declare that this thesis was not submitted to any other authority to achieve an academic degree.

Date and signature:

Erlangen, June 10, 2020

Marko Hanževački

194