A Biophysical Study of Fluoroacetate Dehalogenase

By

Pedram Mehrabi

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Medical Biophysics University of Toronto

© Copyright by Pedram Mehrabi 2017

A Biophysical Study of Fluoroacetate Dehalogenase

Pedram Mehrabi

Doctor of Philosophy

Graduate Department of Medical Biophysics University of Toronto

2017

Abstract

Fluoroacetate dehalogenase (FAcD) is a homodimeric and belongs to the α/β . It possesses the ability to cleave C-F bonds and so is capable of using the simplest fluorocarbon, fluoroacetate, as its main source of carbon. The underlying mechanistic details that allow this enzyme to perform C-F cleavage are still unknown. However, details into building a mechanistic model for the type of asymmetric cooperative behavior observed in

FAcD are slowly being revealed. In the following studies, I use FAcD as a model system to study its catalytic reaction mechanism focusing on the enzymatic coordinate pathway and the dynamics of various states along the pathway. As well, I use FAcD as a suitable choice to study irreversible enzymatic reactions via time-resolved crystallography (TRX) experiments.

Using a series of biophysical techniques I analyze structure, function, and dynamics of both the ground state and transient higher energy states. It was demonstrated using X-ray crystallography that there is subtle structural asymmetry between the individual subunits in the apo state of the enzyme. This asymmetry becomes more pronounced during catalytic activity whereby entropic loss from binding in one protomer is compensated by an increase of entropy and a loss of structural waters in the other.

ii

Kinetic measurements of FAcD revealed that during titrations involving excess substrate the enzyme undergoes substrate inhibition. Biphasic chemical shifts identified by NMR indicated a secondary , which was confirmed via crystal structures of various complexes of a catalytically slow mutant (Y219F). This secondary binding site decreases transmission along the allosteric pathway, which could be mapped using rigidity-based transmission allostery (RTA) analysis.

Using FAcD as a model system for method development I applied fixed target approaches for time-resolved crystallography (TRX). Using light activation of a caged-substrate, I could verify that structural asymmetry first observed using a series of mutants is still conserved. As well, structural features of the and cap domain change from one time-point to the next and corroborate previous results.

Taken together, this data shows nuanced mechanistic insights into the structural enzymology of a homodimeric enzyme.

iii

Acknowledgements

The frontiers of science are now only rarely pushed forth as the work of individuals, but rather involve collaborative efforts in the hopes of being able to understand more about the character of Nature. Several people contributed to my research which made the work of this thesis possible. I would like to start off by expressing my sincere appreciation to my supervisor, Dr. Emil Pai for offering me an opportunity to work on a project that satisfied many of my scientific curiosities and opened many new ones. Not only for his guidance, but also for consistently allowing me the opportunity to bounce my ideas off him. I would also like to express my gratitude to my committee members Drs. Scott Prosser and Gil Privé. Additional thanks goto Dr. Scott Prosser for his guidance and keen insights into understanding protein dynamics.

Further thanks go to the members of the Pai lab for creating an enjoyably working environment. In particular, I would like to thank Ondrej Halgas and Natasha Krugylak for interesting conversation and for their comradery.

I would like to also thank my many collaborators which made much of this research possible. Dr. Zhong Ren for introducing me to both dynamic crystallography and SVD analysis all while unwearyingly answering all my questions. Dr. Adnan Sljoka, for showing me a new world of protein analysis. Dr. Vukica Srajer for introducing me to Laue time-resolved crystallography. Drs. Henrike Müller-Werkmeister and Eike Schulz for tirelessly working through very long beamtimes, trying to optimize and improve our time-resolved measurements. As well as the many members of DESY beamtimes, especially Jessica Besaw, Antoine Sarracini, and Kamil Krawczyk. I would especially like to thank Dr. Tae Hun Kim, not only for his comradery, but also for spending countless hours with me in the lab, patiently teaching me NMR and diligently going over data.

Last but not least, I would like to thank my parents Fariba and Saeid for their constant love, support and encouragement.

iv

Table of Contents Abstract ...... ii Acknowledgements ...... iv List of Tables ...... ix List of Figures ...... x List of Appendices ...... xiv List of Abbreviations ...... xv Chapter 1 ...... 1 1.1 Halogenated hydrocarbon waste ...... 1

1.1.1 Bioremediation ...... 1

1.1.2 Microbial Dehalogenation ...... 3

1.1.2.2 Thiolytic dehalogenation ...... 4

1.1.2.3 Oxygenolytic dehalogenation ...... 5

1.1.2.4 Dehydrohalogenation ...... 6

1.1.2.5 Reductive dehalogenation ...... 7

1.1.3 Hydrolytic dehalogenation...... 8

1.1.3.1 Haloacid dehalogenases ...... 8

1.1.3.2 Haloalkane dehalogenases ...... 9

1.1.3.3 Fluoroacetate dehalogenases ...... 10

1.2 Structural approaches to study fluoroacetate dehalogenase ...... 12

1.2.1 Classical X-ray crystallography ...... 13

1.2.2 Time-resolved crystallography ...... 14

1.2.2.1 Ideal systems ...... 16

1.2.2.2 Mutational Trapping ...... 16

1.2.2.3 Physical Trapping ...... 17

1.2.2.4 Analytical Trapping ...... 17

v

1.2.3 Structure determination by serial crystallography: an overview ...... 18

1.2.4 Computational approaches to electron density interpretation ...... 20

1.2.4.1 Singular value decomposition ...... 21

1.2.4.2 Absolute value electron density maps ...... 22

1.3 Allostery ...... 23

1.3.1 Models of allostery ...... 23

1.3.1.1 Concerted Model ...... 23

1.3.1.2 Sequential Model ...... 23

1.3.1.3 Ensemble Model ...... 25

1.3.2 Methods to study allostery ...... 26

1.3.2.1 X-ray crystallography ...... 26

1.3.2.2 Nuclear Magnetic Resonance ...... 26

1.3.2.3 Computational approaches ...... 28

1.4 Thesis Rationale ...... 28

Chapter 2 ...... 30 2.1 Main Text ...... 32

2.1.1 The ensemble approach to ...... 34

2.1.2 Dynamic allostery in dimers...... 35

2.1.3 Dimer asymmetry...... 36

2.1.4 Dynamic averaging of structural asymmetry ...... 38

2.1.5 Probing dynamic asymmetry and protomer exchange during catalysis...... 38

2.1.6 Asymmetry in local dynamics and numbers of crystallographic water molecules 41

2.1.7 Inter-protomer allosteric pathways ...... 43

2.1.8 Conformational selection and the reaction coordinate pathway ...... 45

2.1.9 The role of dynamics and quaternary structure in catalysis...... 50

vi

2.1.10 The role of water networks in stabilizing functional states...... 52

2.2 Conclusions ...... 53

2.3 Materials and Methods ...... 54

2.4 Supplementary Text ...... 62

2.4.1 FAcD as a model to study irreversible ...... 62

2.4.2 Rigidity-based allosteric transmission ...... 62

Chapter 3 ...... 92 3.1 Introduction ...... 93

3.2 Methods and Materials ...... 95

3.3 Results ...... 98

3.3.1 High substrate concentrations lead to substrate inhibition ...... 98

3.3.2 In crystal structures of the Y219F mutant of FAcD one subunit binds two substrate molecules...... 99

3.3.3 After 24 hours soaking time, crystal structures of the Y219F mutant of FAcD show at the active site...... 102

3.3.4 Secondary binding is absent in structures derived after 24 hours soaking times 103

3.3.5 19F NMR shows changes in protein conformation and dynamics when two substrates are bound ...... 103

3.3.6 Linear response in titrations with substrate analogue BrAc and product glycolate 104

3.3.7 Aberrant biphasic changes in chemical shifts observed in NH HSQC spectra ... 105

3.3.8 Computational rigidity analysis for elucidating the pathway for allosteric information transfer ...... 107

3.4 Discussion ...... 109

3.4.1 The role of substrate inhibition in catalysis ...... 110

3.5 Supplementary Materials ...... 112

vii

Chapter 4 ...... 119 4.1 Introduction ...... 120

4.2 Methods and Materials ...... 122

4.3 Results ...... 124

4.4 Discussion ...... 129

4.5 Supplementary Materials ...... 131

Chapter 5 ...... 138 5.1 Protein Engineering for bioremediation ...... 138

5.2 Further investigations of function of protomer interactions ...... 139

5.3 Time-resolved crystallography ...... 140

5.3.1 Diffusion experiments ...... 140

5.3.2 Time-resolved crystallography of FAcD ...... 141

5.4 Allostery pathway in FAcD ...... 142

5.5 Summary and Conclusion ...... 143

5.6 Final Remarks ...... 144

Appendix ...... 146 A.1 Introduction ...... 147

A.2 Methods and Materials ...... 149

A.3 Results ...... 150

A.3.1 Unit cell and B-factor differences...... 151

A.3.2 Comparison of structures calculated from data sets obtained by Laue, serial, and rotating anode home source techniques ...... 153

A.3 Discussion ...... 157

A.5 Supplementary figures ...... 159

Bibliography ...... 163

viii

List of Tables

Table 2.S1 – Crystallography statistics...... 85

Table 2.S2 - Crystallography statistics for FTX structures...... 86

Table 3.S1 – Crystallography statistics ...... 118

Table 4.S1 – Crystallographic statistics...... 137

Table A.S1 – Crystallography statistics...... 162

ix

List of Figures

Figure 1.1 – Dehalogenation by intramolecular substitution chemistry and the crystal structure of haloalcohol dehalogenase HheC (PDB ID: 1PWZ) from Agrobacterium radiobacter AD1 .... 4

Figure 1.2 – Thiolytic dehalogenation chemistry and the crystal structure of human glutathione- S- P-1 (PDB ID: 10GS) ...... 5

Figure 1.3 - Oxygenolytic dehalogenation chemistry ...... 6

Figure 1.4 - Dehydrohalogenation chemistry and the crystal structure of hexachlorocyclohexane dehydrochlorinase (LinA) Type2 (PDB ID: 3A76) from Agrobacterium radiobacter AD1 ...... 7

Figure 1.5 – Reductive dehalogenation chemistry and the crystal structure of tetrachloroethene reductive dehalogenase (PDB ID: 4UQU) from Sulfurospirillum multivorans ...... 8

Figure 1.6 – Hydrolytic dehalogenation chemistry and the crystal structure of fluoroacetate dehalogenase (RPA1163) from Rhodopseudomonas palustris ...... 12

Figure 1.7 – FAcD crystals and diffraction pattern ...... 14

Figure 1.8 - Simplified scheme of singular value decomposition (SVD) ...... 22

Figure 1.9 – Concerted and sequential models of allostery ...... 25

Figure 2.1 - Structural asymmetry and correlated structural changes in FAcD ...... 33

Figure 2.2 - Conformational and dynamic changes of FAcD upon titration with substrate analog BrAc observed by 15N,1H HSQC and 19F NMR ...... 37

Figure 2.3 - Cross correlation of residues and collective motions from MD simulations ...... 40

Figure 2.4 - Heterogeneous dynamics in the dimer at key catalytic steps ...... 42

Figure 2.5 - Distribution of water molecules bound to crystalline FAcD ...... 43

Figure 2.6 - Computational analyses using rigidity-based transmission models...... 45

Figure 2.7 - Crystallographic and spectroscopic evidence of conformational selection ...... 49

Figure 2.8 - Proposed mechanistic scheme for FAcD catalysis ...... 52

Figure 2.S1 - Reaction coordinate pathway of FAc hydrolysis with or without FAcD ...... 64

x

Figure 2.S2 - Crystal structure of 5F-Trp labeled FAcD and the comparison of its catalytic rate and ligand binding to that of wild-type FAcD ...... 65

Figure 2.S3 - 15N CPMG relaxation dispersions of BrAc-bound FAcD acquired at 800 MHz .. 67

Figure 2.S4 - 2Fo-Fc electron density maps of FAcD reaction steps ...... 69

Figure 2.S5 - Fo-Fc difference Fourier maps after SVD ...... 70

Figure 2.S6 - SVD and SPARTA+ analyses of X-ray crystallographic time series experiments...... 72

Figure 2.S7 - Singular values from SVD analysis of all difference Fourier maps ...... 74

Figure 2.S8 - NMR experiments to detect protein dynamics and spectral comparisons between BrAc and ClAc ...... 75

Figure 2.S9 - Amide peak assignments of 15N,1H HSQC spectra and chemical shift perturbations upon titration with the substrate analog BrAc ...... 77

Figure 2.S10 - Cross correlation of residues from MD simulations...... 78

Figure 2.S11 - Crystal lattice of FAc-bound FAcD ...... 79

Figure 2.S12 - 19F NMR spectra monitoring minor peaks ...... 80

Figure 2.S13 - 19F NMR spectra to confirm the W156 assignment...... 81

Figure 2.S14 - Crystal structures of states along the FAcD reaction coordinate ...... 83

Figure 3.1 - of FAcD indicative of substrate inhibition ...... 99

Figure 3.2 - Multiple binding sites of substrate molecules in the slow Y219F mutant reveal structural basis of substrate inhibition in FAcD ...... 101

Figure 3.3 - Product glycolate bound to active site in Y219F mutant after 24 hr soak ...... 102

Figure 3.4 - Chemical shift perturbations of W156 with different ligands ...... 104

Figure 3.5 - HSQC titration of FAcD with ClAc also results in non-linear chemical shifts ..... 106

Figure 3.6 - SPARTA+ analysis of differences between soaked Y219F structures and its comparison to the SPARTA+ analysis of soaked WT structures ...... 107

xi

Figure 3.7 - Computational prediction of the allosteric pathway using a rigidity-based allostery transmission algorithm...... 109

Figure 3.S1 - Degrees of freedom transmission calculations for different transmission regions ...... 112

Figure 3.S2 - SPARTA+ analysis using the soaked structures of Y219F and WT ...... 113

Figure 3.S3 - 2Fo-Fc map of the active site of Y219F at various time points ...... 113

Figure 3.S4 - Comparison of Y219F apo-FAcD and Y219F FAcD after 150 min of soaking in 100 mM FAc solution ...... 114

Figure 3.S5 - 2Fo-Fc map of the Y219F FAcD mutant active sites with ClAc ...... 115

Figure 3.S6 - 19F NMR spectra of 5F-Trp enriched FAcD titrated with different ligands ...... 116

Figure 3.S7 - 19F NMR titrations of W156 with various ligands ...... 117

Figure 4.1 – Experimental setup ...... 126

Figure 4.2 – Electron density changes in the cap domain ...... 127

Figure 4.3 – Electron density changes observed in the active site ...... 128

Figure 4.4 – B-factor at 0 ms and 1000 ms time points after photolysis ...... 129

Figure 4.S1 – Absorption spectra of caged fluoroacetate in liquid and crystalline environment ...... 131

Figure 4.S2 – Electron densities found in the active site of D110N mutant FAcD in a crystal soaked with caged fluoroacetate...... 132

Figure 4.S3 – Diffraction image from serial TRX ...... 133

Figure 4.S4 – Electron density maps of the cap domain at various time-points ...... 135

Figure 4.S5 – Electron density changes in the active site ...... 135

Figure 4.S6 – Interlacing procedure for long time points ...... 136

Figure A.1 - Water distribution of cryogenic (5K3D) and room temperature structures of apo- FAcD...... 151

xii

Figure A.2 – B-factor differences between cryogenic (5K3D) and room temperature structures of apo-FAcD ...... 153

Figure A.3 - Structural overlay and isomorphous difference Fourier maps (Fo-Fo) of room temperature structures of apo-FAcD ...... 154

Figure A.4 - Isomorphous difference Fourier maps (Fo-Fo) of room temperature structures and the cryogenic structure PDB-ID 5K3D of apo-FAcD ...... 156

Figure A.S1 - Structural superpositions of room temperature structures against the cryogenic structure PDB-ID: 5K3D ...... 159

Figure A.S2 – Overlay of catalytically important active site residues...... 160

Figure A.S3 - Luzzati plots ...... 161

Figure A.S4 - Difference distance matrices (DDM) of room temperature structures against the cryogenic structure(5K3D)...... 161

xiii

List of Appendices

Appendix...... 146

xiv

List of Abbreviations

AVM Absolute Value Map BrAc Bromoacetate CEST Chemical Exchange Saturation Transfer ClAc Chloroacetate CPMG Carr-Purcell-Meiboom-Gill sequence CT Cryogenic Temperature DDM Difference Distance Matrix DNA Deoxyribonucleic acid DOF Degrees Of Freedom FAc Fluoroacetate FAcD Fluoroacetate dehalogenase FIRST Floppy Inclusion and Rigid Substructure Topography FTX Freeze-Trappin\g X-ray crystallography GPCR G-Protein-Coupled Receptors GST Glutathione S-transferase tag HAD Hhaloacid dehydrogenase HIV Human Immunodeficiency Virus IOTA Integration Optimization Triage and Analysis IPTG Isopropyl ß-D-1-thiogalactopyranoside ITC Isothermal Titration Calorimetry KNF Koshland-Némethy-Filmer model LCP Lipidic Cubic Phase LSV Left Singular Vector MD Molecular Dynamics MWC Monod-Wyman-Changeux model NADH Nicotinamide adenine dinucleotide NADPH Nicotinamide adenine dinucleotide phosphate

xv

NMR Nuclear Magnetic Resonance PFOA Perfluorooctanoic acid PYP Photoactive Yellow Protein RA Rotating Anode RMSD Root-Mean-Square Deviation RSV Right Singular Vector RTA Rigidity-based Transmission Allostery RTRA Room-Temperature Rotating Anode SPARTA+ Shifts Prediction from Analogy in Residue type and Torsion Angle STRX Serial Time-Resolved Crystallography SVD Singular Value Decomposition SX Serial Crystallography TEV Tobacco Etch Virus protease TRX Time-resolved Crystallography UV Ultra-violet WT Wild-Type XFEL X-ray Free Electron Laser

xvi

Chapter 1

Introduction

1.1 Halogenated hydrocarbon waste

Worldwide, millions of tons of halogenated hydrocarbons are produced every year (1). While widely applied and highly useful in agriculture, industry, and medicine (2–4) their bioaccumulation has come more into focus because of the health and environmental impact of these compounds (5). Halogenated hydrocarbons pose a particularly difficult task for waste disposal, because halogenation causes a molecule to become more stable and hydrophobic (1). Increased stability and reduced solubility allow these contaminants to accumulate in the environment over time. Fluorinated hydrocarbons represent a rapidly growing part of this group of chemicals and are increasingly used in a number of industrial applications like cooling liquids and pharmaceuticals (6, 7).

1.1.1 Bioremediation The negative impact of organohalogens on health and the environment has fostered the development of an increasing number of strategies as a means of their remediation. For instance, strategies for the removal and incineration of contaminated waste from industrial run off focus on removing contaminants from processed water – such as with electrochemical treatment (8). A number of drawbacks associated with these abiotic strategies are mainly due to their secondary environmental effects. These include the destruction of ecosystems via the removal of contaminated soils as well as cross-contamination of other environments (9). The accumulation

1 and persistence of synthetic organic halogen compounds in the environment and subsequently in the food chain are primarily due to the lack of biodegradation pathways in most microorganisms (10). Fluorinated compounds are especially recalcitrant due to the stability of the C-F bond. Only one fluorine-containing compound is known to be synthesized in Nature in larger amounts, fluoroacetate; it is produced by some plants as protection from grazing animals. All other known fluorinated compounds have been introduced during the last 130 years from anthropogenic sources. However, a small subset of bacteria has recently evolved strategies to deal with the increase of these synthetic compounds in the environment during the last century (11, 12).

Initial efforts for bioremediation strategies used indigenous microbial populations. An advantage of this approach is that there is no need for adaptation of newly introduced microorganisms to the environment, leaving the balance of the local ecosystem largely undisturbed (13). Since its introduction, bioremediation has gained more prominence, which can be attributed to the lower price-tag relative to many abiotic strategies, as well as the specificity the microorganisms possess for certain contaminants (14, 15). Lastly, there is a reduction in the risks to human health and of environmental run off from accidental release of contaminants; in terms of safety, there is no need for the transport of waste off site since many bioremediation strategies occur at the source locations (16).

A number of disadvantages, however, remain with many of the various bioremediation strategies. These include the length of time required for treatment, which can take longer than abiotic strategies - mainly due to the limited speed at which the biochemical pathways can operate. Although it can be advantageous for some microorganisms to have that possess high specificity, this can become a disadvantage when there is a complex mixture of contaminants in the environment that all need to be ameliorated. Lastly, bioremediation strategies only work for compounds for which there are enzymes available to catalyze their breakdown (16). Therefore, microorganisms designed to contain novel enzymes that accept a broader selection of substrates or have the ability to perform the chemical decontamination reactions more efficiently would greatly increase the potential of bioremediation. This provides the basis for our efforts to understand such enzymes in greater detail.

2

1.1.2 Microbial Dehalogenation There are a number of different microorganisms that possess biochemical pathways that are capable of breaking down carbon-halogen compounds. Generally, catalytic cleavage of carbon- halogen bonds is an enzymatically slow process (17, 18). Therefore, any large-scale bioremediation is hampered by this lack of efficiency. The enzymes involved in these biodegredative pathways, mediating organohalogen catalysis, are classified as dehalogenases. There are several variants of dehalogenation chemistry available in Nature; the most important ones are discussed below.

1.1.2.1 Intramolecular substitution

Dehalogenation via intramolecular substitution is performed by a class of enzymes known as haloalcohol dehalogenases and NAD(P)(H)-dependent dehydrogenases (19) (Figure 1.1). The catalytic mechanism, discovered by x-ray crystallography, demonstrates the activation of the hydroxide group of the haloalcohol substrate via a tyrosine. This process occurs after the formation of the Michaelis-Menten complex, which is stabilized by a serine residue that binds to the hydroxyl group of the substrate. The activated hydroxide group subsequently attacks the vicinal carbon bearing the halide atom, releasing a proton and a chloride ion, forming an epoxide in the process (19).

3

Figure 1.1 – Dehalogenation by intramolecular substitution chemistry and the crystal structure of haloalcohol dehalogenase HheC (PDB ID: 1PWZ) from Agrobacterium radiobacter AD1. A) General scheme for dehalogenation by intramolecular substitution. Adapted from (20). B) Cartoon representation of the crystal structure showing the homodimeric structure of haloalcohol dehalogenase HheC with subunits of roughly 28 kDa in size.

1.1.2.2 Thiolytic dehalogenation

A subclass of glutathione S-, dechlorinates dichloromethane, forming an unstable intermediate, S-chloromethylglutathione and chloride (21) (Figure 1.2). The unstable intermediate is subsequently hydrolyzed into formaldehyde, chloride, and glutathione. This biochemical process is performed by facultative methylotrophic bacteria, which use formaldehyde as a central metabolite (22).

4

Figure 1.2 – Thiolytic dehalogenation chemistry and the crystal structure of human glutathione-S- transferase P-1 (PDB ID: 10GS). A) General scheme for thiolytic dehalogenation. Glutathione-S- transferases can catalyze the dehalogenation of a variety of alkyl halides. Adapted from (20). B) Cartoon representation of the crystal structure showing the roughly 46.5 kDa homodimer of human glutathione- S-transferase P-1.

1.1.2.3 Oxygenolytic dehalogenation

Oxygenolytic dehalogenation primarily involves classes of enzymes that contain a flavin group and catalysis is performed with either NADH or NADPH as cosubstrates (Figure 1.3). The enzymes mainly achieve the dechlorination of aromatic rings, substituting the chlorine group with a hydroxyl group by activating molecular oxygen at the flavin . This type of reaction is performed by oxygenases that can be divided into two groups, mono- and dioxygenases (20). Mono- and dioxygenases incorporate one or two hydroxyl groups, respectively, when removing the chlorine group (23). While crystal structures of many oxygenases have been solved, currently there is no known structure for an oxygenolytic dehalogenase.

5

Figure 1.3 - Oxygenolytic dehalogenation chemistry. General scheme for dioxygenase dehalogenation. Oxygenolytic dehalogenation involves the addition of a hydroxyl group to replace both the chlorine and carboxylate group. The hydroxyl groups oxygen is derived from dioxygen. Adapted from (20).

1.1.2.4 Dehydrohalogenation

Dehydrohalogenases by and large involve alkyl halides as their substrates (Figure 1.4A). A halide ion and a proton are removed from the substrate forming HX and a double-bond, resulting in an alkene product. Hexachlorocyclohexane dehydrochlorinase, LinA, is one such enzyme for which the reaction mechanism has been investigated and an E2-like dehydrochlorination proposed (24). Among the dehalogenating enzymes identified so far, which have all been homodimeric, it is uniquely homotrimeric (Figure 1.4B).

6

Figure 1.4 - Dehydrohalogenation chemistry and the crystal structure of hexachlorocyclohexane dehydrochlorinase (LinA) Type2 (PDB ID: 3A76) from Agrobacterium radiobacter AD1. A) General scheme for dehalogenation by intramolecular substitution. Adapted from (20). B) Cartoon representation of the crystal structure showing homotrimeric nature of hexachlorocyclohexane dehydrochlorinase (LinA) Type2. The trimer is roughly 51.9 kDa in size.

1.1.2.5 Reductive dehalogenation

Reductive dehalogenation is seen in a wide variety of both aerobic and anaerobic microorganisms. More complex, heavily halogenated compounds, however, can only be turned over in anaerobic microorganisms (20, 25). Catalysis involves a two-electron transfer reaction, reducing the halide containing substrate and causing the replacement of the halide group by a hydrogen atom (20) (Figure 1.5).

7

Figure 1.5 – Reductive dehalogenation chemistry and the crystal structure of tetrachloroethene reductive dehalogenase (PDB ID: 4UQU) from Sulfurospirillum multivorans. A) General scheme for reductive dehalogenation. Adapted from (20). B) Cartoon representation of the crystal structure showing a homodimer of tetrachloroethene reductive dehalogenase, which is roughly 104 kDa in size.

1.1.3 Hydrolytic dehalogenation Hydrolytic dehalogenation involves the incorporation of a hydroxyl group, which displaces the halide atom (20) (Figure 1.6A). There are three main classes of hydrolytic dehalogenase enzymes, haloacid, haloalkane, and fluoroacetate dehalogenases.

1.1.3.1 Haloacid dehalogenases

The first class, haloacid dehalogenases, itself is subdivided into two main types of enzymes, D- 2-haloacid dehalogenases and L-2-haloacid dehalogenases. Members of the two sub-classes do not share significant sequence similarity (26). The L-2-haloacid dehalogenases, in turn, are part of the haloacid dehydrogenase superfamily (HAD), which includes phosphatases,

8 phosphanatases, ATPases, and phosphomannomutases (27, 28). The dehalogenase members of this superfamily are the most investigated of all the dehalogenating enzymes. L-2-haloacid dehalogenases are the most common subtype, they catalyze reactions involving S-2-haloacids with varying chain lengths of up to 16 carbon atoms (29, 30).

Dehalogenation is a two-step process and involves a . It has been shown that an active site Asp residue, in an SN2 attack on the halogen-carrying carbon atom, releases the fluoride ion, forming a covalent intermediate. This ester is subsequently hydrolyzed by a water molecule that is activated by another Asp residue, leaving behind the free enzyme and product (27, 28, 31).

D-2-haloacid dehalogenases, the other haloacid dehalogenases subtype, catalyze reactions involving R-2-haloacids. They, too, are subdivided into two groups which share ~25% sequence identity (32). Their distinguishing feature is their substrate specificity. One class can only process R-enantiomers while the other subgroup can act on both R- and S-enantiomers (32). Unlike L-2-haloacid dehalogenases, whose mechanism consists of a two-step process, D-2- haloacid dehalogenases perform catalysis in a single step. In Pseudomonas sp. strain 113 DL-2- haloacid dehalogenase, a water molecule is thought to be activated by Asp or Glu residues so it can displace the halide group in a single-step nucleophilic attack (33, 34).

1.1.3.2 Haloalkane dehalogenases

Haloalkane dehalogenases catalyze the conversion of aliphatic haloalkane substrates into a primary alcohol and halide ion (17). This class of dehalogenases has already found a number of practical applications in biosensing , industrial biocatalysis (35) etc. They also have been the subject of directed evolution and modified by rational design to improve their catalytic properties (36).

Haloalkane dehalogenases belong to the α/β hydrolase protein superfamily and transform a wide range of substrates (37). The active sites of these enzymes, like in all enzymes of the α/β fold superfamily, contain a catalytic triad of aspartate, , and either another aspartate or glutamate, located at the interface between a core and a cap domain. The active site includes two more residues, either an asparagine- pair or two tryptophan residues that contribute to

9 a halide-binding pocket. Together these five residues are referred to as a catalytic pentad (38, 39).

While the catalytic mechanisms of all haloalkane dehalogenases are essentially identical, their substrate specificity can vary dramatically. LinB and DhaA have large active site cavities and can turn over larger haloalkanes, unlike DhlA, which preferentially accepts as substrates short haloalkanes up to 8 atoms in length (40–44). X-ray crystallographic analyses of their active sites have demonstrated that LinB and DhaA have distinct substrate profiles (37) although their active site architecture is quite similar (45).

1.1.3.3 Fluoroacetate dehalogenases

Dehalogenation of fluorinated hydrocarbons is quite a formidable task because of the strength of the carbon-fluorine (C-F) bond. It is the strongest bond in organic chemistry, with a dissociation energy of 130 kcal/mol. This is not surprising since fluorine is the most electronegative element, which makes the C-F bond, typically in the range of 1.35 Å, the shortest of all the carbon- halogen bonds (46). The electronegativity of fluorine also strongly polarizes the bond, causing partial charges to form on the atoms (Cδ+-Fδ−); these ionic forces further contribute to the strength of the bond and thereby decrease its reactivity (46).

A number of microorganisms have been identified which possess enzymes with the ability to cleave C-F bonds. These organisms are capable of using the simplest fluoro-acetic acid, fluoroacetate, as their main source of carbon (34). One class of enzymes, named fluoroacetate dehalogenases (FAcD), catalyze the simple hydrolysis of the C-F bond (5). In my thesis work, fluoroacetate dehalogenase RPA1163 from the soil bacterium Rhodopseudomonas palustris (CGA009) is used as a representative example of this class of enzymes (Figure 1.6B).

FAcD is a homodimeric enzyme and belongs to the α/β hydrolase protein superfamily. These enzymes are comprised of an α/β-sheet domain connected to an α-helical cap domain. The cap domain gates the entry and exit of ligands to and from the active site, which is located at the interface of these two domains and contains a halide binding pocket formed by the side chains of the amino acids H155, W156, and Y219; their interactions stabilize the leaving halide ion.

Like many dehalogenating enzymes, FAcD has a relatively slow reaction rate, with a kcat of 6.7 +/- 0.6 min-1 for its best known substrate, fluoroacetate, and 1.38 +/- 0.07 min-1 for

10 chloroacetate (5). Interestingly bromoacetate (BrAc) is not a true substrate as it cannot be turned over by FAcD and only forms the Michaelis-Menten complex; this will be discussed in further detail in chapter 2 (47). All the enzymes that possess the ability to cleave the C-F bond become promising candidates for the bioremediation of fluorinated hydrocarbons. Similar to other dehalogenases, which cleave C-Br and C-Cl bonds, fluoroacetate dehalogenase cleaves the C-F bond in fluoroacetate in a 2-step process, involving an Asp-His-Asp catalytic triad (5, 47). It has been demonstrated that D110 releases the fluoride ion via an SN2 attack, creating a covalent ester intermediate, which in turn is hydrolyzed by a water molecule activated by H280. This mechanism of dehalogenation is very similar to that of the other haloacid-converting enzymes described above. However, only a few of these enzymes can cleave the C-F bond and then do so less proficiently when compared to C-Cl or C-Br bonds. In contrast, FAcD breaks the C-F faster than the other carbon-halogen bonds (5, 47), representing a more specialized catalyst. Computational analysis of the reaction mechanism of FAcD proposed a much lower activation barrier ~19 kcal/mol, since the SN2 mechanism seems to need much less energy to proceed (48).

11

Figure 1.6 – Hydrolytic dehalogenation chemistry and the crystal structure of fluoroacetate dehalogenase (RPA1163) from Rhodopseudomonas palustris. A) General scheme for hydrolytic dehalogenation. Adapted from (20). B) Cartoon representation of the crystal structure of RPA1163, the residues involved in the halide binding pocket (H155, W156, Y219) in red as well as the catalytic triad (D110, D134, H280) in green in protomer A are highlighted. The cap domain, which allows gating of the ligands into and out of the active sites, is highlighted in blue.

1.2 Structural approaches to study fluoroacetate dehalogenase

The challenges that arise with structural analysis of proteins by crystallographic means is highlighted by the fact that many proteins are difficult to crystallize, or fail to diffract adequately to attain a complete high-resolution structure. Fortunately, the methodology for the crystallization of FAcD had already been established providing an initial structure (5). However, attaining nuanced highly detailed structural data of multiple ground and high-energy states, during the reaction cycle, proved to be challenging. Biophysical approaches involving methods to analyze structural data will be highlighted below.

12

1.2.1 Classical X-ray crystallography Classical X-ray crystallography aims to produce a 3-dimensional molecular model at as close to atomic resolution as possible. These models visualize primarily the ground state structures of the proteins (49). The following is a very condensed description of the process.

The initial steps for generating a 3-dimensional structural model are to obtain a reasonable stable protein sample of as high as possible purity and to then find appropriate crystallization conditions; the latter very often being the rate limiting step (50). This is due to the unknown environmental conditions required to make a specific protein crystallize. Often hundreds if not thousands of conditions need to be screened until parameters are found that will produce crystals of a size and quality suitable for structure determination(51).

Once an appropriate protein crystal is attained, it is then irradiated with focused high intensity monochromatic X-rays, which emanates in a series of scattering events, resulting in a diffraction pattern consisting of a multitude of reflection peaks (Figure 1.7). These reflections and the patterns they produce are a result of interference from incident X-ray beams that satisfy the conditions in Bragg’s law.

2푑푠푖푛휃 = 푛휆 (1) whereby d is the spacing between two scattering planes in the crystal (corresponding to the resolution), n is any positive integer, λ the wavelength of the incident X-rays, and θ is the scattering angle (49). Interference peaks, the reflections, are found at positions that obey Bragg’s Law. The number of diffraction patterns needed for a complete data set is dictated by the parameters of the protein crystal’s lattice (49, 50).

13

Figure 1.7 – FAcD crystals and diffraction pattern. (A) FAcD crystals are grown in 100 mM Tris-SO4 pH

8.5, 16-20% PEG 3350, and 200 mM CaCl2. Crystals form in 3-5 days and are usually over 200 µm in length. (B) Home source diffraction pattern taken on a Rigaku 007 rotating anode, focusing optics, and a MAR345 Image Plate Detector. The maximum resolution attainable given the detector distance of the diffraction set-up is 1.54 Å, which FAcD routinely surpasses.

Only the intensities of the reflections can be measured and allow one to calculate the amplitudes of the scattered waves. Any phase information is lost, a fact which is known as the “phase problem”. The phases for any given protein diffraction pattern can be attained from numerous other techniques, such as using heavy atoms as isomorphous electron-dense additions or anomalous scatterers; one can also use an already known atomic model whereby one can back calculate the phases (49). Once the phases and amplitudes of each reflection are known, a Fourier synthesis will provide an electron density map into which the protein of interest can be modeled.

1.2.2 Time-resolved crystallography In a standard x-ray crystallography experiment the actual experimental result, the distribution of electron density in the unit cell (smallest part of a crystal that will recreate the whole crystal using translation operators only), is the spatial and temporal average of all unit cells in the crystal during the collection of diffraction data, i.e. any time-dependent information is lost (52). A rather crude

14 but still highly informative approach to circumvent this restriction involves the mutation of specific residues that have been identified as being crucial for the various steps in a catalytic mechanism. In suitable systems, this can produce structures of several reaction intermediates (5, 47) although one always must keep in mind that the mutations did change the chemical environment of the active site. To address this issue more broadly and generate a more detailed picture of the three-dimensional and temporal changes, e.g. during an enzyme-catalyzed reaction, time-resolved crystallography (TRX) techniques have been employed (53, 54). These techniques are used on wild-type crystals and at room temperature to allow the “natural flow” of the reaction.

Some TRX experiments attempted to stop and immobilize the structures of interest at various time points along the reaction pathway with the help of fast changes in temperature or pH to “trap” the protein in selected minima of the energy landscape (52, 55). The previously most successful TRX investigations, however, made use of light as the trigger of the reaction to be studied. Obviously, proteins with built-in light-responsive elements, like myoglobin and yellow photoactive protein, are ideally suited for such an approach (53, 54, 56). Efforts to widen the pool of proteins that could be investigated by TRX resulted in the synthesis of chemical groups that could be linked to either amino acids in active sites or to substrate molecules. Exposure to UV light generates the free enzyme or the proper substrate and catalysis can ensue (57, 58). An early example of this approach is TRX of the small oncogene product p21 (59).

In these TRX experiments, time is incorporated by exposing the crystals to x-rays at varying times after a reaction is triggered; for each time point a complete set of diffraction data has to be collected. To reduce the time necessary for data collection, use of a polychromatic x-ray beam (Laue diffraction) was rediscovered (60). The resulting diffraction patterns contain significantly more information than what can be obtained by using monochromatic radiation.

Additionally, advancements in 3rd or 4th generation synchrotron sources provide very short X-ray pulses of <1ns duration. Ultrabright X-ray Free Electron Lasers (XFEL) have pushed the limits to femtosecond pulses, which allow the characterization of short-lived structural intermediates that normally cannot be seen by means of trapping (61–64).

15

1.2.2.1 Ideal systems

Whether the method is trapping or TRX, one major road block is attaining a quasi-immediate reaction initiation globally in the crystal. As stated above, successful TRX experiments in the past used light-induced reversible systems, such as breaking the Fe-CO bond in heme proteins and the isomerization of a chromophore bond in photoactive yellow protein (53, 56, 61–63). The advantage of reversible systems is that after reaction initiation the system can revert back to its ground state allowing for multiple repetitions. Using 3rd generation synchrotrons, multiple diffraction patterns can be attained for the same time point on the same crystal to drastically increase signal to noise – depending on the crystal system this can also be done without too much radiation damage before the crystal has to be moved to a new position or a fresh crystal is needed.

This cannot be done for irreversible systems; once a reaction initiates, the crystal cannot revert back to its ground state and must be discarded. This leads to a major problem; to collect a complete data set requires a tremendous number of crystals. Unfortunately, the majority of proteins fall into the irreversible category, they are not light-inducible and cannot revert back to their ground state once the reaction has been started. Therefore, new strategies are needed for these systems.

Features that make systems very well suited for TRX are lifetimes of intermediate high energy states much longer than the time required to measure a diffraction pattern and stable under conditions where the reaction states of interest become populated to a much higher degree in the crystal than the non-desirable states (53, 65).

1.2.2.2 Mutational Trapping

For chemical trapping, a series of mutants can be made that slow down or halt catalysis at key points along the reaction coordinate pathway. This can allow for high-resolution structures taken under cryogenic conditions or at room temperature of highly populated intermediate states (5, 52, 55). Disadvantages with chemical trapping are the potential for proteins to contain structural artifacts or to display substrate-binding modes that only exist in mutants (52, 55). For example, any mutation that modifies the size of the active site might allow for a complex to form that does not reflect the true binding mode. Follow-up experiments are then needed to confirm the validity of the structures determined. Also, only the intermediate states that can be chemical

16 trapped can be observed; information on any short-lived states that exist between chemically trapped states cannot be acquired.

1.2.2.3 Physical Trapping

Physical trapping is the method of choice for experiments that allow for a rather crude determination of the exact time point, e.g. enzymes with a relatively slow kcat. It involves very quickly shifting the crystal environment to cryogenic temperatures thereby halting any chemical processes taking place (55, 66). One major advantage to this method is the very low radiation damage experienced and the high resolution attainable. The reaction is initiated in the crystal via diffusion of the substrate. Once a desirable time point is reached, the reaction is halted via flash- freezing with liquid nitrogen. Consequently, several structural artifacts may arise that only exist under cryo-conditions (67). If the lifetimes of the intermediates of interest are shorter than the time required to lower the temperature to a point where any movements are frozen out any subsequent structures derived from this method will be a mixture of different species in the crystal and no temporal information can be acquired.

1.2.2.4 Analytical Trapping

Out of all the trapping methods, analytical trapping offers the greatest advantages in terms of attaining authentic structures of reaction intermediates. This is because the experiments are performed at room temperature with wildtype proteins. Usually, the reaction is initiated by phototriggering via a very short (fs-ps) laser pulse (53, 62, 64). Avoiding potential artifacts generated by cryogenic conditions, mutations or drastic changes in other chemical parameters like pH, as a result of the speed at which the reactions can be triggered and data collected, due to advances in detector technology, very short-lived intermediate states can be investigated. For example, with myoglobin and photoactive yellow protein (PYP), ps time resolution has been achieved (52, 55, 56, 62–64, 66, 67). Analytical trapping, however, is not without its caveats. Due to the high speed at which diffraction patterns have to be collected to achieve high time resolution, the signal to noise ratios are far lower than those seen with data from chemical or physical trapping experiments (68). Also, the need for extremely short laser pulses and for advanced detectors requires a great deal of planning and forethought but also custom-designed equipment. Only a few specialized beamlines are capable of performing such experiments. Lastly, with most analytical trapping performed with phototriggers, over-exciting the samples

17 when pushing for the highest photo-yields possible is a realistic danger. Consequently, much care must be taken to avoid too much energy being pumped into the system, which would lead to structures full of artifacts (69, 70).

While over-exciting is a valid concern for all photo-triggered systems, reversible and irreversible, it is less relevant for the latter ones, which usually rely on ‘caged’ compounds or proteins to get a fast reaction initiation. These photosensitive compounds have the substrate attached to a protecting group, which can be liberated via a laser pulse thus initiating the reaction (71). Caged compounds can either be diffused into the crystal or co-crystallized, the caging process renders the substrate inert and no reaction will occur until photolysis is triggered (57–59, 71). Depending on the chemical nature of the cage group and therefore the rate of photolysis, a quasi-synchronous start time with sub-ns time resolution is attainable.

The substrate or protein is only uncaged when it absorbs the correct wavelength of light, usually in the UV range (57, 58, 71). Avoiding the absorption maximum, at which the crystal would appear ‘black’ to the laser light, and the accompanying destruction of the crystal but irradiating at a wavelength corresponding to one of the tails of the absorption curve allows for global illumination of the entire crystal and a homogenous reaction initiation.

Two parameters that have to be aligned in experiments employing caged triggers are kcat of the enzyme and the rate of photolysis of the caged compound. Enzymes that have very high turnover rates, e.g. carbonic anhydrase (72), could ‘outrun’ the photolysis of the caged compounds (57, 58, 71). Due to irreversibility of cage-trigger systems, single large crystals are not the best choice although they have worked splendidly in the past for systems with intrinsic triggers like PYP and myoglobin (61–64). Fortunately, the advent of serial crystallography provides a solution for these problems.

1.2.3 Structure determination by serial crystallography: an overview Classical crystallography and early time-resolved crystallography used large single crystals, historically with edges of up to and beyond 1 mm – more recently still in the 100 micron range, which has the advantage of potentially resulting in high-resolution structures. Studying irreversible systems by time-resolved crystallography, even with laser photolysis of caged compounds, consumes very large numbers of crystals. This leads to a major problem, whereby

18 to collect a complete data set a tremendous amount of time and effort is needed to first produce and then manually collect individual crystals. A recently developed technique, dubbed serial crystallography, has begun to solve this problem. Historically, small crystals were seen as a way to progress to larger ones. Advancements in the generation of X-rays have led to the advent of ultra bright X-ray pulses with high photon energy on very short timescales fs (XFEL) – ps (synchrotron), allowing generation of diffraction data with reasonable statistics from ever smaller crystals; the crystal structure of PYP, for example, could be determined from crystals that measured < 5x5x5 µm3. Therefore, it is now feasible to probe sub-ps timescales (53). New detector technology has increased readout speeds (>100 Hz) sufficiently to collect these data from large numbers of crystals in a realistic period of time (73). Because of the brilliance of the beam, any protein crystal exposed to such a powerful radiation is destroyed. Due to the fact that the physical process of diffraction is orders of magnitude faster than the chemical effect of radiation damage, however, data can be collected before the crystal is transformed into plasma, creating the catch-phrase “diffraction before destruction” (74). This combination also results in data sets that have zero (74) to very little radiation damage (75).

Due to the speed with which crystals can be probed now, new sample delivery methods had to be developed. There are several proto-types of sample delivery systems, such as liquid injectors, lipid-cubic phase (LCP) injectors, and fixed target systems. The most widely used is the liquid injector: a slurry of crystals at a fixed concentration is kept in a reservoir and then shot out of a very narrow nozzle and the resulting thin stream of crystals in their storage liquor crosses the path of the X-ray beam. Crystals are in random orientations during data collection (76). Another injector system, which is used for membrane proteins grown in Lipid Cubic Phase materials, extrudes the LCP out of the injector using a plunger system, again into the path of the X-ray beam. This modification is of great advantage when investigating membrane proteins and the crystal structure of the 5-HT2B G-protein Coupled Receptor was determined recently to 2.2 Å resolution using this technique on an XFEL beamline (77). Some disadvantages of these injection systems include the requirement of very large sample volumes, which is sometimes not feasible depending on the crystal system being examined (78), and high diffraction background, especially from the LCP (79).

19

Fixed-target sample delivery systems have several advantages over liquid injector systems: low sample consumption, using either a crystal chip or wafer, which can hold thousands of crystals held in place by designed features on the grid. The fixed target is then guided by microcontrollers so that each feature on the grid is exposed once to the X-ray beam (80, 81). Disadvantages of the fixed target systems include problems with the mechanics of chip-loading, the need to seal the chips to prevent dehydration of the crystals (80), and slower data collection rates than the injection systems (81).

Whichever sample delivery method is used, crystals either streamed in front of the X-ray beam or held static on a chip, do not lend themselves to oscillation geometry data collection and therefore full reflections can not be obtained; all diffraction patterns only consist of partials. This condition has lead to the emergence of software which can index, integrate, and scale not only partial reflections but also very large data sets (82, 83). Although there are now several software packages available they all should be considered to still be in the development phase with recurring changes and improvements (84).

For probing dynamic features of proteins, the introduction of synchronized laser pulses for reaction initiation in the crystal allows for high time resolution TRX experiments. Therefore, experimental designs for both reversible and irreversible systems are largely identical - the varying parameter being the wavelength of the photo-trigger.

1.2.4 Computational approaches to electron density interpretation X-ray crystal structures determined during TRX experiments are composed of a mixture of separate structural states of varying populations. Even in a system composed of a co-crystallized 1:1 complex of protein and caged substrate and a reaction that can be started by light with ps photochemistry the exponential decay nature of a unimolecular reaction cannot be avoided, generating randomly distributed starting points in the crystal due to the stochastic nature of the start of the reaction in individual molecules (85). More systematic time delays are caused by differences in substrate concentration throughout the crystal lattice when the reaction is started by diffusing the substrate into the crystal. This increases the number of structures already present in the crystal even further. Teasing apart meaningful structures that make chemical and physical sense when attempting to model in dynamic activity will only prove fruitful when the population of the state of interest is higher than the background threshold, i.e. when the energy

20 landscape of the reaction contains wells deep enough to allow the accumulation of a sufficiently high percentage of molecules in these states (85). High redundancy data sets that increase the signal to noise ratio are as important for this analysis as are computational approaches that apply more complex mathematical techniques to the investigation of the resulting electron density maps (65).

1.2.4.1 Singular value decomposition

A technique known as Singular Value Decomposition (SVD) analysis is one method to interpret the maps generated by TRX. It provides time-independent densities of the difference maps of the intermediate states, so the structures of the mixed components can be fit and tested for their physical and chemical relevance. Since these maps are a mixture of time-dependent structures, SVD deconvolutes the data, therefore giving structural information that one cannot readily see in a normal density map (85, 86).

Initially a matrix is created from the crystal data. Each column of this Matrix A consists of an entire difference electron density map for a given time point. This Matrix is then subjected to SVD factorization, producing 3 matrices (USVT) (Figure 1.8). The U and V matrices contain the left and right singular vectors, respectively (in VT the right singular vectors are the rows of the matrix). S is a diagonal matrix that contains the singular values. The diagonal matrix acts as a noise filter and the number of significant singular values is the upper limit of possible distinct components in the data. The map representing an intermediate is a linear combination of several components (85). A singular value is defined as significant if its value, when plotted against all the ones generated through SVD (Figure 2.S7), is dramatically higher than the baseline created by the remaining values. The data matrices are then reconstructed to only include the significant singular values and right and left singular vectors that correspond to them. The singular values are the weighted values for the left and right singular vectors - LSV and RSV - and represent time-independent and time-dependent data, respectively. A difference electron density map between an intermediate and a ground-state structure is a linear combination of several significant LSVs (85–87).

21

Figure 1.8 - Simplified scheme of singular value decomposition (SVD). Each column of Matrix A consists of a difference electron density map for an individual time point. This matrix is subjected to SVD factorization - 3 matrices are produced. U and VT contain the right and left singular vectors respectively. Matrix S contains the singular values which serve as a weighting function. Reproduced with permission of the International Union of Crystallography. Figure 1.8 adapted from (85).

1.2.4.2 Absolute value electron density maps

All electron density maps when calculated based on data obtained through X-ray crystallography are on a relative scale. This is due to the fact that the direct beam is always blocked by the beamstop for evident practical reasons. Therefore, the F000 structure factor, which theoretically contains the total number of electrons in a given unit cell and thereby determines the absolute scale, cannot be calculated with certainty. An absolute value map

(AVM) is one approach to estimate F000. It is based on calculating the total number of electrons in the model as well as the bulk , which is then used to change the relative scale to an absolute scale by modifying the map coefficients based on the newly estimated F000 structure factor (88).

Therefore, comparison of electron density maps from separate structures that have different symmetries or crystallization conditions becomes more difficult. AVMs revealed molecular

22 motions more clearly (88) and helped to assign important functional roles to weak electron density (88). As an example, AVMs alongside relative noise maps successfully identified electron density for low-occupancy inhibitors of HIV capsid protein (88).

1.3 Allostery

The ultimate goal of the structural analysis of proteins is to discern how structure leads to function. Proteins consist of a multitude of interatomic networks that dynamically interact with one another; these interactions govern the proteins behavior. While conglomerate interactive networks can make the protein behave as a complex system (89), emergent properties can arise out of these networks, thereby allowing general rules to be made about these systems. “Allostery”, a term coined by Monod and Jacob in 1965 (90), describes long-range behavior in proteins, whereby binding of ligands can cause both structural and dynamic changes in distant areas of the protein molecule but still effecting its function. Several models of allostery have been developed, all attempting to ascribe general rules to complex protein behavior (91–95).

1.3.1 Models of allostery 1.3.1.1 Concerted Model

Originally devised to account for the cooperative behavior seen in the oligomeric protein hemoglobin, the Monod-Wyman-Changeux (MWC) model is often used to describe allosteric transitions in symmetric oligomeric proteins in general (91). The model postulates that the protomers in an oligomeric protein can exist in either a tense (T) or relaxed (R) state (Figure 1.9A) and each protomer is in equilibrium between those two states (91). Both states can bind ligand, although with different affinities, but conformational changes can only happen for all protomers in a concerted fashion. In hemoglobin, in the absence of oxygen, the T state is the predominant conformation (91); binding of oxygen to each protomer shifts the equilibrium of this protomer to the higher oxygen affinity R-state and the overall affinity of hemoglobin for oxygen increases. The reverse process takes place when oxygen is released.

1.3.1.2 Sequential Model

Shortly after the publication of the MWC model, another model of allostery also used to describe the allosteric transition in oligomeric proteins was published by Koshland, Némethy,

23 and Filmer (96). The KNF model was formulated to address some of the problems that exist with the MWC model, e.g. the lack of mechanism leading to the transformation between T- and R-state. The authors of the KNF model theorized that it seemed unrealistic that all the subunits in hemoglobin would structurally change together (96). In the MWC model, substrate binding does not induce any specific structural state in the protomer, rather only shifts the equilibrium between T and R states. In contrast, the KNF model proposed that not all protomers are necessarily in the same structural state at the same time (Figure 1.9B). The model assumes that substrate binding induces a structural change in the protomer leading to gains in affinity for the substrate (97). These conformational changes can be transmitted to neighbouring protomers. For example in the KNF model for hemoglobin, when oxygen binds to the first protomer, it not only induces a structural change, moving the protomers T-state to an R-state, it also induces small structural changes in the neighbouring protomers, causing them to gain a slightly higher affinity for oxygen. As more oxygen atoms are added to hemoglobin the subsequent empty protomers affinity for oxygen will gradually increase (98). Again, oxygen release in tissue follows the reverse path.

24

Figure 1.9 – Concerted and sequential models of allostery. Both the concerted and sequential models were originally developed to explain the measurements of oxygen binding to hemoglobin. A) All the protomers together in the MWC model can only adopt one of two states (T or R) that are in equilibrium. Oxygen binding (denoted in blue) increases the pressure for the tetrameric molecule to have the four protomers assume the R-state, which has a higher affinity for oxygen. B) In the sequential model the protomers can individually change from the T- to the R-state upon oxygen binding. A conformational change to the R-state causes adjacent protomers to adopt a T0 – state, which is a primed state that will more readily bind oxygen than the T - state. Both models have the same paradigm that oxygen binding further increases hemoglobin’s affinity for oxygen.

1.3.1.3 Ensemble Model

Cooperativity has evolved to serve specific enzymatic functions. Despite their undeniable successes, the KNF and MWC models cannot, however, explain all the types of observed with enzymes. Since their introduction, a number of modifications have been suggested to account for new structural and spectroscopic data (99). However, none of them provide any structural basis for how takes place. In addition, the views of how proteins behave as molecules have drastically changed. They are no longer seen as a polypeptide chain that exists in a few well determined and static states. Rather, proteins are regarded as dynamic ensembles of states that sample an energy landscape, as they interconvert

25 between states of varying energies and populations (92, 99–101). The distribution of states that a protein can assume at any given time is not static, and changes depending on environmental conditions, e.g. whether or not it is interacting with ligands (94, 99, 102). Proteins in a given environment exist in an equilibrium where they sample several states at any given time with varying populations. Binding of a ligand to the protein can stabilize one of the sampled states and that state will begin to become populated. Ligand binding therefore changes the free energy landscape, whereby the ligand-bound state becomes the predominant one and the protein can then begin to sample a different array of states (47, 93, 102, 103). The ensemble model of allostery sees proteins existing in a more statistical fashion rather than as simple static structures that proceed sequentially through catalysis. This new model of allostery has caused a shift in the way experiments are designed - searching for high-energy, lowly populated states as a means to better explain allosteric mechanisms (103, 104).

1.3.2 Methods to study allostery Since its first description, allostery has been a hot topic in the biophysical and biochemical communities, and has been subject to a multitude of different experimental studies, both in vitro and more recently also in silico.

1.3.2.1 X-ray crystallography

X-ray crystallography is the most frequently used experimental technique for discerning a protein’s structure. As described above, it has predominantly been used to attain ground state structures. While some chemical and physical trapping methods have had success in garnering “before and after” snapshots of protein behavior, high-energy states and dynamic information were ultimately lost. Due to the advances in time-resolved studies, it is now possible to tease out dynamic information from crystallographic data. New TRX techniques, such as “5-dimensional” crystallography which introduces both time and temperature changes, make it possible to calculate kinetic rates (105). Time-resolved data has also been used with SVD analysis to gain new insights into hemoglobin cooperativity (106).

1.3.2.2 Nuclear Magnetic Resonance

The vast majority of crystallographic experiments have many limiting factors that need to be overcome to study dynamic behavior. Nuclear magnetic resonance (NMR) is a spectroscopic

26 technique which takes advantage of the magnetic resonance phenomenon of isotopes of atoms which have a nonzero spin (107). NMR spectroscopy allows studying dynamic changes in the protein by monitoring how residues behave in their electronic environment via chemical shifts and relaxation times. Depending on the spin used, such as 19F NMR, this technique allows studying dynamic changes taking place in the protein by monitoring key residues. 19F NMR chemical shifts are particularly sensitive to their local chemical environment. This is mainly due to them being acutely exposed to the local electrostatic and van der Waals forces (108). This high degree of sensitivity allows for the detection and analysis of conformational changes, substrate binding, and residue dynamics of the labelled sites (109). Fluorinated amino acids, usually substituted , tyrosines and phenylalanines, are incorporated into the protein by introducing a synthetic auxotrophic environment to grow the bacterial strains – such as using glyphosate to inhibit aromatic amino acid production so labelled residues can take their place (110). Assignment of peaks to labelled residues is usually achieved via site-directed mutagenesis. Spectra of mutant proteins are compared against the spectrum of the wild-type protein – a missing peak establishes a mutated residue, e.g. W-> H/F (111). Subsequent 19F NMR analysis of proteins with labelled residues must make sure that any perturbations arising from fluorinated residues does not disrupt normal protein function and hinder good data collection. Since the atomic size of fluorine is only slightly larger than that of the hydrogen atom it replaced in the amino acid, such changes usually do not significantly affect protein structure or function(110).

Allosteric changes in proteins manifest themselves as changes in population shifts or dynamic changes in a conformational ensemble. NMR has numerous different techniques to study these phenomena. Such as, relaxation dispersion experiments that can give thermodynamic, kinetic, and structural information via measuring line-width changes due to s-ms dynamics (107). NMR is particularly well suited for the detection of very low, almost invisible high-energy states associated with the conformational landscape. It is possible to detect and measure the exchange rates of these excited states using a technique called chemical exchange saturation transfer (CEST)(104) assuming exchange occurs on a timescale between ms and s. Alternatively, faster exchange (µs - ms) may be studied by T1rho or CPMG type (T2) experiments.

27

1.3.2.3 Computational approaches

Computational approaches have begun to prove that they are powerful tools for the study of allosteric mechanisms, especially when combined with crystallographic and spectroscopic methods (47). With long-time scale simulations, coarse-grained methods, statistical analysis, and molecular dynamic simulations such approaches have shown predictive power in discerning allosteric pathways, such as in allosteric drug discovery (112, 113). One especially successful example are simulations that detect transiently formed binding pockets for ligands in potential allosteric binding sites (114). An interesting approach looks at how water molecules behave and how they influence allosteric networks by interacting with ligands during a binding event (115). Graph theory applications have proven effective for mapping out potential allosteric networks as well as in identifying through which residues network signals are carried across the protein(47), (116).

1.4 Thesis Rationale

This thesis attempts to contribute to our understanding of and to gain biophysical insight into enzyme mechanics. It sets out to understand how nuanced small structural and dynamic shifts that occur both during and outside catalytic activity, can profoundly affect enzyme function. Fluoroacetate dehalogenase (FAcD) has proven to be a most suitable model system to study these properties. By addressing the biophysical and chemical details of enzymatic defluorination, I set out to obtain information on how FAcD functions by establishing a mechanistic atomic picture of catalysis to find that the experiments also provided an idea opportunity to contribute to the understanding of allostery. It became clear that these findings can be extrapolated to allosteric mechanisms in other enzyme systems as well. The insights gained in these studies can contribute to new ways to engineer novel enzymes. The over-arching tasks to be addressed in this thesis are:

1) Functionally characterize the structural and dynamic interplay that occurs within and between protomers during catalysis. 2) Understand the dynamic and biochemical mechanisms for allostery and attempt to structurally characterize the catalytic effects of enzyme-substrate interactions.

28

3) Develop an understanding of the structural mechanics of FAcD by gaining insight into structural intermediates without the means of chemical trapping.

Numerous biophysical techniques were used to address these questions. Classical X-ray crystallographic techniques using chemical trapping were applied as well as NMR to gain insight into the dynamic interplay between sidechain motions and large-scale dynamic changes in protein domains. Other aspects aimed at providing insight into allosteric behaviour, investigated the effects of substrate concentration on kinetic rates, and proved the enzyme’s potential to catalyze the hydrolysis of halogenated compounds. Participation in the methodological development of Serial Time-Resolved X-ray Crystallography opened the door to the structural characterization of potential reaction intermediates. This required the design and syntheses of several possible caged substrates (in collaboration) and the characterization of several caged compounds and optimization of the reaction initiation and data collection using crystal chip technology at both synchrotron and XFEL sources. Collectively the data and their interpretation presented in this thesis provide greater insight into allostery and structural mechanics of enzymes using both established techniques and contributing to the improvement of new ones.

29

Chapter 2

The content of this chapter has been adapted from: Science 355(6322), eaag2355 doi:10.1126/science.aag2355 (http://science.sciencemag.org/content/355/6322/eaag2355)

The Role of Dimer Asymmetry and Protomer Dynamics in Enzyme Catalysis

Authors: Tae Hun Kim1†, Pedram Mehrabi2,3†, Zhong Ren4,5, Adnan Sljoka6, Christopher Ing7,8, Alexandr Bezginov2, Libin Ye1, Régis Pomès7,8, R. Scott Prosser1,7,‡*, Emil F. Pai2,3,7,9‡* Affiliations: 1Department of Chemistry, University of Toronto, UTM, 3359 Mississauga Road North, Mississauga, Ontario L5L 1C6, Canada. 2Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada. 3Ontario Cancer Institute/Princess Margaret Cancer Centre, Campbell Family Institute for Cancer Research, Toronto, ON, M5G 1L7, Canada. 4Department of Chemistry, University of Illinois at Chicago, Chicago, IL 60607, USA. 5Renz Research, Inc., Westmont, IL 60559, USA. 6CREST, Japan Science and Technology Agency (JST), Department of Informatics, School of Science and Technology, Kwansei Gakuin University, Japan. 7Department of Biochemistry, University of Toronto, 1 King’s College Circle, Toronto, Ontario M5S 1A8, Canada. 8Program in Molecular Structure and Function, Research Institute, The Hospital for Sick Children, Toronto, Ontario M5G 1X8, Canada. 9Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada. †These authors contributed equally to this work. ‡The contributions by these authors are of equal importance. *Correspondence to: [email protected], [email protected] Statement of contribution: P.M. and T.H.K. designed and performed crystallographic and NMR experiments, respectively. Together they designed and performed the ITC experiments and analyzed all the data together. Z.R. performed SVD analysis. A.S. performed computational rigidity analysis. A.B. assisted with computational algorithm. C.I. and R.P. performed and analyzed MD simulations. L.Y. performed site-directed mutagenesis. R.S.P. supervised NMR data collection and analysis. E.F.P supervised crystallographic experiments and analysis as well as editing of the manuscript. The manuscript was written by T.H.K., P.M., and R.S.P with input

30 by all other authors. Aiping Dong (Structural Genomics Consortium, Toronto) and Jean- Philippe Julien (Hospital for Sick Children Research Institute, Toronto) helped with collecting X-ray diffraction and ITC data, respectively. Lewis E. Kay, Ranjith Muhandiram and Robert M. Culik (University of Toronto) helped with the collection of 15N CPMG relaxation dispersion data, their analysis and discussion.

Abstract Freeze-trapping X-ray crystallography, NMR, and computational techniques reveal the distribution of states and their interconversion rates along the reaction pathway of a bacterial homodimeric enzyme, fluoroacetate dehalogenase (FAcD). The crystal structure of apo-FAcD exhibits asymmetry around the dimer interface and cap domain, priming one protomer for substrate binding. This asymmetry is dynamically averaged through millisecond timescale conformational exchange. During catalysis, the protomer conformational exchange rate becomes enhanced, the empty protomer exhibits increased local disorder and water egresses. Computational studies identify allosteric pathways between protomers. Water release and enhanced dynamics associated with catalysis compensate for entropic losses from substrate binding while facilitating sampling of the transition state. The studies provide insights into how substrate-coupled allosteric modulation of structure and dynamics facilitates catalysis in a homodimeric enzyme.

One Sentence Summary

Enzyme homodimer engages both subunits – one binds substrate in its active site, the other allosterically enhances catalysis.

31

2.1 Main Text

Enzymes greatly accelerate biochemical reactions by providing a scaffold to bind and recognize substrates(117), position catalytic units (118, 119), and facilitate formation of stabilized transition states(120). Representative states associated with this process are typically sampled by the substrate-free (apo) enzyme, where cooperative motions often play a role in connecting states and propelling the enzyme along the reaction coordinate pathway (121–124). This dynamic cooperativity is also key to establishing allosteric pathways between the substrate binding region(s) and the reaction center, resulting in catalysis and release of product(s) (125, 126). In analogy with protein folding, cooperative motions are intrinsically important to catalysis in terms of sampling conformations associated with substrate capture, assembling substrates and surmounting enthalpic and entropic barriers, providing for multiple reaction pathways, and regulating overall catalysis - all while avoiding motions that are either unproductive or give rise to misfolding (127). Much of this seminal work on dynamic energy landscapes has focused on monomeric enzymes (119, 121, 122, 124). Here, we consider the role of dynamics in a homodimeric enzyme, fluoroacetate dehalogenase (FAcD) from Rhodopseudomonas palustris, which cleaves carbon-halogen bonds and, in particular, one of nature’s strongest covalent single bonds, the carbon-fluorine bond, generating glycolate, as shown in Figure 2.1A (5, 128). The cleavage of a carbon-halogen bond is initiated through an

SN2 attack by an aspartate residue, D110, which displaces the halogen resulting in a covalent ester intermediate, while H155, W156 and Y219 stabilize the leaving halide. Then H280 activates a water molecule for ester hydrolysis to generate the product glycolate (Figure 2.S1)(5).

32

Figure 2.1 - Structural asymmetry and correlated structural changes in FAcD. (A) catalyzed by FAcD. (B) Projection of the difference distance matrix (Figure 2.S4A) between the protomers A and B plotted onto the apo FAcD crystal structure. (C) Michaelis complex of mutant D110N co-crystallized with FAc. (D) SPARTA+-predicted chemical shift differences (Δδ) between protomers from 38 substrate-soaked crystal structures (129). (E) Plot of right singular vectors (RSV) v1 and v2, weighted by their corresponding singular values (w1, w2), of 39 protomer pairs from the FTX series including an apo-structure. Coordinates of the two protomers of each dimer in the series are connected by a grey line and are defined by a red and blue dot, representing protomers A and B, as discussed in Figure 2.S6. The diameter of each dot corresponds to the soaking time. Structural changes occurring in protomer A are highly coupled with those in protomer B but due to stochastic fluctuations arising during soaking, no correlation with soaking time could be established.

33

2.1.1 The ensemble approach to catalysis Given the requisite roles of enzymes to recognize and position substrate(s) and facilitate catalysis and product release, the notion of a dynamic conformational ensemble is well- accepted(117–119, 130). Substrate binding can contribute energy toward activation through cooperative processes which allow enhanced sampling of catalytically relevant functional states within this ensemble (97, 99, 131). To advance one’s understanding of dynamic allostery in the context of catalysis, one would need to first identify the dominant conformations associated with substrate recognition, catalysis, and product release, as illustrated on a hypothetical free energy profile in Figure 2.S1. In the current study, high-resolution crystal structures of the apo state, the D110N substrate-bound Michaelis complex, an H280N glycolyl-enzyme covalent intermediate, and a product-bound state constitute a minimum data set with which to identify the appropriate distribution of states (i.e. the ensemble) via NMR (132). When employed on the native enzyme and the functional mutants listed above, it is possible to characterize this ensemble as the reaction ensues.

A key facet of the ensemble involves the connectivity of functional states which arise through cooperative dynamic processes and which we associate with allostery (133, 134). Enzymes generally exhibit a wide range of dynamics, ranging from sub-nanosecond local fluctuations, which relate to the conformational entropy of a given state, to concerted processes spanning tens of microseconds or many milliseconds, which serve to transmit binding or catalytic events across the protein (135, 136). Even under conditions where enzymes possess low catalytic rates, there are invariably picosecond to many millisecond timescale motions that may sample the reaction coordinate pathway but are often futile with regard to the generation of product (137). Solution state NMR can play a major role in providing insights into the above fast processes and the slower cooperative processes, often with atomic resolution(132, 138–140).

Within the arsenal of NMR relaxation experiments, 19F NMR, is most useful in both delineating states and in quantitating collective motions that are perhaps too fast for traditional 13C or 15N relaxation experiments which hone in on millisecond timescale motions (141). Biosynthetic labeling, which commonly involves the substitution of aromatic residues such as tryptophan with their mono-fluorinated equivalents, are generally relatively non-perturbing, as evidenced by the X-ray crystal structure of 5F-Trp enriched FAcD in Figure 2.S2A, which overlaps within

34 a RMSD of 0.14 Å with wild type FAcD. Although there are no detectable structural perturbations, subtle effects on binding and catalysis are detected by 19F NMR functional assays and ITC (Figure 2.S2, C - F). 19F NMR chemical shifts often resolve multiple states, due to their extreme sensitivity to subtle differences in van der Waals and electrostatic environments (142). These chemical shift differences also greatly help in the measurement of millisecond or sub- millisecond dynamics, where traditional 15N or 13C relaxation experiments simply do not register the exchange, as evidenced by Figure 2.S3, where virtually no 15N CPMG dispersions could be detected for a process deemed by 19F CPMGs to occur at 4300 s-1 (vide infra). Finally, millisecond timescale dynamics, which characterize allosteric pathways, are typically difficult to validate by all-atom molecular dynamics (MD) simulations alone. For this reason, we utilize both MD simulations to ascertain which residues exhibit correlated motions over 40 s trajectories and the rigidity-based transmission allostery (RTA) analysis, which accesses millisecond timescale motions. In particular, RTA identifies physical pathways between protomers or distinct domains in the protein, which can communicate binding or local conformational changes, thereby connecting states identified by crystallography and NMR.

2.1.2 Dynamic allostery in dimers. Quaternary structure is generally understood to facilitate allosteric regulation (90, 143, 144) and in some cases regulate catalysis through distinct oligomerization states (145). Only 19% of bacterial proteins are monomers whereas homodimers are the most prevalent form with 38% (146). Curiously, the dimeric enzyme, FAcD, only catalyzes the transformation of a single substrate molecule at a time, calling into question the role of the dimer. To gain insight, we used freeze-trapping X-ray crystallography (FTX), NMR, and computational techniques to investigate the distribution and interconversion rates of states along the reaction pathway. As FAcD catalysis is slow, FTX provides a sense of structural changes occurring during catalysis. High-resolution X-ray crystal structures of the catalysis-effecting mutants listed above provide additional indications of dynamic regions through B-factor analysis and changes in the numbers of bound waters for each of these reaction steps, while NMR provides a sense of the extent to which distinct functional states are sampled, along the reaction coordinate pathway.

35

2.1.3 Dimer asymmetry. The structure of apo FAcD shows a subtle asymmetry in the dimer interface and the region around the active site (Figure 2.1B). While one protomer is in an open state, which is poised for substrate binding, the other protomer exists in a closed state (Figure 2.S1). This asymmetry persists upon binding as evidenced by linear trajectories of NMR chemical shifts during substrate titration (Figure 2.2B), and X-ray crystal structures of the D110N Michaelis complex, the H280N covalent intermediate, and the wild-type glycolate complex, when obtained by co- crystallizing with substrate or product, respectively (Figure 2.1C and Figure 2.S4, A to D). All these structures reveal a single bound substrate molecule per dimer. The only exception is a H280N covalent intermediate structure, obtained from a crystal grown from a solution with an excessive concentration of substrate (Figure 2.S4, E) (5), indicating that after nucleophilic substitution in one active site, the second site becomes more accessible. Taken together, this suggests catalysis proceeds through half-of-the-sites reactivity (147, 148), where binding to the second site is prevented at least until nucleophilic substitution has taken place.

-1 As FAcD exhibits a slow enzymatic rate (kcat = 1.84 min for wild type FAcD), we could capture structural changes in the dimer by X-ray crystallography after freeze trapping crystals soaked with substrate for times ranging from 2 to 180 seconds (149, 150). Structural differences between the protomers in the reaction series were analyzed by difference Fourier maps, which were calculated for each protomer of the 38 soaked structures with the ligand-free structure as reference. As shown in Figure 2.1B and D, key asymmetries in the dimer appear to arise in the interface region, and the cap domain which controls ligand gating. The 78 difference maps of individual protomers were analyzed by singular value decomposition (SVD) analysis to rank decomposed difference components (Figure 2.S5) in order of significance (Figure 2.1E, and Figures 2.S6, B and C, 2.S7) (85). While SVD analysis did not give a time correlation, due to stochastic fluctuations that arise during soaking of the ligand, it did reveal highly correlated yet asymmetric structural changes between the protomers during catalysis (Figure 2.1E). The two protomers in the same dimer follow very different trajectories marked by arrows for the first few dimensions decomposed by SVD, which indicates that the difference maps of the two protomers contain unique compositions (wiVi, i = 1 - 8).

36

Figure 2.2 - Conformational and dynamic changes of FAcD upon titration with substrate analog BrAc observed by 15N,1H HSQC and 19F NMR. (A) An expansion of 15N,1H HSQC spectra from a titration series with BrAc: protein ratios ranging from 0 to 10 at 50 °C (full spectrum, see Figure 2.S9B). (B) Chemical shift perturbations resulting from titration of BrAc (Figure 2.2A), overlaid onto the Michaelis complex structure. Dark black regions represent missing peak assignments. (C) 19F NMR spectrum of 5F-Trp incorporated FAcD at 50 °C. The dashed box showing the major and minor peaks of W156, is enlarged in Figure 2.7B. (D) CPMG relaxation dispersion profiles of W156 recorded at 600 and 500 MHz. The substrate-bound state (orange) shows increased dynamics by a factor of ~5 compared to the apo (black) and product-bound states (green). For an expanded view of CPMG profiles of apo and product- bound states, see Figure 2.S8B. (E) W156 shows large chemical shift changes upon addition of substrate analog BrAc (orange) and product (glycolate, in green). Note that the chemical shift of one of the minor peaks (S1) of apo FAcD matches the “substrate-bound” chemical shift, though without line broadening (dynamics).

37

2.1.4 Dynamic averaging of structural asymmetry In an effort to correlate the above structural asymmetries with the NMR results, SPARTA+ (129) was used to predict the differences between backbone chemical shifts of protomers A and B, averaged over the 38 crystal structures (Figure 2.1D) (129). Several regions are predicted to exhibit larger chemical shift differences associated with catalysis - namely the dimer interface region, the cap domain which controls ligand gating, helices α1, α9, α10, and the β-strand β8 (Figure 2.1D). Nevertheless, 15N, 1H and 19F NMR spectra of the apo-enzyme and substrate- saturated enzyme exhibit a single set of resonances (Figure 2.2A and B), implying that these structural asymmetries are dynamically averaged through conformational exchange. Based on the predicted differences in 15N and 1H chemical shifts, in the regions associated with catalysis, we estimate that the dimer must undergo a conformational exchange between two states (AB ⇄ -1 BA) with kex ~ 400 s as a lower limit in order to dynamically average the expected shift differences (Figure 2.S6D). Given the sensitivity of 19F NMR to subtle differences in van der Waals and electrostatic environments, 19F Carr-Purcell-Meiboom-Gill (CPMG) relaxation dispersion experiments were performed to characterize potential protomer exchange dynamics in the apo-state (109, 151). While several tryptophan resonances exhibit dispersions (Figure 2.S8A), Trp156, which is located on the dimer interface helix, exhibits a pronounced CPMG -1 dispersion, indicative of an exchange rate of kex = 750 s (Figure 2.2D and Figure 2.S8B). Thus, the structural asymmetry identified by X-ray crystallography is dynamically averaged in solution.

2.1.5 Probing dynamic asymmetry and protomer exchange during catalysis. To investigate the role of asymmetric conformational exchange between protomers under catalysis, we employed 15N, 1H and 19F NMR over a wide concentration range of bromoacetate (BrAc), which conveniently binds and establishes a Michaelis-like intermediate, though without further catalysis as in the case of other haloacetates (Figure 2.2, A and B, and Figure 2.S8C to E). Despite the structural asymmetries in the dimer identified by crystallography, the 15N, 1H NMR spectra are indicative of fast protomer conformational exchange over the entire range of BrAc concentrations (Figure 2.2A). 19F NMR provides a similar result (Figure 2.2E). Of the nine tryptophans in FAcD, all exhibit a single resonance indicating fast protomer

38 conformational exchange dynamics (Figure 2.S8E). The four resonances associated with W156, W185, W264, and W267 undergo shifts upon titration with BrAc (Figure 2.S8E). 19F NMR CPMG relaxation dispersion measurements show that addition of the substrate analogue BrAc results in a protomer conformational exchange rate on the order of 4300 s-1, substantially faster than that in the apo-state (750 s-1) or in glycolate-bound FAcD, which was estimated as 930 s-1 (Figure 2.2D).

Resonances which exhibit chemical shift perturbations resulting from the addition of BrAc, map to the same region of the protein identified by SPARTA+ analysis of the FTX series (Figures 2.1D and 2.2B, and Figures 2.S9, B to D). In particular, the catalytic site and the interface region showed the largest chemical shift changes with increasing substrate concentrations, suggesting that conformational changes in the catalytic site of one protomer transmit allosterically to the other protomer via the interface domain (100). This is further corroborated by 40 s of all-atom MD simulations. Analysis of atomistic trajectories reveals that the dimer interface region, and the cap domain have correlated and collective motions extending across the dimer interface (Figures 2.3A, 2.S10). Both symmetric (principal component 1) and asymmetric (principal component 2) collective motions were observed in MD simulations (Figure 2.3B). Crystal structures are spread along the asymmetric collective mode, corroborating the motions observed in FTX (Figure 2.3C).

39

Figure 2.3 - Cross correlation of residues and collective motions from MD simulations. (A) Side and top views of FAcD depicting mutual information networks of dynamically correlated residues in the apo state. Black lines indicate intra-subunit correlated residue pairs (threshold 0.25) and red lines indicate inter-subunit correlated residue pairs (threshold of 0.2). (B) Interpolation between extremes (colored red and blue) of the first and second principal components of the backbone atom covariance matrix onto the average structure for apo FAcD simulations. Black arrows indicate overall directions of collective motions in the cap and interface regions of FAcD. (C) Two-dimensional projection of the apo and FAc-bound simulation datasets onto their respective first and second principal components. The colors of two-dimensional histogram represent the likelihood of observing a simulation snapshot along a collective degree of freedom. X-ray crystal structures (stars) and structures obtained from FTX

40 experiments (circles) are projected onto the principal components of the apo and FAc-bound datasets, respectively.

2.1.6 Asymmetry in local dynamics and numbers of crystallographic water molecules While dimer asymmetry and fast conformational exchange appear to be a hallmark of FAcD catalysis, there is further evidence from crystallographic B-factors that the empty protomer of substrate-bound FAcD undergoes rapid internal fluctuations upon initiation of catalysis. As shown in Figures 2.4 and 2.5, the entropic cost associated with substrate binding to the ‘active’ protomer is compensated by an increase in B-factors and a decrease in the number of bound water molecules in the empty protomer. Although the B-factors and the number of bound waters of both protomers of apo-FAcD are similar, B-factors in the ligand-free protomer of the Michaelis complex are markedly increased, with a corresponding loss of water molecules, whereas those of the ligand-bound protomer are mostly unaltered (Figure 2.4, A–C and Figure 2.5). This effect is also observed albeit to a lesser extent for the ClAc-bound structure (Figure 2.4B and Figure 2.5, B and C), whose C-Cl dissociation energy is in fact lower by 30 kcal/mol (48). Note that FAcD has evolved to be optimized for FAc over ClAc, where the catalytic rate is conspicuously lower, in keeping with the presumed correlation between entropic contribution (assessed through B-factors and crystallographic water molecules) and catalysis(5). B-factors and bound waters of the glycolate-bound protomer were comparable to those of the apo structure (Figure 2.4, A and F, and Figure 2.5, B and C). We emphasize that all crystals are highly isomorphous (Figure 2.S11, Table 2.S1), B-factor values generally only average from 9 to 13 Å2 for main-chain atoms, (with changes reaching 17.5 Å2), and all comparisons were done at the same resolution, suggesting that the observed changes in B-factors are real and significant. Thus, asymmetric binding of substrate to one protomer triggers an increase of local dynamics (100) and an egress of protein-bound waters in the empty protomer. This dynamic asymmetry is also observed in covalent intermediate states, in cases where only one of the catalytic sites undergoes a reaction (Figure 2.4D and Figure 2.5B and C). In contrast, when both catalytic sites contain the ester intermediate, the protomers exhibit B-factors and numbers of tightly bound waters comparable to those of the apo-state (Fig 4E and Figure 2.5B and C). The empty protomer becomes dynamic only in substrate-bound and covalent intermediate structures in

41 which a catalytic site is engaged and primed for breaking or forming covalent bonds (Figure 2.4B to D). Thus, substrate binding enables an increase in empty protomer dynamics, which compensates for entropy losses from substrate binding, facilitates sampling of conformations poised to adopt the transition state, and lowers the free energy of the activation barrier through the allosteric network across the dimer.

Figure 2.4 - Heterogeneous dynamics in the dimer at key catalytic steps. Crystallographic B-factors are indicated by color, from blue to red, and via a putty tube representation. (A) apo structure of FAcD (D110N). (B, C) Michaelis complexes of FAcD (D110N) with substrates, ClAc (B) or FAc (C). (D, E) Covalent ester intermediates with only one catalytic site occupied (D) or both catalytic sites reacted (E). In the case of the singly bound FAc or ester intermediate structures, B-factors in the empty protomers are significantly higher. (F) B-factors in the product glycolate-bound state are notably lower than the FAc-bound state (C).

42

Figure 2.5 - Distribution of water molecules bound to crystalline FAcD. (A) 2Fo-Fc electron densities interpreted as water molecules located within 2.8 Å of the next FAcD atom are shown as grey mesh at 1.0σ cut-off. In the FAc-bound structure, the distribution becomes quite asymmetric due to a significant decrease of water molecules in protomer B, especially around helix α9 and strand β8 of the unbound protomer. Note that these water-depleted regions also exhibit chemical shift perturbations and increased B-factors in the presence of substrate (Figure 2.2A and 2.4). (B) Bar graph of number of bound waters within 2.8 Å of protomers in the apo- and FAc-bound structures. The apo-structure has similar numbers of water molecules in each protomer. The FAc-bound structure reveals a loss of ~30 water molecules surrounding the ligand-free protomer. (C) Bar graph of the same data shown in (B) but as percentages of the total number of tightly bound water molecules.

2.1.7 Inter-protomer allosteric pathways Our X-ray and NMR studies establish that binding of a single substrate molecule to the FAcD dimer initiates a pronounced conformational exchange between protomers on a sub-millisecond timescale (i.e. 4300 s-1), which is accompanied by increased disorder in the empty protomer. To identify an allosteric transmission pathway between the substrate binding site of one protomer and the dynamic regions of the empty protomer, we used a RTA algorithm (based on rigidity theory and an extension of the method FIRST) (152–154). The RTA algorithm predicts the

43 extent to which a local mechanical perturbation of rigidity (mimicking ligand binding) at one region can propagate and be allosterically transmitted to another (remote) region. Computationally, the RTA algorithm calculates the number of conformational degrees of freedom at the interface region before and after perturbation of the rigidity of the substrate- binding region and any consequent transmission in degrees of freedom is obtained. Mechanistically, if two sites are in rigidity-based allosteric communication (i.e. transmit a change in degrees of freedom) then a change in conformation (e.g. as caused by binding) at one site of the network will induce a change in conformation at the second site (155).

The FIRST method makes use of the X-ray crystal structure to generate a constraint network (graph), where the protein is viewed in terms of vertices (atoms) and edges (i.e. bonds, including covalent, electrostatic, hydrogen bonds, and hydrophobic contacts). Hydrogen bonds are then ranked in terms of overall strength, based upon local donor-hydrogen-acceptor geometry, and hydrogen bonds weaker than a selected cut-off are ignored. Using a pebble game algorithm (153), the network can be decomposed into rigid clusters and flexible regions, enabling an evaluation of non-trivial degrees of freedom throughout the protein. Substrate binding would typically establish additional constraints at the binding site, which should lead to changes in flexibility elsewhere in the enzyme, thereby contributing to allosteric transmission. FIRST rigid cluster decomposition analysis shows that apo-FAcD consists predominantly of a single large rigid region (indicated in blue in Figure 2.6A) spanning the dimer interface. The addition of substrate results in several rigid clusters, (designated by color in Figure 2.6A) and flexible connections, shown in gray, demonstrating an increase in conformational entropy in the empty protomer.

The RTA algorithm analysis indicates that the transmission of conformational degrees of freedom (DOF) from the catalytic site in one protomer to the interface region of the empty protomer, shown in Figure 2.6B, occurs by allosteric transmission through networks. Thus, computational analysis of allosteric transmission in all three states, i.e. the apo-, Michaelis, and glycolate-bound structures, suggests that there is indeed an underlying allosteric pathway, which extends from the substrate-binding site to the second protomer. Red and green lines represent the hydrogen bond energy cut-offs corresponding to the maximum possible transmission of degrees of freedom and the onset of allosteric transmission for the apo- and ligand-bound dimer

44 structures. In addition, this analysis indicates that less energy is required for allosteric communication between protomers in the Michaelis complex structure compared to the apo- and the glycolate-bound structures (Figure 2.6B) (152, 156).

Figure 2.6 - Computational analyses using rigidity-based transmission models. (A) Computational rigidity predictions with FIRST. Rigid cluster decompositions at a -2 kcal/mol energy cut-off (purple dashed line in B) shows that the Michaelis complex is composed of several rigid clusters, (designated by colour) and flexible connections (gray) while the apo and the glycolate-bound structures consist predominantly of single large rigid region (blue). (B) The transmission of conformational degrees of freedom (DOF) from the catalytic site in one protomer to the interface region of the empty protomer (i.e. a change in rigidity at site one propagating to modify rigidity at site two) indicates the presence of rigidity-based allostery between the two protomers. Red and green lines show the hydrogen bond energy cut-offs corresponding to the maximum possible transmission of degrees of freedom and the onset of allosteric transmission, respectively.

2.1.8 Conformational selection and the reaction coordinate pathway A key tenet often associated with protein action involves the notion of conformational selection, which in the context of enzymes posits that the ground state apo-conformer (T state) samples the

45 substrate-bound conformer (R state), even in the absence of substrate (157). Substrate binds to the R conformer, and thus initiates catalysis. In principle, subsequent steps along the reaction coordinate pathway may also be sampled in a similar manner (122). Evidence for such coexistence of states in FAcD is found by both crystallography and 19F NMR. In particular, Y141 adopts distinct major and minor conformations in each protomer of the apo FAcD crystal structure, representing two distinct states of the dimer (Figure 2.7A). In one protomer (A, whose minor conformer is poised for substrate binding), the tyrosine ring is either positioned away from W156 in the major conformation or π-stacks with W156 in the minor conformation. In the other protomer (B, whose catalytic site is empty), the side chain torsion angle, 1, of the major and minor conformations of Y141 differ by 19°. Moreover, based on similarities in Y141 and W156 side-chain torsion angles in substrate-bound FAcD (Figure 2.7C and D), we conclude that the minor conformation of protomer A and protomer B are coincident in the apo-protein (i.e. there are a total of two states of the apo-protein rather than four).

Consistent with the crystallography results, 19F NMR of apo-FAcD reveals two distinct minor resonances for W156 with equal areas (i.e. 0.25% that of the major peak), downfield from the corresponding major peak (Figure 2.7B). The chemical exchange saturation transfer (CEST) profile shows that the two minor peaks are in slow exchange with the W156 major peak (Figure 2.7B) (158). Moreover, the corresponding minor state is recapitulated in a number of obvious resonances with equivalent intensities as shown in Figure 2.S12, suggesting the minor peaks associated with W156 signify a global conformational change. Analysis of the Cl--bound crystal structure (5K3C) and 19F NMR spectra of W156H mutant and wt FAcD in the presence of NaCl shows that the two minor peaks arise from the distinct minor conformations of Y141 in each protomer associated with the substrate-bound like state in the apo-form; ring current effects from Y141 to W156 make it possible to observe this minor state by 19F NMR (Figures 2.S13 and 2.S14). The 19F NMR resonance was definitively assigned to W156 through control 19F NMR spectra of the W156H mutant and additional experiments involving the H280N mutant and different ligands as described in Figure 2.S13. The minor state observed by 19F NMR is corroborated by the residual electron density around Y141 in the crystal structure (Figure 2.7A). Fast protomer conformational exchange of the ground state of apo-FAcD or indeed the subsequent Michaelis complex, covalent intermediate, and product-bound states, prevents the detection of gross asymmetries under solution conditions. However, the minor state of apo-

46

FAcD, representing the conformation poised for substrate binding, is clearly asymmetric and long lived, based on the absence of exchange broadening and the absence of magnetization transfer between minor peaks (Figure 2.7B and Figure 2.S8F).

The Michaelis complex and glycolate-bound crystal structures show that the minor conformation associated with the apo-state becomes the dominant conformation in the ligand- bound states with no electron densities associated with minor conformations detected (Figure 2.7C and D, Figure 2.S12). 19F NMR spectra similarly reveal that the minor and major states of the apo-enzyme are reversed in the Michaelis complex and glycolate-bound structures (Figure 2.2E). The pre-sampling of the bound state in the apo-form suggests a conformational selection mechanism in substrate binding (119). In keeping with X-ray crystallography, which showed that the minor state of Y141 as found in the apo-form is stabilized in the Michaelis complex, 19F NMR reveals a clear shift of the major resonance of W156 to one of the minor peaks upon addition of BrAc (Figure 2.2E). Thus, the minor state of apo-FAcD represents the substrate- bound like state, is pre-sampled with a 0.5% population, and exhibits a lifetime on the order of seconds or longer in the absence of substrate (122). 19F NMR also shows evidence for sampling of the apo-like ground state (~8% population, represented by a resonance at -120.8 ppm) upon addition of saturating amounts of BrAc (substrate analogue) and glycolate (product), in coexistence with the major peak (-118.0 ppm). In this case, the major peak exhibits significant exchange broadening Figure 2.2., whereas the major peak resulting from the addition of glycolate did not show line broadening (Figure 2.2E).

19F NMR spectra in the presence of BrAc (Figure 2.2E) and crystallographic data also provide evidence for sampling of sequential catalytic states via a conformational selection mechanism, albeit on a faster timescale. Like the apo-form, the Michaelis complex also exhibits structural asymmetry with respect to Y141, adopting π-stacked and skewed conformations in the bound and the empty protomers, respectively (Figure 2.7C). The π-stacked Y141 conformation of the bound protomer (protomer A) is positioned to accommodate the substrate and is also primed for the initial SN2 nucleophilic substitution. Interestingly, the empty protomer (protomer B) adopts a conformation that is remarkably similar to the covalent intermediate conformation of protomer A, as evidenced by the overlaid crystal strutures of the empty protomer (B) in the Michaelis complex and the H280N covalent intermediate (Figure 2.7E and F). This suggests that the

47 intermediate conformation of the reacted protomer (A) is pre-sampled in the empty protomer of the Michaelis complex. Furthermore, the crystal structures of the occupied protomer (A) of the H280N covalent intermediate and the minor conformation of protomer B in the apo-structure overlay well with each other, suggesting the intermediate conformation is also pre-sampled in the absence of substrate (Figure 2.7F, right panel). Thus, the functional states associated with catalysis (namely, substrate recognition, complex formation, facilitation of SN2 substitution and hydrolysis) are elegantly sampled in a sequential fashion via the asymmetric dimer (122). In this case, sampling of the bound-like state is very slow for apo-FAcD, while sampling of the intermediate state via the Michaelis intermediate occurs on a sub-millisecond timescale, as evidenced by 19F NMR CEST and CPMG experiments, respectively, as discussed below.

48

Figure 2.7 - Crystallographic and spectroscopic evidence of conformational selection. (A) 2Fo-Fc electron density maps representing the major (cyan) and minor (magenta) conformations of Y141 in crystals of apo FAcD (5K3D). The cut-off levels for the electron density displayed are 0.3σ and 1.1σ for protomers A and B, respectively. (B) Downfield region of the 19F NMR spectrum of apo FAcD, showing the minor peaks of W156, marked by asterisks. Line widths of deconvolved minor peaks are 73 (left peak) and 92 Hz (right peak). The CEST profile showing the peak intensity of W156 is displayed as a function of low power saturation frequency (red trace). Chemical shifts of the troughs match those of the minor peaks. (C, E) The 2Fo-Fc maps of the FAcD active site in the Michaelis complex with FAc (5SWN, C) and the covalent intermediate (5K3F, E), showing residues Tyr141, Trp156, glycolyl and bound ligand as sticks. (D) Superposition of the apo form (5K3D) and the Michaelis complex with FAc (5SWN). (F) Crystal structures the apo form (5K3D) overlaid with the Michaelis complex with FAc

49

(5SWN, left display) and the covalent intermediate (5K3F, right display), showing comparable side chain conformations.

2.1.9 The role of dynamics and quaternary structure in catalysis. While some degree of protein dynamics is generally needed for enzymes to bind substrate and release product, there is a significant debate concerning the relative role of protein dynamics in the chemical step (121, 159–165). It has been suggested that dynamics and flexibility are key to enzyme catalysis and facilitate the chemical step by sampling weakly populated catalytically competent states (160, 163). However, some computational studies argue that the contribution of protein dynamics to the chemical step is minimal and electrostatic preorganization is the major driving force of catalysis (161, 162). We revisit this issue in the context of the dimeric enzyme. One hallmark of FAcD is that a subtle structural asymmetry persists in crystal structures of the apo-enzyme, Michaelis complex, covalent intermediate, and product-bound dimer. Moreover, this asymmetry becomes more pronounced as catalysis proceeds to retreat to apo-form levels for the product complex. NMR reveals a clear exchange of protomer conformations (AB ⇄ BA) on a millisecond timescale (750 s-1) associated with the ground state apo-form, whereas the sparsely populated substrate-bound-like state undergoes no such exchange in the absence of substrate (Figure 2.8, I and II). Binding of a single substrate shifts the equilibrium to the bound state and remarkably, brings about a significantly faster and more pronounced conformational exchange on a sub-millisecond timescale (Figure 2.8, III and IV). This would have the advantage of compensating for entropy losses associated with binding the substrate, providing a route for sampling the covalent intermediate state and enabling catalysis. Upon establishing the covalent intermediate (Figure 2.8, V), millisecond conformational exchange (930 s-1) and fast internal empty protomer dynamics may also facilitate sampling of conformations that allow attack by an activated water molecule to complete hydrolysis (Figure 2.8, VI and VII) and product release. In other words, fast conformational exchange between protomers and strong allosteric coupling facilitate sampling of multiple states, along a pathway which allows the transition state to be sampled with a higher probability. B-factors and protein- bound water molecules also contribute to our understanding of the role of dynamics, where a

50 clear enhancement in local motions and changes in the numbers of bound water molecules are seen during catalysis, albeit in domains associated with the empty protomer.

The asymmetry in local dynamics observed during catalysis suggests an advantage for oligomers over that of a monomeric enzyme. In particular, the substrate-bound protomer is able to adopt an optimal pose for efficient catalysis while the empty protomer facilitates the reaction through increased dynamics, for purposes of entropic compensation and sampling of the transition state via tight allosteric coupling between protomers. In addition, the empty protomer appears to serve an additional role with regard to conformational selection and sampling of successive functional states along the reaction coordinate pathway. In particular, the covalent intermediate conformation is sampled by the empty protomer of the Michaelis complex on a rapid timescale upon binding of substrate (4300 s-1). In contrast, the bound-like state of apo- FAcD which comprised of substrate-bound and intermediate conformations is sampled on a very slow timescale (seconds). Thus, conformational selection in the dimer essentially knits together key functional states at every point, along the reaction coordinate pathway and facilitate the overall catalysis by sampling key intermediate conformations.

Despite the increased dynamics upon binding of substrate, there is a large discrepancy between the enzymatic catalytic rate (1.84 min-1) and the timescale of protein dynamics (nanosecond to sub-millisecond). The catalytic rate is of course dependent on a sequence of events involving alignment of substrate, formation of the covalent intermediate, and specific chemical attack via an activated water molecule in a unique excited state pose, all of which may involve many cooperative and non-cooperative processes on the part of the enzyme and substrate, prior to attaining the transition state, whereupon the final chemical step may take place on a timescale of femtoseconds or faster. Protein dynamics are thus essential to facilitating overall catalysis though the experiments described herein do not allow us to interrogate the chemical step.

51

Figure 2.8 - Proposed mechanistic scheme for FAcD catalysis. In the apo-state, FAcD undergoes fast exchange between two heterogeneous ground states (I) while it rarely pre-samples a long-lived excited state (II). Upon substrate binding, the excited state becomes dominant (IV) with a weakly populated ground state (III). The protomer conformational exchange rate of the excited state (IV) increases dramatically while the local dynamics of the empty protomer (denoted by red polygons) also increase. At the same time, the number of bound waters in the empty protomer decreases (IV, V). After formation of the ester intermediate, the dynamic empty protomer becomes accessible to substrate and can potentially initiate catalysis (V). Upon formation of product, the conformational exchange rate and the asymmetry in protomer dynamics diminish to a level comparable to that of the apo-state (VI, VII).

2.1.10 The role of water networks in stabilizing functional states. We emphasize that the structural asymmetry observed by crystallography and indeed the differences in structure between distinct functional states of the enzyme (apo-, Michaelis, covalent intermediate, and product) are quite subtle. RMSD differences between the corresponding backbone atoms of apo-form, Michaelis complex, covalent intermediate and product-bound forms of FAcD are less than 0.3 Å. On the other hand, the presence of bound water networks differ significantly between these functional states, emphasizing the key role of water networks in establishing stable states along the reaction coordinate pathway. This theme

52 of water networks has been repeated in several other recent studies including the adenosine A2A GPCR (166), where waters networks represent a key facet of the activation switch, haemoglobin, where 60 water molecules are involved in stabilizing the “R” state upon binding by oxygen (167), and in a common hub protein, calmodulin, where the protein bound-like state is sampled on a millisecond timescale and is distinguished by the loss of a network of bound waters (168). Water networks present obvious advantages from the perspective of stabilizing states via enthalpic or entropic means, and establishing long-range dynamic allosteric networks in the case of enzymes, GPCRs, and proteins such as calmodulin which participate in signalling pathways through protein-protein interactions.

2.2 Conclusions

In this study, we report how half-of-the-sites reactivity in a homodimeric enzyme is regulated by a dynamic allosteric pathway and conformational selection mechanism. Crystallographic and spectroscopic signatures of the apo enzyme, Michaelis complex, covalent intermediate, and product-bound dimer reveal a subtle structural asymmetry. At any instance, one of the two protomers in the apo enzyme is primed for ligand binding (Figure 2.8, I and II). An asymmetric bound-like state, which is pre-sampled by the apo enzyme, becomes the dominant state upon binding of substrate, whereupon protomer exchange dynamics is reinforced through allosteric pathways emerging from the substrate-binding site of the ligand-containing protomer and crossing the dimer interface (Figure 2.8) (95). Subsequent functional states are also sampled via the dimeric enzyme, in a manner consistent with conformational selection. In particular, the empty protomer of the Michaelis intermediate adopts a conformation that closely resembles the bound protomer of the covalent intermediate. The catalytic intermediates display significant changes in local dynamics and bound water networks, largely in the ligand-free protomer, leading to pronounced asymmetry between the protomers. This increase in configurational entropy and corresponding increased protomer conformational exchange and domain-specific dynamics likely lowers the activation free energy barrier and drives sampling of the transition state. This detailed mechanism provides insights into how substrate-coupled allosteric modulation of structure and dynamics may facilitate catalysis in a homodimeric enzyme.

53

2.3 Materials and Methods

Protein expression and purification

FAcD was expressed in E. coli BL21 (DE3) and purified as previously described (169). Cell cultures were induced with 1 mM IPTG at an O.D. of 0.8 at 37 °C. Luria broth was used for the preparation of crystallography and ITC samples. For 19F NMR sample preparations, 35 mg of 5- fluoro-tryptophan (5F-Trp) and 1 g of glyphosate were added to 1 L of M9 minimal medium culture prior to induction. No perturbations were observed from 5F-Trp enrichment, as evidenced by the comparison of crystal structures of fluorinated and unlabeled FAcD (Figure 2 15 2.S2A and B). Partially deuterated M9 minimal medium (70% H2O, 1 g of NH4Cl, 3 g of [U- 2H, 13C]-glucose and 1 g of [U-1H, 12C]-glucose) was used for assignment of 15N,1H NMR spectra. After overnight induction, cells were pelleted and lysed by sonication in lysis buffer (5 mM , 50 mM HEPES, 500 mM NaCl, 5% glycerol at pH 7.5). The cell lysate was subsequently purified on a Ni-NTA column with wash (30 mM imidazole) and elution (250 mM imidazole) buffers. After dialysis against cleavage buffer (50 mM Tris-HCl, pH 8.0, and 1 mM DTT), hexa-histidine tags were removed by TEV protease before the solution was applied to a Ni-NTA column once again to separate cleaved FAcD from TEV protease and the tag-peptide. This was followed by size exclusion chromatography (HiLoad 16/60 Superdex 200) with size exclusion buffer (50 mM Tris-H2SO4, 150 mM NaCl at pH 8.5). The purified sample was dialyzed against storage buffer (50 or 500 mM Tris-H2SO4 at pH 8.5).

Production of Trp mutants

The mutants coded for by p151TV_W264F and p151TV_W267F were generated using a QuickChange Lightning Site-Directed Mutagenesis Kit (Agilent Technologies) using primers 5'- CGACGCCGCTCGATGTCTTCAGAAAGTGGGCG-3' and 5'- CGCTCGATGTCTGGAGAAAGTTCGCGTCCGACGT-3', respectively. Both constructs were sequenced by a local DNA sequencing facility (The Centre for Applied Genomics, Sick Kids Hospital, Toronto, Canada) prior to expression.

54

X-ray crystallography

FAcD crystals were grown at room temperature in hanging drops with a protein concentration of

0.5 mM in 100 mM Tris-HCl pH 8.5 and 200 mM CaCl2 and 19-22% PEG3350 as a precipitant.

For wild type apo FAcD, we used 100 mM Tris-H2SO4 pH 8.5, 200 mM calcium acetate and 19- 22% PEG3350 to exclude chloride ions. Enzyme complexes were generally prepared by co- crystallizing the protein in the presence of 100 mM ligand. The H280N mutant was also crystallized from a solution with 1 mM of ChlAc; in crystals from this set-up, only one protomer active site displayed the covalent glycolylester intermediate. Diffraction data were collected on a Rigaku FR-C rotating copper anode using a Mar345 detector and on a Rigaku FR-E rotating anode using a Saturn A200 detector, and on a Bruker MICROSTAR using a Bruker SMART 6000 detector. All data were acquired under cryo-conditions (100 K) using paratone N (Hampton Research, Aliso Viejo, CA, USA) as a cryo-protectant. Diffraction data were reduced using XDS (170) and phases were calculated with the molecular replacement software Phaser- MR (171) using wild-type FAcD (PDB code: 3R3U) as a model. Atomic models were built in Wincoot (172) and refinement was done using both Refmac5 (173) as implemented in the ccp4 software suite (174) and phenix.refine using the phenix software package (Table S1)(175). Water molecules were accepted if they were within 2.2-3.5 Å of a potential hydrogen bond partner and the corresponding map r.m.s.d levels were greater than 1.5 electrons/Å3. Taking advantage of the slow nature of the reaction catalyzed, a series of simple freeze-trapping crystallographic (FTX) experiments was attempted. Crystals of wild-type FAcD were soaked at room temperature in artificial mother liquor containing 100 mM FAc for periods of 2 seconds to 3 minutes. They were then flash-frozen in liquid nitrogen for data collection. All crystals for the FTX experiments ranged from 150-200 microns in size.

Distance matrices, SVD and SPARTA+ analysis

Distance matrix analysis was performed as described in (106). For each protomer, the pairwise distances between all main-chain atoms were calculated. A difference distance matrix, represents the difference between two distance matrices of the same size, which in our case are obtained from protomers A and B of the same dimer. The difference distance matrix is an accurate representation of structural differences between these protomers.

55

SVD analysis was performed on the FTX series as described previously (85, 106). Difference Fourier maps were calculated for each of the 38 data sets using the structure before soaking as the 0 second time point reference. The electron density maps corresponding to protomer B were superimposed onto those of the corresponding protomer A using the translation vector and rotation matrix determined in a least-squares superposition of protomers A and B. SVD analysis on these superimposed difference maps, performed as described in (85). Each difference map of a protomer A or B is arranged in one column of the data matrix A, therefore the entire matrix A has 76 columns. The SVD procedure finds A = U W VT, where matrix U contains the decomposed components. Eight singular values wi, i = 1 – 8, in matrix W stand out to indicate that the corresponding eight components in U are significant. Matrix V contains the coefficient sets for linear compination of the components. The coefficient sets corresponding to the first 8 decomposed components are wiVi, i = 1 – 8, which are plotted in an 8-dimensional scatter plot. For visualization, two-dimensional representations of this 8-dimensional space were chosen.

SPARTA+ (http://spin.niddk.nih.gov/bax/software/SPARTA+) (129) was used to translate the structural differences between two protomers in the 38 soaked FAcD crystal structures to NMR spectral parameters. Backbone chemical shifts of two protomers in the dimeric structures were predicted individually. The predicted backbone chemical shifts of each protomer were subtracted from those of the corresponding protomer. The differences between the predicted backbone chemical shifts were averaged after multiplying by weighting factors of 0.25 and 0.10 for carbon and nitrogen nuclei, respectively. Averages of the predicted difference backbone chemical shifts are mapped onto the ClAc Michaelis complex structure (Figure 2.1D). The lower limit of fast exchange (~ 400 s-1) was estimated with the assumption that the fast exchange rate should be at least 10 times faster than the average Δω (36 Hz) of key regions in catalysis, such that no protomer-specific resonances are detected. The key regions used in this analysis include helices α1 (residues 49-57) and α10 (residues 288-298), the interface region (residues 135-190), and the cap domain (residues 250-260). The predicted differences in 15N and 1H chemical shifts, in the key regions for catalysis were converted to Hz, appropriate to a 1H Larmor frequency of 600 MHz.

56

NMR spectroscopy

NMR backbone assignment of FAcD was accomplished using standard 3D NMR experiments: HNCO, HN(CA)CO, HNCA, HN(CO)CA, HNCACB, and HN(CO)CACB (Figure 2.S9A) (176). All NMR experiments were performed at 50 °C with protein concentrations between 1 and 3 mM. Partially unlabelled samples were prepared by adding 1 g of either 14N-labelled lysine, histidine, arginine or glutamine in 15N M9 minimal medium cultures to resolve assignment ambiguities for some amide peaks. To verify the backbone assignments, a 3D 15N,1H NOESY HSQC was recorded with a mixing time of 150 ms. Most of the NMR experiments were obtained on a 600 MHz Varian Inova spectrometer equipped with a cryogenic probe capable of either the standard regimen of HCN or 19F NMR spectroscopy. Some of the CPMG relaxation experiments were performed on a 500 MHz Varian Unity spectrometer with a 19F NMR room temperature probe. 15N,1H HSQC spectra were acquired using 90 increments, 24 transients, and a spectral width of 2700 Hz in the indirect dimension. Free induction decays of 19F NMR spectra were acquired with an acquisition time of 0.25 s, a repetition time of 2 s, a spectral width of 15,000 Hz and between 2,000 and 66,560 transients. The typical 1H, 19F, 13C, and 15N π/2 pulse widths were 9.4 µs, 19 µs, 23 µs, and 42 µs, respectively. The 15N CPMG relaxation dispersion of FAcD with BrAc was recorded by varying CPMG refocusing frequencies from 33.3 to 1000 Hz with a constant time of 30 ms. 19F CPMG relaxation dispersion experiments were performed by applying various CPMG frequencies (125 – 6000 Hz) at 600 and 500 MHz during a constant time of 2 ms or 8 ms. CPMG relaxation dispersion profiles were fitted to a two state model with Chemex as described elsewhere (177). The 19F

CEST profile of apo FAcD was obtained applying a B1 field of 30 Hz for 500 ms. In total, 30 19 19 1D F spectra were acquired with the B1 field varied from -119.4 to -117.8 ppm. F NMR spectra were processed with MestReNova 10.0.2. 1H,13C,15N NMR spectra were processed and analyzed with NMRPipe (178), NMRView (179) and CcpNmr (180) software. The chemical shift perturbations were calculated using the formula, [[(1H)2+(15N/5)2]]1/2. In most NMR experiments, BrAc, a substrate analog, was used because it induces the same effects on the protein structure and dynamics as the substrates FAc and ClAc, without undergoing catalysis. FAcD binds BrAc at its active site but cannot catalyze its hydrolysis. BrAc exerted the same

57 chemical shift perturbations and line broadening in 19F and 15N,1H HSQC spectra as FAc (Figure 2.S8C and D). The refined crystal structure of the BrAc complex is almost identical to the FAc or ClAc Michaelis complexes with respective RMSD values of 0.3 Å.

Isothermal calorimetry

Isothermal titration calorimetry binding experiments were performed using an Auto-iTC 200 instrument (Malvern). Purified wild-type and 5F-Trp enriched FAcD in 50 mM Tris-H2SO4 pH 8.5 buffer were concentrated to approximately 1 mM; their exact concentration was determined by absorbance at 280 nm. Glycolate and bromo-acetate ligands were diluted in matching buffer, and their exact concentrations, ranging between 15-30 mM, were determined by NMR. One experiment consisted of 26 injections of 1.5 µL each, with injection duration of 3 s after an initial injection of 0.5 µL. All experiments were performed at 35 °C. To derive affinity constants

(Kd) and molar reaction enthalpy (ΔH), we used the Origin 7.0 software fitting the integrated titration peaks using a single-site binding model. The entropic change ΔS was calculated from the change in Gibbs free energy, ΔG. Each set of experiments was repeated in triplicates and corresponding thermodynamic parameters of binding with their standard errors are reported in Figure 2.S2D.

Computational analyses - Rigidity-based transmission allostery (RTA) and FIRST

We utilized a computational method rigidity transmission allostery (RTA) for predicting allosteric communication in proteins. This approach is based on seminal work in the area of rigidity theory developed in (156). The underlying mathematical concepts of rigidity-based allosteric communication were also discussed in (155). The rigidity-based allostery model extends the pebble game algorithm to predict whether local perturbation of rigidity at one site of the protein transmits across a protein structure to change the rigidity at a second distant site. The pebble game algorithm is the main component of the low computational complexity method FIRST (Floppy Inclusion and Rigid Substructure Topography) (152) for protein rigidity and flexibility analysis. Starting with the coordinates of a protein structure, FIRST creates a constraint network (graph) composed of nodes (atoms) and edges (covalent bonds, hydrogen

58 bonds and hydrophobic interactions). FIRST then applies the rigorous mathematical theory of rigid and flexible molecular structures (181) and the pebble game algorithm to calculate the degrees of freedom of motions to rapidly decompose a protein into rigid clusters and flexible connections.

With RTA analysis, we investigated allosteric regulation of the protein originating from one site and transferred to distant functional site. The RTA analysis determines whether a perturbation in rigidity and conformational degrees of freedom at a specified site of the protein (mimicking ligand binding) can propagate through the protein structure and induce a quantifiable change in rigidity and available number of degrees of freedom at a second, distant site. We calculated the number of degrees of freedom that can be transmitted from one site to the other. Perturbation of rigidity refers to insertion of additional constraints (edges) to the first site up to its rigidification. Transmission of degrees of freedom refers to a subsequent change in the available number of degrees of freedom at the second site. If the two sites are in rigidity-based allosteric communication (i.e. transmit degrees of freedom), then a change in conformation (i.e. mechanically induced change of shape as, for instance, by ligand binding) at one site will percolate and produce a change in conformation (shape) at the second site (155).

Rigid cluster decomposition of the crystal structures was performed with FIRST as previously described (152). Missing hydrogens were added to crystal structures using the WHAT IF web server (http://swift.cmbi.ru.nl/servers/html/htopo.html). In this study, the number of conformational degrees of freedom that can be transmitted from the catalytic site (residues 110, 111, 114, 141, 155, 156, 185 and 219 as site A) in protomer A to the interface region of the empty protomer B (residues 135-190 as site B) is calculated as a function of hydrogen bond energy strength. The energy of each potential hydrogen bond is calculated within FIRST using the Mayo potential (152). For every hydrogen bond energy cutoff, starting at 0 kcal/mol which is progressively lowered in steps of 0.01 to -6 kcal/mol: we ran program FIRST and the pebble game algorithm to calculate the available degrees of freedom at site A, site B, and the union of site A and site B. Degrees of freedom counts were computed by calculating the maximum number of pebbles that can be gathered on atoms belonging to site A, then site B, and finally the

59 union of sites A and B. The three counts are denoted as Amax, Bmax, and ABmax, respectively. The maximum number of degrees of freedom that can be transmitted from A to B (denoted as

DOF_AB) is finally calculated by obtaining DOF_AB = Amax + Bmax − ABmax − 6. Six is subtracted to take account of the trivial six degrees of freedom corresponding to the ever-present rigid body motions. When DOF_AB is positive, then sites A and B are involved in rigidity- based allosteric transmission and the number of degrees of freedom that can be transmitted from A to B is DOF_AB. This provides a quantifiable measure of allosteric communication between A and B. To monitor allosteric transmission, DOF_AB was plotted as a function of the energy cut-off between intervals 0 and -6 kcal/mol.

Molecular dynamics simulations

Molecular models for simulations were constructed using chains C/D from the full-length wild type apo structure of FAcD (PDB: 3R3U). Six residues missing in the model were added at the N-terminus of chain C to obtain a dimer of the full-length sequence (numbered residues -1 to 300). All titratable residues and termini were in the standard protonation state for simulations at pH 7. An apo model of the enzyme was prepared in addition to a substrate-bound model with FAc placed in the active site of protomer A using the orientation of FAc from the D110N-FAc structure (PDB: 3R3V). All crystallographic water molecules were retained such that the final rhombic dodecahedron simulation cell consisted of the protein dimer, 27,234 water molecules, 135 Na+, and 121 Cl- ions (as well as FAc where it was included). The protein was modeled with the amber99sb-ildn force field (182) with TIP3P (183) water. AMBER compatible parameters for FAc were generated using the acpype/Antechamber (184, 185) parameterization protocol resulting in the following partial atomic charges; CT 0.116700, C 0.877602, O2 - 0.835501, F -0.304700, H1 -0.009300.

All energy minimization, equilibration, and production stages of the simulations were conducted using GROMACS 4.6.3 (186) with all MD runs performed with a time step of 2 fs. Each system was subjected to 2,000 steps of steepest descent energy minimization to remove unfavorable energy contacts. Equilibration was performed in two successive blocks of simulation length 5 ns

60 in the NVT and NPT statistical ensembles, where protein and FAc heavy-atoms were restrained with a force constant of 1,000 and 500 kJ mol-1 nm-1, respectively. For production simulations, twenty separate simulation repeats were then constructed for apo and FAc-bound systems with randomized initial velocities of all atoms at a temperature of 300 K. These simulations were performed in the NPT ensemble at a temperature of 300 K with the Nosé-Hoover thermostat (187, 188) (collision frequency of 0.5 ps-1) and at a pressure of 1 atm using the Parrinello- Rahman barostat (189) (coupling constant of 2 ps). Lennard–Jones interactions were evaluated using a group-based cutoff for separation distances of 1.2 nm. Coulomb interactions were calculated using the smooth particle-mesh Ewald method (190, 191) with a real-space cutoff of 1.2 nm and a Fourier grid spacing of 0.16 nm. The non-bonded pair-list was updated every 10 fs. All covalent bonds were constrained with SETTLE (192) and P-LINCS (193) for water and other molecules, respectively.

Each of these twenty simulation repeats was run for 1,000 ns, for a total of 20 microseconds per system. However, only eighteen of the twenty FAc-bound simulations were utilized for analysis as FAc escaped the binding pocket in two simulations. All production data before 300 ns were discarded from each repeat to account for equilibration.

Normalized mutual information was calculated between all protein torsion angles (backbone dihedral , , and side chain  angles where they were available) from all production simulation frames sampled at a 1 ns interval. Dihedral angles and mutual information calculations were computed using the MDTraj (194) and MDEntropy libraries (195). The average of mutual information computed from 30 iterations of scrambled data was subtracted from the original mutual information matrix to improve statistical significance. Principal components were computed using g_covar and g_anaeig tools included with GROMACS (186) using C, C, N, and O atoms of all residues after Leu4 of each protomer. Molecular renderings of principal components were generated using VMD (196).

61

For the projection of all crystal structure dimers onto the principal components 1 and 2 in Figure

2.3C, all Cα, C’, N, and O atoms needed to be resolved. The software MODELLER (197) was utilized to construct 30 homology models for every structure (38 soak structures, WT-apo, D110N-FAc, D110N-ClAc, D110N-GOA, D110N-apo, H280N 1 ligand, and H280N 2 ligand) without modifying the parts of the protein that were resolved, of which the best scoring model for each crystal structure was selected for Figure 2.3C.

2.4 Supplementary Text

2.4.1 FAcD as a model to study irreversible enzyme catalysis Most NMR studies on enzymes have been conducted on enzymes with reversible catalytic reactions because there is no need to worry about formation of product during the course of experiments. Here, we were able to study irreversible enzyme catalysis due to the slow kcat (1.84 -1 min for FAc) of wild-type FAcD. The slow kcat provides enough time to perform NMR experiments without forming a significant amount of product. Isothermal titration calorimetry (ITC) shows that fluoroacetate and glycolate have comparable binding constants to FAcD and 5F-Trp enriched FAcD (Figure 2.S2C and D), although the enthalpy and entropy terms associated with binding are different. Moreover, the catalytic rate of 5F-Trp enriched FAcD is 55 % that of FAcD (Figure 2.S2E and F) likely due to the altered electrostatics of W156 which plays a key role in catalysis by forming a hydrogen bond with the halogen and is postulated to stabilize the leaving halide ion (Figure 2.S13D).

2.4.2 Rigidity-based allosteric transmission All three structures (apo, Michaelis complex with FAc, and glycolate-bound) exhibit rigidity- based allosteric communication as shown in Figure 2.6. Initially in the presence of many weak hydrogen bonds (i.e. less negative energy cutoff) the protein is very rigid, the two sites are part of a single rigid cluster and one observes no rigidity-based communication. As the energy cutoff is gradually lowered (i.e. weak hydrogen bonding constraints are broken), one starts to observe transmission of degrees of freedom. In comparison to the apo and glycolate-bound structures, rigidity-based allostery in the Michaelis complex structure with FAc begins at a substantially

62 higher energy cutoff (i.e. lower energy), around -1.6 kcal/mol and peaks around -3 kcal/mol. On the other hand, transmission of degrees of freedom in apo and glycolate-bound structures only starts around -2.1 kcal/mol and peaks at about -3.4 kcal/mol and -3.6 kcal/mol, respectively. In the Michaelis complex structure with FAc, the entire transmission curve is shifted to the right towards a lower energy, suggesting a lower activation energy barrier compared to the apo and the glycolate-bound structures.

63

Figure 2.S1 - Reaction coordinate pathway of FAc hydrolysis with or without FAcD. The C-F bond energy in FAc is 96.6 kcal/mol (48). The activation energy barrier of the enzyme-catalyzed reaction is 2.7 kcal/mol (48). Upon formation of the Michaelis complex, W156, H155, and Y219 form hydrogen bonds with the leaving halogen and facilitate the initial SN2 nucleophilic substitution. After the formation of the covalent ester intermediate, hydrolysis occurs via H280 activation of a tightly bound water molecule to generate product. Note that the catalytic reaction occurs only in one protomer at a time while the other protomer changes its conformation and dynamics as the reaction proceeds.

64

Figure 2.S2 - Crystal structure of 5F-Trp labeled FAcD and the comparison of its catalytic rate and ligand binding to that of wild-type FAcD. (A) Superimposed structures of 5-F-Trp labeled (cyan) and wild-type apo FAcD (yellow). With an RMSD of 0.14 Å for all backbone atoms, the structures are essentially identical. W156 is colored red. (B) Detail of 2Fo-Fc electron density map at 1.54 Å resolution 65 contoured at 1σ and atomic model of 19F-W156 in (A) and its immediate environment. (C) ITC titrations of both the raw data (upper panel) and the isotherm (lower panel) are shown for both BrAc and glycolate with wild-type FAcD and 19F-Trp enriched FAcD at 35 °C. (D) ITC binding parameters for BrAc and glycolate to wild type and 5F-Trp enriched FAcD. Note that the introduction of 5-fluoro-tryptophan causes no detectable perturbations to the FAcD X-ray crystal structure. However, ITC reveals that the entropy terms appear to drive binding in the case of 5F-tryptophan enriched FAcD. Due to the low c values (c < 2) of these ITC measurements, stoichiometry parameters (N), which varied between 0.5 and 1, are not reported in this table. Asymmetric binding of ligands to one protomer at a time is shown with linear NMR chemical shift perturbations (Figure 2.2B and Figure 2.S8E) and crystallographic data (Figure 2.S4). (E) Concentrations of FAc and F- ion were measured upon saturating proteins (50 μM) in the presence of saturating concentration of FAc (15 mM) over 1044 seconds with 19F NMR. This graph shows the effect of 5F-Trp incorporation on the enzymatic reaction. (F) Catalytic turnover rates of wild type and 5F-Trp enriched FAcD. The catalytic turnover rate of 5F-Trp enriched FAcD is 55 % of that of the wild-type FAcD. This is likely due to changes in the electronic configuration of W156 given that this residue forms a hydrogen bond with the halide of the substrate and is postulated to stabilize the leaving halide ion. Errors are calculated based on triplicates.

66

Figure 2.S3 - 15N CPMG relaxation dispersions of BrAc-bound FAcD acquired at 800 MHz. (A) Constant

15 R2,eff values with varying νCPMG suggest the absence of N CPMG dispersions in key residues involved in catalysis and allosteric communication. This is likely due to their relatively small ΔωNH and fast kex between different states. (B) While most residues do not show 15N CPMG relaxation dispersions (A),

67 four peaks of 233 peaks (A198, D173 and unassigned peaks X1 and X2), display 15N CPMG relaxation dispersions. A globally fit of these 15N CPMG dispersion profiles to a two-state model as described in

-1 (132) results in a kex of 1080 ± 140 s .

68

Figure 2.S4 - 2Fo-Fc electron density maps of FAcD reaction steps. All electron densities are contoured at 1σ cut-off. (A) D110N mutant co-crystallized with 100 mM FAc (5SWN). (B) D110N mutant co- crystallized with 100 mM ClAc (5K3B). In both cases a Michaelis complex is produced and displays ligands only in the protomer A active sites. Protomer B active sites are each occupied by a water molecule and a chloride ion. (C) Mutant H280N when co-crystallized with 1 mM of ClAc substrate generates the covalent-ester intermediate (H280N, 5K3F) but again only in one protomer. (D) Mutant D110N co-crystallized with glycolate shows the product bound to protomer A only (5K3E). (E) At high concentrations of substrate, however, the covalent ester intermediate can be seen in the active sites of both protomers.

69

Figure 2.S5 - Fo-Fc difference Fourier maps after SVD. Positive and negative electron densities are contoured at 5 in green and red, respectively. Electron density changes indicate atomic motions from the negative density to the positive. (A) A map of protomer A recomposed from the eight most significant components. A linear combination of the first eight decomposed components closely reproduces the original experimental map. Some of the coefficient sets used for this linear combination are plotted in Figures 2.1E, 2.S6B, and 2.S6C. (B, C) The first component produced by SVD. Two orthogonal views are shown around H210. (D) The second component produced by SVD. Significant density changes are present in the active site. A meaningful interpretation of these changes, however, is not readily achieved. (E) The third component of SVD. (F) The fourth component of SVD.

70

Difference electron densities are wide spread. This view shows the signals associated with helix 8 that stand out from the noise level.

71

Figure 2.S6 - SVD and SPARTA+ analyses of X-ray crystallographic time series experiments. (A) Difference distance matrix between the protomers A and B in the apo FAcD crystal structure. The lower triangular portion contains the difference in the intra-protomer distances. (B, C) Each panel displays right singular vectors (RSV) (vk) weighted by their singular values (wk); both panels represent 2-

72 dimensional projections of an 8-dimensional space. Each dot represents a difference Fourier map in one protomer. The size of the dots corresponds to soaking time. Each gray line connects the two protomers of one dimer. The correlated structural changes between two protomers of a dimer are evidenced by the concerted displacements of red and blue dots in these dimensions. (D) Structural differences between the protomers during catalysis. The backbone chemical shifts corresponding to the 38 soaked crystal structures were predicted by SPARTA+. The chemical shift differences between protomers were averaged over the 38 structures to show the structural changes during catalysis.

73

Figure 2.S7 - Singular values from SVD analysis of all difference Fourier maps. When singular values are ranked according to their size and plotted a break in the graph becomes obvious. The eight highest ranked singular values (red dots) alone will make significant contributions and a linear combination of the corresponding decomposed components closely reproduces the original experimental map shown in Figure 2.S5A.

74

Figure 2.S8 - NMR experiments to detect protein dynamics and spectral comparisons between BrAc and ClAc. (A) 19F CPMG relaxation dispersions of additional tryptophans in the apo state performed at 500 MHz with a constant time of 8 ms. Peaks A and B are denoted in Figure 2.S8E. (B) Magnified view of 19F CPMG relaxation dispersions of W156 in the apo state and in the presence of product, glycolate in Figure 2.2D. (C) Overlay of 19F NMR spectrum of apo FAcD with those obtained in the presence of saturating concentrations of BrAc and ClAc, indicates that the spectral changes upon addition of substrate analog and substrate are very similar. (D) Overlaid 15N,1H HSQC spectra of FAcD in the presence of saturating concentrations of BrAc and ClAc, a weaker but true substrate, confirm that only very minor differences exist between these two species. (E) 19F NMR spectra of a BrAc titration series

75 recorded at 50 °C. BrAc:protein ratios range from 0 to 50. (F) Saturation transfer experiments show that the exchange between the substrate-bound minor states is nonexistent or slower than T1 (1.3 sec). In two separate experiments, one of the minor states was saturated with selective continuous wave irradiation and intensities of the other minor state were measured with varying saturation times and vice versa. The saturation transfer profiles of the two experiments were averaged. All NMR spectra were acquired at 50 °C.

76

Figure 2.S9 - Amide peak assignments of 15N,1H HSQC spectra and chemical shift perturbations upon titration with the substrate analog BrAc. (A) 15N,1H HSQC spectrum of partially deuterated 15N-labelled FAcD at 50 °C. Each amide signal is labeled with the corresponding residue name and number. Assignments have been obtained for about 80% of peaks. (B) A full spectral view of the titration series of 15N,1H HSQC spectra with varying concentrations of BrAc at 50 °C. (C) Chemical shift perturbations 77 upon titration with BrAc. The dimer interface region, substrate binding site, and cap domain exhibit large changes in their chemical shifts. The dimer interface region and cap domain also show significant heterogeneities between protomers in SPARTA+ predictions (see Figure 2.S6D) and difference distance matrix analysis. (D) Correlation between SPARTA+ -predicted chemical shift differences between protomers A and B and 15N,1H HSQC chemical shift perturbations resulting from the substrate analog titration. Residues near the catalytic pocket are excluded from the correlation plot because SPARTA+ does not account for the chemical shift perturbations caused by the close presence of BrAc. Only residues with SPARTA+- larger than 0.075 ppm were used to generate the correlation plot.

Figure 2.S10 - Cross correlation of residues from MD simulations. Cross correlation of residue pairs using mutual information for FAcD in the apo state and FAc-bound state. High mutual information (black) indicates correlated conformational changes of an intra or inter-protomer residue pair. Below the matrix diagonal, yellow shading is added to indicate regions predicted to exhibit conformational changes from NMR and FTX crystallography.

78

Figure 2.S11 - Crystal lattice of FAc-bound FAcD. Crystal lattice view of the FAc-complex of FAcD in space group P21 with protomer A (substrate bound) in red, and protomer B in blue. All crystals investigated for this report are isomorphous to this one with conserved packing arrangements and crystal contacts. The central dimer is shown with B-factor representation to allow better correlation to other figures.

79

Figure 2.S12 - 19F NMR spectra monitoring minor peaks. High signal-to-noise 19F NMR spectra of wild- type FAcD (A) in the absence of ligand (S/N of 3300) and (C) in the presence of BrAc (S/N of 1900) or (E) glycolate (S/N of 3300). (B, D, F) Magnified views of the corresponding dashed boxes in (A, C, and E) show that minor peaks are located over the whole spectrum, suggesting the presence of the minor state causes small changes but on a global scale.

80

Figure 2.S13 - 19F NMR spectra to confirm the W156 assignment. (A) 19F NMR spectra of wild-type (black) and the W156H mutant (magenta). (B) Magnified view of the dashed box in (A) shows the loss of both minor peaks and the major peak of W156 in the W156H mutant confirming the residue assignment. (C) Overlay of 19F NMR spectra of the wild type FAcD without ligand (maroon) and in the presence of a saturating concentration of glycolate (cyan). W156 and W185 show large chemical shift perturbations in keeping with the fact that glycolate binds close to these residues (see inset). (D) Overlay of 19F NMR spectra of the W156H mutant without ligand (maroon) and in the presence of a

81 saturating concentration of glycolate (cyan). The W185 peak exhibits a significant chemical perturbation upon binding of glycolate, whereas the remaining resonances are essentially unchanged, confirming the assignment of the downfield resonance as W156. (E) A titration series of 19F NMR spectra acquired for ClAc:H280N mutant ratios ranging from 0 to 30 at 50 °C. The spectra exhibit slow exchange behavior as ClAc reacts with D110 in the catalytic site. W156, W185 and the peak at around -127.0 ppm show large chemical shift perturbations. Note that the H280N mutation shifts W267 and W264 in the absence of ligand, as did W156H, shown in panels (A) and (C). (F) Overlay of 19F NMR spectra of wild-type FAcD without NaCl present (black) and in the presence of 500 mM NaCl. (G) Close-up of (C). The downfield minor peak (*) disappears upon addition of NaCl. This is consistent with the crystal structure determined in the presence of NaCl where the minor conformation of Y141 is not present in protomer A (Figure 2.S14F). Based on the combined crystallographic and 19F NMR findings, the downfield and upfield minor peaks are assigned as originating from protomer A (whose minor conformer is poised for substrate binding) and protomer B (whose catalytic site is empty), respectively.

82

Figure 2.S14 - Crystal structures of states along the FAcD reaction coordinate. Panels on the left display close-ups of the FAcD active site. The electron densities corresponding to residues Tyr141, Trp156, and Asp110, the glycolyl ester, and bound ligands (as sticks) are shown against the background of a ribbon model. Panels at the right hand side show superpositions of the apo form and the respective states. The same three residues and ligands are shown as sticks. Crystal structures of (A) apo (yellow),

83

(B) Michaelis complex with ClAc (orange), structures of covalent intermediates with (C) one (purple) and (D) two protomers (gray) displaying the covalent glycolyl ester, (E) glycolate complex (cyan), and (F)

- Cl (green spheres) complex (green). 2Fo-Fc electron densities are contoured at varying σ values to best display structural states with very low occupancy.

84

Table 2.S1 – Crystallography statistics. D110N-apo H280N D110N-FAc D110N-ClAc D110N-GOA H280N WT-F-Trp WT-apo (5K3D) (5T4T) 1 ligand (5SWN) (5K3B) (5K3E) 2 ligands (5K3C) (5K3F) (5K3A) Data collection Wavelength 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 Space group P21 P21 P21 P21 P21 P21 P21 P21 Cell dimensions a, b, c (Å) 41.9, 78.9, 41.8, 79.6, 41.9, 78.9, 41.8, 78.9, 41.9, 79.0, 41.8, 79.1, 41.9, 79.5, 41.8, 79.7, 85.3 85.1 85.2 84.9 85.0 85.0 85.2 85.2 α; β; γ () 90.0, 103.7, 90.0, 103.1, 90.0, 103.2, 90.0, 103.3, 90.0, 103.2, 90.0, 103.2, 90.0, 103.02, 90.0, 102.5, 90.0 90.0 90.0 90.0 90.0 90.0 90.0 90.0 Resolution (Å) 82.7 (1.51) 36.8 (1.54) 82.7 (1.54) 18.8 (1.58) 19.8 (1.54) 19.8 (1.51) 19.2 (1.54) 33.9 (1.45) Rsym or (%) 3.9(28.2) 2.5 (21.4) 3.4(28.5) 3.1 (25.2) 2.4 (17.0) 3.3 (23.4) 2.1 (21.4) 5.5 (29.3) Rmerge I / I 24.7(3.5) 19.9 (3.5) 30.7(6.1) 18.9 (3.5) 24.0 (5.5) 17.8 (2.7) 25.2 (4.6) 49.7 (3.7) Completeness (%) 91.3(73.5) 92.7 (89.9) 99.1(93.2) 98.7 (96.0) 99.5 (95.5) 89.5 (70.6) 96. 9(86.4) 99.9 (99.7) Redundancy 3.6(2.0) 1.8 (1.7) 4.6(3.7) 1.9 (1.8) 2.0 (2.0) 1.9 (1.6) 1.9 (1.8) 3.2 (1.6) CC1/2 99.9(88.1) 99.9(88.1) 99.9(94.0) 99.9(85.6) 99.8(99.5) 99.9(85.7) 99.9(93.1) 99.9(92.9) Refinement Resolution (Å) 82.7-1.51 36.8-1.54 40.8-1.54 18.8-1.58 19.8-1.54 19.8-1.51 19.2-1.54 33.9-1.45 No. reflections 76,737 74,512 78,560 72,586 78,608 75,208 77,192 96,439 Rwork / Rfree (%) 16.7/20.4 17.9/22.2 16.9/19.4 16.4/19.6 16.6/20.2 16.8/21.1 17.4/21.1 18.3/21.5 No. atoms Protein 4723 4,703 4689 4,718 4,695 4,706 4,664 4,736 Ligand/ion 16 1 7 7 7 1 3 0 Water 659 715 530 569 648 712 573 604 Average B-factors Protein 13.0 14.4 15.3 15.8 14.8 13.1 15.4 14.2 Ligand/ion 15.9 23.2 21.4 18.5 16.1 18.7 16.8 Water 23.5 24.1 24.6 24.5 24.9 23.2 24.2 25.1 R.m.s. deviations Bond lengths (Å) 0.011 0.007 0.007 0.010 0.018 0.010 0.013 0.010 Bond angles () 1.15 1.15 0.961 1.31 1.79 1.39 1.59 1.38 Ramachandran (%) Favored 97.5 97.3 97.45 97.45 97.79 96.79 97.75 97.61 Allowed 2.5 2.7 2.55 2.38 2.21 3.21 2.08 2.39 Outlier 0.0 0.0 0.0 0.17 0.0 0.0 0.17 0.0 Rotamer Outlier (%) 0.42 0.64 0.21 0.63 1.06 0.43 0.43 0.63 85

Table 2.S2 - Crystallography statistics for FTX structures.

0 sec 2 sec 5 sec 5 sec (2) 5 sec (3) 7 sec 9 sec Data collection Wavelength 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 Scaling Space group P21 P21 P21 P21 P21 P21 P21 41.8, 78.7, 85.1; 41.8, 79.1, 85.1; 41.9, 79.3, 85.2; 41.9, 79.4, 85.1; 41.8, 79.3, 85.1; 41.8, 78.2, 85.2; 41.9, 79.4, 85.1; Unit cell 90, 103, 90 90, 103.1, 90 90, 103.1, 90 90, 103.2, 90 90, 103.1, 90 90, 103.1, 90 90, 103.1, 90 No. reflections 71952 30975 33675 38797 33723 66719 38298 Multiplicity 3.8 (3.3) 3.8 (3.9) 3.9 (3.9) 4.0 (4.0) 3.2 (2.5) 3.5 (3.4) 3.9 (4.0) Completeness (%) 90.51 (76.47) 85.94 (98.75) 84.68 (96.68) 98.37 (96.39) 85.52 (95.69) 97.77 (97.13) 97.07 (95.65) Mean I/sigma(I) 35.8 (10.6) 21.7 (5.80) 23.4 (6.53) 35.0 (17.3) 12.3 (6.09) 33.3 (16.5) 37.7 (16.5) R-merge 2.52(11.5) 4.43 (23.7) 3.95 (17.1) 2.82 (7.12) 7.43 (15.9) 2.74 (6.15) 2.62 (7.77) CC1/2 100 (98.3) 99.8 (96.1) 99.9 (97.3) 99.9 (99.5) 99.1 (96.7) 99.9 (99.4) 99.9 (99.4) Refinement Resolution range 18.5 - 1.54 18.5 - 1.54 18.5 - 1.54 18.5 - 1.54 18.5 - 1.54 18.5 - 1.54 19.8 - 1.95 R-work (%) 15.3 19.7 20.4 16.2 21.8 17.6 17.2 R-free 17.7 23.7 22.6 20.9 25.4 20.0 20.8 Protein atoms 4706 4552 4581 4640 4643 4609 4601 Water atoms 899 353 546 605 500 669 624 RMS(bonds) 0.006 0.006 0.004 0.005 0.003 0.005 0.004 RMS(angles) 0.89 0.92 0.94 1 0.74 1.02 0.9 Ramachandran favored (%) 97.44 96.52 96.85 97.05 96.86 96.88 96.7 allowed (%) 2.22 2.79 2.8 2.43 2.61 2.43 2.61 outliers (%) 0.34 0.7 0.35 0.52 0.52 0.69 0.7 Average B-factor macromolecules 12.4 28.3 20.3 12.4 13.6 18.6 13.7 solvent 26.1 33.4 27.6 22.9 20.3 30.9 24.9 86

9 sec (2) 10 sec 10 sec (2) 12 sec 15 sec 15 sec (2) 20 sec Data collection Wavelength 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 Scaling Space group P21 P21 P21 P21 P21 P21 P21 41.9, 79.3, 85.1; 41.8, 78.3, 85.1; 41.9, 79.3, 85.1; 41.9, 79.3, 85.1; 41.8, 78.5, 85.1; 41.9 79.1, 85.1; 41.8, 78.5 85.1; Unit cell 90, 103.1, 90 90, 103.0, 90 90, 103.1, 90 90, 103.1, 90 90, 103.2, 90 90, 103.2, 90 90, 103.2, 90 No. reflections 38254 69204 25494 33366 38054 38334 74727 Multiplicity 4.0 (3.8) 3.5 (3.3) 4.2 (4.2) 3.9 (3.8) 3.9 (3.9) 3.7 (3.7) 3.6 (3.1) Completeness (%) 96.78 (90.57) 98.66 (99.49) 93.44 (92.83) 82.20 (90.68) 97.61 (96.71) 96.60 (89.76) 94.58 (85.04) Mean I/sigma(I) 32.3 (14.2) 32.6 (15.7) 38.3 (16.4) 21.1 (6.55) 35.7 (13.6) 37.0 (19.2) 32.7 (6.07) R-merge 3.13 (8.36) 2.76 (6.48) 2.69 (8.10) 4.55 (18.5) 2.92 (9.13) 2.56 (6.03) 2.60 (21.7) CC1/2 99.9 (99.2) 99.9 (99.4) 100 (99.4) 99.9 (96.1) 99.9 (99.1) 99.9 (99.6) 100 (96.4) Refinement Resolution range 19.8 - 1.95 19.0 - 1.60 19.8 - 2.10 41.5 - 1.95 19.8 - 1.95 19.7 - 1.95 19.0 - 1.54 R-work (%) 0.0 0.0 0.0 0.0 0.0 0.0 0.0 R-free 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Protein atoms 4602 4622 4558 4558 4622 4602 4677 Water atoms 593 675 474 610 522 642 837 RMS(bonds) 0.005 0.005 0.014 0.006 0.004 0.008 0.004 RMS(angles) 0.99 1.06 1.7 0.94 0.94 1.16 1 Ramachandran favored (%) 96.68 96.68 95.95 96.48 97.06 97.21 97.43 allowed (%) 2.97 2.8 3.52 2.64 2.42 2.44 2.06 outliers (%) 0.35 0.52 0.53 0.88 0.52 0.35 0.51 Average B-factor macromolecules 14.4 17.2 12.8 14.7 14.0 12.4 13.6 solvent 25.2 29.4 24.2 23.3 23.1 23.6 26.1

87

20 sec (2) 25 sec 25 sec (2) 30 sec 30 sec (2) 30 sec (3) 30 sec (4) Data collection Wavelength 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 Scaling Space group P21 P21 P21 P21 P21 P21 P21 41.8, 78.4, 85.2; 41.8, 78.9, 85.0; 41.9, 79.3, 85.1; 41.8, 78.8, 85.1; 41.8, 78.3, 85.0; 41.9, 79.3, 85.1; 41.9, 79.2, 85.2; Unit cell 90, 103.2, 90 90, 103.2, 90 90 ,103.2, 90 90, 103.2, 90 90, 103.3, 90 90, 103.2, 90 90, 103.0, 90 Unique reflections 77261 78328 80037 77221 72989 38527 38127 Multiplicity 3.5 (2.8) 3.5 (3.0) 3.5 (2.8) 3.4 (2.7) 3.7 (3.0) 4.8 (4.9) 5.0 (5.0) Completeness (%) 98.30 (92.43) 98.71 (92.91) 96.05 (67.65) 97.29 (87.35) 92.68 (81.07) 95.92 (87.18) 95.68 (89.29) Mean I/sigma(I) 23.6 (4.75) 32.4 (6.75) 24.7 (5.93) 25.6 (4.99) 25.4 (5.92) 37.3 (15.9) 38.3 (13.6) R-merge 3.58 (21.9) 2.54 (16.9) 3.18 (24.8) 3.23 (21.8) 3.36 (19.3) 2.92 (8.81) 3.10 (11.4) CC1/2 99.9 (94.9) 100 (96.8) 99.9 (96.2) 99.9 (94.8) 99.9 (95.9) 99.9 (99.4) 100 (99.1) Refinement Resolution range 19.1 - 1.54 18.4 - 1.54 25.8 - 1.52 19.2 - 1.54 18.5 - 1.54 19.8 - 1.95 19.8 - 1.95 R-work (%) 0.0 0.0 0.0 0.0 0.0 0.0 0.0 R-free 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Protein atoms 4637 4628 4619 4645 4595 4623 4666 Water atoms 868 816 861 838 793 606 582 RMS(bonds) 0.014 0.008 0.022 0.005 0.01 0.003 0.004 RMS(angles) 1.58 1.28 2.09 1.13 1.37 0.89 0.89 Ramachandran favored (%) 97.6 97.4 97.57 97.42 97.09 96.85 97.23 allowed (%) 2.05 2.25 2.08 2.24 2.56 2.45 2.08 outliers (%) 0.34 0.35 0.35 0.34 0.34 0.7 0.69 Average B-factor macromolecules 12.3 13.0 16.2 13.0 13.2 14.1 13.7 solvent 26.0 26.4 30.6 26.7 25.6 25.9 25.1

88

30 sec (5) 30 sec (6) 30 sec (7) 30 sec (8) 45 sec 45 sec (2) 90 sec Data collection Wavelength 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 Scaling Space group P21 P21 P21 P21 P21 P21 P21 41.9, 79.1, 85.1; 41.8, 78.5, 85.1; 41.8, 78.5, 85.1; 41.8, 78.1, 85.0; 41.7, 78.8, 85.1, 41.8, 78.8, 85.1; 41.8, 78.8, 85.1; Unit cell 90, 103.2, 90 90, 103.2, 90 90, 103.3, 90 90, 103.3, 90 90, 103.1, 90 90, 103.2, 90 90, 103.1, 90 Unique reflections 37202 38936 37094 38665 74186 77445 78241 Multiplicity 5.0 (5.1) 4.9 (4.8) 4.9 (4.9) 4.8 (4.8) 3.3 (2.4) 3.5 (3.0) 3.5 (3.0) Completeness (%) 94.49 (90.55) 99.60 (98.79) 95.20 (91.34) 99.40 (97.55) 93.99 (79.14) 97.52 (90.43) 98.71 (92.22) Mean I/sigma(I) 48.7 (25.3) 43.6 (20.3) 43.6 (24.0) 31.3 (15.6) 27.1 (3.2) 30.3 (11.2) 31.6 (5.51) R-merge 2.26 (5.38) 2.53 (6.99) 2.51 (5.67) 3.56 (8.91) 2.85 (29.7) 2.85 (9.49) 2.62 (20.2) CC1/2 100 (99.8) 100 (99.7) 100 (99.7) 99.9 (99.4) 100 (89.1) 99.9 (98.8) 100 (95.6) Refinement Resolution range 19.8 - 1.95 19.7 - 1.95 19.6 - 1.95 20.0 - 1.95 19.0 - 1.54 19.8 - 1.54 19.0 - 1.54 R-work (%) 0.0 0.0 0.0 0.0 0.0 0.0 0.0 R-free 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Protein atoms 4642 4674 4645 4699 4659 4628 4664 Water atoms 592 645 600 565 762 912 802 RMS(bonds) 0.006 0.004 0.004 0.004 0.004 0.006 0.008 RMS(angles) 1.06 0.96 0.91 0.99 0.96 1.15 1.26 Ramachandran favored (%) 97.21 97.26 97.25 97.1 97.23 97.05 97.4 allowed (%) 2.44 2.22 2.23 2.39 2.42 2.6 2.08 outliers (%) 0.35 0.51 0.52 0.51 0.35 0.35 0.52 Average B-factor macromolecules 12.6 11.6 9.6 11.3 15.8 11.8 13.8 solvent 24.7 24.4 22.4 22.8 28.9 25.7 27.4

89

90 sec (2) 90 sec (3) 90 sec (4) 120 sec 120 sec (2) 120 sec (3) 120 sec (4) Data collection Wavelength 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 1.5418 Scaling Space group P21 P21 P21 P21 P21 P21 P21 41.8, 79.4, 85.1; 41.8, 79.4, 85.1; 41.8, 79.1, 85.1; 41.9, 79.3, 85.1; 41.9, 79.2, 85.0; 41.9, 79.2, 85.1; 41.8, 78.6, 85.1; Unit cell 90, 103.1, 90 90, 103.2, 90 90, 103.2, 90 90, 103.2, 90 90, 103.2, 90 90, 103.2, 90 90, 103.2, 90 Unique reflections 38584 38782 38295 78213 37534 39098 37459 Multiplicity 4.9 (4.9) 4.9 (4.8) 5.0 (5.0) 3.5 (2.9) 5.0 (5.0) 4.8 (4.8) 4.9 (4.9) Completeness (%) 97.16 (94.05) 98.14 (96.69) 96.80 (92.61) 92.58 (73.32) 95.18 (91.52) 99.25 (98.13) 95.77 (93.19) Mean I/sigma(I) 40.5 (16.1) 56.2 (26.0) 36.5 (13.2) 26.2 (5.14) 47.2 (21.3) 41.1 (19.5) 48.0 (23.0) R-merge 2.75 (8.96) 1.91 (5.21) 3.12 (11.4) 3.04 (20.7) 2.32 (6.55) 2.58 (6.77) 2.33 (6.05) CC1/2 100 (99.4) 100 (99.8) 100 (99.1) 99.9 (94.8) 100 (99.7) 100 (99.7) 100 (99.7) Refinement Resolution range 19.9 - 1.95 19.7 - 1.95 19.8 - 1.95 19.3 - 1.54 19.8 - 1.95 19.8 - 1.95 19.8 - 1.95 R-work (%) 0.0 0.0 0.0 0.0 0.0 0.0 0.0 R-free 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Protein atoms 4602 4596 4636 4574 4656 4635 4634 Water atoms 581 601 542 742 527 555 574 RMS(bonds) 0.008 0.003 0.006 0.005 0.004 0.008 0.003 RMS(angles) 1.16 0.84 1.04 1.09 0.96 1.19 0.84 Ramachandran favored (%) 97.02 96.68 96.88 97.04 96.91 96.88 97.24 allowed (%) 2.63 2.62 2.78 2.26 2.41 2.78 2.07 outliers (%) 0.35 0.7 0.35 0.7 0.69 0.35 0.69 Average B-factor macromolecules 13.4 13.3 14.5 13.9 12.1 12.4 10.1 solvent 25.3 25.5 26.2 26.7 23.2 24.1 22.1

90

180 sec 180 sec (2) 180 sec (3) 180 sec (4) Data collection Wavelength 1.5418 1.5418 1.5418 1.5418 Scaling Space group P21 P21 P21 P21 41.8, 78.8, 85.0; 41.9, 78.7, 85.1; 41.9, 78.6, 85.1; 41.8, 78.1, 85.0; Unit cell 90, 103.2, 90 90, 103.2, 90 90, 103.3, 90 90, 103.1, 90 Unique reflections 72865 37234 38376 37448 Multiplicity 3.6 (2.9) 5.0 (4.9) 4.9 (4.8) 4.8 (4.8) Completeness (%) 92.18 (80.70) 95.16 (91.97) 98.10 (95.71) 96.33 (92.91) Mean I/sigma(I) 26.4 (3.83) 61.4 (31.45) 51.0 (27.3) 49.8 (33.3) R-merge 3.18 (32.0) 1.79 (4.38) 2.13 (4.83) 2.23 (3.73) CC1/2 100 (91.5) 100 (99.8) 100 (99.8) 100 (99.9) Refinement Resolution range 18.4 - 1.54 19.8 - 1.95 19.7 - 1.95 19.7 - 1.95 R-work (%) 0.0 0.0 0.0 0.0 R-free 0.0 0.0 0.0 0.0 Protein atoms 4638 4650 4676 4629 Water atoms 761 622 626 602 RMS(bonds) 0.005 0.006 0.006 0.004 RMS(angles) 1.08 1.05 1.06 0.97 Ramachandran favored (%) 97.08 96.9 96.57 97.07 allowed (%) 2.41 2.75 3.09 2.58 outliers (%) 0.52 0.34 0.34 0.34 Average B-factor macromolecules 15.2 10.7 11.3 10.0 solvent 28.3 23.1 23.6 22.7

91

Chapter 3

Concomitance of Substrate Inhibition and Allostery in Fluoroacetate Dehalogenase

Authors: Pedram Mehrabi1,2†, Tae Hun Kim3†, Adnan Sljoka5, Natasha Kruglyak6, Emil F. Pai1,2,6,7, R. Scott Prosser4,6,

Affiliations: 1Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada. 2Ontario Cancer Institute/Princess Margaret Cancer Centre, Campbell Family Institute for Cancer Research, Toronto, ON, M5G 1L7, Canada. 3Program in Molecular Structure and Function, Research Institute, The Hospital for Sick Children, Toronto, Ontario M5G 1X8, Canada. 4Department of Chemistry, University of Toronto, UTM, 3359 Mississauga Road North, Mississauga, Ontario L5L 1C6, Canada. 5CREST, Japan Science and Technology Agency (JST), Department of Informatics, School of Science and Technology, Kwansei Gakuin University, Japan. 6Department of Biochemistry, University of Toronto, 1 King’s College Circle, Toronto, Ontario M5S 1A8, Canada. 7Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada. †These authors contributed equally to this work.

Statement of contribution: P.M. and T.H.K. designed and performed crystallographic and NMR experiments, respectively. Together they designed and performed the kinetic experiments and analyzed all the data together. A.S. performed computational rigidity analysis. R.S.P. supervised NMR data collection and analysis. N.K. assisted with the refinement of the Y219F structures. E.F.P supervised crystallographic experiments and analysis as well as editing of the manuscript. The manuscript was written by T.H.K., P.M., and E.F.P. with input by the other authors.

92

Abstract

Substrate inhibition of the bacterial homodimeric enzyme, fluoroacetate dehalogenase (FAcD) is described by X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy. Enzyme kinetic measurements indicate that catalytic rates as a function of substrate concentration initially increase but start to decline upon addition of excessive amount of substrate. This phenomenon is reported for 20% of all enzymes (198). For FAcD, NMR measurements recapitulate these findings displaying biphasic behavior of chemical shift and linewidth in key residues located in the catalytic site and residues in its allosteric pathway between protomers. Using a catalytically slow mutant (Y219F), the decreased overall reaction velocity enabled us to trap the substrate in a second binding site, besides the active site, in the presence of high substrate concentrations. I tested a time series of soaking experiments extending from 3 min to 48 hours. After soaking times of longer than 24 hours, crystal structures showed the sole presence of product in the catalytic site with no secondary substrate molecule bound. Based on the crystal structure series, rigidity-based transmission allostery (RTA) analysis mapped an allosteric pathway using a window analysis technique, indicating reduced cross talk between the subunits. This study provides insight into how substrate inhibition can play a significant role in allostery and the dynamics of catalysis in a homodimeric enzyme.

3.1 Introduction

Enzymatic catalysis of chemical reactions is accomplished by lowering activation energy barriers (199). The space of all possible enzyme configurations generates a complex energy landscape that is governed by the enzyme’s chemical nature and dynamics (200). While structure provides the scaffolding for the catalytic reaction, the interplay of dynamics dictates how the enzyme navigates through the reaction coordinate pathway. Physically or chemically induced transformations can cause changes in the enzyme’s activity. Enzyme activities are tightly regulated because they are key in many cellular processes. This regulation is often accomplished through the binding of a ligand molecule at either allosteric or catalytic sites (201). In the standard Michaelis-Menten representation of enzyme kinetics, the curve describing

93 the reaction rate as a function of substrate concentration asymptotically approaches Vmax (202). For some enzymes, however, elevated concentrations of substrate cause a decrease in the catalytic rate. Such enzymes are termed as susceptible to substrate inhibition. The prevailing model for substrate inhibition has a second substrate molecule binding adjacent to the active site and creating a disruption in the shuttling of substrate and product (203, 204). For simplicity of arguments, most models of substrate inhibition use monomeric enzymes as examples.

There are, however, several examples of oligomeric enzymes that also show substrate inhibition (205–207). Their kinetic performance can be additionally complicated by allosteric behaviour, a regulatory mechanism found in many proteins that display quaternary structure. Recently, our lab engaged in combined crystallographic, NMR and computational studies designed to gain insight into how allosteric processes between protomers play a key role in catalysis in a homodimeric enzyme. The results were somewhat of a surprise not only when crystallography revealed half-of-the-sites reactivity but also when kinetic analysis by 19F NMR indicated substrate inhibition (47).

To study this substrate inhibition further and to refine our understanding of how allostery may play a role in homodimeric enzymes in general, I used fluoroacetate dehalogenase (FAcD) from Rhodopseudomonas palustris as study object. FAcD catalyzes the hydrolysis of carbon halogen bonds, specifically the C-F bond, generating glycolate, a halide ion and a proton as products. In previous studies, we determined the crystal structures of key catalytic steps, some of them stabilized by mutations of residues involved in catalysis. The Michaelis-Menten complex structures, determined by either co-crystallizing (47) mutant proteins or soaking them with the substrate, showed substrate bound at only one of the dimer’s active sites. Unlike in the previous study, where wild type FAcD was used in freeze-trapping X-ray crystallography (FTX), for -1 practical reasons the Y219F mutant, with kcat = 0.035 min , was used to explore substrate inhibition in the current experiments (47). The mutation decreased the overall reaction velocity sufficiently to allow extending the soaking time from seconds to a timescale of minutes to hours and made it possible to capture substrate binding and product generation as a function of time.

94

3.2 Methods and Materials

Expression and purification of FAcD

FAcD was prepared as described in Chapter 2.

Crystallization of FAcD and preparation of enzyme-ligand complexes

Single crystals of apo-FAcD were grown as described in Chapter 2. To collect diffraction data for the FAc and the ClAc complexes of Y219F FAcD, crystals were soaked in the presence of 100 mM of fluoroacetate for 3 min to 48 hours or chloroacetate for 30 min. Crystals of the glycolate-complex of the Y219F mutant were obtained by soaking the crystals in mother liquor containing100 mM FAc or ClAc and leaving them to incubate for a minimum of 24 hours prior to data collection.

Diffraction data collection

Data collection was performed at 100 K either on a Rigaku FR-E rotating anode with mirror optics and Saturn A200 detector or on a Rigaku 007 rotating copper anode with mirror optics and a Mar345 detector. Crystals were flash-frozen in liquid nitrogen using paratone N (Hampton Research, Aliso Viejo, CA, USA) as cryo-protectant.

X-ray data processing and refinement

Diffraction data were indexed, integrated and scaled using the software package XDS (170). Phases were calculated using the molecular replacement module of the Phaser package (171) using wild-type FAcD (PDB ID: 3R3U) as search model. Model building and the first rounds of refinement were performed using Wincoot (172). Automatically picked water molecules were kept for further refinement when they were within 2.3-3.5 Å of a hydrogen acceptor or donor and the corresponding electron density was at least 1.5 e/Å3. Subsequent rounds of refinement used option Phenix.refine from the Phenix software suite (175).

NMR spectroscopy

Enzyme kinetic data were collected by acquiring either 19F or 1H NMR spectra for quantifications of substrate (FAc or ChlAc) and product (fluoride ion or glycolate) concentrations as a function of time at 25 °C. The collected NMR spectra were analyzed with

95

MestReNova (Mestrelab Research S.L., 11.0.4) and the enzyme kinetic data were fitted with a substrate inhibition model using Prism 6 (GraphPad Software). 19F NMR spectra of FAcD enriched with 5F-Trp were acquired with 1600 transients and a spectral width of 22000 Hz. The 19F NMR spectra were processed and analyzed with MestReNova. 15N,1H HSQC spectra were acquired using 80 increments, 25 transients, and a spectral width of 2340 Hz in the indirect dimension. The HSQC spectra were processed with NMRPipe and NMRviewJ software. All NMR spectra were obtained on a 600 MHz Varian Inova spectrometer equipped with a cryogenic probe capable of tuning to HCN and 19F. 1H, 15N, and 19F π/2 pulse widths were 10.8 μs, 44 μs, and 15 μs, respectively.

Computational Prediction of allosteric pathways

Rigidity-based allosteric communication:

To quantify the allosteric effect and map out the allosteric pathway between the substrate- binding region in one protomer and the other protomer, we applied a rigidity-transmission allostery (RTA) algorithm. This novel computational method is founded on extensions of the program FIRST (152–154) and is based on work in rigidity theory initially introduced in (156) and further discussed in (155). The RTA algorithm measures the effect of local mechanical propagation of rigidity across (distant) parts of the protein structure. In this work, we analyze how the perturbation of rigidity of the catalytic site gets transmitted to the second protomer.

Starting with an X-ray crystal structure, FIRST generates a constraint network, where the protein is modelled in terms of nodes (atoms) and edges (i.e. constraints representing covalent bonds, hydrogen bonds, electrostatic interactions, and hydrophobic contacts). Hydrogen bonds are ranked in terms of overall strength using the modified Mayo potential (152), whereupon a hydrogen bond cutoff energy value is selected such that all bonds weaker than this cutoff are ignored. FIRST applies the pebble game algorithm (153) which rapidly decomposes a resulting network into rigid clusters and flexible regions, enabling an evaluation of non-trivial degrees of freedom (DOF) throughout the protein. After generating outputs of the pebble game algorithm (FIRST) at a wide range of energy cutoffs, we apply the RTA algorithm to predict if local perturbation of rigidity at the catalytic site (mimicking ligand/substrate binding) propagates across the protein network and leads to a change in rigidity and conformational degrees of freedom in regions of the other protomer, hence resulting in allosteric transmission.

96

Equivalently, the presence of rigidity-based allostery means that a change in shape (conformation) at the catalytic site (i.e. mechanically changing the shape as binding might) would lead to rearrangement and change of shape in parts of the other protomer. Utilizing the RTA algorithm, prediction of the allosteric response for each residue in the empty protomer is calculated following rigidity perturbation of the catalytic site, enabling us to obtain a detailed mapping of the allosteric network communication pathway between the two protomers.

We used FIRST and the pebble game algorithm to compute conformational degrees of freedom for all small windows consisting of three consecutive residues in the empty protomer, before and after a perturbation of rigidity of the catalytic site. The residues in the empty monomer that undergo a change in internal degrees of freedom (i.e. propagation of rigidity - transmission of degrees of freedom) upon rigidity perturbation of the catalytic site form the allosteric pathway connecting the two protomers. Residues are colour-coded based on the amount of transmission of degrees of freedom.

Computation of rigidity-transmission allostery (RTA):

Missing hydrogen atoms were added to crystal structures with the WHAT IF web server (http://swift.cmbi.ru.nl/servers/html/htopo.html). For every hydrogen bond energy cutoff, starting at 0 kcal/mol (we progressively lowered the cutoff value in increments of 0.01 kcal/mol), the FIRST method (as previously described) was performed to generate the output of rigidity prediction and the pebble game algorithm.

We computed the number of degrees of freedom that can be transmitted (removed) from a window of 3 consecutive residues (r, r+2) in the empty monomer as a result of perturbation of the rigidity of the substrate-binding region (residues 110, 111, 114, 141, 155, 156, 185 and 219). Starting at the N-terminal end of the empty protomer, beginning with window (1, 3), then (2, 4) … etc, continuing to slide the window of 3 consecutive residues, we computed the transmission of degrees of freedom for every window. Initial perturbation of rigidity refers to insertion of additional constraints (edges) (removal of degrees of freedom) to the substrate-binding region up to its rigidification. When transmission of degrees of freedom occurs, it refers to any subsequent change in the number of degrees of freedom at the (distant) window in the empty protomer.

97

Transmission of degrees of freedom between the two sites was computed with the aid of the pebble game. Let the substrate binding region be denoted as site A and the current tested window as site B, and for every energy cutoff we calculate the available conformational degrees of freedom at site A, site B and the union of sites A and B. Using the output of the pebble game (FIRST), degrees of freedom counts are computed by calculating the maximum number of pebbles that can be gathered on nodes (atoms) belonging to site A, site B and finally the union of sites A and B. Given some fixed energy cutoff, we denote these counts as ADOF, BDOF, and

ABDOF, respectively. In the pebble game algorithm, each pebble corresponds to a degree of freedom. The maximum number of degrees of freedom that can be transmitted from A to B

(denoted as DOF_AB) is finally calculated by obtaining the count DOF_AB = ADOF + BDOF −

ABDOF − 6. Six is subtracted to neglect the trivial six degrees of freedom corresponding to rigid body motions. When DOF_AB is positive, then sites A and B are involved in rigidity-based allosteric transmission and the maximum number of degrees of freedom that can be transmitted from A to B is DOF_AB. This provides a quantifiable measure of allosteric communication between A and B. To observe allosteric transmission for some residue r, the transmission of degrees of freedom curve was generated by plotting the averaged DOF_AB for three consecutive windows containing r (i.e. (r-2,r), (r-1, r+1) and (r, r+2)) as a function of the energy cut-off (Figure 3.S1). We calculated the intensity of allosteric transmission for each window of 3 consecutive residues by computing the area under the transmission of degrees of freedom curve. The intensity of allosteric transmission for residue r was obtained by calculating the average intensity of the three consecutive windows that contain the residue r (Figure 3.7A, B). The intensity of allosteric transmission takes into effect the number of degrees of freedom that can be transmitted but also the persistence of the transmission as a function of energy strength.

3.3 Results

3.3.1 High substrate concentrations lead to substrate inhibition Most data of enzyme catalysis are analyzed with Michaelis-Menten kinetics where a hyperbolic curve is fitted to the experimental data of catalytic rate as a function of substrate concentration (208). It is reported that about 20 % of all enzymes show some degree of substrate inhibition (198). Using 19F or 1H NMR techniques to measure concentrations of substrate and product in

98 the FAcD-catalyzed reaction, one can show that rates start to decrease when [FAc] reaches ~170 mM for FAc or ~ 85 mM for ClAc (Figure 3.1). While FAcD hydrolyzes both substrates the catalytic rate is ~5 times slower with ClAc in wt-FAcD (5). Unlike for other enzymes, such as phosphofructokinase and acetylcholinesterase (209, 210), which finally become completely inhibited, the rate curve for FAcD levels off towards very high substrate concentrations.

Figure 3.1 - Enzyme kinetics of FAcD indicative of substrate inhibition. Plots of catalytic rates vs. substrate concentrations, instead of asymptotically approaching maximal velocity, start to decline once substrate concentrations of ca. 200 mM FAc (A) or ca. 100 mM ClAc (B) are reached. The data were fitted using the equation described in the Materials & Methods section.

3.3.2 In crystal structures of the Y219F mutant of FAcD one subunit binds two substrate molecules. In a previous study of freeze-trapping X-ray crystallography, wild type FAcD crystals were soaked with substrate (FAc) for 2 to 180 seconds in an attempt to capture enzyme intermediates along the reaction coordinate pathway (47). However, it was not possible to observe electron density changes indicative of clean substrate binding or generation of product in the catalytic site. Using the Y219F mutant with its significantly slower catalytic rate for FAc hydrolysis of -1 kcat = 0.035min , it is possible to attain and capture the substrate in the active site, e.g. by soaking crystals in buffer containing 100 mM FAc for 3 min. Varying the soaking time did not affect the result until we reached 24 hr when we observe generation of product. Interestingly, there are two substrate molecules found in one of the two protomers, while the other protomer

99 remains empty, consistent with half-of-the-sites reactivity described previously (47) (Figure 3.2C-D and Figure 3.S5A-B). One of the substrate molecules is located in the catalytic pocket assuming a position and conformation highly similar to that previously found for Michaelis- Menten complex-like structures (Figure 3.2B). The second FAc molecule is bound ~5.8 Å away from the first one, adjacent to the dimer interface and the cap domain and towards the entry of the channel leading to the active site (binding site 2; Figure 3.2A, D).

Also in contrast to previously identified structures, the electron density representing the cap domain is much better defined in the mutant structures (Figure 3.S4). One possible reason for this difference could be the FAc molecule at binding site 2 acting as an anchor to reduce the inherent mobility of this part of the protein chain. Its fluorine atom coordinates to the side chain of K152 (F-N: 2.9 Å) and the backbone of the cap domain residue A252 (F-O: 3.2 Å). The carboxylic oxygens of the second FAc, the one seen in the active site, are bound to K152 and Y141 which are part of the interface helix that – with its counterpart in the other subunit – is critical for the allosteric communication between protomers (Figure 3.2B).

100

Figure 3.2 - Multiple binding sites of substrate molecules in the slow Y219F mutant reveal structural basis of substrate inhibition in FAcD. (A) A cross-section through FAcD protomer B displaying the access channel from the protein surface to the active site. In addition to FAc, smaller amounts of glycolate product are bound at the active site. In addition, an adjacent secondary binding site for FAc is revealed, assuming a position that was previously occupied by the side chain of I253 of the cap domain. (B) LigPlots of the two substrate molecules showing the contacts between ligands and protein for each of the binding sites. FAc bound to the secondary site has only two hydrogen bonds linking its carboxylic oxygens to the side chains of Y141 and K152. The substrate molecule in the active site, however, undergoes five binding interactions, which leads to much stronger binding. (C and D) 2Fo-Fc electron density map contoured at 1σ around the end of the substrate access channel of FAcD. (C) In the empty protomer, only water molecules and a chloride ion can be seen while the other protomer (D) holds the two substrate molecules at adjacent locations. Comparing panel C with panel D, also shows how FAc displaces the side chain of I253.

101

3.3.3 After 24 hours soaking time, crystal structures of the Y219F mutant of FAcD show product at the active site. Y219F crystals were soaked in substrate-containing solutions for varying times to characterize structural changes during catalysis. Due to the slow catalytic rates (0.035min-1), the soaking times could be varied from 3 min to 48 hours to capture various intermediates along the reaction coordinate pathway. Crystal structures with soaking times of less than 24 hours showed two substrate molecules in the access channel as described above but in addition electron density indicative of the product glycolate started to appear in the active site. Interestingly, crystal structures with soaking times beyond 24 hours show glycolate in a new orientation, rotated by 68 degrees and flipped along the yz-plane. The carboxylic oxygens are still coordinated to the two arginine residues R111 and R114 but the hydroxyl group is now interacting with the halide- binding pocket instead of R114 (Figure 3.3B and Figure 3.S3). This seemingly weaker binding position of glycolate in the Y219F mutant, not seen in the native protein, might well reflect the loss of the hydroxyl group (Y-F) leading to an altered shape and modified electrostatics of the active site.

Figure 3.3 - Product glycolate bound to active site in Y219F mutant after 24 hr soak. (A) 2Fo-Fc electron density map contoured at 1σ of the product glycolate after soaking Y219F crystals in FAc solution for 24 hours. (B) The orientation of glycolate in Y219F FAcD is shifted and rotated by roughly 68 degrees relative to the position glycolate assumes in the D110N mutant which stabilizes a Michaelis-Menten like state and in the shorter-time soaks where FAc is still bound in multiple sites in the Y219F mutant. (C) LigPlot showing the hydrogen bond network of glycolate in panel A.

102

3.3.4 Secondary binding is absent in structures derived after 24 hours soaking times Crystal structures determined from crystals soaked for longer than 24 hours showed only glycolate in the catalytic site with no secondary FAc bound (Figure 3.3A). The sole presence of glycolate indicates that the majority of the substrate was turned over in the crystals (Figure 3.3A). The lack of secondary binding in the vestibule suggests that the affinity of glycolate for this binding site is insufficient to lead to stable complex formation. 19F NMR spectroscopy completely supports this interpretation of the absence of a second binding site.

3.3.5 19F NMR shows changes in protein conformation and dynamics when two substrates are bound To investigate the protein conformation and dynamics during substrate inhibition, 5F-Trp- incorporating FAcD was studied using 19F NMR. The two substrates ClAc and IAc, the substrate analogue BrAc, and the product glycolate were titrated into a solution that contained the wild-type enzyme and the chemical shift perturbations of the nine 19F-labeled tryptophane residues were monitored. W156, W185, W264 and W267 gave chemical shift perturbations upon addition of ligands (Figure 3.S6). Both the ClAc and IAc titrations exhibited biphasic chemical shift perturbations once the substrate concentration became ~50x higher than that of the enzyme (Figure 3.4 and Figure 3.S7). This pattern suggests that the second binding site is only occupied once the substrate concentration is high enough to almost saturate the catalytic pocket. Furthermore, line widths of peaks generated by W156 and W185 increased during the initial phase of titration but decreased once their chemical shifts reversed direction. As previously shown, changes in line width are associated with protein dynamics, with a decrease indicating reduced exchange of conformational dynamics information between protomers. In conclusion, 19F NMR substrate titrations show that a unique conformation with two substrate molecules bound to one protomer is stabilized in the presence of excess amounts of substrate and is accompanied by decreased inter-protomer dynamics.

103

Figure 3.4 – Shift in the main peak position of W156 with different ligands. 19F NMR titration curves of W156 with various ligands. The chemical shift is plotted as a function of ligand concentration (mM). A, B) Glycolate and bromoacetate are non-reactive ligands and exhibit typical hyperbolic curves. C, D) Chloroacetate and iodoacetate are both substrates and exhibit biphasic curves.

3.3.6 Linear response in titrations with substrate analogue BrAc and product glycolate BrAc is not a substrate for FAcD (47). This is surprising given its location between ClAc and IAc, which both are turned over by the enzyme. With the carbon-bromine bond of BrAc not attacked by FAcD, the compound becomes a most useful substrate analogue. Unlike what had been seen in the ClAc and IAc titrations, 19F NMR peaks obtained when titrating FAcD with BrAc exhibited the chemical shift perturbations and line widths typical for a 1:1 complex of FAcD and haloacetate (Figure 3.S7). The same result was achieved when the product glycolate was added instead of substrate; the linear chemical shift pattern indicative of a 1:1 complex was observed.

104

3.3.7 Aberrant biphasic changes in chemical shifts observed in NH HSQC spectra To further characterize conformational changes connected to the binding of two substrate molecules and to test whether these changes extend over large parts of the protein, we titrated FAcD with ClAc to a concentration of 270 mM and acquired a series of NH HSQC spectra (Figure 3.5). As had been seen in the 19F NMR titrations, some peaks showed a biphasic behaviour indicative of a three-state model with conformational changes realized not only in the immediate environment of the bound ligands but on a more global scale (Figure 3.5B-D and Figure 3.S7). The most prominent changes were located in the interface region between the two protomers (Figure 3.5A), in close proximity to the secondary binding site. Moreover, the helices comprising this region are key components in allosteric communication between the two protomers (47). However, several residues residing outside of the interface region also exhibited biphasic behaviour, indicating the binding of a second ClAc molecule induces a global change of the dynamic behaviour of FAcD.

Backbone chemical shifts of each protomer were predicted using SPARTA+ based on the coordinates of 11 crystal structures of the Y219F mutant obtained as part of the soaking time. These chemical shifts were averaged for all 11 structures and the differences between each protomer were mapped onto the Y219F crystal structure (Figure 3.6A). These predicted shifts have an ostensible similarity to the SPARTA+ analysis of the 38 FAc-soaked WT proteins (47). However, if the two SPARTA+ analyses mentioned above were truly equivalent, the mapped differences in Figure 3.6B, which represent the differences between the averaged differences of the wild-type and mutant structures, would be featureless. As the areas where chemical shifts are indicated are very similar it is the magnitudes of the shifts that are different. The regions of the protein whose residues were noted to have biphasic chemical shift behaviour in NH-HSQC spectra (Figure 3.5A) are the same that have chemical shifts predicted in the SPARTA+ analysis (Figure 3.6B, Figure 3.S2). This makes it highly probable that the differences seen between Y219F and wild-type SPARTA+ analyses, especially the region around W185, are indicative of conformational changes induced by the binding of the second substrate molecule.

105

Figure 3.5 - HSQC titration of FAcD with ClAc also results in non-linear chemical shifts. (A) Residues that showed non-linear chemical shifts projected onto the FAcD fold and coloured magenta. Examples of non-linear shifts: (B) E167, (C) M145, and (D) S278. Pink traces represent measurements at 130 mM, red 182 mM, and green 268 mM ClAc. Arrows indicate the non-linear movement of the centres of the peaks.

106

Figure 3.6 - SPARTA+ analysis of differences between soaked Y219F structures and its comparison to the SPARTA+ analysis of soaked WT structures. A) Averaged backbone chemical shift differences predicted by SPARTA+ for protomers from eleven Y219F mutant crystal structures from various soaking times ranging from 30 s-24 hr. B) Predicted chemical shift differences between the average shift of 38 soaked WT structures and the average shift of 11 soaked Y219F structures.

3.3.8 Computational rigidity analysis for elucidating the pathway for allosteric information transfer The X-ray and NMR studies above provide a detailed explanation for half-of-the-sites-reactivity of FAcD. To probe allosteric contributions and map out an allosteric pathway, rigidity-based allosteric transmission is employed. Using the computational analysis detailed in our previous study, degrees of freedom transmission (regions of flexibility or rigidity) are measured from the active site (region highlighted in green in Figure 3.7A) in a stepwise fashion across the entirety of the empty protomer applying a small window of three residues (Figure 3.7A and Figure 3.S1). This allows for the measurement of an allosteric transmission intensity plot, which shows the amplitude of allosteric crosstalk with respect to residue number. The intensity of each residue is mapped out onto the protein structure, which clearly reveals the dimer interface helix α6 (residues 146-160) as a crucial pathway for allosteric communication from one protomer to the other. We found that the region between residues 135-165 has a particularly high allosteric transmission when perturbing the catalytic site, with highest allosteric transmission occurring in

107 the region around residue I153 (Figure 3.7B). This region is the most crucial region to allosterically transmit conformational and rigidity changes to the empty protomer. It also contains residues involved in catalysis, halide binding (H155, W156), or conformational selection (Y141). Other residues involved are either adjacent to those or at the interface region. The second highest transmission peak occurs around residues 220-221 – incidentally Y219 is also involved in halide binding (Figure 3.7B). A number of other residues in the empty protomer show rigidity-based allosteric changes but of smaller magnitude.

Regions highlighted in red in Figure 3.7A indicate areas of highest allosteric transmission. The intensity of rigidity-based allosteric transmission per residue is highly correlated (R2 = 0.64) with the 15N,1H HSQC chemical shift perturbations observed during BrAc titration (Figure 3.7C, D). As a negative control, we also find residues in the empty protomer with very little transmission, for instance in the region around residues 292-300, 195-206 (coloured in blue), again in agreement with the chemical shift perturbations observed in the BrAc titration. When there are two substrate molecules bound to one protomer, one in the active site and one in the ‘vestibule’, we find that general transmission occurs later in the energy cut-off, more reminiscent of the apo- or product-bound states. The two bound substrate molecules inhibit the enzyme’s ability to transmit allosteric change due to more energy being required to transmit across the allosteric pathway. A reasonable explanation for this finding is that the underlying hydrogen bond network is stronger with the additional ligand tightening the structure.

108

Figure 3.7 - Computational prediction of the allosteric pathway using a rigidity-based allostery transmission algorithm. (A) The amount of allosteric transmission from the catalytic site of one protomer to the second empty protomer is coloured based on the intensity of rigidity-based allosteric transmission shown in (B). (C) Chemical shift perturbations from titration of BrAc projected onto a single protomer of a FAcD molecule. (D) Correlation between the intensity of rigidity-based allosteric transmission per residue with the corresponding 15N,1H HSQC chemical shift perturbations.

3.4 Discussion

Each protomer of FAcD only has one access point into the active site, a narrow channel-like opening that stretches from the surface to the chemistry centre of the active site (Figure 3.2A). While the binding of substrate molecules next to the catalytic residues has been well characterized before, I identified a secondary binding site for substrate molecules, located at a position that is occupied by the side chain of I253 in the apo-form of the enzyme. Interestingly, again only one protomer binds ligands while the second one stays empty. A prolonged existence of substrate in the secondary binding site obviously inhibits the entry and exit of substrate/product. This steric clash alone, however, while a major contributor, is not the sole reason for substrate inhibition. The unusual chemical shifts and decreased line widths observed

109 when the enzyme is confronted by an excess amount of substrate parallel with reduced catalytic rates (Figure 3.4 and Figure 3.5 and Figure 3.S7). This supports the interpretation that decreased s – ms timescales of enzyme dynamics result in decreased catalytic activity, stressing again the importance of dynamics in enzyme function.

High concentrations of substrate transform the secondary binding site from a transient step towards the active site to a permanently occupied site. The molecule at the secondary site acts like a “wrench in a cogwheel” interfering with the communication between protomers. It not only sterically inhibits entry and exit of substrate/product to/from the active site but also increases the number of hydrogen bonds within the transmission site thereby raising energy requirements for transmission, compounding the inhibitory effect. Computational analysis of the Y219F FAcD mutant indicates an allosteric pathway, shown in Figure 3.7A, which runs through residues 124-165. According to rigidity analysis, an increase in the number of hydrogen bonds between rigid clusters increases the energy required to convey changes in the degrees of freedom from one site to another. Since FAcD is a very rigid protein, even small changes to the number of hydrogen bonds can have large effects on the energy requirements for transmission. As described above, the secondary substrate binding adds two hydrogen bonds in the allosteric pathway and thereby increases the energy required for transmission. As published recently, the second, empty protomer is a crucial part of the enzyme’s catalytic machinery as it compensates the entropic loss experienced by the first protomer due to substrate binding and – via subunit- subunit interactions – aids in the sampling of downstream transition states (47). Therefore, perturbed allostery will hinder the enzyme’s ability to communicate with the other protomer and diminish its ability to compensate for entropy loss and to sample transition states, leading to slower catalytic rates.

3.4.1 The role of substrate inhibition in catalysis Substrate inhibition is a phenomenon observed in a multitude of different enzymes (198, 209– 214). Many of these proteins are monomeric causing reaction diagrams to be somewhat simplistic, and do not take proper account of dynamic changes that may happen in the enzyme upon binding of a second substrate molecule (203). Since, in oligomeric proteins all protomers can play a crucial role in catalysis a more nuanced take of enzyme function using structural, spectroscopic, and computational insights needs to be considered. Here I show that FAcD is not

110 only sterically inhibited by the molecule binding at the second site but that the second bound species also affects transmission of information from one protomer to the next. With its ability to decrease the catalytic rate or even halt catalysis altogether substrate inhibition plays an important biological function. Given the persisting uncertainties about the natural substrate of FAcD, it is difficult to speculate about the role of substrate inhibition in this enzyme system. Since almost all fluorinated compounds found in Nature are anthropogenic and FAcD’s defluorinating properties probably have emerged within the last century, one always must consider the possibility that substrate inhibition occurs due to a relatively new enzyme being still evolving on an evolutionary level (215). Another, more biochemical explanation why FAcD exhibits this kinetic behaviour is that this enzyme performs optimally at slightly basic conditions. In contaminated sites, from which the R. palustris bacterium, an organism that prefers neutral to basic environments, was isolated (169, 216, 217) a high concentration of substrate is to be expected. Catalytic turnover will create a substantial number of protons, causing the environmental pH to drop precipitously. Substrate inhibition could be a way to prevent too rapid a decrease in pH, a drop which might well affect the structural integrity of the enzyme and harm the bacteria.

In this study, I show new insights into the structural changes and dynamic mechanisms caused by substrate inhibition in a homodimeric enzyme. Crystallographic structures reveal the location of a secondary binding site while also displaying a distinct enzyme-substrate complex. Spectroscopic analysis shows a decrease in the catalytic activity of the enzyme in the presence of high concentrations of substrate as well as dynamic changes in key residues identified by biphasic chemical shift perturbations. Rigidity-based allosteric transmission analysis points to a localized allosteric pathway in the enzyme structure, which aligns well with chemical shifts from NH-HSQC titrations. Combining the results of these three techniques applied to FAcD, I propose a detailed mechanism how binding of a second substrate to a secondary site adjacent to the active site structurally and dynamically inhibits catalysis in this homodimeric enzyme.

111

3.5 Supplementary Materials

Figure 3.S1 - Degrees of freedom transmission calculations for different transmission regions. Transmission of degrees of freedom is shown for a few regions of protein chain, indicating high transmission (A), low transmission (B) and no transmission (C). Areas under the transmission curve are used to compute the allosteric transmission intensities (see Figure 3.7B.)

112

Figure 3.S2 - SPARTA+ analysis using the soaked structures of Y219F and WT. (A) Predicted chemical shift differences between the Y219F protomers were averaged for the 11 time points. (B) The averaged chemical shifts of Y219F structures were subtracted from the previous SPARTA+ analysis done with the soaked WT structures displaying differences in magnitude.

Figure 3.S3 - 2Fo-Fc map of the active site of Y219F at various time points. A) Active site of apo-Y219F. Two water molecules and one chloride ion are bound. B) After 2 hr soaking in 100 mM FAc solution, an FAc molecule and a putative glycolate molecule are seen superimposed in the active site. In addition, a second FAc molecules is bound at the adjacent site. C) After 48 hr soaking in FAc solution the electron density map only shows a glycolate molecule in the active site.

113

Figure 3.S4 - Comparison of Y219F apo-FAcD and Y219F FAcD after 150 min of soaking in 100 mM FAc solution. In the apo-form (cyan), the cap domain loop is closed and closing off the active site. After FAc binds to the protein the loop opens up and moves by 7.9 Å.

114

Figure 3.S5 - 2Fo-Fc map of the Y219F FAcD mutant active sites with ClAc. 2Fo-Fc electron density map contoured at 1σ around the end of the substrate access channel of FAcD. (A) In the empty protomer, only water molecules and a chloride ion can be seen while the other protomer (B) holds the two substrate molecules at adjacent locations. Comparing panel A with panel B, shows how ClAc displaces the side chain of I253.

115

Figure 3.S6 - 19F NMR spectra of 5F-Trp enriched FAcD titrated with different ligands. A titration series of 19F NMR spectra acquired for (A) glycolate, (B) bromoacetate, (C) chloroacetate, and (D) iodoacetate. The arrows display the general direction of the chemical shift perturbations for W156, W185, W264 and W267 upon addition of the various ligands. Only ligands which can be turned over, chloroacetate and iodoacetate, cause non-linear chemical shift behaviour of the protein resonances (see Figure 3.4 and Figure 3.S7).

116

Figure 3.S7 - 19F NMR titrations of W156 with various ligands. A, B) Glycolate and BrAc titration series of W156 exhibit linear chemical shift behaviour as only one molecule can bind to the protein at the time. C, D) ClAc and IAc titrations exhibit biphasic chemical shift behaviour as either ligand binds at more than one site.

117

Table 3.S1 – Crystallography statistics Y219F - Y219F - Apo Y219F - Fac Y219F - ClAc GOA Data collection Wavelength (Å) 1.5418 1.5418 1.5418 1.5418 Scaling

Space group P21 P21 P21 P21 Unit cell a, b, c (Å) 41.8; 78.0; 84.9 41.8; 79.4;85.0 41.8;79.6;85.0 41.6;79.5;84.8 90.0;  90.0; 103.4; 90.0 90.0;102.9;900 90.0;103.3;90.0 102.8;90.0 I/Sigma 22.9 (2.50) 13.4 (2.1) 25.3 (3.3) 26.2 (3.3)

CC1/2 99.9 (78.4) 99.8 (66.4) 100 (89.6) 100 (88.9)

Completeness (%) 95.2 (80.0) 99.2 (99.5) 90.8 (66.1) 98.2 (80.2)

Structure Refinement

Resolution range (Å) 82.6-1.54 82.9-1.58 82.7-1.51 82.7-1.51

Unique Reflections 74807 73551 76918 83097 Protein atoms 4773 4727 4721 4727 Water atoms 592 641 609 623 Ion atoms 3 1 2 0 Ligand atoms 0 15 15 5

Rwork (%) 17.2 18.68 17.47 17.46

Rfree (%) 19.9 22.6 20.1 20.6 Bond r.m.s.d. (Å) 0.006 0.006 0.007 0.013 Angle r.m.s.d (°) 0.916 0.933 0.950 1.250

Ramachandran statistics Favored (%) 97.0 96.1 96.8 96.8 Allowed (%) 2.5 3.22 2.86 2.86 Outliers (%) 0.5 0.68 0.34 0.34 Average B-factor macromolecules 14 14.2 14.8 14.2 Water 23.5 23 23.9 22.7 Ions 24.6 33.1 16.8 - Ligands - 15 13 11.2

118

Chapter 4

Serial Time-Resolved Crystallographic Analysis of an Irreversible Enzyme, Fluoroacetate Dehalogenase

Authors: Mehrabi, Pa, b., Schulz, E.C.c, Persch, E.d, De Gasparo, R.d, Müller-Werkmeister, H.c, Tellkamp, F.c, Diederich, F.d, Miller, R.J.D.c,e, Pai, E.Fa,b,f.

Affiliations: aDepartment of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada. bOntario Cancer Institute/Princess Margaret Cancer Centre, Campbell Family Institute for Cancer Research, Toronto, ON, M5G 1L7, Canada. cAtomically Resolved Dynamics, Max- Planck-Institute for Structure and Dynamics of Matter, Luruper Chaussee 149, 22761 Hamburg, Germany. dLaboratorium für Organische Chemie, ETH Zürich, Vladimir-Prelog-Weg 3, 8093, Zürich, Switzerland. eDepartments of Chemistry and Physics, University of Toronto, 80 St. George Street, Toronto ON M5S 3H6, Canada. fDepartments of Biochemistry and Molecular Genetics, University of Toronto, 1 King’s College Circle, Toronto, Ontario M5S 1A8, Canada.

Statement of contribution: P.M. designed the experiments, purified and crystallized FAcD, and performed all the crystallography experiments and refined the structures. E.C.S. integrated and scaled all the time-resolved data and was involved in data collection. E.P., R.D.G. and F.D. synthesized the caged compound. H.M.W. and F.T. designed and set up the laser array for photolysis and were both involved in data collection. F.T. Designed and implemented the translation stages.

119

Abstract Time-resolved crystallography of irreversible systems is often hampered by difficulties in attaining homogenous reaction initiation and growing crystals that provide a sufficiently large diffraction volume. I used fluoroacetate dehalogenase as a model system to examine chemically triggered reactions using a chip, a fixed target, to present the crystals to the X-ray beam. Homogenous reaction initiation was achieved by photolytic cleavage of the caged-substrate 2- (4-hydroxyphenyl)-2-oxoethyl fluoroacetate, the p-hydroxyphenylacylester of fluoroacetate, by a femtosecond laser pulse. Structure determination was performed via pump-probe serial crystallography at a synchrotron beamline on sub-30-micron crystals. Several time points were collected ranging from milliseconds to seconds producing well-resolved electron density maps ranging from 1.7-2 Å in resolution. They showed significant structural changes in the active site as well as movement of the cap domain, corroborating previous results obtained by a mutant trapping approach.

4.1 Introduction

First steps towards time-resolved crystallography (TRX) of proteins were taken in 1976 when Alber et al. (218) showed that one could use cryo-crystallography to stabilize reaction intermediates in crystals as to allow the collection of full sets of diffraction data. Somewhat later experiments were designed to observe catalysis in protein crystals as it happened. Study objects were glycogen phosphorylase b (219) and H-Ras p21 (59, 220). While the former used clever manipulation of catalytically important parameters like pH and temperature the latter employed chemically triggered reaction initiation using flashes from a mercury lamp (59). Unfortunately, due to technological restraints TRX remained rather esoteric with an almost exclusive focus on a small subset of reversible, light inducible systems (54, 56, 61, 87, 221). Over the past decade, there has been a resurgence in TRX due to advances in X-ray sources both at synchrotrons and X-ray free electron lasers (XFELS) (53, 78, 222, 223). While classical crystallography can provide high-resolution atomic models of proteins in their ground states, it cannot accurately confer dynamic information that transpires stepwise through the catalytic cycle (224). TRX is able to capture spatial and temporal information, which ultimately underlies all dynamic processes - something that is unattainable with traditional practices. Incidentally, many of the

120 recent advances in TRX methodology continue to use light-inducible reversible systems with intrinsic chromophores.

Traditionally, TRX approaches were performed at room temperature and used Laue-diffraction techniques; this had the advantage of greatly reducing the time and number of exposures needed to collect a complete data set (60, 61, 64). Large single crystals of proteins, such as photoactive yellow protein (PYP) or the myoglobin-CO complex, were repeatedly activated and then allowed to regress back to their ground states. In doing so, individual crystals were not exhausted after each X-ray exposure (54, 225). Nowadays, TRX predominantly uses pump- probe methodology, traditionally by means of light-sensitive proteins or photo-activatable substrates to initiate T0. Even with the use of polychromatic beams, these experiments can become quite laborious, this is especially true for irreversible systems as a result of exhausting each crystal upon reaction initiation. In light of this, the advent of serial crystallography absolves these issues, taking advantage of the brilliant X-ray sources of 3rd and 4th generation synchrotrons and XFELs, whereby obtaining diffraction data from microcrystals at room temperature is achievable with near-zero radiation damage (74, 75). The high intensity of the beams allows for exposure times of <50 fs at XFELs (1012 photons per pulse) and μs-ps timescales contingent on the synchrotron source (1012 photons per pulse) (54, 63, 77, 78, 81); the limiting factor for data collection is usually the speed of detector readout. Typically, for serial time-resolved experiments (STRX), crystals are streamed in front of the beam using a microjet, photo induced and then exposed to X-rays in random orientations (53, 77) – this becomes impractical for time-resolved experiments that rely on very long timescales. For the majority of enzymes, the entire catalytic process takes between 10-1 – 101 seconds (226). Using fixed target systems, such as silicon chips which can house a matrix of crystals, allows for very low background, high diffraction quality (80, 227), and – with appropriate mechanical and software support –for the sampling of very long time-points.

To study irreversible systems and gain an understanding of nuanced catalytic properties governing enzyme mechanics, I used fluoroacetate dehalogenase (FAcD) as a model system. FAcD is a homodimeric enzyme from Rhodopseudomonas palustris, belonging to the α/β hydrolase protein superfamily. Like all members of this family it performs catalysis with an Asp-His-Asp catalytic triad. Rather rare, however, is its ability to cleave C-F bonds allowing it

121 to use fluoroacetate as its sole carbon source (5)(47). FAcD also has a slow turnover rate, Kcat = 1.84 min-1 (47), can readily be crystallized and its crystals are of high diffraction quality (5, 47). Since H-ras p21, photosystem II is the only other system that is irreversible that has been analyzed by TRX (228). Here, I report on an Serial TRX (STRX)approach using fixed targets with caged compounds to study the catalytic properties of the irreversible enzymatic system FAcD.

4.2 Methods and Materials

Expression and purification of fluoroacetate dehalogenase (FAcD)

FAcD was expressed as described in Chapter 2.

Synthesis of caged fluoroacetate

Sodium fluoroacetate (3.15 mmol) was mixed in a 1:1 mixture with CH3CN/toluene with 18- crown-6 ether (3.15 mmol) added to bind and sequester the sodium cation which primes fluoroacetate to displace bromine from 2-bromo-4′-hydroxyacetophenone (2.86 mmol) (CDS013566 ALDRICH) and stirred at 40 °C for 40 h, subsequent purification was performed by flash chromatography and preparative recycling gel permeation chromatography to yield 2- (4-hydroxyphenyl)-2-oxoethyl fluoroacetate as white crystals.

Spectroscopic measurements of caged fluoroacetate

Measurements to determine the absorption spectrum of caged fluoroacetate was performed using the setup as described in (80). In brief, a deuterium-halogen lamp was used as the light source, a series of parabolic mirrors collimated and focused the light source. The measurements were made using a UV – visible spectrometer (Ocean Optics USB2000+), with crystals in 100 mM Tris-HCl pH 8.5 and 200 mM CaCl2 and 24-28% PEG3350 and and 50 mM of the caged fluoroacetate soaked into the crystals. Laser setup

The fundamental laser light at 1030 nm generated in a fiber amplifier (Pharos) at a repetition rate of 1 kHz was guided along a home-built setup. The polarization was set to horizontal for linear polarized light and matched to a BBO crystal with angles cut for third harmonic

122 generation to get optimal output of 340-350 nm at 35-38 uW. Pulses were set to 100-120 fs duration and the laser was focused to a spot size of 70 x 70 microns.

Crystallization of FAcD

FAcD crystals were initially grown in 18-20% PEG3350, 200 mM CaCl2, and 100 mM Tris- HCl pH 8.5 using the hanging drop method. It took about 2-5 days until large crystals formed (~ 200-500 microns). Using a microseed kit from Hampton Research (HR2-320A), a microseed stock was generated from a few large crystals and the seeds were stabilized by increasing precipitant concentration by 4%.

To generate microcrystals, a batch crystallization technique was applied. mixing equal volumes (100-200 ul) of 10-3 –fold diluted stock solution generated from the kit and 0.5 mM FAcD solution in 12 x 35 mm glass vials. Crystals ranging from 10-50 µm grew within 24 - 48 h.

Data collection

Crystals were diluted in batch crystallization solution and 100-200 µl volumes were spread evenly on top of the silicon chips and loaded in place using a gentle vacuum, which randomly oriented the crystals (227). The loaded chips were placed onto specially designed holders and attached to translation stages. Each chip housed a maximum of 11,664 crystals. 3 um mylar sheets covered the chip to prevent crystal dehydration. All data collection was performed at the P14 beamline at the Petra III synchrotron in Hamburg, Germany. Data collection was done at a wavelength of 0.9762 Å with 20 us exposures. The X-ray beam was focused to a spot size of 10 x 5 microns. Diffraction data were collected on a Pilatus 6M detector at room temperature. Long time points (>100 ms) were collected following an interlacing procedure whereby a set number of crystals were activated by the laser pulse before the first crystal was exposed to the X-ray beam, followed by the other activated crystals so that the time differential between laser and X- ray exposure was the same for each of the crystals, equalling the delay for the whole data set (Figure 4.S6).

Data processing and refinement

Diffraction data were indexed and integrated using the data reduction software package IOTA- cctbx (83). The unit cell parameters from a previously determined FAcD structure from a single

123 crystal at room temperature was used to generate the starting lattice model for the initial set of reflection spots. Upon integration, the software package PRIME (229) was used for scaling and merging of the datasets. Phases were calculated by molecular replacement using Phaser in the Phenix package(171) and using wild-type FAcD as search model (PDB code: 3R3U). Wincoot (172) was used to build the atomic model and refinement was done using Phenix (175). Water molecules in the electron density maps were accepted if they were within 2.2-3.5 Å of potential protein hydrogen bond donors or acceptors and the corresponding electron density was at least 1.5 electrons/Å3.

4.3 Results

FAcD’s catalytic cycle starts with the binding of fluoroacetate followed by an SN2 attack by an aspartate residue (D110 in the R. palustris enzyme) on carbon-2 of fluoroacetate, displacing the fluorine ion and creating a covalent ester intermediate. This ester is subsequently hydrolyzed by a water molecule activated by a histidine residue (H280). To observe the structural changes accompanying catalysis, a near homogenous reaction initiation, which will reduce the number of intermediate states that occupy the crystal at any given time, is accomplished using a caged compound, 2-(4-hydroxyphenyl)-oxoethyl fluoroacetate. To obtain this modified substrate, fluoroacetate is reacted with p-hydroxyphenacyl bromide (pHP-Br), which produces pHP-FAc, a compound that has a number of desired properties, such as ps-ns photolytic cleavage (230): a quantum yield of about 0.1-0.37, hydrophilicity, and a peak absorption of ~310-400 nm depending on the caged molecule (230, 231). The caged substrate is catalytically inert until photolytic cleavage. Caged fluoroacetates peak absorption in solution was measured as 320 nm, when soaked in the crystal it was ~340 nm (Figure 4.S1), sufficiently removed from strong absorption bands of proteins. Soaking FAcD crystals with pHP-FAc yielded electron density maps identical to those of the apo-protein, indicating that the caged substrate is restricted to the water channels of the crystal.

In a control experiment, crystals of the catalytically nearly inert D110N mutant of FAcD were soaked in 50 mM of pHP-FAc and exposed overnight to standard office fluorescent lights (Figure 4.S2). In the resulting electron density map, fluoroacetate was easily identified in the

124 active site. This proved that the pump-probe approach should be applicable to the investigation of FAcD by STRX.

Liquid jet or other injection systems are the predominant methods of crystal delivery for STRX experiments, due to their ease of use and quick preparation time (53, 77). However, these methods have some undesirable qualities; some are of a general nature such as low hit rates (1- 18%), therefore requiring large volumes of crystal suspensions (53). A disadvantage peculiar to slower enzymes like FAcD is the requirement for an short path length from injection nozzle to X-ray beam, making it impossible to probe long time points. Time-resolved experiments require high redundancy to observe small structural changes with confidence (65), therefore using a system which takes advantage of high hit rates is desirable to avoid the consumption of large numbers of crystals required in most injection systems. To this end, I used fixed-target crystal chips, which can house >20,000 crystals. The chips are positioned in the path of the X-ray and laser sources via translation stages (80, 227). Crystal chips are loaded by spreading out a suspension of crystals of appropriate size and making use of a gentle vacuum to deposit them in the features of the chip. As a consequence, crystals are fixed, which reduces the total number of crystals needed for a structure determination relative to the injection systems. Likewise, since there is no distance the crystals have to travel after being ejected from the jet between pump- probe cycles because they remain fixed, there are no limitations on the time that can pass between photolysis and the collection of diffraction data. Obviously, not every chip hole will contain a crystal and some will harbour more than one potentially creating problems with indexing of the X-ray reflections. Nevertheless, during data collection, I observed high hit rates, for some chips upwards of 50%, consequently reducing the total crystal volume and the length of time needed to perform the experiments.

The experimental setup shown in Figure 4.1 consists of a laser illumination scheme using a 1030 nm laser (Pharos) with third harmonic generation to attain a ~340-350 nm beam with pulse lengths of 100-120 fs and a 70 x 70 micron focus to excite the caged compound. As no information was available on the kinetic constants of FAcD I collected data at several time points ranging from 30ms – 2 s to cover a full catalytic cycle. The diffraction patterns reached resolutions up to 1.7 Å (Figure 4.S3).

125

Figure 4.1 – Experimental setup. FAcD microcrystals (<40 um in size) were loaded onto the chip using a gentle vacuum and mounted onto the chip holder (1) before being placed on the translation stages (2). Crystals were protected from dehydration via a mylar coating. The crystals were pumped with a 120 fs laser pulse at 343 nm and single 20 µs exposure diffraction images were collected on a Pilatus 6M detector at various time points.

Our previous study (47) showed that the cap domain and active site of FAcD display structural asymmetry between protomers. This was apparent as exhibited by a loss of electron density corresponding to residues 252-259 as well as chemical shifts and B-factor variances indicating an increase in dynamics distinct between protomers. This structural asymmetry was recapitulated in the TRX experiments. The observed electron densities in the cap domains in both protomers at time 0 are largely equivalent, with slight differences in B-factors (Figure 4.4). Upon photolysis the cap domain in protomer A displays a distinct increase in mobility at 30 ms, which is not observed in protomer B (Figure 4.2). Only during the 100 ms time point is there a

126 transient increase in density encompassing the loop (Figure 4.2). The longer the reaction progresses, the more significant the disorder along this helix/loop region becomes (Figure 4.S4).

Figure 4.2 – Electron density changes in the cap domain. 2Fo-Fc electron density map contoured at 1σ. A) At the 0 ms time point (No UV exposure), the electron density for the cap domain is well defined. B) At 2000 ms after photolysis of the pHP-cage, the corresponding electron density is much weaker and less defined, indicating an increase in mobility.

The trends of increased mobility in the cap domain are correlated to changes in electron density seen in the active site of the same protomer (Figure 4.3). The greater the mobility observed in the cap domain the larger the changes in the active site. The density drastically changes at 1s, indicating that entry of substrate to the active site of the crystalline protein occurs sometime between 100 ms and 1s after photolysis (Figure 4.3). Density changes continue at 2 s, these changes are likely caused by the onset of mobility and the mixing of substrate, intermediates, and solvent molecules (Figure 4.S5). Although only very limited time-related information can be drawn from these maps directly, the data do provide information on the initial time course for substrate entry and the formation of the Michaelis-Menten complex.

127

Figure 4.3 – Electron density changes observed in the active site. Both panels show 2Fo-Fc electron density maps (blue) contoured at 1σ together with Fo-Fc maps contoured at 3σ (green and red for positive and negative values, respectively) of the active site in Protomer A. A) At the 100 ms time point, the only significant difference seen is in the occupancy of water molecules. B) At 1000 ms, large changes in the difference map density indicate the presence of a mixture of most probably substrate/ions/water as well as reaction intermediates.

Using chemical trapping methods, I previously showed that the crystallographic B-factors for significant parts of the protein chain change as the reaction progresses (47). This is also evident with the time-resolved structures. The global B-factor changes between time-points become clearly evident once the active site of subunit A becomes populated (Figure 4.4). The crystals investigated are highly isomorphous, the differences in resolution are scaled to the lowest resolution structure, allowing consistent comparisons between the structures.

128

Figure 4.4 – B-factor at 0 ms and 1000 ms time points after photolysis. Crystallographic B-factors are represented by putty representation and by color from blue (5) to red (25). A) At 0 ms. B) At 1000 ms. B-factors of the structure determined at the 1000 ms time point overall are significantly higher than those of the structure at 0 ms.

4.4 Discussion

In this chapter, I describe a proof-of-principle approach to study an irreversible enzyme using fixed targets by STRX. Using a TRX approach we recapitulate some of the data shown in Kim et al. 2017. Structural changes were centered on the cap domain and the active site. Any changes were highly coupled between the two protomers, always conserving dimer asymmetry (47).

Little is known experimentally about the lifetime of intermediate states in FAcD’s reaction cycle and at what precise time points they occur. It is also important to note that reaction kinetics of FAcD crystals are not necessarily like those in solution – due to the potential differences between enzymes in solution and crystals, e.g. limitations on internal diffusion and restrictions on chain movements (232, 233). Therefore, high-resolution kinetic data is needed to identify when key catalytic events occur to allow for a complete structure-reaction correlation. Considering this, the time-resolved data presented here demonstrates on a broad time-scale when the active site becomes populated. It is possible that the increase in density in the cap domain in protomer A is physically relevant to catalysis, by reducing mobility stabilizing the entryway to the active site. More kinetic measurements are needed to gain a better and more detailed understanding of the progression of intermediate states. In addition, the resolution of

129 the structures needs to stay consistently high to make accurate structural comparisons between time-points possible. Many more time points are required to be able to accurately describe small changes and to pin down longer living intermediate states. There the emphasis must be in the time intervals between time points that display clearly defined changes.

One of the challenges of any TRX experiment is to maximize the excited fraction of the caged substrate (234). This can, however, prove difficult as increasing the energy of the laser pulse or the repetition rate, can cause a ‘protein quake’ due to heat generation (69). This may be circumvented by temporally stretching the pulse so as to reduce peak power (235). To avoid complications of overheating, I used low laser power in my experiments. Therefore, structural changes observed in the protein are likely not due to heat generation from photolysis but rather caused by interactions with the substrate.

Interpretation of TRX results is always limited by the ability to deconvolute the varying populations of different states in the crystal. These superpositions are caused by the stochastic nature of the reaction start by each single molecule that follows a natural log law and leads to the co-existence of different molecular structures in the crystal– even if the actual triggering event is quasi-instantaneous. Therefore, to capture lowly populated structures, high redundancy data is required to extract the weak signal from the diffraction data.

TRX has historically predominantly dealt with reversible systems. FAcD is perfectly suited as a model system for progressing TRX methodology towards inclusion of irreversible enzymatic systems. My results show that both temporal and high resolution serial TRX with such a system is possible using a caged compound-based photolytic trigger. With kcat-values of a large majority of enzymes falling between 0.1 and 1 s, the method described can be adapted to cover a wide range of biological catalysts.

130

4.5 Supplementary Materials

Figure 4.S1 – Absorption spectra of caged fluoroacetate in liquid and crystalline environment. A) In a 50 mM Tris-HCl pH 8.5 solution, caged fluoroacetate’s absorption maximum is at 310 nm. Due to limitations of the microspectrometer values below 300 nm are not reliable. B) When soaked into a crystal, the absorption spectrum of caged fluroacetate changes with a broader peak from ~330-370 nm. This is a surprisingly large red-shift of 40 nm.

131

Figure 4.S2 – Electron densities found in the active site of D110N mutant FAcD in a crystal soaked with caged fluoroacetate. Both panels show a 2Fo-Fc electron density map contoured at 1σ of the active site and the atomic model of the corresponding amino acids. Residues involved in halide binding are labeled. A) FAcD crystal soaked with 50 mM of pHP-FAc and kept in the dark for data collection. The resulting structure is that of the apo-enzyme, no additional electron density can be found. B) FAcD crystal soaked with 50 mM of pHP-FAc and then exposed to fluorescent light for 12 h. Well-defined electron density corresponding to the free substrate fluoroacetate is clearly recognized in the active site. The structure should be a close representation of the Michaelis-Menten complex of the catalyzed reaction.

132

Figure 4.S3 – Diffraction image from serial TRX. One representative diffraction image from the Pilatus 6M detector obtained at the 0 s time point with reflection spots extending to high resolution. The labeled rings indicate the resolution.

133

134

Figure 4.S4 – Electron density maps of the cap domain at various time-points. All panels show 2Fo-Fc electron density maps contoured at 1σ with Ile253 labeled as a point of reference. Panels on the left show the electron density and corresponding atomic model of the cap domain of Protomer A, while panels on the right display the same information for protomer B.

Figure 4.S5 – Electron density changes in the active site. All panels (A-E) show 2Fo-Fc electron density maps (blue) contoured at 1σ and Fo-Fc maps (green for positive and red for negative values) contoured at 3σ of the active site in Protomer A. Both the 0 s and 30 ms time points show similar electron density indicating only waters bound in the active site. The 2000 ms time point is similar to the 1000 ms time point, which shows difference electron density representing a mixture of ligands, water, ions, substrate, product and possibly intermediates.

135

Figure 4.S6 – Interlacing procedure for long time points. A) Time intervals are calculated by activating a set number of crystals with the UV laser (yellow). The total number of features scanned by the laser indicates the desired time interval. B) Once the set number of crystals have been activated by the laser an X-ray (red) scan is initiated. Whereby the time points of the crystals is equal to the total time of the laser scan. The rate of scanning for both the laser and X-rays are equivalent.

136

Table 4.S1 – Crystallographic statistics

0 - mS 30 - mS 100 - mS 1000 - mS 2000 - mS Data collection Exposure time (s) 0.037 0.037 0.037 0.037 0.037 Wavelength (Å) 0.97626 0.97626 0.97626 0.97626 0.97626 Scaling

Space group P21 P21 P21 P21 P21 Unit cell 42.5; 80.8; 42.3; 80.4; 42.4; 80.6; 42.4; 80.7; 42.42; 80.70; a; b; c 85.8 85.6 85.8 85.8 85.75 90.0; 102.7; 90; 102.9; 90; 102.9; 90; 102.9; 90.0; 102.96; α; β; γ () 90.0 90 90 90 90.0 Multiplicity 231.8(151.2) 248.2(84.2) 85.7(7.0) 54.1(3.0) 56.8(37.9)

CC1/2 98.7(88.9) 93.2(69.2) 84.6(44.1) 94.2(36.7) 92.5(41.2) Completeness (%) 100 (100) 100 (99.9) 100 (99.9) 100 (80.5) 100 (100) Structure Refinement Resolution range (Å) 83.58-2.05 83.43-2.00 83.58-2.0 83.6-2.00 83.58-2.05 Unique Reflections 35552 37874 38117 37681 35229 Protein atoms 4721 4734 4732 4687 4720 Water atoms 273 226 293 265 169

Rwork (%) 15.75 20.35 21.72 20.12 22.15

Rfree (%) 20.56 24.58 25.87 24.01 26.98 Bond r.m.s.d. 0.006 0.002 0.002 0.002 0.003 Angle r.m.s.d 0.8 0.54 0.51 0.59 0.64 Ramachandran statistics Favored (%) 96.27 95.93 95.93 96.22 95.78 Allowed (%) 2.05 3.39 3.22 3.09 3.91 Outliers (%) 0.68 0.68 0.85 0.69 0.34 Average B-factor macromolecules 13 19.8 19.3 21.2 14.6 solvent 19.6 24.8 25.8 29.6 16.6

137

Chapter 5

Future Directions and Conclusion

5.1 Protein Engineering for bioremediation

This thesis presents progress in our attempts to understand the roles of the various residues involved in catalysis by FAcD. However, the details in mechanism that could explain the widely varying rates at which dehalogenating enzymes catalyze the removal of halogen ions, even in the subgroup investigated here, remain unclear. Further analysis of the structural and chemical differences in catalytic mechanisms should help to find new ways of engineering enzymes that will have broader substrate specificity and/or faster rates of catalysis. Such progress will be beneficial not only for bioremediation exercises but for farming practices as well. In regions where plants produce sodium fluoroacetate, ruminating animals can fall victim to high levels of exposure through consumption of these plants. Engineering the native bacteria in the rumen of these animals to efficiently catalyze hydrolysis of fluoroacetate should result in significant agricultural benefits (236). Generating novel enzymes that accept a broader selection of fluorocarbons as well as enzymes that defluorinate complex fluorinated hydrocarbons, such as perfluorooctanoic acid (PFOA), will impact the environment long-term (237).

Two novel approaches can be used for designing more efficient enzymes: directed evolution and rational design. To broaden substrate specificity and/or improve the catalytic rate of the enzymes directed evolution techniques can be used. A number of different approaches promise to achieve these goals: 1) random mutagenesis of the whole sequence, whereby no information about the enzyme’s structure or mechanism is needed; 2) random mutagenesis of amino acid 138 stretches identified as important in the binding or catalysis of the substrate (238, 239), e.g. residues that are part of the catalytic triad or located in the halide ion binding pocket. ‘Rational’ design approaches (239, 240) based on the rather small differences in the properties of halide ions are judged less promising for improving substrate selectivity; e.g. the respective atomic radii of fluoride and chloride ions differ by 0.28 Å only. They might, however, be successfully applied to broaden the substrate specificity with respect to length or chemical nature of the main carbon chain of the substrate. The two approaches can also be applied in repetitive fashion; once improvements in substrate acceptance have been achieved catalytic rates can be accelerated and vice versa.

Computational approaches to protein design are also possible using an array of algorithms, such as mutation prediction algorithms. As an example, Chen et al. 2009, were able to modify the phenylalanine adenylation domain of gramicidin S synthetase A using mutation algorithms to produce compounds other than its natural product gramicidin S (241). The major advantage of computational approaches to protein design is that they are quite efficient because large areas of ‘sequence space’ can be sampled quickly and cheaply, with multiple mutations introduced in various locations in the protein all at once (242). The downside is that the accurate modeling of protein function in silico is still a challenge and therefore the correctness of algorithms is still not fully reliable (242).

5.2 Further investigations of function of protomer interactions

The work done in this thesis has begun to shed light on the function of the dimer for FAcD. The dimer can be further probed by creating an active-inactive hybrid dimer. This can be accomplished by using a number of already existing mutants, whereby one subunit can have a given mutation that stops catalysis at a specific time point and seeing what that does to dimer function. Such heterodimers can be generated by using different tags for mutants and wild-type protein, such as a 6 x His-tag on the wild-type and a GST or FLAG tag on the mutant subunit. Growing both proteins within the same cell and, for instance, expressing them from a pDUET vector will allow the cells own machinery to make the heterodimer. If incorporating both protein’s genes into the same cell creates problems and therefore does not yield sufficient heterodimers, each protein can be synthesized separately, unfolded e.g. in high concentrations of

139 urea or guanidinium hydrochloride, mixed together and refolded by dialysis. Both the intracellular and ex vivo approaches should produce a mixture of 50% heterodimers and 25% of the mutant and wild-type homodimers, respectively. Use of the appropriate affinity columns will allow the separation of the heterodimeric proteins, followed by size-exclusion chromatography for further purificaiton.

5.3 Time-resolved crystallography

5.3.1 Diffusion experiments For many irreversible systems that are not tenable to light induced reactions with caged compounds, diffusion-based time-resolved methods are currently in development (243, 244). In the past, most time-resolved experiments were accomplished similar to the FTX experiments presented in Chapter 2. Due to the limiting speed of substrate penetrating crystals, flash-freezing approaches ususally result in electron density maps that are a mixture of time-dependent structures. This is because diffusion of the substrate into the crystal is usually slower than the enzymes turn-over rate, which causes the outer regions of the crystal to begin catalysis before the inner regions. This leads to an highly heterogenic crystal, as this disorder is added to the heterogeneity caused by stochastic events after laser induction (85). With the advent of serial crystallographic approaches, there has been a push for ever smaller crystals (78, 222, 223). With crystals <10 μm in size, the diffusion time is now less than the turn-over rates, making it possible to use diffusion methods for TRX (244).

There also have been attempts to adept the liquid jet system for diffusion experiments in serial crystallography. The addition of a mixing chamber prior to ejection of the crystals allows for a pre-determined time of diffusion to commence before data are collected (243). One systemic draw back to this method is that late time-points for slow enzyme systems will be measured with great difficulty due to the length of time the experiments will take. To examine a wide time gamut, mixing/diffusion methodology can be combined using the crystal chip. One drawback of the chip is that loading rates are rarely over 30-50% and much of your crystal solution is lost even while using a gentle vacuum. A piezo microfluidic injector can accurately and consistently dispense pico-micro litre volumes. Thereby a platform can be designed using the existing translation stages for the chip to inject a single crystal at a time onto the features of the chip,

140 attaining near 100% loading. Subsequently piezo micro-injection systems can be used to propel substrate onto a loaded chip. Varying volumes of ligand can be injected from a large reservoir. Using the interlacing procedure described in chapter 4, late time-points can be measured for slow enzyme systems. This method has the advantage of using both the crystal chip system for its accurate time resolution and a diffusion method ligand delivery system, which removes the need for caged substrates. If successful, this method can open the door to investigate more than -1 2/3 of all enzymes, since they have a Kcat slower than 10 seconds (226). Obviously, crystallizability and catalytic activity in the crystal without major effects on the crystal lattice are still limiting requirements.

5.3.2 Time-resolved crystallography of FAcD The time-resolved data presented in chapter 5, illustrated late time points and the corresponding large electron density changes taking place. The additional electron density observed in the active site, although significant and well defined, could not be interpreted as substrate, intermediate, or product alone. This is most probably due to the fact that the total time-scale where intermediate steps begin to populate in the FAcD reaction has not been accurately measured, yet.

Intrinsically, there is inherent heterogeneity in a crystal under TRX conditions; even if one is able to generate a single start time, the unpredictability of the exact point in time at which a given substrate complex will react (overall, the sum of all the enzyme molecules will follow a logarithmic time law), leads to superposition of at least three reaction states (85) at any point in time,. All states populated during the enzymatic reaction could be superimposed in one electron density map. The exact number depends on the sum total of identifiable steps in the catalytic mechanism and their various contributions to the maps will be determined by the stability of potential intermediates. To circumvent issues of heterogeneity and to obtain well-defined electron density that accurately depicts transitional structural changes during catalysis, highly redundant data – for general high-quality – has to be collected at time-points at which the occupancy of the intermediates of interest is at its highest (65). Once these time points have been determined, fine-slicing of time-points, i.e. collecting multiple data points between two time-points of interest, limits the changes occurring and makes it easier to interpret them in terms of an atomic model. Such changes include side chain motions or interpreting dynamic

141 information based on B-factor differences. With both fine-slicing of time points and high redundancy data collection, the conversion of intermediate states can be observed and a better structural characterization of the protein in question obtained.

5.4 Allostery pathway in FAcD

Analysis of the different conformational states characterized by crystallography and NMR identified an allosteric pathway between the two protomers using RTA. Transmission of information is greatly affected by subtle changes in the active site, e.g. a single atom difference like the replacement of a hydrogen by a fluorine atom in the substrate, modifies the degree to which transmission occurs (see Chapter 2).

To gain a better understanding of the allosteric pathway established by RTA, more experimental detail is needed to map out the transmission of information residue by residue. An analysis combining crystallography and NMR methods can be used to identify changes in B-factors, kinetics, and chemical shifts.

Using the ‘window’ method, a residue-by-residue pathway was mapped out for FAcD in Chapter 3. This pathway can be used to guide a series of mutations to examine changes in mobility that can be observed crystallographically. TRX approaches can also be adopted using mutants which slow the overall catalytic rate, if certain structural intermediates are outside the time-resolution using WT enzyme. Fine time-slicing can be used to observe the degree of b- factor changes in the protein, and see whether there are any time-dependent changes - as well as track any perturbations in the electron density corresponding to the allosteric pathway. Any observed changes can be compared to changes identified in the wild-type protein. The advantage of using time-resolved data is that this method has the potential to follow allosteric transitions by observing different structural events as catalysis proceeds. Interesting events can be examined in greater detail by using finer time slices. A similar analysis of allosteric transition was done using dimeric hemoglobin (Hb1) on a ns to μs time regime (221, 225).

142

5.5 Summary and Conclusion

This thesis aimed to improve our understanding of the subtleties found in one of the mechanisms of enzymatic defluorination, using fluoroacetate dehalogenase as its study object. Applying crystallographic as well as spectroscopic and computational techniques, the enzyme’s structure and dynamics could be analyzed in great detail. As presented in Chapter 2, over 39 crystal structures of wild-type FAcD were attained as part of a series of FTX experiments attempting to trap reaction intermediates. While no definitive intermediate structures could be identified, SVD analysis on these structures unexpectedly showed completely conserved asymmetry between the two enzymes subunits. Although these changes are small and highly coupled they are distinctly different in the two subunits, recognizing them being a bonus feature of the very high resolution of the diffraction data obtained in these experiments. When catalysis takes place in one subunit it mechanically drives the other subunit towards the conformation corresponding to the next step along the reaction coordinate pathway. Cocrystallized structures of the D110N mutant with substrate or product have shown that these compounds only occupy one active site in the dimer. In structures of the covalent ester intermediate of the H280N mutant, which were obtained by cocrystallizing the protein with substrate at twice the protein concentration, also have only one active site occupied; while under high substrate concentrations (>50mM), both sites are occupied. Entropic loss from substrate binding was compensated by an increase in dynamics and water egress, mostly in the ligand-free subunit.

Mapping residue dynamics of FAcD using fluorine NMR initially showed minor high energy states of W156. Further analysis of the enzyme revealed minor states throughout the 5F-Trp- labelled spectrum, indicating that the protein is sampling these high energy states globally and not just localized in the active site. These data presented in Chapter 2 are in agreement with the ensemble model of allostery.

Substrate titrations showed biphasic chemical shifts of the W156 resonance. Biphasic behaviour was also seen in plots of initial rates of FAc and ClAc titration with wild type protein. In the presence of high substrate concentrations, initial rates begin to decline pointing to substrate inhibition. This was supported by the results of soaking crystals of a very slow mutant (Y219F) in a solution containing a very high concentrations of substrate. Corresponding electron density maps showed the substrate trapped both in the active site and at a transient secondary binding

143 site located in the channel that leads to the active site. RTA analysis using this data pointed to an allosteric pathway. I hypothesize that there may be a potential evolutionary role for substrate inhibition to prevent pH imbalances to arise when the FAcD-bearing bacteria are in very substrate-rich environments – since both the enzyme and host bacteria prefer neutral to slightly basic environments (216).

The methods-oriented part of this thesis used FAcD as a model system to push forward serial TRX methods of irreversible enzymatic reactions. TRX experiments performed involved flash photolysis of a caged compound and diffraction data collection with a silicon-based chip as the means of crystal presentation. Using “pump and probe” methods, several time points of X-ray diffraction data along the reaction pathway were collected. Electron density changes were noted at multiple time points. As seen with these FTX experiments and with data collected from crystals of mutant proteins, the cap domain electron density can change quite drastically during this process. The TRX data clearly show that as the lag between laser and X-ray pulse increases, there is growing loss of electron density, indicating an increase in dynamics. This finding is accompanied by an increase of electron density in the active site. While the latter cannot be clearly assigned to substrate, intermediate or product or the probable mixture reliably deconvoluted, yet, I cannot exclude the possibility of several intermittent binding sites for fluoroacetate. This interpretation gains some support if one considers that based on the discrepancy of size between the rather large overall ligand binding site and the small fluoroacetate, this compound is probably not the natural substrate for this enzyme. It should also be noted that FAcD is the first irreversible enzymatic system examined by STRX methods.

5.6 Final Remarks

Only very few biological systems so far have yielded to analysis in such great detail as FAcD, Photoactive Yellow Protein, hemoglobin, and myoglobin are other examples. This has made it very difficult to completely understand all the structural and dynamic processes going on during the underlying chemical reactions. Consequently, the models with which we attempt to describe protein behavior are usually constructed based on limited information and despite an often serious lack of detail on the experimental level.

144

Over the past half-decade there has been an increasing push for new methods in protein crystallization more suitable to watch the catalysts in motion (53, 77, 78, 81, 222, 223, 243). In parallel, our ideas about allostery have evolved, especially those based on the ensemble description (47, 93, 94, 99, 102, 103). This makes the present and the near future an excellent time to engage in the analysis of systems that until very recently were recalcitrant to time- resolved studies. The work I have presented in my thesis with regards to FAcD structure and function has pushed our understanding of enzyme function just a bit further, however the conclusions we reached are hardly exhaustive. What it does clearly show, however, is that it is not always large structural changes that are needed for catalytic activity. Rather, minimal structural and dynamic changes, when working cooperatively, can overcome nearly insurmountable obstacles, such as breaking one of Natures strongest bonds.

145

Appendix

Comparative Analysis of Diffraction Data Collection from Fluoroacetate Dehalogenase Crystals by Serial, Laue, Room Temperature, and Cryo-Crystallography

Authors: Mehrabi, Pa, b., Schulz, E.C.c, Srajer, Vd., Tellkamp, F.c., Moffat. Kd,e., Miller, R.J.D.c,f., Pai, E.Fa,b,g. Affiliations: aDepartment of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada. bOntario Cancer Institute/Princess Margaret Cancer Centre, Campbell Family Institute for Cancer Research, Toronto, ON, M5G 1L7, Canada. cAtomically Resolved Dynamics,Max- Planck-Institute for Structure and Dynamics of Matter, Luruper Chaussee 149, 22761 Hamburg, Germany. dCenter for Advanced Radiation Sources, The University of Chicago, 5620 South Ellis Avenue, Chicago, Illinois 60637, USA. eDepartment of Biochemistry and Molecular Biology, and Institute for Biophysical Dynamics, The University of Chicago, 929 East 57th Street, Chicago, Illinois 60637, USA fDepartments of Chemistry and Physics, University of Toronto, 80 St. George Street, Toronto ON M5S 3H6, Canada. gDepartments of Biochemistry and Molecular Genetics, University of Toronto, 1 King’s College Circle, Toronto, Ontario M5S 1A8, Canada. Statement of contributions: P.M. designed the experiments, purified and crystallized FAcD as well as performed all the crystallography experiments and refined the structures. E.C.S. integrated and scaled all the time-resolved data and was involved in serial data collection. V.S. integrated and scaled all the Laue data and was involved in Laue data collection. F.T. designed and implemented the translation stages and was involved in serial data collection.

146

Abstract Fluoroacetate dehalogenase (FAcD), a homodimeric enzyme from the soil bacterium Rhodopseudomonas palustris, is capable of cleaving carbon-fluorine bonds. Its wild-type apo- structure crystallizes with high reproducibility and its crystals can diffract to better than 1 Å resolution and display remarkably low radiation sensitivity (5, 47). This system is well suited to compare the various techniques of data collection and to investigate whether the technique selected together with the specific software packages influences data quality. Room temperature serial crystallography (SX) using fixed targets, Laue multi-wavelength diffraction, and classical monochromatic crystallography techniques are compared. The structure derived under monochromatic cryogenic conditions serves as the point of comparison. Unit cell volume of the crystals probed at room temperature are 4% larger and have less numbers of bound water molecules than are found in the cryogenic, monochromatic (PDB-ID: 5K3D) structure, for which twice as many have been identified. B-factor differences between these structures correlate with the radiation damage accumulated during data collection. Not surprisingly, the SX structure shows the least radiation damage. Difference density between the various structures is mainly related to varying side chain conformations caused by thermal motion. We conclude that cryogenic and room-temperature structures of FAcD are in good agreement, independent of the methods used to collect them.

A.1 Introduction

Recently, a number of microorganisms have been identified that have the ability to utilize simple fluorocarbons as their main carbon source (5, 20, 31). They all possess enzymes capable of cleaving the C-F bond - the strongest single bond in organic chemistry (5). The enzymes most efficient at this type of catalysis are fluoroacetate dehalogenases. They are members of the α/β hydrolase protein superfamily and therefore share a similar architecture (5, 47). One feature all these enzymes possess is a catalytic triad of one histidine and two carboxylate containing residues. In our investigation, we use the homodimeric enzyme fluoroacetate dehalogenase RPA1163 from the soil bacterium Rhodopseudomonas palustris (CGA009). This enzyme contains an Asp-His-Asp catalytic triad. Upon entry of fluoroacetate into the active site and formation of the Michaelis-Menten complex, Asp110 undergoes an SN2 attack releasing a

147 fluoride ion while creating a covalent ester intermediate. The ester bond is subsequently hydrolyzed by a water molecule, which in turn is activated by an interaction with His280. The catalytic reaction generates a proton and a fluoride ion as well as glycolate.

The underlying structural mechanisms that allow these enzymes to break the C-F bond are slowly being unravelled. In a previous study, we identified a high degree of crosstalk between the enzyme’s two subunits, e.g. while highly coupled, they never adopt exactly the same conformation. The substrate fluoroacetate only binds to one subunit at any given time, while the empty subunit becomes more mobile balancing out the entropic price paid during substrate binding in the other subunit. This shift in entropy, combined with fast exchange between subunits, lets the enzyme sample higher energy states allowing it to transition through catalysis faster (47). Artifacts that could arise from differences in crystal packing could be ruled out due to the very high isomorphism of the crystal samples. One question, however, that is still unanswered is to which degree these structural changes can be detected in structures determined from data collected at cryogenic temperatures.

To delineate any possible structural variations observed under cryogenic conditions, room temperature experiments of FAcD were performed. While nowadays most crystallography experiments are performed under cryogenic conditions a subset of methods exist that allows data collection at room temperature.

The traditional approach of mounting single crystals in a capillary or a similar protective environment was the only technique predating the advent of flash-freezing (245). We used the technique to collect diffraction data on our rotating anode home source.

Another method, the Laue diffraction technique, uses single crystals and a high-flux polychromatic x-ray beam for data collection (54, 59, 60, 64, 219). One of its advantages is that it can drastically reduce the total number of exposures needed to complete a data set. Depending on space group symmetry, amassing a complete data set can require only a few diffraction patterns or even – as in the case of cubic space groups – one (60).

Another method is monochromatic serial crystallography. This method uses thousands of microcrystals, either introduced in a fine stream of liquid or fixed on a silicon chip. Individual

148 crystals receive only a single X-ray pulse thereby almost avoiding any radiation damage (74, 78, 228).

With this technique, however, for a complete data set one needs to index, integrate and merge thousands of diffraction patterns each one them containing partial reflections only (246–248).

Taking advantage of the excellent reproducibility of FAcD crystals, we investigated whether one can compare results obtained at varied temperatures and using different crystallographic techniques in a meaningful way without falling prey to potential underlying methodological artefacts. As we are interested in questions about conformation and mobility we added comparisons of B-factors to the analysis of electron density changes observed between FAcD crystal structures obtained by a variety of crystallographic techniques: SX using a crystal chip as a fixed target, room temperature (RT) Laue techniques, RT Rotating Anode (RTRA) on a home source, and cryogenic temperature (PDB-ID: 5K3D) data collection on a home source rotating anode. Investigating data equivalence via comparisons of structures obtained by multiple crystallographic techniques at different temperatures is important for future TRX studies.

A.2 Methods and Materials

Expression and purification of FAcD

FAcD was prepared as described in Chapter 2.

Crystallization of FAcD

Single crystals of apo-FAcD were grown as described in Chapter 2. For serial crystallography experiments, single crystals were grown as described in Chapter 4.

Data collection

Room temperature crystals were harvested, mounted in MicroRT™ X-ray capillaries. RT diffraction data were collected on a Rigaku FR-C rotating copper anode equipped with multilayer optics and a Mar345 detector. Laue data collection was done at BioCARS’ 14-ID-B beamline at the Advanced Photon Source (APS). Two data sets of 180° were collected, each from one crystal, using the 3 x 11 bunch mode. Each exposure was for rotations of 6 degrees until a complete dataset was captured.

149

For SX experiments, data collection is described in Chapter 4. Data processing and refinement

RTRA data sets were processed using the data reduction software XDS (170). Laue data were processed using the software package PRECOGNITION (60), software developed especially for Laue data analysis. SX data were reduced using the software packages IOTA-cctbx for indexed and integration and scaled and merged using the package PRIME (83, 229). Phases were calculated for all the structures using the molecular replacement option in PHASER (171) with wild-type FAcD (PDB code: 5K3D) as search model (47). Model building and refinement were done using software packages Wincoot (172) and Phenix.Refine (175), respectively. Crystallographic water molecules were accepted only when located within a 2.2 - 3.5 Å range of the protein in hydrogen-bond distance of a protein donor or acceptor atom and if the map showed corresponding electron density of at least 1.5 e/Å3.

Data analysis

Luzzati plots were generated for all structures using Sfcheck, part of the CCP4 software suite

(249). Fo-Fo isomorphous difference maps were calculated using the “Isomorphous difference map” tool in the Phenix software suite and figures were created using PyMol (250). Distance matrixes were generated calculating pairwise distances for all main-chain atoms using the routine “Muscle”, part of the “bio3d software package” for structural and sequence alignment (251). Difference distance matrixes were calculated by subtracting two distance matrixes from each other using the same software.

A.3 Results

All the FAcD crystal structures used in the analysis are obtained from crystals of the same space group with near identical unit cells. All the room temperature structures were handled at the same resolution (1.8 Å). Only the SX and Laue datasets used specialized data reduction software to integrate and scale the diffraction data. All subsequent data reduction used molecular replacement using the same search model (PDB ID: 3R3U) and identical parameters for refinement. Thereby allowing for accurate comparison of B-factors, and electron densities when calculating isomorphous difference maps between the various structures.

150

A.3.1 Unit cell and B-factor differences All the RT structures have equivalent unit cells; differences in any axis are smaller than 0.4 Å. This is also true when comparing all the existing 5K3D structures of FAcD presently found in the PDB. Not surprisingly, however, there are differences in unit cell volumes between the 5K3D and RT structures – with the room temperature structures being roughly 4% larger in volume. This is consistent to what has been found for other room temperature and cryogenically cooled crystals (252–254). The protein molecules packed in the larger unit cells also bind roughly 50% less structural water molecules on their surface (Figure A.1).

Figure A.1 - Water distribution of cryogenic (5K3D) and room temperature structures of apo-FAcD. Blue spheres correspond to electron density interpreted as water molecules up to 2.8 Å outside the protein volume. Only water molecules located within 2.3-3.5 Å of a hydrogen bond partner, displaying B-factors <80 Å2, and with corresponding electron density of at least 1.5 e/ Å3 were accepted. A) 5K3D model with the greatest number of tightly bound water molecules (604 waters). B) RTRA data-based model with 249 waters. C) Laue data-based model with 222 waters. D) SX data-based model with 319 waters. The room temperature crystal structures have roughly half the number of bound water molecules found in the 5K3D model.

151

Global B-factor traces of 5K3D show a very rigid protein with an overall B-factor of 14.2 Å2 (Figure A.2A). B-factors do change during catalysis, as was seen with a series of chemically trapped mutants, whereby the empty subunit took on the entropic load from substrate binding in the other subunit (47). A trace of the Laue and RTRA structure reveals that its global B-factors are 17.2 Å2 and 21.1 Å2, ca. 20% and 50% higher, respectively, than the those of the 5K3D structure (Figure A.2B-C). The average B-factor of the RTRA structure is ~22.5% higher than that of the Laue structure, which can be indicative of longer exposure times - (45s) on the rotating anode vs. 3 x 11 bunches on the Laue beam at room temperature. Although there is a 50% difference in the number of structural waters between the SX and 5K3D structures, the B- factors of the SX structure are ca. 30% less than the 5K3D structure indicating that the dynamic mobility of the atoms in 5K3D are not completely dampened by the cryogenic environment. The cryogenic conditions decrease the total amount of radiation damage the crystal will accumulate during data collection (255, 256). The increased B-factors of Laue and RTRA structures is probably due to intrinsically higher mobility and potentially radiation damage, which leads to a general decline of data quality with B-factors acting as final catch-all of corresponding errors (257). Since in SX data collection crystals are only exposed to single pulses of radiation diffraction will outrun damage with data deterioration kept to a minimum. It is therefore not too surprising that the average B-factor of the SX structure, 12.9 Å2, hardly differs (~10%) from that of 5K3D (Figure A.2D).

152

Figure A.2 – B-factor differences between cryogenic (5K3D) and room temperature structures of apo- FAcD. Crystallographic B-factors are shown in both putty representation and color-coded, from blue (low B-factor) to red (high B-factor). The B-factors range from 5 Å2 to a maximum of 25 Å2. Regions with missing residues were removed due to lack of electron density. A) Backbone representation of apo- FAcD model based on data collected under cryogenic conditions. The average B-factor for both subunits is 14.2 Å2. B) Backbone representation of apo-FAcD model based on data collected at RT. The average B-factor for both subunits is 21.1 Å2. C) Backbone representation of apo-FAcD model based on data collected at RT using the Laue technique. The average B-factor for both subunits is 17.2 Å2.D) Backbone representation of apo-FAcD model based on data collected at RT using the SX technique. The average B- factor for both subunits is 12.9 Å2, the lowest among the four models but close to that of the 5K3D structure. The average values for the RTRA and Laue models are noticeably higher than those of the 5K3D structure.

A.3.2 Comparison of structures calculated from data sets obtained by Laue, serial, and rotating anode home source techniques Superposition of all backbone atoms of structures determined from data sets collected by Laue, Serial (SX), and RTRA result in average RMSD values of 0.15 Å (Figure A.3A). The RT

153 structures are independently compared to a cryo structure of apo-FAcD from the pdb (5K3D). The RMSD differences between backbone atoms of the structures based on data collected by Laue, SX, and RTRA structure are 0.31 Å, 0.33 Å, and 0.36 Å, respectively (Figure A.S1). The somewhat larger RMSD differences between these structures and 5K3D can be attributed to the larger unit cell, which allows for a greater degree of mobility. Nevertheless, the RMSD differences are relatively small in absolute terms and only minimally larger than the error indicated by the Luzzati plots (Figure A.S3), making all these structures for practical purposes identical. Especially, residue conformations in the active site are superimposable between all four structures (Figure A.S2).

Figure A.3 - Structural overlay and isomorphous difference Fourier maps (Fo-Fo) of room temperature structures of apo-FAcD. A) Structural overlay of the RTRA (green), Laue (red), and SX (blue) structures.

The average RMSD between the models is 0.15 Å, very close to the expected error. B) Fo(RTRA)-Fo(Laue) electron density map contoured at 3.0 σ (green - positive density; red - negative density) overlaid onto the cartoon representation of the Laue data-based model. C) Fo(SX)-Fo(Laue) electron density map contoured at 3.0 σ (green - positive density; red - negative density) overlaid onto the cartoon representation of the Laue data-based model. D) Fo(RTRA)-Fo(SX) electron density map contoured at 3.0 σ (green - positive density; red - negative density) overlaid onto the cartoon representation of the SX data-based model.

154

The Laue, SX, and RTRA structures were compared with 5K3D to compose Fo-Fo maps (Figure A.4). At 3.0 σ the differences look more pronounced than the Fo-Fo maps comparing the room temperature structures to one another. This is due to differences in thermal motions due to the large temperature differential between these structures and to slight non-isomorphism. In light of this, there are minimal difference peaks in the backbone, larger more pronounced changes are located in the regions around the flexible cap domain and the outside of the protein molecule.

Data collection using the Laue method is rather sensitive to increased crystal mosaicity (258). In addition, lattice distortion drastically changes how well reflections are resolved. Their robustness and low mosaicity made FAcD crystals very suitable for this technique; single crystals could be used to collect 36 images corresponding to a complete dataset, without resulting in observable lattice distortion or loss of Bragg peaks due to radiation damage in a beam of 7 x 1013photons/sec/mm2 focused to 15x15 m2.

155

Figure A.4 - Isomorphous difference Fourier maps (Fo-Fo) of room temperature structures and the cryogenic structure PDB-ID 5K3D of apo-FAcD. A) Fo(Laue)-Fo(5K3D) electron density map contoured at 3.0 σ (green - positive density; red - negative density) overlaid onto the cartoon representation of the

5K3D model. B) Fo(SX)-Fo(5K3D) electron density map contoured at 3.0 σ (green - positive density; red - negative density) overlaid onto the cartoon representation of the 5K3D model. C) Fo(RTRA)-Fo(5K3D) electron density map contoured at 3.0 σ (green - positive density; red - negative density) overlaid onto the cartoon representation of the 5K3D model.

Collecting diffraction data at room temperature on a rotating anode machine with an X-ray flux of 109 photons/sec/mm2 focused to 100x100 m2 results in a structure with similar parameters to the one attained by Laue methods. As shown in Figure A.3B the resulting Fo-Fo map displays minimal difference density peaks in the interior and backbone of the protein, indicating that the two structures are agree very well with one another.

For the SX structure X-ray data were collected at room temperature using a fixed target consisting of a silicon chip which houses a matrix of ~10,000 microcrystals (<30 µm), as described in (80, 227). Crystals are situated in front of the X-ray beam and individual crystals are exposed to a single X-ray pulse, thereby resulting in the lowest photon load out of the three

156 methods tested. Since data are collected at a synchrotron, exposure times are roughly 20 µs and will result in slightly higher radiation damage than if the data were to be collected at an XFEL. SX data sets are the result of merging diffraction patterns from thousands of crystals. The difference electron density map between the SX - Laue data is similar to that of the Laue - RTRA map whereby at 3.0 σ there are minimal difference peaks in the interior of the protein, and the resulting differences are mainly situated on the outside of the protein molecule. The SX-

RTRA Fo-Fo map has a greater number of speckled difference peaks in the active site and adjacent regions (Figure A.3B-D). The difference peaks localized to the outer region of the protein are likely due to side chain dynamics and thermal motions via interaction with bulk water (259). Moreover, the SX structure presented is largely equivalent to the other two structures, the Fo-Fo map and RMSD differences do not indicate any inconsistencies that would result from artifact generation by serial crystallography.

Difference distance matrices are a better way to visualize a quantitative representation of structural variances of the macromolecules themselves when the unit cells are non-isomorphous. Figure A.S4 shows the difference distance matrix between backbone atoms of the RT and 5K3D structures, with values ranging from 0.5 Å to as small as 0.1 Å. It can therefore be safely concluded that there is good structural agreement between all these structures.

A.3 Discussion

Our results show that the crystal structures of apo-FAcD, determined at RT and 5K3D as well as by using a variety of data collection techniques, are in close agreement with one another overall. All the structures with isomorphous unit cells exhibit RMSDs within the experimental errors expected according to Luzzati (Figure A.S3). Whether determined at RT or CT, fold and even side chain orientations of crystalline FAcD molecules were identical or almost identical. The only clear difference detected was the number of structural water molecules that could be identified attached to the protein surface. It has been postulated that larger water networks can act as a “straitjacket” limiting protein motions (252). It is unclear, however, whether an increase in size of such networks also seriously affects a protein’s energy landscape or just dampens intrinsic movements. My results are consistent with earlier findings that saw such a dampening effect under cryogenic conditions (252). B-factors have also been interpreted as a metric of

157 radiation damage in crystal structures (260). It is therefore not surprising that the structures with the highest B-factors are the RT structures. Except for the SX structure, which has accumulated the least amount of radiation damage and also displays the lowest B-factors. It has been proposed that cryogenic cooling has the potential to distort the overall energy landscape of a protein by shifting populations and overall conformational states, thereby making true equilibrium states invisible in a flash-frozen structure (252). High-energy minor states have been identified and confirmed via spectroscopic measurements in FAcD (47). These conformational states may remain “invisible” for many proteins, but minor states of W156 have been shown to emanate from distinct structural conformations of Y141 in each protomer (47). As thermal motions are increased in structures collected at ambient temperatures, “invisible” states may become more pronounced and more readily identifiable. Therefore, future experiments to further decipher these phenomena should be made via room temperature approaches.

With the aim of gaining a quantitative understanding of how much of a dampening effect cryogenic cooling has on FAcD, future studies will compare the B-factors of structures of TRX intermediate states with those of the 5K3D crystal structure to see if there are large differences in the relative uncertainty of atomic positions.

STRX experiments will proffer the best outcomes, since they generate structures which are based on data collected from crystals with the lowest possible radiation damage (70, 261). Two possible approaches with STRX should be considered for studying these effects. One method, discussed in Chapter 4, involves the use of caged substrates and flash photolysis via a laser pulse. This method can give timed crystal structures with high fidelity using fine-time slicing (as described in Chapter 4). The second approach uses microcrystals (<10 µm) and diffusion of unmodified substrate into those crystals. Once the crystal size approaches 1 µm, the time needed for a substrate molecule to reach the crystal centre is short in comparison to the catalytic rate (243, 244). Although the methods and equipment for the latter technique are still in development, it potentially offers a more accurate way of testing these phenomena, because it can make use of the natural substrate and does not require any chemical modifications like caging groups.

158

A.5 Supplementary figures

Figure A.S1 - Structural superpositions of room temperature structures against the cryogenic structure PDB-ID: 5K3D. Overlays are calculated using backbone atoms. A) Overlay of Laue data-based model (red) with cryogenic 5K3D model (yellow) results in an overall RMSD of 0.31 Å. B) Overlay of SX data-based model (green) with cryogenic 5K3D model (yellow) results in an overall RMSD of 0.33 Å). C) Overlay of RTRA data-based model (blue) with cryogenic 5K3D model (yellow) results in overall RMSD of 0.36 Å).

159

Figure A.S2 – Overlay of catalytically important active site residues. The active sites of subunit A of the four structures (Laue – red, RTRA – green, SX – blue, 5K3D – yellow) are superimposed based on all backbone atoms of the active site residues. The figure shows the very high structural conservation of the placement of the catalytically important residues. The average RMSD measured against5K3D using backbone atoms is 0.17Å.

160

Figure A.S3 - Luzzati plots. A) Luzzati plot of the cryogenic structure (5K3D) indicating an error in coordinates of 0.17 Å. B) RTRA model with coordinate error of 0.16 Å. C) Laue model with coordinate error of 0.18 Å D) SX model with coordinate error of 0.29 Å.

Figure A.S4 - Difference distance matrices (DDM) of room temperature structures against the cryogenic structure(5K3D). A) DDM between the RTRA and 5K3Dmodels. The upper triangular portion contains the difference in the intra-subunit distances for all the backbone atoms. B) DDM of Laue model vs. 5K3D model. All the DDM matrices show small differences between the structures with the largest deviations ~1 Å. C) DDM of SX model vs. 5K3D model displaying the largest differences among the structures.

161

Table A.S1 – Crystallography statistics

SX Laue RTWT Data collection Wavelength (Å) 0.97626 1.02-1.12 1.5418 Scaling

Space group P21 P21 P21 Unit cell a; b; c 42.5; 80.8; 85.8 42.4; 80.8; 85.4 42.4; 80.8; 85.4

α; β; γ () 90.0; 102.9; 90.0 90; 102.9; 90 90; 102.9; 90 I/Sigma - 51.01(3.0) 16.02(3.28)

CC1/2 94.7(42.6) - 99.8(87.6) Completeness (%) 99.9(99.5) 73.9(35.3) 99.0 (99.1) Structure Refinement Resolution range (Å) 83.58-1.80 83.26-1.80 83.26-1.80 Unique Reflections 52360 38051 51545 Protein atoms 4696 4661 4700 Water atoms 274 222 249

Rwork (%) 20.4 17.37 14.2

Rfree (%) 22.6 21.07 17.8 Bond r.m.s.d. 0.003 0.008 0.015 Angle r.m.s.d 0.636 0.984 1.301 Ramachandran statistics Favored (%) 97.45 97.61 97.47 Allowed (%) 1.87 2.39 2.53 Outliers (%) 0.68 0 0.00 Average B- factor macromolecules 10.8 17.2 20.1 solvent 17.5 23.5 28.4

162

Bibliography

1. Goldberg ED (1991) Halogenated hydrocarbons: Past, present and near-future problems. Sci Total Environ 100(C):17–28.

2. Mohan R, Raja S, Saraswathy G, Mathivanan S, Das BN (2013) Improvement of Slip Resistance of Rubber Sole by Halogenation. Int J Eng Res 2(5):353–356.

3. Sevenard D V., et al. (2009) Halogenation of fluorinated cyclic 1,3-dicarbonyl compounds: new aspects of synthetic application. Tetrahedron 65(36):7538–7552.

4. Gerebtzoff G, Li-Blatter X, Fischer H, Frentzel A, Seelig A (2004) Halogenation of drugs enhances membrane binding and permeation. ChemBioChem 5(5):676–684.

5. Chan PWY, Yakunin AF, Edwards EA, Pai EF (2011) Mapping the reaction coordinates of enzymatic defluorination. J Am Chem Soc 133(19):7461–7468.

6. Lemal DM (2004) Perspective on Fluorocarbon Chemistry. J Org Chem 69(1):1–11.

7. Müller K, Faeh C, Diederich F (2007) Fluorine in pharmaceuticals: looking beyond intuition. Science 317(5846):1881–6.

8. Chen G (2004) Electrochemical technologies in wastewater treatment. Sep Purif Technol 38(1):11–41.

9. Whicker FW, Hinton TG, MacDonell MM, Pinder JE, Habegger LJ (2004) Environment - Avoiding destructive remediation at DOE sites. Science 303(5664):1615–1616.

10. Chaudhry GR, Chapalamadugu S (1991) Biodegradation of halogenated organic compounds. Microbiol Rev 55(1):59–79.

11. Pries F, Van Den Wijngaard AJ, Bos R, Pentenga M, Janssen DB (1994) The role of spontaneous cap domain mutations in haloalkane dehalogenase specificity and evolution. J Biol Chem 269(26):17490–17494.

12. Pries F, van der Ploeg JR, Dolfing J, Janssen DB (1994) Degradation of halogenated aliphatic compounds: the role of adaptation. FEMS Microbiol Rev 15(2–3):279–95.

13. Adams GO, Fufeyin PT, Okoro SE, Ehinomen I (2015) Bioremediation, Biostimulation

163

and Bioaugmention: A Review. Int J Environ Bioremediation Biodegrad 3(1):28–39.

14. Magan N, Fragoeiro S, Bastos C (2010) Environmental factors and bioremediation of xenobiotics using white rot fungi. Mycobiology 38(4):238–48.

15. Dzionek A, Wojcieszyńska D, Guzik U (2016) Natural carriers in bioremediation: A review. Electron J Biotechnol 23:28–36.

16. Mary Kensa V (2011) Bioremediation - An overview. J Ind Pollut Control 27(2):161– 168.

17. Janssen DB (2004) Evolving haloalkane dehalogenases. Curr Opin Chem Biol 8(2):150– 159.

18. Tiedje JM, Quensen JF, Chee-Sanford J, Schimel JP, Boyd S a (1994) Microbial reductive dechlorination of PCBs. Biodegradation 4(4):231–240.

19. Jong RM De, et al. (2003) Structure and mechanism of a bacterial haloalcohol dehalogenase : a new variation of the short-chain dehydrogenase / reductase fold without an NAD(P)H binding site. EMBO J 22(19):4933–4944.

20. Fetzner S, Lingens F (1994) Bacterial dehalogenases: biochemistry, genetics, and biotechnological applications. Microbiol Rev 58(4):641–685.

21. Stourman N V., Rose JH, Vuilleumier S, Armstrong RN (2003) Catalytic mechanism of dichloromethane dehalogenase from Methylophilus sp. strain DM11. Biochemistry 42(37):11048–11056.

22. Bader, R. and Leisinger T (1994) Isolation and Characterization of the Methylophilus DMll Gene Encoding Dichloromethane Dehalogenase / Glutathione S-Transferase. J Bacteriol 176(12):3466–3473.

23. Arora PK, Bae H (2014) Role of dehalogenases in aerobic bacterial degradation of chlorinated aromatic compounds. J Chem 2014:1–10.

24. Trantírek L, et al. (2001) Reaction mechanism and stereochemistry of gamma- hexachlorocyclohexane dehydrochlorinase LinA. J Biol Chem 276(11):7734–40.

25. Mohn WW, Tiedje JM (1992) Microbial reductive dehalogenation. Microbiol Rev 164

56(3):482–507.

26. Hill KE, Marchesi JR, Weightman AJ (1999) Investigation of two evolutionarily unrelated halocarboxylic acid dehalogenase gene families. J Bacteriol 181(8):2535–2547.

27. Schmidberger JW, Wilce JA, Tsang JSH, Wilce MCJ (2007) Crystal Structures of the Substrate Free-enzyme, and Reaction Intermediate of the HAD Superfamily Member, Haloacid Dehalogenase DehIVa from Burkholderia cepacia MBA4. J Mol Biol 368(3):706–717.

28. Ridder IS, Rozeboom HJ, Kalk KH, Dijkstra BW (1999) Crystal structures of intermediates in the dehalogenation of haloalkanoates by L-2-haloacid dehalogenase. J Biol Chem 274(43):30672–30678.

29. Esaki N, Sodat K, Preparation E (1991) Catalytic Action of L-2-Halo Acid Dehalogenase on Long-chain L-2-Haloalkanoic Acids in Organic . Biotechnol Bioeng 38:1114– 1117.

30. Liu JQ, et al. (1994) Purification and characterization of thermostable and nonthermostable 2-haloacid dehalogenases with different stereospecificities from Pseudomonas sp. strain YL. Appl Environ Microbiol 60(7):2389–93.

31. Kurihara T, Esaki N, Soda K (2000) Bacterial 2-haloacid dehalogenases: Structures and reaction mechanisms. J Mol Catal - B Enzym 10(1–3):57–65.

32. Soda K, et al. (1996) Bacterial 2-haloacid dehalogenases: structures and catalytic properties. Pure Appl Chem 68(11):2097–2103.

33. Nardi-Dei V, Kurihara T, Park C, Esaki N, Soda K (1997) Bacterial DL-2-haloacid dehalogenase from Pseudomonas sp. strain 113: Gene cloning and structural comparison with D- and L-2-haloacid dehalogenases. J Bacteriol 179(13):4232–4238.

34. Nardi-Dei V, et al. (1999) DL-2-haloacid dehalogenase from Pseudomonas sp. 113 is a new class of dehalogenase catalyzing hydrolytic dehalogenation not involving enzyme- substrate ester intermediate. J Biol Chem 274(30):20977–20981.

35. Koudelakova T, et al. (2013) Haloalkane dehalogenases: Biotechnological applications.

165

Biotechnol J 8(1):32–45.

36. van Leeuwen JGE, Wijma HJ, Floor RJ, Van der Laan JM, Janssen DB (2012) Directed Evolution Strategies for Enantiocomplementary Haloalkane Dehalogenases: From Chemical Waste to Enantiopure Building Blocks. ChemBioChem 13(1):137–148.

37. Holmquist M (2000) Alpha Beta-Hydrolase Fold Enzymes Structures, Functions and Mechanisms. Curr Protein Pept Sci 1(2):209–235.

38. Pavlová M, et al. (2007) The identification of catalytic pentad in the haloalkane dehalogenase DhmA from Mycobacterium avium N85: Reaction mechanism and molecular evolution. J Struct Biol 157(2):384–392.

39. Prudnikova T, et al. (2009) Crystallization and preliminary X-ray analysis of a novel haloalkane dehalogenase DbeA from bradyrhizobium elkani USDA94. Acta Crystallogr Sect F Struct Biol Cryst Commun 65(4):353–356.

40. Keuning S, Janssen DB, Witholt B (1985) Purification and characterization of hydrolytic haloalkane dehalogenase from Xanthobacter autotrophicus GJ10. J Bacteriol 163(2):635– 639.

41. Schindler JF, et al. (1999) Haloalkane dehalogenases: Steady-state kinetics and halide inhibition. Biochemistry 38(18):5772–5778.

42. Marek J, et al. (2000) Crystal structure of the haloalkane dehalogenase from Sphingomonas paucimobilis UT26. Biochemistry 39(46):14082–14086.

43. Kmunícek J, et al. (2005) Quantitative analysis of substrate specificity of haloalkane dehalogenase LinB from Sphingomonas paucimobilis UT26. Biochemistry 44(9):3390– 3401.

44. Prokop Z, et al. (2003) Catalytic Mechanism of the Haloalkane Dehalogenase LinB from Sphingomonas paucimobilis UT26. J Biol Chem 278(46):45094–45100.

45. Nagata Y, et al. (1997) Purification and characterization of a haloalkane dehalogenase of a new substrate class from a γ-hexachlorocyclohexane - Degrading bacterium, Sphingomonas paucimobilis UT26. Appl Environ Microbiol 63(9):3707–3710.

166

46. O’Hagan D (2008) Understanding organofluorine chemistry. An introduction to the C-F bond. Chem Soc Rev 37(2):308–19.

47. Kim TH, et al. (2017) The role of dimer asymmetry and protomer dynamics in enzyme catalysis. Science 355(6322):eaag2355.

48. Kamachi T, et al. (2009) The catalytic mechanism of fluoroacetate dehalogenase: a computational exploration of biological dehalogenation. Chem - A Eur J 15(30):7394– 7403.

49. Wlodawer A, Minor W, Dauter Z, Jaskolski M (2013) Protein crystallography for aspiring crystallographers or how to avoid pitfalls and traps in macromolecular structure determination. FEBS J 280(22):5705–5736.

50. Smyth MS, Martin JH (2000) X Ray Crystallography. J Clin Pathol Mol Pathol 53(1):8– 14.

51. Garman EF (2014) Developments in X-ray Crystallographic Biological Macromolecules. Science 343(1978):1102–1108.

52. Moffat K, et al. (1992) Time-Resolved Crystallography: Principles, Problems and Practice [and Discussion]. Philos Trans R Soc A Math Phys Eng Sci 340(1657):175–190.

53. Tenboer J, et al. (2014) Time-resolved serial crystallography captures high-resolution intermediates of photoactive yellow protein. Science 346(6214):1242–6.

54. Genick UK, et al. (1997) Structure of a protein photocycle intermediate by millisecond time-resolved crystallography. Science 275(5305):1471–5.

55. Moffat K (2001) Time-resolved biochemical crystallography: A mechanistic perspective. Chem Rev 101(6):1569–1581.

56. Srajer V, Royer WE (2008) Time-resolved x-ray crystallography of heme proteins. Methods Enzymol 437:379–95.

57. Ellis-Davies GCR (2007) Caged compounds: photorelease technology for control of cellular chemistry and physiology. Nat Methods 4(8):619–28.

58. Giovannardi S, Landò L, Peres A (1998) Flash Photolysis of Caged Compounds: Casting 167

Light on Physiological Processes. News Physiol Sci 13(October):251–255.

59. Schlichting I, et al. (1990) Time-resolved X-ray crystallographic study of the conformational change in Ha-Ras p21 protein on GTP hydrolysis. Nature 345(6273):309–315.

60. Ren Z, et al. (1999) Laue crystallography: coming of age. J Synchrotron Rad 6:891–917.

61. Srajer V, et al. (1996) Photolysis of the Carbon Monoxide Complex of Myoglobin: Nanosecond Time-Resolved Crystallography. Science 274(5293):1726–1729.

62. Schotte F, Soman J, Olson JS, Wulff M, Anfinrud PA (2004) Picosecond time-resolved X-ray crystallography: Probing protein function in real time. J Struct Biol 147:235–246.

63. Schotte F, et al. (2003) Watching a Protein as it Functionswith 150-ps Time-Resolved X- rayCrystallography. Science 300(5627):1944–1947.

64. Schotte F, et al. (2012) Watching a signaling protein function in real time via 100-ps time-resolved Laue crystallography. Proc Natl Acad Sci U S A 109(47):19256–19261.

65. Moffat K (2014) Time-resolved crystallography and protein design: signalling photoreceptors and optogenetics. Philos Trans R Soc B 369(June):20130568.

66. Moffat K, Henderson R (1995) Freeze trapping of reaction intermediates. Curr Opin Struct Biol 5(5):656–663.

67. Mitchell EP, Garman EF (1994) Flash freezing of protein crystals: investigation of mosaic spread and diffraction limit with variation of cryoprotectant concentration. J Appl Crystallogr 27(6):1069–1074.

68. Thor JJ van, et al. (2014) Signal to noise considerations for single crystal femtosecond time resolved crystallography of the Photoactive Yellow Protein. Faraday Discuss 171(0):439–455.

69. Arnlund D, et al. (2014) Visualizing a protein quake with time-resolved X-ray scattering at a free-electron laser. Nat Methods 11(9):923–6.

70. Levantino M, Yorke BA, Monteiro DCF, Cammarata M, Pearson AR (2015) Using synchrotrons and XFELs for time-resolved X-ray crystallography and solution scattering 168

experiments on biomolecules. Curr Opin Struct Biol 35:41–48.

71. Ursby T, et al. (2002) Cryophotolysis of caged compounds: A technique for trapping intermediate states in protein crystals. Acta Crystallogr Sect D Biol Crystallogr 58(4):607–614.

72. Eriksson AE, Jones TA, Liljas A (1988) Refined structure of human carbonic anhydrase II at 2.0 A resolution. Proteins 4(2):274–282.

73. Hatsui T, Graafsma H (2015) X-ray imaging detectors for synchrotron and XFEL sources. IUCrJ 2:371–383.

74. Chapman HN, Caleman C, Timneanu N (2015) Diffraction before destruction. Phil Trans R Soc B 369(June):20130313.

75. Wang J (2016) Destruction-and-diffraction by X-ray free-electron laser. Protein Sci 25(9):1585–1592.

76. Weierstall U (2014) Liquid sample delivery techniques for serial femtosecond crystallography. Phil Trans R Soc B 369(1647):20130337.

77. Liu W, et al. (2013) Serial femtosecond crystallography of G protein-coupled receptors. Science 342(2013):1521–4.

78. Schlichting I (2015) Serial femtosecond crystallography: The first five years. IUCrJ 2:246–255.

79. Nogly P, et al. (2015) Lipidic cubic phase serial millisecond crystallography using synchrotron radiation. IUCrJ 2:168–176.

80. Oghbaey S, et al. (2016) Fixed target combined with spectral mapping: approaching 100% hit rates for serial crystallography. Acta Crystallogr Sect D Struct Biol 72(8):944– 955.

81. Hunter MS, et al. (2014) Fixed-target protein serial microcrystallography with an x-ray free electron laser. Sci Rep 4:6026.

82. White TA, et al. (2012) CrystFEL: A software suite for snapshot serial crystallography. J Appl Crystallogr 45(2):335–341. 169

83. Lyubimov AY, et al. (2016) IOTA: Integration optimization, triage and analysis tool for the processing of XFEL diffraction images. J Appl Crystallogr 49:1057–1064.

84. White TA, et al. (2016) Recent developments in CrystFEL. J Appl Crystallogr 49:680– 689.

85. Ren Z, et al. (2013) Resolution of structural heterogeneity in dynamic crystallography. Acta Crystallogr Sect D Biol Crystallogr 69(6):946–959.

86. Schmidt M, Rajagopal S, Ren Z, Moffat K (2003) Application of Singular Value Decomposition to the Analysis of Time-Resolved Macromolecular X-Ray Data. Biophys J 84(3):2112–2129.

87. Rajagopal S, Schmidt M, Anderson S, Ihee H, Moffat K (2004) Analysis of experimental time-resolved crystallographic data by singular value decomposition. Acta Crystallogr D Biol Crystallogr 60:860–871.

88. Lang PT, Holton JM, Fraser JS, Alber T (2014) Protein structural ensembles are revealed by redefining X-ray electron density noise. Proc Natl Acad Sci USA 111(1):237–242.

89. Paola L Di, Giuliani A (2014) Protein structures as complex systems : a simplification conundrum. Adv Syst Biol 3(1):7–9.

90. Monod J, Wyman J, Changeux JP (1965) on the Nature of Allosteric Transitions: a Plausible Model. J Mol Biol 12(1):88–118.

91. Changeux J-P (2012) Allostery and the Monod-Wyman-Changeux model after 50 years. Annu Rev Biophys 41:103–133.

92. Hilser VJ, Wrabl JO, Motlagh HN (2012) Structural and energetic basis of allostery. Annu Rev Biophys 41:585–609.

93. Nussinov R (2016) Introduction to Protein Ensembles and Allostery. Chem Rev 116(11):6263–6266.

94. Ribeiro AAST, Ortiz V (2016) A Chemical Perspective on Allostery. Chem Rev 116(11):6488–6502.

95. Gunasekaran K, Ma B, Nussinov R (2004) Is allostery an intrinsic property of all 170

dynamic proteins? Proteins Struct Funct Genet 57(3):433–443.

96. Koshland, D.E; Nemethy, G; & Filmer D (1966) Comparison of experimental binding data and theoretical models in proteins containing subunits. Biochemistry 5(1):365–385.

97. Cui Q, Karplus M (2008) Allostery and cooperativity revisited. Protein Sci 17(8):1295– 1307.

98. Eaton WA, Henry ER, Hofrichter J, Mozzarelli A (1999) Is cooperative oxygen binding by hemoglobin really understood? Nat Struct Mol Biol 6(4):351–358.

99. Motlagh HN, Wrabl JO, Li J, Hilser VJ (2014) The ensemble nature of allostery. Nature 508(7496):331–9.

100. Popovych N, Sun S, Ebright RH, Kalodimos CG (2006) Dynamically driven protein allostery. Nat Struct Mol Biol 13(9):831–838.

101. Liu J, Nussinov R (2016) Allostery: An Overview of Its History, Concepts, Methods, and Applications. PLoS Comput Biol 12(6):e1004966.

102. Guo J, Zhou HX (2016) Protein Allostery and Conformational Dynamics. Chem Rev 116(11):6503–6515.

103. Tzeng SR, Kalodimos CG (2011) Protein dynamics and allostery: An NMR view. Curr Opin Struct Biol 21(1):62–67.

104. Vallurupalli P, Bouvignies G, Kay LE (2012) Studying “invisible” excited protein states in slow exchange with a major state conformation. J Am Chem Soc 134(19):8148–8161.

105. Schmidt M, Graber T, Henning R, Srajer V (2010) Five-dimensional crystallography. Acta Crystallogr Sect A 66(2):198–206.

106. Ren Z (2013) Reaction trajectory revealed by a joint analysis of protein data bank. PLoS One 8(11):1–17.

107. Kleckner IR, Foster MP (2011) An introduction to NMR-based approaches for measuring protein dynamics. Biochim Biophys Acta - Proteins Proteomics 1814(8):942–968.

108. Gerig JT (1994) Fluorine NMR of proteins. Prog Nucl Magn Reson Spectrosc 26(PART

171

4):293–370.

109. Kitevski-LeBlanc JL, Prosser RS (2012) Current applications of 19F NMR to studies of protein structure and dynamics. Prog Nucl Magn Reson Spectrosc 62:1–33.

110. Kitevski-Leblanc JL, Evanics F, Scott Prosser R (2010) Optimizing 19F NMR protein spectroscopy by fractional biosynthetic labeling. J Biomol NMR 48(2):113–121.

111. Kitevski-LeBlanc JL, Evanics F, Prosser SR (2010) Approaches to the assignment of 19F resonances from 3-fluorophenylalanine labeled calmodulin using solution state NMR. J Biomol NMR 47(2):113–123.

112. Durrant JD, McCammon JA (2011) Molecular dynamics simulations and drug discovery. BMC Biol 9:71.

113. Nair PC, Miners JO (2014) Molecular dynamics simulations: from structure function relationships to drug discovery. silico Pharmacol 2(4):1–4.

114. Hertig S, Latorraca NR, Dror RO (2016) Revealing Atomic-Level Mechanisms of Protein Allostery with Molecular Dynamics Simulations. PLoS Comput Biol 12(6):e1004746.

115. Olkhova E, Hutter MC, Lill M a, Helms V, Michel H (2004) Dynamic water networks in cytochrome C oxidase from Paracoccus denitrificans investigated by molecular dynamics simulations. Biophys J 86(4):1873–1889.

116. De Ruvo M, Giuliani A, Paci P, Santoni D, Di Paola L (2012) Shedding light on protein- ligand binding by graph theory: The topological nature of allostery. Biophys Chem 165– 166:21–29.

117. Boehr DD, Nussinov R, Wright PE (2009) The role of dynamic conformatinal ensembles in biomolecular recognition. Nat Chem Biol 5(11):789–796.

118. Koshland DE (1958) Application of a Theory of Enzyme Specificity to Protein Synthesis. Proc Natl Acad Sci 44(2):98–104.

119. Agafonov R V, Wilson C, Otten R, Buosi V, Kern D (2014) Energetic dissection of Gleevec’s selectivity toward human tyrosine kinases. Nat Struct Mol Biol 21(10):848–53.

120. Pauling L (1948) Nature of forces between large molecules of biological interest. Nature 172

161(4097):707–709.

121. Henzler-Wildman K a, et al. (2007) Intrinsic motions along an enzymatic reaction trajectory. Nature 450(7171):838–844.

122. Boehr DD, McElheny D, Dyson HJ, Wright PE (2006) The Dynamic Energy Landscape of Dihydrofolate Reductase Catalysis. Science 313(5793):1638–1642.

123. Lange OF, et al. (2008) Recognition Dynamics Up to Microseconds Revealed from an RDC-Derived Ubiquitin Ensemble in Solution. Science 320(5882):1471–1475.

124. Fraser JS, et al. (2009) Hidden alternative structures of proline essential for catalysis. Nature 462(7273):669–673.

125. Foda ZH, Shan Y, Kim ET, Shaw DE, Seeliger MA (2015) A dynamically coupled allosteric network underlies binding cooperativity in Src kinase. Nat Commun 6:5939.

126. Oyen D, Fenwick RB, Stanfield RL, Dyson HJ, Wright PE (2015) Cofactor-Mediated Conformational Dynamics Promote Product Release From Escherichia coli Dihydrofolate Reductase via an Allosteric Pathway. J Am Chem Soc 137(29):9459–9468.

127. Rose GD, Fleming PJ, Banavar JR, Maritan A (2006) A backbone-based theory of protein folding. Proc Natl Acad Sci U S A 103(45):16623–16633.

128. Goldman P (1969) The Carbon-Fluorine Bond in Compounds of Biological Interest. Science, 164(3884):1123–1130.

129. Shen Y, Bax A (2010) SPARTA+: A modest improvement in empirical NMR chemical shift prediction by means of an artificial neural network. J Biomol NMR 48(1):13–22.

130. De Simone A, Aprile FA, Dhulesia A, Dobson CM, Vendruscolo M (2015) Structure of a low-population intermediate state in the release of an enzyme product. Elife 2015(4). doi:10.7554/eLife.02777.

131. Cooper A, Dryden DTF (1984) Allostery without conformational change. Eur Biophys J 11(2):103–109.

132. Sekhar A, et al. (2015) Thermal fluctuations of immature SOD1 lead to separate folding and misfolding pathways. Elife 4:e07296. 173

133. Tzeng S-R, Kalodimos CG (2009) Dynamic activation of an allosteric regulatory protein. Nature 462(7271):368–372.

134. Tzeng S-R, Kalodimos CG (2012) Protein activity regulation by conformational entropy. Nature 488(7410):236–240.

135. Wand AJ (2013) The dark energy of proteins comes to light: Conformational entropy and its role in protein function revealed by NMR relaxation. Curr Opin Struct Biol 23(1):75– 81.

136. Neu A, Neu U, Fuchs A-L, Schlager B, Sprangers R (2015) An excess of catalytically required motions inhibits the scavenger decapping enzyme. Nat Chem Biol 11(9):697– 704.

137. Bar-Even A, Milo R, Noor E, Tawfik DS (2015) The Moderately Efficient Enzyme: Futile Encounters and Enzyme Floppiness. Biochemistry 54(32):4969–4977.

138. Bouvignies G, et al. (2011) Solution structure of a minor and transiently formed state of a T4 lysozyme mutant. Nature 477(7362):111–4.

139. Kasinath V, Sharp KA, Wand AJ (2013) Microscopic insights into the NMR relaxation- based protein conformational entropy meter. J Am Chem Soc 135(40):15092–15100.

140. Grutsch S, Brüschweiler S, Tollinger M (2016) NMR Methods to Study Dynamic Allostery. PLoS Comput Biol 12(3):e1004620.

141. Manglik A, et al. (2015) Structural Insights into the Dynamic Process of β2-Adrenergic Receptor Signaling. Cell 161(5):1101–1111.

142. Ye L, Van Eps N, Zimmer M, Ernst OP, Scott Prosser R (2016) Activation of the A2A adenosine G-protein-coupled receptor by conformational selection. Nature 533(7602):265–268.

143. Freiburger L, et al. (2014) Substrate-dependent switching of the allosteric binding mechanism of a dimeric enzyme. Nat Chem Biol 10(11):937–942.

144. Daniel E. Koshland KEN (1968) The catalytic and regulatory properties of enzymes. Annu Rev Biochem 37:359–410.

174

145. Traut T (1994) Dissociation of enzyme oligomers: a mechanism for allosteric regulation. Crit Rev Biochem Mol Biol 29(2):125–163.

146. Goodsell DS, Olson AJ (2000) Structural symmetry and protein function. Annu Rev Biophys Biomol Struct 29:105–53.

147. Castellani M, et al. (2010) Direct demonstration of half-of-the-sites reactivity in the dimeric cytochrome bc1 complex: enzyme with one inactive monomer is fully active but unable to activate the second ubiquinol oxidation site in response to ligand binding at the ubiquinone reducti. J Biol Chem 285(1):502–510.

148. Koshland DE (1996) The structural basis of negative cooperativity: Receptors and enzymes. Curr Opin Struct Biol 6(6):757–761.

149. Fiedler E, et al. (2002) Snapshot of a key intermediate in enzymatic thiamin catalysis: Crystal structure of the alpha-carbanion of (alpha,beta-dihydroxyethyl)-thiamin diphosphate in the active site of transketolase from Saccharomyces cerevisiae. Proc Natl Acad Sci USA 99(2):591–595.

150. Ding X, Rasmussen BF, Petsko G a, Ringe D (1994) Direct structural observation of an acyl-enzyme intermediate in the hydrolysis of an ester substrate by elastase. Biochemistry 33(31):9285–9293.

151. Hoeltzli SD, Frieden C (1995) Stopped-flow NMR spectroscopy: real-time unfolding studies of 6-19F-tryptophan-labeled Escherichia coli dihydrofolate reductase. Proc Natl Acad Sci U S A 92(20):9318–22.

152. Jacobs DJ, Rader AJ, Kuhn LA, Thorpe MF (2001) Protein flexibility predictions using graph theory. Proteins 44(2):150–65.

153. Jacobs DJ, Hendrickson B (1997) An Algorithm for Two-Dimensional Rigidity Percolation: The Pebble Game. J Comput Phys 137(2):346–365.

154. Chubynsky M V., Thorpe MF (2007) Algorithms for three-dimensional rigidity analysis and a first-order percolation transition. Phys Rev E - Stat Nonlinear, Soft Matter Phys 76(4). doi:10.1103/PhysRevE.76.041135.

175

155. Finbow-singh W, Whiteley W (2013) Isostatic Block and Hole Frameworks. SIAM J Discret Math 27(2):991–1020.

156. Sljoka A (2012) Algorithms in rigidity theory with applications to protein flexibility and mechanical linkages. Dissertation (York University).

157. Monod J, Changeux J-P, Jacob F (1963) Allosteric proteins and cellular control systems. J Mol Biol 6(4):306–329.

158. Forsén S, et al. (2004) Study of moderately rapid chemical exchange reactions by means of nuclear magnetic double resonance. J Chem Phys 39(11):2892–2901.

159. Eisenmesser EZ, et al. (2005) Intrinsic dynamics of an enzyme underlies catalysis. Nature 438(7064):117–21.

160. Bhabha G, et al. (2011) A dynamic knockout reveals that conformational fluctuations influence the chemical step of enzyme catalysis. Science 332(6026):234–238.

161. Pisliakov A V, Cao J, Kamerlin SCL, Warshel A (2009) Enzyme millisecond conformational dynamics do not catalyze the chemical step. Proc Natl Acad Sci U S A 106(41):17359–17364.

162. Adamczyk AJ, Cao J, Kamerlin SCL, Warshel A (2011) Catalysis by dihydrofolate reductase and other enzymes arises from electrostatic preorganization, not conformational motions. Proc Natl Acad Sci U S A 108(34):14115–14120.

163. Kohen A, Cannio R, Bartolucci S, Klinman JP (1999) Enzyme dynamics and hydrogen tunnelling in a thermophilic alcohol dehydrogenase. Nature 399(6735):496–499.

164. García-Meseguer R, Martí S, Ruiz-Pernía JJ, Moliner V, Tuñón I (2013) Studying the role of protein dynamics in an SN2 enzyme reaction using free-energy surfaces and solvent coordinates. Nat Chem 5(7):566–71.

165. Warshel A, Bora RP (2016) Perspective: Defining and quantifying the role of dynamics in enzyme catalysis. J Chem Phys 144(18):180901.

166. Liu W, et al. (2012) Structural Basis for Allosteric Regulation of GPCRs by Sodium Ions. Science 337(6091):232–236.

176

167. Colombo MF, Rau DC, Parsegian V a (1992) Protein solvation in allosteric regulation: a water effect on hemoglobin. Science 256:655–659.

168. Hoang J, Prosser RS (2014) Conformational Selection and Functional Dynamics of Calmodulin: A 19 F Nuclear Magnetic Resonance Study. Biochemistry 53:5727–5736.

169. Chan WY, et al. (2010) Sequence- and activity-based screening of microbial genomes for novel dehalogenases. Microb Biotechnol 3(1):107–120.

170. Kabsch W (2010) XDS. Acta Crystallogr Sect D Biol Crystallogr 66(2):125–132.

171. McCoy AJ, et al. (2007) Phaser crystallographic software. J Appl Crystallogr 40(4):658– 674.

172. Emsley P, Lohkamp B, Scott WG, Cowtan K (2010) Features and development of Coot. Acta Crystallogr Sect D Biol Crystallogr 66(4):486–501.

173. Vagin AA, et al. (2004) REFMAC5 dictionary: Organization of prior chemical knowledge and guidelines for its use. Acta Crystallogr Sect D Biol Crystallogr 60(12 I):2184–2195.

174. Winn MD, et al. (2011) Overview of the CCP4 suite and current developments. Acta Crystallogr Sect D Biol Crystallogr 67(4):235–242.

175. Adams PD, et al. (2010) PHENIX: A comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr Sect D Biol Crystallogr 66(2):213– 221.

176. Sattler M, Schleucher J, Griesinger C (1999) Heteronuclear multidimensional NMR experiments for the structure determination of proteins in solution employing pulsed field gradients. Prog Nucl Magn Reson Spectrosc 34(2):93–158.

177. Shi L, Kay LE (2014) Tracing an allosteric pathway regulating the activity of the HslV protease. Proc Natl Acad Sci 111(6):2140–2145.

178. Delaglio F, et al. (1995) NMRPipe: A multidimensional spectral processing system based on UNIX pipes. J Biomol NMR 6(3):277–293.

179. Johnson BA, Blevins RA (1994) NMR View: A computer program for the visualization 177

and analysis of NMR data. J Biomol NMR 4(5):603–614.

180. Vranken WF, et al. (2005) The CCPN data model for NMR spectroscopy: Development of a software pipeline. Proteins Struct Funct Genet 59(4):687–696.

181. Whiteley W (2005) Counting out to the flexibility of molecules. Phys Biol 2(4):S116– S126.

182. Lindorff-Larsen K, et al. (2010) Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins Struct Funct Bioinforma 78(8):1950–1958.

183. Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML (1983) Comparison of simple potential functions for simulating liquid water. J Chem Phys 79(2):926.

184. Sousa da Silva AW, et al. (2012) ACPYPE - AnteChamber PYthon Parser interfacE. BMC Res Notes 5(1):367.

185. Wang J, Wang W, Kollman PA, Case DA (2006) Automatic atom type and bond type perception in molecular mechanical calculations. J Mol Graph Model 25(2):247–260.

186. Pronk S, et al. (2013) GROMACS 4.5: A high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics 29(7):845–854.

187. Hoover WG (1985) Canonical dynamics: Equilibrium phase-space distributions. Phys Rev A 31(3):1695–1697.

188. Nosé S (1984) A unified formulation of the constant temperature molecular dynamics methods. J Chem Phys 81(1):511.

189. Nosé S, Klein ML (2006) Constant pressure molecular dynamics for molecular systems. Mol Phys 50(5):1055–1076.

190. Darden T, York D, Pedersen L (1993) Particle mesh Ewald: An N⋅log(N) method for Ewald sums in large systems. J Chem Phys 98(12):10089.

191. Essmann U, et al. (1995) A smooth particle mesh Ewald method. J Chem Phys 103(1995):8577–8593.

192. Miyamoto S, Kollman PA (1992) SETTLE: An analytical version of the SHAKE and

178

RATTLE algorithm for rigid water models. J Comput Chem 13(8):952–962.

193. Hess B (2008) P-LINCS: A parallel linear constraint solver for molecular simulation. J Chem Theory Comput 4(1):116–122.

194. McGibbon RT, et al. (2015) MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories. Biophys J 109(8):1528–1532.

195. Hernández C (2015) mdentropy: v0.2 [Data set]. Zenodo. doi:10.5281/ZENODO.18859.

196. Humphrey W, Dalke A, Schulten K (1996) VMD: Visual molecular dynamics. J Mol Graph 14(1):33–38.

197. Fiser A, Šali A (2003) MODELLER: Generation and Refinement of Homology-Based Protein Structure Models. Methods Enzymol 374:461–491.

198. Reed MC, Lieb A, Nijhout HF (2010) The biological significance of substrate inhibition: a mechanism with diverse functions. Bioessays 32(5):422–9.

199. Warshel A (1978) Energetics of enzyme catalysis. Proc Natl Acad Sci U S A 75(11):5250–5254.

200. Benkovic SJ, Hammes GG, Hammes-Schiffer S (2008) Free-energy landscape of enzyme catalysis. Biochemistry 47(11):3317–3321.

201. Changeux J-P, Edelstein SJ (2005) Allosteric mechanisms of signal transduction. Science 308(5727):1424–8.

202. Michaelis L, Menten ML, Goody RS, Johnson KA (1913) Die Kinetik der Invertinwirkung/ The kinetics of invertase action. Biochemistry 49(2):333–369.

203. Yoshino M, Murakami K (2015) Analysis of the substrate inhibition of complete and partial types. Springerplus 4(1):292.

204. Agarry SE, Solomon BO, Layokun SK (2008) Substrate inhibition kinetics of phenol degradation by binary mixed culture of Pseudomonas aeruginosa and Pseudomonas fluorescence from steady state and wash- out data. African J Biotechnol 7(21):3927– 3933.

179

205. Dolenc I, Turk B, Pungercic G, Ritonja A, Turk V (1995) Oligomeric structure and substrate induced inhibition of human cathepsin C. J Biol Chem 270(37):21626–21631.

206. Sekulic N, Konrad M, Lavie A (2007) Structural mechanism for substrate inhibition of the adenosine 5′-phosphosulfate kinase domain of human 3′-phosphoadenosine 5′- phosphosulfate synthetase 1 and its ramifications for enzyme regulation. J Biol Chem 282(30):22112–22121.

207. Chen C, et al. (2014) Structure-based mutational studies of substrate inhibition of betaine aldehyde dehydrogenase BetB from staphylococcus aureus. Appl Environ Microbiol 80(13):3992–4002.

208. Jukić D, Sabo K, Scitovski R (2007) Total least squares fitting Michaelis-Menten enzyme kinetic model function. J Comput Appl Math 201(1):230–246.

209. Shafferman A, et al. (1992) Substrate inhibition of acetylcholinesterase: residues affecting signal transduction from the surface to the catalytic center. EMBO J 11(10):3561–3568.

210. Fenton AW, Reinhart GD (2003) Mechanism of Substrate Inhibition in Escherichia coli Phosphofructokinase. Biochemistry 42(43):12676–12681.

211. Nakashima A, et al. (1999) Dopamine inhibition of human tyrosine hydroxylase type 1 is controlled by the specific portion in the N-terminus of the enzyme. J Neurochem 72(5):2145–2153.

212. Wu B (2011) Substrate inhibition kinetics in drug reactions. Drug Metab Rev 43(4):440–456.

213. Lin Y, et al. (2001) Substrate inhibition kinetics for cytochrome P450-catalyzed reactions. Drug Metab Dispos 29(4 I):368–374.

214. Quinsey NS, Luong a Q, Dickson PW (1998) Mutational analysis of substrate inhibition in tyrosine hydroxylase. J Neurochem 71(5):2132–8.

215. McLoughlin SY, Ollis DL (2004) The role of inhibition in enzyme evolution. Chem Biol 11(6):735–737.

180

216. Venkata Ramana V, et al. (2012) Descriptions of rhodopseudomonas parapalustris sp. nov., rhodopseudomonas harwoodiae sp. nov. and rhodopseudomonas pseudopalustris sp. nov., and emended description of rhodopseudomonas palustris. Int J Syst Evol Microbiol 62(8):1790–1798.

217. Oda Y, et al. (2008) Multiple genome sequences reveal adaptations of a phototrophic bacterium to sediment microenvironments. Proc Natl Acad Sci U S A 105(47):18543– 18548.

218. T. Alber, G. A. Petsko DT (1976) Crystal structure of elastase–substrate complex at −55 °C. Nature 263:297–300.

219. Hajdu J, et al. (1987) Catalysis in the crystal: synchrotron radiation studies with glycogen phosphorylase b. EMBO J 6(2):539–46.

220. Scheidig AJ, et al. (1992) Time-Resolved Crystallography on H-ras p21 [and Discussion]. Philos Trans R Soc A Math Phys Eng Sci 340(1657):263–272.

221. Milani M, Nardini M, Pesce A, Mastrangelo E, Bolognesi M (2008) Hemoprotein time- resolved X-ray crystallography. IUBMB Life 60(3):154–158.

222. Darmanin C, et al. (2016) Protein crystal screening and characterization for serial femtosecond nanocrystallography. Sci Rep 6:25345.

223. Chapman HN, et al. (2011) Femtosecond X-ray protein nanocrystallography. Nature 470(7332):73–77.

224. Cozzini P, et al. (2008) Target flexibility: An emerging consideration in drug discovery and design. J Med Chem 51(20):6237–6255.

225. Knapp JE, Pahl R, Srajer V, Royer WE (2006) Allosteric action in real time: time- resolved crystallographic studies of a cooperative dimeric hemoglobin. Proc Natl Acad Sci U S A 103(20):7649–54.

226. Bar-Even A, et al. (2011) The moderately efficient enzyme: Evolutionary and physicochemical trends shaping enzyme parameters. Biochemistry 50(21):4402–4410.

227. Zarrine-Afsar A, et al. (2012) Crystallography on a chip. Acta Crystallogr Sect D Biol

181

Crystallogr 68(3):321–323.

228. Kupitz C, et al. (2014) Serial time-resolved crystallography of photosystem II using a femtosecond X-ray laser. Nature 513(7517):261–265.

229. Uervirojnangkoorn M, et al. (2015) Enabling X-ray free electron laser crystallography for challenging biological systems from a limited number of crystals. Elife 4. doi:10.7554/eLife.05421.

230. Givens RS, Rubina M, Wirz J (2012) Applications of p-hydroxyphenacyl (pHP) and coumarin-4-ylmethyl photoremovable protecting groups. Photochem Photobiol Sci 11(3):472–88.

231. Pelliccioli AP, Wirz J (2002) Photoremovable protecting groups: reaction mechanisms and applications. Photochem Photobiol Sci 1(7):441–458.

232. Geremia S, Campagnolo M, Demitri N, Johnson LN (2006) Simulation of diffusion time of small molecules in protein crystals. Structure 14(3):393–400.

233. Ma P, et al. (2015) Observing the overall rocking motion of a protein in a crystal. Nat Commun 6:1–24.

234. Coppens P, Vorontsov II, Graber T, Gembicky M, Kovalevsky AY (2005) The structure of short-lived excited states of molecular complexes by time-resolved X-ray diffraction. Acta Crystallogr Sect A Found Crystallogr 61(2):162–172.

235. Wang T, Kumavor PD, Zhu Q (2012) Application of laser pulse stretching scheme for efficiently delivering laser energy in photoacoustic imaging. J Biomed Opt 17(6):61218.

236. Gregg K (1995) Engineering gut flora of ruminant livestock to reduce forage toxicity: progress and problems. Trends Biotechnol 13(10):418–421.

237. Vierke L, Staude C, Biegel-Engler A, Drost W, Schulte C (2012) Perfluorooctanoic acid (PFOA) — main concerns and regulatory developments in Europe from an environmental point of view. Environ Sci Eur 24(1):16.

238. Packer MS, Liu DR (2015) Methods for the directed evolution of proteins. Nat Rev Genet 16(7):379–394.

182

239. Chen R (2001) Enzyme engineering: Rational redesign versus directed evolution. Trends Biotechnol 19(1):13–14.

240. Lane MD, Seelig B (2014) Advances in the directed evolution of proteins. Curr Opin Chem Biol 22:129–136.

241. Chen C-Y, Georgiev I, Anderson AC, Donald BR (2009) Computational structure-based redesign of enzyme activity. Proc Natl Acad Sci U S A 106(10):3764–9.

242. Tiwari MK, Singh R, Singh RK, Kim I-W, Lee J-K (2012) Computational approaches for rational design of proteins with novel functionalities. Comput Struct Biotechnol J 2(3):1– 13.

243. Calvey GD, Katz AM, Schaffer CB, Pollack L (2016) Mixing injector enables time- resolved crystallography with high hit rate at X-ray free electron lasers. Struct Dyn 3(5):54301.

244. Schmidt M (2013) Mix and inject: Reaction initiation by diffusion for time-resolved macromolecular crystallography. Adv Condens Matter Phys 2013. doi:10.1155/2013/167276.

245. Garman EF, Schneider TR (1997) Macromolecular Cryocrystallography. J Appl Crystallogr 30(3):211–237.

246. Kroon-Batenburg LMJ, Schreurs AMM, Ravelli RBG, Gros P (2015) Accounting for partiality in serial crystallography using ray-tracing principles. Acta Crystallogr Sect D Biol Crystallogr 71:1799–1811.

247. White T a (2014) Post-refinement method for snapshot serial crystallography. Philos Trans R Soc B Biol Sci 369(1647):20130330.

248. Rossmann MG (2014) Serial crystallography using synchrotron radiation. IUCrJ 1:84– 86.

249. Vaguine AA, Richelle J, Wodak SJ (1999) SFCHECK: a unified set of procedures for evaluating the quality of macromolecular structure-factor data and their agreement with the atomic model. Acta Crystallogr D Biol Crystallogr 55:191–205.

183

250. The PyMOL Molecular Graphics System, Version 1.3 (2002) Schrödinger LLC. doi:citeulike-article-id:240061.

251. Grant BJ, Rodrigues APC, ElSawy KM, McCammon JA, Caves LSD (2006) Bio3d: An R package for the comparative analysis of protein structures. Bioinformatics 22(21):2695–2696.

252. Keedy DA, et al. (2014) Crystal cryocooling distorts conformational heterogeneity in a model michaelis complex of DHFR. Structure 22(6):899–910.

253. Juers DH, Matthews BW (2001) Reversible lattice repacking illustrates the temperature dependence of macromolecular interactions. J Mol Biol 311:851–862.

254. Fraser JS, et al. (2011) Accessing protein conformational ensembles using room- temperature X-ray crystallography. Proc Natl Acad Sci U S A 108(39):16247–52.

255. Burmeister WP (2000) Structural changes in a cryo-cooled protein crystal owing to radiation damage. Acta Crystallogr Sect D Biol Crystallogr 56(3):328–341.

256. Garman EF, Owen RL (2006) Cryocooling and radiation damage in macromolecular crystallography. Acta Crystallogr Sect D Biol Crystallogr 62(1):32–47.

257. Shimizu N, Hirata K, Hasegawa K, Ueno G, Yamamoto M (2007) Dose dependence of radiation damage for protein crystals studied at various X-ray energies. Journal of Synchrotron Radiation, pp 4–10.

258. Hedman B, Hodgson KO, Helliwell JR, Liddington R, Papiz MZ (1985) Protein microcrystal diffraction and the effects of radiation damage with ultra-high-flux synchrotron radiation. Proc Natl Acad Sci U S A 82(22):7604–7607.

259. Qin Y, Wang L, Zhong D (2016) Dynamics and mechanism of ultrafast water–protein interactions. Proc Natl Acad Sci 113(30):201602916.

260. Kmetko J, Husseini NS, Naides M, Kalinin Y, Thorne RE (2006) Quantifying X-ray radiation damage in protein crystals at cryogenic temperatures. Acta Crystallogr Sect D Biol Crystallogr 62(9):1030–1038.

261. Panneels V, et al. (2015) Time-resolved structural studies with serial crystallography: A

184 new light on retinal proteins. Struct Dyn 2(4):41718.

185