Biological Charge Transfer in Redox Regulation and Signaling

by

Ruijie Darius Teo

Department of Duke University

Date:______Approved:

______David Beratan, Advisor

______Patrick Charbonneau

______Agostino Migliore

______Kenichi Yokoyama

Dissertation submitted in partial fulfillment of the requirements for the degree of in the Department of Chemistry in the Graduate School of Duke University

2020

i

v

ABSTRACT

Biological Charge Transfer in Redox Regulation and Signaling

by

Ruijie Darius Teo

Department of Chemistry Duke University

Date:______Approved:

______David Beratan, Advisor

______Patrick Charbonneau

______Agostino Migliore

______Kenichi Yokoyama

An abstract of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Chemistry in the Graduate School of Duke University

2020

i

v

Copyright by Ruijie Darius Teo 2020

Abstract

Biological signaling via DNA-mediated charge transfer between high-potential

[4Fe4S]2+/3+ clusters is widely discussed in the literature. Recently, it was proposed that for DNA replication on the lagging strand, primer handover from primase to polymerase α is facilitated by DNA-mediated charge transfer between the [4Fe4S] clusters housed in the respective C-terminal domains of the proteins. Using a theoretical-computational approach, I established that redox signaling between the clusters in primase and polymerase α cannot be accomplished solely by DNA-mediated charge transport, due to the unidirectionality of charge transfer between the [4Fe4S] cluster and the nucleic acid. I extended the study by developing an open-source electron hopping pathway search code to characterize hole hopping pathways in proteins and nucleic acids. I used this module to analyze protective hole escape routes in cytochrome p450, cytochrome c oxidase, and benzylsuccinate synthase. Next, I used the module to analyze molecular dynamics snapshots of a mutant primase, where the Y345C mutation

(found in gastric tumors) attenuates charge transfer between the [4Fe4S] cluster and nucleic acid, which in turn, could disrupt the signaling process between primase and polymerase α. In another protein-nucleic acid system, I found that charge transfer in the p53-DNA complex plays an important role for p53 to differentiate Gadd45 DNA and p21 DNA in metabolic pathway regulation. Using density functional theory calculations

iv

on molecular dynamics snapshots, I found that hole transfer (HT) from Gadd45 DNA to the proximal cysteine residue in the DNA-binding domain of p53 is preferred over HT from p21 DNA to cysteine. This preference ensures that the p21 DNA remains bound to the transcription factor p53 which induces the transcription of the gene under cellular oxidative stress. This dissertation concludes with a study that demonstrates similar electron conductivities between an artificial nucleic acid, 2'-deoxy-2'-fluoro- arabinonucleic acid (2’F-ANA), and DNA. Compared to DNA, 2’F-ANA offers the additional benefit of chemical stability with respect to hydrolysis and nuclease degradation, thereby promoting its use as a sensor in biological systems and cellular environments.

v

Contents

Abstract ...... iv

List of Tables ...... x

List of Figures ...... xiii

Acknowledgements ...... xvi

1. Introduction ...... 1

1.1 Charge Transfer in Proteins ...... 2

1.2 Charge Transfer in DNA ...... 4

1.3 Charge Transfer in Protein-DNA Systems ...... 5

1.4 Aims and Objectives ...... 7

2. Classical Marcus Theory ...... 8

2.1 Charge Transfer Mechanisms ...... 11

2.1.1 Tunneling ...... 12

2.1.2 Superexchange ...... 13

2.1.3 Flickering Resonance ...... 14

2.1.4 Hopping ...... 16

3. Electronic Couplings ...... 17

3.1 Empirical/Semi-Empirical Models ...... 17

3.1.1 Hopfield Model ...... 17

3.1.2 Pathway Tunneling Model ...... 18

3.1.3 Average Packing Density Model ...... 19

vi

3.2 Electronic Structure Models ...... 20

3.2.1 Generalized Mulliken-Hush Method ...... 20

3.2.2 Effective Hamiltonians ...... 21

3.2.2.1 Localized Orbitals ...... 21

3.2.3 Constrained Density Functional Theory (CDFT) ...... 22

3.2.4 Orbital Splitting ...... 25

4. Reorganization Energy ...... 26

4.1 Marcus’ Two-Sphere Model ...... 26

4.1.1 Inner-Sphere Reorganization Energy ...... 27

4.1.2 Outer-Sphere Reorganization Energy ...... 29

5. Molecular Dynamics ...... 30

5.1 MD Force Fields ...... 31

5.1.1 Seminario Method for Calculating Force Constants ...... 33

5.1.2 Atomic Charges for Coulomb Interactions ...... 34

5.2 Integration Algorithms for System Evolution ...... 35

5.3 Solvent Representation ...... 37

5.4 Periodic Boundary Conditions ...... 39

5.5 Temperature/Pressure Coupling Algorithms ...... 40

6. Kinetic Models and Master Equations for Charge Transfer Dynamics ...... 43

7. Charge Transfer in the Human Primosome ...... 46

7.1 Methodology ...... 49

7.1.1 Docking and Modeling ...... 49

vii

7.1.2 Redox Potentials ...... 51

7.1.3 Reorganization Energies ...... 52

7.1.4 Electronic Couplings ...... 53

7.1.5 Kinetic Model ...... 56

7.2 Unidirectionality ...... 57

7.3 Concluding Remarks ...... 63

8. Mapping Hole Hopping Pathways in Proteins ...... 67

8.1 Description of EHPath ...... 70

8.2 Cytochrome p450 ...... 78

8.3 Cytochrome c Peroxidase ...... 83

8.4 Benzylsuccinate Synthase ...... 85

8.5 Concluding Remarks ...... 87

9. Role of Y345C Mutation on Primase-RNA/DNA Hole Hopping Pathways ...... 89

9.1 Introduction ...... 89

9.2 Broken-Symmetry DFT ...... 90

9.3 Methodology and Results ...... 92

9.3.1 Generation of Truncated Models ...... 92

9.3.2 DFT Calculations ...... 92

9.3.3 Structural Comparison ...... 96

9.3.4 MD Simulations ...... 97

9.3.5 Binding Free Energies ...... 101

9.3.6 Hole Hopping Pathways ...... 102

viii

10. Hole Transfer at the p53-DNA Interface ...... 106

10.1 Methodology ...... 111

10.2 Results and Analysis ...... 113

10.3 Concluding Remarks ...... 118

11. Hole Transfer in 2’-Deoxy-2’-Fluoro-Arabinonucleic Acid ...... 120

11.1 Methodology ...... 122

11.1.1 MD Simulation Setup ...... 123

11.1.2 Generation of Truncated Models ...... 124

11.1.3 Free Energy Change ∆G° ...... 125

11.1.4 Reorganization Energies ...... 125

11.2 Results and Analysis ...... 126

11.2.1 Effective Electronic Couplings ...... 126

11.2.2 Reorganization Energies ...... 128

11.2.3 Charge Transport ...... 130

11.3 Concluding Remarks ...... 133

12. Conclusions ...... 135

Appendix ...... 137

References ...... 153

Biography ...... 177

ix

List of Tables

Table 1: Reorganization energies (λX) associated with the redox group X in electron self- exchange...... 50

Table 2: Values of VEA and E0red of Tyr, Trp and Met...... 52

DFT Table 3: DFT (M11/6-311g**) values of the electronic couplings (VIF ); and DFT electronic couplings “dressed” with the effects of the protein medium...... 54

Table 4: Semiempirical estimates of electronic couplings (VIF); DFT (M11/6-311g**) values of the electronic couplings; and “dressed” DFT electronic couplings...... 55

Table 5: ET paths, steps, parameters, and rate constants for CT mechanisms potentially at play in the p58c-RNA/DNA complex...... 58

Table 6: Fastest 5 hole hopping routes in P450BM3 (the heme is the hole donor, while Y305 or Y334 is the terminal hole acceptor)...... 82

Table 7: Mean residence time τM of the hole for the 5 fastest hole hopping pathways in Ccp1 with selected terminal HT sites identified in ref. 159...... 84

Table 8: Mean residence times τM (eq. 76) and τM,approx (eq. 77) of the hole in the 5 fastest hole-hopping escape pathways of BSS...... 86

Table 9: RMSD comparison of the six optimized [4Fe4S]3+ structures (optimized with the Cosmo model) with the crystal structure...... 97

Table 10: Top hole hopping pathways between the [4Fe4S] cluster in wild-type or Y345C primase, and the nucleic acid...... 102

x

Table 11: Top hole hopping pathways between the [4Fe4S] cluster in wild-type or Y345C primase, and the nucleic acid...... 103

Table 12: RDA of D-A pairs from the top hopping pathways listed in Table 10...... 104

Table 13: RDA of D-A pairs from the top hopping pathways listed in Table 11...... 105

Table 14: Values of mean-square electronic coupling, , in eV2 and coherence parameter, C, for each redox pair in Figure 26...... 114

Table 15: Ranges of distances (in Å) spanned by the indicated bp-Cys pairs, over the selected MD snapshots, at the DNA contacts with the top and bottom proteins...... 115

Table 16: Mean-square electronic coupling (in meV2) and coherence parameter C for the indicated nucleobase pair dimers in DNA and 2'F-ANA...... 127

Table 17: Reorganization energy (λDA) and hole-transfer rate (kDA) values for the indicated base-pair dimers in DNA, using the S1, S2 and S3 dielectric constant sets...... 129

Table 18: Reorganization energy (λDA) and hole-transfer rate (kDA) values for the indicated base-pair dimers in 2’-FANA, using the S1, S2 and S3 dielectric constant sets.130

Table 19: Mean travel time (τ) spent by the hole to traverse the path from TA to a charge drain in contact with CG...... 131

Table 20: Mean travel time (τ) spent by the hole to traverse the path from CG to a charge drain in contact with TA. The notation is the same as in Table 19...... 131

Table A1: Interatomic distances Req (Å) for the six optimized geometries 13+-63+ of [4Fe4S]3+ and the corresponding average values...... 137

xi

Table A2: Bond stretching force constants Kr (kcal mol−1 Å−2) for the six optimized geometries 13+-63+ of [4Fe4S]3+ and the corresponding average values...... 138

Table A3: Angles qeq (°) for the six optimized geometries 13+-63+ of [4Fe4S]3+ and the corresponding average values...... 139

Table A4: Angle bending force constants Kq (kcal mol−1 rad−2) for the six optimized geometries 13+-63+ of [4Fe4S]3+ and the corresponding average values...... 141

Table A5: Table of the interatomic distances Req (Å) for the three optimized geometries 12+-32+ of [4Fe4S]2+ and the corresponding average values...... 143

Table A6: Table of the bond stretching force constants Kr (kcal mol−1 Å−2) for the three optimized geometries 12+-32+ of [4Fe4S]2+ and the corresponding average values...... 144

Table A7: Table of the angles qeq (°) for the three optimized geometries 12+-32+ of [4Fe4S]2+ and the corresponding average values...... 145

Table A8: Table of the angle bending force constants Kq (kcal mol−1 rad−2) for the three optimized geometries 12+-32+ of [4Fe4S]2+ and the corresponding average values...... 147

Table A9: RESP charges for atoms corresponding to the optimized geometries 13+-63+ of [4Fe4S]3+ and the corresponding average values...... 149

Table A10: Table of the RESP charges for atoms corresponding to the three optimized geometries 12+-32+ of [4Fe4S]2+ and the corresponding average values...... 151

xii

List of Figures

Figure 1: Marcus parabolas for CT (in this case, excess electron transfer) from D- (electron on donor) and A...... 8

Figure 2: (A) Superexchange, (B) flickering resonance, and (C) hopping CT mechanisms for a D-1-2-3-A chain comprising of three bridge sites...... 12

Figure 3: Bonded (solid lines) and non-bonded interactions (dashed line) represented by atoms 1-5...... 32

Figure 4: (A) 3-site water model, (B) 4-site model, (C) 5-site model, and (D) 6-site model. Color code: pink (O), green (H), blue (lone pair), orange (dummy atom)...... 38

Figure 5: Kinetic model for a hopping network consisting of N+1 sites, where the charge starts at site 1 and eventually arrives at the trap site N+1 shown in orange...... 44

Figure 6: Human primosome-nucleic acid complex...... 47

Figure 7: Proposed mechanism of primer handoff driven by DNA charge transport chemistry...... 48

Figure 8: Strongest tunneling pathways between the [4Fe4S] cluster in primase and purine nucleobases in anchored nucleic acid...... 57

Figure 9: CT steps and rate constants in hole hopping between the [4Fe4S] cluster and the nucleobases in the p58c-RNA/DNA complex...... 59

Figure 10: Fastest ET routes from the iron-sulfur cluster in primase to DNA/RNA (i.e., hole transfer in the opposite direction)...... 61

Figure 11: CT-mediated [4Fe4S] protein signaling and pertinent redox potentials...... 62

xiii

Figure 12. Side-by-side comparison of p58C structures with correctly folded [PDB code 5F0Q] and misfolded (PDB code 3L9Q) -binding regions...... 65

Figure 13. Structure of a directed graph representing hopping routes in a protein...... 71

Figure 14: Part of the catalytic cycle of cytochrome P450...... 79

Figure 15: Probable routes for hole hopping from the of P450BM3 to the protein surface...... 80

Figure 16: Redox-active residues (C, M, W, Y) in Ccp1 (PDB 1ZBY184)...... 83

Figure 17: Locations of the BSS redox-active residues involved in the five most rapid hole hopping routes from G829• to the terminal hole acceptor [4Fe4S]2+ (Table 8)...... 86

Figure 18: a-h) Optimized geometries 13+-63+ of the [4Fe4S]3+ cluster corresponding to the six spin layer assignments, and g) average geometry of the [4Fe4S]3+ cluster...... 94

Figure 19: RMSD (Å) of the wild-type primase-RNA/DNA complex (excluding water and counterions) across 120 ns using the A3+ average force field parameters...... 99

Figure 20: RMSD (Å) of the wild-type primase-RNA/DNA complex (excluding water and counterions) across 130 ns using the 33+ force field parameters...... 99

Figure 21: RMSD (Å) of the mutated Y345C primase-RNA/DNA complex (excluding water and counterions) across 125 ns using the A3+ force field parameters...... 100

Figure 22: RMSD (Å) of the mutated Y345C primase-RNA/DNA complex (excluding water and counterions) across 110 ns using the 33+ force field parameters...... 100

Figure 23: RMSD (Å) of the wild-type primase-RNA/DNA complex (excluding water and counterions) across 120 ns using the A2+ average force field parameters...... 101

xiv

Figure 24: Protein-DNA models used in the MD simulation of the p53 protein complexed with the Gadd45 and p21 half-site sequences...... 108

Figure 25: RMSD (without hydrogens) along the MD production run for the protein- DNA complexes containing the Gadd45 (blue) and p21 (orange) DNA models...... 112

Figure 26: HT models for the contacts of p53 with (a) Gadd45 and (b) p21 DNA sequences...... 113

Figure 27: (a) VCG-GC, (b) VGC-A1T, (c) VA1T-A2T, and (d) VA2T-TA versus the MD simulation time for DNA and 2’F-ANA...... 126

Figure 28: Instantaneous value of the Marcus expression for the outer-sphere reorganization energy λoDA(t) vs. the MD simulation time (ns) for the D-A dimers...... 128

xv

Acknowledgements

Firstly, I am indebted to my advisor, Professor David Beratan, for his tremendous support, encouragement, and mentorship throughout my graduate education. His kindness, humility, and generosity are some of the traits I would always remember and strive to emulate. Next, I am deeply grateful to my mentor, Professor

Agostino Migliore, for his constant guidance, thoughtfulness, and camaraderie. His passion for research is certainly infectious, and I have become a much better researcher and thinker under his tutelage.

I would like to specially thank my other committee members - Professor Patrick

Charbonneau and Professor Kenichi Yokoyama - for their generous time, advice, and support. I would also like to thank Meg Avery, Michael Conti, Professor Jiyong Hong,

Professor Michael Fitzgerald, and Professor Katherine Franz for their help with administrative matters and fellowship applications. In addition, I am grateful for the interaction with several faculty members and researchers that I had the fortune of meeting as a graduate student, including Professor Stephen Craig, Professor Emily

Derbyshire, Professor Steve Haase, Professor Richard MacPhail, Professor Henry Pfister,

Professor Michael Therien, Dr. Ronald Venters, and Professor Weitao Yang.

I had the privilege of mentoring several students at Duke (Xiaochen Du, Daniel

Koceja, Elizabeth Smithwick, Kiriko Terai, and Hector Torres), and I would like to thank

xvi

them for their valuable research contributions and the opportunity to hone my mentoring skills. I wish them the best in their academic endeavors.

I would also like to thank the past and present members of the Beratan group for enriching my experience. Many thanks to Dr. Shuming Bai, Dr. Tomasz Janowski,

Benjamin Rousseau, Professor Luis Rego, Dr. Xuyan Ru, Jesus Valdiviezo, Jonathon

Yuly, Professor Peng Zhang, Zhendian Zhang, and Dr. Ellie Zheng for making this possible. As busy as graduate school may be, I would like to thank Brandon Bowser,

Heather Folliard, Tony and Faye Hilger, Jiachen Li, Jeffrey Lin, Justin Ma, Melyssa

Minto, Robert Tennant, Ruobing Wang, Yujia Zhai, and Hong Zhou for making the process enjoyable.

In addition, I am grateful to Blue Waters, the Duke Graduate School, and the

National Institute of Health for supporting my research. The Blue Waters staff (special shoutouts to Dr. Victor Anisimov, Dr. Greg Bauer, Dr. Maxim Belkin, Dr. Scott Lathrop,

Noni Ledford, Kjellrun Olson, and Susan Vinson) and the Blue Waters Fellows are an incredible bunch of people to be around with.

Above all, I would like express my immense gratitude to my family and my incredible wife. This dissertation is only possible through her selflessness and sacrificial love. I am also extremely grateful for the joy my precious son brings to me after a long day of research. Daddy loves you so very much. I am truly indebted to both of you.

xvii

1. Introduction

The overarching aim of my research is to understand the intricate relationship among charge transfer (CT), redox regulation, and signaling in biological systems. A CT event occurs when a donor group releases an electron or a hole to the acceptor group.

CT processes govern cellular redox signaling processes, which occur, for example, in response to a change in reactive oxygen species (ROS) or reactive nitrogen species (RNS) levels of the cellular system. The mitochondria, for example, initiate plasma membrane repair in injured cells through their production of ROS facilitated by increased calcium uptake.1 Redox signaling processes are paramount in the regulation of major cellular pathways, while structural differences in proteins and DNA can alter CT pathways, dysregulate cellular signaling, and result in the over proliferation of cells, and human diseases like cancer.

Using theoretical chemistry, modeling, and simulation, I have investigated CT in a wide variety of protein/nucleic acid systems, including the human primase-RNA/DNA complex, p53-DNA complex, azurin, 2’fluoroarabinonucleic acid (2’F-ANA), cytochrome p450, cytochrome c peroxidase (Ccp1), and benzylsuccinate synthase (BSS). As CT in these biological systems helps facilitate the systems’ responses to elevated levels of ROS and

RNS, detailed analyses and characterizations of the kinetics and thermodynamics of CT pathways in these systems are crucial in establishing the role of CT in cellular response.

This chapter aims to introduce and frame CT mechanisms in proteins, DNA, protein-DNA

1

systems, and provide a detailed description of the aims and objectives of this dissertation research.

1.1 Charge Transfer in Proteins

Biological CT is ubiquitous in nature. A classic example is the photosynthetic photosystem II that catalyzes the light-driven oxidation of water. At the core of photosystem II lies the oxygen evolving complex (OEC), an Mn4Ca that is electronically coupled to the chlorophyll complex P680 through a redox-active tyrosine- histidine pair (TyrZ-D1His190).2 This pair acts as an electron relay between P680 and the

OEC components by mediating four sequential oxidations of the OEC from the S0 state to the S4 state through proton-coupled electron transfer (PCET).2-4 When the S4 state is attained, the oxidation of water, the generation of dioxygen and the regeneration of the S0 state results.

CT in proteins can occur either through single-step tunneling (superexchange, if virtual states of the bridge (B) are present between the charge donor (D) and acceptor (A)) or multi-step hopping. In both cases, the electronic coupling between D and A is relatively weak (see section 1.4 for a more detailed discussion about these mechanisms). This weak coupling imposes a limit of ~20 Å for the D-A distance in order for a single-step tunneling event to be feasible on a biological millisecond timescale.6 Charge hopping, on the other hand, can be viewed as an incoherent multi-step tunneling process where sequential tunneling reactions can deliver charges to distant sites (> 25 Å). Many studies have

2

revealed that electrons are transferred over large distances in like Complex I

(NADH dehydrogenase) of the mitochondrial respiratory chain5 and in synthetically modified proteins like Ru-cytochrome (cyt) c.6 It has also been proposed that redox-active amino acids like tyrosine and tryptophan can protect proteins from harmful reactive nitrogen and oxygen species by acting as electron or hole conduits to divert highly oxidizing holes from the active site to the protein surface for scavenging.7 The arrangement of these redox-active aromatic residues modulates their electronic coupling interaction energies and could be a key factor in the evolution of hole-hopping chains in proteins.8

Protein CT can be broadly divided into two categories – 1) inter-protein CT, and

2) intra-protein CT. A classic example of inter-protein CT occurs between cytochrome c

(Cc) and cytochrome c oxidase (CcO). A conformational change at the protein interface and a possible redox-dependent gating mechanism9 was attributed to the much lower ET rate from Cc to cytochrome c oxidase CcO (10–102 s−1)10 compared to the rate estimated from electrochemically triggered redox reaction of a surface-modified Cc bound to self- assembled monolayers on a gold electrode (∼107 s−1).11,12 A classic example of intra-protein

CT is in the purple bacterium Rhodobacter sphaeroides R26, where the primary electron donor comprises a pair of symmetrically arranged bacteriochlorophyll a cofactors (PL and

PM) coordinated by different histidine residues (His-L168 for PL and His-M202 for PM).

Other cofactors present include two accessory BChls, two BPhes a, two ubiquinones, and

3

a nonheme iron that are all arranged a near twofold symmetry.13 Interestingly, the electron does not travel along both pathways but instead ET occurs solely along one single branch.

1.2 Charge Transfer in DNA

CT in DNA has been extensively studied. DNA has robust structural, biological, and electronic properties that allow it to function as a charge mediator,14 where the helical nature of DNA and the ordered π-stacking of nucleobases allow charge propagation over nm-scale distances.15 Long-range CT is attractive due to its applications in nanotechnology and biology.16 Optimizing charge transport dynamics through DNA has thus been an intense area of research through the exploration of different hole donors, length of base-pair sequences, as well as the type of sequences.17-24 In biology, extensive research has been focused on DNA-mediated CT in

DNA repair and replication proteins containing redox-active residues like iron-sulfur clusters and cysteine (such as DNA primase and p53) where charges can be shuttled between DNA and these residues. It has been hypothesized that interprotein communication can be established using CT.25-28 In nanotechnology, DNA is, for example, explored in self-assembling 3D architectures. Previous studies, for example, have demonstrated the structural robustness of DNA in a torsionally-stressed ‘tensegrity triangle’ motif.29,30 Engineering such architectures is useful in programmable DNA nanoelectronics based on long-range CT.31

4

DNA CT often involves the migration of a positive charge (a hole) across the

HOMOs of the GC and AT nucleobase pairs. These HOMOs tend to localize on the purines (G or A) due to their lower oxidation potentials (and higher HOMO energies) than the pyrimidines.32 An established mechanism of long-range DNA CT is the hopping mechanism,24,33 where the charge ‘hops’ among the purine bridging sites via multistep tunneling. In contrast to standard coherent tunneling (often used to describe short-range DNA CT), recent theoretical models based on coherent transport, such as ‘deep-hole transfer’34 and transient coherence or flickering resonance (FR),35 were introduced (see section 1.4.4) to describe CT in the hopping regime; the development of these models is motivated by the observed small tunneling decay exponent β (< 1.0 Å-1) at longer distances.36 These coherent mechanisms exist since small energy gaps (< 0.1 eV) between a D and A nucleobase pair can be transiently eliminated by thermal fluctuations (i.e., � ≈ ��� ~ 0.1 eV) at room temperature) of the energy levels. This allows transient delocalization of the charge from D to multiple Bs.

1.3 Charge Transfer in Protein-DNA Systems

Barton’s group has devoted much effort to studying DNA-mediated CT between high potential [4Fe4S]2+/3+-containing DNA replication and repair proteins.16 DNA- mediated CT between [4Fe4S]2+/3+ clusters, which is hypothesized to form the basis for redox signaling between [4Fe4S] cluster-containing proteins,37-39 while one study even

5

suggests that these [4Fe4S] cofactors may have been a key to the origins of life.40 Once hypothesized to play a structural role, recent experiments suggest that these iron-sulfur clusters are involved in the regulation of protein activity.41,42 Switching between the reduced and oxidized redox states of a high-potential [4Fe4S] cluster via DNA-mediated

CT may synchronize the binding and unbinding events required for repair and replication proteins. These events are governed by electrostatics, by which a protein containing a [4Fe4S]3+ cluster binds more tightly to the negatively-charged DNA than a

[4Fe4S]2+-containing protein.43 This increased DNA binding shifts the [4Fe4S]2+/3+ couple to ca. +80 mV vs NHE.26,44,45 From the results of protein activity assays,46,47 [4Fe4S]2+/3+ cluster pairs are proposed to act as points of communication between proteins.16,41

In base excision repair enzymes like Endonuclease III and MutY, it is proposed that mismatched base pairs are recognized by detecting changes in base stacking through DNA CT.44,46,48 Once the [4Fe4S] cluster in one protein becomes oxidized to the

3+ state, it binds more tightly to the DNA and a second protein will attempt to transfer an electron to the first protein. If the double helix is well formed (does not contain base pair mismatches) and the bases are properly stacked, the electron is thought to transfer from a [4Fe4S]2+ cluster in the first protein to the [4Fe4S] 3+ cluster in the second protein through the intervening DNA and protein medium. However, if the base stacking is disrupted by a mismatched base pair or by an amino acid mutation, DNA-mediated CT would be interrupted between these two proteins. Using this method, DNA damage

6

could be located in a more efficient fashion than checking every base pair for defects.

Likewise, DinG, an R-loop repairing helicase, is thought to co-operate with the base excision repair enzymes to detect R-loops in the DNA through a similar mechanism.47

1.4 Aims and Objectives

In order to examine the feasibility of CT in the biological systems presented in this dissertation, the standard methodology of first simulating the biological system of interest using molecular dynamics (MD) , followed by using electronic structure methods like density functional theory (DFT) to calculate Marcus parameters for the CT rate constants (see Chapter 2), is utilized. Chapters 2-6 introduce pertinent theoretical and computational methods that enable these studies. The main objectives for each biological system studied (Chapters 7-11) are outlined as follows – 1) the examination of the directionality of CT between the [4Fe4S]2+/3+-containing primase and the bound

RNA/DNA duplex, 2) development of a computer code that maps hole hopping pathways in proteins, 3) impact of Y345C mutation on CT between primase and nucleic acid duplex, 4) modulation of CT at the p53-nucleic acid binding interface with differing

DNA substrates, and 5) comparison of charge conductivity of an artificial nucleic acid

(2’F-ANA) with DNA.

7

2. Classical Marcus Theory

CT (excess electron transfer, EET, or hole transfer, HT) reactions between a donor and acceptor can be represented by means of two parabolic curves representing the

(free) energies of the initial and final CT states as a function of a nuclear reaction coordinate (see Figure 1). The vertical energy gap ∆�, which is the energy difference between the two adiabatic states (red curves, Figure 1) at a certain nuclear configuration, can represent the nuclear reaction coordinate, although other suitable choices of the reaction coordinate can also be made. Taking the probability distribution � of ∆� to be

Gaussian, this gives rise to two parabolic free energy curves via the Landau free energy

49 equation �(∆�) = −�� ln �(∆�) + constant.

G1

Gb

G0

Figure 1: Marcus parabolas for CT (in this case, excess electron transfer) from D- (electron on donor) and A. The reaction coordinate is the vertical energy gap ΔE. The driving force ΔGo is the free energy difference between the two minima ΔEamin and ΔEbmin. This figure is adapted from ref. 49.

Now, the free energies �(∆�) and �(∆�) for the respective diabatic states a and b relate to the force constant � via the parabolic equation 8

1 � (∆�) = �∆� − ∆� (1) 2

1 � (∆�) = �∆� + ∆� (2) 2

where � = �′′(0), ∆� and ∆� represent the reaction coordinates corresponding to the minima of states a and b respectively. Suppose CT occurs at ∆� and �(∆�) =

�(∆�). Hence,

1 1 ∆�° + �∆� − ∆� = �∆� + ∆� (3) 2 2

where ∆�° is the free energy difference between the energies at ∆� and ∆� . This equation can be rearranged to

∆�° 1 ∆� = + (∆� + ∆� ) (4) �(∆� − ∆� ) 2

Inserting eq. 4 into the activation energy equation,

1 ∆� = �∆� − ∆� (5) 2 and after some rearranging, one obtains for the nonadiabatic activation free energy at the crossing point of the two diabatic curves (see Figure 1),

(∆�° + �) ∆� = (6) 4�

where the reorganization energy � = �(∆� − ∆� ). The reorganization energy

(see Chapter 4) can be defined as the free energy required to rearrange the equilibrium configuration of the system before and after CT while remaining on the initial diabatic curve. Assuming that the nuclei are treated classically (i.e., nuclear quantum effects in

9

the high-temperature limit are negligible), the formulation for ∆� can be substituted into the Fermi’s Golden Rule-like expression for the nonadiabatic rate constant �,

2π � = � FC (7) ħ

∆ where FC is the Franck-Condon factor equivalent to exp and � is the l electronic coupling (see Chapter 3). In the nonadiabatic regime, the electronic coupling between the initial and final diabatic states is weak and there is a high probability for the system to enter the adiabatic excited state � (see Figure 1), as ∆� = 2 � at the transition state coordinate. Since there is transfer of charge to the adiabatic excited state

�, the word ‘nonadiabatic’ refers to the violation of the adiabatic approximation used in the Born-Oppenheimer separation of electronic and nuclear degrees of freedom. In biological systems, as nonadiabatic CT usually dominates (due to the existence of long- range electron tunneling), the formulation for other CT regimes (adiabatic and intermediate) will not be detailed and instead the reader is encouraged to refer to ref. 49 for a summary of these other regimes.

Thus, the nonadiabatic CT rate constant kDA can be described by the classical

Marcus equation in the high-temperature limit (high frequency nuclear modes coupled to CT are regarded as negligible),50-52 which is dependent on several key parameters, i.e., the electronic coupling � between the donor (D) and acceptor (A) groups, free energy change between the initial and final charge (diabatic) states ∆�°, and reorganization energy l,

10

° 2π 1 −(∆� + l) � = � exp (8) ħ 4πl�� 4l�� where � is Boltzmann’s constant. I and F represents the initial (charge localizes on D) and final (charge localizes on A) electronic states. For biological CT, temperature � is often set at 298.15 K (room temperature). ∆�° can be approximated as the difference in redox potentials (reduction potentials for excess ET and oxidation potentials for HT) calculated via the Nernst equation. However, there are numerous - albeit computationally expensive - methods developed including thermodynamic integration that can calculate ∆�° to greater accuracy.

2.1 Charge Transfer Mechanisms

Suppose a D-A pair that is separated by three bridge sites 1-3, i.e. D-1-2-3-A.

Depending on the strength of the electronic coupling � and the energies of the site levels (with respect to the D and A levels), the mechanism and effective rate of charge migration from D to A can vary (Figure 2).36 Mechanisms that support this charge migration include single-step bridge-mediated tunneling (superexchange), flickering resonance, and multi-step tunneling (hopping). In comparison with a single-step D-A tunneling without any intervening bridge units, the presence of these virtual bridge orbitals helps increase the electronic coupling between D and A, while the other two

° Marcus parameters, ∆� and l, both remain unchanged. These mechanisms will be further elaborated in this section.

11

Figure 2: (A) Superexchange, (B) flickering resonance, and (C) hopping CT mechanisms for a D-1-2-3-A chain comprising of three bridge sites. This figure is reproduced with permission from ref. 49.

2.1.1 Tunneling

Starting from a description of electron tunneling through a rectangular barrier that spans from � to � and height �, where the electron of mass m has an energy E, one obtains the following expression for the transmission coefficient � in the WKB (Wentzel,

Kramers, Brillouin) approximation,53

2 � ~ exp − �� 2�(� − �) (9) ℏ

Evaluating the integral for square barriers gives �2��, where � is the ionization

53 energy of the donor and � is the D-A distance that represents the barrier width. The exponent, known as the tunneling factor, can be rewritten as exp(−��) where � =

− 2�� . Hence, the electron tunneling rate is proportional to the tunneling factor, and ℏ

12

exponentially decays with increasing �. Common estimates for � in proteins range between 1.0 Å-1 – 1.4 Å-1 (depending on the secondary structure).54

2.1.2 Superexchange

Superexchange is a special case of D-A tunneling that is mediated by the bridge virtual sites (Figure 2). Suppose a system contains N bridge energy levels

(let � represent the average energy level of a bridge) that are well-separated from the donor (�) and acceptor (�) levels, while the nearest-neighbor bridge electronic coupling �, (j runs from 1 to N) is much smaller than the separation,53

� − �, �, ≪ � − �/ (10)

Starting from a Green’s function-based expression of effective D-A coupling for superexchange and retaining only the lowest-order term in the Dyson expansion of the

Green function connecting bridge sites 1 and N,53 one arrives at the following lowest order expression for the effective coupling � for a D-A pair separated by an N-level bridge,

�� � � = (11) � �/ − � where � is the bridge-bridge coupling (assuming that all �, are identical). In the expression of the nonadiabatic CT rate (eq.

13

8), � is equal to � and one then obtains the superexchange CT rate �. The

superexchange decay factor � for the CT rate � (i.e., � ~ � , where R is

the D-A distance), in this case is approximately equal to

� − � 2 � = ln (12) � �

where L is the distance between redox sites (assuming equal spacing). Some examples of

superexchange in molecular systems include nucleic acids, for example, guanines

separated by short (T−A) bridges.49 Depending on the choice of the arene bridge, the

HOMO-LUMO gap can be increased (as well as the bridge reorganization energy) in

order to favor superexchange over the hopping mechanism.

2.1.3 Flickering Resonance

For a D-A system, the flickering resonance (FR) rate � is equal to

� (2),35 where is the CT rate from D- (electron on D) to A (� is the CT time),

and �(2), the probability that the D and A energy levels are within the

magnitude of the D-A electronic coupling, is defined as35

|| �(2) = ��� (�) ��� (�) ��� (�) (13) ||

assuming uncorrelated fluctuations of the D/A energy levels and D-A electronic

coupling. The energy fluctuations (Figure 2) are accounted for by �(�) and

�(�), the probability density function of the energy levels of site 1 (D) and site 2

(A) respectively. By representing �(�) and �(�) using Gaussian spectral 14

- functions to describe the oxidation of D and reduction of A (∆� = � −

35,55 � , ∆� = � − �),

() () 1 ∆ / � (� ) = � (∆�) = � (14) / √2��

1 () () ∆ / � (� ) = � (∆� ) = � (15) / √2��

� (� ), the variance, is equal to 2��� (2���), where � and � are the

respective D and A contributions to the overall reorganization energy �. �(�),

which accounts for the coupling fluctuations, is equivalent to �(� − |�|). The FR

rate � for an N-state D-BN-2-A system, is equal to � (�),35 where

�(�)

{,,..}|| || = ��� (�) ��� (�) ��� (�) … ��� (�) || {,,..}||

≈ � (16)

Φ = �� , where ∆� is the distance between adjacent sites, � is the ∆

root-mean-square fluctuation of the nearest-neighbor electronic coupling, and �

is the standard deviation of the site energies.

The FR mechanism was validated in ref. 56 and 57 by the approximately

linear dependence of the resistance (and CT rate) on V, where coherence is

promoted by the strong electronic coupling between odd-length bridging G

15

nucleobases (where the HOMO levels of the bases can be in resonant to the electrode fermi levels).

2.1.4 Hopping

In contrast to the coherent superexchange mechanism, in the incoherent hopping mechanism, the charge migrates from D to A via a series of independent tunneling events (multi-step tunneling). Each bridge site is populated as the bridge energy levels are closer to the D/A levels and are therefore more accessible to the tunneling charge. The charge then ‘hops’ from D through each bridge site and eventually onto A (Figure 2), where each ‘hop’ is represents a tunneling process. This process can be described by a master equation (see Chapter 6).

16

3. Electronic Couplings

The donor-acceptor electronic coupling VDA can be computed in several ways. If the charge localization (diabatic) states are orthogonal, VDA is the off-diagonal electronic coupling matrix element HDA between a charge donor and acceptor,

� ≡ � = ⟨Ψ|�|Ψ⟩ (17) where Ψ and Ψ are the N-electron wavefunctions of the two orthogonal diabatic states when the charge is localized on the donor and acceptor respectively. H is the electronic

Hamiltonian for the two-state system including the surrounding environment. Since post Hartree-Fock methods are costly, empirical/semi-empirical (Section 3.1), and electronic structure (Section 3.2) methods have been developed to calculate VDA.

3.1 Empirical/Semi-Empirical Models

The Hopfield, tunneling pathway, and average packing density models are used to calculate VDA between redox-active residues in proteins. The pathway model considers the atomistic structure of the protein in terms of covalent bonds, hydrogen bonds, and through space parameters, while the average packing density model is essentially equivalent in its treatment.58

3.1.1 Hopfield Model

For couplings involving π-systems such as aromatic residues, VDA for proteins was approximated by Hopfield55 with a square barrier model

17

2.7 � = exp (−0.72�) (18) �� with a 2 eV barrier height, � is the edge-to-edge distance (in Å) between D and A, � and � are the number of atoms in D and A respectively.

3.1.2 Pathway Tunneling Model

The pathway tunneling model was introduced in the late 1980s.59,60 In this model,

VDA is proportional to the of tunneling decay factors �,

� = � �(�) �(�) �(�) (19)

�, �, and � are the penalties for tunneling across covalent bonds, hydrogen bonds, and through-space respectively. � = 0.6 was obtained from saturated hydrocarbon-bridged

61 (.) donor-acceptor systems, � = � � was obtained by approximating a

59,60 (.) hydrogen bond as two stretched covalent bonds, and � = �� (� = 1.1 Å

58 was based on redox cofactor binding energies ). � and � are the lengths of the hydrogen bond (heteroatom-heteroatom) and through-space tunneling step, respectively. Because � is unitless, the pathway tunneling model does not predict the absolute VDA value but rather the relative values for each tunneling pathway identified.

The most dominant pathway would then correspond to the one with the largest value.

Blumberger49 notes that the packing density model is less accurate at smaller donor-acceptor distances (< 10 Å) where empty space regions can dominate (making it more difficult to estimate the appropriate value of �) and the sensitivity of VDA to the

18

donor and acceptor orientations (for instance, VDA for a heme dimer was demonstrated to change by orders of magnitude for different orientations).49,62 Similar limitations are noted by Jones et al.58

3.1.3 Average Packing Density Model

The average packing density model was introduced in 1999 and captures the same tunneling barrier effects as the pathway model.58 In order to account for variations in the coupling decay constant β (which is proportional to the square root of the barrier height �, � = 22��⁄ħ, where m is the effective mass) that are dependent on the average packing density � of the protein containing D and A,58

|�| = � exp {−[0.9� + 2.8(1 − �)][� − 3.6]} (20) where � is the edge-to-edge distance between the donor and acceptor, and � is a preexponential factor equivalent to58

/ � = �√�� (21)

where � is the width of the potential well, and |�| = � when � = 3.6 Å. Rates obtained using the model in the activationless (∆�° = −�) regime were found to closely agree with experimental rates.63 If one uses a value of � = 0.7663 for the protein packing density, then

(|�| ) = � exp {−1.4[� − 3.6]} (22)

When � = 0 (in gas-phase),

(|�| ) = � exp {−2.8[� − 3.6]} (23)

19

Beyond the empirical approach, if DFT is used to calculate the coupling in eq. 23, then the gas-phase DFT coupling be ‘dressed’ by the protein medium, with the following

27 expression for the ‘dressed’ couplings � obtained by dividing eq. 22 by eq. 23.

� = � exp {0.7[� − 3.6]} (24)

3.2 Electronic Structure Models

The electronic structure models described below differ mainly in the description of the D and A states (diabatic, adiabatic) and the level of approximation of these states.

There are many theoretical approaches to generate diabatic states, for example, using constrained DFT (CDFT),64,65 and to compute electronic couplings.66,67

3.2.1 Generalized Mulliken-Hush Method

The generalized Mulliken-Hush (GMH) approach introduced by Cave and

Newton68 builds on studies of Mulliken and Hush on CT spectra. The Mulliken-Hush approach links the transition dipole moment to the diabatic coupling that arises in the two-state approximation,

�∆� �∆� � = = (25) ∆� ∆� + 4� where � is the adiabatic transition dipole moment, ∆� is the energy difference between the two adiabatic states (vertical excitation energy), ∆� is the difference of the

adiabatic state dipole moments, and ∆� is the difference of the diabatic state dipole moments. The energies and dipole moment elements are calculated using electronic structure methods, or experimental data. A key feature of the GMH approach is that 20

only adiabatic quantities are required to calculate �, hence it is a common approach for including multiple adiabatic states which are admixtures of diabatic states. Ref. 69 derives a GMH expression that includes bridge states (in addition to the ground and charge-transfer states).

3.2.2 Effective Hamiltonians

For a D-B-A system, the reduced or effective Hamiltonian � can be obtained from the Löwdin partitioning scheme and represented as70,71

ℎ (�) �(�) � (�) = (26) �(�) ℎ (�) where

�(�) = � + ��(�)� (27) ,

� is the tunneling energy, � is the electronic interaction between donor and acceptor, � and � represent the electronic interactions between the donor and bridge site i, and bridge site j and the acceptor respectively. � is an element for the Green’s function matrix and represents the tunneling probabilities through bridge orbitals.70,71

3.2.2.1 Localized Orbitals

Methods for constructing localized orbitals to represent the D, B, or A localized states are implemented in popular DFT packages. Some choices include the Pipek-

Mezey72 and Foster-Boys73 localization routines implemented in GAMESS and

NWChem. 21

3.2.3 Constrained Density Functional Theory (CDFT)

Charge-localized diabatic states can also be constructed via CDFT,64 which involves minimizing the Kohn-Sham DFT energy under the constraint that the charge is localized either on D or A. In CDFT, a functional �[�, �] is minimized using the

Lagrange multiplier � under the constraint that an amount of charge (or charge difference between two regions) � is distributed among the atoms defined by the

64 weight function �(�). If �(�) = 1 within domain C and 0 otherwise, the charge is constrained among the atoms in C.74

�(�)�(�)�� − � = 0 (28)

�[�, �] = ��� �[�] + �[ �(�)�(�)�� − �] (29) where �[�] is the Kohn-Sham potential of electron density �(�). This leads to the

75 64,65 standard Kohn-Sham formulation, but with an added potential constraint ��(�)

1 �(�) − ∇ + � (�) + �� + � (�) + � � (�) � (�) = � � (�) (30) 2 |� − �| where the first four terms on the LHS correspond to the kinetic, external, Coulomb

75 (Hartree), and exchange-correlation potential, respectively. From an initial guess of �

in order to solve the Kohn-Sham equation, �(�) = ∑ |�(�)| , is obtained. �(�) is then substituted into the constraint equation to determine whether or not ∫ �(�)�(�)�� is equal to �. If not, this procedure is iterated with new values of � until �[�, �] is minimized under the charge constraint and �(�) ensures that ∫ �(�)�(�)�� is equal to

22

�. The wavefunction for the initial diabatic state (before CT) is then represented as the

N-electron determinant, while the energy of the initial state is obtained when � = 1

(excess electron on donor). The aforementioned steps are repeated with � = −1 (excess electron on acceptor) to obtain the energy of the final diabatic state.

As CDFT generates non-orthogonal diabatic states A and B where ⟨�|�⟩ ≠ 0, the electronic Hamiltonian � in the non-orthogonal diabatic basis is represented as76

1 � − � � � − � � � = (31) � − � � � − � � 1 − �

where � = ⟨�|�⟩, � = ⟨�|�⟩, � = �� � where � is the Kohn-Sham

Hamiltonian for state A, � = �� � where � is the Kohn-Sham Hamiltonian

77 for state B, � = �� �, and � = �� �. Since � is not Hermitian, the two off-diagonal elements are averaged and the result is used as the electronic coupling.

Refs. 78 and 79 present an ab initio method for calculating VIF that relies on a full- electron description and thus includes electronic relaxation effects in the generation of the diabatic states via CDFT using hybrid exchange-correlation (XC) density functionals.65,80 The method also allows the usage of post-HF schemes for state generation. Importantly, the method provides exact values of VIF within the two-state model and treats both orthogonal and nonorthogonal diabatic states (without assuming complete localization of the charge on the donor or acceptor respectively).6 Within the model, where the electronic ground-state wavefunction |�⟩ is a linear combination of the diabatic states |�⟩ and |�⟩, i.e. |�⟩ = �|�⟩ + �|�⟩, VIF is calculated as follows:

23

�� � + � 1 ⎧ ∆� 1 + � (∆� ≠ 0) � − � 2�� 1 − � � = (32) ⎨ ∆� (∆� = 0) ⎩ 2

where a, b, and SIF are overlap integrals (a + bSIF = ⟨�|�⟩, b + aSIF = ⟨�|�⟩, SIF = ⟨�|�⟩), and ∆EIF (∆Ev) is the (vertical) energy difference between the diabatic (adiabatic) states.

At the transition state coordinate where ∆� = 0 and a = b, the first expression in eq. 32 has a removable discontinuity and can be reduced to the second expression. However, the second expression is rarely used because the nuclear coordinates are almost never at

79 the transition state coordinate (i.e. ∆� ≪ 2�) which is a zero-measure subspace of the entire conformational space.81 The extent of satisfying the two-state approximation is

measured by the normalization parameter N, defined as � + � + 2���, is equal to 1 when the approximation is exactly satisfied.

For the calculation of VIF presented in Chapters 7, 10, and 11, the CDFT module of the NWChem package82 (versions 6.5 and 6.6) was used to obtain the diabatic energies and wavefunctions with the computational setup of the 6-311G** basis set and the M11 functional, which reproduces the correct 1/r asymptotic behaviour due to the inclusion of 100% Hartree-Fock exchange in the long range.83 The overlap integrals in eq. 32 were obtained using the ET module in the NWChem code. The range-separated M11 functional was ascertained to be the best-performing XC functional81 among the set of hybrid XC functionals (B3LYP, PBE0, SOGGA11-X, CAM-B3LYP, M05-2X, M06-2X, M06-

HF, M08-HX, M08-SO, M11, ωB97M-V, BHLYP, ωB97M-V, LRC-ωPBEh) investigated,

24

since the VIF value (computed using M11) for the G-A pair is similar to the ones computed using higher-level post-HF methods like CASSCF and CAS-PT2.84

3.2.4 Orbital Splitting

In the non-adiabatic regime, the energy gap ∆� between the ground and excited state at the transition state coordinate is equivalent to 2�. Using Koopmans’ theorem and neglecting orbital relaxation, � for HT can be approximated as

� − � � ≈ (33) , 2 where � and � are respectively the HOMO (representing the excited state of the positively-charged system) and HOMO-1 (representing the ground state of the positively-charged system) energies of the neutral system that can be accurately obtained by DFT with a suitable XC potential.85 Likewise, for excess ET,

� − � � ≈ (34) , 2 where � and � are respectively the LUMO (representing the ground state of the negatively-charged system) and LUMO+1 (representing the excited state of the negatively-charged system) energies.

25

4. Reorganization Energy

The reorganization energy � for CT between D and A is made up of two parts

– the inner-sphere (�) and outer-sphere (�) components. � accounts for the internal structural distortions of the solute upon CT. Since quantum-mechanical calculations are typically performed in the gas-phase, � can be obtained from these calculations. On the other hand, � arises from the rearrangement of solvent and solvent polarization coupled to CT. One could use elaborate free energy methods to obtain �, or approximate it more quickly using reasonable choices of the dielectric constants, effective radii, and D-A distances (see below).

4.1 Marcus’ Two-Sphere Model

As the name suggests, Marcus used two-spheres to describe D and A with effective radii � and � respectively, where the point charges are located at the center

86 of the spheres. These two spheres are separated by a center-to-center distance of � and located in a dielectric continuum characterized by optical and static dielectric constants � and �, respectively. The image charge effect arising from the charge polarization is ignored. The Marcus two-sphere expression for the reorganization energy

52 � is

1 1 1 1 1 � = � + − (∆�) + − (35) � � 2� 2� �

26

where � is the inner-sphere component of �; from DFT calculations, the reorganization

energy for a G-G nucleobase pair � and an A-A pair � are taken to be 0.746 eV and

0.423 eV, respectively.87 ∆� is the amount of charge transferred (in eV). The second term in eq. 35 represents the outer-sphere reorganization energy �. The centers of the spheres can be represented as the center of charge (as opposed to the geometric center), which is obtained from DFT calculation of Löwdin spin densities using M11/6-311G** and applying the formula � = ∑ ��, where � is the position vector of atom i and � is the corresponding charge. � can then calculated as the distance between the D and A center of charge. The effective radius can be calculated by averaging the distances between each

atom in a system of N atoms and the center of charge, i.e., � = ∑ |� − � |, where X = D

27,28 or A. The reorganization energy � (and likewise � or �) can also be approximated as51,88

� + � � = = � + � (36) 2

where � (�) is the reorganization energy for the weak-overlap self-exchange reaction in the D-D (A-A) pair.

4.1.1 Inner-Sphere Reorganization Energy

The inner-sphere reorganization energy � is commonly computed using

Nelsen’s four-point method,89

1 � = {�(��, ��) − �(��, ��) + �(��, ��) − �(��, ��)} (37) 2

27

where �(�, �) represents the energy of the solute a at a certain redox state calculated using the geometrically optimized structure b. In the above example, both a and b can either represent the initial state �� (hole resides on �) or final state �� (hole resides on �).

Alternatively, � has been estimated starting from the expression based on the harmonic displacement of the coordinates of atom i and j, ∆� and ∆� respectively,

1 � = � ∆� ∆� (38) 2 , , where the summation is over all atoms of the system and �, is the force constant

()() between atom i and j from Hooke’s law. Substituting � = 2 into the above ()() equation and replacement ∆� with normal modes ∆�, one obtains90

�(�)�(�) � = (∆�) (39) �(�) + �(�) where �(�) and �(�) are the force constants for the initial and final states (before and after CT), and the summation is over all bonds involved. The bond force constant � can be obtained using an empirical expression introduced by Gordy91

�� � = 1.67 � + 0.30 (40) �

-5 where � has units of 10 dynes/cm and � is in Å. � is the bond order, � is the bond length, and � and � are the electronegativities of A and B respectively.

28

4.1.2 Outer-Sphere Reorganization Energy

The thermodynamic integration (TI) method is well-established to calculate

° Marcus parameters like � and ∆� for a D-A pair via a ‘coupling’ parameter Λ. If Λ =

0.8, this means that 80% of the charge is localized on A and 20% is localized on D. The

92-94 total potential energy of the D-A pair, �, can thus be expressed as a linear relation

�(Λ) = (1 − Λ)� + Λ � (41)

° ∆� can be obtained by integrating the Boltzmann-weighted partial derivative of � with respect to Λ. This can be approximated as the sum of the partial derivatives at various values of Λ (1 through n). Angular brackets denote Boltzmann-weighted averages obtained by computing the partial derivatives for a range of MD snapshots.92

∂U(Λ) ∂U(Λ) ∆G° = 〈 〉 dΛ ≈ w 〈 〉 (42) ∂Λ ∂Λ where � denote the weights and is dependent on the numerical integration scheme (i.e.,

Simpson’s rule, trapezoidal rule, etc.). The total reorganization energy is then calculated by a linear fit function92-94

� = � − ∆�° = −�/2 (43) where � is the y-intercept and � is the slope of the linear regression function.

29

5. Molecular Dynamics

Molecular dynamics (MD) describes the movement of atomic nuclei and electrons in a medium (condensed-phase or gas-phase). Although it is most precise to solve the full time-dependent Schrodinger equation for all electronic and nuclear degrees of freedom, in reality, this is computationally challenging and expensive. MD simulations invoke the Born-Oppenheimer approximation where the nuclear and electronic degrees of freedom are separable, and the nuclear movement is confined to the Born-Oppenheimer ground-state energy surface. The intermolecular forces can be updated (quantum mechanically) on the fly at every time step, as in ab initio MD, or derived from a fixed set of ‘force field’ parameters relating to the potential energy function, as in classical MD. Both approaches require the application of classical Newton mechanics (Newton’s second law) to update the atomic coordinates of the molecules at every time step. Classical MD simulations provide an inexpensive way to investigate the dynamics of macromolecules (such as proteins) in biological environments. Ranging from simulations of conformation state transitions for large membrane proteins like G- protein coupled receptors to protein folding processes, classical MD has allowed us to uncover key biochemical mechanisms on a much longer timescale than what is possible using ab initio MD. For example, the Anton supercomputer has simulated protein folding of 12 structurally diverse proteins, each lasting between 100 µs and 1 ms long.95

As the computational power of supercomputers continues to grow, along with the

30

advent of quantum computers, MD simulations will continue to be important for simulating the intricate dynamics of biological processes which are otherwise difficult to capture using experimental approaches.

5.1 MD Force Fields

Classical MD simulations solve Newton’s equation of motion of each atom i in the system,

� � ��(�) � = �� = � = − (44) �� �� where � is the net force acting on atom i with mass �, acceleration �, and position �.

�(�) is the potential energy function, which represents the potential energy of the entire system, is dependent on the system geometry represented by the displacement vector �. The force field, which refers to the potential energy function �(�) and its associated parameters, is typically expressed as

�!"!#$(�) = �%"&'(' + �&"&)%"&'(' = �%"&' + �#&*$( + �'+,('-#$ + �."/$"0% + �12 + �3"$ 1 1 = � (� − � )5 + � (� − � )5 + � cos(�� + �) 2 %"&' (4 2 7 (4 '+,( %"&'6 #&*$(6 '+,('-#$6 �+�8 �+8 �+8 1 + + 95 − : − �+�+ (45) ��+8 � � 2 + 8;+ + 8;+ +8 +8 +

�(�) is divided into bonded interaction terms (�, �, and �) and the

96 non-bonded interaction terms (�, �, and �). � represents the harmonic interaction of two covalently-bonded atoms at a bond length � with equilibrium bond length �. � is the force constant for the particular bond. The second term � represents the harmonic potential describing angle bending among three atoms from its 31

- equilibrium angle �, where � is the angle bending force constant (units in kcal mol

1 -2 rad for AMBER). The dihedral term � describes the torsional rotation among

-1 four adjacent atoms where � is the dihedral force constant (kcal mol ), � is the dihedral angle, � is the periodicity, and � is the phase shift for each unique dihedral type.

� represents the electrostatic interaction between atom i and atom j

(associated with partial charges � and � respectively) in a medium of dielectric constant � and separated by a distance �. � represents the short-range repulsive and long-range dispersion forces that are described by the pairwise Lennard-Jones (LJ) 12-6 potential. The � and � parameters are related to the potential well depth and the internuclear distance at zero LJ potential (for interested readers, see ref. 97 which contains widely-used � and � values for MD simulations).

5 r 1 r ij 2 3 ! " 4

Figure 3: Bonded (solid lines) and non-bonded interactions (dashed line) represented by atoms 1-5. The associated bonded and non-bonded parameters r, θ, ϕ, and rij, are explained above.

When polarizable force fields are used, �(�) includes a final term � that represents the electronic polarization induced by the external electric field �. � is the

32

induced dipole moment for atom i. The electronic degrees of freedom are often modeled using the classical Drude oscillator model where each polarizable atom is connected to a charged particle via a harmonic spring.98

5.1.1 Seminario Method for Calculating Force Constants

The Seminario method is used to derive accurate bond, angle, and dihedral force constants from Cartesian second derivatives (Hessians).99 The reaction force �� from a displacement of N atoms �� in the system can be expanded to second order as follows,

�� = −[k]�� (46) or in matrix representation,

�� �� �� ⎡ ⎤ . . ⎢ �� ���� ����⎥ �� �� ⎢ �� �� �� ⎥ ⎡ �� ⎤ ⎡ �� ⎤ ⎢ . . ⎥ ⎢ ⎥ = − ���� �� ���� ⎢ ⎥ (47) . ⎢ . . . . . ⎥ . ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣��⎦ . . . . . ⎣��⎦ ⎢ ⎥ ⎢ � � � � � � ⎥ . . ⎣���� ���� �� ⎦

For two atoms A and B connected by a bond, the hessian matrix is represented as

�� �� �� ⎡ ⎤ ⎢���� ���� ����⎥ ⎢ �� �� �� ⎥ [k] = ⎢ ⎥ (48) ���� ���� ���� ⎢ ⎥ ⎢ � � � � � � ⎥ ⎣���� ���� ����⎦

From diagonalizing [k] to obtain the three eigenvalues � (i = 1, 2, 3) and

eigenvectors � , the bond force constant � is represented as the projection of the

99,100 eigenvector � of [k] onto the bond vector � to give

33

� = � � ∙ � (49)

The angle force constant � of an atomic A-B-C system is related to the force

99,100 constants � and � via the equation for two springs in series

1 1 � = + (50) �� ��

where � and � are the AB and CB bond lengths respectively, � = ∑ � � ∙

� , � = ∑ � � ∙ � , while � and � are two unit vectors perpendicular to the bonds AB and CB respectively.

For an A-B-C-D system with unit vectors � and � perpendicular to

59 plane ABC and BCD respectively, the dihedral force constant � is

1 1 � = + (51) �� × � � �� × � �

where � = ∑ � � ∙ � and � = ∑ � � ∙ � . For parameterizing a metal cofactor, the dihedral force constants are often omitted since metal-ligand torsion barriers are below kBT.101

5.1.2 Atomic Charges for Coulomb Interactions

Atomic charges are assigned based on the type of force field. For example, the

AMBER force field (as well as the generalized AMBER force field (GAFF)) traditionally uses restrained electrostatic potential (RESP) charges, where the restraints (in the form of a penalty function, can be harmonic or hyperbolic) reduces the conformational dependence of functional groups and ensure that the set of charges are easily 34

transferable between common functional groups. Charges are typically derived from gas-phase HF/6-31G* QM calculations on the optimized geometry of the system in order to solution-phase electrostatic interactions. In the case of metal sites that have high levels of multireference character, B3LYP/6-31G* is used instead. The QM electrostatic potential is first calculated at gridpoints located on several layers around the molecule, where points inside the van der Waals volume are neglected.102 For the Merz-Kollman

(MK) scheme,102 the point charges are then fitted and located at the atom centers to best reproduce these electrostatic potential. Other force fields, such as CHARMM, use refined charges (from Mulliken charges as the initial guess) that best reproduce the interaction energy between water and the molecule of interest. Similar to the variational method, several iterations of partial charges are tested starting from the initial guess until the molecular mechanics (MM) interaction energy matches with that obtained from

QM.

5.2 Integration Algorithms for System Evolution

As there is no analytical solution to solve the equations of motion, integration algorithms have been developed to integrate the equations of motions numerically for

MD simulations. In all of these algorithms, the positions, velocities, and accelerations are approximated by Taylor expansions and thus the evolution of the system across time is discretized. The Verlet algorithm, for example, expresses a new position �(� + ∆�) at time � + ∆� as a function of the positions �(�) and �(� − ∆�), and the acceleration �(�),

35

�(� + ∆�) = 2�(�) − �(� − ∆�) + �(�)∆� (52)

The velocities can be derived as follows,

�(� + ∆�) − �(� − ∆�) �(�) = (53) 2∆�

∆ Another algorithm is the Leap-frog algorithm, which first derives the velocity � � +

∆ ∆ at time � + from � � − and �(�),

∆� ∆� � � + = � � − + �(�)∆� (54) 2 2

∆ The velocity � � + is then used to update the position �(� + ∆�),

∆� �(� + ∆�) = �(�) + � � + ∆� (55) 2

Hence, the velocities and positions leap over each other, since the calculations of the

∆ velocities and positions alternate every .

The third algorithm, the Velocity Verlet algorithm, overcomes the non-self- starting issue of the Verlet algorithm and the velocity approximation of the Leap-frog algorithm. As the positions, velocities, and accelerations are synchronized and updated at the same time step, the algorithm is more precise than the former. �(� + ∆�) and

�(� + ∆�) are expressed as follows,

1 �(� + ∆�) = �(�) + �(�)∆� + �(�)∆� (56) 2 1 �(� + ∆�) = �(�) + [�(� + ∆�) + �(�)]∆� (57) 2

36

5.3 Solvent Representation

The three-point transferable intermolecular potential (TIP3P)103 is one of the most widely-used non-polarizable explicit water model for atomistic MD simulations. In

TIP3P,104 the water molecule is modeled as a rigid monomer consisting of three interaction sites; this originates from the TIPS3 site model which has positive charges on the two hydrogens and a negative charge on the oxygen atom. The dimerization energy for two water molecules a and b can be represented simply by the sum of the Coulombic potential energy between the various charges, and the L-J potential between the two oxygens,

�� � � � = + + (58) �� � � where the atomic charges � and �, L-J parameters � and �, and other structural parameters such as the length of the O-H bond are fitted to reproduce reasonable water thermodynamic properties such as heat capacity and enthalpy of vaporization.103,104

37

A B

C D

Figure 4: (A) 3-site water model, (B) 4-site model, (C) 5-site model, and (D) 6- site model. Color code: pink (O), green (H), blue (lone pair), orange (dummy atom).

4-site rigid models such as TIP4P where the negative charge is located on a dummy atom (see Figure 4) at the bisector of the HOH angle yields isothermal compressibilities in the right experimental range while the other 3-point models in ref.

63 underestimates. 5-site rigid tetrahedral models like TIP5P were developed in order to accurately model the density behavior of water as a function of temperature and pressure. For TIP5P,105 the negatively charged interaction sites are located symmetrically along the lone-pairs of oxygen while each hydrogen atom has a charge of +0.241 e. 6-site models include all interaction sites of 4-site and 5-site models.

Other water models include flexible and polarizable models. For flexible models like TIP4PF, the intramolecular flexibility (bond stretching and angle bending) is modeled using classical harmonic energy terms akin to the � and � terms presented earlier. The corresponding force constants were empirically determined based on experimental properties for liquid water at 25 °C and 1 atm.106 Polarizable models like

TIP4P-FQ include an explicit polarizable term ∆�, where in the TiP4P-FQ model, is 38

expressed as the difference between the internal energy E for the liquid state charges and the gas-phase charges,107

∆� = 〈�� 〉 − �� (59) where each interaction site i of the water molecule corresponds to a charge of �. TIP4P-

FQ was found to reproduce the static dielectric constant εs, the optical dielectric constant

εo, and a Debye relaxation time close to the experimental values.107

Since there is a trade-off between model accuracy and the size of the system that can be computed realistically (the computational time scales linearly with the number of interatomic distances between each pair of water molecules), TIP3P is often preferred for long biological simulations. Another attractive feature of TIP3P is its compatibility with nonpolarizable force fields including AMBER, CHARMM, and OPLS-AA.

5.4 Periodic Boundary Conditions

Periodic boundary conditions are utilized in MD simulations to model a large system using a unit cell and eliminate boundary condition effects when simulating finite-sized systems. For example, if water molecules are confined within a finite box, then the pressure exerted by water molecules near the boundaries of the box on the inner water molecules causes a pressure difference (Laplace pressure) and result in

‘water droplets’. With periodic boundary conditions, such problems would be eliminated, as water molecules would be able to leave the simulation box and re-enter

39

the box on the opposite side where it previous exited. The dynamics of the system in each unit cell is synchronized and thus all unit cells are identical.

In terms of addressing long-range Columbic forces under periodic boundary conditions, the Particle Mesh Ewald (PME) method is often used,108-110 where the 3D Fast

Fourier Transform (FFT) is used to evaluate the second term of the following sum of error functions,

1 erfc(��) 1 − erfc (��) = + (60) � � � where r is the distance, and � is the Ewald splitting parameter. The above sums are known as Ewald sums and the computation of the electrostatic interactions scales on the order of NlnN with the system size, where N is the number of point charges. The first error function (direct sum) decays quickly and thus can account for small r values. The second term (reciprocal sum) is the solution to Poisson's equation in reciprocal space and is evaluated at the FFT grid points.110 PME is therefore more attractive than truncation methods that neglect interatomic electrostatic forces beyond a cutoff distance since those methods scale on the order of N2.

5.5 Temperature/Pressure Coupling Algorithms

Depending on the type of ensemble (NPT, NVT, etc.) used for MD simulations, thermostats and barostats are utilized to control the simulation temperature and pressure respectively. Since temperature T is related to the kinetic energy via the equipartition theorem,

40

��� |�| = (61) 2 2� where � is the number of degrees of freedom, � is the Boltzmann’s constant, � and � are the momentum and mass of atom i respectively, a constant temperature would imply constant kinetic energy. Two common thermostats to maintain the temperature

(and energy) are the Berendsen thermostat and Langevin themorstat. For the Berendesn themostat, the system is coupled to an external heat bath with temperature Tbath. When the temperature at time t, T(t), deviates from the fixed temperature Tbath, the velocities of the particles are scaled at each time step such that

��(�) � − �(�) = (62) �� � where � is the system-bath coupling parameter which partly determines the rate of velocity scaling while the system exponentially decays towards Tbath. In the regime where � → ∞, the scaling factor becomes negligible (no heat bath) and the temperature fluctuations will correspond to that of the microcanonical ensemble (NVE).

The Langevin thermostat maintains the temperature by modifying the

Newtonian equations of motion,

�� � = (63) �� �

�� = � − � � + � (64) �� where � is the momentum of particle i, � is its corresponding mass, Fi is the force which is the derivative of its potential energy, � is the friction coefficient that damps 41

the momenta as the larger particle i moves through a sea of smaller particles, and � is the random force caused by thermal stochastic ‘kicks’ from smaller particles. Therefore, the Langevin thermostat is computationally inexpensive as the high-frequency motions of small particles are replaced by stochastic terms.

In order to produce MD trajectories in the NPT ensemble, the Langevin piston method111 was implemented in MD codes like NAMD. The Langevin piston method allows intermediate piston damping via a Langevin description of the piston degree of freedom coupled to a heat bath, and allows one to vary the dynamics of the degree by choosing an appropriate value for the collision frequency.111

42

6. Kinetic Models and Master Equations for Charge Transfer Dynamics

Kinetic models and master equations describing the dynamics of these models have been used for decades to examine the timescale for a stochastic event to occur.

Knowing the timescale can help us understand if the event is feasible on the biological timescale. For example, if the timescale for charge hopping from site 1 to site 3 is on the order of minutes, this implies that the CT event is unlikely to occur in biology, because the characteristic timescales of key biological CT processes (such as in photosynthesis) occur within milliseconds or less.112

113-116 The timescale can be defined by the mean first passage time (MFPT) 〈�〉, which is the average time to arrive at a final state from an initial state. It is represented as the first moment of the probability distribution,

〈�〉 = � �(�) �� (65) where �(�) is the probability density defined as

� � (�) = − � (�) (66) �� for a linear kinetic network with n sites with the terminal site being a trap/sink (see

Figure 5). For instance, assuming there are N+1 sites, only the forward rate constant from

N to N+1 is considered, while forward and backward rate constants are considered

43

between all other nearest-neighbor sites). �(�) is the probability that the population is found on site n.114

k1®2 kN®N +1 1 2 …… N N + 1

k2®1

Figure 5: Kinetic model for a hopping network consisting of N+1 sites, where the charge starts at site 1 and eventually arrives at the trap site N+1 shown in orange. The arrows represent the forward and backward tunneling rate constants.

Substituting the expression for �(�) into 〈�〉 followed by integration by parts, one obtains 〈�〉 as a sum of the residence time � of the each state (excluding the trap state due to the irreversibility condition).

〈�〉 = �(�) �� = � (67)

The master equation for population evolution can be represented using linear, first- order differential equations of the form

� � (�) = � � − � � (68) ��

The set of differential equations can then be rewritten using matrix representation as

� �(�) = ��(�) (69) �� where �(�) is the probability matrix and � is the rate constant matrix. For a hopping network containing a trap (implying the existence of an irreversible step) in the form of

114 Figure 5, 〈�〉 (the subscript refers to the initial state being state 1) is expressed as

44

〈�〉 = [−� �] = − � (70) where � is the inverse of the rate constant matrix that only describes the reversible steps (does not contain �→), and � is the corresponding initial probability matrix of

dimension N, � = [1 0 … … 0] (assuming that site 1 is initially populated). The

summation is over the N elements of the matrix product −� �. � refers to the

th th 116 element corresponding to the n row and l column. The inverse of 〈�〉 gives the effective hopping rate constant from site i to the trap.

When a trap does not exist in the network, a stationary solution implies the non-

existence of � , i.e. � is not invertible. Thus, the above expression for 〈�〉 should not be used. Instead, one can solve for 〈�〉 by finding the eigenvector that spans the nullspace of � at steady state (i.e., ��(�) = 0), and relate the components of the eigenvector to the effective rate constant (for details, see section A.2 in the Appendix of ref. 117, where its inverse is 〈�〉.

45

7. Charge Transfer in the Human Primosome

Portions adapted from ‘Teo, R. D.; Rousseau, B. J. G.; Smithwick, E. R.; Di Felice, R.;

Beratan, D. N.; Migliore, A., Charge transfer between [4Fe4S] proteins and DNA is unidirectional. Implications for biomolecular signaling. Chem 2019, 5, 122-137’. All authors discussed the results, commented on the manuscript, and contributed to its writing. In addition: R.D.T. carried out computations and analysis of results; B.J.G.R. contributed to computations; R.D.F. contributed to the pathway product analysis; A.M. conceived and designed the research, developed the kinetic model, contributed to computations, and headed the analysis of results and the manuscript writing.

Barton et al. proposed a mechanism for the truncation of RNA primer synthesis and the handover of the primer from p58c to p180core via DNA-mediated CT.26 The primosome, which contains both DNA primase and DNA polymerase α (Polα), is involved with the synthesis of RNA primers necessary for initiating DNA replication

(Figure 6). The process begins with an oxidized [4Fe4S]3+ cluster in the C-terminal domain of primase (p58c) that is electrostatically attracted and tightly bound to the negatively-charged phosphate backbone of the duplex (Figure 7, Left). After the elongating the primer by primase, a signaling partner (Polα), comes into contact with the RNA/DNA duplex and reduces the [4Fe4S]3+ cluster in p58c via DNA-mediated CT

46

(Figure 7, Center). As a result, the [4Fe4S]2+ cluster in the C-terminal domain of Polα

(p180c) is oxidized to the 3+ state, while the [4Fe4S]3+ cluster in primase is reduced to the

2+ state (Figure 7, Right). These redox processes decrease the electrostatic attraction and binding affinity of p58c to the duplex, resulting in p58c dissociation. On the other hand,

Polα, which now contains a [4Fe4S]3+ cluster, is more tightly bound to the duplex.

It was also suggested that the CT is mediated by three key tyrosine residues in p58c (Y309, Y345, and Y347),26 although a structurally different mutant p58c was used to delineate this pathway. In general, DNA-mediated CT mechanism involving repair proteins and replicative proteins both require the high potential [4Fe4S] cluster to function as an electron donor (in the +2 oxidation state) and then as the electron acceptor

(in the +3 oxidation state); this is often depicted as a concerted process (Figure 7,

Center).16,118,119

a [4Fe4S] b

Polα primase

p58N p180c

p180core p58c [4Fe4S] docked at RNA/DNA ZBS1 ZBS2

Figure 6: Human primosome-nucleic acid complex. a, Portion of the primosome crystal structure (PDB ID 5EXR120) highlighting the [4Fe4S] cluster docked to p180c (at zinc 1, ZBS1) relative to p58c [4Fe4S] cluster and to the other zinc binding site (ZBS2) in p180c. Color code: zinc (red), S (yellow), Fe (magenta), p180core (cyan), p180c (orange), p58c (silver), p58N (pink), p180core-p180c linker (green), p58c-p58N linker (blue). b, Schematic view of the protein complex bound to RNA/DNA. p58c, with an oxidized [4Fe4S] cluster (yellow), is linked to primase N- 47

terminal domain (p58N). p58N connects primase to Polα. We show the catalytic core (p180core) linked to the C-terminal subunit p180c of Polα (see Appendix, Section S2 of ref. 27) and a [4Fe4S] cluster (orange) bound to p180c. The nucleic acid transiently associates primase and Polα. The distance between the p180c [4Fe4S] cluster and the duplex depends on the conformation of the p180core-p180c linker. The p58c [4Fe4S] cluster is at edge-to-edge distances of 50.9 Å and 38.6 Å from ZBS2 and from the [4Fe4S] cluster docked to p180c, respectively.

Figure 7: Proposed mechanism of primer handoff driven by DNA charge transport chemistry.26 Left: DNA primase elongates an RNA primer (green) to a length of 8 to 12 nt with both p48 and p58C contacting the nascent RNA/DNA duplex. Center: When the nascent primer is large enough, another [4Fe4S] (purple), which we hypothesize to be DNA Polα in vivo, participates in DNA-mediated signaling through the primer-template duplex. Right: This redox event promotes dissociation of p58C through reduction of the cluster from the [4Fe4S]3+ state to the [4Fe4S]2+ state; the next [4Fe4S] enzyme is then tightly bound and can continue elongation of the primer-template. Figure and caption are reproduced from ref. 26.

However, the proposed redox signaling mechanism26 is the subject of major debate.121-123 A key criticism of this mechanism arises from the fact that O’Brien et al.26 used a partially misfolded p58c in their electrochemical experiments.124 Nonetheless, there are numerous redox-active residues (Tyr, Trp, Cys, Met) in the primosome that support charge hopping (see Appendix, Sections S2 and S3 of ref. 27); this indicates a high possibility of a redox-based signaling mechanism. DNA-mediated redox communication between [4Fe4S] clusters offers an appealing and broadly relevant 48

protein signaling paradigm. Using the theoretical framework outlined in section 7.1 and detailed in sections S4 and S5 of ref. 27, HT from a [4Fe4S]3+ cluster to a neutral

RNA/DNA duplex and EET from a [4Fe4S]2+ cluster to the neutral duplex are demonstrated to be not energetically accessible, while HT from a hole-containing

RNA/DNA duplex to the [4Fe4S]2+ cluster and EET from an excess electron-containing duplex to the [4Fe4S]3+ cluster are feasible on a biological millisecond timescale. In addition, a comprehensive set of possible mechanisms for signaling between [4Fe4S]2+/3+ clusters are proposed.

7.1 Methodology

7.1.1 Docking and Modeling

Eukaryotic DNA polymerases have been shown to require a [4Fe4S] cluster,125 although there are still no crystal structures of DNA Polα containing iron-sulfur clusters to date. This is largely due to the lability of the [4Fe4S] cluster - coupled with the solvent-exposed region of the two zinc-binding sites (ZBS) in the C-terminal domain of p180 (p180c) (Figure 6)126 - which has made the crystallization of a [4Fe4S]-containing

Polα difficult.125 In order to gain insights into the CT mechanism between the two iron- sulfur clusters (one in p180c and one in p58c), a [4Fe4S] cluster was successfully docked at ZBS1 using the AutoDock Vina program.127

ZBS1 was chosen over ZBS2 as the cluster has been shown to bind p180c at

ZBS1125 and is located proximally to the catalytic core of human DNA polymerase α

49

(p180core) and the [4Fe4S] cluster of p58c (Figure 6). Based on the crystal structure of p180core and an RNA-primed DNA template (PDB 4QCL), ZBS1 is proximally located to W1084 and other proximal tyrosine residues that directly interact with the RNA/DNA helix, priming these residues and the docked [4Fe4S] cluster as electron-hole conduits.

Using Figure 6 as a starting point, theoretical and experimental values for redox potentials, reorganization energies (Table 1), and estimated electronic couplings (Tables

2 and 3) can be obtained.

Table 1: Reorganization energies (λX) associated with the redox group X in electron self-exchange at the indicated donor-acceptor center-to-center distances (RDA), experimental oxidation potentials (E0), vertical ionization energies (VIE), theoretical oxidation potentials derived from VIE using eq. S7 of ref. 27 (E0ox), vertical electron affinities (VEA), and theoretical reduction potentials derived from VEA using eq. S8 of ref. 27 (E0red). E0ox and E0red are redox potentials vs. normal hydrogen electrode (NHE). E0 refers to the standard hydrogen electrode (SHE) or to NHE, and the very small difference in redox potentials vs. NHE and vs. SHE is neglected. Potentials are in V units; energies are in eV.

X λX RDA (Å) E0 VIE E0ox VEA E0red

0.93129 – Tyr 1.02128 12.80 -1.60 -3.39 1.38130,131

1.02129 – Trp 0.95128 12.80 -1.38 -3.21 1.19132

Met 1.08128 12.80 1.66133 -2.44 -4.07

G 1.08134,135 3.38 1.29136,137 7.98138 1.38 -1.25138 -3.10

C ~1.6136,139 8.69138 1.76 -0.58138 -2.56

A 0.82134,135 3.38 1.42136 8.26138 1.53 -0.76138 -2.71

50

T ~1.7136,139 9.01138 1.93 -0.29138 -2.32

[4Fe4S]2+/3+ 0.75 14.44 0.0816,43

7.1.2 Redox Potentials

The theoretical oxidation potentials (Table 1) were obtained using the vertical ionization energies (VIEs) through the equation138

� = (−2.59 ± 0.26) + (0.56 ± 0.03) × VIE (71) where the VIE values for the nucleobases138 were computed in ref. 138 and the error bars

in eq. 71 were chosen to minimize deviations of the theoretical oxidation potentials (�)

from the corresponding experimental values. The � values are consistently higher than the respective experimental values by about 10 %. In our further analysis, we use the experimental estimates of the oxidation potentials. However, the differences between the experimental and theoretical oxidation potentials of nucleobases are small compared to their differences from the oxidation potential of the [4Fe4S] cluster. The penalty for

HT from the iron-sulfur cluster to the nucleobases is thus significant irrespective of the choice (theoretical or experimental) of oxidation potentials for the nucleobases.

The theoretical reduction potentials (Table 1) were obtained using vertical electron affinities (VEAs) inserted into the following equation138

� = (−2.09 ± 0.03) + (0.81 ± 0.05) × VEA (72)

51

The VEA values for Tyr, Trp, and Met in Table 1 from gas-phase DFT calculations (Table

2), using the PBE0 functional,140 the M06-2X141,142 and M06-HF141-143 meta hybrid density functionals and the cc-pVTZ basis set. After geometry optimization of a neutral amino acid, the VEA is obtained from the difference in energy between the neutral and the anionic amino acid at the nuclear coordinates of the former. For each amino acid, the

VEA value reported in Table 1 is the mean of the PBE0, M06-2X and M06-HF values in

Table 2.

Table 2: Values of VEA and E0red of Tyr, Trp and Met, obtained from DFT gas- phase calculations with the indicated exchange-correlation functionals and the cc- pVTZ basis set. The VEA is the difference in energy between the neutral and anionic amino acid at the optimized nuclear coordinates of the neutral species.

amino acid method VEA � M06-HF -1.52 -3.32 Tyr M06-2X -1.64 -3.42 PBE0 -1.64 -3.42 M06-HF -1.36 -3.19 Trp M06-2X -1.41 -3.23 PBE0 -1.38 -3.21 M06-HF -2.41 -4.04 Met M06-2X -2.47 -4.09 PBE0 -2.45 -4.07

7.1.3 Reorganization Energies

In Table 1, �[] = 0.75 eV when the center-to-center distance � is 14.44 Å.

When �′ is 13.10 Å (center-to-center distance between [4Fe4S] and DA10 in the

52

primase-RNA/DNA complex), one can use the following equation to obtain �′[] based on the known �[] at the original distance �,

1 1 �′ − � �′ = � + − (∆�) (73) � � �′�

Using eq. 73, one obtains a �′[] value of 0.74 eV. The outer-sphere contribution of

adenine, � is equal to 0.605 eV and is estimated as half the reorganization energy of the A-A redox pair in solvated regular DNA 134, where the average inter- nucleobase distance is 3.38 Å. Using eq. 35 (with the water dielectric constants of � = 1.8 and � = 80), the effective radius for adenine is 2.22 Å. Using the same approach, an effective radius for guanine is 2.10 Å. Rescaling the distance between the A-A nucleobases to that of the �[] using eq. 35 (but with protein dielectric constants of � = 2.2 and � = 4.0, � now becomes 0.76 eV. Since for a cross reaction Ox! +

Red" ⇌ Red! + Ox", the total reorganization energy can be represented as

� + � � = = � + � (74) 2

�[] is equivalent to the sum of �[] and �; this provides a value of 1.50 eV. For calculating � at a distance �′, where X is an amino acid, eq. 73 can be used directly without having to first solve for the effective radius of X.

7.1.4 Electronic Couplings

The DFT electronic couplings � (obtained from eq. 32) using the M11 density functional 144 and the 6-311g** split-valence basis set) refer to the redox pairs in the gas

53

phase. In most cases, obtaining DFT electronic couplings would be computationally unfeasible for protein-mediated tunneling. The range separation implemented in the

M11 hybrid meta-GGA density functional produces the correct inverse-distance dependence of the long-range potential. Real electronic wave functions (and electronic couplings) are computed using the NWChem software. To account for the effect of the protein medium on the gas-phase couplings, we consider the average packing density

model (eq. 24) to obtain the ‘dressed’ couplings � (Table 3).

Table 3: DFT (M11/6-311g**) values of the electronic couplings obtained from DFT eq. 32 (VIF ); and DFT electronic couplings “dressed” with the effects of the protein DFT medium using eq. 24 (ṼIF ). All values are in meV.

redox couple � � M288/Y309 1.86 2.79 Y309/W327 8.30 10.16 W327/Y345 0.01 0.12 W327/M307 0.04 1.17

Y345/Y347 2.2´10 -4 0.018 DA7/M307 46.13 a 40.96 DG4/Y347 2.18 9.28 GTP/Y345 0.11 1.57

In order to estimate the electronic couplings between the [4Fe4S] cluster in p58c and other redox-active moieties that can accept the electron hole (R), the edge-to-edge distance �e between the [4Fe4S] cluster and A is first obtained from the crystal

54

structure. Secondly, the electronic coupling between two A groups is estimated at the edge-to-edge distance of the actual redox couple (�) using the Hopfield equation (eq.

18). Thirdly, the center-to-center distance � between two iron-sulfur clusters at the same edge-to-edge distance as the actual redox pair is obtained by adding twice the

2+/3+ effective radius of the cluster (~ 1.87 Å) to �. For a pair of [4Fe4S] clusters, the electronic coupling is estimated using the expression in ref. 145 where

log � = 1.73 − 0.42 � (75)

2+/3+ The coupling between an [4Fe4S] cluster and A, �[], is obtained as the geometric average of the [4Fe4S]-[4Fe4S] and A-A electronic couplings, i.e., �[] =

�[][]�.

Table 4: Semiempirical estimates of electronic couplings (VIF) between the listed redox-active partners at the indicated edge-to-edge distances (Ree) from eq. 18; DFT (M11/6-311g**) values of the electronic couplings obtained from eq. 32 (VIFDFT); and DFT electronic couplings “dressed” with the effects of the protein medium as in eq. 24 (ṼIFDFT). All values are in meV.a This coupling value was calculated using the M06-HF exchange-correlation functional141,143 to avoid excessive delocalization of the hole over the two redox partners at this short edge-to-edge distance.

redox partner � � �[ ] [ ] �[ ] of [4Fe4S]

M288 5.14 22 9.9 15.0

Y309 9.95 0.30 9.5´10 -2 0.17

DA10 10.74 0.12 4.4´10 -2 7.2´10 -2

M307 13.84 4.2´10 -2 2.2´10 -3 9.7´10 -3

55

DA7 17.56 8.7´10 -4 6.0´10 -5 2.3´10 -4

GTP 20.16 1.2´10 -4 4.9´10 -6 2.4´10 -5

7.1.5 Kinetic Model

The Marcus parameters (as described in sections 7.1.2 - 7.1.4) were then inserted into the Marcus rate equation (eq. 8) to obtain tunneling rate constants for individual steps. Once the rate constants were obtained, they were inserted into the following expression for the mean residence time τ of the transferring charge in each pathway

(where the inverse is the effective CT rate constant),

1 �→ 1 � = � = + 1 + (76) �→ �→ �→ where N is the total number of redox-active sites/residues in the pathway and �→± is the rate constant for CT between nearest-neighbor redox sites. � = 0 denotes the initial charge donor, � = 1 to N indicates the bridging sites, and site � = N + 1 is the terminal charge acceptor in contact with a charge ‘drain’. Every forward and backward rate constant between each nearest-neighbor redox pair is inserted into eq. 76, except for the charge-transfer step between site � = N and the terminal acceptor, where only the forward rate constant is considered; this step irreversible due to rapid scavenging of the charge by redox agents in the cell environment. When all backward rates are neglected, eq. 76 reduces to 56

1 � ≅ (77) �→ where the approximate expression is the sum of the residence times of for each forward tunneling step in the pathway.

7.2 Unidirectionality

We first considered the process of [4Fe4S]2+ in p58c donating an electron to a hole-containing nucleobase, with the possibility of mediation by three tyrosine residues

(Y309, Y345, and Y347).26 Pathway analysis (see below) indicates that the strongest CT pathways terminate on the DA10, DA7, and guanosine triphosphate (GTP) purine nucleobases. DA7 is the most attractive hole-donating purine due to its close proximity to the cluster; it is H-bonded to H303, and the DA7-H303 pair behaves as a pseudo pair that extends the start of the double-stranded region (Figure 8).

C384 G383 H382 [4Fe4S] G304 H303 C367

H443 S366

DC9 DA8 GTP DA7 DA10

Figure 8: Strongest tunneling pathways between the [4Fe4S] cluster in primase and purine nucleobases in anchored nucleic acid. The pathway strength was defined using the value of the pathway product in eq. 19. We show the p58c secondary and 57

tertiary structures (gray), the amino acid residues in the tunneling pathways, the backbone of DNA/RNA (orange), the Fe (pink) and S (yellow) atoms in the iron- sulfur cluster. The color code for the atoms in the residues is: C (cyan), O (red), N (blue), H (white).

Because DA7 is the most attractive hole-donating purine, possible redox-active hopping sites (M288, Y309, W327, and M307) for electron-hole hopping pathways between the cluster and DA7 were identified. This was followed by an extensive individual tunneling rate analysis between adjacent redox-active sites (Figure 9 and

Table 5) using the Marcus rate constant expression for nonadiabatic ET (ET).146 Since the forward tunneling rate constant for the M288-Y309 step is very small (~10-6 s-1), potential pathways involving M288 are neglected. Rate constants involving Y345 and Y347 are noticeably small; this indicates that HT pathways involving all three tyrosines do not occur on biologically feasible millisecond timescale.

Table 5: ET paths, steps, parameters, and rate constants for CT mechanisms potentially at play in the p58c-RNA/DNA complex (PDB ID: 5F0Q). Energies are in eV, distances in Å. The redox couple with the strongest connecting electron tunneling pathway, which also corresponds to the largest single-step Marcus-type rate constant, is highlighted in purple, and the corresponding rate constant value is enclosed in a purple rectangle. Similarly highlighted, in cyan, is the single-step ET reaction with inverse rate constant close to a biologically relevant millisecond timescale. The redox pairs in the two fastest routes for hole transfer from the nucleic acid to [4Fe4S]2+ are highlighted by using the same color code as in Figure 10. The upper/lower sign of the reaction free energy refers to the forward/backward CT. Where Tyr and Trp are involved, the reaction free energy and rate constant values in the second lines correspond to the higher oxidation potential values (for pure hole transfer) in Table 1.

0 ® -1 ¬ -1 D/A route DG l � � kET (s ) kET (s ) single-step hole transfer to/from [4Fe4S]2+/3+ [4Fe4S]/DA10 ∓ 1.34 1.50 10.74 13.10 6.0 × 107 1.3 × 10-15 58

[4Fe4S]/DA7 ∓ 1.34 1.58 17.56 19.94 4.9 × 102 1.1 × 10-20 [4Fe4S]/GTP ∓ 1.21 1.80 20.16 22.54 1.1 4.0 × 10-21 hole hopping to/from [4Fe4S]2+/3+ [4Fe4S]/M307 1 ∓ 1.58 1.87 13.84 16.74 7.3 × 105 1.4 × 10-21 M307/DA7 1,2,5 ± 0.24 1.38 3.43 4.24 2.3 × 105 2.6 × 109 ∓ 0.85 3.5 × 106 1.5 × 10-8 [4Fe4S]/Y309 2-4 1.77 9.95 13.27 ∓ 1.30 1.1 × 108 1.1 × 10-14 ∓ 0.09 2.8 × 105 8.4 × 103 Y309/W327 2-7 1.75 3.89 6.60 ∓ 0.19 1.0 × 103 1.7 × 106 ∓ 0.64 2.8 × 106 4.2 × 10-5 W327/M307 2,5 1.96 8.35 9.98 ∓ 0.47 2.6 × 105 2.9 × 10-3 ± 0.09 0.3 11 W327/Y345 3,4,6,7 1.88 7.20 9.33 ∓ 0.19 66 4.1 × 10-2 ∓ 0.36 1.3 × 105 0.1 Y345/GTP 3,6 1.92 7.40 10.20 ± 0.09 38 1.3 × 103 Y345/Y347 4,7 0 2.04 9.89 12.57 9.8 × 10-3 9.8 × 10-3 ∓ 0.36 1.4 × 107 11 Y347/DG4 4,7 1.80 5.67 7.28 ± 0.09 4.2 × 103 1.4 × 105 12 -15 [4Fe4S]/M288 5-7 ∓ 1.58 1.65 5.14 7.34 2.8´10 5.5´10 ± 0.73 5.3 × 10-5 1.1 × 108 M288/Y309 5-7 1.87 4.18 6.34 ± 0.28 3.6 1.9 × 105

2.8×105 3 (1.0×10 ) 3.3×10-1 3.5×106 Y309 W327 (6.6×101) (1.1×108) 8.4×103 5.3×10-5 (1.7×106) (3.6) 1.1×101 Y345 -3 1.5×10-8 1.1×108 (4.1×10-2) 9.8×10 (1.1×10-14) (1.9×105) M288 7 -3Y347 1.4×10 2.8×106 9.8×10 3 -5 (4.2×10 ) 2.8×1012 4.2×10 (2.6×105) -3 -15 (2.9×10 ) 5.5×10 1 1.1×10-1 1.1×10 1.3×105 5 3 (1.4×10 )DG4 (1.3×10 ) 1 7.3×105 (3.8×10 ) M307 [4Fe4S] 1.4×10-21 2.3×105

2.6×109 DA7 GTP

Figure 9: CT steps and rate constants in hole hopping between the [4Fe4S] cluster and the nucleobases in the p58c-RNA/DNA complex. (Crystal structure with 59

PDB ID 5F0Q.) The CT steps with an inverse rate constant within a biologically relevant millisecond time scale are in blue; the other steps are in red. The fastest (second fastest) CT route is drawn as a green (orange) dashed line. Possible routes for charge transport between the [4Fe4S] cluster and the nucleic acid are 1: [4Fe4S]-M307- DA7; 2: [4Fe4S]-Y309-W327-M307-DA7; 3: [4Fe4S]-Y309-W327-Y345-GTP; 4: [4Fe4S]- Y309-W327-Y345-Y347-DG4; 5: [4Fe4S]-M288-Y309-W327-M307-DA7; 6: [4Fe4S]-M288- Y309-W327-Y345-GTP; 7: [4Fe4S]-M288-Y309-W327-Y345-Y347-DG4.

Using eq. 76 to obtain the mean residence times of possible ET pathways from

[4Fe4S] to the RNA/DNA duplex on a biologically feasible timescale, the fastest ET pathway was identified to be [4Fe4S]-M307-DA7 with a residence time τ on a microsecond timescale (Figure 9). Expectedly, ET to DG4 through Y345 and Y347 is even

26 slower, with τ in the range 10- 10 s.

For the second fastest ET pathway (see Figure 9 and 10) that contains the Y309 and W327 residues, we considered two different oxidation potentials for Tyr (0.93 V129 and 1.38 V,130,131 Table 1) and Trp (1.02 V129 and 1.19 V,132 Table 1). We thus obtained a microsecond and millisecond timescale corresponding to the usage of 0.93 V/1.02 V and

1.38 V/1.19 V, respectively. The oxidation potential for the Tyr (Trp) residue is 0.93 (1.02)

V if the ET is coupled to proton transfer to a residue nearby,129 while the oxidation potential is 1.38 (1.19) V in the case of pure ET.130,131 If the transferring proton has a tight

H bond with a residue nearby, the short-range proton transfer is expected not to affect appreciably the electronic wave functions of Tyr or Trp. In this case, the CT process leading to the Tyr or Trp oxidation can be described approximately as pure ET with an

60

oxidation potential close to the value for PCET.7 This is essentially the case in the crystal structure with PDB: 5F0Q (e.g., the N atom of His292 side chain is in H-bond configuration with the OH group of the Y309 side chain). Therefore, in our analysis, we described the electron-transfer steps through Tyr (Trp) residues using the Marcus rate equation with 0.93 V and 1.38 V (1.02 V and 1.19 V) as limiting values for the oxidation potential, in order to show the robustness of our kinetic conclusions with respect to parameter choices.

GTP

[4Fe4S] Y W H

M

DA10 DA7 DC6

Figure 10: Fastest ET routes from the iron-sulfur cluster in primase to DNA/RNA (i.e., hole transfer in the opposite direction). The two most rapid hopping paths (according to eq. 76) are drawn using green ([4Fe4S]-M307-DA7) and orange ([4Fe4S]-Y309-W327-M307-DA7) arrows. The purple arrow shows the most rapid non- productive tunneling from the [4Fe4S] cluster to DA10 according to the tunneling pathway model. H303, which is H-bonded to DA7, and DC6, which is paired with the GTP, are in pink. Crystal structure from PDB, ID 5F0Q.

Although duplex-to-cluster HT is feasible, the energetically uphill HT from

[4Fe4S]3+ to a neutral duplex cannot occur on biologically relevant timescales (Figure 61

11b, process b). This prevents the signaling between the p58c and p180c [4Fe4S] clusters from being exclusively mediated by HT through the duplex. However, concerted mediation by the duplex and ROS may enable the CT-based signaling (Figure 11a,

Mechanism 1). Likewise, the signaling may be mediated by transfer of an excess electron

(from cellular reductants) through the RNA/DNA duplex (Figure 11a, Mechanism 2) to the [4Fe4S]3+ cluster which we found to be energetically feasible. This is due to the large downhill energetic landscape (Figure 11b, process c). On the other hand, EET from

[4Fe4S]2+ to a neutral duplex is energetically unfavorable (Figure 11b, process d) with a maximum rate constant of ~10-7 s-1. This unfavourability is dominated by the large tunneling barrier of more than 2 eV that the electron has to overcome to reach the neutral duplex, as indicated by the large uphill climb in process d.

A O = 3+ = 2+ B -E (V) Met 1) ü -E (V) TTyr CTrp h+ G AA C GT û Trp Tyr R O d cc 2) e- d û h+ Met 4) e- [4Fe4S]2+ [4Fe4S]3+ [4Fe4S]2+a b [4Fe4S]3+ ü Tyr a Trp b Tyr Trp 3) GTrp TrpTyr AG CA Tyr Met - C e TMet T

Figure 11: CT-mediated [4Fe4S] protein signaling and pertinent redox potentials. a, Possible mechanisms for protein communication via CT, supported by the primer: 1) HT or 2) EET between the p58c and p180c [4Fe4S] clusters (drawn as small cubes), mediated by a RNA/DNA duplex transiently bound to both p58c and p180core. Part of the CT route (in red) is energetically unfavorable. Nearby oxidants (O) or reductants (R) should support these signaling mechanisms. 3) Change in 62

relative DNA-binding affinities of primase and Polα caused by direct inter-cluster CT. 4) Competitive protein binding to the primer, modulated by sequential changes in the [4Fe4S] cluster oxidation states assisted by RNA/DNA and surrounding redox agents. b, Redox potential landscape for HT (blue) and EET (green), and related downhill (a, c) and uphill (b, d) ET processes. The y-axis represents the redox potential (V). The energetically unfavorable processes b and d correspond to HT from [4Fe4S]3+ to amino acid/nucleobase and EET from [4Fe4S]2+ to amino acid/nucleobase respectively.

Mechanism 3 in Fig. 11a envisages signaling by HT superexchange between

[4Fe4S] clusters, while p58c holds the template/primer during its handoff to Polα120.

Although the crystallographic distance of 38.6 Å (Figure 6a) is too large for charge self- exchange to occur on a biologically relevant timescale, the flexibility of the p58c-p58N linker120,147 and substantial conformational changes during primer synthesis120 can bring the [4Fe4S] clusters sufficiently close to a minimum inter-cluster distance of ~12.5 Å for the occurrence of direct CT on the millisecond timescale. In Mechanism 4 (Figure 11a), the duplex donates an electron to a nearby oxidant. This is followed by the reduction of the duplex by the Polα [4Fe4S]2+ cluster and the displacement of the primase by steric effects or other structural interactions (mechanism 4, bottom panel).

7.3 Concluding Remarks

CT pathways from the redox-active iron-sulfur cluster in p58c to the DNA-RNA duplex have been delineated while the HT process from duplex to iron-sulfur cluster is possible on a microsecond time scale. This finding solves a main point of debate in the recent literature,121-123 demonstrating that the CT-based “communication” between

[4Fe4S] cluster and nucleic acid is feasible121 regardless of the specific CT pathway

63

involved. Although the exact rates for the primer duplex handover process between the human primase and Polα have not been characterized experimentally, this process is expected to occur at least on a biologically relevant millisecond timescale.

From our calculations, CT between the cluster and duplex via three tyrosines

(Y309, Y345, and Y347) is unfeasible due to a residence time of ~106 s. This is not surprising as the analysis from ref. 26 is based on a primase structure containing a I271S mutation and adopts an atypical β-hairpin conformation at the DNA:RNA-binding interface (Figure 12). Hence, changes in p58 local folding26 alter significantly the Tyr positions, and therefore the specific route that enables the fastest ET from the iron-sulfur cluster to the nucleic acid.

64

Figure 12. Side-by-side comparison of p58C structures with correctly folded [PDB code 5F0Q] and misfolded (PDB code 3L9Q) substrate-binding regions highlighted in different colors. These regions are overlapped in the left bottom quarter. The positions of Y309, Y345, and Y347 relative to [4Fe-4S]2+ in two structures are shown in the right bottom quarter. Figure and caption are reproduced with permission from ref. 123.

In addition, various mechanisms for DNA-mediated CT and signaling between the iron-sulfur clusters in primase and Polα are proposed, thereby further supporting the possibility of a CT-based primer handover process. More importantly, these mechanisms can describe the DNA-mediated signaling and coordination between iron- sulfur clusters in biological systems. Mechanism 4 (Figure 11a), which suggests the facilitation of primer handover via both CT and steric effects, is the most plausible.

Indeed, steric effects have also been suggested by Tahirov et al.,126 where p58c could

65

move towards p180core during RNA primer elongation, causing the displacement of p180core from the p49-p58N-p180c-p70 platform and conveniently positioning the p180core to catch the DNA template with a 9-mer RNA primer from p58c.126

66

8. Mapping Hole Hopping Pathways in Proteins

Portions adapted with permission from ‘Teo, R. D.; Wang, R.; Smithwick, E. R.; Migliore,

A.; Therien, M. J.; Beratan, D. N., Mapping hole hopping escape routes in proteins. Proc.

Natl. Acad. Sci. U.S.A. 2019, 116 (32), 15811-15816’. All authors contributed to the writing of the manuscript. In addition: R.D.T. and R.W. conceived and designed the research;

R.D.T. and R.W. wrote the code; R.D.T. carried out the computations; R.D.T., A.M.,

M.J.T., and D.N.B. analyzed the results.

Proteins that incorporate oxygen into organic substrates from molecular oxygen and other potent oxidants (such as H2O2) are vital to biological function. In fact, these oxidant species are often short-lived products of biochemical reactions in the cell. Redox reactions involving these oxidants are susceptible to forming ROS, which can lead to oxidative damage of DNA and proteins.148 ROS damage in proteins can manifest as backbone oxidation,149 protein fragmentation,149 or amino acid side-chain oxidation; cysteine, methionine, tryptophan, and tyrosine are the most susceptible to oxidation.150,151 For example, tyrosine oxidation can promote dityrosine-linked aggregate formation in the eye lens protein gb-crystallin, tertiary structure destabilization, and cataractogenesis.152 Studies also link the effects of ROS damage to aging complications and to cancer.149 Protein oxidation can lead to unfolding, and ultimately, to the loss of

67

function.153 Although proteasomes and lysosomes help to degrade damaged proteins,149 the accumulation of oxidized protein can contribute to Alzheimer’s disease and diabetes.154 Here, we describe a computational method to identify hole hopping chains and to estimate the timescale with which these pathways may convey otherwise damaging oxidizing equivalents to a protein surface where they may be reduced by external reductants.

Recent studies of Gray and Winkler150,151,155 investigated a ROS protective mechanism that may act within monooxygenases, dioxygenases, and peroxidases. They proposed that chains of redox-active amino acids, including Tyr, Trp, Cys, and Met, can support hole hopping to direct strongly oxidizing electron holes to the protein surface in the absence of the enzyme substrate,150,156 thus preventing damage to the protein active site. Once the excess charge is diverted to the exterior of the protein, reductants within the cell may be able to fill the holes. This hypothesis regarding molecular approaches to

“defusing redox bombs”157 was supported by a study of the Research Collaboratory for

Structural Bioinformatics Protein Data Bank (PDB),150 which revealed that one third of the surveyed proteins contained chains of 3-to-5 aromatic amino acids, with the highest occurrence in redox proteins such as and .150

In addition to protecting redox-active sites from oxidative damage, hole hopping pathways could also act to safeguard proteins from labilized hemes that may be produced in over-oxidized proteins like cytochrome c peroxidase (Ccp1). Labilized

68

hemes may be produced when H2O2 levels in Ccp1 (e.g., from Saccharomyces cerevisiae) increase ~10-fold with yeast respiration; in this mechanism, Ccp1 is activated via irreversible oxidation of the axial H175 ligand to transfer its heme to apo-catalase A, after which apo-Ccp1 escapes the mitochondria.158,159 Irreversible oxidation of heme can therefore take place when H2O2 levels are high and hole escape routes are absent.159

Solvent-exposed high-potential [4Fe4S]2+/3+ clusters may serve as terminal hole acceptors for a through-protein oxidative damage protection chain.160 Although it was initially proposed that the Gly829• radical cofactor could be reduced by the [4Fe4S]2+ cluster in the glycyl radical enzyme benzylsuccinate synthase (BSS),161 the 31.2 Å edge- to-edge separation between Gly829• and the cluster (PDB 4PKF) indicates that this distance cannot support the HT-based protective mechanism.162 Despite the large transfer distance, experimental studies of other proteins151 suggest that this HT reaction may be accessible. Electron tunneling over 15-20 Å can occur on a nanosecond to microsecond timescale in iron and copper proteins,150,151 and hopping (multi-step tunneling) can enable CT well over longer distances on a sub-millisecond time scale when the free energy profile is favorable.163

The computational approach described here is used to identify the fastest hole hopping routes that may limit ROS formation in cytochrome P450 (from Bacillus megaterium, P450BM3), Ccp1, and BSS. This methodology can be used more broadly for rapid database screening in the context of oxidative protection pathways, and can also

69

be used to explore hole hopping in signaling pathways that are proposed for protein- nucleic acid complexes.

We investigated hole hopping routes for P450BM3, Ccp1, and BSS on biologically- relevant timescales using the search program described here (EHPath). The program identifies hole hopping routes and estimates the timescale for hole hopping transfer or escape. The Python-based program implements a kinetic model outlined in Section 7.1.5.

The program allows the choice of ET kinetic parameters, the fastest HT or escape routes are identified, and the transit time is estimated. Estimates of these times allow us to resolve the outstanding questions of whether such hopping paths could establish viable signaling or oxidative damage protection mechanisms. The viability of the [4Fe4S] clusters to serve as hole traps in BSS is addressed, and the biological relevance of the differences in the escape times for P450BM3 and Ccp1 are discussed.

8.1 Description of EHPath

The Pathways method to estimate bridge mediated tunneling interactions uses

164-166 graph theory to identify tunneling routes that maximize �. Other CT pathway finders rely on QM/MM methods to identify tunneling pathways.167 EHPath focuses, instead, on the case of hopping transport. This hopping pathway finder: (i) identifies bidirected graphs that correspond to specific protein structures, (ii) estimates ΔG° and

� values for all hopping steps using the same approaches outlined in Chapter 7, and

70

(iii) calculates the mean times required for the charge hopping routes from D to A. The structure of a typical graph is shown in Figure 13.

Forward & Backward Forward only Donor Bridge Acceptor

Figure 13. Structure of a directed graph representing hopping routes in a protein, where the nodes represent the initial charge donor site (red), the intermediate hopping sites (yellow), and the acceptor site (blue). As the acceptor is in contact with a charge ‘drain’ in the kinetic model, there is no directed edge from acceptor to bridge (namely, no backward charge transfer to the bridge).168

The rates for forward and reverse HT between nodes define the effective lengths of edges in the corresponding graphs. The mean residence time for a charge hopping pathway is computed using the exact (eq. 76) or approximate (eq. 77) expression for user defined redox-active residues and cofactors. The pathway search uses NetworkX, a python package for network analysis based on graph theory,169 to rank the hopping pathways based on mean residence time.

The EHPath program uses python libraries (pandas and networkx) for data processing and directed graph analysis. The program first requires the user to provide the input CSV (MS-Dos format) files donor.csv, bridge.csv, and acceptor.csv (sample files

71

can be accessed at https://github.com/etransfer/EHPath). Therefore, one needs to define the charge (electron or hole) donor, bridge(s), and acceptor(s). The charge donor is easily identified as a protein active site or cofactor. The bridge.csv file is created using the grep command to extract the redox-active Cys, Met, Tyr and Trp residues from the original

PDB file:

$ grep -E 'CYS|MET|TRP|TYR' input_PDB.pdb > bridge.pdb

The output file contains all redox-active residues in the protein, but only those in the bridging protein medium (between D and A) can contribute to the relevant pathways.

The bridge.pdb file is then converted to the CSV format using tools like Excel or

Convertio. Note that the redox-active amino acid residues that are part of the protein active site (or bound to the cofactor) working as charge donor or acceptor are neither the start point of a charge hopping pathway nor part of the bridge. For example, if the donor is a [4Fe4S] cluster, Cys residues bonded to the Fe atoms of the cluster should be removed from both donor and bridge lists. For hole hopping pathways as oxidative damage escape routes in biological systems, the potential acceptor residues are solvent- exposed surface residues. For users with no prior information about these surface residues pertaining to their proteins of interest, tools such as PyMOL, which calculate the solvent accessible surface area, help to identify the relevant terminal residues at the protein surface.

72

Once the CVS files are generated, one can run the EHPath.py program. The user is asked to enter the names of the donor, bridge, and acceptor CSV files. EHPath.py and the CSV files need to be in the same directory. The program asks for the residue number associated with the donor, a cutoff number (where cutoff number + 1 is the maximum number of nodes in any pathway considered by the program), the number of pathways to be printed in the results section, the types of donor and acceptor (type just ‘electron’ for ET and ‘hole’ for HT), and the value of a parameter a (0 ≤ a ≤ 1). The cutoff number helps one limit the pathways under consideration to the most direct ones between the terminal charge donor and acceptor. We chose a cutoff number of 4 as a good compromise between computational cost and probability that an efficient pathway contains more than 5 nodes. In fact, our results show that the top-ranked hopping pathway in each of the systems investigated contains less than 5 nodes. The a parameter allows the user to change the reorganization energy l for each charge-transfer step in a hopping route between the value � calculated using eq. 35-36 and the value of 0.8 eV used in ref. 3. a is defined as follows,

l = 0.8 + a(� − 0.8) (in eV) (78)

Therefore, a = 1 gives l = �, while a = 0 gives l = 0.8 eV. The program ranks the hopping pathways according to both the approximate and the exact mean residence times (based on our kinetic model27). An option in the program allows the user to define a hopping pathway, by specifying the residue numbers involved, and to calculate the

73

exact mean residence time for such pathway. Here is an example of a successful run that finds and analyzes the hole hopping pathways between a heme donor with residue number 999 and an amino acid residue acceptor (the maximum number of nodes in any computed pathway is 5, the number of pathways printed in the output is 5, and a = 1),

*****************************************

Welcome! EHPath is able to analyze and rank electron/hole hopping pathways between an electron/hole donor and an electron/hole acceptor according to the overall mean residence time.

*****************************************

Please enter the name of donor file (eg: donor.csv):

Donor_Node_2IJ2.csv

Please enter the name of bridge file, (eg: bridge.csv):

Bridging_Nodes_2IJ2.csv

Please enter the name of acceptor file, (eg: acceptor.csv):

Acceptor_Nodes_2IJ2.csv

Please enter residue number of donor: 999

Please enter cutoff_num (cutoff_num + 1 = maximum number of nodes in the pathway, Warning: A larger cutoff_num will cost more memory and leads to longer computation time): 4

Please specify the number of pathways to be printed: 5

Please enter the type of donor/acceptor (electron or hole): hole 74

Please specify the value for α: 1

Start calculation......

Top 5 pathways ranked according to the approximate mean residence time:

[999, 96, 90, 334] 0.011699829537750969

[999, 115, 305] 0.05850237678612136

[999, 156, 115, 305] 0.08403832975022704

[999, 96, 334] 0.13273387470162293

[999, 115, 156, 305] 14.437926968672322

Top 5 pathways ranked according to the exact mean residence time:

[999, 96, 90, 334] 0.0367530800137136

[999, 115, 305] 0.060844981316308985

[999, 156, 115, 305] 0.13910192575395705

[999, 96, 334] 0.42140415862323116

[999, 156, 305] 15.103972122225027

End of calculation

Would you like to find the exact mean residence time of a specific pathway? (Yes/No): Yes

75

Please specify the pathway with their residue numbers (for example, 999, 156, 115, 112, 305): 999, 156, 115, 112, 305

Exact mean residence time: 3956.3190955084924

*****************************************

If you have any questions or suggestions about EHPath, please contact Mr. Ruijie Teo ([email protected]) or Dr. Ruobing Wang

([email protected]).

Citation: Teo, R. D.; Wang, R.; Smithwick, E.; Migliore, A.;

Therien, M. J.; Beratan, D. N. Proc. Natl. Acad. Sci. U.S.A.,

2019, 116, 15811-15816.

*****************************************

Steps to define a new charge donor in EHPath (for example, FAD)

Step 1. In the donor.csv file, identify the atoms that are relevant to the charge hopping

(that is, the atoms that would be involved in the excess charge distribution, which can be predetermined by means of first-principles computations or based on simple chemical grounds). We consider here the case of FAD, which consists of 53 heavy atoms (atom IDs

5811-5863 in the PDB file with code 5JFC170). From the literature (e.g., see ref. 171 and refs. therein), we know that the frontier molecular orbitals (which describe the excess charge distribution in a single-particle picture) are essentially localized on the flavin

76

moiety, which thus defines the donor group. Therefore, the following line needs to be added under the ‘Section for truncating redox groups’ of EHPath.py: donor = donor[~(donor['atomtype'].isin(['N1','C2','O2','N3','C4','O4','C4X','N5','C5X','C6','C7','C7M

','C8','C8M','C9','C9A','N10','C10']) & (donor['residuetype'].isin(['FAD'])) )]

Step 2. Define the new residue type, starting with the line if self.Residuetype=='FAD': and inserting the pertinent reorganization energy and redox potential. The reorganization energy can be obtained from a variety of methods: from simple approaches that use Marcus’ expression for the reorganization energy146 and exploit available theoretical/experimental data in the literature27 to accurate methods that use classical MD172,173 or QM/MM.174

Step 3. Insert the line to calculate the electronic coupling under the sections ‘Donor-

Bridge’ and ‘Donor-Acceptor”.

The reliability of EHPath was tested by showing that the analysis replicated residence times computed earlier in Table 1 of ref. 27. Slight differences in the two sets of mean residence times arise from the fact that DFT was used to calculate the electronic couplings in ref. 27, while semi-empirical estimates of the electronic couplings were used in this study. The match is exact when the same coupling values are used. The

EHPath program ranks the hopping pathways, and identifies weaker hole hopping

77

routes as well. These hopping pathways can also be accessesed on biologically relevant time scales and were not captured in the earlier analysis. This finding highlights the capability of EHPath to carry out systematic searches of hole hopping routes in redox proteins.

8.2 Cytochrome p450

Using EHPath, the hopping routes in the heme domain of P450BM3, a bacterial class II cytochrome P450 protein that catalyzes alkane hydroxylation, were examined.

Cytochrome P450 is a known producer of ROS when the Fe-coordinated O2 remains in the oxidized state (in the absence of a substrate).175,176 In this protein, hole-hopping pathways could divert holes from compound I (species D in Figure 14) to the protein surface, thus preventing ROS formation. In the presence of a substrate, these hopping pathways should not enable hole escape on the timescale for converting compound I to compound II (D to E, Figure 14). Ab initio MD simulations have found that D is converted to E in ~ 0.1 ps.177 Thus, protective HT to the protein surface should not approach this time scale, because the normal enzyme function would otherwise be affected. Yet, hole escape should be faster than the timescale of autoxidation of the oxy- ferrous/superoxy-ferric complex A (which would cause A to produce ferric heme and superoxide). The rate of autoxidation is 0.14 s-1 (0.11 s-1) in the wild-type enzyme with

(without) substrate (the R-H substrate shown in Figure 14 is not considered explicitly in our analysis, as it is not involved in hole hopping pathways).178 Two hole hopping

78

pathways to the protein surface have been proposed in P450BM3 (PDB 2IJ2)151 (Figure 15).

Distance-based arguments suggested that Pathway 1 would support faster hole hopping.151

Figure 14: Part of the catalytic cycle of cytochrome P450. The cycle begins with substrate entry to the active site,179 displacing an axial water, driving a high-to-low spin conversion at the metal center that causes the iron(III) center to dome the heme, and enabling the heme’s action as an electron sink for the reductase domain.179 When the iron center is reduced from Fe(III) to Fe(II), O2 can bind to the iron center to generate the superoxy form of the enzyme, A.179 This species may induce autoxidation, which could enable species A to decay to Fe(III) and a superoxide anion at a rate of about 0.1 s-1 (indicated in the parentheses).178,179 Species A can avoid this fate by being converted into species B by electron transfer from the reductase domain at a rate faster than 99 s-1 in wild-type cytochrome P450BM3.102,178,179 This step is followed by proton transfer to B, to form species C. A second proton transfer results in an iron-oxo radical cation D (also known as Compound I) and a water molecule. Then, D abstracts a hydrogen atom from the substrate to form a Fe(IV)-hydroxide intermediate E.180 The radical substrate eventually recombines to form the hydroxylated product.

79

Figure 15: Probable routes for hole hopping from the active site of P450BM3 to the protein surface.151 Pathway 1 (P1, green arrows) is heme→W96→W90→Y334, with edge-to-edge distances (Ree) between the redox-active groups of 7.32 Å, 8.35 Å, and 4.45 Å, respectively, based on PDB structure 2IJ2. Ree was computed as the minimum distance between the heavy atoms for each redox-active site (that is, porphyrin ring of heme, indole group of Trp, phenol group of Tyr, methanethiol moiety of Cys, and dimethyl sulfide in Met). Pathway 2 (P2, orange arrows) is heme- →C156→Y115→M112-→Y305, with Ree values of 10.03 Å, 3.99 Å, 3.70 Å, and 5.19 Å, respectively. ‘h+’ denotes the hole that transfers from heme to W96 at the start of Pathway 1, or to C156 in Pathway 2, while the arrows in the figure indicate the direction of the protective hole transport.

EHPath was used to find the fastest hole hopping routes with Y334 or Y305 as the terminal hole acceptor in P450BM3 (Figure 15). The validity of different Marcus parameters inserted into the expressions for the mean residence time was then tested. τM is defined to be the exact mean residence time (eq. 76) obtained using the well-known

Marcus rate expression (eq. 8), with the ∆G° values approximated as the oxidation potential differences and the reorganization energies obtained using eq. 35-36. τSBT is defined to be the exact mean residence time (eq. 76) obtained using the empirical square barrier tunneling (SBT) rate expression,7

(∆° ) () � = 10 e e (79)

80

� is a D-A contact distance of 3 Å, � is the D-A edge-to-edge distance, and � is the distance decay factor (1.1 Å–1) for the D-A electronic coupling.181 The prefactor of 1013 s-1 was obtained from kinetic measurements in Ru-modified azurins.181-183 Like τM, the

Marcus free energy parameters for τSBT were obtained by approximating the ∆G° values as the oxidation potential differences and the reorganization energies obtained using eq.

35-36. τ'SBT was obtained using by inserting rate constants from eq. 79 into the exact expression for the mean residence time (eq. 76); the rate constants were calculated using approximate free energy parameters from ref. 7 for Tyr and Trp hopping in heme proteins: λ = 0.8 eV, ∆G° = 0.1 eV for HT from the heme to Trp or Tyr, ∆G° = 0 eV for the intermediate HT step(s), and ∆G° = −0.1 eV for the final HT to the terminal Trp or

Tyr residue.

The fastest HT route (with τM = 3.7 ´ 10-2 s) has the structure of Pathway 1 shown in Figure 15. This HT route does not compete with the much faster conversion of D to E on the sub-ps timescale (Figure 14),177 while offering protection against A autoxidation

(Figure 14), which takes place on the 10 s timescale in the wild-type enzyme, and on the second or longer timescale in its mutants.178 τM for Pathway 2 of Fig. 2 is 4.0 ´ 103 s (Table

6), which is larger than the timescale for autoxidation of species A. Therefore, our results using EHPath exclude Pathway 2 (Figure 15) as a plausible hole escape route.

Table 6 shows that the τ'SBT values differ from the corresponding τM values by four orders of magnitude, although both are in the range that would lead to protection

81

on the relevant biological time scale. This is a consequence of the large differences between the individual forward HT rates (see Table S2 of ref. 168), resulting from the use of different free energy parameters in the two estimates of the timescale for hole escape. τM and τSBT , computed using rates derived from eqs. 8 and 79 respectively, use the same ∆G° and � values and are therefore in much closer agreement; this reflects the similar conclusions derived from using eq. 8 and 79. Importantly, both τM and τSBT values are well within the timescale range between the functional conversion from species D to E and the harmful A autoxidation. That is, irrespective of the specific model used for the nearest-neighbor hopping rate, we reach the same conclusion that the fastest hole escape routes are sufficiently fast to prevent A autoxidation while not interfering with the catalytic function. The closer agreement between the τM and τSBT values (compared to τM and τ'SBT values) results from using the same ∆G° and � values in the expressions for the mean residence times. Our theoretical finding that path 1 is the most probable oxidative damage escape route is consistent with the experimental

• finding that W96 + is a critical intermediate in the hole hopping of Ru-modified P450BM3

(this amino acid is conserved in ~ 75% of the cytochrome P450 species150).155

Table 6: Fastest 5 hole hopping routes in P450BM3 (the heme is the hole donor, while Y305 or Y334 is the terminal hole acceptor), ranked by the mean residence time τM (eq. 76, with rates derived from eq. 8 and λDA from eq. 35-36). τSBT was similarly calculated using eq. 79. τ'SBT also uses eq. 79, but with the free energy parameters of ref. 7.

Hole Hopping Pathways τM (s) τSBT (s) τ'SBT (s) HEM-W96-W90-Y334 (P1) 3.7 ´ 10-2 9.3 ´ 10-3 4.6 ´ 10-6 82

HEM-Y115-Y305 6.1 ´ 10-2 1.9 ´ 10-2 6.1 ´ 10-6 HEM-C156-Y115-Y305 1.4 ´ 10-1 7.7 ´ 10-2 - HEM-W96-Y334 4.2 ´ 10-1 5.7 ´ 10-2 9.6 ´ 10-6 HEM-C156-Y305 1.5 ´ 101 6.7 - HEM-C156-Y115-M112-Y305 (P2) 4.0 ´ 103 2.2 ´ 102 -

8.3 Cytochrome c Peroxidase

A recent experimental study159 offers an excellent framework for analyzing hole hopping routes that may prevent oxidative damage to the heme in Ccp1 of Saccharomyces cerevisiae (PDB 1ZBY184). Ref. 159 identifies protein zones with different densities of residues that can be oxidized by hole hopping from the heme. The three main zones are shown in Figure 16: Zone 1 contains the highest percentage of oxidizable amino acid residues, followed by zones 2a and 2b.

Figure 16: Redox-active residues (C, M, W, Y) in zone 1 (green), 2a (yellow), and 2b (blue) of Ccp1 (PDB 1ZBY184). The zones lowest in oxidizable residues (zones 3 and 4 of ref. 159) are not shown.

83

Table 7: Mean residence time τM of the hole for the 5 fastest hole hopping pathways in Ccp1 with selected terminal HT sites identified in ref. 159 (calculated using eq. 8 for the HT rates and eqs. 35-36 for the free energy parameters). Additional pathways are shown in Table S3 of ref. 168.

Hole Hopping Pathways τM (s) HEM-W191-Y229 2.5 ´ 10-3 HEM-W191-W211 2.5 ´ 10-3 HEM-Y36 1.2 ´ 10-2 HEM-Y187-W191-Y229 6.1 ´ 10-2 HEM-Y187-W191-W211 6.1 ´ 10-2

We find that 17 of the 20 fastest hole hopping pathways terminate in zone 1

(Table S3 of ref. 168), providing theoretical support for the experimental data interpretation of ref. 159 (8 residues of the 24 oxidized residues in Ccp1 identified from

LC-MS/MS analysis were mapped into zone 1, while the 16 other residues were distributed across zones 2a, 2b, 3, and 4). We also identify W191 as a key residue in four of the top five hole hopping routes. Even using the large reorganization energies when estimating τM, we predict that HT from the heme to Y229, through W191, can occur on a millisecond timescale (Table 7). This result is consistent with the experimental observation of a role for W191•+ in mediating CT between Ccp1 and cytochrome c,185 and the hypothesis that the charge would most likely migrate from W191•+ to a tyrosine residue (which is identified to be Y229 in our analysis).186 Interestingly, comparing the hole hopping pathways in Tables 6 and 7 shows that the fastest pathways in Ccp1 have smaller τM values than the fastest pathways in P450BM3. This difference may correlate

84

with the differential functions of the two proteins. The main roles for Ccp1 are the oxidization of two ferrocytochrome c molecules through the initial two-electron reduction of H2O2,187 and the eventual transfer of its heme to catalase during respiration when large mitochondrial levels of H2O2 are present.158 In the absence of ferrocytochrome c molecules, the Ccp1 pathways listed in Table 7 can help protect the active site from irreversible oxidative damage (and heme crosslinking to W51)188 produced by highly oxidizing H2O2; the protection of the active site integrity allows the subsequent transfer of undamaged heme to catalase A.159 The P450BM3 monooxygenase relies, instead, on the less strongly oxidizing O2 as the primary route to the hydroxylation of fatty acid substrates.189 Therefore, the reliance of P450BM3 on a weaker oxidant may require fewer and less rapid hole hopping routes for sufficient heme active site protection from oxidative damage in P450BM3 compared to the case with Ccp1.

8.4 Benzylsuccinate Synthase

τM (eq. 76) and τM,approx (eq. 77) values for the top five hole hopping pathways between Gly829• (hole donor) and [4Fe4S]2+ (hole acceptor) of BSS appear in Table 8.

These pathways involve the key residues highlighted in Figure 17. Since the backward

HT rates are negligible in these hopping pathways, eq. 77 provides an excellent approximation to the mean residence time (Table 8). The pathways from G829• to

[4Fe4S]2+ via Y533, M713, and W58 are predicted to be traversed on a millisecond timescale (see Table 8) as a consequence of their downhill free energy landscapes. Since

85

our analysis demonstrates that [4Fe4S]2+ may reduce G829•, the iron-sulfur cluster could play act as a reducing agent in other systems (e.g., in the experimentally observed reversible reduction of the Gly radical in pyruvate formate , another glycyl radical enzyme.162,190

Table 8: Mean residence times τM (eq. 76) and τM,approx (eq. 77) of the hole in the 5 fastest hole-hopping escape pathways of BSS, where Gly829• is the initial hole donor and [4Fe4S]2+ is the final hole acceptor. Pathways 1 and 2 are delineated in Figure 17.

Hole Hopping Pathways τM (s) τM,approx (s)

G829-M713-Y533-[4Fe4S] (1) 4.5 ´ 10-3 4.5 ´ 10-3 G829-M713-Y533-W58-[4Fe4S] (2) 4.5 ´ 10-3 4.5 ´ 10-3 G829-M713-W55-[4Fe4S] 7.8 ´ 10-3 7.8 ´ 10-3 G829-M713-W55-Y533-[4Fe4S] 8.0 ´ 10-3 8.0 ´ 10-3 G829-M713-W55-W531-[4Fe4S] 1.0 ´ 10-2 9.1 ´ 10-3

W55

(1) [4Fe4S]2+ M713 h+ G829 Y533 W58 (2)

Figure 17: Locations of the BSS redox-active residues involved in the five most rapid hole hopping routes from G829• to the terminal hole acceptor [4Fe4S]2+ (Table 8). The two fastest paths (1) and (2) are highlighted with green and orange arrows, respectively.

86

8.5 Concluding Remarks

The HT kinetic modeling and analysis incorporated in the EHPath program allow us to identify, assess, and understand the role of hole hopping pathways as oxidative damage escape routes in three important biological systems, taking into account microscopic models for the HT rates among hopping stations. The quantitative results of this theoretical study support the general viability of the proposed hole- transfer protective mechanism in redox proteins.150,151 Our analysis identifies Pathway 1 in P450BM3 (Figure 15) as the dominant oxidation escape route, compared to the less efficient Pathway 2, irrespective of the specific details of the parameterization for the HT rates (Table 6). We find that the mean residence time computed for the charge on

Pathway 1 is shorter than the timescale of the potentially harmful autoxidation of the oxy-ferrous/superoxy-ferric compound (Figure 14) and is longer than the timescale of the functional transition from the iron-oxo radical cation D to the Fe(IV)-hydroxide complex E. The identification of W191 as a key site for hole hopping away from the active site in Ccp1 and the termination of the most rapid hopping pathways in protein zone 1 (Figure 16) both agree with the EPR spectral signature of W191•+.159,185 An [4Fe4S] cluster can serve as the terminal hole acceptor in BSS, reducing the Gly radical through hopping pathways that involve the M713, Y533 and W58 residues. The software and modeling described here provide a robust approach to map and to characterize CT hopping pathways. This scheme should enable further studies of hopping pathways in

87

proteins, DNA, and protein-DNA complexes, including the primase-polymerase complex.26 Future applications of this approach to oxygen-utilizing/evolving proteins, including ribonucleotide reductase, cytochrome c oxidase, and photosystem II,150 may help to reveal how nature has evolved charge hopping protection mechanisms.

88

9. Role of Y345C Mutation on Primase-RNA/DNA Hole Hopping Pathways

9.1 Introduction

Somatic tyrosine-to-cysteine mutations (Y345C) in the DNA primase enzyme are found in gastric tumors.191 Electrochemical experiments have been performed by the

Barton group to characterize the CT kinetics between [4Fe4S]2+ in wild-type/mutant p58c and DNA,26 where the Y345C mutation in primase was found to reduce [4Fe4S]2+-DNA

CT by ~60% as compared to wild-type primase. However, just how the mutation affects

CT is poorly understood. Several groups have commented that O’Brien et al.26 used a partially misfolded p58c in their electrochemical experiments, therefore questioning the veracity of their results.124 In order to investigate the impact of the Y345C mutation on biological CT in primase, MD simulations of the native primase-nucleic acid complex were carried out and the analysis of hopping pathways on the extracted MD snapshots were performed using EHPath (see Chapter 8 for EHPath details). This analysis helps probe the structural effects that this mutation might have on the complex, which in turn affects charge conduction between the [4Fe4S] cluster embedded in the primase protein and its bound nucleic acid.

Binding affinity measurements by the Barton group have shown that high- potential [4Fe4S] cluster-containing proteins bind more tightly (up to 550 times more tightly) to the duplex when the [4Fe4S] cluster is in the 3+ state as compared to the 2+ state.43 However, force field parameters for parameterizing the [4Fe4S]3+ cluster are 89

unavailable. Hence, these parameters have been developed (by first optimizing the geometry of the cluster using broken-symmetry DFT) and used for the MD simulations.

An initial investigation of the primase-RNA/DNA crystal structure (PDB ID

50FQ) shows that the Y345 residue of the primase enzyme electrostatically interacts with the terminal guanine nucleobase of the primase-bound RNA/DNA duplex, and that the mutation of the Y345 residue from tyrosine to cysteine is predicted to modulate the dominant hopping pathways through structural perturbations. This modulation will be further elaborated in section 9.3.6.

9.2 Broken-Symmetry DFT

A molecule with ‘broken symmetry’ has an electronic wave function of lower symmetry than its nuclear geometry.192 In other words, the resulting eigenstates of the electronic Hamiltonian H are not eigenfunctions of the symmetry operators for the relevant nuclear point group.193 Compared to the more expensive multi-reference methods, broken symmetry-DFT (BS-DFT) is an alternative and cheaper approach that can approximate electron correlation in polynuclear metal complexes. In BS-DFT, spin- up (α) and spin-down (β) electrons are placed in different spatial orbitals (different orbitals for different spins). The wavefunction of the BS state |�⟩ is represented as

, , |�⟩ = �⨂� (80)

, , � and � are the magnetic orbitals which are related to the spin-restricted

MOs � and � via the following relations:

90

, � = (����)� + (����)� (81)

, � = (����)� − (����)� (82)

Strong overlap between the BS magnetic orbitals imply that � = 0; since

, , �� = cos � − sin �, the overlap becomes 1 when � = 0. The magnetic orbitals are both reduced to the closed-shell MO � obtained from spin-restricted

, , calculations. When � ≠ 0, � ≠ � , resulting in broken spin symmetry.

[4Fe4S] clusters in the 3+ oxidation state (S = ½ ground state) are characterized by antiferromagnetic Heisenberg coupling between the S = 9/2 mixed-valence [2Fe2S] cluster and the S = 4 ferric [2Fe2S] cluster (this assignment has been derived from EPR measurements).194,195 For the mixed-valence [2Fe2S] cluster, the two high-spin Fe centers are coupled ferromagnetically and one minority spin electron is delocalized between the two Fe atoms.196 The antiferromagnetic coupling between the two [2Fe2S] layers is represented by a BS state,197 which can be constructed using the generalized ionic fragment approach.198 Each [2Fe2S] layer is represented as an ionic fragment (where the electronic structure of each fragment is first calculated through SCF iterations), and the

BS state of the [4Fe4S] cluster arising from the two ionic fragments is then generated by a given magnetic coupling scheme.

91

9.3 Methodology and Results

9.3.1 Generation of Truncated Models

The crystal structure coordinates of the [4Fe4S] cluster were extracted from PDB

5F0Q,199 together with the coordinated Cys residues that were pruned to SCH3; the pruned system was generated using the Maestro Molecular Modeling Suite.39 The hydrogens were then geometrically optimized using the B3LYP functional200 and 6-31G* basis set while keeping the other atoms fixed. The resulting structure was then used as the starting input geometry for the geometry optimization of the [4Fe4S]2+ and [4Fe4S]3+ cluster.

9.3.2 DFT Calculations

The [4Fe4S]3+ cluster has a total spin S = ½ with the two [2Fe2S] layers antiferromagnetically-coupled to each other; the first layer consists of a mixed-valence pair (each iron has a formal charge of +2.5) with total spin S1 = 9/2, while the second layer consists of a ferric pair (each iron has a formal charge of +3) with S2 = 4,195 where the second layer is the more oxidized layer. Although a previous 2D-NMR study of a

[4Fe4S]3+-containing protein from Chromatium vinosum (CvHiPIP) was able to assign the first two Cys (Cys 43 and Cys 46) in the sequence to the more oxidized layer and the second two Cys (Cys 61 and Cys 75) to the less oxidized layer,201 this assignment may not be consistent with other [4Fe4S] proteins.202 Due to the lack of any experimental spin characterization of the [4Fe4S]3+ cluster in primase, geometry optimizations on all six

92

possible redox layer assignments were carried out.202 Similarly, geometry optimization on the three possible redox layer assignments were carried out for the [4Fe4S]2+ cluster with a total spin S = 0, where the two [2Fe2S] layers of S = 9/2 each are antiferromagnetically-coupled.

For each layer assignment, the system was geometrically optimized by BS-DFT

(B3LYP/6-31G**)196,202-204 using the NWChem package82 with an extra fine integration grid, Grimme DFT-D3 dispersion correction, and the Cosmo model205 (ε = 4) in order to model the protein dielectric environment.206 Distorted cluster structures were obtained from the geometry optimizations, consistent with similar distorted [4Fe4S] structures observed from the DFT studies of Niu and Ichiye (see Figure 18 for geometries for the

[4Fe4S}3+ cluster, 13+-63+).202,207 The average geometry for the [4Fe4S] cluster in the respective 2+ or 3+ redox state was also obtained by first constructing the z-matrices corresponding to each of the six individual geometries using the newzmat tool of

Gaussian, followed by averaging the parameters of the matrices to generate the corresponding z-matrix, and in turn, the geometric coordinates, for the average geometry (see Figure 18g for the average geometry for the [4Fe4S]3+ cluster).

93

a) H5 b) H4 H5 C2 H6 H4 C2 H6 S4 S4 Fe2 S1 S6 S5 S1 Fe2 S6 S5 C3 H9 C3 H7 Fe3 Fe3 H1 S3 H7 S3 Fe4 H8 Fe1 Fe4 H10 H1 Fe1 H10 S2 H8 H9 S8 S2 S8 C1 C4 C1 C4 S7 S7 H2 H3 H11 H12 H2 H3 H11 H12

H4 c) H4 d) C2 H6 H5 C2 H6 H5 S4 S4 Fe2 S5 H7 S1 Fe2 S6 S5 S1 H8 Fe3 C3 H7 Fe3 S6 C3 S3 Fe4 H10 S3 Fe4 H1 Fe1 H1 Fe1 H8 H9 S2 S8 C4 S2 H9 C1 C1 S8 S7 H10 H2 S7 H11 H12 H2 H3 H3 H11 C4 H12

H4 H4 e) f) C2 C2 H5 H6 H5 H6 S4 S4 Fe2 S6 S5 Fe2 S5 S1 H7 S1 S6 H7 Fe3 H8 H8 H1 Fe3 S3 Fe4 C3 H10 S3 C3 H10 Fe1 Fe1 Fe4 H1 H9 C1 H3 H9 C4 C4 S8 C1 S2 S8 S7 H12 H2 H11 H12 H2 H11 S2 S7 H3

g) H4 H5 H6 C2 S4

S1 Fe2 S6 S5 H7 Fe3 C3 S3 Fe4 H8 H10 H1 Fe1 H9 S2 S8 C4 C1 S7 H12 H2 H3 H11

Figure 18: a-h) Optimized geometries 13+-63+ of the [4Fe4S]3+ cluster corresponding to the six spin layer assignments, and g) average geometry of the [4Fe4S]3+ cluster obtained by averaging the geometries a-h. 94

For each optimized geometry, the Hessian matrix was generated, and the force constants corresponding to bond stretching and angle bending were extracted using the

VFFDT program.208 Since the torsion barriers for metal-ligand bonds are smaller than kBT,101 the dihedral parameters are often ignored.208 The six sets of force constants, bond distances, and bond angles for each optimized [4Fe4S]3+ geometry, and the averages, are presented in Tables A1-A4 (see Appendix). The three sets of force constants, bond distances, and bond angles for each optimized [4Fe4S]3+ geometry, and the averages, are presented in Tables A5-A8 (see Appendix).

RESP charges were calculated for all atoms in the truncated model [(4Fe4S)

(SCH3)4]1- and [(4Fe4S)(SCH3)4]2- (B3LYP/6-31G* in order for compatibility with the

AMBER force field (ff) 99SB)209 in gas-phase using the ESP module in the NWChem package; for each SCH3 group, the charge on the third hydrogen atom was constrained to be -0.0987 (sum of partial charges of the CA, HA, N HN, C, O atoms in the CYM residue of AMBER ff99)210, as the atom is eventually substituted with the backbone atoms (i.e., CA, HA, N HN, C, O atoms) in the full cysteine residue. This ensures that the partial atomic charges for the backbone atoms are kept consistent with the charges listed for the CYM residue (cysteine bound to a metal) of AMBER ff99SB, since these atoms are located further away from the [4Fe4S]3+/2+ cluster than the SCH3 group. The six sets of partial charges, and the averages for [4Fe4S]3+, are presented in Table A9. The three sets of partial charges, and the averages for [4Fe4S]3+, are presented in Table A10.

95

The 12-6 Leonnard-Jones (LJ) parameters Rmin,ij (distance between atom i and atom j that corresponds to a minimum in the LJ potential) and εij (well depth) for Fe2+ and Fe3+ for the TIP3P103 water model was obtained from ref. 211 and 212 respectively.

As the [4Fe4S]3+ cluster consists of a Fe3+-Fe3+ and Fe2.5+-Fe2.5+ layer, the average formal charge on each Fe atom is +2.75. Scaling the LJ parameters for Fe2.75+ linearly using the following formulae,

� = �, + �(�, − �,) (83)

� = � + �(� − �) (84)

2+ where � = 0.75, �, and � are the distance and well depth for Fe , while �, and

3+ 2.75+ � are that for Fe . We obtain �/2 = 1.4465 Å and � = 0.0247 kcal/mol for each Fe atom in [4Fe4S]3+, assuming an equal distribution of formal charges across all Fe atoms.

The ‘scaled’ LJ parameters for Fe2.5+ were reused for [4Fe4S]2+, since each Fe atom has a formal charge of +2.5.

9.3.3 Structural Comparison

We compared the six optimized geometries of [4Fe4S]3+ with the crystal structure by calculating the RMSDs of the 4 Fe and 4 S atoms of the cluster. The [4Fe4S] cluster of primase was crystallized in the oxidized state of 3+, as the (aerobic) sitting-drop vapor diffusion protocol was utilized and generated needle-like prisms over 2-4 days.199 33+ was found to have the smallest RMSD of 0.258 Å and therefore the closest structural similarity to the cluster extracted from the crystal structure (Table 9). However, since the

96

RMSDs of all six structures are similar, an average set of force field parameters, obtained from averaging across the six sets of force field parameters for 13+-63+, will be used in the

MD simulations. Likewise, an average set of force field parameters for the [4Fe4S]2+ cluster will be used.

Table 9: RMSD comparison of the six optimized [4Fe4S]3+ structures (optimized with the Cosmo model) with the crystal structure.

Structures RMSD (Å)

13+ 0.283

23+ 0.278

33+ 0.258

43+ 0.311

53+ 0.279

63+ 0.307

9.3.4 MD Simulations

MD simulations were first performed using the set of average force field parameters (A3+) for the [4Fe4S]3+-containing primase-RNA/DNA complex. The average force field parameters consist of the average force constants, distances, angles, and average partial charges of the 4 Fe and 4 S atoms of the cluster. The partial charges of the

Cys fragments were averaged, such that each atom (H, CB, and S) of Cys has the same charge across all Cys residues, while the force constants involving H (excluding the H

97

atoms that were constrained in the RESP calculations), CB, S, and/or the atoms of the cluster were also averaged across the four fragments. Consistently, existing force constants from ff14SB (parm10) were used for bonds or angles that involve any atoms that are not in the truncated model. Force field parameters involving the other atoms of the Cys residue (such as CA, N, C, O) correspond to that for CYM. AMBER parameters for N-terminal GTP of the DNA/RNA duplex were taken from ref. 213. The primase protein and RNA/DNA duplex were parameterized by ff14SB214 and ff99-bsc0210,215

AMBER force fields respectively. The primase-duplex system was neutralized with Na+ ions and solvated with TIP3P103 water extending 10 Å on each side. The unit cell vectors were (in Å): (66.000, 88.000, 74.000).

Structure 33+ provided the smallest RMSD of 0.258 Å with respect to the cluster atoms of the crystal structure (Table 9). MD simulations were also carried out using the force field parameters associated with 33+; force field parameters involving atoms within each [2Fe2S] redox layer assigned to 33+ was averaged. The RMSDs corresponding to the single-trajectory simulations associated with A3+ and 33+ are presented in Figure 19 and 20 respectively. Additional MD simulations were performed for the [4Fe4S]3+-containing

Y345C mutated primase-RNA/DNA complex (see Figures 21-22 for RMSDs). MD snapshots were extracted every ns and used for binding free energy and hopping pathway analysis.

98

MD simulations using the set of average force field parameters (A2+) for the

[4Fe4S]2+-containing primase-RNA/DNA complex were also performed (Figure 23).

3

2 ) Å

RMSD ( 1

0 0 20 40 60 80 100 120 Time (ns)

Figure 19: RMSD (Å) of the wild-type primase-RNA/DNA complex (excluding water and counterions) across 120 ns using the A3+ average force field parameters. Subsequently, MD snapshots were extracted every ns from 21-120 ns for EHPath analysis.

3

2 ) Å

RMSD ( 1

0 0 20 40 60 80 100 120 Time (ns)

Figure 20: RMSD (Å) of the wild-type primase-RNA/DNA complex (excluding water and counterions) across 130 ns using the 33+ force field parameters. Subsequently, MD snapshots were extracted every ns from 31-130 ns for EHPath analysis. 99

4

3 ) Å 2 RMSD ( 1

0 0 20 40 60 80 100 120 Time (ns)

Figure 21: RMSD (Å) of the mutated Y345C primase-RNA/DNA complex (excluding water and counterions) across 125 ns using the A3+ force field parameters. Subsequently, MD snapshots were extracted every ns from 26-125 ns for EHPath analysis.

3

2 ) Å

RMSD ( 1

0 0 20 40 60 80 100 Time (ns)

Figure 22: RMSD (Å) of the mutated Y345C primase-RNA/DNA complex (excluding water and counterions) across 110 ns using the 33+ force field parameters. Subsequently, MD snapshots were extracted every ns from 11-110 ns for EHPath analysis.

100

3

2 ) Å

RMSD ( 1

0 0 20 40 60 80 100 120 Time (ns)

Figure 23: RMSD (Å) of the wild-type primase-RNA/DNA complex (excluding water and counterions) across 120 ns using the A2+ average force field parameters.

9.3.5 Binding Free Energies

Primase-RNA/DNA binding free energies using the MM/PBSA method (based on the single-trajectory protocol) are calculated using MMPBSA.py.216 The default

AMBER atomic radii were used for the PB calculations.217 CPPTRAJ218 was also used to ensure that the input MD trajectories (sampled every 1 ns from 21 ns – 120 ns for wild- type primase versus 26 ns – 125 ns for Y345C mutated primase, obtained using the average force field parameters) are compatible with MMPBSA.py. The ΔΔG value obtained is 1.7 kcal/mol (~3 kBT), therefore implying that the mutation does not appreciably perturb the primase-DNA binding free energies.

101

9.3.6 Hole Hopping Pathways

For examining hole hopping pathways, we first defined the [4Fe4S] cluster as the electron donor/hole acceptor while the RNA/DNA is the electron acceptor/hole donor.

This is consistent with the setup that we described in ref. 27. The top hopping pathway, along with its corresponding residence time,27 for each MD snapshot extracted from the wild-type primase and mutated primase simulations was computed using the

EHPath.py code that we developed.168 The α parameter for reorganization energies, as detailed in eq. 78, was set to 1. The pathways derived from the 100 snapshots in each of the four runs (wild-type and Y345C primase using force field sets A3+ and 33+) were classified according to their percentage of occurrence, while the residence time was averaged across the total number of pathways. The results are presented in Table 10

(based on MD snapshots using A3+) and 11 (using 33+).

Table 10: Top hole hopping pathways between the [4Fe4S] cluster in wild-type or Y345C primase, and the nucleic acid. MD snapshots obtained using the force field set A3+ were analyzed. The average residence times and standard deviation are shown in parenthesis (first column).

System type Top hopping pathways % of pathways

[4Fe4S]-M307-DA7 37.0

Wild-type primase- [4Fe4S]-Y309-M307-DA7 8.0 RNA/DNA (4.84 ± [4Fe4S]-Y309-W327-DA7 28.0 2.40 µs) [4Fe4S]-Y309-W327-M307-DA7 27.0

[4Fe4S]-M307-DA7 53.0

102

Y345C primase- [4Fe4S]-Y309-M307-DA7 10.0

RNA/DNA (7.11 ± [4Fe4S]-Y309-W327-M307-DA7 37.0 6.00 µs)

Table 11: Top hole hopping pathways between the [4Fe4S] cluster in wild-type or Y345C primase, and the nucleic acid. MD snapshots obtained using the force field set 33+ were analyzed. The average residence times and standard deviation are shown in parenthesis (first column).

System type Top hopping pathways % of pathways

[4Fe4S]-M307-DA7 52.0 Wild-type primase- [4Fe4S]-Y309-M307-DA7 5.0 RNA/DNA (5.05 ± [4Fe4S]-Y309-W327-M307-DA7 43.0 2.62 µs)

Y345C primase- [4Fe4S]-M307-DA7 59.0

RNA/DNA (8.93 ± [4Fe4S]-Y309-M307-DA7 10.0

10.96 µs) [4Fe4S]-Y309-W327-M307-DA7 31.0

From Table 10, the dominant pathways for wild-type primase are [4Fe4S]-M307-

DA7 (37%), [4Fe4S]-Y309-W327-DA7 (28%) and [4Fe4S]-Y309-W327-M307-DA7 (27%).

The [4Fe4S]-M307-DA7 pathway for Y345C primase dominates at an even higher percentage (53%), while the second strongest pathway is [4Fe4S]-Y309-W327-M307-DA7 at 32%. Most notably, the average residence time for CT between the [4Fe4S] cluster in wild-type primase is smaller than that in Y345C primase by ~47%. These observations

103

are also found from simulations performed using the force field set 33+. From Table 11, the dominant pathway in wild-type and Y345C primase belongs to [4Fe4S]-M307-DA7.

Similarly, the second most dominant pathway in Y345C primase belongs to [4Fe4S]-

Y309-W327-M307-DA7. The average residence time for CT between the [4Fe4S] cluster in wild-type primase is smaller than that in Y345C primase by about 43%. Our results agree with the reduction in CV peak current for the Y345C primase-DNA complex from the wild-type primase observed by Barton and coworkers.26 While the decrease in peak current observed by Barton et al. was ~67%, the primase that was used in their experiments contained the I271S mutation while the DNA substrate that was used lacked several biologically-relevant features.123 As we used a biologically-relevant DNA substrate and native primase structure in our simulations and analysis, the magnitude of the decrease in residence time between the wild-type and Y345C primase is expected

(and indeed confirmed) to be different from that observed by Barton et al.

Table 12: RDA of D-A pairs from the top hopping pathways listed in Table 10.

RDA (Å) D-A Wild-type Y345C [4Fe4S]-M307 16.30 16.70 M307-DA7 4.22 4.15 [4Fe4S]-Y309 13.41 13.38 Y309-M307 12.52 12.36 Y309-W327 6.63 6.88 W327-DA7 11.77 13.33 W327-M307 10.82 10.21

104

From Table 12, the D-A pair corresponding to the largest difference in RDA between the wild-type and Y345C primase (1.56 Å) is the W327-DA7 pair. Since RW327-DA7 increases to 13.33 Å for Y345C primase, this turns off the second most dominant pathway ([4Fe4S]-Y309-W327-DA7), that was previously observed in wild-type primase, for Y345C primase. R[4Fe4S]-M307 is 0.40 Å larger for the Y345C primase than the wild-type primase while R[4Fe4S]-Y309 for both primases are about the same (Table 12). The exponential dependence of the mean residence time (τ ∝ eβR, β ~1.1 Å-1) implies that even a 0.40 Å increase would result in a 55% increase in τ[4Fe4S]-M307.

From Table 13, RM307-DA7 is noticeably larger for the Y345C primase (4.29 Å) than that for the wild-type primase (3.91 Å). Since the wild-type and Y345C primase share the same hopping pathways (Table 11), and these pathways contain the M307-DA7 step, the

RM307-DA7 difference would result in a 52% increase in τM307-DA7. Likewise, a 44% increase would result for τY309-W327, where the W327-M307 step is found in the second most dominant pathway (Table 11) for both the wild-type and Y345C primase.

Table 13: RDA of D-A pairs from the top hopping pathways listed in Table 11.

RDA (Å) D-A Wild-type Y345C [4Fe4S]-M307 16.98 16.67 M307-DA7 3.91 4.29 [4Fe4S]-Y309 13.34 13.38 Y309-M307 12.41 12.37 Y309-W327 6.50 6.83 W327-DA7 12.36 12.88 W327-M307 10.25 10.22 105

10. Hole Transfer at the p53-DNA Interface

Portions adapted from ‘Teo, R. D.; Smithwick, E. R.; Migliore, A.; Beratan, D. N., A single AT–GC exchange can modulate charge transfer-induced p53–DNA dissociation.

Chem. Commun. 2019, 55 (2), 206-209’ with permission from the PCCP Owner Societies.

R.D.T., A.M., and D.N.B. contributed to the writing of the manuscript. In addition:

R.D.T. conceived the research; R.D.T. and A.M. designed the research; R.D.T and E.R.S. carried out the computations; R.D.T. and A.M. analyzed the results.

The tumor suppressor protein p53 prevents tumor formation by sensing various cellular stresses and activating or repressing a myriad of genes. p21WAF1/Cip1 and Gadd45 are two target genes that are expressed when the Cys-rich core domain of p53 binds to their sequence-specific DNA response elements at the promoters. p21 mediates the p53- dependent G1 growth phase arrest by hindering the activities of cyclin-dependent kinases,219 inhibiting DNA replication, and interfering with DNA polymerase activity.220

The activated Gadd45 recruits nucleotide and/or base excision repair factors to gene- specific loci,221 promoting DNA repair222 and demethylation.221

Some of the cellular stresses sensed by p53 include ROS, which serve as signaling molecules to regulate biological and physiological processes.223 In many proteins, redox signaling can involve H2O2-mediated oxidation of Cys residues,223 as well as DNA-

106

mediated signaling initiated by base oxidation.119 Therefore, it is important to understand how p53 differentiates target genes for the redox regulation of metabolic and signaling pathways under oxidative stress. Experimental studies224-226 have shown that, under conditions of DNA oxidation, the core DNA-binding domain of p53 dissociates more easily from the Gadd45 consensus sequence than from the p21 sequence. The dissociation is expected to occur because of HT from the oxidized DNA to a proximal Cys residue of p53. In fact, the HT oxidizes Cys to a cysteinyl radical cation that can form a disulfide bond with a neighboring Cys residue. This disulfide bond formation in turn attenuates p53-DNA binding, thus causing the DNA dissociation.227

Interestingly, the preferred dissociation of the Gadd45 DNA sequence from p53 is also consistent with the differential expression of p53-inducible genes in human ML-1 myeloblastic leukemia cells after DNA damage by ionizing radiation.228 As apoptosis can serve as a fail-safe mechanism to prevent uncontrolled cellular proliferation under persistent oxidative stress,229 p53 reacts by underexpressing Gadd45 (which hampers inefficacious DNA repair) and continuing to express p21 (which promotes G1 cell cycle arrest and eventual cellular apoptosis230). This differential response of p53 to the oxidized Gadd45 and p21 DNA sequences needs to be explained in terms of structure, free energy landscape,226 and dynamical aspects of the p53-DNA CT that are studied here.

107

Structurally, the key underlying feature of p53-DNA binding is the ability of p53 to bind to a consensus sequence located within the downstream gene promoter.231 This DNA sequence consists of two decameric palindromic sequences spaced 0 to 21 base pairs (bps) apart.232,233 Crystal structures show that four p53 proteins self-assemble on two DNA palindromic sequences (which thus represent the two half-sites for protein binding) to form a tetramer.234 One half-site bound to two p53 proteins is shown in Figure 24a, where a key C277 residue is nestled into the major groove and forms H-bonds with the proximal purines.235 Indeed, it was demonstrated that modifying C277 at the DNA-binding interface can regulate p53 cellular activity in response to environmental redox signals.236 Selective alkylation of C277 by 2-sulfonylpyrimidines helps to stabilize p53 mutants and is especially potent at killing cancer cells.237

Figure 24: Protein-DNA models used in the MD simulation of the p53 protein complexed with the Gadd45 and p21 half-site sequences. a) Two p53 proteins (labeled ‘top’ and ‘bottom’) bound to a half-site DNA sequence (PDB ID 2ADY). b) Original DNA sequence (in blue) in the 2ADY structure and bp replacements producing the p21 and Gadd45 model sequences (see Figure 3 of ref. 12). The DNA decamer usually 108

consists of the sequence indicated in purple, where R denotes a purine nucleobase (G or A), W is either A or T, and Y is a pyrimidine nucleobase (C or T). The yellow and orange circles indicate the locations of the DNA binding to the key Cys residue (C277) in the top and bottom protein, respectively. Key differences between the Gadd45 and p21 bp sequences in the proximity of the C277 residues are at positions R3 and Y1.

Near each redox-active C277 residue, the Gadd45 and p21 consensus sequences (see Figure 24b) differ by only one bp - guanine-cytosine (GC) in

Gadd45 and adenine-thymine (AT) in p21. This difference plays a critical role in the interpretation of the differential response of p53 to the presence of an electron hole, produced by oxidative stress, in the p21 and Gadd45 sequences. In fact, since

(i) the oxidation potential landscape of DNA favors hole residence on guanines and (ii) the oxidation potentials of G and Cys are 1.29 V and 0.9 V, respectively

(thus enabling energetically downhill HT to Cys), the presence of G near C277 drives DNA-to-p53 HT, with consequent disulfide bond formation and DNA- protein dissociation.226,235 Yet, kinetic validation of this mechanism has not been provided. From a dynamic perspective, the probability of HT to Cys depends on two factors: how frequently the low-redox potential nucleobase aligns with the

DNA major groove-p53 interface,226 and the competition between the time scales of the HT through the bp at position R3 or Y1 and the HT to Cys. Here, we used

MD simulations to investigate the first factor. Then, we performed electronic structure and kinetic analysis on MD snapshots to study the second factor, which

109

touches upon key unresolved issues in the understanding of DNA charge transport, including the localization/delocalization of the transferring charge.238

The G proximal to C277 can speed up239 charge transit through the DNA in

Gadd45 compared to p21. This faster charge conduction competes with the possibility that the hole transiently localized on G transfers to Cys. Some models of thermally induced hole hopping through DNA239 predict that, in p21 DNA, the hole should jump between G bases at positions R1 and C (we focus the analysis on this part of the palindromic sequence) via superexchange. However, can molecular motion bring one of the two G bases sufficiently close to C277 to enable direct HT? May HT from one of the G bases to Cys compete with superexchange between the guanines separated by two bps (Figure 24b)? Depending on the

DNA-p53 conformation and on the polarization of the local environment, may an

A nucleobase be transiently occupied by the hole and transfer it to the Cys? In this case, A-to-Cys HT would be even more downhill, thermodynamically, than G-to-

Cys HT. Considering the strong hole localization in DNA that emerges from previous studies,240 may the HT through DNA be treated as sequential hopping between bps? These questions highlight the difficulty of establishing a relation between DNA sequence and p53 response to DNA oxidation in a dynamic perspective. Here, we explain this relation by comparative analysis of the HT steps to C277 and through the DNA, after calculating the parameters that describe

110

all relevant CT steps within the framework of Marcus ET theory.241 We find, in terms of CT dynamics, that a single bp difference in the p21 and Gadd45 sequences can modulate the HT from the oxidized DNA to C277 and thus can influence the functionally relevant dissociation of the p53-DNA complex, in good agreement with experimental results.225,226,235

10.1 Methodology

The systems studied comprise two p53 proteins (the ‘top’ and ‘bottom’ molecules in Figure 24) bound to the Gadd45 or the p21 sequence. To build these systems, we made substitutions in the original DNA sequence from the PDB structure

2ADY234 to obtain models of the p21 and p53 sequences that match the sequences used in the experiments of ref. 225 in the Cys binding regions (Figure 24b). The conformational dynamics of the resulting protein-DNA complexes was simulated using classical MD (see details in SI of ref. 28), with MD production runs of 40 ns.

Both systems were well equilibrated after 10 ns, and the RMSDs (Figure 25) show similar structural fluctuations for the human p53-Gadd45 DNA and p53-p21 DNA complexes.

111

4

) 3 Å

2 RMSD ( 1 Gadd45 p21 0 0 10 20 30 40 Time (ns)

Figure 25: RMSD (without hydrogens) along the MD production run for the protein-DNA complexes containing the Gadd45 (blue) and p21 (orange) DNA models.

Therefore, we extracted one MD snapshot every 2.5 ns in the 10-40 ns time window to calculate the CT parameters involved in the HT steps illustrated in

Figure 26 (vide infra). The rates associated with these steps were described using eq. 8. Starting from the MD snapshot at 10 ns, we calculated � for portions of the p53-DNA interface containing the methanethiol moiety and the relevant bp, using a DFT implementation (with the M11 functional83 and the 6-31g** basis set) of eq.

32. The ∆G○ values were estimated as differences between the experimental oxidation potentials of the charge donor and acceptor. The reorganization energies were evaluated combining Marcus’ expression for l241 with DFT analysis

(eqs. 40 and 41, see SI of ref. 28 for more details).

112

10.2 Results and Analysis

Figure 26: HT models for the contacts of p53 with (a) Gadd45 and (b) p21 DNA sequences. The arrows denote the possible HT steps. The HT rate constant values (in s–1) refer to the top (black) and bottom (purple) contacts. We assume that protein and DNA unbind after HT to Cys. Thus, the HT steps from Cys are not shown. The DNA sequence runs from the 5’ (left) to the 3’ (right) end in both top and bottom contact models.

The mean-square electronic coupling over the selected MD snapshots, 〈�〉, for each redox pair in Figure 26 is reported in Table 14. The Gadd45 DNA sequence has similar coupling to the top and bottom proteins, as is expected from the similarity of the two p53-DNA contacts.

113

Table 14: Values of mean-square electronic coupling, , in eV2 and coherence parameter, C, for each redox pair in Figure 26, at the top and bottom p53- Gadd45 DNA and p53-p21 DNA contacts.

� DNA protein redox pair 〈��� 〉 C

AT-Cys 2.32 × 10-5 0.34

GC-Cys 2.39 × 10-3 0.19

Top CG-Cys 4.56 × 10-5 0.36

AT-GC 2.92 × 10-3 0.56

-6 Gadd45 gene GC-CG 9.16 × 10 0.33 model AT-Cys 5.30 × 10-5 0.32

GC-Cys 7.75 × 10-3 0.35

Bottom CG-Cys 2.50 × 10-4 0.50

AT-GC 6.21 × 10-3 0.54

GC-CG 1.64 × 10-5 0.26

AT1-Cys 1.08 × 10-6 0.48

AT2-Cys 6.34 × 10-9 0.33

Top CG-Cys 6.45 × 10-7 0.52

AT1-AT2 1.82 × 10-3 0.66

-3 p21 gene AT2-CG 2.00 × 10 0.38

model AT1-Cys 4.29 × 10-5 0.17

AT2-Cys 1.52 × 10-5 0.32

Bottom CG-Cys 1.39 × 10-4 0.39

AT1-AT2 2.75 × 10-3 0.66

AT2-CG 2.36 × 10-4 0.76

114

In contrast, the p21 model sequence is more strongly coupled to the Cys in the bottom protein than to the one in the top protein. This difference may also be a consequence of our modeling. The bp triplet near the upper Cys is one bp away from the edge of the DNA model sequences, while two extra bps were maintained on the side of the bp triplet in contact with the lower Cys (see Figure 24 and 26).

This choice was made to minimize the changes to the actual DNA sequence that is in complex with the p53 proteins in the 2ADY structure, and to explore the extension of the DNA sequence on the sides of the contact bp triplet needed to assure the structural stability of the p53-DNA contact. Our findings indicate that including only one extra bp in the DNA sequence complexed with the proteins is a borderline modeling choice that was sufficient to constrain the DNA-protein contact geometry in the case of Gadd45 but failed to do so in the case of p21.

Table 15: Ranges of distances (in Å) spanned by the indicated bp-Cys pairs, over the selected MD snapshots, at the DNA contacts with the top and bottom proteins.

p21 DNA AT1-Cys AT2-Cys CG-Cys top 7.37-9.34 9.29-11.66 8.50-10.28 bottom 5.70-8.43 7.55-9.82 5.73-9.61

Table 15 shows that the ranges of bp-Cys distances (see Section S3 of ref.

28) for the three bps in Figure 26b, at the interfaces of p21 DNA model with the top and bottom proteins. These distances correlate very well with the couplings in

Table 14, thus also supporting the robustness of our method for calculating 115

couplings. We note that, although the DNA structural fluctuations can be reduced

by the presence of the proteins, the average values of the electronic couplings

between adjacent bps are similar to the expectations242 for free DNA. For the p21

DNA-bottom p53 contact, e.g., we obtain 〈�〉 values of 43, 2, 58 and 13 meV for

AT1-AT2, GC-CG, AT-GC and AT2-CG, respectively.

Table 14 also shows the values of the coherence parameter243 � =

〈�〉 ⁄〈�〉, which is a descriptor of the coupling sensitivity to structural

fluctuations: � ≅ 1 for very rigid systems; � ≅ 0 for very flexible systems with

strong dependence of the coupling on the atomic conformation. The � values in

Table 14 indicate that the electronic couplings are rather sensitive to the

fluctuations of the protein-DNA contact regions. Moreover, a comparison of the

bottom panel in Table 14 with Table 15 reveals that � does not correlate strictly

with the p53-DNA distance in this complex system. The closer contact between

the p21 sequence and the bottom protein, compared to the top protein, seems to

enable a broader range of coupling values that depend on subtleties of the contact

(including the bp-Cys distance, the molecular orientations, etc.), thus leading to

smaller � values.

Irrespective of whether we consider the top or bottom DNA-protein

contact in the system of Figure 24a, Table 14 shows that the intermediate bp in the

contact bp triplet is much more strongly coupled to Cys, and more weakly

116

coupled to the next bp, in the protein complex with the Gadd45 sequence than in the complex with the p21 sequence. The relative differences in the electronic couplings are accentuated in the HT rates (Figure 26), favoring a slower transit of the charge across the bp triplet and its transition to C277 in the Gadd45 system compared to p21. In particular, in the complex of p53 with the Gadd45 sequence, the CT rates computed for HT to CG and C277 are approximately in the ranges

103-104 s–1 and 107-108 s–1, respectively, while the ranges 105-107 s–1 and 103-107 s–1 were computed for the p21 system. Note that the predicted HT is faster between

AT2 and CG than between GC and CG because of both a larger electronic coupling and a negative reaction free energy (see Section S4 of ref. 28). Thus, considering the connections among Cys oxidation, disulfide bridge formation, and protein dissociation from DNA, the HT rates in Figure 26 demonstrate, in kinetic terms, the higher propensity of p53 to dissociate from the Gadd45 DNA sequence than from the p21 sequence in the presence of an electron hole generated by oxidative stress. This conclusion can be clearly stated without constructing a specific kinetic model. Constructing such a model accurately would require taking into account the probability for the hole to tunnel through both AT1 and AT2, rather than transiently residing in AT2, in the system of Figure 26b, while the transferring charge is expected to transiently localize on the guanine close to C277 (Figure 26a) in the Gadd45 sequence. Therefore, the use of a detailed kinetic model would

117

further strengthen the conclusions of this study. Importantly, our MD simulations find that the three bps in the HT models of Figure 26 remain close to the two C277 residues during the evolution of the protein-DNA complex. Therefore, HT to the protein from purine-containing bps outside of the contact bp triplet could not compete with HT by superexchange through AT1–AT2 and thus favor HT to the protein in the complex with p21 DNA.

10.3 Concluding Remarks

This study provides an explanation, in terms of protein-DNA CT dynamics, of the experimental finding224-226 that DNA-to-protein HT can cause the selective binding of tumor suppressor protein p53 to p21 DNA, in contrast to its dissociation from Gadd45 DNA, under oxidative stress. We show that the intimate relationship between p53 activation/function (contributing to genome stability) and the cell redox environment is a matter of kinetic competition between HT through DNA and HT from the DNA to residue C277, which enables the sequence selectivity of the p53 binding to different promoters.

The implications of our findings may be extended to related proteins that contain redox-active residues in their DNA-binding domains and should be considered in strategies to prevent p53 aggregation by the formation of disulfide bridges.244 In addition, the kinetic competition of HT processes at the p53-DNA interface described here suggests a potential role for genome engineering tools

118

such as CRISPR-Cas9245,246 to edit the bp sequence precisely and hence to modulate the relative efficiency of the competing CT pathways determining the p53 regulatory function.

119

11. Hole Transfer in 2’-Deoxy-2’-Fluoro-Arabinonucleic Acid

Portions adapted from ‘Teo, R. D.; Terai, K.; Migliore, A.; Beratan, D. N., Electron transfer characteristics of 2′-deoxy-2′-fluoro-arabinonucleic acid, a nucleic acid with enhanced chemical stability. Phys. Chem. Chem. Phys. 2018, 20 (41), 26063-26067’, and

‘Teo, R. D.; Smithwick, E.; Migliore, A., 2′-Deoxy-2′-fluoro-arabinonucleic acid: a valid alternative to DNA for biotechnological applications using charge transport. Phys. Chem.

Chem. Phys. 2019, ’ with permission from the PCCP Owner Societies. R.D.T., A.M., and

D.N.B. contributed to the writing of the manuscript. In addition: R.D.T. conceived the research; R.D.T. and A.M. designed the research; R.D.T and K.T. carried out the computations; R.D.T. and A.M. analyzed the results.

CT in DNA has been widely studied. This is largely due to DNA’s robust structural, biological, and electronic properties that have prompted its investigation as a potential ‘molecluar wire’.14 The helical nature of DNA and the ordered π-stacking of nucleobases can allow efficient charge propagation through the bridging sites with DNA lengths of over 34 nm.15 Such long-range CT is attractive due to its applications in nanotechnology and biology.16 In biology, extensive research has been focused on DNA- mediated CT in DNA repair and replication proteins containing redox-active residues like iron-sulfur clusters and cysteine (such as DNA primase and p53) where charges can 120

be shuttled between DNA and these residues as a form of interprotein communication.25-

28 In nanotechnology, DNA has often been explored as a building block of biomimetic materials based on self-assembled 3D architectures. For example, previous studies have demonstrated the structural robustness of DNA in a torsionally-stressed ‘tensegrity triangle’ motif.29,30 Engineering such architectures is useful in programmable DNA nanoelectronics based on long-range CT.31

DNA CT often involve the migration of a positive charge (a hole) across the

HOMOs of the GC and AT nucleobase pairs. These HOMOs tend to localize on purines

(G or A) due to their lower oxidation potentials (and higher HOMO energies) than the pyrimidines.32 An established mechanism of HT is the hopping mechanism,24,33 by which the hole ‘hops’ across the purine bridging sites via a multistep tunneling process.

Although DNA is an efficient charge conductor, it is often prone to chemical degradation. Artificially synthesized nucleic acids, termed XNA (xeno-nucleic acids), are known to more chemically stable than DNA/RNA, while retaining the hereditary and evolutionary properties of DNA/RNA.247 In this work,81,248 we compared the VIFs between nucleobase D-A pairs (CG-GC, GC-A1T, A1T-A2T, A2T-TA) found in a XNA, 2’-deoxy-2’- fluoro-arabinonucleic acid (2’F-ANA), with those obtained from DNA from MD snapshots sampled at every 0.5 ns from 10-50 ns (81 snapshots). 2’F-ANA offers several notable advantages over DNA/RNA, such as a wider functional pH range,249 high resistance to acid-catalyzed depurination,250,251 hydrolysis,249 and nuclease degradation.249

121

These advantages present 2’F-ANA as an especially attractive material for microarrays and biosensing, because the 2’F-ANA probes can be washed with acids or bases to remove the analyte while maintaining the structural integrity of the material. As 2’F-

ANA has increased chemical stability and a potentially longer half-life in vivo, 2’F-ANA- based therapeutics252 can provide lower dosage requirements than existing DNA-based therapeutics.253 In comparison to neutral backbone species like peptide nucleic acid that could destabilize the hole states on the bridge and ‘cool off’ the holes,254 2’F-ANA not only retains the negatively-charged phosphate backbone but also contains a highly electronegative fluorine on the ribose moiety, thereby increasing the backbone charge and further stabilizing the holes.

11.1 Methodology

VIFs (eq. 33) and � between the nucleobase pairs (CG-GC, GC-A1T, A1T-A2T,

A2T-TA) in the DNA and 2’F-ANA Dickinson-Drew dodecamer sequences (PDB 4C64255 and 2LSC256 respectively) were calculated based on snapshots of the MD trajectories

(obtained in ref. 81) for every 0.5 ns from 10-50 ns (81 snapshots). The corresponding � for each nucleobase pair in each snapshot was calculated and plugged into eq. 35 to

° obtain the reorganization energy l. These parameters, along with the ∆� of the D-A pair, were plugged into the Marcus equation (eq. 8) to obtain the corresponding �.

�s obtained for the four nucleobase pairs were then combined into a master equation

122

(eq. 76) to obtain an effective rate (equivalent to the inverse of the mean residence time) for hole migration from CG to TA with the intervening GC, A1T, and A2T bridging sites.

11.1.1 MD Simulation Setup

The original structure of the 2’F-ANA/ANA duplex (PDB 2LSC) contains U nucleobases in the ANA strand. These bases were modified to T bases (using the

Schrodinger’s Maestro molecular modeling software257) to match the analogous

Dickerson-Drew DNA sequence (PDB 4C64). Finally, the sugar moieties in the ANA strand were replaced with the corresponding ones in 2’F-ANA to obtain a 2’F-ANA/2’F-

ANA duplex.

The dodecamers were solvated in a box of TIP3P water that extended 10.0 Å on each side of the nucleic acid, which helps to maintain the volume of the unit cell during the NPT simulation. The negative charge of the system was neutralized by Na+ ions.

AMBER parameters for FANA were derived from ref. 258, while the AMBER force field ff99 was used for DNA. The solvated DNA and FANA systems consisted of 13311 and

12069 atoms, respectively. We set the scaling factor for 1-4 electrostatic interactions to

0.833333. The cutoff distance for truncating van der Waals interactions was 12 Å. The maximum non-bonded interaction distance for the periodical calculation of the interaction energy was set to 14 Å. The electrostatic interactions were computed using the Particle Mesh Ewald method (grid spacing = 1 Å). Full electrostatic interaction energy was calculated each 2 time steps. The water molecules were maintained rigid

123

using the SHAKE algorithm. The unit cell vectors were (in Å): (46.008, 49.610, 65.744) for

DNA and (48.000, 65.000, 46.000) for 2’F-ANA.

All MD simulations were carried out using the NAMD 2.11 software. Both nucleic acid structures were first optimized by means of 8 x 104 energy minimization steps. For DNA, the minimization was followed by solvent equilibration (using a

Langevin thermostat, with a damping coefficient of 1.0 ps-1 for the Langevin dynamics), by gradually increasing the temperature from 293 K (crystallization temperature at which the system was equilibrated for 225 ps) to 295.5 K (for 50 ps), and finally to 298 K

(for further 50 ps). For 2’F-ANA, we only carried out 225 ps of solvent equilibration at

209 K, namely, the temperature at which the 2’F-ANA solution structure was obtained.

Next, both systems were equilibrated at constant temperature and pressure for 1.5 ns, using Langevin thermostat and piston (temperature = 298 K, pressure = 1 atm; barostat period = 100 fs, characteristic damping time = 50 fs, damping coefficient = 2.0 ps-1). The final MD production runs covered 50 ns, with a time step of 0.5 fs. The DNA and 2’F-

ANA snapshots used for electronic coupling calculations were taken every 0.5 ns from

10 to 50 ns (see Figure S1 of ref. 81 for RMSD plot).

11.1.2 Generation of Truncated Models

The nucleobases of interest (i.e., CG, GC, A1T, A2T) were pruned from the sugar- phosphate backbone and the dangling bonds were saturated with H atoms, which were then optimized using B3LYP/6-31G* (with the Grimme’s DFT-D3 correction)259 while

124

keeping the other atoms fixed. Following which, for each MD snapshot, VIF for every adjacent nucleobase pair (i.e., CG-GC, GC-A1T, A1T-A2T, A2T-TA) in the sequence set

(see Figure 27 and Table 16) was calculated. Table 16 also presents the coherence parameter C.

11.1.3 Free Energy Change ∆G°

Consistent with the estimation of free energy changes for HT in the systems presented in Chapters 7-10, likewise, for HT from D to A (D or A would represent either

G or A in 2’F-ANA/DNA) was computed as the difference of the oxidation potentials of

A and D, where the oxidation potential of G and A are 1.29 V136,137 and 1.42 V137 respectively, as described in ref. 27. Hence, DG° for the forward HT step of CG-GC, GC-

A1T, A1T-A2T, and A2T-TA are approximated as 0 eV, 0.13 eV, 0 eV, and 0 eV, respectively. DG° for the backward HT step are 0 eV for all pairs except for GC-A1T which is -0.13 eV.

11.1.4 Reorganization Energies

Three sets of constants implemented in ref. 28 were used for this study – 1) � =

260 260 261 262 2.27 , � = 12.4 , 2) � = 1.8, � = 80, and 3) � = 2 , � = 8 . Set 1 correspond to the dielectric constants of the base stack in the duplex.260 If the dielectric constants in eq. 35 are dominated by the contribution of the solvent water molecules, then set 2 is a good

262 approximation. Set 3 offers an experimental value of � = 8 while the high-frequency dielectric constant � = 2 was employed in ref. 261. ∆� is equal to 1 (in atomic units) due 125

to the elementary transferring charge. The center-to-center distance � between a

purine pair was calculated by first defining the D and A moieties, i.e., G and A were

isolated and pruned from their sugar-phosphate backbones.

11.2 Results and Analysis

11.2.1 Effective Electronic Couplings

350 350 DNA 2'F-ANA DNA 2'F-ANA 300 (a) 300 (b) 250 250 200 200 (meV) (meV) 150 150 IF IF 100

100 V V 50 50 0 0 10 20 30 40 50 10 20 30 40 50 Time (ns) Time (ns)

350 350 DNA 2'F-ANA DNA 2'F-ANA 300 (c) 300 (d) 250 250 200 200

(meV) 150 (meV) 150

IF 100 IF 100 V V 50 50 0 0 10 20 30 40 50 10 20 30 40 50 Time (ns) Time (ns)

Figure 27: (a) VCG-GC, (b) VGC-A1T, (c) VA1T-A2T, and (d) VA2T-TA versus the MD simulation time for DNA and 2’F-ANA. The diagrammed values are listed in Table S1 of ref. 257.

The VDA values obtained in the present study are listed in Table S1 of ref.

257 and diagrammed in Figure 27. Table S2 of ref. 257 shows that the two-state

approximation is well satisfied. The parameters in Table 16 are obtained using the

VDA values in Table S1 of ref. 257 and in ref. 81. Due to the palindromic nature of

the two base-pair sequences, we expect similar electronic couplings for the

126

nucleobase pairs in the part of the system not considered, thus ruling out

substantial differences that could otherwise arise from the directional asymmetry

of HT through nucleic acids.263

2 2 Table 16: Mean-square electronic coupling (in meV ) and coherence parameter C for the indicated nucleobase pair dimers in DNA and 2'F-ANA. These parameters are computed using the VDA values in Table S1 of ref. 81.

DNA 2'F-ANA Base pair C C dimer DA DA CG-GC 241 0.47 267 0.48

GC-A1T 9353 0.57 1444 0.73

A1T-A2T 3599 0.51 1914 0.47

A2T-TA 2547 0.30 3055 0.24

From Table 16, the values of 〈�〉 and 〈�〉 in 2′F-ANA and DNA

are very similar (the ratios of these mean-square couplings are about 1.1 and 1.2,

respectively), thus reflecting in even smaller differences in the electronic

couplings. In contrast, we obtain appreciable differences for the 〈�〉 and

〈�〉 values, which are about 6 and 2 times larger in DNA than in 2′F-ANA

(namely, their square roots differ by factors of about 2.5 and 1.4, respectively).

These differences correlate with the 〈�〉 and 〈�〉 values, which are

3.80 Å and 3.70 Å in DNA versus 3.94 Å and 3.80 Å, respectively, in 2′F-ANA.

From structural analysis using the CPPTRAJ software,218 we see that the slide

parameter is mainly different in the two systems. In fact, for the GC-A1T and A1T -

127

A2T dimers in the DNA system, the average (absolute) values of the slide are 0.40

Å and 0.68 Å, respectively, while the corresponding values in 2′F-ANA are 1.21 Å and 1.22 Å, thus slightly impairing the corresponding electronic couplings in 2′F-

ANA compared to DNA. Since C for DNA and 2'F-ANA nucleobase pairs are 0.46 and 0.48 respectively (Table 16), this indicates similar fluctuations of the base-pair electronic couplings and nucleobase π-stack.

11.2.2 Reorganization Energies

2 2

1.5 ) 1.5 ) t t ( ( T 1 GC A -

1 – 1 CG o GC o l 0.5 l 0.5 (a) DNA 2'F-ANA (b) DNA 2'F-ANA 0 0 10 20 30 40 50 10 20 30 40 50 tt tt

2 2

) 1.5 1.5 ) t t ( ( T 2 TA A –

– 1 1 T T 2 1 A o o A l l 0.5 0.5 (c) DNA 2'F-ANA (d) DNA 2'F-ANA 0 0 10 20 30 40 50 10 20 30 40 50 t t

Figure 28: Instantaneous value of the Marcus expression for the outer-sphere reorganization energy λoDA(t) (calculated using eq. 35, in eV) vs. the MD simulation time (ns) for the D-A dimers (a) CG-GC, (b) GC-A1T, (c) A1T -A2T, and (d) A2T-TA in

DNA and 2'F-ANA. The S1 set of dielectric constants is used for eq. 35.257

Using the effective center-to-center D-A distances in Tables S13-S14 of ref.

257 and the effective D and A radii in Tables S15-S16 of ref. 257, one obtains the

128

outer-sphere �(�) values as plotted in Figure 28. The inclusion of the inner-

sphere reorganization energies �(�) results in Tables 17 and 18 (vide infra), which contain the values of the total reorganization energy � for each nucleobase pair in DNA and 2'F-ANA respectively. Different values of � are presented in the tables using different dielectric constant sets S1, S2, and S3. The

outer-sphere reorganization energy �(�) (see Table S23 of ref. 257) for the intra-

strand adenine dimer A1-A2 spans the range 0.84-1.27 eV depending on the dielectric constants used. This range includes previous estimates of Steinbrecher et al.264 (1.08 eV) and Kubař et al.133 (1.21 eV). Our estimate of 1.26 eV using the dielectric constants of water (i.e. dielectric set S2) is close to the one of Kubař and

Elstner133 for DNA in water. Since Tables 17 and 18 show similar reorganization energies for the DNA and 2'F-ANA nucleobase pairs, according to our results, the

DNA modification leading to 2'F-ANA does not influence significantly the reorganization free energies associated with HT through the nucleobases.

Table 17: Reorganization energy (λDA) and hole-transfer rate (kDA) values for the indicated base-pair dimers in DNA, using the S1, S2 and S3 dielectric constant sets.

S1 S2 S3 DNA –1 –1 –1 λDA (eV) kDA (s ) λDA (eV) kDA (s ) λDA (eV) kDA (s )

5 2 5 TA-A2T 1.81 7.08 × 10 2.52 5.97 × 10 1.87 3.88 × 10

8 6 8 A2T-A1T 1.25 2.81 × 10 1.68 3.68 × 10 1.29 1.87 × 10

9 7 9 A1T-GC 3.60 × 10 5.42 × 10 2.41 × 10 1.33 7 1.75 5 1.37 7 GC- A1T 2.28 × 10 3.43 × 10 1.53 × 10

129

GC-CG 1.85 4.49 × 104 2.48 8.41 × 101 1.90 2.72 × 104

Table 18: Reorganization energy (λDA) and hole-transfer rate (kDA) values for the indicated base-pair dimers in 2’-FANA, using the S1, S2 and S3 dielectric constant sets.

S1 S2 S3 2'F-ANA –1 –1 –1 λDA (eV) kDA (s ) λDA (eV) kDA (s ) λDA (eV) kDA (s )

5 2 5 TA-A2T 1.88 4.21 × 10 2.62 2.65 × 10 1.94 2.31 × 10

7 6 7 A2T-A1T 1.29 9.95 × 10 1.73 1.19 × 10 1.33 6.64 × 10

8 6 8 A1T-GC 3.37 × 10 4.17 × 10 2.26 × 10 1.38 6 1.82 4 1.42 6 GC- A1T 2.13 × 10 2.64 × 10 1.43 × 10

GC-CG 1.72 1.83 × 105 2.28 6.81 × 102 1.77 1.11 × 105

11.2.3 Charge Transport

Tables 17 and 18 also show the CT rate constants for the HT steps in DNA

and 2'F-ANA respectively. The forward and backward kDAs for the A1T-GC pair

are presented, due to the non-zero reaction free energy difference. For the other

nucleobase pairs (TA-A2T, A2T-A1T, and GC-CG), both forward and backward kDAs result in the same values due to the zero free energy difference.

The theoretical calculation of the rate constant for the A1T-to-GC hole-

transfer step is relevant to experiments on photoinduced charge separation and

transport in (diblock) DNA hairpins,265-267 in which the HT between A and G

nucleobases protracts the charge separation.267 In fact, a value of ΔG° = -0.45 eV

130

was predicted34,267,268 (note that part of the difference between this value and the

value of -0.13 eV used in our study can also be attributed to the different

experimental conditions in the investigations of DNA diblock hairpins34,267,268 and

in the studies that provided the A and G oxidation potentials used here136,137).

Using ΔG° = -0.45 eV and the reorganization energy values corresponding to the

S1 and S3 dielectric constant sets, a ps time scale for HT from A1T to GC is

obtained, which is in excellent agreement with the experimental results in ref. 267

(see Scheme 2B in ref. 267); this agreement supports our theoretical-computational

approach to the calculation of electronic couplings and reorganization energies

between adjacent base-pair dimers in either DNA or 2'F-ANA.

Table 19: Mean travel time (τ) spent by the hole to traverse the path from TA to a charge drain in contact with CG. The τ values corresponding to the S1, S2 and S3 dielectric constants sets are denoted τ1, τ2, and τ3, respectively, and are computed inserting the CT rates of Tables 17 and 18 into the exact expression of eq. 76. The value of the effective rate keff,1 = 1/τ1 is also shown.

–1 TA to CG τ1 (s) keff,1 (s ) τ2 (s) τ3 (s)

DNA 2.41 × 10-5 4.14 × 104 1.38 × 10-2 4.00 × 10-5

2'F-ANA 7.98 × 10-6 1.25 × 105 5.27 × 10-3 1.36 × 10-5

Table 20: Mean travel time (τ) spent by the hole to traverse the path from CG to a charge drain in contact with TA. The notation is the same as in Table 19.

–1 CG to TA τ1 (s) keff,1 (s ) τ2 (s) τ3 (s)

DNA 4.73 × 10-4 2.12 × 103 5.44 × 10-1 8.57 × 10-4

2'F-ANA 7.64 × 10-4 1.31 × 103 1.20 1.39 × 10-3

131

Table 19 shows the mean residence time, τ, of the hole along the path from the TA pair to CG, namely, the average time spent by a hole injected in the middle of DNA or 2'F-ANA (for example, by contact with an analyte) to travel towards the edge of the double strand (where, for example, the hole is delivered to an electrode that functions as a charge drain). τ is calculated using the different sets of dielectric constants. The value of the effective rate constant keff = 1/τ obtained using the S1 set is also explicitly shown. Table 20 reports the analogous quantities for the path from CG to TA. S2 leads to τ values significantly larger than those resulting from S1 and S3, as a consequence of the greater localization of the transferring hole in the more polarizable medium, which entails a larger reorganization energy.269,270 However, environments characterized by dielectric constants closer to S1 and S3 are expected in most technological applications, where, generally, the nucleic acid is not completely surrounded by a polar solvent such as water.

In the path from TA to CG, the rate of the forward CT from A1T to GC is much larger than the backward CT rate. Although the forward and backward CT rates for the other self-exchange steps are the same, the backward CT rate from

CG to GC is not considered because the charge is immediately delivered to a drain when it arrives at CG. In both DNA and 2'F-ANA, keff is only slightly smaller

132

than kDA for the rate-limiting CT step from GC to CG (Tables 19 and 20, 4.49 × 104 s–1 in DNA and 1.83 × 105 s–1 in 2'F-ANA). The results in Tables 19 and 20 indicate that DNA and 2'F-ANA have comparable conductivities. As a note, due to the approximations used in the theoretical models, one cannot establish with certainty which of the two systems supports faster hole conduction, although it is clear that DNA and 2'F-ANA conductivities have the same order of magnitude in different CT (dielectric) environments. Since the present analysis only considers the effects of the different backbones in DNA and 2'F-ANA on the structural fluctuations that influence both the base-pair π-stacking (which is the main factor determining the base-pair electronic couplings) and the reorganization energies associated with the hole transport, future studies should ascertain whether the backbone difference may produce any appreciable change in the electronic couplings and in the reorganization energies, as computed by including the backbone connecting the two stacked base pairs involved in each CT step.

11.3 Concluding Remarks

This study shows that the electron-hole conduction occurs on the same timescale in DNA and 2'F-ANA. We provide a simple scheme for rapid and yet sufficiently accurate description of the charge transport through nucleic acid-like systems, which may be conveniently applied to systems larger than the ones studied here. According to our results, the greater chemical stability of 2'F-ANA

133

compared to DNA does not entail significant changes in the conformational properties of the base-pair pattern, concerning the electronic structure and free energy properties relevant to the hole transport. In fact, the results of this study, combined with those of ref. 81, show (i) similar fluctuations of the inter- nucleobase pair electronic couplings as a function of the nuclear motion, despite the chemical stability enhancement produced by the backbone mutation in 2'F-

ANA, and (ii) similar free energy parameters (in particular, reorganization energies) for the pertinent CT steps.

Our results indicate a slightly faster hole transport through 2'F-ANA than through DNA in the predominant charge transport direction, but the theoretical uncertainty in the absolute values of the effective CT rates is such not to allow such a strict comparison of the two conductivities. Although the absolute value of the timescale depends on the base-pair sequence and the approximations used in the theoretical modeling of the actual systems, we expect that none of these approximations can affect the main physical-chemical conclusion of this study, namely, the fact 2'F-ANA can conduct electron-holes comparably to DNA, thereby providing a valid alternative to DNA for technological applications that use nucleic acid charge transport and, at the same, can benefit from the enhanced

2'F-ANA’s chemical stability.

134

12. Conclusions

In summary, CT-based redox signaling between [4Fe4S] proteins and nucleic acids was established to be unidirectional (see Chapter 7). Specifically, CT from an excess charge-containing (i.e., excess electron or hole-containing) nucleobase to the

[4Fe4S] cluster through the protein medium can occur on a biologically-relevant millisecond timescale, while it is not energetically feasible for CT to occur from an excess charge-containing cluster to the neutral nucleobase. In addition, a set of signaling mechanisms (involving oxidants or reductants) is proposed. From the theoretical- computational approach established in Chapter 7, a python module (EHPath.py) for characterizing hopping pathways in proteins and nucleic acids was developed (Chapter

8). This module was used to investigate hopping pathways as protection mechanisms in several proteins including the oxygen-utilizing model enzyme cytochrome p450, and is freely accessible via GitHub at the following link: https://github.com/etransfer/EHPath.

Chapter 9 documents the development of force field parameters for the high- potential [4Fe4S] cluster. The usage of EHPath to analyze MD snapshots of the high- potential [4Fe4S]-containing primase-RNA/DNA complex demonstrates that the Y345C mutation modifies hole hopping pathways and attenuates the CT rate between the

[4Fe4S] cluster and the duplex. Chapter 10 details that CT between DNA and the proximal Cys residue in the core DNA-binding domain of the tumor suppressor protein p53 is more favorable when bound to the Gadd45 DNA sequence than

135

the p21 sequence, even though both consensus sequences near the Cys residue only differ by one base-pair. In Chapter 11, ET rates in a synthetic nucleic acid, 2’F-ANA, are found to be comparable to that of DNA, along with the added advantage of greater chemical stability for 2’F-ANA. 2’F-ANA would be a promising substitute for DNA in a wide range of applications, such as nucleic acid-based sensors.

The field of nucleic acid-mediated redox signaling and regulation is very rich, due to the prevalence of DNA-processing enzymes found in nature, as well as CT- deficient protein mutants found in human diseases like cancer. The results, as summarized above, will help foster experimental studies of CT-mediated redox signaling among [4Fe4S] proteins, improve our understanding of CT-mediated regulation (as exemplified in the case study of p53), and drug targets in cancer signaling pathways. In addition, the results suggest a potential role for genome editing tools to precisely edit DNA sequences and modulate competing CT and signaling pathways for p53 (and other DNA-processing proteins) regulatory function.

136

Appendix

Table A1: Interatomic distances Req (Å) for the six optimized geometries 13+-63+ of [4Fe4S]3+ and the corresponding average values.

Bond 13+ 23+ 33+ 43+ 53+ 63+ Average Fe1-S1 2.37 2.36 2.38 2.39 2.27 2.38 2.36 Fe1-S2 2.25 2.25 2.25 2.22 2.22 2.22 2.24 C4-H10 1.09 1.09 1.09 1.09 1.09 1.09 1.09 Fe1-S3 2.36 2.37 2.33 2.27 2.37 2.37 2.35 S8-C4 1.85 1.85 1.85 1.85 1.85 1.85 1.85 Fe4-S8 2.22 2.25 2.23 2.25 2.23 2.25 2.24 Fe1-S7 2.35 2.36 2.39 2.36 2.38 2.26 2.35 S1-Fe2 2.37 2.39 2.26 2.35 2.39 2.39 2.36 S1-Fe3 2.26 2.34 2.34 2.35 2.33 2.33 2.32 S2-C1 1.85 1.85 1.85 1.85 1.85 1.85 1.85 C1-H1 1.09 1.09 1.09 1.09 1.09 1.09 1.09 C1-H2 1.09 1.09 1.09 1.09 1.09 1.09 1.09 C1-H3 1.09 1.09 1.09 1.09 1.09 1.09 1.09 Fe2-S3 2.36 2.27 2.39 2.36 2.34 2.37 2.35 Fe2-S4 2.25 2.22 2.22 2.25 2.25 2.22 2.24 C4-H12 1.09 1.09 1.09 1.09 1.09 1.09 1.09 Fe2-S5 2.35 2.37 2.37 2.36 2.39 2.26 2.35 C4-H11 1.09 1.09 1.09 1.09 1.09 1.09 1.09 S3-Fe4 2.27 2.35 2.36 2.36 2.37 2.35 2.34 S4-C2 1.85 1.85 1.85 1.85 1.85 1.85 1.85 C2-H4 1.09 1.09 1.09 1.09 1.09 1.09 1.09 C2-H5 1.09 1.09 1.09 1.09 1.09 1.09 1.09 C2-H6 1.09 1.09 1.09 1.09 1.09 1.09 1.09 Fe3-S5 2.37 2.38 2.34 2.27 2.34 2.35 2.34 Fe3-S6 2.23 2.23 2.25 2.23 2.25 2.25 2.24 Fe3-S7 2.37 2.28 2.34 2.38 2.34 2.35 2.34 S5-Fe4 2.37 2.34 2.38 2.36 2.26 2.37 2.35 S6-C3 1.85 1.85 1.85 1.85 1.85 1.85 1.85 C3-H7 1.09 1.09 1.09 1.09 1.09 1.09 1.09 C3-H8 1.09 1.09 1.09 1.09 1.09 1.09 1.09 C3-H9 1.09 1.09 1.09 1.09 1.09 1.09 1.09 Fe4-S7 2.37 2.36 2.26 2.35 2.38 2.37 2.35

137

Table A2: Bond stretching force constants Kr (kcal mol−1 Å−2) for the six optimized geometries 13+-63+ of [4Fe4S]3+ and the corresponding average values.

Bond 13+ 23+ 33+ 43+ 53+ 63+ Average Fe1-S1 50.09 50.74 45.74 47.40 78.59 47.29 53.31 Fe1-S2 111.87 111.30 106.73 120.00 118.57 118.99 114.58 C4-H10 366.94 364.47 368.31 366.37 369.34 364.92 366.72 Fe1-S3 48.63 46.86 57.78 78.55 51.88 52.70 56.07 S8-C4 135.73 138.23 137.45 138.51 137.89 139.13 137.82 Fe4-S8 119.73 110.16 116.93 109.90 117.79 107.78 113.72 Fe1-S7 53.35 51.42 42.68 52.69 47.55 79.21 54.48 S1-Fe2 50.30 47.35 79.76 51.94 42.33 45.81 52.91 S1-Fe3 82.72 58.89 57.87 57.50 59.76 59.73 62.75 S2-C1 138.85 138.53 139.50 136.02 136.64 136.25 137.63 C1-H1 365.03 365.15 364.30 367.48 369.69 366.88 366.42 C1-H2 365.36 365.22 364.85 367.34 367.05 367.21 366.17 C1-H3 365.52 365.78 365.08 368.21 368.22 368.95 366.96 Fe2-S3 49.42 78.66 47.47 48.56 55.11 52.64 55.31 Fe2-S4 110.59 121.03 116.88 111.23 105.77 118.33 113.97 C4-H12 367.41 365.31 366.92 365.29 367.18 364.88 366.16 Fe2-S5 52.77 52.11 51.26 53.19 42.18 80.24 55.29 C4-H11 366.97 365.13 370.07 363.99 367.48 364.04 366.28 S3-Fe4 78.89 51.87 53.07 50.52 52.10 54.33 56.80 S4-C2 138.72 135.94 137.39 138.64 138.87 136.37 137.66 C2-H4 364.72 367.43 366.67 365.59 364.97 367.31 366.11 C2-H5 365.47 368.60 370.18 365.33 365.21 367.07 366.98 C2-H6 366.13 366.20 369.96 365.25 363.94 368.94 366.74 Fe3-S5 52.78 48.84 53.21 78.34 58.91 53.27 57.56 Fe3-S6 118.69 117.97 109.28 116.60 108.61 107.24 113.06 Fe3-S7 50.53 74.13 58.53 47.70 53.93 55.38 56.70 S5-Fe4 50.57 53.77 47.50 51.58 80.61 48.37 55.40 S6-C3 135.88 135.76 138.76 136.49 138.17 138.86 137.32 C3-H7 367.22 368.62 365.05 369.11 366.52 365.59 367.02 C3-H8 366.64 367.70 365.71 366.93 365.81 364.97 366.29 C3-H9 367.96 365.37 365.44 366.93 364.71 366.00 366.07 Fe4-S7 51.63 53.89 80.80 52.34 48.84 46.50 55.67

138

Table A3: Angles qeq (°) for the six optimized geometries 13+-63+ of [4Fe4S]3+ and the corresponding average values.

Angle 13+ 23+ 33+ 43+ 53+ 63+ Average C4-S8-Fe4 98.27 97.38 101.78 98.70 101.19 98.79 99.35 Fe1-S1-Fe2 71.01 76.46 78.30 76.34 78.07 81.65 76.97 Fe1-S1-Fe3 78.48 77.11 70.44 82.03 78.74 76.59 77.23 Fe1-S2-C1 97.78 98.26 99.65 99.15 101.43 99.50 99.30 Fe1-S3-Fe2 71.22 78.50 76.89 78.41 77.00 82.37 77.40 Fe1-S3-Fe4 78.15 71.65 77.10 78.16 82.45 76.57 77.35 Fe1-S7-Fe3 76.73 78.31 70.34 81.81 76.33 78.48 77.00 Fe1-S7-Fe4 76.73 71.73 78.11 76.49 82.05 78.08 77.20 Fe2-S1-Fe3 78.28 81.88 78.55 76.85 70.38 76.52 77.08 Fe2-S3-Fe4 78.37 78.48 82.52 71.73 77.08 76.73 77.48 Fe2-S4-C2 98.47 99.41 102.74 97.92 99.06 99.71 99.55 Fe2-S5-Fe3 76.55 81.55 76.32 78.31 70.25 78.44 76.90 Fe2-S5-Fe4 76.58 76.70 82.48 71.75 78.20 78.35 77.34 Fe3-S5-Fe4 81.72 76.97 76.30 78.82 78.81 69.11 76.96 Fe3-S6-C3 97.99 99.86 98.09 100.15 98.58 99.07 98.96 Fe3-S7-Fe4 81.67 78.73 78.79 76.80 76.52 69.10 76.94 H1-C1-H2 109.39 109.32 109.33 109.60 109.85 109.41 109.48 H1-C1-H3 109.18 109.18 109.14 109.55 109.49 109.58 109.35 H11-C4-H10 109.41 109.13 109.50 109.15 109.45 108.91 109.26 H11-C4-H12 109.62 109.44 109.83 109.16 109.40 109.45 109.48 H11-C4-S8 110.05 110.30 110.04 110.60 110.25 110.16 110.23 H12-C4-H10 109.58 109.26 109.49 109.54 109.81 109.40 109.51 H2-C1-H3 109.30 109.41 109.42 109.54 109.51 109.76 109.49 H4-C2-H5 109.23 109.80 109.63 109.31 109.46 109.36 109.47 H4-C2-H6 109.19 109.28 109.65 109.39 109.33 109.83 109.44 H5-C2-H6 109.47 109.50 109.55 109.22 109.00 109.51 109.38 H7-C3-H8 109.56 109.85 109.33 109.82 109.60 109.58 109.62 H7-C3-H9 109.73 109.40 109.20 109.42 109.21 109.00 109.33 H8-C3-H9 109.34 109.15 109.49 109.34 109.28 109.42 109.34 S1-Fe1-S2 114.83 119.39 113.36 116.36 116.93 116.69 116.26 S1-Fe1-S3 108.17 100.03 100.01 102.06 102.52 97.04 101.64 S1-Fe1-S7 99.53 99.36 107.38 96.51 101.01 100.88 100.78 S1-Fe2-S3 108.34 102.23 102.03 100.27 99.68 96.96 101.59 S1-Fe2-S4 114.45 116.44 118.01 117.93 111.41 116.44 115.78 S1-Fe2-S5 99.75 96.61 101.54 99.64 107.26 101.07 100.98 S1-Fe3-S5 102.33 97.41 100.14 102.34 110.92 100.16 102.22 S1-Fe3-S6 121.14 120.53 113.87 119.95 114.29 120.52 118.38 S1-Fe3-S7 101.99 102.29 110.53 96.88 100.24 100.02 101.99 S2-C1-H1 110.39 110.45 110.29 110.16 109.92 110.16 110.23 139

S2-C1-H2 108.27 108.25 108.50 107.92 107.77 107.84 108.09 S2-C1-H3 110.30 110.22 110.15 109.95 110.27 110.07 110.16 S2-Fe1-S3 113.52 113.50 123.73 119.40 118.55 119.62 118.05 S2-Fe1-S7 118.84 115.23 110.92 116.61 117.84 116.79 116.04 S3-Fe1-S7 100.09 107.28 99.61 102.40 96.53 102.38 101.38 S3-Fe2-S4 113.89 120.14 117.02 114.17 125.11 120.21 118.42 S3-Fe2-S5 99.90 101.82 96.21 107.53 99.45 102.05 101.16 S3-Fe4-S5 101.93 99.92 96.38 107.56 102.49 99.47 101.29 S3-Fe4-S7 102.27 108.03 102.39 100.09 96.67 99.72 101.53 S4-C2-H4 110.51 108.03 107.72 108.20 108.51 107.85 108.47 S4-C2-H5 108.19 109.63 110.18 110.36 110.23 110.26 109.81 S4-C2-H6 110.23 110.58 110.08 110.33 110.29 110.01 110.26 S4-Fe2-S5 118.67 115.97 118.45 115.30 112.04 116.63 116.18 S5-Fe3-S6 114.99 115.41 117.44 115.88 112.75 110.27 114.46 S5-Fe3-S7 97.03 101.22 100.11 101.51 99.84 111.07 101.79 S5-Fe4-S7 97.00 99.89 101.23 99.82 101.06 109.67 101.45 S6-C3-H7 108.05 116.62 113.29 116.99 117.35 113.71 114.33 S6-C3-H8 110.19 108.33 108.22 107.89 108.19 108.36 108.53 S6-C3-H9 109.95 110.84 110.24 110.26 110.37 110.34 110.33 S6-Fe3-S7 115.77 109.26 110.34 110.09 110.17 110.13 110.96 S8-C4-H10 110.16 110.39 110.19 110.16 110.09 110.38 110.23 S8-C4-H12 108.01 108.30 107.77 108.21 107.82 108.53 108.11 S8-Fe4-S3 121.08 115.95 118.11 114.87 118.63 124.52 118.86 S8-Fe4-S5 115.14 116.56 118.25 113.07 116.26 111.39 115.11 S8-Fe4-S7 115.81 114.42 117.01 119.56 118.36 110.77 115.99

140

Table A4: Angle bending force constants Kq (kcal mol−1 rad−2) for the six optimized geometries 13+-63+ of [4Fe4S]3+ and the corresponding average values.

Angle 13+ 23+ 33+ 43+ 53+ 63+ Average C4-S8-Fe4 47.14 23.51 52.15 21.78 50.04 56.07 41.78 Fe1-S1-Fe2 28.57 48.09 33.14 52.97 35.31 37.59 39.28 Fe1-S1-Fe3 33.16 46.02 23.00 35.74 35.57 49.16 37.11 Fe1-S2-C1 26.86 25.37 54.27 48.26 52.38 34.21 40.23 Fe1-S3-Fe2 19.88 32.86 49.42 32.49 48.78 32.53 35.99 Fe1-S3-Fe4 33.35 20.41 49.43 34.98 34.69 46.43 36.55 Fe1-S7-Fe3 49.95 33.20 21.51 35.49 49.74 34.70 37.43 Fe1-S7-Fe4 54.88 31.11 34.93 53.32 41.67 35.37 41.88 Fe2-S1-Fe3 31.96 37.15 35.00 47.15 23.73 46.65 36.94 Fe2-S3-Fe4 32.44 31.69 31.92 22.58 46.46 43.17 34.71 Fe2-S4-C2 25.29 47.30 57.59 26.55 52.69 39.29 41.45 Fe2-S5-Fe3 49.90 34.67 53.28 34.28 23.50 35.41 38.51 Fe2-S5-Fe4 54.60 52.09 38.44 31.51 33.42 36.12 41.03 Fe3-S5-Fe4 34.46 46.82 47.55 30.17 33.66 23.71 36.06 Fe3-S6-C3 47.73 39.16 35.12 46.61 36.45 52.55 42.94 Fe3-S7-Fe4 34.54 31.34 32.78 49.22 49.13 25.15 37.03 H1-C1-H2 42.92 43.27 43.05 42.73 42.52 42.86 42.89 H1-C1-H3 42.19 42.42 42.51 42.11 41.28 41.72 42.04 H11-C4-H10 41.74 42.43 41.15 42.10 41.12 42.24 41.80 H11-C4-H12 42.30 42.39 43.08 42.98 42.96 42.40 42.68 H11-C4-S8 42.90 43.96 42.42 42.38 42.97 42.28 42.82 H12-C4-H10 42.65 42.83 43.19 42.39 42.32 43.42 42.80 H2-C1-H3 42.34 42.17 41.81 41.88 43.30 41.97 42.25 H4-C2-H5 43.11 41.78 43.28 42.68 42.38 43.33 42.76 H4-C2-H6 42.16 42.55 43.32 42.72 43.20 42.27 42.70 H5-C2-H6 42.44 42.01 40.94 42.07 42.29 41.22 41.83 H7-C3-H8 42.73 41.52 42.69 42.46 42.27 42.28 42.32 H7-C3-H9 42.24 42.45 42.25 41.71 41.76 42.46 42.15 H8-C3-H9 41.96 42.53 42.45 43.21 42.99 43.32 42.74 S1-Fe1-S2 27.00 24.57 29.01 27.25 23.59 23.58 25.83 S1-Fe1-S3 27.28 36.02 35.09 50.26 49.93 59.49 43.01 S1-Fe1-S7 43.18 41.10 37.73 62.71 54.60 52.02 48.56 S1-Fe2-S3 25.30 49.48 50.96 34.64 34.28 55.42 41.68 S1-Fe2-S4 22.83 27.46 29.91 22.54 33.09 25.04 26.81 S1-Fe2-S5 43.51 56.90 50.13 42.40 37.30 51.61 46.97 S1-Fe3-S5 48.38 63.16 38.12 50.65 22.79 36.96 43.34 S1-Fe3-S6 28.31 30.33 19.81 41.65 19.03 29.87 28.17 S1-Fe3-S7 47.83 49.52 18.81 62.18 39.43 39.69 42.91 S2-C1-H1 43.60 43.64 41.85 43.28 42.29 41.99 42.78 141

S2-C1-H2 41.89 40.94 41.39 42.43 42.81 41.52 41.83 S2-C1-H3 43.67 43.86 42.10 41.96 41.63 42.75 42.66 S2-Fe1-S3 19.20 17.05 33.39 28.71 45.57 39.75 30.61 S2-Fe1-S7 23.01 28.98 29.20 26.98 31.96 24.84 27.49 S3-Fe1-S7 36.49 35.49 32.78 46.86 61.34 48.62 43.59 S3-Fe2-S4 16.34 28.11 33.74 17.70 32.07 34.86 27.14 S3-Fe2-S5 36.34 47.36 60.71 30.82 34.55 50.65 43.41 S3-Fe4-S5 48.59 36.55 59.96 30.23 51.77 34.66 43.62 S3-Fe4-S7 48.98 25.06 47.66 39.70 62.09 34.75 43.04 S4-C2-H4 42.89 44.47 40.58 41.44 42.94 41.81 42.35 S4-C2-H5 41.23 37.10 41.86 43.66 42.45 42.53 41.47 S4-C2-H6 43.61 41.90 42.08 43.03 39.75 42.67 42.17 S4-Fe2-S5 20.49 26.60 36.20 26.88 38.91 21.40 28.41 S5-Fe3-S6 26.23 25.66 28.11 24.29 26.10 30.67 26.84 S5-Fe3-S7 62.12 52.22 43.38 51.66 45.11 23.97 46.41 S5-Fe4-S7 62.15 41.23 52.12 39.69 51.89 38.61 47.61 S6-C3-H7 41.97 26.03 26.78 28.94 29.08 31.57 30.73 S6-C3-H8 41.68 44.88 40.76 41.02 41.92 42.27 42.09 S6-C3-H9 42.46 41.24 43.58 42.36 43.21 42.26 42.52 S6-Fe3-S7 26.22 39.43 43.41 42.84 43.49 41.71 39.52 S8-C4-H10 42.89 43.63 41.38 43.30 41.91 43.95 42.84 S8-C4-H12 39.65 39.93 40.89 41.86 41.53 39.01 40.48 S8-Fe4-S3 28.31 14.74 41.92 13.65 43.53 27.75 28.32 S8-Fe4-S5 27.59 25.12 23.55 23.19 25.47 37.73 27.11 S8-Fe4-S7 27.17 26.52 27.04 22.37 29.86 34.42 27.90

142

Table A5: Table of the interatomic distances Req (Å) for the three optimized geometries 12+-32+ of [4Fe4S]2+ and the corresponding average values.

Bond 12+ 22+ 32+ Average Fe1-S1 2.37 2.34 2.38 2.36 Fe1-S2 2.28 2.28 2.28 2.28 C4-H11 1.09 1.09 1.09 1.09 Fe1-S3 2.36 2.36 2.34 2.35 Fe4-S8 2.28 2.28 2.28 2.28 S8-C4 1.85 1.85 1.85 1.85 Fe1-S7 2.34 2.36 2.39 2.36 S1-Fe2 2.37 2.39 2.34 2.37 S1-Fe3 2.33 2.33 2.34 2.33 S2-C1 1.85 1.85 1.85 1.85 C1-H1 1.09 1.09 1.09 1.09 C1-H2 1.09 1.09 1.09 1.09 C1-H3 1.09 1.09 1.09 1.09 Fe2-S3 2.36 2.34 2.36 2.35 Fe2-S4 2.28 2.28 2.28 2.28 C4-H12 1.09 1.09 1.09 1.09 Fe2-S5 2.33 2.39 2.36 2.36 C4-H10 1.09 1.09 1.09 1.09 S3-Fe4 2.34 2.36 2.36 2.35 S4-C2 1.85 1.85 1.85 1.85 C2-H4 1.09 1.09 1.09 1.09 C2-H5 1.09 1.09 1.09 1.09 C2-H6 1.09 1.09 1.09 1.09 Fe3-S5 2.36 2.34 2.34 2.35 Fe3-S6 2.28 2.28 2.28 2.28 Fe4-S7 2.37 2.36 2.34 2.35 Fe3-S7 2.35 2.34 2.34 2.34 S5-Fe4 2.36 2.34 2.36 2.35 S6-C3 1.85 1.85 1.85 1.85 C3-H7 1.09 1.09 1.09 1.09 C3-H8 1.09 1.09 1.09 1.09 C3-H9 1.09 1.09 1.09 1.09

143

Table A6: Table of the bond stretching force constants Kr (kcal mol−1 Å−2) for the three optimized geometries 12+-32+ of [4Fe4S]2+ and the corresponding average values.

Bond 12+ 22+ 32+ Average Fe1-S1 52.29 54.79 47.84 51.64 Fe1-S2 99.73 97.86 94.34 97.31 C4-H11 361.50 363.03 362.34 362.29 Fe1-S3 51.81 50.81 56.88 53.17 Fe4-S8 95.03 97.51 97.77 96.77 S8-C4 141.36 140.98 141.03 141.12 Fe1-S7 57.70 52.43 44.84 51.66 S1-Fe2 50.93 44.95 54.58 50.15 S1-Fe3 59.61 60.36 59.25 59.74 S2-C1 140.97 141.04 140.21 140.74 C1-H1 363.05 362.95 362.01 362.67 C1-H2 361.89 361.98 361.96 361.94 C1-H3 363.52 364.06 362.68 363.42 Fe2-S3 51.07 56.30 51.48 52.95 Fe2-S4 97.66 92.70 98.24 96.20 C4-H12 361.69 361.76 361.79 361.75 Fe2-S5 56.76 43.87 52.44 51.03 C4-H10 363.51 362.78 364.54 363.61 S3-Fe4 55.65 51.87 51.93 53.15 S4-C2 141.08 140.10 141.19 140.79 C2-H4 362.30 361.94 362.35 362.20 C2-H5 363.94 363.98 363.23 363.71 C2-H6 362.80 360.80 363.26 362.29 Fe3-S5 52.33 59.73 55.90 55.99 Fe3-S6 93.69 97.01 96.36 95.68 Fe4-S7 51.03 54.04 56.75 53.94 Fe3-S7 55.88 56.08 59.84 57.27 S5-Fe4 52.37 57.42 53.62 54.47 S6-C3 141.06 140.62 141.12 140.93 C3-H7 363.99 363.73 362.97 363.56 C3-H8 361.59 362.35 362.38 362.11 C3-H9 363.62 362.93 363.55 363.37

144

Table A7: Table of the angles qeq (°) for the three optimized geometries 12+-32+ of [4Fe4S]2+ and the corresponding average values.

Angle 12+ 22+ 32+ Average S1-Fe1-S2 114.02 117.96 112.13 114.70 S1-Fe1-S3 109.67 101.03 100.20 103.63 S1-Fe1-S7 99.86 99.60 108.98 102.81 S2-Fe1-S3 114.23 113.69 122.71 116.88 S2-Fe1-S7 116.80 113.56 111.43 113.93 S3-Fe1-S7 100.63 109.50 100.01 103.38 Fe1-S1-Fe3 76.07 76.24 69.30 73.87 Fe1-S1-Fe2 70.08 75.80 75.75 73.87 Fe2-S1-Fe3 76.09 69.64 76.18 73.97 Fe1-S2-C1 96.96 97.92 97.68 97.52 S2-C1-H1 110.30 110.44 110.34 110.36 S2-C1-H2 108.84 108.80 108.98 108.87 S2-C1-H3 110.28 110.17 110.20 110.22 H1-C1-H2 109.28 109.17 109.17 109.21 H1-C1-H3 108.87 108.84 108.78 108.83 H2-C1-H3 109.24 109.41 109.34 109.33 S1-Fe2-S3 109.56 100.28 100.77 103.54 S1-Fe2-S4 114.84 111.79 118.55 115.06 S1-Fe2-S5 99.89 108.41 99.51 102.60 S3-Fe2-S4 113.53 122.10 112.96 116.20 S3-Fe2-S5 100.70 100.03 110.36 103.69 S4-Fe2-S5 116.68 112.72 113.33 114.24 Fe1-S3-Fe4 75.98 70.18 76.35 74.17 Fe2-S3-Fe4 75.84 76.36 69.51 73.90 Fe1-S3-Fe2 70.38 76.35 76.29 74.34 Fe2-S4-C2 98.28 99.31 98.13 98.57 S4-C2-H4 108.70 109.05 108.76 108.84 S4-C2-H5 110.34 110.15 110.44 110.31 S4-C2-H6 110.46 110.70 110.21 110.46 H4-C2-H5 109.34 109.42 109.17 109.31 H4-C2-H6 109.12 108.81 109.32 109.08 H5-C2-H6 108.85 108.68 108.93 108.82 S1-Fe3-S7 100.39 100.77 112.31 104.49 S1-Fe3-S5 100.32 112.11 100.51 104.31 S1-Fe3-S6 118.49 113.40 113.04 114.97 S5-Fe3-S6 110.36 113.19 116.10 113.22 S5-Fe3-S7 110.97 100.06 100.03 103.69 S6-Fe3-S7 115.08 116.01 113.55 114.88 Fe2-S5-Fe4 75.89 75.85 69.43 73.72 145

Fe3-S5-Fe4 69.13 76.36 75.96 73.82 Fe2-S5-Fe3 76.18 69.48 75.76 73.81 Fe3-S6-C3 99.40 97.84 97.99 98.41 S6-C3-H7 110.17 110.28 110.32 110.26 S6-C3-H8 108.86 108.80 108.76 108.80 S6-C3-H9 110.42 110.24 110.28 110.32 H7-C3-H8 109.51 109.39 109.22 109.37 H7-C3-H9 108.61 108.90 108.89 108.80 H8-C3-H9 109.25 109.21 109.36 109.27 S8-Fe4-S3 120.49 114.27 112.97 115.91 S8-Fe4-S7 112.05 113.67 118.73 114.82 S3-Fe4-S7 100.24 109.87 100.84 103.65 S3-Fe4-S5 100.46 101.17 110.45 104.02 S8-Fe4-S5 112.04 116.74 113.08 113.96 S7-Fe4-S5 110.43 99.57 99.43 103.14 Fe1-S7-Fe4 75.92 70.10 75.82 73.94 Fe4-S7-Fe3 69.14 75.99 76.31 73.81 Fe1-S7-Fe3 76.21 75.73 69.14 73.69 Fe4-S8-C4 98.13 97.07 98.51 97.90 H12-C4-H10 109.32 109.22 109.50 109.35 H11-C4-H12 109.23 109.32 109.07 109.21 H11-C4-H10 108.59 108.81 108.82 108.74 H11-C4-S8 110.27 110.24 110.54 110.35 S8-C4-H12 109.01 108.88 108.74 108.87 S8-C4-H10 100.40 110.36 110.14 106.97

146

Table A8: Table of the angle bending force constants Kq (kcal mol−1 rad−2) for the three optimized geometries 12+-32+ of [4Fe4S]2+ and the corresponding average values.

Angle 12+ 22+ 32+ Average S1-Fe1-S2 29.06 27.99 44.10 33.72 S1-Fe1-S3 28.76 30.58 31.11 30.15 S1-Fe1-S7 37.17 36.01 40.97 38.05 S2-Fe1-S3 25.30 23.23 32.97 27.16 S2-Fe1-S7 27.35 30.26 19.20 25.60 S3-Fe1-S7 30.94 32.91 28.76 30.87 Fe1-S1-Fe3 35.54 32.21 22.48 30.08 Fe1-S1-Fe2 30.12 36.57 35.34 34.01 Fe2-S1-Fe3 31.35 22.61 33.06 29.00 Fe1-S2-C1 37.39 36.32 61.26 44.99 S2-C1-H1 43.48 42.80 41.06 42.45 S2-C1-H2 41.75 41.12 39.03 40.63 S2-C1-H3 43.14 43.63 42.91 43.23 H1-C1-H2 43.03 43.40 43.02 43.15 H1-C1-H3 42.39 42.54 42.54 42.49 H2-C1-H3 42.57 42.63 42.53 42.58 S1-Fe2-S3 30.37 32.26 30.58 31.07 S1-Fe2-S4 26.38 41.05 26.97 31.47 S1-Fe2-S5 34.26 42.97 35.89 37.71 S3-Fe2-S4 23.05 34.00 21.24 26.09 S3-Fe2-S5 31.63 28.21 29.54 29.79 S4-Fe2-S5 24.20 41.68 28.93 31.60 Fe1-S3-Fe4 33.43 22.19 35.02 30.21 Fe2-S3-Fe4 34.17 33.14 20.92 29.41 Fe1-S3-Fe2 22.06 33.09 32.92 29.36 Fe2-S4-C2 38.86 60.30 38.28 45.81 S4-C2-H4 40.94 43.80 42.42 42.39 S4-C2-H5 42.72 37.77 42.03 40.84 S4-C2-H6 42.95 39.57 42.62 41.71 H4-C2-H5 42.83 41.93 42.82 42.53 H4-C2-H6 43.17 43.04 42.66 42.96 H5-C2-H6 42.33 42.11 42.74 42.39 S1-Fe3-S7 33.21 33.51 18.99 28.57 S1-Fe3-S5 30.55 16.80 33.08 26.81 S1-Fe3-S6 34.82 21.10 23.73 26.55 S5-Fe3-S6 34.26 26.59 31.33 30.72 S5-Fe3-S7 29.34 37.62 38.05 35.00 S6-Fe3-S7 31.59 32.37 26.34 30.10 147

Fe2-S5-Fe4 34.73 32.68 28.36 31.92 Fe3-S5-Fe4 27.95 34.43 33.77 32.05 Fe2-S5-Fe3 34.38 21.56 32.84 29.60 Fe3-S6-C3 55.74 40.37 41.95 46.02 S6-C3-H7 42.00 43.08 42.73 42.60 S6-C3-H8 42.09 40.05 41.59 41.25 S6-C3-H9 41.85 42.91 42.81 42.52 H7-C3-H8 42.43 42.67 42.68 42.60 H7-C3-H9 42.27 42.31 42.39 42.32 H8-C3-H9 43.32 42.90 42.95 43.06 S8-Fe4-S3 35.52 22.59 20.66 26.26 S8-Fe4-S7 31.19 30.15 28.10 29.81 S3-Fe4-S7 28.47 27.60 32.39 29.49 S3-Fe4-S5 32.11 30.70 27.52 30.11 S8-Fe4-S5 32.99 29.03 24.72 28.91 S7-Fe4-S5 37.30 34.61 36.42 36.11 Fe1-S7-Fe4 35.13 30.60 35.69 33.81 Fe4-S7-Fe3 32.78 31.29 34.34 32.80 Fe1-S7-Fe3 36.22 33.69 24.43 31.45 Fe4-S8-C4 57.79 37.58 34.55 43.31 H12-C4-H10 43.31 42.84 42.77 42.98 H11-C4-H12 42.62 42.79 43.35 42.92 H11-C4-H10 42.36 42.52 42.31 42.39 H11-C4-S8 42.38 42.52 42.23 42.38 S8-C4-H12 40.07 41.66 40.83 40.85 S8-C4-H10 43.21 43.15 42.80 43.06

148

Table A9: RESP charges for atoms corresponding to the optimized geometries 13+-63+ of [4Fe4S]3+ and the corresponding average values.

Atoms 13+ 23+ 33+ 43+ 53+ 63+ Average Fe1 0.61385 0.62176 0.60860 0.70819 0.75926 0.73172 0.66074 S1 -0.53024 -0.57662 -0.52533 -0.56930 -0.53402 -0.59970 -0.54633 S2 -0.56532 -0.56538 -0.58415 -0.51989 -0.52838 -0.52524 -0.58553 C1 0.25223 0.23961 0.21100 0.22570 0.24308 0.24029 0.44013 H1 -0.03569 -0.04103 -0.02547 -0.01983 -0.03315 -0.03008 -0.08592 H2 -0.01637 -0.01179 -0.00544 -0.00944 -0.00614 -0.00906 -0.05986 H3 -0.04349 -0.03529 -0.02557 -0.01686 -0.03552 -0.02516 -0.09870 Fe2 0.62048 0.68960 0.78575 0.58443 0.56574 0.72072 0.65457 S3 -0.47241 -0.48597 -0.59487 -0.47984 -0.56060 -0.54352 -0.49502 S4 -0.56257 -0.51586 -0.52446 -0.55287 -0.58004 -0.51915 -0.57994 C2 0.23739 0.25545 0.21709 0.22973 0.24099 0.22418 0.43144 H4 -0.04517 -0.01587 0.00267 -0.00909 -0.01652 -0.00210 -0.06290 H5 -0.01099 -0.02577 -0.02890 -0.03497 -0.02698 -0.02982 -0.07892 H6 -0.02810 -0.03376 -0.03327 -0.03765 -0.03862 -0.02361 -0.09870 Fe3 0.77498 0.80599 0.69568 0.80610 0.76219 0.78483 0.77215 S5 -0.57143 -0.59772 -0.64294 -0.52647 -0.54873 -0.57868 -0.56336 S6 -0.54715 -0.55287 -0.57067 -0.55689 -0.58570 -0.59989 -0.60833 C3 0.19404 0.20675 0.18768 0.20182 0.19252 0.17471 0.41732 H7 0.00166 -0.00860 -0.01148 -0.02517 -0.01997 0.00077 -0.06970 H8 -0.01761 -0.00602 -0.00072 0.00576 0.00163 0.00202 -0.05758 H9 -0.00990 -0.02357 -0.01692 -0.01765 -0.01527 -0.02399 -0.09870 Fe4 0.67546 0.65146 0.80558 0.64317 0.79021 0.63631 0.65246 S7 -0.57617 -0.56752 -0.54559 -0.61550 -0.65085 -0.59591 -0.56924 S8 -0.50902 -0.56901 -0.53508 -0.57018 -0.53559 -0.57881 -0.58423 C4 0.17739 0.20354 0.20292 0.21195 0.20200 0.16626 0.47627 H10 -0.00470 -0.01951 -0.02164 -0.02673 -0.02546 -0.00421 -0.08590 H11 0.00170 -0.01199 -0.02924 -0.01864 -0.02026 0.01675 -0.07753 H12 -0.00285 -0.01002 0.00478 -0.00986 0.00418 -0.00967 -0.09870 Fe1 0.61385 0.62176 0.60860 0.70819 0.75926 0.73172 0.66074 S1 -0.53024 -0.57662 -0.52533 -0.56930 -0.53402 -0.59970 -0.54633 S2 -0.56532 -0.56538 -0.58415 -0.51989 -0.52838 -0.52524 -0.58553 C1 0.25223 0.23961 0.21100 0.22570 0.24308 0.24029 0.44013 H1 -0.03569 -0.04103 -0.02547 -0.01983 -0.03315 -0.03008 -0.08592 H2 -0.01637 -0.01179 -0.00544 -0.00944 -0.00614 -0.00906 -0.05986 H3 -0.04349 -0.03529 -0.02557 -0.01686 -0.03552 -0.02516 -0.09870 Fe2 0.62048 0.68960 0.78575 0.58443 0.56574 0.72072 0.65457 S3 -0.47241 -0.48597 -0.59487 -0.47984 -0.56060 -0.54352 -0.49502 S4 -0.56257 -0.51586 -0.52446 -0.55287 -0.58004 -0.51915 -0.57994 C2 0.23739 0.25545 0.21709 0.22973 0.24099 0.22418 0.43144 149

H4 -0.04517 -0.01587 0.00267 -0.00909 -0.01652 -0.00210 -0.06290 H5 -0.01099 -0.02577 -0.02890 -0.03497 -0.02698 -0.02982 -0.07892 H6 -0.02810 -0.03376 -0.03327 -0.03765 -0.03862 -0.02361 -0.09870 Fe3 0.77498 0.80599 0.69568 0.80610 0.76219 0.78483 0.77215 S5 -0.57143 -0.59772 -0.64294 -0.52647 -0.54873 -0.57868 -0.56336 S6 -0.54715 -0.55287 -0.57067 -0.55689 -0.58570 -0.59989 -0.60833 C3 0.19404 0.20675 0.18768 0.20182 0.19252 0.17471 0.41732 H7 0.00166 -0.00860 -0.01148 -0.02517 -0.01997 0.00077 -0.06970 H8 -0.01761 -0.00602 -0.00072 0.00576 0.00163 0.00202 -0.05758 H9 -0.00990 -0.02357 -0.01692 -0.01765 -0.01527 -0.02399 -0.09870 Fe4 0.67546 0.65146 0.80558 0.64317 0.79021 0.63631 0.65246 S7 -0.57617 -0.56752 -0.54559 -0.61550 -0.65085 -0.59591 -0.56924 S8 -0.50902 -0.56901 -0.53508 -0.57018 -0.53559 -0.57881 -0.58423 C4 0.17739 0.20354 0.20292 0.21195 0.20200 0.16626 0.47627 H10 -0.00470 -0.01951 -0.02164 -0.02673 -0.02546 -0.00421 -0.08590 H11 0.00170 -0.01199 -0.02924 -0.01864 -0.02026 0.01675 -0.07753 H12 -0.00285 -0.01002 0.00478 -0.00986 0.00418 -0.00967 -0.09870

150

Table A10: Table of the RESP charges for atoms corresponding to the three optimized geometries 12+-32+ of [4Fe4S]2+ and the corresponding average values.

Atoms 12+ 22+ 32+ Average Fe1 0.72175 0.68155 0.67058 0.69129 S1 -0.68520 -0.70051 -0.68618 -0.69063 S2 -0.71774 -0.70080 -0.72143 -0.71332 C1 0.45856 0.43468 0.42029 0.43784 H1 -0.09179 -0.09381 -0.07244 -0.08601 H2 -0.10185 -0.09179 -0.08954 -0.09439 H3 -0.09870 -0.09870 -0.09870 -0.09870 Fe2 0.69428 0.69795 0.68768 0.69330 S3 -0.61800 -0.61975 -0.60914 -0.61563 S4 -0.70358 -0.73167 -0.70208 -0.71244 C2 0.43005 0.44057 0.43150 0.43404 H4 -0.09040 -0.09653 -0.08947 -0.09213 H5 -0.08721 -0.08335 -0.09406 -0.08821 H6 -0.09870 -0.09870 -0.09870 -0.09870 Fe3 0.84634 0.89280 0.86123 0.86679 S5 -0.70390 -0.71684 -0.67613 -0.69896 S6 -0.74539 -0.73034 -0.74179 -0.73917 C3 0.38688 0.37549 0.44672 0.40303 H7 -0.05424 -0.06324 -0.10604 -0.07450 H8 -0.08181 -0.07101 -0.08829 -0.08037 H9 -0.09870 -0.09870 -0.09870 -0.09870 Fe4 0.69640 0.69437 0.67514 0.68864 S7 -0.72918 -0.68384 -0.68181 -0.69828 S8 -0.70569 -0.69996 -0.69709 -0.70091 C4 0.34869 0.39374 0.40187 0.38143 H10 -0.04829 -0.07045 -0.06431 -0.06102 H11 -0.02389 -0.06244 -0.08042 -0.05558 H12 -0.09870 -0.09870 -0.09870 -0.09870 Fe1 0.72175 0.68155 0.67058 0.69129 S1 -0.68520 -0.70051 -0.68618 -0.69063 S2 -0.71774 -0.70080 -0.72143 -0.71332 C1 0.45856 0.43468 0.42029 0.43784 H1 -0.09179 -0.09381 -0.07244 -0.08601 H2 -0.10185 -0.09179 -0.08954 -0.09439 H3 -0.09870 -0.09870 -0.09870 -0.09870 Fe2 0.69428 0.69795 0.68768 0.69330 S3 -0.61800 -0.61975 -0.60914 -0.61563 S4 -0.70358 -0.73167 -0.70208 -0.71244 C2 0.43005 0.44057 0.43150 0.43404 151

H4 -0.09040 -0.09653 -0.08947 -0.09213 H5 -0.08721 -0.08335 -0.09406 -0.08821 H6 -0.09870 -0.09870 -0.09870 -0.09870 Fe3 0.84634 0.89280 0.86123 0.86679 S5 -0.70390 -0.71684 -0.67613 -0.69896 S6 -0.74539 -0.73034 -0.74179 -0.73917 C3 0.38688 0.37549 0.44672 0.40303 H7 -0.05424 -0.06324 -0.10604 -0.07450 H8 -0.08181 -0.07101 -0.08829 -0.08037 H9 -0.09870 -0.09870 -0.09870 -0.09870 Fe4 0.69640 0.69437 0.67514 0.68864 S7 -0.72918 -0.68384 -0.68181 -0.69828 S8 -0.70569 -0.69996 -0.69709 -0.70091 C4 0.34869 0.39374 0.40187 0.38143 H10 -0.04829 -0.07045 -0.06431 -0.06102 H11 -0.02389 -0.06244 -0.08042 -0.05558 H12 -0.09870 -0.09870 -0.09870 -0.09870

152

References

(1) Horn, A.; Van der Meulen, J. H.; Defour, A.; Hogarth, M.; Sreetama, S. C.; Reed, A.; Scheffer, L.; Chandel, N. S.; Jaiswal, J. K. Mitochondrial redox signaling enables repair of injured skeletal muscle cells. Sci. Signal. 2017, 10, eaaj1978.

(2) Megiatto, J. D.; Antoniuk-Pablant, A.; Sherman, B. D.; Kodis, G.; Gervaldo, M.; Moore, T. A.; Moore, A. L.; Gust, D. Mimicking the electron transfer chain in photosystem II with a molecular triad thermodynamically capable of water oxidation. Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 15578-15583.

(3) Limburg, J.; Vrettos, J. S.; Liable-Sands, L. M.; Rheingold, A. L.; Crabtree, R. H.; Brudvig, G. W. A functional model for O-O bond formation by the O2-evolving complex in photosystem II. Science 1999, 283, 1524-1527.

(4) Tommos, C.; Babcock, G. T. Oxygen production in nature: A light-driven metalloradical enzyme process. Acc. Chem. Res. 1998, 31, 18-25.

(5) Verkhovskaya, M. L.; Belevich, N.; Euro, L.; Wikström, M.; Verkhovsky, M. I. Real-time electron transfer in respiratory complex I. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 3763-3767.

(6) Winkler, J. R.; Nocera, D. G.; Yocom, K. M.; Bordignon, E.; Gray, H. B. Electron-transfer kinetics of pentaammineruthenium(III)(histidine-33)-ferricytochrome c. Measurement of the rate of intramolecular electron transfer between redox centers separated by 15 Å in a protein. J. Am. Chem. Soc. 1982, 104, 5798-5800.

(7) Gray, H. B.; Winkler, J. R. Hole hopping through tyrosine/tryptophan chains protects proteins from oxidative damage. Proc. Natl. Acad. Sci. U.S.A. 2015, 112, 10920-10925.

(8) Polizzi, N. F.; Migliore, A.; Therien, M. J.; Beratan, D. N. Defusing redox bombs? Proc. Natl. Acad. Sci. U.S.A. 2015, 112, 10821-10822.

(9) Winkler, J. R.; Malmström, B. G.; Gray, H. B. Rapid electron injection into multisite metalloproteins: Intramolecular electron transfer in cytochrome oxidase. Biophys. Chem. 1995, 54, 199-209.

(10) Michel, B.; Proudfoot, A. E. I.; Wallace, C. J. A.; Bosshard, H. R. The cytochrome c oxidase-cytochrome c complex: Spectroscopic analysis of conformational changes in the protein-protein interaction domain. 1989, 28, 456-462.

153

(11) Niki, K.; Hardy, W. R.; Hill, M. G.; Li, H.; Sprinkle, J. R.; Margoliash, E.; Fujita, K.; Tanimura, R.; Nakamura, N.; Ohno, H.; Richards, J. H.; Gray, H. B. Coupling to lysine-13 promotes electron tunneling through carboxylate-terminated alkanethiol self-assembled monolayers to cytochrome c. J. Phys. Chem. B 2003, 107, 9947-9949.

(12) Sakamoto, K.; Kamiya, M.; Imai, M.; Shinzawa-Itoh, K.; Uchida, T.; Kawano, K.; Yoshikawa, S.; Ishimori, K. NMR basis for interprotein electron transfer gating between cytochrome c and cytochrome c oxidase. Proc. Natl. Acad. Sci. U.S.A. 2011, 108, 12271-12276.

(13) Daviso, E.; Prakash, S.; Alia, A.; Gast, P.; Neugebauer, J.; Jeschke, G.; Matysik, J. The electronic structure of the primary electron donor of reaction centers of purple bacteria at atomic resolution as observed by photo-CIDNP 13C NMR. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 22281-22286.

(14) Wohlgamuth, C. H.; McWilliams, M. A.; Slinker, J. D. DNA as a molecular wire: Distance and sequence dependence. Anal. Chem. 2013, 85, 8634-8640.

(15) Slinker, J. D.; Muren, N. B.; Renfrew, S. E.; Barton, J. K. DNA charge transport Over 34 nm. Nat. Chem. 2011, 3, 228.

(16) Arnold, A. R.; Grodick, M. A.; Barton, J. K. DNA Charge Transport: from Chemical Principles to the Cell. Cell Chem. Biol. 2016, 23, 183-197.

(17) Lewis, F. D.; Liu, X.; Liu, J.; Miller, S. E.; Hayes, R. T.; Wasielewski, M. R. Direct measurement of hole transport dynamics in DNA. Nature 2000, 406, 51.

(18) Lewis, F. D.; Daublain, P.; Cohen, B.; Vura-Weis, J.; Shafirovich, V.; Wasielewski, M. R. Dynamics and efficiency of DNA hole transport via alternating AT versus poly(A) sequences. J. Am. Chem. Soc. 2007, 129, 15130-15131.

(19) Lewis, F. D.; Letsinger, R. L.; Wasielewski, M. R. Dynamics of photoinduced charge transfer and hole transport in synthetic DNA hairpins. Acc. Chem. Res. 2001, 34, 159-170.

(20) Thazhathveetil, A. K.; Trifonov, A.; Wasielewski, M. R.; Lewis, F. D. Increasing the speed limit for hole transport in DNA. J. Am. Chem. Soc. 2011, 133, 11485- 11487.

(21) Kalosakas, G.; Spanou, E. Distance dependence of hole transfer rates from G radical cations to GGG traps in DNA. Phys. Chem. Chem. Phys. 2013, 15, 15339-15346.

154

(22) Meggers, E.; Michel-Beyerle, M. E.; Giese, B. Sequence dependent long range hole transport in DNA. J. Am. Chem. Soc. 1998, 120, 12950-12955.

(23) Fujitsuka, M.; Majima, T. Charge transfer dynamics in DNA revealed by time-resolved spectroscopy. Chem. Sci. 2017, 8, 1752-1762.

(24) Fujitsuka, M.; Majima, T. Hole and excess electron transfer dynamics in DNA. Phys. Chem. Chem. Phys. 2012, 14, 11234-11244.

(25) Tse, E. C. M.; Zwang, T. J.; Bedoya, S.; Barton, J. K. Effective distance for DNA-mediated charge transport between repair proteins. ACS Cent. Sci. 2019, 5, 65-72.

(26) O’Brien, E.; Holt, M. E.; Thompson, M. K.; Salay, L. E.; Ehlinger, A. C.; Chazin, W. J.; Barton, J. K. The [4Fe4S] cluster of human DNA primase functions as a redox switch using DNA charge transport. Science 2017, 355, eaag1789.

(27) Teo, R. D.; Rousseau, B. J. G.; Smithwick, E. R.; Di Felice, R.; Beratan, D. N.; Migliore, A. Charge transfer between [4Fe4S] proteins and DNA is unidirectional. Implications for biomolecular signaling. Chem 2019, 5, 122-137.

(28) Teo, R. D.; Smithwick, E. R.; Migliore, A.; Beratan, D. N. A single AT–GC exchange can modulate charge transfer-induced p53–DNA dissociation. Chem. Commun. 2019, 55, 206-209.

(29) Liu, D.; Wang, M.; Deng, Z.; Walulu, R.; Mao, C. Tensegrity: Construction of rigid DNA triangles with flexible four-arm DNA junctions. J. Am. Chem. Soc. 2004, 126, 2324-2325.

(30) Hernandez, C.; Birktoft, J. J.; Ohayon, Y. P.; Chandrasekaran, A. R.; Abdallah, H.; Sha, R.; Stojanoff, V.; Mao, C.; Seeman, N. C. Self-assembly of 3D DNA crystals containing a torsionally stressed component. Cell Chem. Biol. 2017, 24, 1401- 1406.e1402.

(31) Osakada, Y.; Kawai, K.; Fujitsuka, M.; Majima, T. Charge transfer through DNA nanoscaled assembly programmable with DNA building blocks. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 18072-18076.

(32) Kawai, K.; Majima, T. Hole transfer kinetics of DNA. Acc. Chem. Res. 2013, 46, 2616-2625.

155

(33) Giese, B.; Amaudrut, J.; Köhler, A.-K.; Spormann, M.; Wessely, S. Direct observation of hole transfer through DNA by hopping between adenine bases and by tunnelling. Nature 2001, 412, 318-320.

(34) Renaud, N.; Harris, M. A.; Singh, A. P. N.; Berlin, Y. A.; Ratner, M. A.; Wasielewski, M. R.; Lewis, F. D.; Grozema, F. C. Deep-hole transfer leads to ultrafast charge migration in DNA hairpins. Nat. Chem. 2016, 8, 1015.

(35) Zhang, Y.; Liu, C.; Balaeff, A.; Skourtis, S. S.; Beratan, D. N. Biological charge transfer via flickering resonance. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 10049- 10054.

(36) Beratan, D. N. Why are DNA and protein electron transfer so different? Annu. Rev. Phys. Chem. 2019, 70, 71-97.

(37) Roche, B.; Aussel, L.; Ezraty, B.; Mandin, P.; Py, B.; Barras, F. Iron/sulfur proteins biogenesis in prokaryotes: Formation, regulation and diversity. BBA - Bioenerg. 2013, 1827, 455-469.

(38) Rouault, T. A. Mammalian iron-sulphur proteins: novel insights into biogenesis and function. Nat. Rev. Mol. Cell. Biol. 2015, 16, 45-55.

(39) Lill, R.; Mühlenhoff, U. Maturation of iron-sulfur proteins in eukaryotes: Mechanisms, connected processes, and diseases. Annu. Rev. Biochem. 2008, 77, 669-700.

(40) Bonfio, C.; Valer, L.; Scintilla, S.; Shah, S.; Evans, D. J.; Jin, L.; Szostak, J. W.; Sasselov, D. D.; Sutherland, J. D.; Mansy, S. S. UV-light-driven prebiotic synthesis of iron–sulfur clusters. Nat. Chem. 2017, 9, 1229-1234.

(41) Fuss, J. O.; Tsai, C.-L.; Ishida, J. P.; Tainer, J. A. Emerging critical roles of Fe–S clusters in DNA replication and repair. BBA - Mol. Cell Res. 2015, 1853, 1253-1271.

(42) Johnson, D. C.; Dean, D. R.; Smith, A. D.; Johnson, M. K. Structure, function, and formation of biological iron-sulfur clusters. Annu. Rev. Biochem. 2005, 74, 247-281.

(43) Tse, E. C. M.; Zwang, T. J.; Barton, J. K. The oxidation state of [4Fe4S] clusters modulates the DNA-binding affinity of DNA repair proteins. J. Am. Chem. Soc. 2017, 139, 12784-12792.

156

(44) Bartels, P. L.; Zhou, A.; Arnold, A. R.; Nuñez, N. N.; Crespilho, F. N.; David, S. S.; Barton, J. K. Electrochemistry of the [4Fe4S] Cluster in Base Excision Repair Proteins: Tuning the Redox Potential with DNA. Langmuir 2017, 33, 2523-2530.

(45) Gorodetsky, A. A.; Boal, A. K.; Barton, J. K. Direct electrochemistry of Endonuclease III in the presence and absence of DNA. J. Am. Chem. Soc. 2006, 128, 12082- 12083.

(46) Boal, A. K.; Genereux, J. C.; Sontz, P. A.; Gralnick, J. A.; Newman, D. K.; Barton, J. K. Redox signaling between DNA repair proteins for efficient lesion detection. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 15237-15242.

(47) Grodick, M. A.; Segal, H. M.; Zwang, T. J.; Barton, J. K. DNA-mediated signaling by proteins with 4Fe–4S clusters Is necessary for genomic integrity. J. Am. Chem. Soc. 2014, 136, 6470-6478.

(48) Boon, E. M.; Livingston, A. L.; Chmiel, N. H.; David, S. S.; Barton, J. K. DNA-mediated charge transport for DNA repair. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 12543-12547.

(49) Blumberger, J. Recent advances in the theory and molecular simulation of biological electron transfer reactions. Chem. Rev. 2015, 115, 11191-11238.

(50) Marcus, R. A. On the theory ofelectron transfer reactions. VI. Unified Treatment for Homogeneous and Electrode Reactions. J. Chem. Phys. 1965, 43, 679-701.

(51) Marcus, R. A. On the theory of oxidation-reduction reactions involving electron transfer. J. Chem. Phys. 1963, 67, 853-857.

(52) Marcus, R. A.; Sutin, N. Electron transfers in chemistry and biology. Biochim. Biophys. Acta, Rev. Bioenerg. 1985, 811, 265-322.

(53) Nitzan, A.: Chemical dynamics in condensed phases: Relaxation, transfer, and reactions in condensed molecular systems; OUP Oxford, 2013.

(54) Gray, H. B.; Winkler, J. R. Electron tunneling through proteins. Q. Rev. Biophys. 2003, 36, 341-372.

(55) Hopfield, J. J. Electron transfer between biological molecules by thermally activated tunneling. Proc. Natl. Acad. Sci. U.S.A. 1974, 71, 3640-3644.

157

(56) Liu, C.; Xiang, L.; Zhang, Y.; Zhang, P.; Beratan, D. N.; Li, Y.; Tao, N. Engineering nanometre-scale coherence in soft matter. Nat. Chem. 2016, 8, 941.

(57) Xiang, L.; Palma, J. L.; Bruot, C.; Mujica, V.; Ratner, M. A.; Tao, N. Intermediate tunnelling–hopping regime in DNA charge transport. Nat. Chem. 2015, 7, 221.

(58) Jones, M. L.; Kurnikov, I. V.; Beratan, D. N. The nature of tunneling pathway and average packing density models for protein-mediated electron transfer. J. Phys. Chem. A 2002, 106, 2002-2006.

(59) Onuchic, J. N.; Beratan, D. N. A predictive theoretical model for electron tunneling pathways in proteins. J. Chem. Phys. 1990, 92, 722-733.

(60) Beratan, D.; Betts, J.; Onuchic, J. Protein electron transfer rates set by the bridging secondary and tertiary structure. Science 1991, 252, 1285-1288.

(61) Closs, G. L.; Miller, J. R. Intramolecular long-distance electron transfer in organic molecules. Science 1988, 240, 440-447.

(62) Breuer, M.; Rosso, K. M.; Blumberger, J. Electron flow in multiheme bacterial cytochromes is a balancing act between heme electronic interaction and redox potentials. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 611-616.

(63) Page, C. C.; Moser, C. C.; Chen, X.; Dutton, P. L. Natural engineering principles of electron tunnelling in biological oxidation–reduction. Nature 1999, 402, 47- 52.

(64) Wu, Q.; Van Voorhis, T. Direct optimization method to study constrained systems within density-functional theory. Phys. Rev. A 2005, 72, 024502.

(65) Wu, Q.; Van Voorhis, T. Constrained density functional theory and its application in long-range electron transfer. J. Chem. Theory Comput. 2006, 2, 765-774.

(66) Kaduk, B.; Kowalczyk, T.; Van Voorhis, T. Constrained density functional theory. Chem. Rev. 2012, 112, 321-370.

(67) Ramos, P.; Pavanello, M. Low-lying excited states by constrained DFT. J. Chem. Phys. 2018, 148, 144103.

(68) Cave, R. J.; Newton, M. D. Generalization of the Mulliken-Hush treatment for the calculation of electron transfer matrix elements. Chem. Phys. Lett. 1996, 249, 15-19.

158

(69) Voityuk, A. A. Estimation of electronic coupling in π-stacked donor- bridge-acceptor systems: Correction of the two-state model. J. Chem. Phys. 2006, 124, 064505.

(70) Gillet, N.; Berstis, L.; Wu, X.; Gajdos, F.; Heck, A.; de la Lande, A.; Blumberger, J.; Elstner, M. Electronic coupling calculations for bridge-mediated charge transfer using constrained density functional theory (CDFT) and effective hamiltonian approaches at the density functional theory (DFT) and fragment-orbital density functional tight binding (FODFTB) level. J. Chem. Theory Comput. 2016, 12, 4793-4805.

(71) Berstis, L.; Baldridge, K. K. DFT-based Green's function pathways model for prediction of bridge-mediated electronic coupling. Phys. Chem. Chem. Phys. 2015, 17, 30842-30853.

(72) Pipek, J.; Mezey, P. G. A fast intrinsic localization procedure applicable for ab initio and semiempirical linear combination of atomic orbital wave functions. J. Chem. Phys. 1989, 90, 4916-4926.

(73) Foster, J. M.; Boys, S. F. Canonical configurational interaction procedure. Rev. Mod. Phys. 1960, 32, 300-302.

(74) Blumberger, J.; McKenna, K. P. Constrained density functional theory applied to electron tunnelling between defects in MgO. Phys. Chem. Chem. Phys. 2013, 15, 2184-2196.

(75) Kohn, W.; Sham, L. J. Self-consistent equations including exchange and correlation effects. Phys. Rev. 1965, 140, A1133-A1138.

(76) Newton, M. D. Quantum chemical probes of electron-transfer kinetics: the nature of donor-acceptor interactions. Chem. Rev. 1991, 91, 767-792.

(77) Oberhofer, H.; Blumberger, J. Electronic coupling matrix elements from charge constrained density functional theory calculations using a plane wave basis set. J. Chem. Phys. 2010, 133, 244105.

(78) Migliore, A. Full-electron calculation of effective electronic couplings and excitation energies of charge transfer states: Application to hole transfer in DNA π- stacks. J. Chem. Phys. 2009, 131, 114113.

(79) Migliore, A. Nonorthogonality problem and effective electronic coupling calculation: Application to charge transfer in π-Stacks relevant to biochemistry and molecular electronics. J. Chem. Theory Comput. 2011, 7, 1712-1725.

159

(80) Wu, Q.; Voorhis, T. V. Extracting electron transfer coupling elements from constrained density functional theory. J. Chem. Phys. 2006, 125, 164105.

(81) Teo, R. D.; Terai, K.; Migliore, A.; Beratan, D. N. Electron transfer characteristics of 2′-deoxy-2′-fluoro-arabinonucleic acid, a nucleic acid with enhanced chemical stability. Phys. Chem. Chem. Phys. 2018, 20, 26063-26067.

(82) Valiev, M.; Bylaska, E. J.; Govind, N.; Kowalski, K.; Straatsma, T. P.; Van Dam, H. J. J.; Wang, D.; Nieplocha, J.; Apra, E.; Windus, T. L.; de Jong, W. A. NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations. Comput. Phys. Commun. 2010, 181, 1477-1489.

(83) Peverati, R.; Truhlar, D. G. Improving the accuracy of hybrid meta-GGA density functionals by range separation. J. Phys. Chem. Lett. 2011, 2, 2810-2817.

(84) Blancafort, L.; Voityuk, A. A. CASSCF/CAS-PT2 study of hole transfer in stacked DNA nucleobases. J. Phys. Chem. A 2006, 110, 6426-6432.

(85) You, Z.-Q.; Hung, Y.-C.; Hsu, C.-P. Calculating electron-transfer coupling with density functional theory: The long-range-corrected density functionals. J. Phys. Chem. B 2015, 119, 7480-7490.

(86) Marcus, R. A. On the theory of oxidation-reduction reactions involving electron transfer. I. J. Chem. Phys. 1956, 24, 966-978.

(87) Khan, A. Reorganization, activation and ionization energies for hole transfer reactions through inosine–cytosine, 2-aminopurine – Thymine, adenine– thymine, and guanine–cytosine base pairs: A computational study. Comput. Theor. Chem. 2013, 1013, 136-139.

(88) Marcus, R. A. Theoretical relations among rate constants, barriers, and Broensted slopes of chemical reactions. J. Chem. Phys. 1968, 72, 891-899.

(89) Nelsen, S. F.; Blackstock, S. C.; Kim, Y. Estimation of inner shell Marcus terms for amino nitrogen compounds by molecular orbital calculations. J. Am. Chem. Soc. 1987, 109, 677-682.

(90) Jakobsen, S.; Mikkelsen, K. V.; Pedersen, S. U. Calculations of intramolecular reorganization energies for electron-transfer reactions involving organic systems. J. Phys. Chem. 1996, 100, 7411-7417.

160

(91) Gordy, W. A relation between bond force constants, bond orders, bond lengths, and the electronegativities of the bonded atoms. J. Chem. Phys. 1946, 14, 305-320.

(92) Krapf, S.; Koslowski, T.; Steinbrecher, T. The thermodynamics of charge transfer in DNA : using thermodynamic integration calculations to analyse the kinetics of electron transfer reactions. Phys. Chem. Chem. Phys. 2010, 12, 9516-9525.

(93) Bauß, A.; Langenmaier, M.; Strittmatter, E.; Plattner, D. A.; Koslowski, T. Thermodynamic integration networks and their application to charge transfer reactions within the AauDyPI fungal peroxidase. J. Phys. Chem. B 2016, 120, 4937-4944.

(94) Na, S.; Bauß, A.; Langenmaier, M.; Koslowski, T. Thermodynamic integration network study of electron transfer: from proteins to aggregates. Phys. Chem. Chem. Phys. 2017, 19, 18938-18947.

(95) Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Shaw, D. E. How fast-folding proteins fold. Science 2011, 334, 517-520.

(96) Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. J. Am. Chem. Soc. 1996, 118, 2309-2309.

(97) Ȧqvist, J. Ion-water interaction potentials derived from free energy perturbation simulations. J. Phys. Chem. 1990, 94, 8021-8024.

(98) Lemkul, J. A.; Huang, J.; Roux, B.; MacKerell, A. D. An empirical polarizable force field based on the classical drude oscillator model: Development history and recent applications. Chem. Rev. 2016, 116, 4983-5013.

(99) Seminario, J. M. Calculation of intramolecular force fields from second- derivative tensors. Int. J. Quantum Chem. 1996, 60, 1271-1277.

(100) Allen, A. E. A.; Payne, M. C.; Cole, D. J. Harmonic force constants for molecular mechanics force fields via hessian matrix projection. J. Chem. Theory Comput. 2018, 14, 274-281.

(101) Li, P.; Merz, K. M. MCPB.py: A python based metal center parameter builder. J. Chem. Inf. Model. 2016, 56, 599-604.

(102) Singh, U. C.; Kollman, P. A. An approach to computing electrostatic charges for molecules. J. Comput. Chem. 1984, 5, 129-145.

161

(103) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983, 79, 926-935.

(104) Jorgensen, W. L. Quantum and statistical mechanical studies of liquids. 10. Transferable intermolecular potential functions for water, alcohols, and ethers. Application to liquid water. J. Am. Chem. Soc. 1981, 103, 335-340.

(105) Mahoney, M. W.; Jorgensen, W. L. A five-site model for liquid water and the reproduction of the density anomaly by rigid, nonpolarizable potential functions. J. Chem. Phys. 2000, 112, 8910-8922.

(106) Mahoney, M. W.; Jorgensen, W. L. Quantum, intramolecular flexibility, and polarizability effects on the reproduction of the density anomaly of liquid water by simple potential functions. J. Chem. Phys. 2001, 115, 10758-10768.

(107) Rick, S. W.; Stuart, S. J.; Berne, B. J. Dynamical fluctuating charge force fields: Application to liquid water. J. Chem. Phys. 1994, 101, 6141-6156.

(108) Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. A smooth particle mesh Ewald method. J. Chem. Phys. 1995, 103, 8577-8593.

(109) Darden, T.; York, D.; Pedersen, L. Particle mesh Ewald: An N⋅log(N) method for Ewald sums in large systems. J. Chem. Phys. 1993, 98, 10089-10092.

(110) York, D.; Yang, W. The fast Fourier Poisson method for calculating Ewald sums. J. Chem. Phys. 1994, 101, 3298-3300.

(111) Feller, S. E.; Zhang, Y.; Pastor, R. W.; Brooks, B. R. Constant pressure molecular dynamics simulation: The Langevin piston method. J. Chem. Phys. 1995, 103, 4613-4621.

(112) Milo, R.; Phillips, R.: Cell biology by the numbers; Garland Science, 2015.

(113) First passage time problems in chemical physics. In Advances in Chemical Physics; pp 1-18.

(114) Polizzi, N. F.; Therien, M. J.; Beratan, D. N. Mean first-passage times in biology. Isr. J. Chem. 2016, 56, 816-824.

162

(115) Procaccia, I.; Mukamel, S.; Ross, J. On the theory of unimolecular reactions: Application of mean first passage time to reaction rates. J. Chem. Phys. 1978, 68, 3244-3253.

(116) Bar-Haim, A.; Klafter, J. On mean residence and first passage times in finite one-dimensional systems. J. Chem. Phys. 1998, 109, 5187-5193.

(117) Polizzi, N. F.; Skourtis, S. S.; Beratan, D. N. Physical constraints on charge transport through bacterial nanowires. Faraday Discuss. 2012, 155, 43-61.

(118) Lee, P. E.; Demple, B.; Barton, J. K. DNA-mediated redox signaling for transcriptional activation of SoxR. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 13164-13168.

(119) Genereux, J. C.; Boal, A. K.; Barton, J. K. DNA-mediated charge transport in redox sensing and signaling. J. Am. Chem. Soc. 2010, 132, 891-905.

(120) Baranovskiy, A. G.; Babayeva, N. D.; Zhang, Y.; Gu, J.; Suwa, Y.; Pavlov, Y. I.; Tahirov, T. H. Mechanism of concerted RNA-DNA primer synthesis by the human primosome. J. Biol. Chem. 2016, 291, 10006-10020.

(121) O'Brien, E.; Holt, M. E.; Thompson, M. K.; Salay, L. E.; Ehlinger, A. C.; Chazin, W. J.; Barton, J. K. Response to Comments on "The [4Fe4S] cluster of human DNA primase functions as a redox switch using DNA charge transport". Science 2017, 357, eaan2762.

(122) Pellegrini, L. Comment on "The [4Fe4S] cluster of human DNA primase functions as a redox switch using DNA charge transport". Science 2017, 357, eaan2954.

(123) Baranovskiy, A. G.; Babayeva, N. D.; Zhang, Y.; Blanco, L.; Pavlov, Y. I.; Tahirov, T. H. Comment on “The [4Fe4S] cluster of human DNA primase functions as a redox switch using DNA charge transport”. Science 2017, 357.

(124) Vaithiyalingam, S.; Warren, E. M.; Eichman, B. F.; Chazin, W. J. Insights into eukaryotic DNA priming from the structure and functional interactions of the 4Fe- 4S cluster domain of human DNA primase. Proc. Natl. Acad. Sci. U.S.A. 2010, 107, 13684- 13689.

(125) Netz, D. J. A.; Stith, C. M.; Stümpfig, M.; Köpf, G.; Vogel, D.; Genau, H. M.; Stodola, J. L.; Lill, R.; Burgers, P. M. J.; Pierik, A. J. Eukaryotic DNA polymerases require an iron-sulfur cluster for the formation of active complexes. Nat. Chem. Biol. 2012, 8, 125-132.

163

(126) Baranovskiy, A. G.; Babayeva, N. D.; Zhang, Y.; Gu, J.; Suwa, Y.; Pavlov, Y. I.; Tahirov, T. H. Mechanism of Concerted RNA-DNA Primer Synthesis by the Human Primosome. J. Biol. Chem. 2016, 291, 10006-10020.

(127) Trott, O.; Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading. J. Comput. Chem. 2010, 31, 455-461.

(128) Heck, A.; Woiczikowski, P. B.; Kubař, T.; Giese, B.; Elstner, M.; Steinbrecher, T. B. Charge transfer in model peptides: obtaining marcus parameters from molecular simulation. J. Phys. Chem. B 2012, 116, 2284-2293.

(129) DeFelippis, M. R.; Murthy, C. P.; Broitman, F.; Weinraub, D.; Faraggi, M.; Klapper, M. H. Electrochemical properties of tyrosine phenoxy and tryptophan indolyl radicals in peptides and amino acid analogs. J. Phys. Chem. 1991, 95, 3416-3419.

(130) Tommos, C.; Babcock, G. T. Proton and hydrogen currents in photosynthetic water oxidation. BBA - Bioenerg. 2000, 1458, 199-219.

(131) Schlodder, E.; Cetin, M.; Lendzian, F. Temperature dependence of the oxidation kinetics of Tyr(Z) and Tyr(D) in oxygen-evolving photosystem II complexes throughout the range from 320 K to 5 K. BBA - Bioenerg. 2015, 1847, 1283-1296.

(132) Cantrell, A.; Truscott, T. G.: Carotenoids and radicals; Interactions with other nutrients. In Carotenoids in Health and Disease; Krinsky, N., Mayne, S. T., Sies, H., Eds.; Taylor & Francis Group, LLC: New York, 2004; pp 31-52.

(133) Merényi, G.; Lind, J.; Engman, L. The dimethylhydroxysulfuranyl radical. J. Phys. Chem. 1996, 100, 8875-8881.

(134) Kubař, T.; Elstner, M. Solvent reorganization energy of hole transfer in DNA. J. Phys. Chem. B 2009, 113, 5653-5656.

(135) Khan, A. Reorganization, activation and ionization energies for hole transfer reactions through inosine–cytosine, 2-aminopurine – Thymine, adenine– thymine, and guanine–cytosine base pairs: A computational study. Comput. Theor. Chem. 2013, 1013, 136-139.

(136) Steenken, S.; Jovanovic, S. V. How easily oxidizable Is DNA? One- electron reduction potentials of adenosine and guanosine radicals in aqueous solution. J. Am. Chem. Soc. 1997, 119, 617-618.

164

(137) Steenken, S.; Jovanovic, S. V.; Bietti, M.; Bernhard, K. The trap depth (in DNA) of 8-oxo-7,8-dihydro-2‘deoxyguanosine as derived from electron-transfer equilibria in aqueous solution. J. Am. Chem. Soc. 2000, 122, 2373-2374.

(138) Crespo-Hernández, C. E.; Close, D. M.; Gorb, L.; Leszczynski, J. Determination of redox potentials for the Watson−Crick base pairs, DNA nucleosides, and relevant nucleoside analogues. J. Phys. Chem. B 2007, 111, 5386-5395.

(139) Thapa, B.; Schlegel, H. B. Calculations of pKa’s and redox potentials of nucleobases with explicit waters and polarizable continuum solvation. J. Phys. Chem. A 2015, 119, 5134-5144.

(140) Adamo, C.; Barone, V. Toward reliable density functional methods without adjustable parameters: The PBE0 model. J. Chem. Phys. 1999, 110, 6158-6170.

(141) Zhao, Y.; Truhlar, D. G. The M06 suite of density functionals for main group thermochemistry, thermochemical kinetics, noncovalent interactions, excited states, and transition elements: two new functionals and systematic testing of four M06- class functionals and 12 other functionals. Theor. Chem. Acc. 2008, 120, 215-241.

(142) Zhao, Y.; Truhlar, D. G. Density functionals with broad applicability in chemistry. Acc. Chem. Res. 2008, 41, 157-167.

(143) Zhao, Y.; Truhlar, D. G. Density functional for spectroscopy: No long- range self-interaction error, good performance for Rydberg and charge-transfer states, and better performance on average than B3LYP for ground states. J. Phys. Chem. A 2006, 110, 13126-13130.

(144) Peverati, R.; Truhlar, D. G. Improving the accuracy of hybrid meta-GGA density functionals by range separation. J. Phys. Chem. Lett. 2011, 2, 2810-2817.

(145) Voityuk, A. A. Electron transfer between [4Fe–4S] clusters. Chem. Phys. Lett. 2010, 495, 131-134.

(146) Marcus, R. A.; Sutin, N. Electron transfers in chemistry and biology. Biochim. Biophys. Acta 1985, 811, 265-322.

(147) Baranovskiy, A. G.; Zhang, Y.; Suwa, Y.; Babayeva, N. D.; Gu, J.; Pavlov, Y. I.; Tahirov, T. H. Crystal structure of the human primase. J. Biol. Chem. 2015, 290, 5635- 5646.

165

(148) Davies, M. J. The oxidative environment and protein damage. BBA - Proteins Proteom. 2005, 1703, 93-109.

(149) Berlett, B. S.; Stadtman, E. R. Protein oxidation in aging, disease, and oxidative stress. J. Biol. Chem. 1997, 272, 20313-20316.

(150) Gray, H. B.; Winkler, J. R. Hole hopping through tyrosine/tryptophan chains protects proteins from oxidative damage. Proc. Natl. Acad. Sci. U. S. A. 2015, 112, 10920-10925.

(151) Winkler, J. R.; Gray, H. B. Electron flow through biological molecules: does hole hopping protect proteins from oxidative damage? Q. Rev. Biophys. 2015, 48, 411-420.

(152) Kanwar, R.; Balasubramanian, D. Structure and stability of the dityrosine- linked dimer of γB-crystallin. Exp. Eye Res. 1999, 68, 773-784.

(153) Hawkins, C. L.; Davies, M. J. The role of aromatic amino acid oxidation, protein unfolding, and aggregation in the hypobromous acid-induced inactivation of trypsin inhibitor and lysozyme. Chem. Res. Toxicol. 2005, 18, 1669-1677.

(154) Stadtman, E. R.; Levine, R. L. Free radical-mediated oxidation of free amino acids and amino acid residues in proteins. Amino Acids 2003, 25, 207-218.

(155) Ener, M. E.; Gray, H. B.; Winkler, J. R. Hole hopping through tryptophan in cytochrome p450. Biochemistry 2017, 56, 3531-3538.

(156) Choi, M.; Shin, S.; Davidson, V. L. Characterization of electron tunneling and hole hopping reactions between different forms of MauG and methylamine dehydrogenase within a natural protein complex. Biochemistry 2012, 51, 6942-6949.

(157) Polizzi, N. F.; Migliore, A.; Therien, M. J.; Beratan, D. N. Defusing redox bombs? Proc. Natl. Acad. Sci. U. S. A. 2015, 112, 10821-10822.

(158) Kathiresan, M.; Martins, D.; English, A. M. Respiration triggers heme transfer from cytochrome c peroxidase to catalase in yeast mitochondria. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 17468-17473.

(159) Kathiresan, M.; English, A. M. LC-MS/MS suggests that hole hopping in cytochrome c peroxidase protects its heme from oxidative modification by excess H2O2. Chem. Sci. 2017, 8, 1152-1162.

166

(160) Martins, B. M.; Blaser, M.; Feliks, M.; Ullmann, G. M.; Buckel, W.; Selmer, T. Structural basis for a Kolbe-type decarboxylation catalyzed by a glycyl radical enzyme. J. Am. Chem. Soc. 2011, 133, 14666-14674.

(161) Knappe, J.; Neugebauer, F. A.; Blaschkowski, H. P.; Gänzler, M. Post- translational activation introduces a free radical into pyruvate formate-lyase. Proc. Natl. Acad. Sci. U.S.A 1984, 81, 1332-1335.

(162) Funk, M. A.; Judd, E. T.; Marsh, E. N. G.; Elliott, S. J.; Drennan, C. L. Structures of benzylsuccinate synthase elucidate roles of accessory subunits in glycyl radical enzyme activation and activity. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 10161- 10166.

(163) Gray, H. B.; Winkler, J. R. Long-range electron transfer. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 3534-3539.

(164) Balabin, I. A.; Hu, X.; Beratan, D. N. Exploring biological electron transfer pathway dynamics with the pathways plugin for VMD. J. Comput. Chem. 2012, 33, 906- 910.

(165) Beratan, D. N.; Betts, J. N.; Onuchic, J. N. Protein electron transfer rates set by the bridging secondary and tertiary structure. Science 1991, 252, 1285-1288.

(166) Betts, J. N.; Beratan, D. N.; Onuchic, J. N. Mapping electron-tunneling pathways - an algorithm that finds the minimum length maximum coupling pathway between electron-donors and acceptors in proteins. J. Am. Chem. Soc. 1992, 114, 4043- 4046.

(167) Guallar, V.; Wallrapp, F. Mapping protein electron transfer pathways with QM/MM methods. J. Royal Soc. Interface 2008, 5, 233-239.

(168) Teo, R. D.; Wang, R.; Smithwick, E. R.; Migliore, A.; Therien, M.; Beratan, D. N. Mapping hole hopping escape routes in proteins. Proc. Natl. Acad. Sci. U.S.A. 2019, 116, 15811-15816.

(169) Hagberg, A.; Swart, P.; S Chult, D.: Exploring network structure, dynamics, and function using networkx; ; Los Alamos National Lab. (LANL), Los Alamos, NM (United States), 2008.

(170) Lubner, C. E.; Jennings, D. P.; Mulder, D. W.; Schut, G. J.; Zadvornyy, O. A.; Hoben, J. P.; Tokmina-Lukaszewska, M.; Berry, L.; Nguyen, D. M.; Lipscomb, G. L.; Bothner, B.; Jones, A. K.; Miller, A.-F.; King, P. W.; Adams, M. W. W.; Peters, J. W.

167

Mechanistic insights into energy conservation by flavin-based electron bifurcation. Nat. Chem. Biol. 2017, 13, 655.

(171) Rousseau, B. J. G.; Shafei, S.; Migliore, A.; Stanley, R. J.; Beratan, D. N. Determinants of photolyase's DNA repair mechanism in mesophiles and extremophiles. J. Am. Chem. Soc. 2018, 140, 2853-2861.

(172) Blumberger, J.; Klein, M. L. Reorganization free energies for long-range electron transfer in a porphyrin-binding four-helix bundle protein. J. Am. Chem. Soc. 2006, 128, 13854-13867.

(173) Tipmanee, V.; Oberhofer, H.; Park, M.; Kim, K. S.; Blumberger, J. Prediction of reorganization free energies for biological electron transfer: a comparative study of Ru-modified cytochromes and a 4-helix bundle protein. J. Am. Chem. Soc. 2010, 132, 17032-17040.

(174) Hu, L. H.; Farrokhnia, M.; Heimdal, J.; Shleev, S.; Rulisek, L.; Ryde, U. Reorganization energy for internal electron transfer in multicopper oxidases. J. Phys. Chem. B 2011, 115, 13111-13126.

(175) Gottlieb, R. A. Cytochrome p450: Major player in reperfusion injury. Arch. Biochem. Biophys. 2003, 420, 262-267.

(176) Jezek, P.; Hlavata, L. Mitochondria in homeostasis of reactive oxygen species in cell, tissues, and organism. Int. J. Biochem. Cell Biol. 2005, 37, 2478-2503.

(177) Elenewski, J. E.; Hackett, J. C. Ab initio dynamics of the cytochrome p450 hydroxylation reaction. J. Chem. Phys. 2015, 142, 9.

(178) Ost, T. W. B.; Clark, J.; Mowat, C. G.; Miles, C. S.; Walkinshaw, M. D.; Reid, G. A.; Chapman, S. K.; Daff, S. Oxygen activation and electron transfer in flavocytochrome p450 bm3. J. Am. Chem. Soc. 2003, 125, 15010-15020.

(179) Meunier, B.; de Visser, S. P.; Shaik, S. Mechanism of oxidation reactions catalyzed by cytochrome p450 enzymes. Chem. Rev. 2004, 104, 3947-3980.

(180) Huang, X.; Groves, J. T. Oxygen activation and radical transformations in heme proteins and metalloporphyrins. Chem. Rev. 2018, 118, 2491-2553.

(181) Langen, R.; Chang, I.; Germanas, J.; Richards, J.; Winkler; Gray, H. Electron tunneling in proteins: coupling through a beta strand. Science 1995, 268, 1733- 1735.

168

(182) Winkler, J. R.; Gray, H. B. Electron flow through metalloproteins. Chem. Rev. 2014, 114, 3369-3380.

(183) Regan, J. J.; Di Bilio, A. J.; Langen, R.; Skov, L. K.; Winkler, J. R.; Gray, H. B.; Onuchic, J. N. Electron tunneling in azurin: the coupling across a β-sheet. Chem. Biol. 1995, 2, 489-496.

(184) Bonagura, C. A.; Bhaskar, B.; Shimizu, H.; Li, H. Y.; Sundaramoorthy, M.; McRee, D. E.; Goodin, D. B.; Poulos, T. L. High-resolution crystal structures and spectroscopy of native and compound I cytochrome c peroxidase. Biochemistry 2003, 42, 5600-5608.

(185) Payne, T. M.; Yee, E. F.; Dzikovski, B.; Crane, B. R. Constraints on the radical cation center of cytochrome c peroxidase for electron transfer from cytochrome c. Biochemistry 2016, 55, 4807-4822.

(186) Barrows, T. P.; Bhaskar, B.; Poulos, T. L. Electrostatic control of the tryptophan radical in cytochrome c peroxidase. Biochemistry 2004, 43, 8826-8834.

(187) Volkov, A. N.; Nicholls, P.; Worrall, J. A. R. The complex of cytochrome c and cytochrome c peroxidase: the end of the road? BBA - Bioenerg. 2011, 1807, 1482-1503.

(188) Murphy, E. J.; Metcalfe, C. L.; Basran, J.; Moody, P. C. E.; Raven, E. L. Engineering the substrate specificity and reactivity of a heme protein: creation of an ascorbate binding site in cytochrome c peroxidase. Biochemistry 2008, 47, 13933-13941.

(189) Lee, D.-S.; Yamada, A.; Sugimoto, H.; Matsunaga, I.; Ogura, H.; Ichihara, K.; Adachi, S.-i.; Park, S.-Y.; Shiro, Y. Substrate recognition and molecular mechanism of fatty acid hydroxylation by cytochrome p450 from Bacillus subtilis: Crystallographic, spectroscopic, and mutational studies. J. Biol. Chem. 2003, 278, 9761-9767.

(190) Nnyepi, M. R.; Peng, Y.; Broderick, J. B. Inactivation of E. coli pyruvate formate-lyase: Role of AdhE and small molecules. Arch. Biochem. Biophys. 2007, 459, 1-9.

(191) Forbes, S. A.; Bindal, N.; Bamford, S.; Cole, C.; Kok, C. Y.; Beare, D.; Jia, M.; Shepherd, R.; Leung, K.; Menzies, A.; Teague, J. W.; Campbell, P. J.; Stratton, M. R.; Futreal, P. A. COSMIC: Mining complete cancer genomes in the catalogue of somatic mutations in cancer. Nucleic Acids Res. 2011, 39, D945-D950.

(192) Noodleman, L.; Peng, C. Y.; Case, D. A.; Mouesca, J. M. Orbital interactions, electron delocalization and spin coupling in iron-sulfur clusters. Coord. Chem. Rev. 1995, 144, 199-244.

169

(193) Spin contamination in inorganic chemistry calculations. In Encyclopedia of Inorganic and Bioinorganic Chemistry.

(194) Cutsail, G. E.; Telser, J.; Hoffman, B. M. Advanced paramagnetic resonance spectroscopies of iron–sulfur proteins: electron nuclear double resonance (ENDOR) and electron spin echo envelope modulation (ESEEM). BBA - Mol. Cell Res. 2015, 1853, 1370-1394.

(195) Hanson, G.; Berliner, L.: High resolution EPR: Applications to metalloenzymes and metals in medicine; Springer Science & Business Media, 2009; Vol. 28.

(196) Niu, S.; Wang, X.-B.; Yang, X.; Wang, L.-S.; Ichiye, T. Mechanistic insight into the symmetric fission of [4Fe−4S] analogue complexes and implications for cluster conversions in iron−sulfur proteins. J. Phys. Chem. A 2004, 108, 6750-6757.

(197) Broken symmetry states of iron—sulfur clusters. In Encyclopedia of Inorganic Chemistry.

(198) Szilagyi, R. K.; Winslow, M. A. On the accuracy of density functional theory for iron—sulfur clusters. J. Comput. Chem. 2006, 27, 1385-1397.

(199) Baranovskiy, A. G.; Babayeva, N. D.; Zhang, Y.; Gu, J.; Suwa, Y.; Pavlov, Y. I.; Tahirov, T. H. Mechanism of concerted RNA-DNA primer synthesis by the human primosome. J. Biol. Chem. 2016, 291, 10006-10020.

(200) Becke, A. D. A new mixing of Hartree–Fock and local density-functional theories. J. Chem. Phys. 1993, 98, 1372-1377.

(201) Bertini, I.; Capozzi, F.; Ciurli, S.; Luchinat, C.; Messori, L.; Piccioli, M. Identification of the iron ions of high potential iron protein from Chromatium vinosum within the protein frame through two-dimensional NMR experiments. J. Am. Chem. Soc. 1992, 114, 3332-3340.

(202) Perrin, B. S.; Niu, S.; Ichiye, T. Calculating standard reduction potentials of [4Fe–4S] proteins. J. Comp. Chem. 2013, 34, 576-582.

(203) Tan, M.-L.; Perrin, B. S.; Niu, S.; Huang, Q.; Ichiye, T. Protein dynamics and the all-ferrous [Fe4S4] cluster in the nitrogenase iron protein. Protein Sci. 2016, 25, 12-18.

(204) Wang, X.-B.; Niu, S.; Yang, X.; Ibrahim, S. K.; Pickett, C. J.; Ichiye, T.; Wang, L.-S. Probing the intrinsic electronic structure of the cubane [4Fe−4S] cluster:

170

Nature's favorite cluster for electron transfer and storage. J. Am. Chem. Soc. 2003, 125, 14072-14081.

(205) Klamt, A. Conductor-like screening model for real solvents: A new approach to the quantitative calculation of solvation phenomena. J. Phys. Chem. 1995, 99, 2224-2235.

(206) Carvalho, A. T. P.; Swart, M. Electronic structure investigation and parametrization of biologically relevant iron–sulfur clusters. J. Chem. Inf. Model. 2014, 54, 613-620.

(207) Niu, S.; Ichiye, T. Cleavage of [4Fe—4S]-type clusters: Breaking the symmetry. J. Phys. Chem. A 2009, 113, 5710-5717.

(208) Zheng, S.; Tang, Q.; He, J.; Du, S.; Xu, S.; Wang, C.; Xu, Y.; Lin, F. VFFDT: A new software for preparing AMBER force field parameters for metal-containing molecular systems. J. Chem. Inf. Model. 2016, 56, 811-818.

(209) Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins 2006, 65, 712-725.

(210) Wang, J.; Cieplak, P.; Kollman, P. A. How well does a restrained electrostatic potential (RESP) model perform in calculating conformational energies of organic and biological molecules? J. Comput. Chem. 2000, 21, 1049-1074.

(211) Li, P.; Roberts, B. P.; Chakravorty, D. K.; Merz, K. M. Rational design of particle mesh ewald compatible lennard-jones parameters for +2 metal cations in explicit solvent. J. Chem. Theory Comput. 2013, 9, 2733-2748.

(212) Li, P.; Song, L. F.; Merz, K. M. Parameterization of highly charged metal ions using the 12-6-4 LJ-type nonbonded model in explicit water. J. Phys. Chem. B 2015, 119, 883-895.

(213) Meagher, K. L.; Redman, L. T.; Carlson, H. A. Development of polyphosphate parameters for use with the AMBER force field. J. Comput. Chem. 2003, 24, 1016-1025.

(214) Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, K. E.; Simmerling, C. ff14SB: Improving the accuracy of protein side chain and backbone parameters from ff99SB. J. Chem. Theory Comput. 2015, 11, 3696-3713.

171

(215) Pérez, A.; Marchán, I.; Svozil, D.; Sponer, J.; Cheatham, T. E.; Laughton, C. A.; Orozco, M. Refinement of the AMBER force field for nucleic acids: Improving the description of α/γ conformers. Biophys. J. 2007, 92, 3817-3829.

(216) Miller, B. R.; McGee, T. D.; Swails, J. M.; Homeyer, N.; Gohlke, H.; Roitberg, A. E. MMPBSA.py: An efficient program for end-state free energy calculations. J. Chem. Theory Comput. 2012, 8, 3314-3321.

(217) Wang, C.; Nguyen, P. H.; Pham, K.; Huynh, D.; Le, T.-B. N.; Wang, H.; Ren, P.; Luo, R. Calculating protein–ligand binding affinities with MMPBSA: Method and error analysis. J. Comput. Chem. 2016, 37, 2436-2446.

(218) Roe, D. R.; Cheatham, T. E. PTRAJ and CPPTRAJ: Software for processing and analysis of molecular dynamics trajectory data. J. Chem. Theory Comput. 2013, 9, 3084-3095.

(219) He, G.; Siddik, Z. H.; Huang, Z.; Wang, R.; Koomen, J.; Kobayashi, R.; Khokhar, A. R.; Kuang, J. Induction of p21 by p53 following DNA damage inhibits both Cdk4 and Cdk2 activities. Oncogene 2005, 24, 2929-2943.

(220) Abbas, T.; Dutta, A. p21 in cancer: Intricate networks and multiple activities. Nat. Rev. Cancer 2009, 9, 400-414.

(221) Niehrs, C.; Schäfer, A. Active DNA demethylation by Gadd45 and DNA repair. Trends Cell Biol. 2012, 22, 220-227.

(222) Barreto, G.; Schäfer, A.; Marhold, J.; Stach, D.; Swaminathan, S. K.; Handa, V.; Döderlein, G.; Maltry, N.; Wu, W.; Lyko, F.; Niehrs, C. Gadd45a promotes epigenetic gene activation by repair-mediated DNA demethylation. Nature 2007, 445, 671-675.

(223) Schieber, M.; Chandel, Navdeep S. ROS function in redox signaling and oxidative stress. Curr. Biol. 2014, 24, R453-R462.

(224) Buzek, J.; Latonen, L.; Kurki, S.; Peltonen, K.; Laiho, M. Redox state of tumor suppressor p53 regulates its sequence-specific DNA binding in DNA-damaged cells by cysteine 277. Nucleic Acids Res. 2002, 30, 2340-2348.

(225) Augustyn, K. E.; Merino, E. J.; Barton, J. K. A role for DNA-mediated charge transport in regulating p53: Oxidation of the DNA-bound protein from a distance. Proc. Natl. Acad. Sci. U. S. A. 2007, 104, 18907-18912.

172

(226) Schaefer, K. N.; Barton, J. K. DNA-mediated oxidation of p53. Biochemistry 2014, 53, 3467-3475.

(227) Takada, T.; Barton, J. K. DNA charge transport leading to disulfide bond formation. J. Am. Chem. Soc. 2005, 127, 12204-12205.

(228) Chin, P. L.; Momand, J.; Pfeifer, G. P. In vivo evidence for binding of p53 to consensus binding sites in the p21 and GADD45 genes in response to ionizing radiation. Oncogene 1997, 15, 87-99.

(229) Kannan, K.; Jain, S. K. Oxidative stress and apoptosis. Pathophysiology 2000, 7, 153-163.

(230) Barr, A. R.; Cooper, S.; Heldt, F. S.; Butera, F.; Stoy, H.; Mansfeld, J.; Novák, B.; Bakal, C. DNA damage during S-phase mediates the proliferation-quiescence decision in the subsequent G1 via p21 expression. Nat. Commun. 2017, 8, 14728.

(231) El-Deiry, W. S.; Kern, S. E.; Pietenpol, J. A.; Kinzler, K. W.; Vogelstein, B. Definition of a consensus binding site for p53. Nat. Genet. 1992, 1, 45-49.

(232) El-Deiry, W. S.; Tokino, T.; Velculescu, V. E.; Levy, D. B.; Parsons, R.; Trent, J. M.; Lin, D.; Mercer, W. E.; Kinzler, K. W.; Vogelstein, B. WAF1, a potential mediator of p53 tumor suppression. Cell 1993, 75, 817-825.

(233) Kearns, S.; Lurz, R.; Orlova, E. V.; Okorokov, A. L. Two p53 tetramers bind one consensus DNA response element. Nucleic Acids Res. 2016, 44, 6185-6199.

(234) Kitayner, M.; Rozenberg, H.; Kessler, N.; Rabinovich, D.; Shaulov, L.; Haran, T. E.; Shakked, Z. Structural basis of DNA recognition by p53 tetramers. Mol. Cell 2006, 22, 741-753.

(235) Schaefer, K. N.; Geil, W. M.; Sweredoski, M. J.; Moradian, A.; Hess, S.; Barton, J. K. Oxidation of p53 through DNA charge transport involves a network of disulfides within the DNA-binding domain. Biochemistry 2015, 54, 932-941.

(236) Kaar, J. L.; Basse, N.; Joerger, A. C.; Stephens, E.; Rutherford, T. J.; Fersht, A. R. Stabilization of mutant p53 via alkylation of cysteines and effects on DNA binding. Protein Sci. 2010, 19, 2267-2278.

(237) Bauer, M. R.; Joerger, A. C.; Fersht, A. R. 2-Sulfonylpyrimidines: Mild alkylating agents with anticancer activity toward p53-compromised cells. Proc. Natl. Acad. Sci. U. S. A. 2016, 113, E5271-E5280.

173

(238) Renaud, N.; Berlin, Y. A.; Lewis, F. D.; Ratner, M. A. Between superexchange and hopping: An intermediate charge-transfer mechanism in poly(A)- poly(T) DNA hairpins. J. Am. Chem. Soc. 2013, 135, 3953-3963.

(239) Bixon, M.; Jortner, J. Charge transport in DNA via thermally induced hopping. J. Am. Chem. Soc. 2001, 123, 12556-12567.

(240) Uskov, D. B.; Burin, A. L. Strong localization of positive charge in DNA from a charge-balance theory. Phys. Rev. B 2008, 78, 073106.

(241) Kuznetsov, A. M.; Ulstrup, J.: Electron transfer in chemistry and biology: An introduction to the theory; John Wiley & Sons: New York, 1999.

(242) Voityuk, A. A. Electronic couplings and on-site energies for hole transfer in DNA: Systematic quantum mechanical/molecular dynamic study. J. Chem. Phys. 2008, 128, 115101.

(243) Balabin, I. A.; Onuchic, J. N. Dynamically controlled protein tunneling paths in photosynthetic reaction centers. Science 2000, 290, 114-117.

(244) Wiman, K. G. Pharmacological reactivation of mutant p53: from protein structure to the cancer patient. Oncogene 2010, 29, 4245.

(245) Cong, L.; Ran, F. A.; Cox, D.; Lin, S. L.; Barretto, R.; Habib, N.; Hsu, P. D.; Wu, X. B.; Jiang, W. Y.; Marraffini, L. A.; Zhang, F. Multiplex genome engineering using CRISPR/Cas systems. Science 2013, 339, 819-823.

(246) Mali, P.; Yang, L. H.; Esvelt, K. M.; Aach, J.; Guell, M.; DiCarlo, J. E.; Norville, J. E.; Church, G. M. RNA-guided human genome engineering via Cas9. Science 2013, 339, 823-826.

(247) Pinheiro, V. B.; Taylor, A. I.; Cozens, C.; Abramov, M.; Renders, M.; Zhang, S.; Chaput, J. C.; Wengel, J.; Peak-Chew, S.-Y.; McLaughlin, S. H.; Herdewijn, P.; Holliger, P. Synthetic genetic polymers capable of heredity and evolution. Science 2012, 336, 341-344.

(248) Teo, R. D.; Smithwick, E. R.; Migliore, A. 2′-Deoxy-2′-fluoro- arabinonucleic acid: a valid alternative to DNA for biotechnological applications using charge transport. Phys. Chem. Chem. Phys. 2019, 21, 22869-22878.

174

(249) Watts, J. K.; Katolik, A.; Viladoms, J.; Damha, M. J. Studies on the hydrolytic stability of 2’-fluoroarabinonucleic acid (2’F-ANA). Org. Biomol. Chem. 2009, 7, 1904-1910.

(250) Marquez, V. E.; Tseng, C. K. H.; Mitsuya, H.; Aoki, S.; Kelley, J. A.; Ford, H.; Roth, J. S.; Broder, S.; Johns, D. G.; Driscoll, J. S. Acid-stable 2'-fluoro purine dideoxynucleosides as active agents against HIV. J. Med. Chem. 1990, 33, 978-985.

(251) Carson, D. A.; Wasson, D. B.; Esparza, L. M.; Carrera, C. J.; Kipps, T. J.; Cottam, H. B. Oral antilymphocyte activity and induction of apoptosis by 2-chloro-2'- arabino-fluoro-2'-deoxyadenosine. Proc. Natl. Acad. Sci. U.S.A. 1992, 89, 2970-2974.

(252) Souleimanian, N.; Deleavey, G. F.; Soifer, H.; Wang, S.; Tiemann, K.; Damha, M. J.; Stein, C. A. Antisense 2’-deoxy, 2’-fluoroarabino nucleic acid (2’F-ANA) oligonucleotides: In vitro gymnotic silencers of gene expression whose potency Is enhanced by fatty acids. Mol. Ther. Nucleic Acids 2012, 1.

(253) Watts, J. K.; Damha, M. J. 2′F-Arabinonucleic acids (2′F-ANA) — History, properties, and new frontiers. Can. J. Chem. 2008, 86, 641-656.

(254) Beratan, D. N.; Waldeck, D. H. Hot holes break the speed limit. Nat. Chem. 2016, 8, 992.

(255) Lercher, L.; McDonough, M. A.; El-Sagheer, A. H.; Thalhammer, A.; Kriaucionis, S.; Brown, T.; Schofield, C. J. Structural insights into how 5- hydroxymethylation influences transcription factor binding. Chem. Commun. 2014, 50, 1794-1796.

(256) Martín-Pintado, N.; Yahyaee-Anzahaee, M.; Campos-Olivas, R.; Noronha, A. M.; Wilds, C. J.; Damha, M. J.; González, C. The solution structure of double helical arabino nucleic acids (ANA and 2′F-ANA): effect of arabinoses in duplex-hairpin interconversion. Nucleic Acids Res. 2012, 40, 9329-9339.

(257) Schrödinger Release 2018-1: Maestro, S., LLC, New York, NY, 2018.

(258) Noy, A.; Luque, F. J.; Orozco, M. Theoretical analysis of antisense duplexes: Determinants of the RNase H susceptibility. J. Am. Chem. Soc. 2008, 130, 3486- 3496.

(259) Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 2010, 132, 154104.

175

(260) Tavernier, H. L.; Fayer, M. D. Distance dependence of electron transfer in DNA: The role of the reorganization energy and free energy. J. Phys. Chem. B 2000, 104, 11541-11550.

(261) Tong, G. S. M.; Kurnikov, I. V.; Beratan, D. N. Tunneling energy effects on GC oxidation in DNA. J. Phys. Chem. B 2002, 106, 2381-2392.

(262) Cuervo, A.; Dans, P. D.; Carrascosa, J. L.; Orozco, M.; Gomila, G.; Fumagalli, L. Direct measurement of the dielectric polarization properties of DNA. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, E3624-E3630.

(263) Rösch, N.; Voityuk, A. A. Quantum chemical calculation of donor- acceptor coupling for charge transfer in DNA. Top Curr Chem 2004, 237, 37-72.

(264) Steinbrecher, T.; Koslowski, T.; Case, D. A. Direct simulation of electron transfer reactions in DNA radical cations. J. Phys. Chem. B 2008, 112, 16935-16944.

(265) Zeidan, T. A.; Carmieli, R.; Kelley, R. F.; Wilson, T. M.; Lewis, F. D.; Wasielewski, M. R. Charge-transfer and spin dynamics in DNA hairpin conjugates with perylenediimide as a base-pair surrogate. J. Am. Chem. Soc. 2008, 130, 13945-13955.

(266) Harris, M. A.; Mishra, A. K.; Young, R. M.; Brown, K. E.; Wasielewski, M. R.; Lewis, F. D. Direct observation of the hole carriers in DNA photoinduced charge transport. J. Am. Chem. Soc. 2016, 138, 5491-5494.

(267) Olshansky, J. H.; Young, R. M.; Wasielewski, M. R. Charge separation and recombination pathways in diblock DNA hairpins. J. Phys. Chem. B 2019, 123, 1545-1553.

(268) Lewis, F. D.; Kalgutkar, R. S.; Wu, Y. S.; Liu, X. Y.; Liu, J. Q.; Hayes, R. T.; Miller, S. E.; Wasielewski, M. R. Driving force dependence of electron transfer dynamics in synthetic DNA hairpins. J. Am. Chem. Soc. 2000, 122, 12346-12351.

(269) Petersen, R. A.; Evans, D. H. Heterogeneous electron-transfer kinetics for a variety of organic electrode-reactions at the mercury acetonitrile interface using either tetraethylammonium perchlorate or tetraheptylammonium perchlorate electrolyte. J. Electroanal. Chem. 1987, 222, 129-150.

(270) Migliore, A.; Nitzan, A. Nonlinear charge transport in redox molecular junctions: A Marcus perspective. ACS Nano 2011, 5, 6669-6685.

176

Biography

Ruijie D. Teo (Darius) completed his undergraduate studies at the California

Institute of Technology in June 2015, and obtained a Ph.D. degree in chemistry at Duke

University in May 2020, along with a Certificate in College Teaching and a Certificate in

Structural Biology and . In graduate school, he received the Blue Waters

Graduate Fellowship, James B. Duke Fellowship, William Krigbaum Fellowship, Bass

Instructional Fellowship, and John Chambers Fellowship. He was awarded the Chemical

Computing Group Excellence Award for Graduate Students, Best Poster Award (28th

Canadian Symposium on Theoretical and Computational Chemistry), and the Pelham

Wilder Award for teaching excellence.

Publication list (* denotes equal contribution):

10. Teo, R. D.; Smithwick, E. R.; Migliore, A. Phys. Chem. Chem. Phys. 2019, 21, 22869– 22878. 9. Teo, R. D.*; Wang, R.*; Smithwick, E.; Migliore, A.; Therien, M. J.; Beratan, D. N. Proc. Natl. Acad. Sci. U.S.A. 2019, 116, 15811–15816. 8. Teo, R. D.; Smithwick, E.; Migliore, A.; Beratan, D. N. Chem. Commun. 2019, 55, 206–209. 7. Teo, R. D.; Rousseau, B. J. G.; Smithwick, E.; Di Felice, R.; Beratan, D. N.; Migliore, A. Chem. 2019, 5, 122–137. 6. Soll, M.; Goswami, T.; Cheng, Q.-C.; Saltsman, I.; Teo, R. D.; Shahgoli, M.; Lim, P.; Di Bilio, A. J.; Cohen, S.; Termini, J.; Gray, H. B.; Gross, Z. Sci. Rep. 2019, 9, 2294. 5. Teo, R. D.; Terai, K.; Migliore, A.; Beratan, D. N. Phys. Chem. Chem. Phys. 2018, 20, 26063– 26067. 4. Teo, R. D.; Hwang, J. Y.; Termini, J.; Gross, Z; Gray, H. B. Chem. Rev. 2017, 117, 2711– 2729. 3. Teo, R. D.; Termini, J.; Gray, H. B. J. Med. Chem. 2016, 59, 6012–6024. 2. Teo, R. D.*; Dong, S. S.*; Gross, Z.; Gray, H. B.; Goddard III, W. A. Mol. Biosyst. 2015, 11, 2907–2914. 1. Teo, R. D.; Gray, H. B.; Lim, P.; Termini, J.; Domeshek, E.; Gross, Z. Chem. Commun. 2014, 50, 13789–13792.

177