<<

Molecular of Folded and Disordered Polypeptides in Comparison with Nuclear

Magnetic Resonance Measurement

Thesis

Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in

the Graduate School of The Ohio State University

By

Lei Yu

Graduate Program in Chemistry

The Ohio State University

2018

Thesis Committee

Rafael Brüschweiler, Advisor

Sherwin Singer

Copyrighted by

Lei Yu

2018

2

Abstract

With continuous increase in computer speed, (MD) have become a crucial biophysical method for the understanding of biochemical processes at atomic detail. However, current fields optimized for globular often result in overly collapsed structures for intrinsically disordered proteins

(IDPs). In order to study the relevant biological functions of IDPs, it is of great importance to utilize the appropriate force fields. Chemical shifts and J-coupling measurements from nuclear magnetic resonance (NMR) experiments serve as a common ground for assessment. Various combinations of force fields and water models were assessed by the direct comparison between back-calculated NMR parameters and experimental data. TIP4P-D was proven to improve the accuracy of back- calculation by producing more realistic dihedral angle ψ distributions. Amber99SBnmr1-

ILDN and RSFF2+ force fields were shown to be optimal choices for tripeptide and IDP fragment simulations. The stability of the simulated two- (THB) domain in the eukaryotic Na+/Ca2+ exchanger (NCX) suggested the transferable performance of

Amber99SBnmr1-ILDN force field in company with TIP4P-D water model in folded proteins. Based on the assessment results, a research plan for future improvement of force fields by rebalancing dihedral angle distributions was proposed.

ii

Dedication

For my family.

iii

Acknowledgments

I would like to express my sincere gratitude towards Prof. Rafael Brüschweiler for his unwavering support throughout my course of study. His multidisciplinary expertise constantly astonishes me while his genuine enthusiasm for science empowers me to pursue further.

Besides my advisor, I would like to thank my thesis committee member Prof. Sherwin

Singer for his rigorous teaching, insightful comments, and accessible personality.

My deep appreciation also goes to Dr. Da-Wei Li for inspiring discussions and prompt guidance.

Last but not least, I thank the rest of my group: Dr. Alexandar Hansen, Dr. Lei

Bruschweiler-Li, Dr. Bo Zhang, Dr. István Timári, Mouzhe Xie, Jiaqi Yuan, Cheng

Wang, Gregory Jameson, Abigail Leggett, Daniel Cantu, and Xinyao Xiang for their assistance and support.

iv

Vita

2016 ···················· B.S. Chemistry, University of Science and Technology of China

2016 to present ········ Graduate Associate, Department of Chemistry and ,

The Ohio State University

Fields of Study

Major Field: Chemistry

v

Table of Contents

Abstract ······························································································· ii

Dedication ··························································································· iii

Acknowledgments ·················································································· iv

Vita ···································································································· v

List of Tables ······················································································ viii

List of Figures ························································································ x

Chapter 1: Introduction ············································································· 1

1.1 Molecular Dynamics (MD) Simulations ·················································· 1

1.2 Nuclear Magnetic Resonance (NMR) ···································· 3

Chapter 2: Assessment of Molecular Mechanics Force Fields for IDPs by NMR Data ··· 6

2.1 Introduction ··················································································· 6

2.2 Method ························································································· 9

2.2.1 MD ·········································································· 9

2.2.2 Back-calculation of J-coupling constants ·········································· 10

vi

2.2.3 Back-calculation of chemical shifts ················································ 11

2.3 Result and Discussion ····································································· 12

2.3.1 RMSDs of back-calculated J-coupling constants ································· 12

2.3.2 Correlation coefficients of back-calculated J-coupling constants ·············· 14

2.3.3 RMSDs of back-calculated chemical shifts ······································· 16

2.3.4 Comparison with approach ··········································· 19

2.4 Conclusion ·················································································· 20

Chapter 3: MD Simulation of the Two-helix Bundle Domain in the Na+/Ca2+ Exchanger

········································································································ 22

2.1 Introduction ················································································· 22

2.2 Method ······················································································· 25

2.3 Result and Discussion ····································································· 25

2.3.1 Stability of the NMR Structure ······················································ 25

2.3.2 Inter-helical Angles Are Consistent with the Experimental Values ············ 26

2.3.3 Dynamics of the THB Domain ······················································ 29

2.4 Conclusion ·················································································· 30

Chapter 4: Conclusion and Outlook ····························································· 31

Bibliography ························································································ 34

vii

List of Tables

Table 2.1 RMSD of φ dependent J-coupling constants using different force field and water

model combinations. The use of TIP4P-D slightly increased RMSDs for all

protein force fields. RSFF2+/TIP3P yielded the smallest RMSD. ·············· 13

Table 2.2 RMSD of ψ dependent J-coupling constants using different force field and water

model combinations. Except for CHARMM36m, the RMSDs decreased when

TIP4P-D was used with the largest decrease of 12% for Amber99SMnmr1-

ILDN. Amer14/TIP4P-D yielded the smallest RMSD. ··························· 13

Table 2.3 φ and ψ equally weighted RMSD of J-coupling constants using different force

field and water model combinations. Amber99SBnmr1-ILDN/TIP4P-D yielded

the smallest RMSD while RSFF2/TIP4P-D was in the second place. The

difference between different force fields was, however, subtle. The best

combination was less than 10% better than the worst. ···························· 13

Table 2.4 RMSD of chemical shifts of p53TAD (excluding CO data) and α-Synuclein

using different force field and water model combinations. Amber99SBnmr1-

ILDN/TIP4P-D yielded the smallest RMSD while RSFF2+/TIP4P-D yielded

comparable results. ····································································· 16

Table 3.1 Relative standard deviations of different measures of . The relative

standard deviations of all 4 measures of motion, namely, the overall RMSD of viii the ordered region of the THB domain, the distance between residue 295 and

310, the distance of residue 295 and 314, and the inter-helical angle, were calculated by dividing the standard deviation by the mean. The relative standard deviation of the overall RMSD is about 2 times larger than the rest, which demonstrates the rigidity of the linker. ·············································· 29

ix

List of Figures

Figure 2.1 Computational and experimental J-coupling constants. 3J (HN,Hα), 3J (HN,C'),

3J (Hα,C'), 3J (HN,Hβ), 1J (N,Cα), and 2J (N,Cα) coupling constants from

available experimental data are plotted against computed ones. Scarlet bars

indicate measurement errors. Correlation coefficients, shown on the plots, are

low except for 3J (HN,Hα) and 3J (HN,Hβ). The back-calculation overestimated

3J (HN,Hα), 3J (HN,C'), and 2J (N,Cα) coupling constants while underestimated

3J (Hα,C'), 3J (HN,Hβ), and 1J (N,Cα) coupling constants. ······················· 15

Figure 2.2 Computational and experimental chemical shifts in α-Synuclein. Chemical

shifts predicted by PPM using 500-nsMD trajectory using Amber99SBnmr1-

ILDN/TIP4P-D show overall good correlation with experimental values. The

amide proton chemical shifts show weaker correlation due to imperfect force

fields and reference error is observed in the plot for N chemical shifts. ······ 18

Figure 2.3 Fraction of RMSD of back-calculated chemical shifts smaller than that of the

experimental predictor. When TIP3P water model was used (scarlet), the

random coil approach performed better than simulation and PPM combined.

When TIP4P-D was used (grey), 12 out of 22 predictions grouped by types of

residues and chemical shifts using Amber99SBnmr1-ILDN/TIP4P-D or

x

RSFF2+/TIP4P-D was better than that using random coil chemical shifts,

which indicated both approaches were comparable. ····························· 19

Figure 3.1 Schematic representation of eukaryotic NCX structure (from Jiaqi Yuan, et al.,

2018). The updated NCX model consists of a transmembrane domain TM

(residues 1-218 and 727-903) and an intracellular f-loop (residues 219-726).

The newly characterized two-helix bundle (THB) domain (residues 284-322)

is the only structured domain in the f-loop besides the Ca2+ binding domain

CBD12 (residues 371-650). The exchanger inhibitory XIP (residues

219-238), cytoplasmic segment 1 Cyto1 (residues 219-370), and cytoplasmic

segment 2 Cyto 2 (residues 651-726) are other domains and regions in the

intracellular loop. ····································································· 24

Figure 3.2 MD simulation of THB domain. A. Cα RMSDs calculated from snapshots of

MD trajectory compared to the experimental structure reflecting good stability

of the THB domain during the length of the trajectory (500 ns). B. Distances

between residue 295 and residue 310 and between residue 295 and 314

calculated from snapshots of MD trajectory reveals good stability of the THB

domain. ················································································ 27

Figure 3.3 MD simulation of THB domain. A. the angles between the two helices

calculated from MD snapshots. The gray line corresponds to the average angle

between helices (134.7°) calculated from 20 NMR structures of lowest energy.

B. Histogram representing the distribution of inter-helical angle during the MD

simulation, which has a standard deviation of 6.0°. C. Superposition of THB

xi domain structures corresponding to MD simulation structures taken at 100 ns,

200 ns, 300 ns, 400 ns, and 500 ns. Hydrophobic residues M291, I294, L295,

L298, L310, I311, A314, and V318 involved in inter-helical contacts are indicated in gray. The tight bundle, consistent with the experiment, reflects the good stability of the domain during the MD simulation. ······················· 28

xii

Chapter 1: Introduction

1.1 Molecular Dynamics (MD) Simulations

Molecular dynamics (MD) is a computational method for the study of molecular systems at atomic detail in a variety of areas, such as material [1] and biological sciences [2-4].

From a physical perspective, molecular dynamics seeks to achieve two goals. First, given a molecular structure, it needs to accurately calculate the energy based on a molecular mechanics force field. Second, it needs to sample conformational space as extensively as possible or as needed from a starting structure to uncover other states of interest.

In an MD simulation, a series of time-dependent numerical integrations of classical equations of motion is calculated. The leap-frog algorithm, for example, is one of the symplectic algorithms that are able to perform this task. The leap-frog algorithm for an N- body system can be expressed in a synchronized form.

� � + ∆� = � � + �(� + ∆�/2)∆� � � + ∆�/2 = � � − ∆�/2 + �(�)∆� 1 �� � − � �� 1 �� � � � = − � �� ⋮ 1 �� � − � ��

1

The atomic coordinate � = (�, �, . . . , �) is in 3N Cartesian space. The acceleration vector �(�) is derived from the force field �(�) with atomic mass � (� = 1, 2, … , �). The velocity �(�) and position �(�) propagate from their initial values according to these equations. The integration step ∆� is chosen such that the continuity of the simulation is ensured while the use of computer resources is efficient.

� � � � = � − � + � − � + � (1 + cos �� − � ) 2 2 � � �� + 4� − + � � 4��� , ,

Generally, a molecular mechanics force field is a sum of additive energy functions describing different types of interactions. The energies of and bond angle are treated as harmonic oscillators with force constants kb and kθ, and energy minima at the optimal bond length b0 and the optimal bond angle θ0, respectively. The dihedral angle potential can be described periodically where kϕ, n and δ are constants. As for intermolecular interactions, the Lennard-Jones potential is usually used to represent the van der Waals interactions, where rij is the intermolecular distance, and εij and σij are constants.

Coulomb interactions between of partial charges qi and qj with dielectric constant ε account for the . Finally, a water model is required to consider explicit solvent dynamics.

Based on , MD simulations performed under different conditions generate various thermodynamic ensembles. The isothermal-isobaric (NPT) ensemble fixes the number of atoms, and the microscopic system is coupled to the macroscopic

2 properties pressure and on a step-to-step basis. NPT simulations sample the

Gibbs free energy landscapes, which govern the most common experimental condition.

Also, in such simulations, liquid density is able to equilibrate automatically.

More efficient than quantum chemical calculations and more accurate compared with coarse-grained representations, MD simulations have become a crucial biophysical method for the understanding of biochemical processes at atomic detail. Continuous increase in computer speed enables MD simulations of full proteins to be run routinely into the microsecond timescale. Using special-purpose computers that are capable of performing millisecond simulations, proteins have been folded from scratch in appropriate force fields

[5].

1.2 Nuclear Magnetic Resonance (NMR) Spectroscopy

Nuclear magnetic resonance (NMR) spectroscopy, arisen from interactions between nuclear magnetic spins in an external magnetic field, is an important experimental method to characterize and probe site-specific dynamics. NMR measurements, such as chemical shifts and J-couplings, provide molecular information of proteins at a much slower timescale that is usually inaccessible to MD simulations. Chemical shifts, the variations of NMR frequency due to variations in the electron distribution, are diagnostic of chemical environments. J-couplings, arisen from indirect dipole-dipole interactions, contain information about the conformations of . To be more precise, according to Karplus equations [6], J-coupling constants can be calculated solely based on dihedral angles.

� = �cos(� + �) + �cos(� + �) + � 3

In a typical Karplus equation, ϕ denotes dihedral angle φ or ψ, θ is an offset, and A, B, and

C are Karplus coefficients parametrized by experimental data or quantum chemical calculations.

Such experimental input can be utilized in MD simulations in various ways. Back- calculation is a common approach in which experimental data is used to validate MD trajectories. J-coupling constants, for example, can be back-calculated by averaging over all snapshots in an MD trajectory. With appropriate Karplus parametrization, the back- calculated J-coupling constant � can be compared with the experimental coupling constant �.

� = �′ cos(� + �) + �′ cos(� + �) + �′

The Karplus coefficients A', B', and C' used in back-calculation, strictly speaking, are not the same as those fitted experimentally. In the experimental parametrization, the experimental J-couplings are averages over different conformations while the dihedral angle are static values taken from structures. Therefore, experimental Karplus coefficients containing dynamic information has been readily averaged by motion. To avoid double averaging, static Karplus coefficients should be used in back-calculation instead. Such coefficients can be obtained by assuming certain dihedral angle distribution

[7] or directly fitting from MD trajectories [8].

Chemical shifts can also be predicted from an MD trajectory or a single snapshot. Because chemical shift have a more complex relationship with backbone and side-chain dihedral angles as well as hydrogen bonds, linear regression and artificial neural networks are

4 commonly used by empirical chemical shift predictors, such as SPART+[9], PPM[10], and

PROSECCO [11].

In this thesis, MD simulations and NMR measurements are used as complementary tools for the study of protein structure and dynamics. In Chapter 2, molecular mechanics force fields are assessed as combinations of protein force fields and water models by comparing back-calculated NMR parameters with the experimental values. In Chapter 3, MD simulations are used to confirm the stability of a newly characterized domain, the two-helix bundle (THB) domain, in the eukaryotic Na+/Ca2+ exchanger (NCX). In Chapter 4, major conclusions are highlighted and an outlook for further research is provided.

5

Chapter 2: Assessment of Molecular Mechanics Force Fields for IDPs by NMR Data

2.1 Introduction

Intrinsically disordered proteins (IDPs) are proteins that exist in natively unfolded states under physiological conditions. In a broader sense, such intrinsic disorder is also manifested in intrinsically disordered regions and loops that connect multiple structured domains. The existence of intrinsic disorder is ubiquitous. It has been reported that more than 30% of eukaryotic proteins have at least 40 consecutive disordered residues, significantly higher than the percentage for bacteria or archaea [12].

IDPs are widely encoded in human genes and linked to a variety of human diseases [13-

15]. The human tumor protein p53, for instance, is an IDP that plays a crucial role in preventing cancer formation [16]. α-Synuclein, as another example, is partially responsible for Parkinson's disease due to its accumulation in the brain.

Although the detailed mechanisms of IDP functions are still largely unknown, binding- induced folding is a commonly accepted model for the interaction between an IDP and a globular partner or between two IDPs [12]. Therefore, knowledge of structural propensities and dynamics of IDPs is essential to understanding the pathogeneses of such human diseases and the mechanisms of relevant biological functions.

MD simulations have proven effective for the study of [5] and conformational change of IDPs [17, 18] at atomic detail. However, using simulations to 6 study the dynamics of IDP remains challenging due to issues from insufficient samplings and imperfect molecular mechanics potentials.

On the one hand, the lack of 3D structures results in flatter energy landscapes for IDPs compared with those of folded proteins. Larger conformational space needs to be sampled in order to acquire an ensemble of IDP conformations whereas a single native structure is usually necessary for a globular protein from its energy minimum, which makes the study of IDP extremely difficult by simulations and by experiments. Enhanced sampling techniques, such as replica-exchange MD simulations, have been shown to accelerate the conformational sampling of and IDPs [19, 20].

On the other, traditional force fields optimized for globular proteins often result in overly collapsed structures for IDPs [21-24]. Approaches have been proposed to solve this issue

[25-27]. In particular, balanced protein-water interactions play an important part in improving the agreement with experimental data [28-30].

Specifically, TIP4P-D water model has recently received great attention [31]. Compared with TIP3P water model [32], which was developed more than 30 years ago, TIP4P-D modifies water dispersion coefficietnt and thus improves performance with IDP simulations. Various approaches have been taken to improve protein force fields. The

Amber ff14SB force field (Amber14) modifies side-chain and backbone parameters to improve secondary structure content in small peptides and the agreement with NMR χ1 scaler coupling constants [33]. CHARMM36m force field mainly adjusts the population of left-handed α-helix to generate more realistic ensembles for both IDPs and folded proteins

[34]. An NMR-based protein force field optimizes the potential of backbone dihedral

7 angles by reweighting MD trajectories to minimize the RMSD compared with the experimental chemical shifts [35]. Together with side-chain torsion angle correction [36],

Amber99SBnmr1-ILDN was tested to perform the best with several water models [37].

More recently, residue-specific force fields (RSFFs), based on protein coil library, have been shown to improve peptide conformations [38, 39]. With the compensation on α-helix, RSFF2+ force field is able to remove the destabilizing effect from TIP4P-D on proteins with small fractions of folding and thus produce correct folding [40]

Different force fields are able to generate different structure ensembles for IDPs [41]. In order to obtain realistic protein ensembles, it is of great importance to utilize appropriate protein force fields and water models for given systems. Since force fields are usually developed in and tested for different systems, it is necessary to assess them by the same standards.

Chemical shifts and J-coupling data from solution NMR experiments provide valuable ensemble descriptions for the dynamics of IDPs. J-coupling constants are empirically calculated using various Karplus equations that only depend on dihedral angles [6].

Chemical shifts can be also predicted using empirical predictors such as PPM [10] and

PPM_One [42] from an ensemble of structures and a single structure, respectively. These accurately measured experimental parameters serve as a common ground for the assessment of the accuracies of protein force fields and water models.

The transactivation domain of the human tumor suppresser p53 (p53TAD) [43-45] and α-

Synuclein [46], containing 73 and 140 amino-acid residues respectively, are well-studied

8

IDPs with accurate chemical shifts data available. At a smaller scale, structure propensities of amino-acid residues in tripeptides have been extensively studied by various methods with available J-coupling constants of various types [47, 48]. In this chapter, we select the fragments from p53TAD and α-Synuclein sequences as well as the tripeptides to avoid insufficient samplings caused by the high intrinsic flexibilities of IDPs. The combined accuracies of protein force fields and water models are assessed by the difference between the experimental NMR parameters, namely J-coupling constants and chemical shifts, and the back-calculated ones.

2.2 Method

2.2.1 MD simulation

Eight tripeptides with sequence GXG (X = A, E, F, K, L, M, S, and V) were selected to represent 20 amino acids and minimize nearest neighbor effects. Their structures were prepared using the LEap program in AmberTools14 with Amber14 force field. The initial structures of overlapping heptapeptides from α-Synuclein and p53TAD (P72R mutant with exogenous tag SNA) sequences were prepared in the same manner with termini properly capped by acetyl and N-methyl amide groups.

The simulations were performed using GROMACS 5.0 package [49]. The integration step was set to 2 fs with all bond lengths containing hydrogen atoms restrained by the LINCS algorithm. Na+ ions were added to neutralize the total charge of the system. A cutoff of 10

Å was used for all van der Waals and electrostatic interactions. Particle-Mesh with a grid spacing of 1.2 Å was used to calculate long-rage electrostatic interactions. A cubic simulation box that extended 8 Å from the protein surface was used,

9 and periodic boundary conditions were applied in all three . was done using steepest descent algorithm for 50,000 steps. Then, the system was simulated for 100 ps at a constant temperature of 298 K and constant volume with all protein heavy atoms fixed. Next, the pressure was coupled at 1 atm and the system was simulated for another 100 ps. The final production run was performed in the NPT ensemble at 298 K and 1 atm. A Berendsen thermostat with a damping constant of 0.1 ps was used to keep the temperature of the system at 298 K and a Parrinello-Rahman barostat with a relaxation time of 2.0 ps was used to control the pressure at the target value of 1 atm [50].

1-µs MD simulations of the tripeptides were performed using combinations of protein force fields, namely, Amber14, CHARMM36m, Amber99SBnmr1-ILDN, and RSFF2+, and water models, TIP3P and TIP4P-D. Similarly, 69 IDP fragments were simulated for 500 ns each using force fields, Amber99SBnmr1-ILDN and RSFF2+, and water models, TIP3P and TIP4P-D.

2.2.2 Back-calculation of J-coupling constants

J-coupling constants, 3J (HN,Hα), 3J (HN,C'), 3J (Hα,C'), 3J (HN,Hβ), 1J (N,Cα), and 2J (N,Cα) for the middle residues of the tripeptides, to avoid terminal effect, were back-calculated as the ensemble averages using the suitable Karplus coefficients [51-53]. In addition, cumulatively time-averaged J-coupling constants were calculated to observe convergence.

Special attention was paid to long-time samplings of positive φ angles as well as the frequencies of conformational jumps between α and β region in Ramachandran space according to the conventional definition [54].

10

RMSD errors were calculated compared with experimental data [48] for a group of J- coupling constants that depend on φ and the other group that depends on ψ to evaluate the accuracies of force fields. Then, an overall RMSD error was calculated with the two groups equally weighted to evaluate the overall performance.

2.2.3 Back-calculation of chemical shifts

Chemical shifts of Cα, Cβ, CO, HN, and N of the middle 3 residues in the IDP fragments of p53TAD and α-Synuclein were back-calculated from MD trajectories using chemical shift predictor PPM [10]. Overall RMSDs between all calculated chemical shifts and experimental values were calculated, except for the CO chemical shifts in p53TAD, in which a notable offset between the experimental values and calculated ones indicated referencing error in the experimental data. For overall evaluation of chemical shift prediction for different spins, proton chemical shifts were multiplied by 10 and nitrogen chemical shifts were divided by 2, to account for the different prediction accuracies of different types of nuclei.

To evaluate the effectiveness of MD simulations and computational chemical shift predictors, a hypothetical chemical shift predictor using random coil chemical shifts was formulated. For each residue, the predictor would give the random coil chemical shift for the residue type it belongs to [55]. And the RMSDs for the prediction based on MD simulation and the one using random coil chemical shift with respect to experimental data were then compared on a residue-specific basis. To avoid referencing error, only chemical shifts from α-Synuclein were adopted. Also, only residues that had at least 5 chemical shifts available were considered for statistical significance.

11

2.3 Result and Discussion

2.3.1 RMSDs of back-calculated J-coupling constants

For J-coupling constants that depend on φ, the use of TIP4P-D water model slightly increased the RMSDs for all protein force fields (Table 2.1). In the ψ dependent group, however, all RMSDs except for the one associated with CHARMM36m decreased when

TIP4P-D was used. The most significant improvement was shown in the case of

Amber99SBnmr1-ILDN, where the RMSD was decreased by about 12% (Table 2.2).

According to the overall RMSD with J-coupling constants depending on φ and ψ equally weighted, all protein force fields, except for CHARMM36m, benefited from the use of

TIP4P-D (Table 2.3).

Overall, Amber99SBnmr1-ILDN/TIP4P-D yielded the lowest RMSD and RSFF2+/TIP4P-

D was in the second place (Table 2.3). RSFF2+/TIP3P and Amber14/TIP4P-D performed the best in back-calculating J-coupling constants that depend on φ and ψ, respectively.

Nonetheless, the performance difference among different force field combinations was small. The best combination was less than 10% better than the worst. When broken into 2 different dihedral-angle dependent groups, the differences were slightly higher, 15% for the φ dependent group and 26% for the ψ dependent group.

Little correlation between a force field and the water model the force field had been optimized with was observed. All protein force fields were developed using TIP3P water model, including Amber99SBnmr1-ILDN which benefited the most from TIP4P-D. It was then demonstrated that one can improve the performance of a force field by using a better water model, and vice versa.

12

J-coupling constant (Hz) TIP3P TIP4P-D Amber14 0.5416 0.5689 CHARMM36m 0.5267 0.5620 Amber99SBnmr1-ILDN 0.4932 0.4974 RSFF2+ 0.4844 0.5105

Table 2.1 RMSD of φ dependent J-coupling constants using different force field and water model combinations. The use of TIP4P-D slightly increased RMSDs for all protein force fields. RSFF2+/TIP3P yielded the smallest RMSD.

J-coupling constant (Hz) TIP3P TIP4P-D Amber14 0.4098 0.3706 CHARMM36m 0.4044 0.4054 Amber99SBnmr1-ILDN 0.4532 0.3995 RSFF2+ 0.4991 0.4423

Table 2.2 RMSD of ψ dependent J-coupling constants using different force field and water model combinations. Except for CHARMM36m, the RMSDs decreased when TIP4P-D was used with the largest decrease of 12% for Amber99SMnmr1-ILDN. Amer14/TIP4P-D yielded the smallest RMSD.

J-coupling constant (Hz) TIP3P TIP4P-D Amber14 0.4802 0.4801 CHARMM36m 0.4695 0.4900 Amber99SBnmr1-ILDN 0.4736 0.4511 RSFF2+ 0.4918 0.4776

Table 2.3 φ and ψ equally weighted RMSD of J-coupling constants using different force field and water model combinations. Amber99SBnmr1-ILDN/TIP4P-D yielded the smallest RMSD while RSFF2/TIP4P-D was in the second place. The difference between different force fields was, however, subtle. The best combination was less than 10% better than the worst.

13

2.3.2 Correlation coefficients of back-calculated J-coupling constants

Except for 3J (HN,Hα) and 3J (HN,Hβ) coupling constants, all the correlation coefficients

(Rs) were low (Figure 2.1). In this small selection of tripeptides where J-coupling constants are within relatively small ranges, the low correlations between experimental and back- calculated J-coupling constants could result from the inability of force fields to discriminate the dihedral angle distributions in different amino acids. It was also noticed that the back-calculation overestimated 3J (HN,Hα), 3J (HN,C'), and 2J (N,Cα) coupling constants while underestimated 3J (Hα,C'), 3J (HN,Hβ), and 1J (N,Cα) coupling constants.

14

Figure 2.1 Computational and experimental J-coupling constants. 3J (HN,Hα), 3J (HN,C'), 3J (Hα,C'), 3J (HN,Hβ), 1J (N,Cα), and 2J (N,Cα) coupling constants from available experimental data are plotted against computed ones. Scarlet bars indicate measurement errors. Correlation coefficients, shown on the plots, are low except for 3J (HN,Hα) and 3J (HN,Hβ). The back-calculation overestimated 3J (HN,Hα), 3J (HN,C'), and 2J (N,Cα) coupling constants while underestimated 3J (Hα,C'), 3J (HN,Hβ), and 1J (N,Cα) coupling constants.

15

2.3.3 RMSDs of back-calculated chemical shifts

The average RMSD of back-calculated chemical shifts compared with experimental values was 1-2 ppm (Table 2.4). When TIP4P-D water model was used, the RMSDs decreased significantly. Amber99SBnmr1-ILDN/TIP4P-D yielded the smallest RMSD, 42% better than Amber99SBnmr1-ILDN/TIP3P. RSFF2+/TIP4P-D was in the second place. This result was consistent with the evaluation based on J-coupling constants of tripeptides but more conclusive. Given that Amber99SBnmr1-ILDN force field was optimized against experimental chemical shift data from globular proteins, the fact that RSFF2+ force field reached comparable accuracy affirmed the effectiveness of the residue-specific approach of force field development.

RMSD � (ppm) TIP3P TIP4P-D Amber99SBnmr1-ILDN 1.8122 1.0561 RSFF2+ 1.6971 1.1611

Table 2.4 RMSD of chemical shifts of p53TAD (excluding CO data) and α-Synuclein using different force field and water model combinations. Amber99SBnmr1- ILDN/TIP4P-D yielded the smallest RMSD while RSFF2+/TIP4P-D yielded comparable results.

16

The overall correlation between the calculated chemical shifts and experimental ones was encouraging (Figure 2.2). All correlation coefficients were larger than 0.85, except for amide proton chemical shifts, in which case the coefficient was 0.57. There are several factors that contribute to the generally low correlation for HN chemical shifts. First, the ranges of and chemical shifts are much larger than that of proton chemical shifts, thus the difference of carbon and nitrogen chemical shifts for different amino-acid residues are more prominent. Second, the positions of hydrogen atoms cannot be detected by X-ray crystallography. Therefore, force fields parameters regarding hydrogen atoms are usually not well validated due to the lack of experimental data. Finally, reference error was observed by an offset in the plot for N chemical shifts.

17

Figure 2.2 Computational and experimental chemical shifts in α-Synuclein. Chemical shifts predicted by PPM using 500-nsMD trajectory using Amber99SBnmr1-ILDN/TIP4P- D show overall good correlation with experimental values. The amide proton chemical shifts show weaker correlation due to imperfect force fields and reference error is observed in the plot for N chemical shifts.

18

2.3.4 Comparison with random coil approach

The comparison between the RMSDs of the chemical shift back-calculation and the experimental chemical shift predictor indicated the back-calculation had a comparable accuracy with the random coil approach (Figure 2.3). Only carbon chemical shifts were used due to force fields' imperfection for protons and the reference error for N chemical shifts. When TIP3P water model was used, the performance of random coil in predicting chemical shifts were better than the combination of MD simulations and PPM. The use of

TIP4P-D water model increased the fractions of the computational chemical shift predictor having smaller RMSDs than those of the random coil predictor, which was 54.6% using

Amber99SBnmr1-ILDN/TIP4P-D or RSFF2+/TIP4P-D.

Figure 2.3 Fraction of RMSD of back-calculated chemical shifts smaller than that of the experimental predictor. When TIP3P water model was used (scarlet), the random coil approach performed better than simulation and PPM combined. When TIP4P-D was used (grey), 12 out of 22 predictions grouped by types of residues and chemical shifts using Amber99SBnmr1-ILDN/TIP4P-D or RSFF2+/TIP4P-D was better than that using random coil chemical shifts, which indicated both approaches were comparable.

19

Despite the small database of experimental chemical shifts, it was revealing that the combined accuracy of the force fields and the chemical shift predictor was improved by the use of TIP4P-D water model. For further improvement of back-calculation, PPM and force fields need to be assessed separately.

2.4 Conclusion

Accurate protein force fields and water models are essential to the study of dynamics of

IDPs. By comparing with experimental NMR data, optimal choices were determined.

Amber99SBnmr1-ILDN/TIP4P-D and RSFF2+/TIP4P-D yielded best results in back- calculating J-coupling constants and chemical shifts. Amber99SBnmr1-ILDN optimized by experimental chemical shifts of globular proteins worked the best in J-coupling constant back-calculation. The effectiveness of residue-specific approach of force field development was demonstrated in the chemical shift prediction using RSFF2+ force field.

It has also been shown that the use of TIP4P-D water model provided more accurate predictions of J-coupling constants and chemical shifts by producing an improved dihedral angle ψ distribution.

While the performance of force fields was not independent of that of water models, little correlation was observed between a protein force field and the water model it had been optimized with. Amber99SBnmr1-ILDN was developed for TIP3P water model but benefited the most from TIP4P-D. The development of protein force fields and water models can be made independently while the collective performance should be constantly assessed.

20

Different experimental data reveals different aspects of molecular mechanics force fields.

Low correlation coefficients between back-calculated J-coupling constants and experimental ones indicate the inability of force fields to differentiate dihedral angle distributions in different amino acids. The comparison with the experimental chemical shift predictor raises questions about the effectiveness of MD simulations. However, the prediction of chemical shifts requires another package, whose accuracy complicates the assessment of force fields. To better reveal the deficiencies in force fields, separate evaluation of chemical shift predictor and MD force fields are required. For example, comparison between the results of MD trajectory-based PPM and sequence-based

PPM_One may indicate the effectiveness of MD simulations in back-calculation of chemical shifts since both packages supposedly adopt similar algorithms.

For future research, replica-exchange MD may be used to ensure sufficient sampling of peptide configurations and the system of interest may thereby be expanded to the entire sequences of IDPs to reach a higher level of complexity. Also, more recent force fields may be included for assessment. GROMOS 54A7 [56], for example, has been optimized for amino-acid hydration energetics and thus is expected to better describe the unfolded states.

21

Chapter 3: MD Simulation of the Two-helix Bundle Domain in the Na+/Ca2+

Exchanger

2.1 Introduction

The Na+/Ca2+ exchanger (NCX) is a that exists ubiquitously among mammalian species, which plays a key role in maintaining cellular Ca2+ homeostasis [57-

60]. Alterations in NCX protein expression or regulation are associated with altered calcium homeostasis in pathophysiological disorders and diseases, such as stroke and heart failure [61, 62].

In eukaryotic cells, the regulation of the intracellular Ca2+ homeostasis involves a transmembrane domain as well as an intracellular loop, termed f-loop. The Ca2+-binding domain (CBD12) in the f-loop senses the cellular Ca2+ concentration and the transmembrane domain transports one Ca2+ ion to the opposite side of the membrane in exchange for three Na+ ions [63]. However, how the conformational changes of CBD12 in the center of f-loop communicate to the transmembrane domain that is tens of Å away is largely unknown [64, 65].

The two-helix bundle (THB) domain found within the cytosolic region of the canine NCX1 protein has been recently structurally determined by NMR experiments as the only structured domain in the f-loop besides CBD12. The experimentally characterized structure of the THB domain consists of 2 α-helices (residues 284-301 and residues 307-322) and a 22 linker (residues 302-306). Besides, there are 2 loops (residues 271-283 and residues 323-

327) that flank the ordered region [66].

Since the conformational change of the THB domain upon Ca2+ binding to CBD12 is distinct from the rest of the disordered loop, the study of structural stability of the THB domain is essential to providing information of eukaryotic NCX proteins.

As a continuation of our previous force field assessment, it is crucial for MD force fields to generate realistic structural ensembles for both disordered and ordered proteins. Despite the TIP4P-D water model’s achievement on the former one, there is evidence indicating that it may destabilize folded proteins in reversible folding possibly by disfavoring α- helices [31, 40]. This experimentally determined THB domain flanked by disordered loops, with its mixed ordered and disordered features, serves as a model system to test the performance of force fields in terms of stabilities of folded proteins and dynamics of their disordered regions.

In this chapter, we confirmed the stability of the THB domain and explored the potential ubiquitous regulatory mechanism via folding and unfolding using MD simulations. Also,

Amber99SBnmr1-ILDN/TIP4P-D force field and water model combination was shown to provide reasonable stability for the folded protein.

23

Figure 3.1 Schematic representation of eukaryotic NCX structure (from Jiaqi Yuan, et al., 2018). The updated NCX model consists of a transmembrane domain TM (residues 1-218 and 727-903) and an intracellular f-loop (residues 219-726). The newly characterized two-helix bundle (THB) domain (residues 284-322) is the only structured domain in the f-loop besides the Ca2+ binding domain CBD12 (residues 371-650). The exchanger inhibitory peptide XIP (residues 219-238), cytoplasmic segment 1 Cyto1 (residues 219-370), and cytoplasmic segment 2 Cyto 2 (residues 651-726) are other domains and regions in the intracellular loop.

24

2.2 Method

The simulation was performed for 500 ns using GROMACS 5.0 package [49] following the same protocol described in Chapter 2. Amber99SBnmr1-ILDN [35] was used as the force field, and water molecules were explicitly included using TIP4P-D water model [31].

The RMSD of the Cα (residues 284-322) from each snapshot was calculated compared with the initial NMR structure with the lowest energy. The inter-helical distances were also estimated by the Cα coordinates of two pairs of residues (residue 295 and 310, and residue

295 and 314). The angle between the two α-helices was calculated based on the angle between the average N-H bond vectors of residues 288-301 and 311-322. Due to the formation of hydrogen bonds between residues i and i + 4, the N-H bonds represented the direction of α-helices.

2.3 Result and Discussion

2.3.1 Stability of the NMR Structure

The average RMSD is 2.35 Å with a standard deviation of 0.32 Å throughout the simulation

(Figure 3.2 A). This RMSD profile indicates a good convergence of the trajectory. Small

Cα RMSD, on the other hand, reflects good overall similarity between the experimental structure and the computational ensemble.

Typically, there are types of motion in a two-helix bundle, one along the coordinates of the distance between the two helices and the other that changes the inter-helical angle. The mean distance between residue 295 and 310 is 6.68 Å with a standard deviation of 0.35 Å and the mean distance between residue 295 and 314 is 5.51 Å with a standard deviation of

0.33 Å (Figure 3.2 B). All three residues are in the middle of helical region, so they are not

25 affected by the end effect. Residue 295 is located on one α-helix, and residues 310 and 314 on another. Along the shortest inter-helical distance, the residue 295 sits between residue

310 and 314. Therefore, the combined pairwise distance profile reveals the appropriate inter-helical distance. The RMSDs of the distances are comparable to the standard deviation of the Cα RMSDs.

2.3.2 Inter-helical Angles Are Consistent with the Experimental Values

As the alternative motion of the THB domain, the average inter-helical from the MD simulation is 125.7° with a standard deviation of 6.0 ° (Figure 3.3 A and B). Although the average angle from MD trajectory is systematically lower than the average angle between helices (134.7°) calculated from the 20 NMR structures of lowest energies, the stability of the THB domain is demonstrated. This result is also consistent with the fact that the two pairwise distances are not correlated in any discernable way, which also indicates the small fluctuation of the inter-helical angle.

Hydrophobic residues M291, I294, L295, L298, L310, I311, A314, and V318 experimentally determined to be involved in inter-helical contacts still hold in the superposition of the MD simulation structures taken at 100 ns, 200 ns, 300 ns, 400 ns, and

500 ns (Figure 3.3 C). The tight bundle reflects the good overall stability of the domain during the MD simulation.

26

A

B

Figure 3.2 MD simulation of THB domain. A. Cα RMSDs calculated from snapshots of MD trajectory compared to the experimental structure reflecting good stability of the THB domain during the length of the trajectory (500 ns). B. Distances between residue 295 and residue 310 (scarlet) and between residue 295 and 314 (grey) calculated from snapshots of MD trajectory reveals good stability of the THB domain.

27

A

B C

Figure 3.3 MD simulation of THB domain. A. the angles between the two helices calculated from MD snapshots. The gray line corresponds to the average angle between helices (134.7°) calculated from 20 NMR structures of lowest energy. B. Histogram representing the distribution of inter-helical angle during the MD simulation, which has a standard deviation of 6.0°. C. Superposition of THB domain structures corresponding to MD simulation structures taken at 100 ns, 200 ns, 300 ns, 400 ns, and 500 ns. Hydrophobic residues M291, I294, L295, L298, L310, I311, A314, and V318 involved in inter-helical contacts are indicated in gray. The tight bundle, consistent with the experiment, reflects the good stability of the domain during the MD simulation.

28

2.3.3 Dynamics of the THB Domain

The relative standard deviations of all 4 measures of motion, namely, the overall RMSD of the THB domain, the distance between residue 295 and 310, the distance between residue 295 and 314, and the inter-helical angle, reflect the relative amplitudes of the (Table 3.1). The overall RMSD of the ordered region that comprises of 2 helices and a linker exhibits a comparable fluctuation to the ones in pairwise distances and inter- helical angle. Since the latter ones only involve the motion of the helices, it can be speculated that the linker that coordinates the motions of the helix-bundle is relatively rigid, which is consistent with the NMR relaxation measurements.

The THB domain displays sufficient stability during the length of the MD simulation (500 ns), indicating the possible unfolding process of the THB domain takes places on a much slower timescale.

inter-helical distance distance overall

angle residue 295 and 310 residue 295 and 314 RMSD relative standard 0.048 0.052 0.060 0.135 deviation

Table 3.1 Relative standard deviations of different measures of motion. The relative standard deviations of all 4 measures of motion, namely, the overall RMSD of the ordered region of the THB domain, the distance between residue 295 and 310, the distance of residue 295 and 314, and the inter-helical angle, were calculated by dividing the standard deviation by the mean. The relative standard deviation of the overall RMSD is about 2 times larger than the rest, which demonstrates the rigidity of the linker.

29

2.4 Conclusion

To assess the stability of the NMR structure of the THB domain, an MD simulation was performed for a total length of 500 ns with AMBER99SBnmr1-ILDN force field in TIP4P-

D explicit water starting from the NMR structure of the lowest energy. The backbone Cα

RMSD of individual snapshots relative to the NMR structure shows a stable behavior fluctuating around 2 Å with an average inter-helical angle of 125.7°, which is close to one of the NMR structures. These results confirm the stability of the NMR structure of the THB domain.

The stability of the simulated THB structure also suggests the transferable performance of

Amber99SBnmr1-ILDN force field in company with TIP4P-D water model. Further tests may be performed on intact proteins, such as ubiquitin and the third IgG-binding domain of Protein G.

Once the performance of force fields on ordered and disordered proteins is validated, MD simulations may be applied to larger parts of NCX and investigate the underlying mechanism of Ca2+ sensing.

30

Chapter 4: Conclusion and Outlook

Various combinations of protein force fields and water models have been assessed based on their performance in the back-calculation of NMR parameters for direct comparison with experiment. TIP4P-D water model is proven to improve such accuracy by producing more realistic dihedral angle ψ distributions. Amber99SBnmr1-ILDN and RSFF2+ force fields have been shown to be optimal choices for tripeptide and IDP fragment simulations.

The combined accuracy of MD simulations using those force fields and chemical shift predictors is comparable to that of average chemical shifts from the random coil approach for a given residue type. Despite the great contributions to the study of structural propensities of IDPs, there are still many aspects where molecular mechanics force fields can be improved. For example, force fields are generally unable to differentiate dihedral angle distributions in a residue-specific manner.

For the future improvement of force fields, we would start with an optimal combination,

Amber99SBnmr1-ILDN/TIP4P-D, and rebalance the populations of αR conformation and the others in the Ramachandran space. First, the assessment of the feasibility of such modification will be done on well-sampled IDP fragments with accurately measured NMR parameters available. χ2 landscapes concerning the difference between back-calculated

NMR parameters and the experimental values will be plotted against the αR population.

2 The change of the simulated population of αR conformation that would lower the χ value 31 will then be determined, either increase or decrease, in the case of 1J (N,Cα), and 2J (N,Cα), for example, which only depend on dihedral angle ψ distributions. This type of evaluation will be applied to same residues across the sequence for the possibility of force field modification in a residue-specific approach and to different residue types to seek greater uniformity.

Second, should such a reweighting method be feasible, in the

Ramachandran space may be changed using the average free energy change approximated by the change of certain population. Several IDP fragments from the test set may be selected to repeat the simulation with such implementation and the effectiveness of the modification may be indicated by the results in back-calculation.

Third, tests of stability will be performed on globular proteins and proteins with smaller fractions of folding to ensure transferability. As reported in the literature, TIP4P-D water model tends to destabilize mini proteins [40]. It is also important to monitor its effect on the accuracies of other back-calculated NMR parameters while making force fields modification. For example, the RMSDs of back-calculated 3J (HN,Hα) compared with experimental values should not increase remarkably while changes are being made.

With structural propensities of each residue optimized, the new force field may be applied to study loop dynamics. GTPase K-Ras, for example, is a proto-oncogene with two major loops, namely switch I and switch II. Its oncogenic mutants share almost identical structures with the wildtype, yet they are able to remain in the active form, causing cell proliferation and eventually cancer. It has been speculated that switch I and switch II play an essential role in the enzymatic processes, but the regulatory mechanism has never been

32 discovered. Using long MD simulations with balanced protein force fields and water models, we are hopeful to decipher the conformational change of K-Ras based on increasingly realistic protein ensembles.

On the methodological side, the new force field can be used for the refinement of both experimental and in-silico generated protein structures and homology models. Current force fields are capable of generating globular structures that are very close to native structures. However, they are unable to capture precise details of protein loops and intrinsically disordered regions even with millisecond simulations. The carefully balanced new force field should allow us to explore such possibilities. Finally, the new force field can be used to generate physically realistic conformational ensembles that are generally underdetermined from experimental measurements alone.

33

Bibliography

1. Kremer, K., Computer simulations for macromolecular science. Macromolecular Chemistry and , 2003. 204(2): p. 257-264.

2. Saiz, L., S. Bandyopadhyay, and M.L. Klein, Towards an understanding of complex biological membranes from atomistic molecular dynamics simulations. Bioscience Reports, 2002. 22(2): p. 151-173.

3. Norberg, J. and L. Nilsson, Molecular dynamics applied to nucleic acids. Accounts of Chemical Research, 2002. 35(6): p. 465-472.

4. Karplus, M. and J.A. McCammon, Molecular dynamics simulations of biomolecules. Nat Struct Biol, 2002. 9(9): p. 646-52.

5. Lindorff-Larsen, K., et al., How Fast-Folding Proteins Fold. Science, 2011. 334(6055): p. 517-520.

6. Karplus, M., Contact Electron-Spin Coupling of Nuclear Magnetic Moments. Journal of , 1959. 30(1): p. 11-15.

7. Bruschweiler, R. and D.A. Case, Adding Harmonic Motion to the Karplus Relation for Spin-Spin Coupling. Journal of the American Chemical Society, 1994. 116(24): p. 11199-11200.

8. Vogeli, B., et al., Limits on variations in protein backbone dynamics from precise measurements of scalar couplings. J Am Chem Soc, 2007. 129(30): p. 9377-85.

9. Shen, Y. and A. Bax, SPARTA+: a modest improvement in empirical NMR chemical shift prediction by means of an artificial neural network. J Biomol NMR, 2010. 48(1): p. 13-22.

10. Li, D.W. and R. Bruschweiler, PPM: a side-chain and backbone chemical shift predictor for the assessment of protein conformational ensembles. Journal of Biomolecular Nmr, 2012. 54(3): p. 257-265.

34

11. Sanz-Hernandez, M. and A. De Simone, The PROSECCO server for chemical shift predictions in ordered and disordered proteins. J Biomol NMR, 2017. 69(3): p. 147-156.

12. Habchi, J., et al., Introducing Protein Intrinsic Disorder. Chemical Reviews, 2014. 114(13): p. 6561-6588.

13. Uversky, V.N., C.J. Oldfield, and A.K. Dunker, Intrinsically disordered proteins in human diseases: introducing the D2 concept. Annu Rev Biophys, 2008. 37: p. 215-46.

14. van der Lee, R., et al., Classification of Intrinsically Disordered Regions and Proteins. Chemical Reviews, 2014. 114(13): p. 6589-6631.

15. Larion, M., et al., Kinetic Cooperativity in Human Pancreatic Glucokinase Originates from Millisecond Dynamics of the Small Domain. Angewandte Chemie-International Edition, 2015. 54(28): p. 8129-8132.

16. Surget, S., M.P. Khoury, and J.C. Bourdon, Uncovering the role of p53 splice variants in human malignancy: a clinical perspective. Onco Targets Ther, 2013. 7: p. 57-68.

17. Best, R.B., Computational and theoretical advances in studies of intrinsically disordered proteins. Curr Opin Struct Biol, 2017. 42: p. 147-154.

18. Lindorff-Larsen, K., et al., Structure and dynamics of an unfolded protein examined by molecular dynamics simulation. J Am Chem Soc, 2012. 134(8): p. 3787-91.

19. Sugita, Y. and Y. Okamoto, Replica-exchange molecular dynamics method for protein folding. Chemical Physics Letters, 1999. 314(1-2): p. 141-151.

20. Lockhart, C. and D.K. Klimov, Alzheimer's Abeta10-40 peptide binds and penetrates DMPC bilayer: an isobaric-isothermal replica exchange molecular dynamics study. J Phys Chem B, 2014. 118(10): p. 2638-48.

21. Best, R.B. and J. Mittal, Free-energy landscape of the GB1 hairpin in all- explicit solvent simulations with different force fields: Similarities and differences. Proteins, 2011. 79(4): p. 1318-28.

22. Piana, S., J.L. Klepeis, and D.E. Shaw, Assessing the accuracy of physical models used in protein-folding simulations: quantitative evidence from long molecular dynamics simulations. Current Opinion in , 2014. 24: p. 98- 105.

35

23. Fawzi, N.L., et al., Structure and Dynamics of the A beta(21-30) Peptide from the Interplay of NMR Experiments and Molecular Simulations (vol 130, pg 6145, 2008). Journal of the American Chemical Society, 2011. 133(30): p. 11816- 11816.

24. Skinner, J.J., et al., Benchmarking all-atom simulations using hydrogen exchange. Proceedings of the National Academy of Sciences of the United States of America, 2014. 111(45): p. 15975-15980.

25. Mercadante, D., et al., Kirkwood-Buff Approach Rescues Overcollapse of a Disordered Protein in Canonical Protein Force Fields. Journal of Physical Chemistry B, 2015. 119(25): p. 7975-7984.

26. Miller, M.S., et al., Reparametrization of Protein Force Field Nonbonded Interactions Guided by Osmotic Coefficient Measurements from Molecular Dynamics Simulations. Journal of Chemical Theory and Computation, 2017. 13(4): p. 1812-1826.

27. Yoo, J. and A. Aksimentiev, Refined Parameterization of Nonbonded Interactions Improves Conformational Sampling and of Protein Folding Simulations. Journal of Physical Chemistry Letters, 2016. 7(19): p. 3812-3818.

28. Best, R.B., W.W. Zheng, and J. Mittal, Balanced Protein Water Interactions Improve Properties of Disordered Proteins and Non-Specific Protein Association (vol 10, pg 5113, 2014). Journal of Chemical Theory and Computation, 2015. 11(4): p. 1978-1978.

29. Nerenberg, P.S., et al., Optimizing Solute-Water van der Waals Interactions To Reproduce Free Energies. Journal of Physical Chemistry B, 2012. 116(15): p. 4524-4534.

30. Henriques, J., C. Cragnell, and M. Skepo, Molecular Dynamics Simulations of Intrinsically Disordered Proteins: Force Field Evaluation and Comparison with Experiment. Journal of Chemical Theory and Computation, 2015. 11(7): p. 3420- 3431.

31. Piana, S., et al., Water Dispersion Interactions Strongly Influence Simulated Structural Properties of Disordered Protein States. Journal of Physical Chemistry B, 2015. 119(16): p. 5113-5123.

32. Jorgensen, W.L., et al., Comparison of Simple Potential Functions for Simulating Liquid Water. Journal of Chemical Physics, 1983. 79(2): p. 926-935.

36

33. Maier, J.A., et al., ff14SB: Improving the Accuracy of Protein and Backbone Parameters from ff99SB. J Chem Theory Comput, 2015. 11(8): p. 3696-713.

34. Huang, J., et al., CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nature Methods, 2017. 14(1): p. 71-73.

35. Li, D.W. and R. Bruschweiler, NMR-Based Protein Potentials. Angewandte Chemie-International Edition, 2010. 49(38): p. 6778-6780.

36. Lindorff-Larsen, K., et al., Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins-Structure Function and Bioinformatics, 2010. 78(8): p. 1950-1958.

37. Beauchamp, K.A., et al., Are Protein Force Fields Getting Better? A Systematic Benchmark on 524 Diverse NMR Measurements. J Chem Theory Comput, 2012. 8(4): p. 1409-1414.

38. Zhou, C.Y., F. Jiang, and Y.D. Wu, Residue-Specific Force Field Based on Protein Coil Library. RSFF2: Modification of AMBER ff99SB. Journal of Physical Chemistry B, 2015. 119(3): p. 1035-1047.

39. Li, S.X. and A.H. Elcock, Residue-Specific Force Field (RSFF2) Improves the Modeling of Conformational Behavior of Peptides and Proteins. Journal of Physical Chemistry Letters, 2015. 6(11): p. 2127-2133.

40. Wu, H.-N., F. Jiang, and Y.-D. Wu, Significantly Improved Protein Folding Thermodynamics Using a Dispersion-Corrected Water Model and a New Residue-Specific Force Field. The Journal of Physical Chemistry Letters, 2017: p. 3199-3205.

41. Rauscher, S., et al., Structural Ensembles of Intrinsically Disordered Proteins Depend Strongly on Force Field: A Comparison to Experiment. J Chem Theory Comput, 2015. 11(11): p. 5513-24.

42. Li, D.W. and R. Bruschweiler, PPM_One: a static protein structure based chemical shift predictor. Journal of Biomolecular Nmr, 2015. 62(3): p. 403-409.

43. Lee, H., et al., Local structural elements in the mostly unstructured transcriptional activation domain of human p53. Journal of Biological Chemistry, 2000. 275(38): p. 29426-29432.

44. Wells, M., et al., Structure of tumor suppressor p53 and its intrinsically disordered N-terminal transactivation domain. Proceedings of the National

37

Academy of Sciences of the United States of America, 2008. 105(15): p. 5762- 5767.

45. Shan, B., et al., Competitive Binding between Dynamic p53 Transactivation Subdomains to Human MDM2 Protein IMPLICATIONS FOR REGULATING THE p53.MDM2/MDMX INTERACTION. Journal of Biological Chemistry, 2012. 287(36): p. 30376-30384.

46. Cho, M.K., et al., Structural characterization of alpha-synuclein in an aggregation prone state. Protein Science, 2009. 18(9): p. 1840-1846.

47. Graf, J., et al., Structure and dynamics of the homologous series of alanine peptides: A joint molecular dynamics/NMR study. Journal of the American Chemical Society, 2007. 129(5): p. 1179-1189.

48. Hagarman, A., et al., Intrinsic Propensities of Residues in GxG Peptides Inferred from Amide I ' Band Profiles and NMR Scalar Coupling Constants. Journal of the American Chemical Society, 2010. 132(2): p. 540-551.

49. Abraham, M.J., et al., GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX, 2015. 1-2: p. 19-25.

50. Li, D.W. and R. Bruschweiler, Dynamic and Thermodynamic Signatures of Native and Non-Native Protein States with Application to the Improvement of Protein Structures. Journal of Chemical Theory and Computation, 2012. 8(7): p. 2531-2539.

51. Hu, J.S. and A. Bax, Determination of phi and chi(1) angles in proteins from C- 13-C-13 three-bond J couplings measured by three-dimensional heteronuclear NMR. How planar is the ? Journal of the American Chemical Society, 1997. 119(27): p. 6360-6368.

52. Wirmer, J. and H. Schwalbe, Angular dependence of (1)J(N-i,C-alpha i) and (2)J(N-i,C alpha(i-1)) coupling constants measured in J-modulated HSQCs. Journal of Biomolecular Nmr, 2002. 23(1): p. 47-55.

53. Ding, K.Y. and A.M. Gronenborn, Protein backbone H-1(N)-C-13(alpha) and N- 15-C-13(alpha) residual dipolar and J couplings: New constraints for NMR structure determination. Journal of the American Chemical Society, 2004. 126(20): p. 6232-6233.

54. Jensen, M.R., et al., Exploring Free-Energy Landscapes of Intrinsically Disordered Proteins at Atomic Resolution Using NMR Spectroscopy. Chemical Reviews, 2014. 114(13): p. 6632-6660. 38

55. Tamiola, K., B. Acar, and F.A.A. Mulder, Sequence-Specific Random Coil Chemical Shifts of Intrinsically Disordered Proteins. Journal of the American Chemical Society, 2010. 132(51): p. 18000-18003.

56. Schmid, N., et al., Definition and testing of the GROMOS force-field versions 54A7 and 54B7. Eur Biophys J, 2011. 40(7): p. 843-56.

57. Philipson, K.D. and D.A. Nicoll, Sodium-calcium exchange: a molecular perspective. Annu Rev Physiol, 2000. 62: p. 111-33.

58. Baker, P.F., et al., The influence of calcium on sodium efflux in squid axons. J Physiol, 1969. 200(2): p. 431-58.

59. Nicoll, D.A., S. Longoni, and K.D. Philipson, Molecular cloning and functional expression of the cardiac sarcolemmal Na(+)-Ca2+ exchanger. Science, 1990. 250(4980): p. 562-5.

60. Dyck, C., et al., Ionic regulatory properties of brain and kidney splice variants of the NCX1 Na(+)-Ca(2+) exchanger. J Gen Physiol, 1999. 114(5): p. 701-11.

61. Lytton, J., Na+/Ca2+ exchangers: three mammalian gene families control Ca2+ transport. Biochem J, 2007. 406(3): p. 365-82.

62. Khananshvili, D., The SLC8 gene family of sodium-calcium exchangers (NCX) - structure, function, and regulation in health and disease. Mol Aspects Med, 2013. 34(2-3): p. 220-35.

63. Kang, T.M. and D.W. Hilgemann, Multiple transport modes of the cardiac Na+/Ca2+ exchanger. Nature, 2004. 427(6974): p. 544-8.

64. Hilge, M., et al., Ca2+ regulation in the Na+/Ca2+ exchanger features a dual electrostatic switch mechanism. Proc Natl Acad Sci U S A, 2009. 106(34): p. 14333-8.

65. Hilge, M., Ca2+ regulation of ion transport in the Na+/Ca2+ exchanger. J Biol Chem, 2012. 287(38): p. 31641-9.

66. Yuan, J., et al., The intracellular loop of the Na+/Ca2+ exchanger contains an "awareness ribbon" shaped two-helix bundle domain. 2018. Submitted.

39