DIMERIZATION IN PHYSIOLOGICALLY RELEVANT ENVIRONMENTS

Alex J. Guseman

A dissertation submitted to the faculty at the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Chemistry

Chapel Hill 2018

Approved by:

Gary J Pielak

Eric M. Brustad

Matthew R.Redinbo

Sharon L. Campbell

Qi Zhang

©2018 Alex J. Guseman ALL RIGHTS RESERVED

ii

ABSTRACT

Alex J. Guseman: Protein Dimerization in Physiologically Relevant Conditions (under the direction of Gary J. Pielak)

A protein’s native environment, the cell, is far from dilute as the concentration of exceeds 300 g/L. Despite this the majority of biophysical studies occur in dilute buffered solutions. In these complex environments, protein structure and thermodynamics are influenced by weak transient interactions absent in dilute solution.

Here I look to define the role of weak intracellular interactions on protein-protein interactions.

Chapter 1 provides an overview of weak interactions in living cells and cell like system. Additionally, it discusses methodologies to study these interactions in living cells.

Chapter 2 defines the influence of crowded environments on a simple side-by- side dimer. Using a variant of the model system GB1 (the B1 domain of protein G) we investigated in vitro the effects of cosolutes that interact with through known mechanisms, thus determining the influences of chemical interactions and hard core repulsions on the simplest protein-protein interactions.

Chapter 3 defines the influence of chemical interactions by using the side-by-side dimer and systematically manipulates the electrostatics between the side-by-side dimer and protein cosolutes. The results of this study highlighted the importance of chemical interaction on protein dimerization.

iii Chapter 4 defines the influence of hardcore repulsions by using the side-by-side dimer and a 93% sequence identical domain-swapped dimer that has a different shape but identical surface. This system allowed us to manipulate the influence of hardcore repulsions without influencing chemical interactions, suggesting that the shape of a protein dimer defines how it will be influenced by hardcore repulsions.

My work shows the importance of studying protein-protein interactions in physiologically relevant conditions, as it defines the influence of chemical interactions and hardcore repulsions on a protein homodimers.

iv

If you put a gun to my head on Rosemary Street and ask me what were the most important lessons I learned in my PhD.

I would tell you three things

1. Never work with elemental fluorine 2. Never rob a doughnut shop 3. Never invade Russia in the winter

v

ACKNOWLEDGEMENTS

To start off some say it takes a village, so this might be long. Looking back it is rather ironic that my dissertation focuses on the biological physics of the cell. I fell in love with science and more specifically biology while taking AP Bio in high school. I was taking AP Bio to avoid taking GT Physics. My teacher, Mr. Scott showed me how laboratory science was an everyday adventure and encouraged me to do an internship in a local biotech lab company Igene. At Igene, Dr. Anu Sunderarajan taught me basic laboratory skills, pipetting, experimental design, etc setting me up to really hit the ground running during my undergraduate research. I cannot thank Mr. Scott and Anu enough.

From the University of Maryland, thank you to the whole College Park Scholars

Program and ΑΧΣ for the years support. Thank you to my academic advisors, Patty shields, Reid Compton, and Michael Montague Smith who fostered my love for science and encouraged me to become a biochemist despite my math placement exam telling me otherwise. To Dave Hawthorne, thank you for being an amazing mentor and friend throughout the years. My work in Dave’s lab studying ABC transporters in honey bees led to my interest in structural biology causing me to switch my major to and join David Fushman’s lab. In David’s lab I have to thank David, Emma, and especially Carlos for their patience with me as they taught me truly a disciplined approach to protein chemistry. From them I learned to purify proteins, do NMR, and

vi even write my own grants. Because of their encouragements I applied to graduate school, and I am forever grateful for my time with them.

Gary, I cannot thank you enough for the advice and encouragement throughout the years. From day one of my rotation, you listened to my ideas, told me which ones to run with and more importantly which ones to leave behind. Through our constant discussions and sometimes arguments, you pushed me to be a better scientist while giving me room to learn by myself. Some of the most important lessons from Gary have not been about science, but how to navigate interactions with colleagues both at UNC

(If you ever get an email ignored send it again and just CC Gary, you will get a response) and the greater science community.

Finally, I cannot thank the countless number of people who have served as counsel to me through the years. Mom and Dad, having you guys only three hours away has been invaluable. Paul and Andrew, thanks for always annoying me about the golgi apparatus, it is not mentioned again in my thesis. To my former labmates, Drs. Austin

Smith, Rachel Cohen, and Annelise Gorensek-Benitez thank you for showing me the ropes in graduate school and being a source for advice and counsel. To my current colleagues, Pixie, Sam, Thomas, Shannon, Candice, and Joey thank you for pushing me to be a better scientist and think outside the box. Thank you to Tom Christy, Mipso, and O.A.R who also get a shout out for getting me through learning to code and coding the scaled particle theory discussed in chapter 4.

Last of all, thank you to my great friends who have kept me sane throughout graduate school. People of the pod thanks for frequent help adjusting my attitude.

vii Brendon and Earl, thanks for always being there for a beer, counsel, and conversations to distract from everyday life. Fabio, there is too much to write in one sentence so thanks. Katie, just like Fabio there is too much to write in one sentence, but thank you for your love, support, and adventures that have been invaluable to me.

viii

TABLE OF CONTENTS

LIST OF TABLES ...... xii

LIST OF FIGURES ...... xiii

LIST OF ABBREVIATIONS AND SYMBOLS ...... xvi

CHAPTER 1: PROTEIN STABILITY AND WEAK INTRACELLULAR INTERACTIONS ...... 1

Quinary Interactions ...... 1

Protein Stability ...... 2

Roles of Weak Interactions ...... 3

Not All Crowded Solutions are Physiologically Relevant ...... 4

Early Observations of Weak Attractive Interactions under Crowded Conditions ...... 5

Other Qualitative Observations in Cells and Crowded In Vitro Conditions ...... 8

Using NMR to Study Weak Interactions in Cells ...... 12

Interactions with the Folded State ...... 16

Interactions with the Unfolded Ensemble...... 17

Interactions with Disordered Proteins ...... 18

Non-NMR approaches ...... 19

Next Steps ...... 19

Summary and closing thoughts ...... 20

Shifting focus to protein-protein interactions ...... 21

REFERENCES ...... 22

ix CHAPTER 2: COSOLUTE AND CROWDING EFFECTS ON A SIDE- BY-SIDE PROTEIN DIMER ...... 29

Introduction ...... 29

Experimental Procedures ...... 32

Vector...... 32

Expression and Purification...... 32

NMR...... 33

Cosolutes...... 34

Volumes and Surface Areas...... 34

Results ...... 34

Quantifying Dimerization...... 34

Small Molecule and Synthetic Polymer Cosolutes...... 36

Proteins and Freeze-Dried Cytosol...... 36

Discussion ...... 37

Small Molecule Cosolutes...... 38

Synthetic Polymers and Their Monomers...... 39

Proteins...... 39

Freeze-dried Lysate...... 39

Conclusions ...... 40

Supporting Information ...... 41

REFERENCES ...... 45

CHAPTER 3: SURFACE-CHARGE MODULATES PROTEIN-PROTEIN INTERACTIONS IN PHYSIOLOGICALLY-RELEVANT ENVIRONMENTS...... 50

Introduction ...... 50

Results and discussion ...... 55

Supporting Information ...... 59

x Materials and Methods...... 59

REFERENCES ...... 63

CHAPTER 4: PROTEIN SHAPE MODULATES CROWDING EFFECTS ...... 66

Introduction ...... 66

Results ...... 71

Scaled-particle theory predictions...... 71

Observations using 19F NMR...... 73

Discussion ...... 74

Scaled-Particle Theory...... 74

Control Small-Molecule Cosolutes...... 75

Synthetic Polymer Cosolutes...... 76

Proteins Cosolutes...... 77

Conclusions ...... 78

Materials and Methods ...... 79

Vector...... 79

Protein expression and purification...... 79

Cosolutes...... 81

19F NMR...... 81

Data analysis...... 81

Supporting Information ...... 82

REFERENCES ...... 86

CHAPTER 5: QUINARY INTERACTIONS AND EXPANDED PROTEIN NATIVE STATE ENSEMBLES, A MEANS TO NEW FUNCTIONS...... 91

Introduction ...... 91

REFERENCES ...... 99

xi LIST OF TABLES

Table 1.1: Strategies for in-cell NMR in E. coli...... 12

Table 2.1: Thermodynamic data for A34F GB1 in cosolutes at 298 K pH 7.5 ...... 37

Table 3.1: Equilibrium dissociation parameters at 298 K ...... 54

Table S3.1: Formal charge of A34F GB1, BSA, and lysozyme at pH 6.2, 6.8, and 7.5 ...... 60

Table 4.1: ΔΔG°`D→M, values at 298 K predicted by scaled particle theory (SPT) and NMR (pH 7.5) for the domain-swapped (L=0.5) and side-by-side dimer (L=0.7) ...... 73

Table S1: Parameters and increase in free energy of dissociation compared to dilute solution (ΔΔG°`) predicted by scaled-particle theory for the domain-swapped dimer [eccentricity (L) = 0.5] and side-by-side dimer (L = 0.7) dimers in various cosolutes...... 82

xii LIST OF FIGURES

Figure 1.1: NMR 1H-15N HSQC spectra of ubiquitin--α-synuclein fusion construct in cells (A) and cell lysates (B) ...... 7

Figure 1.2: In-cell 19F NMR spectra of GB1-GRn variants ...... 9

Figure 1.3: In-cell 1H-15N HSQC spectra of GB1 at pH 7.5 (A), 6.0 (B) and 5.0 (C) in 75 mM bis-tris propane/ HEPES/citrate buffer ...... 10

Figure 1.4: Interaction free energies (δΔΔG∘`op,int) with the cellular interior for charge-change variants ...... 13

Figure 1.5: In-cell NMR spectra and stability curves of SODBarrel in buffer, E. coli and A2780 cells (human ovarian carcinoma cells) ...... 15

Figure 1.6. Isoelectric-point histograms from open reading frames in E. coli and Homo sapiens proteomes ...... 16

` Figure 1.7: Graphical representation of in cell environment and 횫횫G∘ u at pH 7.4, pH 6.0 and pH 5.0...... 17

Figure 2.1: The A34F GB1 dimer and monomer and 19F NMR spectra acquired at two concentrations ...... 31

Figure 2.2: Binding isotherms of 3-fluorotyrosine labeled A34F in solutions of cytosol (75 g/L blue), buffer (black) and ethylene glycol (200 g/L, red) at pH 7.5, 298 K...... 36

Figure 2.3: A34F dissociation constants in buffer, osmolytes, synthetic polymers, their monomers and protein crowders ...... 38

Figure S2.1: ESI-FT-ICR analysis of fluorinated A34F GB1 ...... 41

Figure S2.2: 1H-15N HSQC of A34F GB1 dimer (3.3 mM, red) and 1H-15N HMQC of A34F GB1 monomer (10 μM, blue) ...... 42

Figure S2.3: (A)19F NMR spectrum of A34F GB1 ...... 43

Figure S2.4:1H-15N HSQC spectra of 19F 15N GB1 (red) and 15N A34F GB1 (blue)...... 44

Figure S2.5: Analytical ultracentrifugation curves for 15N A34F GB1 and 19F 15N A34F GB1 ...... 44

Figure 3.1. Dimer structure (PDBID: 2RMM) with the atoms of residue 40 shown as spheres. (A) 19F NMR spectra of A34F-4 (blue), A34F;D40N-3 (black), and A34F;D40K-2 (red) show nearly identical spectra.

xiii (B) 1H-15N HSQC spectra of the proteins using the same coloring as panel B. (C) ...... 53

Figure 3.2. Side chains at position 40 and thermometer representations of 휟휟푮푫 → 푴풐′ (pH 7.5, 298 K)...... 55

Figure 3.3. Graphical representations of A34F-4 dimerization in BSA and lysozyme at pH 7.5 (A), pH 6.8 (B) and pH 6.2 (C) ...... 57

Figure S3.1: Formal charge of A34F GB1, A34F;D40N GB1, and A34F; D40K GB1 between pH 1 and 14. The charge does not changes between pH 6.2 and pH 7.5, the range used here...... 61

Figure S3.2: Electrostatic potential map of A34F GB1. D40, shown in sticks, is in the midst of an acidic patch...... 61

Figure S3.3: Formal charge on (A) lysozyme (red), A34F GB1 (blue) and (B) BSA (black) and A34F GB1 (blue) between pH 1 and 14...... 62

o’ Figure 4.1: Scaled-particle-theory derived values of ΔΔG D→ M ...... 68

Figure 4.2: GB1 and its dimers (PDBIDs 1GB1, 2RMM, and 1Q10)...... 69

Figure 4.3: van der Waals volumes, solvent-accessible surface areas (SASAs), coarse-grained- (top) and geometric- representations (bottom) of the dimers (A) and their electrostatic surface-potentials from PYMOL (B)...... 70

Figure 4.4: 19F NMR spectra of 5-fluorotryptophan-43 labeled domain-swapped dimer at pH 7.5 and 298 K at the indicated protein concentrations...... 72

o’ Figure 4.5: ΔΔG D→ M at pH 7.5 and 298 K of the domain-swapped dimer and the side-by-side dimer ...... 74

Figure S4.1: GB1 backbone overlaid with the protein surface ...... 83

Figure S4.2: High-resolution ESI FT-ICR mass spectra of 5-fluorotryptophan-labeled, 15N-enriched domain-swapped dimer...... 84

Figure S4.3: 1H-15N HSQC of 19F-Trp labeled domain swapped dimer (black, 500 µM) and monomer (red, 50 µM) ...... 85

Figure S4.4: Overlay of 15N-1H HSQC spectra of 15N-enriched domain-swapped dimer (black, 500 µM) and 15N-enriched, 19F fluorotryptophan-labeled domain-swapped dimer (red, 700 µM)...... 85

Figure 5.1: Simulation of attractive interactions between cytosolic proteins (gray, yellow, and red) and Pyruvate Dehydrogenase (Green)...... 93

xiv Figure 5.2: Protein structures at different energy levels in the protein folding funnel...... 94

Figure 5.3: Population predictions from a 3 state molecular partition function with a native state N, a native state ensemble (N*), and the unfolded state (U) ...... 96

xv LIST OF ABBREVIATIONS AND SYMBOLS

Å Angstrom

σFd standard deviation of fraction dimer

횫CpU Change in the standard state heat capacity of unfolding

∆퐺푒푥 Excess free energy

ΔG°`D→M Standard state free energy of dimerization

ΔΔG°`D→M Change in the standard state free energy of dimerization

` 횫G∘ U Standard state free energy of unfolding

` 횫횫G∘ U Change in the standard state free energy of unfolding

δΔΔG∘`op,int interaction free energy with the cellular interior

푔(퐿, 푅푑, 훷) work required to form a cavity of L shape, Rd radius, and 횽 volume occupancy

` 횫H∘ U Change in the Standard state enthalpy of unfolding

횽 volume occupancy oC degrees Celsius

휇푒푥 excess chemical potential

BSA Bovine serum Albumin cm centimeter c* polymer overlap concentration

Da daltons

E. coli Escherichia Coli

ESI electrospray ionization

Fd fraction dimer

FPLC fast protein liquid chromatography

FT-ICR Fourier transform ion cyclotron resonance

xvi GB1 B1 domain of protein G g/L Grams per Liter

GJP Gary Joseph Pielak h hour

HAH1 Human metallochaperone protein H1

H/F proton or fluorine

HIV-Tat Human immunodeficiency virus protein Tat

HMQC Heteronuclear Multiple Quantum Coherence

HSQC Heteronuclear Single Quantum Coherence

Hz hertz

ILVA Isoleucine, leucine, valine, and alanine labeling

IPTG isopropyl β-D-1-thiogalactopyranoside

K temperature in kelvin kcal/mol Kilocalories per mole kDa Kilodaltons

Kd equilibrium constant of dissociation

K D→M equilibrium constant of dissociation

LB lennox broth mΩ milliohm

MHz megahertz mg milligram mM millimolar m/z Mass over charge nm nanometer

NMR nuclear magnetic resonance

xvii NmerA N-terminal metal binding domain of mercuric ion reductase

OD600 optical density at 600 nm

PDB Protein Data Bank,

PDBID Protein Data Bank Identifier

PEG polyethylene glycol pI Isoelectric point ppm parts per million

Pt total protein

QCI four channel inverse cryoprobe

Rw radius of water

Ra radius of cosolute

Rb radius of monomer rpm rotations per minute

RT room temperature times the gas constant s seconds

SASA Solvent Accessible Surface Area

SPT Scaled Particle theory

SDS PAGE sodium dodecyl sulfate poly acrylamide gel electrophoresis,

SOD1 Superoxide Disumutase 1

SOD1barrel Superoxide Disumutase 1 beta barrel structure

SOLEXY solvent exchange spectroscopy

TMAO trimethylamine oxide

TTHA Thermus thermophiles heavy metal binding protein

μM micromolar

xviii

CHAPTER 1: PROTEIN STABILITY AND WEAK INTRACELLULAR INTERACTIONS

Edited from Guseman, A.J.*; Pielak, G.J.*; 2018 in In-cell NMR Spectroscopy: From Molecular Sciences to Cell Biology Shirakawa, M. Döstch, V. and Ito, Yutaka. Royal Society of Chemistry (Submitted) *These authors contributed equally to this work.

Quinary Interactions

The cell is the fundamental unit of life because it sequesters the genetic code, the macromolecular machinery and the small molecules necessary for homeostasis and reproduction. Cells concentrate macromolecules1 to perform these existential functions, in some cells to over 300 g/L.2, 3 In these crowded and complex environments, proteins experience interactions with neighboring that are absent in the dilute buffered solutions where proteins are most often studied. These interactions are of two types: hard-core repulsions and soft interactions.4 Hard-core repulsions between macromolecules arise because two molecules cannot be in the same place at the same time. Soft contacts comprise weak attractive and repulsive chemical interactions that can nevertheless have large effects because of the enormous concentration of macromolecules in cells.

The totality of these interactions comprises protein quinary structure, but before delving into this topic it is important to review the first four levels as defined by

Linderstrøm–Lang in 1952.5 Primary structure is the amino-acid sequence. Secondary structure describes helices, sheets and turns. Tertiary structure describes how the units of secondary structure come together to form the three-dimensional conformation of a protein. Quaternary structure is how proteins come together to form dimers, trimers, etc.

1 In his 1972 Nobel Lecture, Anfinsen made the prescient statement: “a protein molecule only makes stable, structural sense when it exists under conditions similar to those under which it was selected – the so-called physiological state”.6 For most proteins, these conditions include the weak interactions found in the crowded cellular milieu that form the fifth level of protein structure.

The term quinary structure was coined three times. In 1973, Vaĭnshteĭn defined it apropos electron microscopy of assemblies comprising proteins, nucleic acids and nucleoproteins that form viruses, ribosomes, chromosomes, etc.7 In 1980, Edelstein defined quinary structure more narrowly as “the interactions within helical arrays, such as found for sickle cell hemoglobin fibers or tubulin units in ”.8 In 1982,

McConkey defined the term to emphasize the role of charge-charge interactions in cellular organization.9 Studies of quinary structure were recently reviewed and a general definition offered: “Quinary structure comprises the transient interactions between macromolecules that provides organization and compartmentalization inside cells”.10

Several excellent reviews10-13 on quinary structure have published, not all of which agree that the systems we describe here fit the definition.13 We discuss the weaknesses of some of these systems at the end of the chapter.

Protein Stability

Quinary interactions may organize the cellular interior by influencing protein structure, function and stability. Single-domain globular proteins often exist in a simple equilibrium between their folded (F; i.e., native, functional) state and their unfolded (U; i.e., denatured) ensemble, and their stability can be defined as the modified standard-

` state Gibbs free-energy of the unfolded ensemble minus that of folded state, 횫G∘ U

` (Equation 1.1). 횫G∘ U equals the available thermal energy, RT, where R is the gas

2 constant [1.987 cal/(mol K)] and T the absolute temperature, times the natural logarithm of the ratio of the concentrations of the two states (Equation 1.2). The stability of a

14 ` typical globular protein is 5-15 kcal/mol. To quantify the effect of environment, 횫G∘ U is

` measured in that environment and the stability in buffer is subtracted to give 횫횫G∘ U

(Equation 1.3).

∆퐺°`푈 = 퐺°`푈 − 퐺°`퐹 (1.1)

[푈] ∆퐺°` = −푅푇퐿푛 ( ) (1.2) 푈 [퐹]

∆∆퐺°`푈 = ∆퐺°`푼,풆풏풗풊풓풐풏풎풆풏풕 − ∆퐺°`푼,풃풖풇풇풆풓 (1.3)

Next, we discuss the importance of weak interactions and then how such interactions influence protein stability.

Roles of Weak Interactions

The free energy for breaking a typical quinary interaction is probably no more than a few kcal/mol. To put this value in perspective, it is a significant percentage of total protein stability.14 On the other hand, the available thermal energy at biologically relevant temperatures (300 K) is approximately 0.6 kcal/mol. A common misconception is that free-energy changes of this magnitude are biologically insignificant. Changes of this magnitude are irrelevant for most simple chemical reactions, but they are of signal importance in biology. Here is an incomplete list of small energy changes with critical biological consequences.

• Normal human body temperature is 37 oC. A fever of 42 oC, which represents an

increase in thermal energy of 0.01 kcal/mol, is often fatal.

3 • Escherichia coli have a maximum growth rate at 21 oC, but they stop growing and

die at 49 oC,15 which amounts to a change in thermal energy of 0.06 kcal/mol.

• Temperature-sensitive missense mutations in yeast, where growth occurs at 30

oC but not at 37 oC.16

• Increasing the incubation temperature of alligator eggs by 4 °C, changes the sex

of hatchlings from 100% female to 100% male.17, 18

These and other small energetic changes have large impacts because biology depends on cooperative interactions—a large number of small interactions acting together.19 For instance, proteins can go from 90% folded to 10% folded over a few degrees Celsius.20

Given the importance of cooperative interactions in biology, the question is not whether weak interactions are important, but instead, do we need to reassess some of the results from studies in buffer alone? For instance, are all the folding intermediates observed in buffer important in cells, when the interactions that stabilize such intermediates are of approximately the same free energy as interactions between the cellular milleau and the protein of interest?21

Not All Crowded Solutions are Physiologically Relevant

Many attempts to define the effects of the crowded cellular interior use high concentrations of synthetic polymers such as Ficoll (a crosslinked sucrose polymer), dextran (a glucose polymer) or polyethylene glycol. These cosolutes are used because they are macromolecular, highly soluble, inexpensive and commercially available in several sizes and in pure form. Their effects, however, may not mimic the crowded cellular interior22 because they are not inert, but interact, albeit weakly, with test proteins.23, 24 Nevertheless, understanding the effects of synthetic polymers is important

4 because they are often used to stabilize commercial proteins and protein-based drugs.

We end this section with a suggestion about nomenclature. Many investigators use the words “molecular crowding” but such usage is redundant. What else besides molecules

(or atoms) can crowd a solution? The term “macromolecule crowding”, however, is useful, especially when comparing the effects of polymers to their monomers.24

Early Observations of Weak Attractive Interactions under Crowded Conditions

Our first hint came from studies of globular protein stability comparing results from the thermal denaturation in sugar solutions to predictions from theory.25, 26 This comparison suggested that stabilizing hard-core repulsions might be offset by destabilizing cosolute-protein interactions. The second hint came from an NMR-based study by Peter Crowley’s group showing that polyethylene glycol, which was thought to be an inert synthetic polymer, interacts with the test protein cytochrome c.23

Our realization that these observations illuminate the intracellular behavior of proteins arose from a mistake made by GJP that resulted in the retraction27 of two papers from our laboratory.28, 29 However, the mistake provided the basis for our thoughts on soft, attractive interactions. The synthesis began with simple in-cell NMR, as pioneered by Serber and Dötsch.30 Escherichia coli cells are grown to near mid-log phase and the expression of the protein being studied is induced in the presence of 15N- enriched medium. After expression, the cells are gently pelleted by centrifugation, washed and resuspended in a minimal volume of culture media or buffer. The sample is then carefully transferred to an NMR tube.

After our successful in-cell NMR study of a disordered protein in E. coli using the

15N-1H heteronuclear single quantum coherence (HSQC) experiment,31 we turned to a

28, 29 partially-folded protein, 11.2 kDa apocytochrome b5. Shortly after our first paper on

5 this protein was published, it was brought to GJP’s attention by Lila Gierasch that overexpressed recombinant proteins can leak from these bacterial cells. We had checked for leakage by analyzing the supernatant from the in-cell sample using SDS

PAGE, which indicated that no more than 10% of the protein had escaped from the cells. We then gently centrifuged the sample, examined the supernatant by NMR and realized that the entire “in-cell” NMR signal arose from leaked 15N-enriched apocytochrome b5. Our “ah ha moment” was the realization that because the NMR spectrum arose from the 10% of the apocytochrome b5 that had leaked, the other 90% of the protein was inside the cells but invisible by NMR. As described next, we reasoned and then showed32 that the spectrum disappears in cells because attractive interactions between cytoplasmic proteins and apocytochrome b5 slow the tumbling of the apocytochrome.

Molecules tumble in solution. Like the atoms in a rubber ball, all the atoms in a stable, structured protein rotate as the protein tumbles. The tumbling rate is essential to obtaining high-quality, solution NMR spectra.33 The larger the structured protein, the slower the tumbling. Fast-tumbling structured proteins give narrow resonances. Slow tumbling structured proteins give broad resonances. In extreme instances, the resonances are so broad that the signal from the protein disappears into the baseline of the spectrum. Putting these ideas together, we realized that attractive interactions between protein surfaces and the components of the cell slow tumbling to such a degree that the in-cell spectrum of most globular proteins will not be visible. Most importantly, these are the weak interactions that bring about quinary structure.

6

Figure 1.1: NMR 1H-15N HSQC spectra of ubiquitin--α-synuclein fusion construct in cells (A) and cell lysates (B).34 Reproduced with permission from ChembioChem, copyright Wiley and Sons.

This explanation raised another key question. Why did we observe the spectrum of a disordered protein in cells?35 In a separate study we showed that disordered proteins retain enough internal motion to give a high-resolution spectra even in the presence of attractive interactions.32 In a related study we defined the control experiments involving supernatants from in-cell samples to check for leakage and showed that leakage often occurs when the recombinant protein represents 20% or more of the total cellular protein.36 We confirmed these ideas with a fusion protein comprising the disordered protein, -synuclein and the stably folded protein, ubiquitin

(Figure 1.1).34 In buffer alone, the spectrum from both 15N-enriched proteins are observed, but only the -synuclein spectrum is observed in E. coli cells. Importantly, the spectrum of both proteins was again observed after lysis and dilution.

7 Other Qualitative Observations in Cells and Crowded In Vitro Conditions

Multiple groups made similar observations. As expected, most globular proteins do not provide high-quality in-cell NMR spectra. Peter Crowley’s group observed that the highly-cationic protein cytochrome c (12 kDa, pI 10) fails to give 15N-1H HSQC spectra in E. coli cells or their lysates.37 They suggest that attractive interactions with the E. coli proteome caused polycationic cytochrome c to act like a much larger protein because most E. coli proteins are polyanions at physiological pH. They tested this idea by adding six negative charges by changing three lysine residues to glutamic acids and showed that the spectrum “reappears” in lysates.38 Complimenting this observation with analytical size-exclusion chromatography, they showed that these attractive chemical interactions gave cytochrome c an apparent molecular weight >150 kDa in cell-like environments. Thus, the attractive interactions between the highly cationic cytochrome c and the anionic proteins in E. coli caused the protein to tumble too slowly to be observed in the 15N-1H HSQC experiment.

To expand on this idea, Kyne and Crowley turned to a small anionic globular protein, the B1 domain of protein G (GB1, 6 kDa, pI 6.5).39 GB1 should not be ‘sticky’ in

E. coli because it and most other E. coli proteins are anions at physiological pH values.

As predicted, GB1 gives high quality 15N-1H HSQC spectra in cells. To lend further support to the idea that the charge of GB1 makes it visible in cells, they fused the nuclear-localization signal of HIV-Tat, which contains six arginine residues, to the N- terminus of GB1. The fusion construct is invisible in cells and lysates, suggesting that attractive interactions involving the Tat motif severely impede tumbling.40 Testing the hypothesis that the “stickiness” of E. coli cells arises from electrostatic interactions,

Kyne and Crowley quantified the number of arginine residues required to suppress the

8 spectrum in cells by adding a flexible linker to the C-terminus of GB1 followed by an arginine tail. Four arginines rendered the protein invisible, further supporting the weak interactions hypothesis (Figure 1.2).39

Figure 1.2: In-cell 19F NMR spectra of GB1-GRn variants reveal that adding arginine residues to the tail of GB1 increases the “stickiness” in cells rendering NMR spectra undetectable. Reprinted with permission from Biochemistry, 2017, 56, 5026-5032. Copyright 2017 American Chemical Society

The Gierasch group assessed the effect of protein surface composition on rotational diffusion using GB1, the N-terminal metal-binding domain of mercuric ion reductase (NmerA, 7 kDa), ubiquitin (9 kDa) and fusion constructs of these proteins.41

Only GB1 produces reasonable in-cell 15N-1H HSQC spectra. NmerA gives lower quality spectra and ubiquitin produces none at all, despite their similar size. They attributed this behavior to a combination of surface electrostatics and exposed hydrophobic moieties that can interact with the environment via transient and weak chemical interactions. To support the idea that hydrophobic interactions can slow tumbling in cells and cell

9 lysates, Gierasch and colleagues mutated the hydrophobic patch of ubiquitin, reducing the chemical interactions and making ubiquitin visible in cell lysates.

Figure 1.3: In-cell 1H-15N HSQC spectra of GB1 at pH 7.5 (A), 6.0 (B) and 5.0 (C) in 75 mM bis- tris propane/ HEPES/citrate buffer. Cells from the experiments shown in panel C were lysed, resulting in the HSQC spectrum shown in panel D. Reproduced with permission from Protein Science, copyright Wiley and Sons.

Our group demonstrated that the “stickiness” of the E. coli cytoplasm depends in an understandable way on pH (Figure 1.3). First, we showed that the cytoplasm acidifies during an in-cell experiment by following the chemical shifts from an engineered histidine variant of GB1 to determine the intercellular pH.42 Specifically, the pH drops from pH 7.2 to pH 5.5 in ~4 h. Coincidently, the quality of the in-cell 15N-1H

HSQC spectra of GB1 also decreases.43 However, upon lysis and dilution, the spectrum returns. These data show that the interactions have an electrostatic component whose strength increases as the pH of the cellular interior decreases. As discussed in section

12.7, the change is understandable in terms of the pI distribution of the E. coli

10 proteome; interactions between anionic GB1 and the increasing positively-charged cytosolic proteins. Such interactions do not need to involve charge. Huan-Xiang Zhou’s group showed that the carbohydrate-protein interactions broaden spectra, and, as discussed next, hydrophobic interactions are also important.44

Mikael Oliveberg and colleagues observed that weak attractive interactions slow protein tumbling in higher eukaryotic cells by delivering SOD1 into mammalian cells using cell-penetrating peptides.45 The 15N-1H HSQC spectra are significantly broader in cells than in dilute buffered solution. They attributed the broadening to weak interactions in cells. Oliveberg et al. also identified a ‘physicochemical code’ for quinary interactions, by examining three proteins in E. coli cells: bacterial TTHA (7 kDa), human HAH1 (7 kDa) and SOD1 (10 kDa). They defined the minimum amount of charged/hydrophobicity necessary for attractive interactions to make spectra disappear. To support their hypothesis, they used a Glu-to-Lys charge-change variant to make TTHA invisible and

Lys/Arg-to-Glu variants to make HAH1 and SOD1 visible.

Taken together, these results show that alternative strategies must be developed to observe globular proteins in living cells using NMR. The Li group assessed in-cell

NMR methods for E. coli using four proteins, GB1( 6 kDa), calmodulin (17 kDa), ubiquitin (8 kDa) and bcl-xl-cutloop (24 kDa) . They assessed the quality of one and two-dimensional spectra acquired using various labeling and enrichment techniques

(Table 1.1). One-dimensional methods involving 13C and 19F gave in-cell spectra for all four proteins, with 19F producing the highest-quality data.

11 protein 15N 15N, 2H Fractional Fractional 15N 19F ILVA ILVA 13C (2D) 13C L methyl methyl (1D 13C (2D) (1D 13C detected) detected) GB1 + + + + + + + + ⬛⬛ ⬛⬛ ⬛⬛⬛ ⬛⬛⬛ ⬛ ⬜ ⬛⬛ ⬛⬛ Ubq - - - + - + - + ⬛⬛ ⬛⬛ ⬛⬛⬛ ⬛⬛⬛ ⬛ ⬜ ⬛⬛ ⬛⬛ CAM - + - + - + + + ⬛⬛⬛ ⬛⬛⬛ ⬛⬛⬛ ⬛⬛⬛ ⬛ ⬜ ⬛⬛ ⬛⬛ BCL- - - - + - + - + XL- ⬛⬛⬛ ⬛⬛⬛ ⬛⬛⬛ ⬛⬛⬛ ⬛ ⬜ ⬛⬛ ⬛⬛ cutloop Table 1.1: Strategies for in-cell NMR in E. coli. +, detectable in-cell spectra; -undetectable in- cell spectra, ⬜ very low background, ⬛ low background, ⬛⬛ medium background, ⬛⬛⬛ high background. 1D, one-dimensional. 2D, two dimensional. Adapted from Biochemistry, 2014, 53, 1971-1981. Copyright 2014 American Chemical Society.

All these studies point to the existence of weak interactions between protein surfaces in cells. Quantitative knowledge about these interactions is essential for understanding the structure and stability and function of proteins in their native environment.

Using NMR to Study Weak Interactions in Cells

Simple in-vitro NMR experiments can be difficult in cells because of the need to keep the cells alive and because weak attractive interactions broaden spectra. Cells are actively trying to maintain homeostasis, resulting in NMR-active metabolites and changes in intracellular factors such as pH. Methods have been developed to identify and quantify signals arising from metabolites and buffer inside cells.43, 46, 47 Novel detection strategies, involving 2H, 15N and 13C enrichment and 19F labeling, have been developed to overcome broadening and other complications.48-51

Amide proton-deuterium exchange is another method for assessing the folding and stability of both globular proteins and disordered proteins in cells.52, 53 We adapted

12 a mass-spectrometry-based quenched-lysate protocol developed by Terrence Oas and colleagues54 to measure the stability of globular proteins in cells using NMR-detected amide-proton exchange (Figure 1.4).55, 56 E. coli cells expressing the protein of interest are transferred from H2O to D2O. Exchange is allowed to proceed in cells. At defined times, an aliquot is removed, and exchange is quenched by lowering the pH at the same time the cells are lysed. Exchange is detected by integrating 15N-1H HSQC cross peaks as a function of time. Using GB1, we were able to quantify protein stability in living cells and showed that the cellular interior is slightly destabilizing.56 This result was particularly interesting because the crowded nature of the cytoplasm was thought to stabilize globular proteins via hardcore repulsions, which favors compact folded states over expanded unfolded ensembles. The destabilization is due to weak attractive interactions in cells overcoming the stabilizing repulsions.

Figure 1.4: Interaction free energies (δΔΔG∘`op,int) with the cellular interior for charge-change variants, I6L, D40A, D40N and D40K variants are shown in blue, green and red, respectively. Dashed lines represent the average interaction free energy for each variant. Reproduced from Monteith et al.57 with permission. Copyright (2015) National Academy of Sciences.

Although elegant, amide proton exchange experiments are expensive and time consuming for in-cell studies. 19F NMR has been used to expedite stability

` measurements of a metastable protein (횫G∘ U <2 kcal/mol) where exchange between

13 folded state and the unfolded ensemble is slow on the NMR timescale, such that the resonance from both forms are observed. 19F is an attractive nucleus because it is

100% abundant, it can be detected by NMR with high sensitivity (83% 1H sensitivity), is readily incorporated into proteins via aromatic amino acids58, 59 and is not used in E. coli metabolism.

Smith et al. used 19F NMR to measure the stability of the drkn-SH3 domain in E.

46, 60 ` 61 coli. This protein has a 횫G∘ U near zero. By labeling the sole tryptophan with 5- fluorotryptophan,59 the resulting in-cell NMR spectra showed one peak for the folded state and one for the unfolded state. These data confirmed that the crowded cellular interior can be destabilizing over a large temperature range (4 ∘C to 40 ∘C), and allowed

` ` the quantification of the key equilibrium thermodynamic parameters: 횫G∘ U, 횫H∘ U, and

횫CpU. Unfortunately, a metabolite resonance under the unfolded resonance in cells,

` complicated the analysis. To compensate, both a raw 횫G∘ U, and a conservatively

` corrected 횫G∘ U was reported. The true value likely is between these bounds and corresponds to a destabilization of the SH3 domain by the cellular interior. Stadmiller et al. then used the SH3-19F NMR system to study how osmotic shock affects protein stability in cells, showing that osmotic shock destabilizes proteins, but that stability can be restored by the osmolyte glycine beatine.47

Conggang Li’s group used relaxation-based 19F NMR experiments and GB1 to show that the viscosity inside E. coli cells is about twice that of dilute buffer49 while that in Xenopus laevis oocytes is only about 20% greater than buffer. They also showed that most of the resonance broadening in E. coli cells arises from chemical interactions in general (i.e., homogeneous broadening), but the broadening in oocytes arises from

14 different interactions in different places in the cell (i.e., inhomogeneous broadening).62

They then used covalently-connected GB1 constructs and 19F NMR to show that the interior of oocytes is less sticky than the interior of E. coli cells.63

Figure 1.5: In-cell NMR spectra and stability curves of SODBarrel in buffer, E. coli and A2780 cells (human ovarian carcinoma cells). Reproduced from Danielsson et al.51 with permission. Copyright (2015) National Academy of Sciences.

Oliveberg and colleagues also quantified the stability in cells of proteins whose folded state and unfolded ensemble are in slow exchange on the NMR time scale.

Using 1H-15N HSQC analyses, they determined that the Q153 cross peak of the

“trimmed” SOD1barrel reports on its folding equilibrium in both E. coli and in A2780 mammalian cells (Figure 1.5). SOD1barrel is destabilized in both cells types, with A2780 cells being more destabilizing. This increased destabilization can be attributed to an increase in electrostatic interactions between the more cationic Homo sapien proteome compared to E. coli (Figure 1.6).

15

Figure 1.6. Isoelectric-point histograms from open reading frames in E. coli and Homo sapiens proteomes. Data from the proteome-pI database.37

Interactions with the Folded State

To better understand protein destabilization in cells, we investigated electrostatic interactions by using site-directed variants and changes in pH to manipulate charge- charge interactions between GB1 and the cellular interior. Changing Asp 40, a surface residue without intramolecular hydrogen bonds, to either an Asn or Lys changes the

` overall charge by +1 or +2, respectively. The mutations do not affect 횫G∘ U in dilute buffered solutions, but in cells they destabilize the protein by 1.3 kcal/mol for D40N and by 1.5 kcal/mol for D40K compared to wild-type GB1.57 To support these findings, we investigated the stability of GB1 in cells at pH 7.4, 6.0 and 5.0. The destabilizing attractive interactions become stronger at lower pH (Figure 1.7).64 These results indicate that the interactions between GB1 and the other intracellular protein are largely electrostatic.

16

` Figure 1.7: Graphical representation of in cell environment and 횫횫G∘ u at pH 7.4, pH 6.0 and pH 5.0. Adapted with permission from R. D. Cohen and G. J. Pielak, J. Am. Chem. Soc., 2016, 138, 13139-13142. Copyright 2016 American Chemical Society.

Interactions with the Unfolded Ensemble.

Weak interactions in cells can affect not only the folded state but also the unfolded ensemble. Three studies from our lab highlight these interactions. The first evidence came from 19F NMR showing that the cellular environment interacts with the unfolded ensemble of SH3 as indicated by the greater line broadening in cells in the unfolded state than the folded state.46 Following this observation, we used physiologically-relevant cosolutes in vitro to determine that tumbling of the unfolded ensemble is slowed more in the unfolded ensemble than in the folded state.46 We confirmed the interactions with the unfolded ensemble by using exchange spectroscopy to show that folding was slowed in the presence of protein cosolutes.46, 65

17 Cohen and Pielak expanded on these ideas by studying changes in a structured region of GB1’s denatured ensemble.21 GB1 possesses localized structure in the unfolded ensemble.66 This so-called hydrophobic staple stabilizes the ensemble, which is predicted to decrease the stability compared to a variant without the staple. As expected, variants that remove the staple exhibit increased stability in vitro. In cells however, the hydrophobic staple is destabilized by quinary interactions (i.e., the staple is absent in cells) and therefore, the variants and the wild-type proteins all have approximately the same stability. These data suggest that attractive interactions in the cellular interior can also affect unfolded ensembles.

Interactions with Disordered Proteins

Although weak interactions with the folded state and the unfolded ensemble in cells influence globular protein stability, the high quality of in-cell NMR spectra of disordered proteins indicate that these proteins retain a great deal of internal motion in cells.32, 35, 42, 67 The cellular interior may induce the collapse of some disordered proteins35, 68 and the expansion of others.67 For instance the average radius of gyration of -synuclein decreases from 4.0 nm in buffer to 3.6 nm in mammalian cells, suggesting that compaction in cells obscures the amyloidogenic region of the protein.

However, amide proton exchange rates for both FlgM and α-synuclein measured using the SOLEXY experiment,69 are the same in E. coli cells as they are in buffer, suggesting that disorder persists in cells.42 These quantitative studies show that protein disorder persists in cells.

As reviewed here, many studies show that crowding in cells and under physiologically-relevant conditions can alter protein stability. Crowding also has the

18 potential to alter globular protein dynamics65, 70 and the conformation of loops.71

Nevertheless, with the possible exception of fold-switching proteins,72 quinary interactions are unlikely to alter overall tertiary structure because the interior of most globular proteins is nearly as efficiently packed as perfectly packed spheres.73

Non-NMR approaches

What follows is far from comprehensive but provides an introduction to the recent literature on several methods. Fluorescence spectroscopy, which has the advantage over NMR in that it provides spatial resolution in the cell, is the main non-NMR method for characterizing weak interactions.74-76 Combining fluorescence detection with fast temperature jumps has proven to be especially powerful for assessing weak interactions in cells.70 Attractive, weak interactions under crowded conditions have also been studied using calorimetry,77 simulation71, 78 and considered theoretically.79, 80

Next Steps

We end by returning to the idea that the interior of cells (and their organelles) are not only crowded, but also organized. In 1985, Srere coined the term “metabolon” to describe a “supramolecular complex of sequential metabolic and cellular structural elements … formed by quinary interactions of complementary surfaces”.81

These interactions facilitate substrate channeling by protecting intermediates, allowing enhanced rates of catalysis and metabolic flux.

Studies of authentic quinary interactions are rare,82-87 because quinary interactions are so weak that they rarely survive outside of cells. More specifically, studies of quinary interactions by NMR have three problems. First, much of the data are from one prokaryotic organism, Escherichia coli. Second, the proteins examined are not

19 native to E. coli. Third, the proteins are expressed at non-physiologically high-levels to allow detection.

It is time to test the idea that these transient complexes exist in cells under normal conditions. The obvious place to begin is with metabolons, for instance, the two most important metabolic pathways, glycolysis and the tricarboxylic acid cycle.

However, current NMR methods are not sensitive enough to be useful. The simplest plan involves three steps. First, insert a small affinity tag into the structural gene for a protein at its natural locus in the genome.88 Second, incorporate a photo crosslinker using, for example, a commercially-available lysine-based photo affinity tag that can be inserted at a normal lysine codons.89 Lysine’s positively-charged side chain makes it an ideal amino acid, because as discussed here, we know that charge-charge interactions play a large role in quinary interactions.43, 46, 64 Third, the affinity purified, crosslinked, target enzyme is treated with proteases and the proteins making quinary interactions with it are identified by using bottom-up proteomics.90

Summary and closing thoughts

Initially, weak interactions stymied in-cell HSQC-based NMR studies because globular proteins cause the spectra to disappear. Strategies such as 19F NMR and specific enrichment can overcome this problem. The idea is that a few broad resonances from specifically labeled or enriched residues can be observed, but the tens to hundreds of HSQC cross peaks from a typical globular protein blend into the background in cells. These strategies have enabled the use of in-cell NMR to assess the effects of weak interactions on protein thermodynamics in cells and under physiologically-relevant conditions in vitro. The most common observation is destabilization. Previously, the crowded nature of the cellular interior was thought to be

20 stabilizing because of the emphasis on hard-core steric repulsions, but NMR-based observations reveal a balancing act between stabilizing repulsions and destabilizing attractive chemical interactions. We have only scratched the surface. Efforts to define these weak interactions must continue because they are the key to understanding cellular organization, metabolism, homeostasis and the defects that disrupt these processes.

Shifting focus to protein-protein interactions

While the influence of cellular interior to protein folding is broadly understood, few studies have investigated the role of crowding on protein-protein interactions circa

2014.91, 92 Here I shift focus away from protein folding tol focus on how hardcore repulsions and chemical interactions influences protein-protein interactions in vitro under crowded conditions.

21 REFERENCES

(1) Spitzer, J., Pielak, G. J., and Poolman, B. (2015) Emergence of life: Physical chemistry changes the paradigm. Biol. Direct 10, 33-48.

(2) Zimmerman, S. B., and Trach, S. O. (1991) Estimation of macromolecule concentrations and excluded volume effects for the cytoplasm of Escherichia coli. J. Mol. Biol. 222, 599-620.

(3) Cayley, D. S., Guttman, H. J., and Record, M. T. (2000) Biophysical characterization of changes in amounts and activity of Escherichia coli cell and compartment water and turgor pressure in response to osmotic stress. Biophys. J. 78, 1748- 1764.

(4) Sarkar, M., Li, C., and Pielak, G. J. (2013) Soft interactions and crowding. Biophys. Rev. 5, 187-194.

(5) Linderstrøm-Lang, K. U. (1952) Proteins and enzymes, In Lane medical lectures, pp 52-73, Stanford University Press, Stanford Califorinia

(6) Anfinsen, C. B. (1973) Principles that govern the folding of protein chains. Science 181, 223-230.

(7) Vaĭnshteĭn, B. K. (1973) Three-dimensional electron microscopy of biological macromolecules. Physics-Uspekhi 16, 185-206.

(8) Edelstein, S. J. (1980) Patterns in the quinary structures of proteins. Plasticity and inequivalence of individual molecules in helical arrays of sickle cell hemoglobin and tubulin. Biophys. J. 32, 347-360.

(9) McConkey, E. H. (1982) Molecular evolution, intracellular organization, and the quinary structure of proteins. Proc. Natl. Acad. Sci. USA 79, 3236-3240.

(10) Cohen, R. D., and Pielak, G. J. (2017) A cell is more than the sum of its (dilute) parts: A brief history of quinary structure. Protein Sci. 26, 403-413.

(11) Wirth, A. J., and Gruebele, M. (2013) Quinary protein structure and the consequences of crowding in living cells: Leaving the test-tube behind. BioEssays 35, 984-993.

(12) Chien, P., Gierasch, L. M., Kozminski, K. G., and Lippincott-Schwartz, J. (2014) Challenges and dreams: Physics of weak interactions essential to life. Mol. Biol. Cell 25, 3474-3477.

(13) Rivas, G., and Minton, A. P. (2018) Toward an understanding of biochemical equilibria within living cells. Biophys. Rev. 10, 241-253.

22 (14) Creighton, T. E. (2010) The biophysical chemistry of nucleic acids & proteins, In The biophysical chemistry of nucleic acids & proteins, pp 488-497, Helvetian Press.

(15) Ingraham, J. (1987) Chemical composition of Escherichia coli, In Escherichia coli and Salmonella typhimurium (F. C. Neidhardt, J. L. Ingraham, K. B. Low, B.Magasanik, M. Schaechter, and Umbarger., H. E., Eds.), pp 1543-1554, American Society for Microbiology, Washington DC.

(16) Wang, X., and Pielak, G. J. (1991) Temperature-sensitive variants of saccharomyces cerevisiae iso-1-cytochrome c produced by random mutagenesis of codons 43 to 54. J. Mol. Biol. 221, 97-105.

(17) Ferguson, M. W. J., and Joanen, T. (1982) Temperature of egg incubation determines sex in Alligator mississippiensis. Nature 296, 850-853.

(18) Georges, A., and Holleley, C. E. (2018) How does temperature determine sex? Science 360, 601-602.

(19) Creighton, T. E. (1983) An empirical approach to protein conformation stability and flexibility. Biopolymers 22, 49-58.

(20) Schellman, J. A. (1987) Selective binding and solvent denaturation. Biopolymers 26, 549-559.

(21) Cohen, R. D., and Pielak, G. J. (2017) Quinary interactions with an unfolded state ensemble. Protein Sci. 26, 1698-1703.

(22) Sarkar, M., Smith, A. E., and Pielak, G. J. (2013) Impact of reconstituted cytosol on protein stability. Proc. Natl. Acad. Sci. USA 110, 19342-19347.

(23) Crowley, P. B., Brett, K., and Muldoon, J. (2008) NMR spectroscopy reveals cytochrome c–poly(ethylene glycol) interactions. ChemBioChem 9, 685-688.

(24) Benton, L. A., Smith, A. E., Young, G. B., and Pielak, G. J. (2012) Unexpected effects of macromolecular crowding on protein stability. Biochemistry 51, 9773- 9775.

(25) Saunders, A. J., Davis-Searles, P. R., Allen, D. L., Pielak, G. J., and Erie, D. A. (2000) Osmolyte-induced changes in protein conformational equilibria. Biopolymers 53, 293-307.

(26) Davis-Searles, P. R., Saunders, A. J., Erie, D. A., Winzor, D. J., and Pielak, G. J. (2001) Interpreting the effects of small uncharged solutes on protein-folding equilibria. Annu. Rev. Biophys. Biomol. Struct. 30, 271-306.

(27) Bryant, J. E., Lecomte, J. T. J., Lee, A. L., Young, G. B., and Pielak, G. J. (2007) Protein dynamics in living cells. Biochemistry 46, 8206.

23 (28) Bryant, J. E., Lecomte, J. T. J., Lee, A. L., Young, G. B., and Pielak, G. J. (2005) Protein dynamics in living cells. Biochemistry 44, 9275-9279.

(29) Bryant, J. E., Lecomte, J. T. J., Lee, A. L., Young, G. B., and Pielak, G. J. (2006) Cytosol has a small effect on protein backbone dynamics. Biochemistry 45, 10085-10091.

(30) Serber, Z., and Dötsch, V. (2001) In-cell NMR spectroscopy. Biochemistry 40, 14317-14323.

(31) Dedmon, M. M., Patel, C. N., Young, G. B., and Pielak, G. J. (2002) Flgm gains structure in living cells. Proc. Natl. Acad. Sci. USA 99, 12681-12684.

(32) Li, C., Charlton, L. M., Lakkavaram, A., Seagle, C., Wang, G., Young, G. B., Macdonald, J. M., and Pielak, G. J. (2008) Differential dynamical effects of macromolecular crowding on an intrinsically disordered protein and a globular protein: Implications for in-cell NMR spectroscopy. J. Am. Chem. Soc. 130, 6310- 6311.

(33) Rule, G. S., and Hitchens, T. K. (2007) Fundamentals of protein NMR spectroscopy Springer.

(34) Barnes, C. O., Monteith, W. B., and Pielak, G. J. (2011) Internal and global protein motion assessed with a fusion construct and in-cell NMR. ChemBiochem 12, 390-391.

(35) Dedmon, M. M., Patel, C. N., Young, G. B., and Pielak, G. J. (2002) Flgm gains structure in living cells. Proc. Natl. Acad. Sci. USA 99, 12681-12684.

(36) Barnes, C. O., and Pielak, G. J. (2011) In-cell protein NMR and protein leakage. Proteins 79, 347-351.

(37) Kozlowski, L. P. (2017) Proteome-pi: Proteome isoelectric point database. Nucleic Acids Res. 45, 1112-1116.

(38) Crowley, P. B., Chow, E., and Papkovskaia, T. (2011) Protein interactions in the Escherichia coli cytosol: An impediment to in-cell NMR spectroscopy. ChemBioChem 12, 1043-1048.

(39) Kyne, C., and Crowley, P. B. (2017) Short arginine motifs drive protein stickiness in the Escherichia coli cytoplasm. Biochemistry 56, 5026-5032.

(40) Kyne, C., Ruhle, B., Gautier, V. W., and Crowley, P. B. (2015) Specific ion effects on macromolecular interactions in Escherichia coli extracts. Protein Sci. 24, 310- 318.

24 (41) Wang, Q., Zhuravleva, A., and Gierasch, L. M. (2011) Exploring weak, transient protein–protein interactions in crowded in vivo environments by in-cell nuclear magnetic resonance spectroscopy. Biochemistry 50, 9225-9236.

(42) Smith, A. E., Zhou, L. Z., and Pielak, G. J. (2015) Hydrogen exchange of disordered proteins in Escherichia coli. Protein Sci. 24, 706-713.

(43) Cohen, R. D., Guseman, A. J., and Pielak, G. J. (2015) Intracellular pH modulates quinary structure. Protein Sci. 24, 1748-1755.

(44) Miklos, A. C., Sumpter, M., and Zhou, H.-X. (2013) Competitive interactions of ligands and macromolecular crowders with maltose binding protein. PLOS ONE 8, e74969.

(45) Mu, X., Choi, S., Lang, L., Mowray, D., Dokholyan, N. V., Danielsson, J., and Oliveberg, M. (2017) Physicochemical code for quinary protein interactions in Escherichia coli. Proc. Natl. Acad. Sci. USA 114, 4556-4563.

(46) Smith, A. E., Zhou, L. Z., Gorensek, A. H., Senske, M., and Pielak, G. J. (2016) In- cell thermodynamics and a new role for protein surfaces. Proc. Natl. Acad. Sci. USA 113, 1725-1730.

(47) Stadmiller, S. S., Gorensek-Benitez, A. H., Guseman, A. J., and Pielak, G. J. (2017) Osmotic shock induced protein destabilization in living cells and its reversal by glycine betaine. J. Mol. Biol. 429, 1155-1161.

(48) Li, C., Wang, G.-F., Wang, Y., Creager-Allen, R., Lutz, E. A., Scronce, H., Slade, K. M., Ruf, R. A. S., Mehl, R. A., and Pielak, G. J. (2010) Protein 19F NMR in Escherichia coli. J. Am. Chem. Soc. 132, 321-327.

(49) Ye, Y., Liu, X., Zhang, Z., Wu, Q., Jiang, B., Jiang, L., Zhang, X., Liu, M., Pielak, G. J., and Li, C. (2013) 19F NMR spectroscopy as a probe of cytoplasmic viscosity and weak protein interactions in living cells. Chem. -Eur. J. 19, 12705-12710.

(50) Xu, G., Ye, Y., Liu, X., Cao, S., Wu, Q., Cheng, K., Liu, M., Pielak, G. J., and Li, C. (2014) Strategies for protein NMR in Escherichia coli. Biochemistry 53, 1971- 1981.

(51) Danielsson, J., Mu, X., Lang, L., Wang, H., Binolfi, A., Theillet, F.-X., Bekei, B., Logan, D. T., Selenko, P., Wennerström, H., and Oliveberg, M. (2015) Thermodynamics of protein destabilization in live cells. Proc. Natl. Acad. Sci. USA 112, 12402-12407.

(52) Linderstrøm-Lang, K. U. (1958) Deuterium exchange and protein structure, In Symposium on protein structure (A, N., Ed.), pp 23-34, Methuen, London.

(53) Englander, S. W., and Kallenbach, N. R. (2009) Hydrogen exchange and structural dynamics of proteins and nucleic acids. Q. Rev. Biophys. 16, 521-655.

25 (54) Ghaemmaghami, S., and Oas, T. G. (2001) Quantitative protein stability measurement in vivo. Nat. Struct. Biol. 8, 879-882.

(55) Monteith, W. B., and Pielak, G. J. (2014) Residue level quantification of protein stability in living cells. Proc. Natl. Acad. Sci. USA 111, 11335-11340.

(56) Monteith, W. B., and Pielak, G. J. (2015) Correction for Monteith and Pielak, residue level quantification of protein stability in living cells. Proc. Natl. Acad. Sci. USA 112, 7031.

(57) Monteith, W. B., Cohen, R. D., Smith, A. E., Guzman-Cisneros, E., and Pielak, G. J. (2015) Quinary structure modulates protein stability in cells. Proc. Natl. Acad. Sci. USA 112, 1739-1742.

(58) Jackson, J. C., Hammill, J. T., and Mehl, R. A. (2007) Site-specific incorporation of a 19F-amino acid into proteins as an NMR probe for characterizing protein structure and reactivity. J. Am. Chem. Soc. 129, 1160-1166.

(59) Crowley, P. B., Kyne, C., and Monteith, W. B. (2012) Simple and inexpensive incorporation of 19F-tryptophan for protein NMR spectroscopy. Chem. Commun. 48, 10681-10683.

(60) Sharp, K. A. (2016) Unpacking the origins of in-cell crowding. Proc. Natl. Acad. Sci. USA 113, 1684-1685.

(61) Bezsonova, I., Singer, A., Choy, W.-Y., Tollinger, M., and Forman-Kay, J. D. (2005) Structural comparison of the unstable drkn sh3 domain and a stable mutant. Biochemistry 44, 15550-15560.

(62) Ye, Y., Liu, X., Chen, Y., Xu, G., Wu, Q., Zhang, Z., Yao, C., Liu, M., and Li, C. (2015) Labeling strategy and signal broadening mechanism of protein NMR spectroscopy in Xenopus laevis oocytes. Chem. -Eur. J. 21, 8686-8690.

(63) Ye, Y., Wu, Q., Zheng, W., Jiang, B., Pielak, G. J., Liu, M., and Li, C. (2018) Quantification of size effect on protein rotational mobility in cells by 19F NMR spectroscopy. Anal. Bioanal. Chem. 410, 869-874.

(64) Cohen, R. D., and Pielak, G. J. (2016) Electrostatic contributions to protein quinary structure. J. Am. Chem. Soc. 138, 13139-13142.

(65) Gorensek-Benitez, A. H., Smith, A. E., Stadmiller, S. S., Perez Goncalves, G. M., and Pielak, G. J. (2017) Cosolutes, crowding, and protein folding kinetics. J. Phys. Chem. B 121, 6527-6537.

(66) Muñoz, V., Blanco, F. J., and Serrano, L. (1995) The hydrophobic-staple motif and a role for loop-residues in α-helix stability and protein folding. Nat. Struct. Biol. 2, 380-385.

26 (67) McNulty, B. C., Young, G. B., and Pielak, G. J. (2006) Macromolecular crowding in the Escherichia coli periplasm maintains α-synuclein disorder. J. Mol. Biol. 355, 893-897.

(68) Theillet, F.-X., Binolfi, A., Bekei, B., Martorana, A., Rose, H. M., Stuiver, M., Verzini, S., Lorenz, D., van Rossum, M., Goldfarb, D., and Selenko, P. (2016) Structural disorder of monomeric α-synuclein persists in mammalian cells. Nature 530, 45-50.

(69) Chevelkov, V., Xue, Y., Rao, K. D., Forman-Kay, J. D., and Skrynnikov, N. R. (2010) 15NH/D-solexsy experiment for accurate measurement of amide solvent exchange rates: Application to denatured drkn sh3. J. Biomol. NMR 46, 227-244.

(70) Ebbinghaus, S., Dhar, A., McDonald, J. D., and Gruebele, M. (2010) Protein folding stability and dynamics imaged in a living cell. Nat. Methods 7, 319-323.

(71) Yu, I., Mori, T., Ando, T., Harada, R., Jung, J., Sugita, Y., and Feig, M. (2016) Biomolecular interactions modulate macromolecular structure and dynamics in atomistic model of a bacterial cytoplasm. eLife 5, 19274-19296.

(72) Porter, L. L., and Looger, L. L. (2018) Extant fold-switching proteins are widespread. Proc. Natl. Acad. Sci. USA 115, 5968-5973.

(73) Richards, F. M. (1977) Areas, volumes, packing, and protein structure. Annu. Rev. Biophys. Bioeng. 6, 151-176.

(74) Sukenik, S., Ren, P., and Gruebele, M. (2017) Weak protein–protein interactions in live cells are quantified by cell-volume modulation. Proc. Natl. Acad. Sci. USA 114, 6776-6781.

(75) Davis, C. M., and Gruebele, M. (2018) Labeling for quantitative comparison of imaging measurements in vitro and in cells. Biochemistry 57, 1929-1938.

(76) Timothy, C., Kapil, D., and Martin, G. (2018) Pressure- and heat-induced protein unfolding in bacterial cells: Crowding vs. Sticking. FEBS Lett. 592, 1357-1365.

(77) Senske, M., Törk, L., Born, B., Havenith, M., Herrmann, C., and Ebbinghaus, S. (2014) Protein stabilization by macromolecular crowding through enthalpy rather than entropy. J. Am. Chem. Soc. 136, 9036-9041.

(78) Feig, M., Yu, I., Wang, P.-H., Nawrocki, G., and Sugita, Y. (2017) Crowding in cellular environments at an atomistic level from computer simulations. J. Phys. Chem. B 121, 8009-8025.

(79) Sapir, L., and Harries, D. (2014) Origin of enthalpic depletion forces. J. Phys. Chem. Lett. 5, 1061-1065.

27 (80) Sapir, L., and Harries, D. (2015) Is the depletion force entropic? Molecular crowding beyond steric interactions. Curr. Opin. Colloid Interface Sci. 20, 3-10.

(81) Srere, P. A. (1985) The metabolon. Trends Biochem. Sci. 10, 109-110.

(82) Caperelli, C. A., Benkovic, P. A., Chettur, G., and Benkovic, S. J. (1980) Purification of a complex catalyzing folate cofactor synthesis and transformylation in de novo purine biosynthesis. J. Biol. Chem. 255, 1885-1890.

(83) Haggie, P. M., and Brindle, K. M. (1999) Mitochondrial citrate synthase is immobilized in vivo. J. Biol. Chem. 274, 3941-3945.

(84) Fei, W., and Shelley, M. (2015) Krebs cycle metabolon: Structural evidence of substrate channeling revealed by cross‐linking and mass spectrometry. Angew. Chem., Int. Ed. 54, 1851-1854.

(85) Bhattacharyya, S., Bershtein, S., Yan, J., Argun, T., Gilson, A. I., Trauger, S. A., and Shakhnovich, E. I. (2016) Transient protein-protein interactions perturb e. Coli metabolome and cause gene dosage toxicity. eLife 5, e20309.

(86) Bulutoglu, B., Garcia, K. E., Wu, F., Minteer, S. D., and Banta, S. (2016) Direct evidence for metabolon formation and substrate channeling in recombinant TCA cycle enzymes. ACS Chem Biol. 11, 2847-2853.

(87) Schmitt, D. L., and An, S. (2017) Spatial organization of metabolic enzyme complexes in cells. Biochemistry 56, 3184-3196.

(88) Ghaemmaghami, S., Huh, W.-K., Bower, K., Howson, R. W., Belle, A., Dephoure, N., O'Shea, E. K., and Weissman, J. S. (2003) Global analysis of protein expression in yeast. Nature 425, 737.

(89) Yang, T., Li, X.-M., Bao, X., Fung, Y. M. E., and Li, X. D. (2015) Photo-lysine captures proteins that bind lysine post-translational modifications. Nat. Chem. Biol. 12, 70-72.

(90) Zhang, Y., Fonslow, B. R., Shan, B., Baek, M.-C., and Yates, J. R. (2013) Protein analysis by shotgun/bottom-up proteomics. Chem. Rev. 113, 2343-2394.

(91) Phillip, Y., Sherman, E., Haran, G., and Schreiber, G. (2009) Common crowding agents have only a small effect on protein-protein interactions. Biophys. J. 97, 875-885.

(92) Jiao, M., Li, H.-T., Chen, J., Minton, A. P., and Liang, Y. (2010) Attractive protein- polymer interactions markedly alter the effect of macromolecular crowding on protein association equilibria. Biophys. J. 99, 914-923.

28

CHAPTER 2: COSOLUTE AND CROWDING EFFECTS ON A SIDE-BY-SIDE PROTEIN DIMER

Edited from Alex J. Guseman,1 and Gary J. Pielak*,1,2,3 Biochemistry 56 (7):971-976

Introduction

The Escherichia coli cytoplasm, which has a macromolecule concentration of greater than 300 g/L,1 is a complex environment crowded with an assortment of proteins, nucleic acids and small molecules. A reductionist approach featuring simple buffered solutions with macromolecule concentrations of less than 10 g/L, however, is usually utilized to simplify investigations of protein and nucleic acid biophysics. Such approaches provide a wealth of knowledge about protein structure, function, and folding, but they neglect the transient interactions that take place in cells brought about by the crowded environment.2, 3

To understand how proteins behave in cells, high concentrations of cosolutes are often used to simulate the cellular interior. Cosolutes, from large and supposedly inert macromolecules like polyethylene glycol (PEG) and the crosslinked sucrose polymer

Ficoll-70TM to globular proteins have been used to mimic the cellular interior.4-6 In concentrated cosolute solutions, the test protein experiences two interactions that are essentially absent in dilute solution: hard-core repulsions and chemical interactions between the cosolutes and the test protein.7-15 Hard-core repulsions reduce the volume available to the test protein and favor states that take up the least space.16 Chemical interactions arise from the close proximity of the test protein and the cosolute and include charge-charge,12, 17 hydrophobic and hydrogen bonding interactions. Repulsive

29 chemical interactions between charges of the same sign on the test protein and the cosolute stabilize globular proteins, because they enhance the hard-core repulsions.

Attractive interactions destabilize globular proteins, because protein unfolding exposes additional groups that can form attractive interactions.15

Environments ranging from small cosolutes to synthetic polymers, globular proteins, cell lysates and even the cellular interior8-13, 17 have been used to explore how crowding affects protein stability and folding. However, proteins rarely act alone, and there are few studies on the effect of crowding on protein-protein interactions.18, 19

The B1 domain of protein G (GB1) is one of the most extensively studied globular proteins. This 56-residue molecule adopts a thermally stable 4β+α globular fold, and a large number of variants have been characterized using a variety of biophysical methods.20-22 One variant, A34F, forms a simple side-by-side dimer (Figure 1).23 Such dimers can be thought of as a pair of kissing spheres where the volume of the dimer is approximately twice that of the monomer,24 and there is a small decrease in solvent accessible surface area upon dimer formation. Dimerization of A34F GB1 occurs because the side chain of Phe 34 becomes part of the hydrophobic core, displacing the

Tyr 33 side chain at the surface.23 To compensate, the C-terminal end of the sole helix unfolds, forming a pocket for the Tyr 33 side chain of the other GB1 monomer. The dimer is also stabilized by hydrogen bonding along the now adjacent -sheets.

For A34F GB1, our calculations show that there is no volume reduction on dimerization; the dimer and monomer molecular volumes are 15500 Å3 and 7500 Å3, respectively. The change in solvent accessible surface is also small; the area of the dimer, 7023.90 Å2, is only 9% less than that of two monomers (2 x 3924.1 Å2). Scaled-

30 particle theory predict that hard core repulsions have a small effect on the stability of a side by side dimer,24 making this a good system for focusing on the effect of chemical interactions.

Figure 2.1: The A34F GB1 dimer and monomer and 19F NMR spectra acquired at two concentrations. In each spectrum, the area under each peak is proportional to the concentration of the corresponding state, allowing straightforward quantification of the dissociation constant, KD→M.

As we show, the dissociation constant, KD→M, where D represent the dimer and

M the monomer, can be quantified in buffer and buffered solutions containing high concentrations of cosolutes by using 19F nuclear magnetic resonance spectroscopy

(NMR). Fluorine-19 is an attractive nucleus because it is 100% abundant, rarely used in biology, has a high NMR sensitivity (83% of protons) and its chemical shift is sensitive to its environment.25-27 Furthermore, the simplicity of one-dimensional 19F spectra allows the acquisition of data in a matter of minutes. Importantly, Escherichia coli readily incorporates fluorinated aromatic amino acids into recombinant protein.28, 29 GB1

31 contains three tyrosines that are readily labeled with 3-fluorotyrosine. Their resonances have been assigned.30

Experimental Procedures

Vector. The gene for T2Q GB1 in pET11a was used as the wild-type vector. This mutation prevents N-terminal degradation.31 We refer to the T2Q variant as the wild- type protein. The A34F change was made using Agilent’s QuickChange mutagenesis .

Expression and Purification. GB1 was expressed in E. coli BL21(DE3) cells and purified using a modified protocol.32 Briefly, a one-L culture harboring the A34F construct was grown in antibiotic-containing, 15N-enriched, M9 media and incubated 37

⁰C with shaking (New Brunswick Scientific Innova I26, 225 rpm).32 When the cells reached an optical density at 600 nm of 0.4, N-phosphonomethylglycine (0.5 g, to inhibit aromatic amino acid synthesis), 3-fluorotyrosine (70 mg), phenylalanine (60 mg) and tryptophan (60 mg), were added.33 Protein expression was induced with Isopropyl β-D-

1-thiogalactopyranoside (IPTG), at a final concentration of 1 mM, when the culture reached an optical density at 600 nm of 0.6. After two hours, the cells were pelleted at

1000g for 25 min. The supernatant was discarded and the pellet was stored at -20 °C.

Cells were lysed by sonication (Fischer Scientific Sonic Dismembrator model 500, 15% amplitude, 10 min, 50% duty cycle) in 25 mL of 20 mM tris, pH 7.5, containing a cOmplete protease inhibitor TM tablet (Roche). The lysate was centrifuged for 45 min at

27000g to remove cell debris, and the supernatant was filtered (0.22 μm). The filtrate was loaded on a 16 mm x 200 mm Q Sepharose anion exchange column attached to a

GE AKTA FPLC. The column was eluted with 20 mM tris, pH 7.5 using a gradient from

0 M to 1 M NaCl over three column volumes. Fractions were subjected to sodium

32 dodecyl sulfate poly acrylamide gel electrophoresis (SDS PAGE) to assess protein content. GB1-containing fractions were concentrated in a 3000-molecular-weight-cut-off

Amicon spin concentrator and buffer exchanged into 10 mM potassium phosphate containing 150 mM NaCl (pH 6.0). The final volume was 3 mL. This solution was loaded on a 16 mm x 600 mm GE Superdex-75 gel filtration column and developed over two column volumes of the same buffer. Fractions were subjected SDS PAGE. Fractions containing only GB1 were concentrated and buffer exchanged into filtered, 17 mΩ cm-1

H2O, flash frozen in a CO2(s)/ethanol bath and lyophilized for at least 12 h (Labconco

FreeZone). Lyophilized protein was stored at -20 °C. The process was completed in <36 h to avoid aggregation. Protein purity was confirmed by electrospray ionization mass spectrometry (expected 6385 Da, observed 6391 Da Figure S1), SDS PAGE and NMR spectroscopy (Figure S2).

NMR. Fluorine-labeled protein was resuspended at a final concentration of 500

μM in either 20 mM sodium phosphate buffer (pH 7.5, 298 K), or this buffer plus cosolute. Experiments were conducted with a Bruker Avance III HD spectrometer operating at a 19F Larmor frequency of 470 MHz and equipped with a cryogenic QCI probe with an H/F channel. Spectra comprised 31047 points, 128 scans, a delay of 2 s, an acquisition time of 1.4 s, an offset of -100 ppm and a sweep width of 100 ppm.

Samples were internally referenced using a Wilmad coaxial insert containing 0.01% deuterated trifluoroacetic acid (-75.6 ppm) in D2O.

Data were processed using TopSpin 3.2. A line broadening function of 10 Hz was applied to each free-induction decay before Fourier transformation. The resonances from tyrosine 33 corresponding to the monomer and dimer were integrated to obtain the

33 relative populations. The fraction of protein in the dimer, Fd, was calculated by dividing the integral of the dimer peak by the sum of the integrals of the monomer and dimer peaks. KD→M values were obtained as described in Results using MATLAB version

R2016A.

Cosolutes. Small cosolutes were weighed, dissolved in 20 mM sodium phosphate buffer, adjusted to pH 7.5 and diluted to the desired concentrations.

Lyophilized lysozyme and bovine serum albumin were purchased from Sigma Aldrich.

To prepare these cosolute solutions, protein was weighed and dissolved in 20 mM sodium phosphate buffer (pH 7.5). Concentrations were determined spectrophotometrically with a Nanodrop 1000 spectrophotometer using extinction coefficients of 6700 (L mg-1 cm-1) for bovine serum albumin34 and 26400 (L mg-1 cm-1) for lysozyme.35 E. coli lysates were prepared as described36.

Volumes and Surface Areas. VADAR37 and POPS38 software were used to calculate the volumes and solvent accessible surface areas, respectively. Volumes were calculated using the standard Voronoi procedure and the PDB structure 2RMM23 for the dimer and 2RMMa and 2RMMb for the monomer. Monomer solvent accessible surface areas were calculated using the ‘per chain analysis’ of 2RMM and 1GB1 with a probe radius of 1.4

Results

Quantifying Dimerization. A34F GB1 has tyrosines at positions 3, 33 and 45.

Its 19F spectrum exhibits six peaks. Tyrosines 3 and 45 are buried in the core. Their rotomers are in slow exchange on the NMR timescale and, therefore, two resonances are observed for each residue. One resonance is observed for tyrosine 33, suggesting that its rotomers are in fast exchange or have similar chemical shifts. Nevertheless, the

34 shift of the tyrosine 33 peak is affected by dimerization; its resonance exhibits concentration dependence (Figure S3), suggesting that the dimer and the monomer are in slow exchange, in agreement with previous studies.23 We assigned the dimer to the resonance that increases with GB1 concentration. We integrated the dimer and monomer peaks to give their relative populations at five GB1 concentrations using serial dilution. Experiments were performed in triplicate. The data were fitted to eq 139 , where

Pt Is the total GB1 concentration and Fd is the fraction dimer, to yield a KD→M at 298 K and pH 7.5 of 59 ± 2 μM, where the uncertainty is the standard deviation of the mean.

Our measured dissociation constant is approximately twice that of previous report, 27 ±

4 μM determined at pH 5.5 in 50 mM sodium phosphate.23

2 [4푃푡+퐾푑−√퐾푑+8푃푡퐾푑] 퐹푑 = . (1) 4푃푇

The near identity of 15N-1H HSQC spectra show that fluorine labeling has a small effect on the structure of A34F (Figure S4) and KD→M, because equilibrium analytical ultracentrifugation of the unlabeled protein gives the same value (Figured S5). In

19 summary, F NMR permits facile and precise quantification of KD→M for the GB1 A34F side-by-side dimer from five A34F concentrations in less than 2 h.

35

Figure 2.2: Binding isotherms of 3-fluorotyrosine labeled A34F in solutions of cytosol (75 g/L blue), buffer (black) and ethylene glycol (200 g/L, red) at pH 7.5, 298 K.

Small Molecule and Synthetic Polymer Cosolutes. Two osmolytes, TMAO (75

Da) and urea (60 Da), were tested. TMAO (38 g/L) stabilized the dimer, reducing KD→M to 25 ± 2 μM. Urea destabilized the dimer, increasing KD→M to 90 ± 5 μM. The effects of polyethylene glycol (PEG, 8 kDa,) its monomer [ethylene glycol (200 g/L)], Ficoll-70 (70 kDa) and its monomer [sucrose (300 g/L)], were examined at the highest concentrations consistent with acquiring high quality NMR data. PEG and ethylene glycol destabilized

TM the dimer giving KD→M values of 85 ± 5 μM, and 95 ± 6 μM, respectively. Ficoll weakly stabilized the dimer yielding a KD→M of 45 ± 2 μM, but its monomer, sucrose, at the same g/L concentration, had a small effect compared to buffer alone (K D→M = 55 ± 2

μM).

Proteins and Freeze-Dried Cytosol. Two protein cosolutes, lysozyme (14 kDa, pI 11.4) and bovine serum albumin (BSA, 68 kDa, pI 4.5) and freeze-dried E. coli cytosol36 (75 g/L) were tested. BSA at 100 g/L stabilized the dimer (K D→M 26 ± 2 μM),

36 but 50 g/L lysozyme destabilized the dimer (K D→M 80 ± 5 μM). The cosolutes were used at the highest concentrations that give interpretable NMR spectra (higher concentrations caused excessive broadening). Cytosol had the largest stabilizing effect of any cosolute

(K D→M 17 ± 2 μM). The results from all cosolutes are summarized in Table 2.1.

K D→M ∆Go’ D→M ∆∆G o’ D→M Condition μM kcal/mol

buffer (20 mM NaPO4) 59 ± 2 5.75 ± 0.03 N/A ethylene glycol, 200 g/L 95 ± 6 5.46 ± 0.04 -0.29 ± 0.05 urea, 100 g/L 90 ± 5 5.50 ± 0.03 -0.25 ± 0.04 8 kDa PEG, 200 g/L 85 ± 5 5.53 ± 0.03 -0.22 ± 0.03 lysozyme, 50 g/L 80 ± 5 5.57 ± 0.04 -0.18 ± 0.05 sucrose, 300 g/L 55 ± 3 5.78 ± 0.03 0.03 ± 0.04 Ficoll-70, 300 g/L 45 ± 2 5.90 ± 0.03 0.15 ± 0.04 BSA, 100 g/L 26 ± 2 6.23 ± 0.05 0.48 ± 0.06 TMAO, 38 g/L 25 ± 2 6.25 ± 0.05 0.50 ± 0.06 cytosol, 75 g/L 17 ± 2 6.47 ± 0.07 0.72 ± 0.08 Values are averages. Uncertainties are standard deviations of the mean from three trials.

Table 2.1: Thermodynamic data for A34F GB1 in cosolutes at 298 K pH 7.5

Discussion

Dissociation constants (Figure 3) and modified standard state dissociation free energies with respect to buffer were compared at pH 7.5 and 298 K.

37

Figure 2.3: A34F dissociation constants in buffer, osmolytes, synthetic polymers, their monomers and protein crowders. The size of the dots reflects the uncertainty.

Small Molecule Cosolutes. Urea and ethylene glycol interact favorably with the protein surface, favoring states with the most surface because expansion maximizes interactions between the protein and the cosolute.40-42 Consistent with this idea, urea and ethylene glycol destabilize the dimer by 0.25 kcal/mol and 0.29 kcal/mol, respectively. TMAO and sucrose work in the opposite manner; interacting unfavorably with the protein, favoring compact- over expanded- states.43, 44 As expected, TMAO stabilizes the dimer by 0.50 kcal/mol. It is unclear why sucrose has an almost negligible effect, even though it stabilizes proteins. Perhaps due to a balancing of repulsive interactions from the backbone and attractive interactions from the exposed residues on the protein surface.44 Proving the importance of chemical interactions, however, must await determination of the enthalpy of dissociation.45-49

38 Synthetic Polymers and Their Monomers. Both PEG and ethylene glycol destabilize the dimer, but PEG is less destabilizing. Sucrose has almost no effect, but its polymer FicollTM is stabilizing. An explanation for both observations is that there is a small stabilizing macromolecular effect. However, the small net stabilizing effect could also arise from polymer induced shielding of destabilizing attractive interactions by the polymer, as suggested by Knowles et al.41

Proteins. To gain more biologically relevant information, we turned to globular proteins as cosolutes and examined the effects of lysozyme (14 kDa, pI 9.7, 50 g/L) and

BSA (68 kDa, pI 4.5, 100 g/L). At pH 7.5, both are polyanions. The GB1 monomer has a net charge of approximately -4.3, and BSA has a net charge of approximately -18. The resulting charge-charge repulsion between BSA and GB1 should favor dimerization.

This prediction is borne out; BSA stabilizes the dimer by 0.48 kcal/mol.

Lysozyme is expected to have the opposite effect, because it is a polycation with a charge of +8 at pH 7.5. The resulting attraction between lysozyme and GB1 should favor the monomer because it has more charged surface to interact with lysozyme. As predicted, lysozyme destabilizes the dimer by ~0.2 kcal/mol. The effect would have been larger, but lysozyme concentrations of >50 g/L give poor quality spectra, again consistent with idea that its attractive interactions with GB1 increase its effective molecular weight, broadening the resonances.

Freeze-dried Lysate. Proteins comprise ~55% of the dry weight of the E. coli cytoplasm,50 and the molecular masses and isoelectric points of the proteome ranges from ~5 kDa to >200 kDa and from 4 to 12, respectively. The E. coli proteome has an abundance of acidic proteins, and GB1 also has a net negative charge at the pH studied

39 here. Therefore, we expect a stabilizing effect from the cellular environment. To test this hypothesis we examined the effect of freeze-dried E. coli lysates36 at 75 g/L. These lysates comprise solely of proteins and nucleic acids as the small molecules were removed during the preparation process. As predicted, lysate increases dimer stability by 0.7 kcal/mol. It is difficult to parse this stability increase between macromolecular effects and repulsive charge-charge interactions, but the charge-charge portion of the stabilization arises from a combination of protein charge and nucleic acids charge.

The protein stabilizing or destabilizing effect of most cosolutes, including reconstituted lysates,14 increases with cosolute concentration.44, 51 Therefore, we anticipate that concentrations of lysates approaching the concentration in cells,1 will have a greater influence on protein stability. Therefore, the intracellular environment might have a much larger effect on protein-protein interactions that what is observed in our model studies.

Conclusions

A recent review of macromolecular crowding highlighted the need for studies of protein association under crowded conditions.52 We undertook this challenge using a simple homodimeric system. We find that the attractive and repulsive interactions that govern the effects of cosolutes on protein stability also govern their effects on this protein-protein interaction. Polymer crowders, which are believed to stabilize proteins through hard-core repulsions, did not show a large stabilizing effect. Reconstituted E. coli cytosol was the most stabilizing cosolute, which highlights the important differences between physiologically relevant cosolutes and synthetic polymer crowders, a difference that is also noted for effects on protein stability. Our results highlight the importance of chemical interactions as a mechanism for regulating protein complex stability. This

40 study lays the foundation for defining the role of chemical interactions in protein-protein interactions.

Supporting Information

Figure S2.1: ESI-FT-ICR analysis of fluorinated A34F GB1 reveals 3 populations of fluorinated GB1 containing 0 (6371.90 Da), 1 (6390.89 Da), or 2 (6407.87 Da) fluorotyrosine residues.

41

Figure S2.2: 1H-15N HSQC of A34F GB1 dimer (3.3 mM, red) and 1H-15N HMQC of A34F GB1 monomer (10 μM, blue) shows chemical shift changes between monomer and dimer states, consistent with previously published spectra

42

Figure S2.3: (A)19F NMR spectrum of A34F GB1 shows six peaks, two for Tyr-45 and two Tyr-3, all 4 are in slow exchange, and two resonances near -136.5 ppm that correspond to Tyr-33 in the dimer state (upfield) and monomer state (down field). (B) Peaks were assigned using the published assignments and the A34F Y3F variant data shown here.

43

Figure S2.4:1H-15N HSQC spectra of 19F 15N GB1 (red) and 15N A34F GB1 (blue) indicate that fluorination has a small effect on A34F GB1 structure.

Figure S2.5: Analytical ultracentrifugation curves for 15N A34F GB1 and 19F 15N A34F GB1 yield the same dissociation constant and uncertainty (59 ± 10 μM), showing that 19F labeling does not affect dimerization.

44 REFERENCES

(1) Zimmerman, S. B., and Trach, S. O. (1991) Estimation of macromolecule concentrations and excluded volume effects for the cytoplasm of Escherichia coli. J. Mol. Biol. 222, 599-620.

(2) Sarkar, M., Li, C., and Pielak, G. J. (2013) Soft interactions and crowding. Biophys. Rev. 5, 187-194.

(3) Cohen, R. D., and Pielak, G. J. (2017) A cell is more than the sum of its (dilute) parts: A brief history of quinary structure. Protein Sci. 26, 403-413.

(4) McPhie, P., Ni, Y., and Minton, A. P. (2006) Macromolecular crowding stabilizes the molten globule form of apomyoglobin with respect to both cold and heat unfolding. J. Mol. Biol. 361, 7-10.

(5) Hong, J., and Gierasch, L. M. (2010) Macromolecular crowding remodels the energy landscape of a protein by favoring a more compact unfolded state. J. Am. Chem. Soc. 132, 10445-10452.

(6) Christiansen, A., and Wittung-Stafshede, P. (2013) Quantification of excluded volume effects on the folding landscape of Pseudomonas aeruginosa apoazurin in vitro. Biophys. J. 105, 1689-1699.

(7) Miklos, A. C., Sarkar, M., Wang, Y., and Pielak, G. J. (2011) Protein crowding tunes protein stability. J. Am. Chem. Soc. 133, 7116-7120.

(8) Gnutt, D., Gao, M., Brylski, O., Heyden, M., and Ebbinghaus, S. (2015) Excluded- volume effects in living cells. Angew. Chem., Int. Ed. 54, 2548-2551.

(9) Danielsson, J., Mu, X., Lang, L., Wang, H., Binolfi, A., Theillet, F.-X., Bekei, B., Logan, D. T., Selenko, P., Wennerström, H., and Oliveberg, M. (2015) Thermodynamics of protein destabilization in live cells. Proc. Natl. Acad. Sci. USA 112, 12402-12407.

(10) Monteith, W. B., Cohen, R. D., Smith, A. E., Guzman-Cisneros, E., and Pielak, G. J. (2015) Quinary structure modulates protein stability in cells. Proc. Natl. Acad. Sci. USA 112, 1739-1742.

(11) Wang, Q., Zhuravleva, A., and Gierasch, L. M. (2011) Exploring weak, transient protein–protein interactions in crowded in vivo environments by in-cell nuclear magnetic resonance spectroscopy. Biochemistry 50, 9225-9236.

(12) Cohen, R. D., Guseman, A. J., and Pielak, G. J. (2015) Intracellular pH modulates quinary structure. Protein Sci. 24, 1748-1755.

45 (13) Crowley, P. B., Chow, E., and Papkovskaia, T. (2011) Protein interactions in the Escherichia coli cytosol: An impediment to in-cell NMR spectroscopy. ChemBioChem 12, 1043-1048.

(14) Sarkar, M., Smith, A. E., and Pielak, G. J. (2013) Impact of reconstituted cytosol on protein stability. Proc. Natl. Acad. Sci. USA 110, 19342-19347.

(15) Sarkar, M., Lu, J., and Pielak, G. J. (2014) Protein crowder charge and protein stability. Biochemistry 53, 1601-1606.

(16) Minton, A. P. (1981) Excluded volume as a determinant of macromolecular structure and reactivity. Biopolymers 20, 2093-2120.

(17) Cohen, R. D., and Pielak, G. J. (2016) Electrostatic contributions to protein quinary structure. J. Am. Chem. Soc. 138, 13139-13142.

(18) Phillip, Y., Sherman, E., Haran, G., and Schreiber, G. (2009) Common crowding agents have only a small effect on protein-protein interactions. Biophys. J. 97, 875-885.

(19) Jiao, M., Li, H.-T., Chen, J., Minton, A. P., and Liang, Y. (2010) Attractive protein- polymer interactions markedly alter the effect of macromolecular crowding on protein association equilibria. Biophys. J. 99, 914-923.

(20) Gronenborn, A., Filpula, D., Essig, N., Achari, A., Whitlow, M., Wingfield, P., and Clore, G. (1991) A novel, highly stable fold of the immunoglobulin binding domain of streptococcal protein G. Science 253, 657-661.

(21) Smith, C. K., Bu, Z., Engelman, D. M., Regan, L., Anderson, K. S., and Sturtevant, J. M. (1996) Surface point mutations that significantly alter the structure and stability of a protein's denatured state. Protein Sci. 5, 2009-2019.

(22) Honda, S., Kobayashi, N., and Munekata, E. (2000) Thermodynamics of a β-hairpin structure: Evidence for cooperative formation of folding nucleus. J. Mol. Biol. 295, 269-278.

(23) Jee, J., Byeon, I.-J. L., Louis, J. M., and Gronenborn, A. M. (2008) The point mutation A34F causes dimerization of GB1. Proteins 71, 1420-1431.

(24) Berg, O. G. (1990) The influence of macromolecular crowding on thermodynamic activity: Solubility and dimerization constants for spherical and dumbbell-shaped molecules in a hard-sphere mixture. Biopolymers 30, 1027-1037.

(25) Harper, D. B., and O'Hagan, D. (1994) The fluorinated natural products. Nat. Prod. Rep. 11, 123-133.

(26) O’Hagan, D., and B. Harper, D. (1999) Fluorine-containing natural products. J. Fluorine Chem. 100, 127-133.

46 (27) Chen, H., Viel, S., Ziarelli, F., and Peng, L. (2013) 19F NMR: A valuable tool for studying biological events. Chem. Soc. Rev. 42, 7971-7982.

(28) Campos-Olivas, R., Aziz, R., Helms, G. L., Evans, J. N. S., and Gronenborn, A. M. (2002) Placement of 19F into the center of GB1: Effects on structure and stability. FEBS Lett. 517, 55-60.

(29) Crowley, P. B., Kyne, C., and Monteith, W. B. (2012) Simple and inexpensive incorporation of 19F-tryptophan for protein NMR spectroscopy. Chem. Commun. 48, 10681-10683.

(30) Ye, Y., Liu, X., Zhang, Z., Wu, Q., Jiang, B., Jiang, L., Zhang, X., Liu, M., Pielak, G. J., and Li, C. (2013) 19F NMR spectroscopy as a probe of cytoplasmic viscosity and weak protein interactions in living cells. Chem. -Eur. J. 19, 12705-12710.

(31) Smith, C. K., Withka, J. M., and Regan, L. (1994) A thermodynamic scale for the b sheet forming tendencies of the amino acids. Biochemistry 33, 5510-5517.

(32) Monteith, W. B., and Pielak, G. J. (2014) Residue level quantification of protein stability in living cells. Proc. Natl. Acad. Sci. USA 111, 11335-11340.

(33) Khan, F., Kuprov, I., Craggs, T. D., Hore, P. J., and Jackson, S. E. (2006) 19F NMR studies of the native and denatured states of green fluorescent protein. J. Am. Chem. Soc. 128, 10729-10737.

(34) Putnam, F. (1984) Progress in plasma proteins, In The plasma proteins 2nd ed. (Putnam, F., Ed.), pp 1-44, Academic Press, New York.

(35) Aune, K. C., and Tanford, C. (1969) Thermodynamics of the denaturation of lysozyme by guanidine hydrochloride. I. Dependence on pH at 25°. Biochemistry 8, 4579-4585.

(36) Wang, Y., Li, C., and Pielak, G. J. (2010) Effects of proteins on protein diffusion. J. Am. Chem. Soc. 132, 9392-9397.

(37) Willard, L., Ranjan, A., Zhang, H., Monzavi, H., Boyko, R. F., Sykes, B. D., and Wishart, D. S. (2003) VADAR: A web server for quantitative evaluation of protein structure quality. Nucleic Acids Res. 31, 3316-3319.

(38) Cavallo, L., Kleinjung, J., and Fraternali, F. (2003) POPS: A fast algorithm for solvent accessible surface areas at atomic and residue level. Nucleic Acids Res. 31, 3364-3366.

(39) Aramini, J., Hamilton, K., Ma, L.-C., Swapna, G. V. T., Leonard, P., Ladbury, J., Krug, R., and Montelione, G. (2014) 19F NMR reveals multiple conformations at the dimer interface of the nonstructural protein 1 effector domain from Influenza A virus. Structure 22, 515-525.

47 (40) Guinn, E. J., Pegram, L. M., Capp, M. W., Pollock, M. N., and Record, M. T. (2011) Quantifying why urea is a protein denaturant, whereas glycine betaine is a protein stabilizer. Proc. Natl. Acad. Sci. U. S. A. 108, 16932-16937.

(41) Knowles, D. B., Shkel, I. A., Phan, N. M., Sternke, M., Lingeman, E., Cheng, X., Cheng, L., O’Connor, K., and Record, M. T. (2015) Chemical interactions of polyethylene glycols (PEGs) and glycerol with protein functional groups: Applications to effects of PEG and glycerol on protein processes. Biochemistry 54, 3528-3542.

(42) Shkel, I. A., Knowles, D. B., and Record, M. T. (2015) Separating chemical and excluded volume interactions of polyethylene glycols with native proteins: Comparison with PEG effects on DNA helix formation. Biopolymers 103, 517- 527.

(43) Street, T. O., Bolen, D. W., and Rose, G. D. (2006) A molecular mechanism for osmolyte-induced protein stability. Proc. Natl. Acad. Sci. U. S. A. 103, 13997- 14002.

(44) Auton, M., Rösgen, J., Sinev, M., Holthauzen, L. M. F., and Bolen, D. W. (2011) Osmolyte effects on protein stability and solubility: A balancing act between backbone and side-chains. Biophys. Chem. 159, 90-99.

(45) Wang, Y., Sarkar, M., Smith, A. E., Krois, A. S., and Pielak, G. J. (2012) Macromolecular crowding and protein stability. J. Am. Chem. Soc. 134, 16614- 16618.

(46) Benton, L. A., Smith, A. E., Young, G. B., and Pielak, G. J. (2012) Unexpected effects of macromolecular crowding on protein stability. Biochemistry 51, 9773- 9775.

(47) Senske, M., Törk, L., Born, B., Havenith, M., Herrmann, C., and Ebbinghaus, S. (2014) Protein stabilization by macromolecular crowding through enthalpy rather than entropy. J. Am. Chem. Soc. 136, 9036-9041.

(48) Politi, R., and Harries, D. (2010) Enthalpically driven peptide stabilization by protective osmolytes. Chem. Commun. 46, 6449-6451.

(49) Sapir, L., and Harries, D. (2015) Is the depletion force entropic? Molecular crowding beyond steric interactions. Curr. Opin. Colloid Interface Sci. 20, 3-10.

(50) Ingraham, J. (1987) Chemical composition of Escherichia coli, In Escherichia coli and Salmonella typhimurium (F. C. Neidhardt, J. L. Ingraham, K. B. Low, B.Magasanik, M. Schaechter, and Umbarger., H. E., Eds.), pp 1543-1554, American Society for Microbiology, Washington DC.

48 (51) Schellman, J. A. (1990) A simple model for solvation in mixed solvents. Applications to the stabilization and destabilization of macromolecular structures. Biophys. Chem. 37, 121-140.

(52) Rivas, G., and Minton, A. P. (2016) Macromolecular crowding in vitro, in vivo, and in between. Trends Biochem. Sci. 41, 970-981.

49

CHAPTER 3: SURFACE-CHARGE MODULATES PROTEIN-PROTEIN INTERACTIONS IN PHYSIOLOGICALLY-RELEVANT ENVIRONMENTS.

Edited from Alex J. Guseman Shannon L. Speer, Gerardo M. Perez Goncalves, and Gary J. Pielak, Biochemistry 57:1681-1684

Introduction

The formation and dissociation of protein complexes regulate processes from signaling, to transcription and metabolism, all of which are essential to maintaining cellular homeostasis.1 Traditionally, these interactions were studied in dilute buffered solution. In their native cellular environments, however, the concentration of macromolecules can exceed 300 g/L.2, 3 These high concentrations of macromolecules are the source of the contacts that organize the cytoplasm.4, 5 These interactions, which define quinary structure,6-8 comprise two components: hard-core steric repulsions and

“soft” chemical interactions.9 Their influence is currently investigated in the context of protein folding and stability.9-17 Here, we shift the emphasis to protein-protein interactions.18-20

There are over three decades of speculation about how crowding influences the stability of a test protein.21-24 Originally, crowding effects were attributed solely to hard- core repulsions, which occur at short crowder-test protein distances, because of the large and unfavorable energy associated with the interpenetration of electron shells.9

Hard-core repulsions favor compact states. The stabilization from synthetic polymers was attributed to these repulsions, because the folded state occupies less space than the unfolded ensemble. Results from early studies of protein dimerization in crowded

50 conditions using inert polymers suggest that hard-core repulsions play but a small role in dimerization.25 The minor increase in stability can be explained by the small decrease in volume when two monomers become a dimer.

Synthetic polymers, although traditionally used to simulate the crowded cellular environment, are a poor representation of biology, because they do not have the same surface properties as biological macromolecules.26, 27 These so-called “soft” interactions include charge-charge contacts, hydrogen bonding, and hydrophobic interactions between the surface of a test protein and physiologically-relevant crowded environments,9, 28-34 but even synthetic polymers have chemical interactions with test proteins.27, 35, 36

Nevertheless, hard-core repulsions are always present, and they are stabilizing when the products of a reaction occupy less space than the reactants. The use of proteins and cellular lysates as cosolutes revealed the importance of chemical interactions, because globular proteins are destabilized in these environments despite the stabilizing effects of hard-core repulsions.19, 30, 37 In these environments, side chains and exposed backbone on the surface of the test protein interact with the whole surface of the macromolecules [as opposed to specific (i.e., ligand) binding, which would be stabilizing]. The current model used to explain crowding effects predicts that chemical interactions stabilize proteins when repulsive and destabilize proteins when attractive.

Extending this idea to protein-protein interactions, where hard-core repulsions are less important,25 suggested to us that soft interactions can tune the protein-complex stability.

We used a model homodimer system, the A34F variant of the B1 domain of protein G, GB1 (6 kDa, pI 4.5),38, 39 to test this idea. Dimerization is driven by

51 displacement of the Tyr-33 side chain from the hydrophobic core by the phenylalanine side chain at position 34. To compensate, the C-terminus of the -helix unravels, forming a hydrophobic pocket for the Tyr-33 from a second molecule. The interaction is stabilized by hydrogen bonding between antiparallel -strands at the dimer interface.

The components of the 12-kDa homodimer retain the tertiary structure of the monomer, forming a dimer resembling kissing spheres. As a variant of a thermostable monomer, the A34F variant provides access to nearly all GB1 variants in the Protherm database,40 allowing us to control its surface properties.

We measured dimer stability in several cosolutes using 19F NMR. Fluorine is easily incorporated by supplementing minimal media with 3-fluorotyrosine prior to inducing GB1 expression. Spectra of the purified protein show two resonances in slow exchange on the NMR timescale, allowing the monomer-dimer equilibrium to be determined from their areas.19 Dimer stability is minimally influenced by 100-g/L concentrations of the neutral synthetic polymers, but more influenced by protein cosolutes, suggesting a role for charge-charge interactions. We showed previously that attractive electrostatic interactions can be diminished by increasing the ionic strength.30

Here, we examine the effect of electrostatics by manipulating the charge on GB1, by changing the charge of the cosolute proteins and by manipulating the pH.

The A34F monomer has a charge of -4 at pH 7.5 and there are no groups that ionize from pH 7.5 to pH 6.2, the range used here (Figure S1). The charge on GB1 was changed by altering aspartic acid 40, which forms part of an acid patch on the surface remote from the dimer interface (Figures 1A and S2). The D40N and D40K variants

52 result in a +1 and +2 charge-change with respect to the monomer. We refer to these proteins as A34F-4, A34F;D40N-3 and A34F;D40K-2.

Figure 3.1. Dimer structure (PDBID: 2RMM) with the atoms of residue 40 shown as spheres. (A) 19F NMR spectra of A34F-4 (blue), A34F;D40N-3 (black), and A34F;D40K-2 (red) show nearly identical spectra. (B) 1H-15N HSQC spectra of the proteins using the same coloring as panel B. (C)

High concentrations of globular proteins have been used to mimic cellular conditions.10, 12, 30, 41 The charge on the surroundings can be altered by using bovine serum albumin (BSA, 66 kDa, pI 4.5) and hen egg white lysozyme (14 kDa, pI 9.7) as cosolutes. Both proteins are highly soluble and commercially available in pure form, but have different surfaces. At physiologically-relevant pH values, BSA is a polyanion

53 whereas lysozyme is a polycation (Figure S3). We used these proteins to create repulsive- and attractive- interactions with GB1.

풐′ 풐′ a KD→M 휟푮푫→푴 휟휟푮푫→푴 Condition μM kcal/mol kcal/mol

A34F-4, pH 7.5b

Buffer 59 ± 2 5.75 ± 0.03 N/Ac

BSA 100 g/L 26 ± 2 6.23 ± 0.05 0.48 ± 0.06

Lysozyme 50 g/L 80 ± 5 5.57 ± 0.04 -0.18 ± 0.05

A34F-4, pH 6.8

Buffer 47 ± 2 5.88 ± 0.03 N/A

BSA 28 ± 1 6.19 ± 0.02 0.31 ± 0.04

Lysozyme 73 ± 2 5.62 ± 0.02 -0.26 ± 0.04

A34F-3, pH 6.2

Buffer 39 ± 1 5.99 ± 0.02 N/A

BSA 35 ± 2 6.05 ± 0.03 0.06 ± 0.04

Lysozyme 66 ± 2 5.68 ± 0.02 -0.31 ± 0.03

A34F;D40N-3, pH 7.5

Buffer 20 ± 1 6.38 ± 0.03 N/A

BSA 16 ± 1 6.52 ± 0.04 0.13 ± 0.05

Lysozyme 31 ± 2 6.13 ± 0.04 -0.26 ± 0.05

A34F;D40K-2, pH 7.5

Buffer 21 ± 2 6.36 ± 0.03 N/A

BSA 18 ± 1 6.45 ± 0.03 0.09 ± 0.07

Lysozyme 36 ± 3 6.04 ± 0.05 -0.32 ± 0.07

Table 3.1: Equilibrium dissociation parameters at 298 K. aPositive values indicate stabilization. Uncertainties are reported as the standard deviation of the mean from triplicate analysis. bfrom19 cN/A, not applicable.

54 Results and discussion

We altered the charge on the protein cosolutes by controlling the pH. The net charge on BSA changes from -18 at pH 7.5, -12 at pH 6.8 and -4 at pH 6.2 (Table S1 and Figure S3). The net charge on lysozyme changes from +7 at pH 7.5, to +8 at pH 6.8 and +9 at pH 6.2.

풐′ We first examined dimer stability in buffer at pH 7.5 and 298 K. 횫푮푫→푴 increases from 5.8 to 6.4 kcal/mol for both A34F;D40N-3 and A34F;D40K-2 (Table 1, Figure 2).

NMR data indicate that A34F;D40N-3and A34F;D40K-2 retain the structure of A34F -4

(Figure 1B and C). These data are consistent with the idea that stabilization arises from the mutation-induced decrease in inter-monomer charge-charge repulsion (Figure S2), although we do not fully understand why the A34F;D40N-3 and A34F;D40K-2 dimers have the same stability.

풐′ Figure 3.2. Side chains at position 40 and thermometer representations of 휟휟푮푫→푴 (pH 7.5, 298 K).

55 We have shown that 100 g/L BSA and 50 g/L lysozyme affect A34F-4 dimer stability at pH 7.5,19 and that under crowded conditions, charge-charge interactions become weaker at high ionic strength.30 BSA has a charge of -18 at this pH. Therefore, we expected, and observed (Table 1, Figure 2), stabilization via electrostatic repulsions when polyanionic A34F is surrounded by polyanionic BSA. The net charge on lysozyme is +7 at pH 7.5. As expected (Table 1), attractive electrostatic interactions between

A34F and lysozyme destabilize the dimer.

To elucidate further the role of electrostatics, we investigated A34F-4 dimerization in BSA as a function of pH (Figure 3). As discussed above, the charge on the dimer does not change over our pH range, but the charge on the protein cosolutes does

(Table S1). If electrostatics interactions between A34F-4 proteins and BSA are important, then the stabilizing effect of BSA should diminish with decreasing pH, because of the decrease in repulsion between the dimer and BSA as BSA gains protons. Supporting this prediction, the stabilization decreases from 0.48 ± 0.06 kcal/mol at pH 7.5, 0.31 ±

0.04 at pH 6.8 to 0.06 ± 0.04 kcal/mol at pH 6.2. These results and those from the charge-change variants suggest that the charge of both GB1 and BSA are responsible for the repulsive electrostatic interactions that stabilize the dimer, supporting the idea of a key role for charge-charge repulsion.

56

Figure 3.3. Graphical representations of A34F-4 dimerization in BSA and lysozyme at pH 7.5 (A), pH 6.8 (B) and pH 6.2 (C)

If the charge of the environment is important for dimer formation, we expect that as the cationic nature of lysozyme increases, the strength of attractive chemical interactions with GB1 will increase, resulting in destabilization. The stability of the dimer in lysozyme relative to its stability in buffer decreases upon dropping the pH from 7.5 to

풐′ 6.8 (휟휟푮푫→푴= -0.26 ± 0.04 kcal/mol), but the value is within the uncertainty of that measured for lysozyme at pH 7.5. At pH 6.2, however, the decrease is larger (-0.31 ±

0.03 kcal/mol), showing that as the positive charge on lysozyme increases the strength of destabilizing attractive interactions increases.

Consistent with the results from the charge-change variants in buffer, there is no increase in destabilization in lysozyme upon changing the charge of GB1 (Table 1).

Nevertheless, stability decreases when lysozyme is more cationic, suggesting an electrostatic role in the destabilization. Other attractive interactions, e.g., hydrophobic contacts, polar interactions, and hydrogen bonds, likely also contribute to destabilization

57 of the dimer, and the same effect is observed for the stability of an SH3 domain in lysozyme at pH 3 where lysozyme was destabilizing despite the existence of net charge-charge repulsion.30

The charge-change variants were then tested in protein cosolutes at pH 7.5.

Given the polyanionic nature of the dimer, we expect the stabilization arising from polyanionic BSA to decrease for A34F;D40N-3 and A34F;D40K-2. The results match the predictions (Figure 2). Thus, decreasing the repulsion between the dimer and BSA, decreases the stabilization from polyanionic BSA.

If the A34F-4-lysozyme interaction is solely electrostatic, the A34F;D40N-3 and

A34F;D40K-2 proteins should restore the stability lost in A34F-4. However, the stabilities of all three complexes are equal to within the uncertainty of the measurements (Table 1,

Figure 2). This observation shows that charge is not uniquely responsible for lysozyme’s destabilizing influences, consistent with studies of protein stability.30, 41 The result can be rationalized because the surfaces of GB1 and lysozyme both possess hydrogen bond donors and acceptors capable of forming attractive interactions. Another important generalization is that electrostatic repulsions may overcome other sources of destabilization, but electrostatic attractions can only reinforce the inherent attractive interactions that exist between proteins and can lead to misfolding and disease.42

Our data highlight the importance of protein surface in controlling the strength of protein-protein interactions under physiologically-relevant conditions. Manipulating the surface interactions between macromolecules resulted in changes on the order of 0.6 kcal/mol, the available thermal energy as defined by product of the gas constant and the temperature (RT). In biology small changes can lead to amazing effects. For instance,

58 increasing the incubation temperature of alligator eggs by 4 oC, representing 0.01 kcal/mol of thermal energy, changes the sex of the hatchlings from 100% female to

100% male.43 More importantly, we conducted our experiments at the highest concentration of protein cosolutes that provide high quality spectra. These macromolecular concentrations are only 1/3rd to 1/6th the intracellular macromolecule concentration,2, 3 and as we have shown,10 increasing the macromolecule concentration increases the influence of the interactions. Therefore, the charge effects we observe are probably much more pronounced in cells. In summary, the spatial dependence of interactions based on protein charge and local pH suggest that manipulating chemical interactions provides a mechanism for regulating physiologically crucial events in the nonhomogeneous, crowded milieu of macromolecules that comprise the cellular interior.

Supporting Information

Materials and Methods

The gene for A34F GB1 in pET-11a was used to produce the D40N and D40K variants using the Agilent Quick-Change kit. Proteins were 19F labeled, expressed, purified and stored as a lyophilized powder.19 GB1 and its variants were resuspended in

20 mM, sodium phosphate, pH 7.5, in the presence or absence of cosolutes. Data were also acquired at pH 6.8 and 6.2 in buffer or buffer plus BSA or lysozyme. For those samples, the conductivity was adjusted with NaCl to match the value at pH 7.5.

NMR spectra were acquired at 298 K on a Bruker Avance III HD spectrometer equipped with a cryogenic QCI probe containing an H/F channel operating at a 19F

Larmor frequency of 470 MHz. Spectra comprised between 128 and 1000 free-induction decays of 8-K points each, a 20-ppm spectral width, a delay time of 2 s and an

59 acquisition time of 1.4 s. Assignments, areas and populations were determined as described.19

Data were obtained from several batches of A34F and its variants and fitted to equation 1, where Fd is the fraction dimer, 푃푇 is the total GB1 concentration and 퐾푑 is the equilibrium constant for dissociation. Uncertainties for the pH 7.5 data are the standard deviation of the mean from triplicate datasets. Binding curves at pH 6.2 and pH 6.8 were acquired once. The uncertainties for these experiments, were obtained by using the standard deviations of the mean for Fd at pH 7.5 in buffer (500 µM, σFd 0.023;

250 µM, σFd 0.015; 125 µM, σFd 0.004; 62.5 µM, σFd 0.020; 31.25 µM, σFd 0.001) BSA

(500 µM, σFd 0.028; 250 µM, σFd 0.004; 125 µM, σFd 0.041; 62.5 µM, σFd 0.012; 31.25

µM, σFd 0.007), and lysozyme (500 µM, σFd 0.031; 250 µM, σFd 0.027; 125 µM, σFd

0.012; 62.5 µM, σFd 0.005; 31.25 µM, σFd 0.020) to drive Monte Carlo analysis using equation 1 and 1000 randomly-generated data sets using.

2 [4Pt+Kd-√Kd +8PtKd] Fd= 1 4PT

Table S3.1: Formal charge of A34F GB1, BSA, and lysozyme at pH 6.2, 6.8, and 7.5 pH A34F GB1 BSA Lysozyme 6.2 -4 -4 +9 6.8 -4 -11 +8 7.5 -4 -19 +7

60

Figure S3.1: Formal charge of A34F GB1, A34F;D40N GB1, and A34F;D40K GB1 between pH 1 and 14. The charge does not changes between pH 6.2 and pH 7.5, the range used here.

Figure S3.2: Electrostatic potential map of A34F GB1. D40, shown in sticks, is in the midst of an acidic patch.

61

Figure S3.3: Formal charge on (A) lysozyme (red), A34F GB1 (blue) and (B) BSA (black) and A34F GB1 (blue) between pH 1 and 14.

62 REFERENCES

(1) Ryan, D. P., and Matthews, J. M. (2005) Protein–protein interactions in human disease. Curr. Opin. Struct. Biol. 15, 441-446.

(2) Zimmerman, S. B., and Trach, S. O. (1991) Estimation of macromolecule concentrations and excluded volume effects for the cytoplasm of Escherichia coli. J. Mol. Biol. 222, 599-620.

(3) Theillet, F.-X., Binolfi, A., Frembgen-Kesner, T., Hingorani, K., Sarkar, M., Kyne, C., Li, C., Crowley, P. B., Gierasch, L., Pielak, G. J., Elcock, A. H., Gershenson, A., and Selenko, P. (2014) Physicochemical properties of cells and their effects on intrinsically disordered proteins (IDPs). Chem. Rev. 114, 6661-6714.

(4) Srere, P. A. (1987) Complexes of sequential metabolic enzymes. Annu. Rev. Biochem. 56, 89-124.

(5) Cohen, R. D., and Pielak, G. J. (2017) A cell is more than the sum of its (dilute) parts: A brief history of quinary structure. Protein Sci. 26, 403-413.

(6) Vaĭnshteĭn, B. K. (1973) Three-dimensional electron microscopy of biological macromolecules. Physics-Uspekhi 16, 185.

(7) Edelstein, S. J. (1980) Patterns in the quinary structures of proteins. Plasticity and inequivalence of individual molecules in helical arrays of sickle cell hemoglobin and tubulin. Biophys. J. 32, 347-360.

(8) McConkey, E. H. (1982) Molecular evolution, intracellular organization, and the quinary structure of proteins. Proc. Natl. Acad. Sci. U.S.A. 79, 3236-3240.

(9) Sarkar, M., Li, C., and Pielak, G. J. (2013) Soft interactions and crowding. Biophys. Rev. 5, 187-194.

(10) Sarkar, M., Smith, A. E., and Pielak, G. J. (2013) Impact of reconstituted cytosol on protein stability. Proc. Natl. Acad. Sci. U. S. A. 110, 19342-19347.

(11) Senske, M., Törk, L., Born, B., Havenith, M., Herrmann, C., and Ebbinghaus, S. (2014) Protein stabilization by macromolecular crowding through enthalpy rather than entropy. J. Am. Chem. Soc. 136, 9036-9041.

(12) Sarkar, M., Lu, J., and Pielak, G. J. (2014) Protein crowder charge and protein stability. Biochemistry 53, 1601-1606.

(13) Smith, A. E., Zhang, Z., Pielak, G. J., and Li, C. (2015) NMR studies of protein folding and binding in cells and cell-like environments. Curr. Opin. Struct. Biol. 30, 7-16.

(14) Danielsson, J., Mu, X., Lang, L., Wang, H., Binolfi, A., Theillet, F.-X., Bekei, B., Logan, D. T., Selenko, P., Wennerström, H., and Oliveberg, M. (2015) Thermodynamics of protein destabilization in live cells. Proc. Natl. Acad. Sci. U. S. A. 112, 12402-12407.

63 (15) Gnutt, D., Gao, M., Brylski, O., Heyden, M., and Ebbinghaus, S. (2015) Excluded- volume effects in living cells. Angew. Chem., Int. Ed. 54, 2548-2551.

(16) Cohen, R. D., and Pielak, G. J. (2016) Electrostatic contributions to protein quinary structure. J. Am. Chem. Soc. 138, 13139-13142.

(17) Cohen, R. D., and Pielak, G. J. (2017) Quinary interactions with an unfolded state ensemble. Protein Sci 26, 1698-1703.

(18) Rivas, G., and Minton, A. P. (2016) Macromolecular crowding in vitro, in vivo, and in between. Trends Biochem. Sci. 41, 970-981.

(19) Guseman, A. J., and Pielak, G. J. (2017) Cosolute and crowding effects on a side-by- side protein dimer. Biochemistry 56, 971-976.

(20) Sukenik, S., Ren, P., and Gruebele, M. (2017) Weak protein–protein interactions in live cells are quantified by cell-volume modulation. Proc. Natl. Acad. Sci. U.S.A. 114, 6776-6781.

(21) Minton, A. P. (1981) Excluded volume as a determinant of macromolecular structure and reactivity. Biopolymers 20, 2093-2120.

(22) Minton, A. P. (1983) The effect of volume occupancy upon the thermodynamic activity of proteins: Some biochemical consequences. Mol. Cell. Biochem. 55, 119-140.

(23) Ellis, R. J. Macromolecular crowding: Obvious but underappreciated. Trends Biochem. Sci. 26, 597-604.

(24) Ellis, R. J., and Hartl, F. U. (1999) Principles of protein folding in the cellular environment. Curr. Opin. Struct. Biol. 9, 102-110.

(25) Phillip, Y., Sherman, E., Haran, G., and Schreiber, G. (2009) Common crowding agents have only a small effect on protein-protein interactions. Biophys. J. 97, 875- 885.

(26) Benton, L. A., Smith, A. E., Young, G. B., and Pielak, G. J. (2012) Unexpected effects of macromolecular crowding on protein stability. Biochemistry 51, 9773-9775.

(27) Sapir, L., and Harries, D. (2015) Is the depletion force entropic? Molecular crowding beyond steric interactions. Curr. Opin. Colloid Interface Sci. 20, 3-10.

(28) Crowley, P. B., Chow, E., and Papkovskaia, T. (2011) Protein interactions in the Escherichia coli cytosol: An impediment to in-cell NMR spectroscopy. ChemBioChem 12, 1043-1048.

(29) Wang, Q., Zhuravleva, A., and Gierasch, L. M. (2011) Exploring weak, transient protein–protein interactions in crowded in vivo environments by in-cell nuclear magnetic resonance spectroscopy. Biochemistry 50, 9225-9236.

64 (30) Smith, A. E., Zhou, L. Z., Gorensek, A. H., Senske, M., and Pielak, G. J. (2016) In-cell thermodynamics and a new role for protein surfaces. Proc. Natl. Acad. Sci 113, 1725-1730.

(31) Yu, I., Mori, T., Ando, T., Harada, R., Jung, J., Sugita, Y., and Feig, M. (2016) Biomolecular interactions modulate macromolecular structure and dynamics in atomistic model of a bacterial cytoplasm. eLife 5, e19274.

(32) Zhang, N., An, L., Li, J., Liu, Z., and Yao, L. (2017) Quinary interactions weaken the electric field generated by protein side-chain charges in the cell-like environment. J. Am. Chem. Soc. 139, 647-654.

(33) Kyne, C., and Crowley, P. B. (2017) Short arginine motifs drive protein stickiness in the Escherichia coli cytoplasm. Biochemistry 56, 5026-5032.

(34) Kyne, C., Jordon, K., Filoti, D. I., Laue, T. M., and Crowley, P. B. (2017) Protein charge determination and implications for interactions in cell extracts. Protein Sci. 26, 258–267.

(35) Crowley, P. B., Brett, K., and Muldoon, J. (2008) NMR spectroscopy reveals cytochrome c–poly(ethylene glycol) interactions. ChemBioChem 9, 685-688.

(36) Benton, L. A., Smith, A. E., Young, G. B., and Pielak, G. J. (2012) Unexpected effects of macromolecular crowding on protein stability. Biochemistry 51, 9773–9775.

(37) Monteith, W. B., Cohen, R. D., Smith, A. E., Guzman-Cisneros, E., and Pielak, G. J. (2015) Quinary structure modulates protein stability in cells. Proc. Natl. Acad. Sci. U. S. A. 112, 1739-1742.

(38) Jee, J., Byeon, I.-J. L., Louis, J. M., and Gronenborn, A. M. (2008) The point mutation A34F causes dimerization of GB1. Proteins 71, 1420-1431.

(39) Jee, J., Ishima, R., and Gronenborn, A. M. (2008) Characterization of specific protein association by 15N CPMG relaxation dispersion NMR: The GB1A34F monomer−dimer equilibrium. J. Phys. Chem. B 112, 6008-6012.

(40) Bava, K. A., Gromiha, M. M., Uedaira, H., Kitajima, K., and Sarai, A. (2004) Protherm, version 4.0: Thermodynamic database for proteins and mutants. Nucleic Acids Res. 32, D120-D121.

(41) Wang, Y., Sarkar, M., Smith, A. E., Krois, A. S., and Pielak, G. J. (2012) Macromolecular crowding and protein stability. J. Am. Chem. Soc. 134, 16614- 16618.

(42) Dobson, C. M. (2003) Protein folding and misfolding. Nature 426, 884-890.

(43) Ferguson, M. W. J., and Joanen, T. (1982) Temperature of egg incubation determines sex in Alligator mississippiensis. Nature 296, 850-853.

65

CHAPTER 4: PROTEIN SHAPE MODULATES CROWDING EFFECTS

Edited from Guseman, A.J.; Perez Goncalves, G.M.; Speer, S.L.; Young, G.B.; Pielak G.J.; “Protein Shape Modulates Crowding effects” (submitted)

Introduction

Protein-protein interactions are essential for maintaining cellular homeostasis (1).

Details of their equilibria under thermodynamically ideal conditions have provided a trove of information. Ideality in this sense refers to dilute solutions, where each monomer contacts only solvent or another monomer, conditions far removed from those in cells where protein-protein interactions evolved. In the cytoplasm, and other cellular compartments and biological fluids, macromolecules can occupy up to 30% of the volume and their concentrations often exceed 300 g/L (2).

Protein molecules take part in more complex interactions under nonideal conditions. The surrounding macromolecules influence proteins in two ways, neither of which is significant in dilute solution. Hard-core repulsions arise from high volume- occupancy, because two molecules cannot occupy the same space at the same time.

This volume exclusion favors the most compact state of a protein (3). Chemical interactions comprise transient contacts between protein surfaces arising from the diverse chemical landscapes of proteins (4). When repulsive (i.e., like charges), they favor the state that maximizes the distance between charges, adding to the hard-core repulsions and stabilizing the native state. Attractive chemical interactions (e.g., opposite charges, hydrogen bonds, etc.) are destabilizing. We are beginning to understand how hard-core repulsions and chemical interactions affect protein stability

66 (5), but there are few studies about crowding effects on protein-protein interactions (6-

9).

Given the existential roles of both protein-protein interactions and crowding in biology (10), we are undertaking efforts to determine the effect of crowding on the simplest of protein complexes, homodimers. Our first endeavor, which involved a side- by-side homodimer, highlighted chemical interactions and showed a small contribution from hard-core repulsions (11, 12). These results were predicted by Berg who used scaled-particle theory to suggest that side-by-side dimers would be only mildly influenced by hard-core repulsions (12). Berg also predicted that more compact dimers are more likely to be stabilized by hard-core repulsions. Here, we test this idea by changing the shape of a dimer in a controlled fashion.

Scaled-particle theory is based on statistical mechanics. For situations like those investigated here, the theory considers solution nonideality as arising from the presence of cosolutes (13-15). As often applied, the solvent and cosolutes are considered hard spheres, which leads to three consequences: molecules only interact when they touch, the interaction energy is purely repulsive, and the energy has a strong distance dependence. The inability of stones to interpenetrate is a macroscopic example. Water and cosolute concentration is expressed as volume occupancy, 횽, the unit-less parameter indicating the fraction of the volume they occupy. The reversible work (i.e., free energy) to produce a monomer- and dimer- sized hole in water and in a cosolute- containing solution is calculated, and a thermodynamic cycle (16) is used to estimate the change in the free energy of dissociation (KD→M) in cosolute solution minus the value in water, i.e., ΔΔG°`D→M, where a positive value indicates stabilization.

67 The theory has several shortcomings, including its spherical assumptions, neglect of chemical interactions and its extreme sensitivity to sphere size and density

(17, 18). Nevertheless, it has been successfully applied to assess the effects of crowding-induced hard-core interactions on protein stability and protein-protein interactions (18-22). For instance, its application to protein stability led to the idea that crowding effects comprise both hard-core repulsions and attractive or repulsive chemical interactions (16, 23).

o’ Figure 4.1: Scaled-particle-theory derived values of ΔΔG D→ M as a function of eccentricity, L, at ten cosolute concentrations (solid symbols) of the indicated radii and volume occupancies, 횽. As predicted by Berg (12). the more compact spherical-, but not the side-by-side- dimer, is stabilized. Shaded areas indicate the estimated eccentricity of the GB1 dimers. The 4.1-Å species represents sucrose, and the 35-Å species represents bovine serum albumin.

We use the implementation of scaled-particle theory (24) described by Berg (12) to assess crowding effects. The monomers are assumed to be spheres. The dimers have twice the volume of the monomer, but their shapes, quantified as the eccentricity,

L, is varied (Figure 4.1). At the compact extreme, L = 0, the dimer is a sphere. At the other extreme, L = 1, the dimer comprises touching monomers (kissing spheres). Berg’s application shows that shape matters. Treating water as a sphere of radius 1.4 Å,

68 crowding favors more compact (smaller L) dimers, but the stabilization disappears at L =

~0.9 and then becomes slightly destabilizing. Using a related form of the theory that treats the solvent as a continuum also predicts crowding-induced stabilization that decreases, but never disappears, with decreasing L (12). The diminution arises because two spherical cavities anywhere in the solution are more likely than two cavities next to each other (12).

The system we chose has several favorable attributes for testing ideas about crowding and dimer formation. The B1 domain of protein G (GB1; 6.2 kDa, pI 4.5) is a

56-residue model globular protein with a thermally-stable 4β+α globular fold (25). The

Gronenborn laboratory developed two GB1 homodimers of known structure (26-28) with

95% sequence identity but different eccentricities, allowing us to test the effect of hard- core repulsions and chemical interactions on dimer shape.

Figure 4.2: GB1 and its dimers (PDBIDs 1GB1, 2RMM, and 1Q10).

The A34F variant (Figure 4.2) forms a dimer reminiscent of kissing spheres

(Figure 4.3A and Figure 4.1) that does not require a large conformational change (27,

29). Dimerization occurs because the Phe 34 side chain becomes part of the hydrophobic core, displacing the Tyr 33 side chain at the surface and causing the C-

69 terminal end of the sole helix to unfold, forming a pocket for the Tyr 33 side chain of the other monomer. The dimer is also stabilized by hydrogen bonds along the now adjacent

β-sheets.

The L5V;F30V;Y33F;A34F variant forms a more compact domain-swapped dimer

(Figure 4.2) whose monomer is a partially-folded, molten-globule-like species (26, 28)

(Figure S4.1) that undergoes refolding to form the dimer (Figure 4.3A). Swapping the turn between β-strand three and β-strand four results in interchanged secondary structural units with concomitant formation of several interprotein interactions.

Figure 4.3: van der Waals volumes, solvent-accessible surface areas (SASAs), coarse-grained- (top) and geometric- representations (bottom) of the dimers (A) and their electrostatic surface- potentials from PYMOL (B) (30).

The properties of protein surfaces determine the chemical interactions with cosolutes (31-35). An advantageous aspect of the GB1 system is that the interactions are similar for both dimers because their primary structures, and hence their surfaces are highly similar (Figure 4.3B). In summary, the GB1 system is well suited for studying crowding effects on protein complex shape, because the monomers are not far removed from spheres, and the dimers have different shapes but similar surfaces.

70 Results

Scaled-particle theory predictions. Positive values of ΔΔG°`D→M, indicate an increase in dimer stability compared to dilute solution. Scaled-particle theory calculations (Table 4.1) were performed as described by Berg using the dimerization reaction with a variable cavity shape (Berg’s equations S1-S8) (12). Small molecule radii were estimated using molecular-volume data from the ChemAxon chemicalize database (36). Protein radii were estimated from their surface areas and volumes obtained using VADAR and their structures (PDBID3V03 for BSA and 1DPX for lysozyme) (37).

To apply Berg’s ideas (12) we first estimated the dimer eccentricities. Distances between the centrally-located 5’-proton on trp43 (38) and the protein surface were measured using Pymol (39) giving a monomer radius of 11.8 Å, and an eccentricity of

0.71 for the side-by-side dimer and 0.56 for the domain-swapped dimer (Figure 4.1). We also assessed eccentricity by fitting the solvent-accessible surface areas and volumes to geometric shapes, which yielded values of 0.77 and 0.53, respectively. The ranges are shaded in the figure. A complete list of input parameters is compiled in the

Supporting Information (Table S4.1).

71

Figure 4.4: 19F NMR spectra of 5-fluorotryptophan-43 labeled domain-swapped dimer at pH 7.5 and 298 K at the indicated protein concentrations.

72 Table 4.1: ΔΔG°`D→M, values at 298 K predicted by scaled particle theory (SPT) and NMR (pH 7.5) for the domain-swapped (L=0.5) and side-by-side dimer (L=0.7). Positive values indicate increase dimer stability.

ΔΔG°`D→M, kcal/mol

Domain-swapped Side-by-side

Cosolute g/L SPT1 NMR2 SPT1 NMR2,3

Sucrose 300 1.38 0.26 ± 0.06 0.67 0.03 ± 0.04

Ethylene glycol 200 1.50 -0.30 ± 0.06 0.76 -0.29 ± 0.05

TMAO 38 0.25 0.16 ± 0.06 0.13 0.50 ± 0.06

Urea 100 0.61 -1.31 ± 0.05 0.31 -0.25 ± 0.04

BSA 100 0.14 0.51 ± 0.05 0.10 0.48 ± 0.06

Lysozyme 50 0.09 -0.12 ± 0.05 0.05 -0.18 ± 0.05

Ficoll-70 300 N/A4 0.71 ± 0.06 N/A4 0.15 ± 0.04

8 kDa PEG 200 N/A4 0.39 ± 0.06 N/A4 -0.22 ± 0.03 1As described by Berg(12) using parameters from Table S4.1. 2Uncertainties are the standard deviation of the mean from triplicate analysis. 3From Guseman and Pielak.(40) 4Ficoll and PEG cannot be simulated with scaled particle theory at these concentrations.

Observations using 19F NMR. To test the predictions from scaled-particle theory, we measured ΔΔG°`D→M using the approach described in Materials and

Methods and 5-fluorotryptophan-labeled GB1 in buffer (Figure 4.4) and in a variety of small and large cosolutes at pH 7.5 and 298 K (Table 4.1). Comparison of 15N-1H HSQC data for the domain-swapped dimer (Figure S4.4) and the side-by-side dimer (40) show that fluorine labeling does not significantly affect structure of the proteins.

73

o’ Figure 4.5: ΔΔG D→ M at pH 7.5 and 298 K of the domain-swapped dimer and the side-by-side dimer in synthetic polymers and their monomers (A) and urea, trimethylamine N-oxide (TMAO) and globular proteins (B). Positive values indicate increased dimer stability.

Discussion

Scaled-Particle Theory. The predictions for the cosolutes tested (Table 4.1) agree with Berg’s prediction (12) that the hard-core component of crowding has a larger stabilizing effect on the more compact domain-swapped dimer (41). We did not apply the theory to the synthetic polymers because these macromolecules cannot be accurately modeled as spheres, because, at the high concentrations used here (42-44) the individual polymer molecules overlap to form a mesh (45). The reason is exemplified by the concentration dependence of viscosity. At low concentrations (the so-called dilute regime), there is an approximately linear relationship between viscosity and concentration, as is observed for small molecules. At higher concentrations, however, synthetic polymers form a mesh (the semi-dilute regime), and there is a much steeper dependence. The intersection of the linear between these regimes is called the overlap concentration, c*. Perhaps the best rationale for not treating synthetic polymers as

74 spheres is that polymers form these highly viscous solutions whereas spheres jam at a volume occupancy of ~0.6 (46).

Control Small-Molecule Cosolutes. We divided the cosolutes into three classes: small-molecule denaturants and stabilizers, synthetic polymers and their monomers, and proteins (Figure 4.5). The denaturant urea at 100 g/L is predicted by scaled-particle theory to stabilize the domain-swapped and side-by-side dimers by 0.61 kcal/mol and 0.31 kcal/mol, respectively. We observe destabilizations of 1.31 ± 0.05 kcal/mol and 0.25 ± 0.04 kcal/mol for the domain-swapped and side-by-side dimers, respectively. The predicted and observed results are contradictory in both sign and magnitude. The sign discrepancy highlights a key shortcoming of scaled-particle theory; it ignores chemical interactions between the cosolute and the test protein, in this instance the attractive interaction between urea and the protein backbone (47). The magnitude discrepancy means the domain-swapped dimer should be more stabilized

(less destabilized) than the side-by-side dimer. This discrepancy probably arises because the monomers are fundamentally different. For the side-by-side dimer, the monomer is a stable folded protein (27), but for the domain-swapped dimer the monomer is a molten globule (Figure S4.1) (26, 28). Urea likely unfolds the globule, resulting in a destabilizing chemical interaction larger than the predicated stabilization from scaled-particle theory.

Trimethylamine-N-oxide (TMAO) at 39 g/L is predicted to stabilize the domain- swapped and side-by-side dimers by 0.25 kcal/mol and 0.13 kcal/mol, respectively. We observe stabilizations of 0.16 ± 0.06 and 0.50 ± 0.06 kcal/mol, respectively. The less than expected stabilization of the domain-swapped dimer likely arises because its

75 monomer is only partially structured (Figure S4.1) (26-28). That is, TMAO, a known protein stabilizer, likely favors a conformation of the monomer that is similar to that of the wild-type monomer, reducing the tendency to swap domains and dimerize.

Synthetic Polymer Cosolutes. These macromolecules are traditionally used as cosolutes to mimic hard-core repulsions under crowded conditions. We tested Ficoll-70, a 70-kDa branched sucrose-polymer at 300 g/L and 8-kDa polyethylene (PEG) at 200 g/L (Figure 4.5A).

We first studied their monomers, sucrose and ethylene glycol. Sucrose stabilizes the domain-swapped dimer by 0.26 ± 0.06 kcal/ mol, most likely by increasing the influence of hard-core repulsions as predicted by scaled-particle theory (12, 22).

Ethylene glycol destabilizes the domain-swapped dimer by -0.30 ± 0.06 kcal/mol.

Compared to buffer alone, sucrose at 300 g/L has no effect on ΔG°`D→M of the side-by-

side dimer, but ethylene glycol decreases ΔG°`D→M by -0.29 ± 0.05 kcal/mol (40). The increased stabilization of the more compact domain-swapped dimer by sucrose is also consistent with theory (12, 22). The destabilization of the dimers by ethylene glycol is likely due to attractive interactions with the test protein; ethylene glycol interacts favorably with protein surfaces (48-50).

Having assessed the monomers, we tested for, and found, a stabilizing macromolecular effect. That is, both polymers are more stabilizing than their respective monomers at the same mass-per-volume concentration. Ficoll stabilizes the side-by- side dimer by 0.15 ± 0.04 kcal/mol compared to buffer, and PEG destabilizes this dimer by -0.22 ± 0.03 kcal/mol (40), whereas both synthetic polymers stabilize the domain- swapped dimer by 0.71 ± 0.06 kcal/mol, for Ficoll, and 0.39 ± 0.06 kcal/mol for PEG.

76 This result is consistent with observations on other protein-protein interactions (40, 51), but differs from observations on protein folding (52). As discussed above, we cannot make direct comparisons with scaled-particle theory, but the Ficoll and PEG results are consistent with Berg’s proposal that the influence of hard-core repulsions on dimer formation depends on the shape of the dimer complex. Specifically, the more compact domain-swapped dimer is stabilized more than the less compact kissing-sphere-shaped side-by-side dimer.

Proteins Cosolutes. Unlike synthetic polymers, globular proteins are roughly spherical (53, 54) allowing the application of scaled particle theory. To understand how chemical interactions affect the dimers, we used globular proteins with opposite net charges at pH 7.5, bovine serum albumin (BSA, 66 kDa, pI 5) and lysozyme (14 kDa, pI

9) as cosolutes (Figure 4.5B).

Theory predicts that BSA at 100 g/L stabilizes the domain-swapped dimer by

0.14 kcal/mol and the side-by-side dimer by 0.10 kcal/mol. We observe larger stabilizations, 0.51 ± 0.05 kcal/mol and 0.48 ± 0.06 kcal/mol, respectively. At pH 7.5,

BSA is predicted to have a charge of -19, while both GB1 dimers are predicted to have a charge of -4. The more-than-predicted stabilization can be explained by electrostatic repulsions between BSA and acidic patches on the surface of each dimer (31).

For lysozyme (50 g/L), scaled-particle theory predicts stabilizations of 0.09 kcal/mol and 0.05 kcal/mol for the domain-swapped and side-by-side dimers, respectively, but we observe destabilizations of 0.12 ± 0.05 kcal/mol and 0.18 ± 0.05 kcal/mol, respectively. These results are also explained by charge-charge interactions, because lysozyme has a positive charge of +7 at pH 7.5 while the dimers have a charge

77 of -4. In summary, these results show for both dimers that chemical interactions can modulate, and even overcome, the effects of hard-core repulsions.

The effect of protein cosolutes on ΔG°`D→M is the same for each dimer. This agreement between values supports our hypothesis that similar protein surfaces result in similar chemical interactions. The GB1 dimer system effectively decouples the difference in hard-core repulsions by changing the shape of the protein complex while producing indistinguishable differences in chemical interactions, making it the ideal system to test scaled particle theory predictions.

Conclusions

We tested Berg’s (12) idea that shape can control the effects of crowding on protein complex stability by using two nearly identical proteins that form dimers with different shapes but with nearly identical surfaces. Scaled-particle theory predicts that the more compact dimer is generally more stabilized by crowding. This observation suggests that shape dependence may have been used by biology to control which proteins interact. That the compact domain-swapped dimer is more influenced by hard- core repulsions from the synthetic polymers than is the side-by-side dimer might be important for two reasons. First, it highlights which architecture would be favored by the crowded cellular interior, providing a means of stabilizing complexes important for metabolism, signaling, and maintaining biological homeostasis. Second, cells could use more self-contained proteins that would form dumbbell-shaped dimers to prevent weak protein-protein interactions from being stabilized by the crowded interior. We observe these differences in concentrated solutions of the relatively inert synthetic polymers,

PEG and Ficoll. However, the predicted differences can be entirely counteracted by electrostatic interactions as shown by the stabilizing effect of BSA and the destabilizing

78 effect of lysozyme and attractive interactions, as shown by ethylene glycol. We conclude that to gain a biologically useful knowledge about protein-protein interactions, we must study them, as Anfinsen wrote, in “the environment for which they were selected, the so called physiological state (55).”

Materials and Methods

Vector. A pET11a plasmid containing the GB1 T2Q mutant was used as the wild-type vector. The T2Q change prevents N-terminal degradation (56). Agilent’s

QuickChange mutagenesis kit was used to produce the A34F mutant, which we call the side-by-side dimer, and the L5V;F30V;Y33F;A34F mutant, which we call the domain- swapped dimer.

Protein expression and purification. The plasmid encoding the domain- swapped dimer was transformed into BL21(DE3) Escherichia coli cells. Following overnight incubation at 37 ºC, a single colony was used to inoculate a 25-mL overnight

LB culture containing 1 mM ampicillin. The culture was incubated with shaking at 225 rpm at 37 ºC (New Brunswick Scientific, model I26). The overnight culture was used to inoculate 975-mL of M9 media (50 mM Na2HPO4, 20 mM KH2PO4, 9 mM NaCl, 1 g/L

NH4Cl, 4 g/L glucose, 2 mM MgSO4, 10 mg/mL thiamine HCl, 10 mg/mL biotin, 100 µM

CaCl2, 100 µg/mL ampicillin). The culture was grown with shaking at 37 ºC, and its optical density at 600 nm (OD600) was monitored (Bio-Rad Spectra Plus). On reaching an OD600 of 0.4, 500 mg of glyphosate, 60 mg of L-phenylalanine, 60 mg of L-tyrosine, and 70 mg of 5-fluoroindole were added. At an OD600 of 0.6, protein expression was induced by adding isopropyl-β-D-thiogalactopyranoside to a final concentration of 1 mM.

After 2 h, the cells were pelleted for 30 min at 4000g (RC-3B Refrigerated Centrifuge;

Sorvall Instruments).

79 The pellet was resuspended in 25 mL of buffer A (20 mM tris-HCl, pH 7.5), and

300 µL of Roche protease inhibitor cocktail added. Cell lysis was carried out by sonication (Fischer Scientific Sonic Dismembrator model 500, 15% amplitude, 0.50 s on, 0.50 s off for 10 min). The lysate was centrifuged (Sorvall RC-5B) at 27,000g for 1 h.

The supernatant was filtered using a 0.22-µm syringe-driven unit (Millex) and loaded onto a Q Sepharose anion exchange column (16 mm x 100 mm; GE Healthcare) equilibrated at 4 ºC with 20 mM tris-HCl, pH 7.5, on an AKTA Pure FPLC (GE

Healthcare). Protein was eluted over a 0-50% linear gradient of 20 mM tris, pH 7.5 to 20 mM tris-HCl, 2 M NaCl, pH 7.5. Fractions were assessed using SDS-PAGE (4-20%

Criterion TGX gels; Bio-Rad) stained with Coomassie Brilliant Blue R-250. Fractions containing the domain-swapped dimer were pooled and concentrated using a 3000 kDa cut-off centrifugal concentrator (Millipore). The concentrated sample was filtered (0.22-

µm) and loaded onto a 16 mm x 600 mm Superdex 75 (GE Healthcare) size exclusion column at 4 ºC. The column was eluted with two column volumes of 5 mM Na2HPO4, 2 mM KH2PO4, 9 mM NaCl, pH 7.5. The eluent was assessed using SDS PAGE.

Fractions containing pure GB1 variant were concentrated using an Amicon centrifugal concentrator with a 3000 kDa cut off. The concentrated protein was exchanged thrice into 18-MΩ deionized H2O. The protein concentration was quantified using a NanoDrop

One spectrophotometer (ThermoFisher) and an extinction coefficient of 8400L M-1 cm-1, split into 500-µM aliquots, flash frozen, and lyophilized for between 12 and 16 h

(Labonco Freezone) (57). Purified protein was analyzed by Fourier-transform ion- cyclotron-resonance mass spectrometry (observed 6296.9 Da, expected 6297 Da,

80 Figure S4.2) and NMR resulting in HSQC spectra consistent with literature (Figure S4.3)

(26, 28).

Cosolutes. Solutions were prepared to the desired concentration in 20 mM sodium phosphate buffer. The pH was adjusted to 7.5 using concentrated HCl or NaOH.

Lyophilized lysozyme and bovine serum albumin were purchased from Sigma-Aldrich.

Their concentrations were monitored using extinction coefficients at 280 nm of 6700 L mg-1 cm-1 and 26400 L mg-1 cm-1, respectively (58, 59).

19 F NMR. Samples were resuspended in buffer containing 10% D2O to a final concentration of 500 µM. Experiments were performed at pH 7.5 and 298 K on a Bruker

Avance III HD spectrometer operating at a 19F Larmor frequency of 470 MHz equipped with a cryogenic QCI probe and a tunable H/F channel. Spectra comprising 31047 points were acquired with a 2 s delay, an acquisition time of 1.4 s, an offset of -130 ppm, and a sweep width of 22 ppm.

Data analysis. NMR spectra were analyzed using Topspin3.5pl6. A 10-Hz exponential line broadening was applied to each free-induction decay prior to Fourier transformation. 5-Fluorotryptophan 43 in the domain-swapped dimer produces two resonances in slow exchange (40). The downfield resonance at -124.2 ppm corresponds to the monomer. The upfield resonance at -125.6 ppm corresponds to the dimer (Figure 4.4) (Shifts for the side-by-side dimer are published) (40). Integration of each resonance was used to obtain their relative populations. The fraction of dimer (Fd) was calculated by dividing the area of the dimer peak by the sum of the areas. Data were fit to equation (1) using MATLAB (R2017A), where Pt is the total protein concentration and KD→M is the equilibrium constant for dissociation. Dissociation

81 constants were converted to ΔG°`D→M values using the equation ΔG°`D→M = -

RTln(KD→M) where R is the universal gas constant, and T is the absolute temperature.

2 4푃푡+퐾푑−√퐾퐷→푀+∗8푃푡퐾푑 퐹푑 = (1) 4푃푡

Supporting Information

Table S1: Parameters and increase in free energy of dissociation compared to dilute solution (ΔΔG°`) predicted by scaled-particle theory for the domain-swapped dimer [eccentricity (L) = 0.5] and side-by-side dimer (L = 0.7) dimers in various cosolutes.

Condition Crowder Volume ΔΔG°` radius, Å occupancy kcal/mol

L=0.5 0.7

Sucrose (300 g/L) 4.10 0.185 -1.380 -0.670

Ethylene glycol 2.50 0.127 -1.500 -0.760

(200 g/L)

TMAO (38 g/L) 2.72 0.0254 -0.25 -0.13

Urea (100 g/L) 2.32 0.0524 -0.61 -0.31

BSA (100 g/L) 35.01 0.1629 -0.14 -0.10

Lysozyme (50 g/L) 15.67 0.0678 -0.09 -0.05

Scaled Particle Theory equations:

∆퐺푒푥(푎, 푏, 훷) = 휇푒푥,푑 − 2휇푒푥,푚 (S1)

휇푒푥 = 푔(퐿, 푅푑, 훷) − 푔(퐿, 푅푑, 0) (S2)

2 퐿 퐿 2 6푆 (1+ ) 6(3+3퐿+( )푆2 ) 2 2 4 2 4 3 3 푔(퐿, 푅푑, 훷) = −퐿푛(1 − 푆3) + 푅푑 + 2 푅푑 + ( ) 휋푃(1 + 1.5퐿 − 0.5퐿 )푅푑 (S3) 1−푆3 (1−푆3) 3

푟 퐿 = (S4) 2푅푑

82 1 푆 = ( ) ∗ (푎3 훷 + 훷)푅3 0 8 푤 푎

1 푆 = ( ) ∗ (푎2 훷 + 훷)푅2 1 4 푤 푎

1 푆 = ( ) ∗ (푎 훷 + 훷)푅 2 2 푤 푎

푆3 = 훷푤 + 훷 (5)

3 6 푆0 3푆1푆2 3푆2 푃 = ( ) + 2 + 3 (S6) 휋 1−푆3 (1−푆3) (1−푆3)

( ) 훷푤 = 훷푤0(1 − 훷 − (훷푓 푎 ) (S7)

1−훷 2 ( ) ( ) 푤0 푓 푎 = 3 1 − 훷푤 (푎(1 + 2훷푤0) + ) /(푎(1 + 2훷푤0) (S7a) 3푎훷푤0

3 3 3 2푅푏 = 푅푑(1 + 1.5퐿 − 0.5퐿 ) (S8)

푅푎 푅푏 4 3 4 3 푎 = , 푏 = 훷 = ( ) 휋푅퐴휌푎, 훷푤 = ( ) 휋푅푤휌푤 푅푤 푅푎 3 3

푅푤 = 1.38 Å 푅푎 = 푐표푠표푙푢푡푒 푟푎푑푖푢푠 푅푏 = 푚표푛표푚푒푟 푟푎푑푖푢푠

Figure S4.1: GB1 backbone overlaid with the protein surface. For the monomer of the domain- swapped dimer, the blue highlighted region corresponds to structured residues as determined by the presence of nuclear Overhauser effects (26, 28).

83

Figure S4.2: High-resolution ESI FT-ICR mass spectra of 5-fluorotryptophan-labeled, 15N- enriched domain-swapped dimer.

84

Figure S4.3: 1H-15N HSQC of 19F-Trp labeled domain swapped dimer (black, 500 µM) and monomer (red, 50 µM). The limited number of cross peaks and the broadness of the extant cross peaks suggest that the monomer is a molten globule, in agreement with the results of Byeon et al (26, 28).

Figure S4.4: Overlay of 15N-1H HSQC spectra of 15N-enriched domain-swapped dimer (black, 500 µM) and 15N-enriched, 19F fluorotryptophan-labeled domain-swapped dimer (red, 700 µM).

85 REFERENCES

(1) Sahni N, et al. (2015) Widespread macromolecular interaction perturbations in human genetic disorders. Cell 161(3):647-660.

(2) Zimmerman SB and Trach SO (1991) Estimation of macromolecule concentrations and excluded volume effects for the cytoplasm of Escherichia coli. J. Mol. Biol. 222(3):599-620.

(3) Minton AP (1981) Excluded volume as a determinant of macromolecular structure and reactivity. Biopolymers 20(10):2093-2120.

(4) Cohen RD and Pielak GJ (2017) A cell is more than the sum of its (dilute) parts: A brief history of quinary structure. Protein Sci. 26(3):403-413.

(5) Davis CM, Gruebele M, and Sukenik S (2018) How does solvation in the cell affect protein folding and binding? Curr. Opin. Struct. Biol. 48:23-29.

(6) Phillip Y, Kiss V, and Schreiber G (2012) Protein-binding dynamics imaged in a living cell. Proc. Natl. Acad. Sci. USA 109(5):1461-1466.

(7) Phillip Y, Sherman E, Haran G, and Schreiber G (2009) Common crowding agents have only a small effect on protein-protein interactions. Biophys. J. 97(3):875- 885.

(8) Kozer N, Kuttner YY, Haran G, and Schreiber G (2007) Protein-protein association in polymer solutions: From dilute to semidilute to concentrated. Biophys. J. 92(6):2139-2149.

(9) Jiao M, Li H-T, Chen J, Minton AP, and Liang Y (2010) Attractive protein-polymer interactions markedly alter the effect of macromolecular crowding on protein association equilibria. Biophys. J. 99(3):914-923.

(10) Spitzer J and Poolman B (2009) The role of biomacromolecular crowding, ionic strength, and physicochemical gradients in the complexities of life's emergence. Microbiol. Mol. Biol. Rev. 73(2):371-388.

(11) Zhou H-X, Rivas G, and Minton AP (2008) Macromolecular crowding and confinement: Biochemical, biophysical, and potential physiological consequences. Annu. Rev. Biophys. 37(1):375-397.

(12) Berg OG (1990) The influence of macromolecular crowding on thermodynamic activity: Solubility and dimerization constants for spherical and dumbbell-shaped molecules in a hard-sphere mixture. Biopolymers 30(11-12):1027-1037.

(13) Lebowitz JL, Helfand E, and Praestgaard E (1965) Scaled particle theory of fluid mixtures. J. Chem. Phys. 43(3):774-779.

86 (14) Reiss H (1966) Scaled particle methods in the statistical thermodynamics of fluids. Adv. Chem. Phys. (9):1-84.

(15) Gibbons RM (1969) The scaled particle theory for particles of arbitrary shape. Mol. Phys. 17(1):81-86.

(16) Saunders AJ, Davis-Searles PR, Allen DL, Pielak GJ, and Erie DA (2000) Osmolyte-induced changes in protein conformational equilibria. Biopolymers 53(4):293-307.

(17) Tang KES and Bloomfield VA (2000) Excluded volume in solvation: Sensitivity of scaled-particle theory to solvent size and density. Biophys. J. 79(5):2222-2234.

(18) Davis-Searles PR, Saunders AJ, Erie DA, Winzor DJ, and Pielak GJ (2001) Interpreting the effects of small uncharged solutes on protein-folding equilibria. Annu. Rev. Biophys. Biomol. Struct. 30:271-306.

(19) Christiansen A, Wang Q, Samiotakis A, Cheung MS, and Wittung-Stafshede P (2010) Factors defining effects of macromolecular crowding on protein stability: An in vitro / in silico case study using cytochrome c. Biochemistry 49(31):6519- 6530.

(20) Chalikian TV (2016) Excluded volume contribution to cosolvent-mediated modulation of macromolecular folding and binding reactions. Biophys. Chem. 209:1-8.

(21) Rivas G and Minton AP (2016) Macromolecular crowding in vitro, in vivo, and in between. Trends Biochem. Sci. 41(11):970-981.

(22) Sharp KA (2015) Analysis of the size dependence of macromolecular crowding shows that smaller is better. Proc. Natl. Acad. Sci. USA 112(26):7990-7995.

(23) Schellman J (2003) Protein stability in mixed solvents: A balance of contact interactions and excluded volume. Biophys. J. 85:108-125.

(24) Lebowitz JL, Helfand E, and Praestgaard E (1965) Scaled particle theory of fluid mixtures. The Journal of Chemical Physics 43(3):774-779.

(25) Gronenborn A, et al. (1991) A novel, highly stable fold of the immunoglobulin binding domain of streptococcal protein G. Science 253(5020):657-661.

(26) Byeon I-JL, Louis JM, and Gronenborn AM (2003) A protein contortionist: Core mutations of GB1 that induce dimerization and domain swapping. J. Mol. Biol. 333(1):141-152.

(27) Jee J, Byeon I-JL, Louis JM, and Gronenborn AM (2008) The point mutation A34F causes dimerization of GB1. Proteins 71(3):1420-1431.

87 (28) Byeon I-JL, Louis JM, and Gronenborn AM (2004) A captured folding intermediate involved in dimerization and domain-swapping of GB1. J. Mol. Biol. 340(3):615- 625.

(29) Jee J, Ishima R, and Gronenborn AM (2007) Characterization of specific protein association by 15N CPMG relaxation dispersion NMR: The GB1A34F monomer−dimer equilibrium. J. Phys. Chem. B 112(19):6008-6012.

(30) Schrödinger L (2015) Th PYMOL molecular graphics system version 2.0.

(31) Guseman AJ, Speer SL, Perez Goncalves GM, and Pielak GJ (2018) Surface charge modulates protein–protein interactions in physiologically relevant environments. Biochemistry 57(11):1681-1684.

(32) Sarkar M, Li C, and Pielak GJ (2013) Soft interactions and crowding. Biophys. Rev. 5(2):187-194.

(33) Miklos AC, Sarkar M, Wang Y, and Pielak GJ (2011) Protein crowding tunes protein stability. J. Am. Chem. Soc. 133(18):7116-7120.

(34) Sarkar M, Lu J, and Pielak GJ (2014) Protein crowder charge and protein stability. Biochemistry 53(10):1601-1606.

(35) Kyne C and Crowley PB (2017) Short arginine motifs drive protein stickiness in the Escherichia coli cytoplasm. Biochemistry 56(37):5026-5032.

(36) ChemAxon (2017) Chemicalize chemoinformatics database. in Chemicalize (ChemAxon).

(37) Willard L, et al. (2003) VADAR: A web server for quantitative evaluation of protein structure quality. Nucleic Acids Res. 31(13):3316-3319.

(38) Campos-Olivas R, Aziz R, Helms GL, Evans JNS, and Gronenborn AM (2002) Placement of 19F into the center of GB1: Effects on structure and stability. FEBS Lett. 517(1–3):55-60.

(39) Schrodinger, LLC (2015) The pymol molecular graphics system, version 1.8.

(40) Guseman AJ and Pielak GJ (2017) Cosolute and crowding effects on a side-by-side protein dimer. Biochemistry 56(7):971-976.

(41) Christiansen A and Wittung-Stafshede P (2013) Quantification of excluded volume effects on the folding landscape of Pseudomonas aeruginosa apoazurin in vitro. Biophys. J. 105(7):1689-1699.

(42) Fissell WH, et al. (2007) Ficoll is not a rigid sphere. Am. J. Physiol.: Renal, Fluid Electrolyte Physiol. 293(4):F1209-F1213.

88 (43) Acosta LC, Perez Goncalves GM, Pielak GJ, and Gorensek‐Benitez AH (2017) Large cosolutes, small cosolutes, and dihydrofolate reductase activity. Protein Sci. 26(12):2417-2425.

(44) Lee H, Venable RM, MacKerell AD, and Pastor RW (2008) Molecular dynamics studies of polyethylene oxide and polyethylene glycol: Hydrodynamic radius and shape anisotropy. Biophys. J. 95(4):1590-1599.

(45) Rubinstein M and Colby RH (2003) Polymer physics (Oxford University Press, Oxford).

(46) Song C, Wang P, and Makse HA (2008) A phase diagram for jammed matter. Nature 453:629.

(47) Guinn EJ, Pegram LM, Capp MW, Pollock MN, and Record MT (2011) Quantifying why urea is a protein denaturant, whereas glycine betaine is a protein stabilizer. Proc. Natl. Acad. Sci. USA 108(41):16932-16937.

(48) Knowles DB, et al. (2015) Chemical interactions of polyethylene glycols (PEGs) and glycerol with protein functional groups: Applications to effects of PEG and glycerol on protein processes. Biochemistry 54(22):3528-3542.

(49) Shkel IA, Knowles DB, and Record MT (2015) Separating chemical and excluded volume interactions of polyethylene glycols with native proteins: Comparison with PEG effects on DNA helix formation. Biopolymers 103(9):517-527.

(50) Chao S-H, Schäfer J, and Gruebele M (2017) The surface of protein λ6–85 can act as a template for recurring poly(ethylene glycol) structure. Biochemistry 56(42):5671-5678.

(51) Munari F, et al. (2017) Identification of primary and secondary UBA footprints on the surface of ubiquitin in cell‐mimicking crowded solution. FEBS Lett. 591(7):979- 990.

(52) Smith AE, Zhou LZ, Gorensek AH, Senske M, and Pielak GJ (2016) In-cell thermodynamics and a new role for protein surfaces. Proc. Natl. Acad. Sci. USA 113(7):1725-1730.

(53) Majorek KA, et al. (2012) Structural and immunologic characterization of bovine, horse, and rabbit serum albumins. Mol Immunol 52(3-4):174-182.

(54) Weiss MS, Palm GJ, and Hilgenfeld R (2000) Crystallization, structure solution and refinement of hen egg-white lysozyme at pH 8.0 in the presence of MPD. Acta Crystallogr D Biol Crystallogr 56(Pt 8):952-958.

(55) Anfinsen CB (1973) Principles that govern the folding of protein chains. Science 181(4096):223-230.

89 (56) Smith CK, Withka JM, and Regan L (1994) A thermodynamic scale for the b sheet forming tendencies of the amino acids. Biochemistry 33(18):5510-5517.

(57) Gill SC and von Hippel PH (1989) Calculation of protein extinction coefficients from amino acid sequence data. Anal. Biochem. 182(2):319-326.

(58) Putnam F (1984) Progress in plasma proteins. The plasma proteins 2nd ed., ed Putnam F (Academic Press, New York), pp 1-44.

(59) Aune KC and Tanford C (1969) Thermodynamics of the denaturation of lysozyme by guanidine hydrochloride. I. Dependence on pH at 25°. Biochemistry 8(11):4579-4585.

90

CHAPTER 5: QUINARY INTERACTIONS AND EXPANDED PROTEIN NATIVE STATE ENSEMBLES, A MEANS TO NEW FUNCTIONS.

Edited from Alex J Guseman “Quinary interactions and expanded protein native state ensembles, a means to new functions.” (In preparation)

Introduction

The structure function paradigm dictates that the structure of a protein is intimately linked to the function it was evolved to perform.1 Born from this, structural biologists have used techniques such as X-ray crystallography, Cryo-electron microscopy, and NMR to interrogate protein structures to understand their functions.

The fruits of these labors produced the atomics coordinates for more than 45,000 protein structures to be deposited in the Protein Data Bank (PDB).2 The majority of these structures represent the global minimum of protein folding funnel, where the protein occupies the most thermodynamically stable state under the conditions for which the structure was solved. While studies of protein structures in dilute buffered solutions have provided a trove of information about protein function, often ignored is the influence of the cellular environment on protein structure.3

The cellular environment is a crowded, diverse, and dynamic environment where macromolecule are concentrated between >300 g/L.4 In these complex and crowded environments, proteins and nucleic acids are influenced by transient interactions that are absent in dilute solution.5-9 These interactions can be generalized into two types; hard-core steric repulsions and “soft” chemical interactions.5, 10-14 Hard-core repulsions arise because no two molecules can occupy the same space resulting in a packing

91 problem that stabilizes the most compact state of a protein.5 Chemical interactions arise from the diverse chemical landscapes on the surfaces of proteins, where electrostatics, hydrophobic, and polar interactions cause destabilizing repulsive and stabilizing attractive chemical interactions.7, 8, 12-17 These interactions are referred to as quinary interactions as their totality comprise the fifth level of protein structure, quinary structure.3, 6, 18 The predominant function of quinary structure is to organize the cellular interior and can be made evident by colocalization of metabolic enzymes, sequestering macromolecules with similar functions, and formation of biomolecular condensates.3

Here I outline the influence of quinary interactions on protein structure, and define the importance of weak interactions on the native state ensemble of a protein.

Numerous groups have studied the influence of quinary interactions on proteins structure and stability both in vitro under crowded conditions, and in living cells.6, 9, 19-24

From these studies, it was gleaned that the cellular environment could be destabilizing to proteins by 2.0 kcal/mol via weak chemical interactions, or roughly 1-3 units of available thermal energy (RT) at 300 K. The average thermodynamic stability of a protein ranges from, 5-15 kcal/mol (8-25 RT at 300 K),25 suggesting these interactions would have minimal effects to protein structures. While these interactions are not strong enough to fully unfolded a protein, these interactions raise the overall energy of the system, allowing the protein to access conformations that exist at higher energy states.

92

Figure 5.1: Simulation of attractive interactions between cytosolic proteins (gray, yellow, and red) and Pyruvate Dehydrogenase (Green) at the start of the simulation (T=0), halfway through the simulation (T=0.5) and at the end of the simulation (T=1). Adapted from Yu et al 2016 ELife.

While early in cell studies showed destabilization of protein structures in cells and cell like environments, simulations of the microplasma genitalium cytoplasm’s influence on the tertiary structure of pyruvate dehydrogenase (Figure 1) were performed by

Michael Feig’s group and helped visualize the effects of quinary interactions.26 As the molecular dynamics simulation evolves, pyruvate dehydrogenase starts in its native state structure and undergoes local unfolding as it interacts with its environments, but retains a folded “core” structure.26 These local dynamics and unfolding would be consistent with destabilizing attractive interactions relative with the observed unfolding free energies measured experimentally.

93

Figure 5.2: Protein structures at different energy levels in the protein folding funnel.

To put these experiments and simulations in context, it is important to think of the folding funnel free energy landscape or a protein. Protein structures are often determined using x-ray crystallography. These crystal structures provide a snap-shot of a protein structure in a single microstate near the global minimum of the folding funnel.

As a single microstate, the observed structure has minimal configurational entropy as the number of microstates accessible one. To expand on this, solution structures such as those solved by NMR, provides an ensemble of microstates that are populated at a given temperature. This ensemble approach increases the configurational entropy available to a given polypeptide as it is located at a higher energy level (Figure 2) and can populate states available at RT. As the protein access higher states in the folding funnel, it unfolds, first into molten globular ensemble, and finally into disordered ensembles, further increasing the configurational entropy available to the protein.

Attractive chemical interactions function to raise the location of the native state ensemble in the folding funnel, similar to changing the temperature of the system by 1-3 units of thermal energy (RT).27 In doing so a larger ensembles of states is accessed that

94 has more microstates and thus a greater configurational entropy. The increased number of microstates available to a protein, increases the number of structures that can potentially perform a function. If a functional state in this ensemble increases the fitness of the organism, it can be selected for, thus providing a mechanism for proteins to evolve new functions within the same sequence.

In theory one way to populate these higher energy states would be by changing the temperature. Changes in temperature influence both the free energy available to the

` system, and the temperature dependent thermodynamic parameters ΔCp and ΔS∘ . In terms of available thermal energy, in order to produce 0.1 kcal/mol free energy a temperature change of 50 ∘ C is required. The majority of the proteins in the E. coli proteome have a thermal stability below 85∘C, thus at optimal growth temperature

(37∘C) provides at maximum 48∘ C, or 0.096 kcal/mol before 50% of the protein is unfolded and aggregates. To further complicate this problem, in living cells such as E. coli, increasing the temperature from 37-42∘ C can be lethal despite this adding only

0.02 kcal/mol to the system. Thus biological systems are unable to use temperature to manipulate the native state ensemble of proteins.

The temperature dependent thermodynamic parameters that influence protein

` folding include both the heat capacity ΔCp and entropy of unfolding ΔS∘ U. From this,

` ΔCp produces a minimal contribution to the average protein, but ΔS∘ U can be on the

2 ` order of 10-10 cal/mol*K. In the higher extremes of ΔS∘ U a ΔT 5 degree can result in a destabilization of the protein by 0.5 kcal/mol.

95

Figure 5.3: Population predictions from a 3 state molecular partition function with a native state N, a native state ensemble (N*), and the unfolded state (U). predictions were made for two theorized proteins and intermediates.

In order to produce the same transitions in cells, the free energy comes from attractive chemical interactions. Following in Einstein’s footsteps of thought experiments, statistical mechanics allows us to write a three state partition function using the native state (N), native state ensemble (N*) and the unfolded state (U) calculations of using a three state partition function, would predict that a 5 kcal/ mol stablilize protein would see a 20-40% increase in population of an excited state that exists at a 2-3 RT higher energy level. If this calculation is adapted to a 15 kcal/mol stable protein, we would see a 10-20% increase in population of a state that excists 3 kcal/mol higher in the energy from an excess 2-3 RT worth of free energy being added to the system (figure 3).

Furthermore, in living cells, proteins do not exist on their own, E. coli are predicted to have 2-4 million protein molecules per cell, fission yeast increase that number to 90-140 million protein molecules, humans are predict to have upwards of

96 10,000,000,000 protein molecules per cell. Under this logic, the cellular environment could provide an increasing in configurational entropy for a given polypeptide chain allowing to access N more states. Assuming N new states per protein and 2-3 million protein molecules for E. coli, the ensemble of states in any proteome increases from 2-3 million microstates to the sum of all new states N, for 2-3 million proteins M. This new ensemble of states allows the proteome to find protein structures that can provide additional functions.

It is important to furthermore consider the effects of external pressures on protein biophysics. Selective pressures such as desiccation/dehydration have been demonstrated to modulate quinary interactions. Using a metastable SH3 domain,

Stadmiller and colleagues were able to demonstrate that the strength of quianry interactions increases under desiccation, seeing the destabilization of SH3 in cells increase when cells are osmotically shocked. This demonstrates, that proteins would be able to access even higher energy states, and greater configurational entropy when presented with environmental pressures.

In theory, probing the functional ensemble hypothesis can be done by applying established techniques to a crowded system. Two techniques that will be able to glean information about protein structures under crowded conditions includes, NMR based

Paramagnetic Relaxation Enhancement (PREs) experiments and Small Angles Neutron

Scattering (SANS). PREs act as molecular rulers, providing distance information between a paramagnetic spin label and diamagnetic nuclei. Their measurements rely on the introduction of a paramagnetic group into a protein, most frequently by conjugation to a judiciously chosen cysteine. The relaxation effect can be measured on all amide protons,

97 providing a measure of distance between the paramagnetic center and the surrounding amide groups. As an NMR based experiment, specific labelling strategies can be used to observe only our protein of interest in a crowded milieu of proteins. A complimentary approach, SANS is analogous to small angle x-ray scattering (SAXS), but instead of scattering x-rays off of electrons, SANS scatters neutrons off of atomic nuclei. Uniquely, protons and deuterons scatter neutrons with opposite signs, allowing each signal to be isolated by a technique called contrast matching. By observing a protonated protein in a sea of deuterated cosolutes, structural expansion can be observed for our protein of interest.

Combining novel methods such as NMR detected PREs, SANS, in combination with molecular dynamics simulations, the functional ensemble hypothesis can be test to look at an increase in N* populations for given polypeptides. The results of these studies will have lasting implications on protein structure, function and evolution.

98 REFERENCES

(1) Linderstrøm-Lang, K. U. (1952) Proteins and Enzymes, In Lane Medical Lectures, pp 52-73, Stanford University Press, Stanford Califorinia

(2) Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N., and Bourne, P. E. (2000) The Protein Data Bank, Nucleic Acids Res. 28, 235-242.

(3) Cohen, R. D., and Pielak, G. J. (2017) A cell is more than the sum of its (dilute) parts: A brief history of quinary structure, Protein Sci. 26, 403-413.

(4) Zimmerman, S. B., and Trach, S. O. (1991) Estimation of macromolecule concentrations and excluded volume effects for the cytoplasm of Escherichia coli, J. Mol. Biol. 222, 599-620.

(5) Minton, A. P. (1981) Excluded volume as a determinant of macromolecular structure and reactivity, Biopolymers 20, 2093-2120.

(6) Monteith, W. B., Cohen, R. D., Smith, A. E., Guzman-Cisneros, E., and Pielak, G. J. (2015) Quinary structure modulates protein stability in cells, Proc. Natl. Acad. Sci. USA 112, 1739-1742.

(7) Stadmiller, S. S., Gorensek-Benitez, A. H., Guseman, A. J., and Pielak, G. J. (2017) Osmotic Shock Induced Protein Destabilization in Living Cells and Its Reversal by Glycine Betaine, J. Mol. Biol. 429, 1155-1161.

(8) Smith, A. E., Zhou, L. Z., Gorensek, A. H., Senske, M., and Pielak, G. J. (2016) In- cell thermodynamics and a new role for protein surfaces, Proc. Natl. Acad. Sci. USA 113, 1725-1730.

(9) Danielsson, J., Mu, X., Lang, L., Wang, H., Binolfi, A., Theillet, F.-X., Bekei, B., Logan, D. T., Selenko, P., Wennerström, H., and Oliveberg, M. (2015) Thermodynamics of protein destabilization in live cells, Proc. Natl. Acad. Sci. USA 112, 12402-12407.

(10) Minton, A. P. (2013) Quantitative assessment of the relative contributions of steric repulsion and chemical interactions to macromolecular crowding, Biopolymers 99, 239-244.

(11) Minton, A. P. (1983) The effect of volume occupancy upon the thermodynamic activity of proteins: some biochemical consequences, Mol. Cell. Biochem. 55, 119-140.

(12) Sarkar, M., Li, C., and Pielak, G. J. (2013) Soft interactions and crowding, Biophys. Rev. 5, 187-194.

(13) Sarkar, M., Lu, J., and Pielak, G. J. (2014) Protein crowder charge and protein stability, Biochemistry 53, 1601-1606.

99 (14) Sarkar, M., Smith, A. E., and Pielak, G. J. (2013) Impact of reconstituted cytosol on protein stability, Proc. Natl. Acad. Sci. USA 110, 19342-19347.

(15) Cohen, R. D., Guseman, A. J., and Pielak, G. J. (2015) Intracellular pH modulates quinary structure, Protein Sci. 24, 1748-1755.

(16) Guseman, A. J., and Pielak, G. J. (2017) Cosolute and Crowding Effects on a Side- By-Side Protein Dimer, Biochemistry 56, 971-976.

(17) Guseman, A. J., Speer, S. L., Perez Goncalves, G. M., and Pielak, G. J. (2018) Surface Charge Modulates Protein–Protein Interactions in Physiologically Relevant Environments, Biochemistry 57, 1681-1684.

(18) McConkey, E. H. (1982) Molecular evolution, intracellular organization, and the quinary structure of proteins, Proc. Natl. Acad. Sci. USA 79, 3236-3240.

(19) Monteith, W. B., and Pielak, G. J. (2014) Residue level quantification of protein stability in living cells, Proc. Natl. Acad. Sci. USA 111, 11335-11340.

(20) Cohen, R. D., and Pielak, G. J. (2016) Electrostatic Contributions to Protein Quinary Structure, J. Am. Chem. Soc. 138, 13139-13142.

(21) Cohen, R. D., and Pielak, G. J. (2017) Quinary interactions with an unfolded state ensemble, Protein Sci. 26, 1698-1703.

(22) Davis, C. M., and Gruebele, M. (2018) Labeling for Quantitative Comparison of Imaging Measurements in Vitro and in Cells, Biochemistry 57, 1929-1938.

(23) Sukenik, S., Ren, P., and Gruebele, M. (2017) Weak protein–protein interactions in live cells are quantified by cell-volume modulation, Proc. Natl. Acad. Sci. USA 114, 6776-6781.

(24) Timothy, C., Kapil, D., and Martin, G. (2018) Pressure- and heat-induced protein unfolding in bacterial cells: crowding vs. sticking, FEBS Lett. 592, 1357-1365.

(25) Creighton, T. E. (2010) The Biophysical Chemistry of Nucleic Acids & Proteins, In The Biophysical Chemistry of Nucleic Acids & Proteins, pp 488-497, Helvetian Press.

(26) Yu, I., Mori, T., Ando, T., Harada, R., Jung, J., Sugita, Y., and Feig, M. (2016) Biomolecular interactions modulate macromolecular structure and dynamics in atomistic model of a bacterial cytoplasm, eLife 5, 19274-19296.

(27) Ribeiro, S., Ebbinghaus, S., and Marcos, J. C. (2018) Protein folding and quinary interactions: creating cellular organisation through functional disorder, FEBS Lett. 592, 3040-3053.

100