<<

Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available.

Title Protein- interactions and structural characterization of Ralstonia solanacearum lectin

Author(s) Antonik, Pawel

Publication Date 2017-03-20

Item record http://hdl.handle.net/10379/6412

Downloaded 2021-09-23T13:45:56Z

Some rights reserved. For more information, please see the item record link above.

Protein-Carbohydrate Interactions and Structural Characterization of Ralstonia solanacearum Lectin

Paweł Antonik, M.Sc. Eng.

School of Chemistry National University of Ireland, Galway

Protein-Carbohydrate Interactions and Structural Characterization of Ralstonia solanacearum Lectin

Ph.D. Thesis

This thesis was prepared at the School of Chemistry, National University of Ireland, Galway, from October 2011 to November 2016.

I declare that the work included in this thesis is my own work and has not been previously submitted for a degree to this or any other academic institution.

Cover Art: “Candy bound” form of Ralstonia solanacearum lectin. Structure adapted from 2BT9 deposition.

Paweł Michał Antonik

Born in Zawiercie, Poland 1987

Advisor: Dr Peter B. Crowley

Teagasc Advisor: Dr Dilip Rai

External Examiner: Prof. Mike P. Williamson (University of Sheffield)

Internal Examiner: Dr Michelle Kilcoyne

Viva voce: 30th January 2017

Thesis Abstract

Ralstonia solanacearum is a bacterium which causes fatal damage to hundreds of agricultural plants such as potato or tobacco. It produces two lectins that are involved in attachment of the bacterium to the host. One of them is RSL that was recently characterized by X-ray crystallography.1 Here, we present the first studies of RSL by NMR spectroscopy. The NMR assignments and dynamic studies of sugar-free and –bound RSL were the first milestone in our study of lectin-carbohydrate interactions in aqueous solutions (Chapter 2).2 The NMR results provided valuable and unique information about anomeric-recognition effects that were not reported previously. This study also presented new insights that complement the RSL crystal structure.1 The abundance of tryptophan residues provided useful probes in the HSQC spectrum.1,2 With this broad background information we screened numerous and could interpret their binding events using CSP (Chapter 3). The recent interest in noncovalent protein-polymer conjugates sparked our imagination to test the concept with RSL. Reversible, noncovalent PEGylation of RSL using fucosylated PEG was demonstrated and characterized by NMR spectroscopy, size exclusion chromatography and native gel electrophoresis. A structural model of the RSL conjugate with fucosylated PEG was obtained by small- angle X-ray scattering experiments. Using ITC the affinity of the complex was assessed (Chapter 4).3 In the last part of this thesis we demonstrate studies of RSL in physiological and crowded environments. In cell NMR spectroscopy was not eligible to obtain a spectrum of RSL in native conditions. Instead, intra-cellular mimicry was achieved using protein crowders, gelatin and agarose gels. Using NMR spectroscopy, significant differences in the interactions of sugar-free and -bound forms of RSL were demonstrated. The studies were supported by native agarose gel electrophoresis and SDS-PAGE revealing the aspect of confinement related to pore size (Chapter 5).

(1) Kostlanova et al. J. Biol. Chem. 2005, 280, 27839–27849 (2) Antonik et al. Biochemistry. 2016, 55,1195–1203 (3) Antonik et al. Biomacromolecules 2016, 17, 2719–2725

Contents

Abbreviations 8

Chapter 1 Introduction 10

Chapter 2 Anomer-Specific Recognition and Dynamics in 26 RSL

Chapter 3 The Impact of Stereochemistry on Chemical 48 Shift Changes in Sugar-bound RSL

Chapter 4 Noncovalent PEGylation of RSL 86

Chapter 5 RSL Interactions in Physiological and Crowded 104 Environments

Chapter 6 Discussion 122

Bibliography 130

Acknowledgments 152

Curriculum Vitae 153

Appendices 156

Abbreviations

BSA bovine serum albumin CSP chemical shift perturbation DMSO dimethyl sulfoxide HSQC heteronuclear single quantum coherence ITC isothermal titration calorimetry NMR nuclear magnetic resonance PDB protein data bank PEG polyethylene glycol RSL Ralstonia solanacearum lectin SAXS small-angle X-ray scattering SDS-PAGE sodium dodecyl sulfate polyacrylamide gel electrophoresis SEC size exclusion chromatography SPR surface plasmon resonance TROSY transverse relaxation optimized spectroscopy

8

9

Chapter I

Introduction

10 Introduction

1. Lectins Lectins are known from the end of 19th century when the first sugar-binding protein was isolated from seeds of the castor tree (Ricinus communis) and named ricin. Firstly, lectins were called hemagglutins due to their feature to agglutinate the red blood cells. The name lectin, from the Latin lectus, (the past participle of legere, meaning to pick, choose, or select) was proposed in 19541 when the ability of plant agglutinins to distinguish between erythrocytes of different blood types was shown. The isolation of pure crystalline protein from jack bean (Canavalia ensiformis) later named as concanavalin A, was a milestone in lectin studies. Subsequently, concanavalin A was also the first lectin primary structure to be established and the first lectin to be solved by X-ray crystallography.2

Table 1.1. Lectin specificity groups.

Carbohydrate Source Lectin Mannose Jack bean ConA3 Lathyrus ochrus LOL N-acetylglucosamine Wheat germ WGA Agrocybe aegerita AAL23 Galactose Ricin RCA II Human GAL14 Fucose Ralstonia solanacearum RSL5 Burkholderia ambifaria BambL6 Sialic acid Influenza A virus Hemagglutin H3N27

Proteins interacting with carbohydrates are commonly found in nature, in numerous organisms like bacteria, viruses, plants or animals. Amongst biological research of carbohydrate binding proteins we find enzymes, antibodies and mainly lectins.8 The lectins are classified in five main groups (Table 1.1.) according to the saccharide for which they reveal the highest affinity which is usually weak (Kd ~ μM-mM) but highly selective.9

11 Chapter I

Figure 1.1. Cell surface lectin-carbohydrate interactions. Figure adapted from Sharon & Lis,

2004.2 The main function that lectins possess is to recognize molecules inside cells, on cell surfaces, or in physiological fluids (Figure 1.1.).2 The involvement of lectins in recognition events implies that they take part in many biological processes such as cell-cell communication, cell growth,10 cell signalling, adhesion and apoptosis.11 They also play key roles in host recognition and infection.12 The examples of lectin- specific roles are listed in Table 1.2.

Table 1.2. Functions of lectins.2

Lectin Role Microorganism Bacteria Infection, host-pathogen attachment Virus Infection Plants Legumes Symbiosis with nitrogen-fixing bacteria, defence Animals Collectins Innate immunity Galectins Regulation of cell-growth, apoptosis L-selectin Lymphocyte homing Siglecs Cell-cell interactions in the immune and neural system

1.1 Fucose Of all the six-carbon sugars present in mammals fucose is the only one with the L- configuration and it lacks a hydroxyl group on the C6 position13 (i.e. 6-deoxy-L- galactose, Figure 1.2. A). Fucose is a common component of many N- and O- linked glycans9 consequently it is co-responsible for numerous functions.13 Fucose can be a carbon source for some bacteria14 but can also play fundamental role in cell adhesion, cell-cell communication and immune defence.15,16 Fucose is present in the placenta of many species including human17 and as part of the Sialyl-LewisX moiety

12 Introduction

X Figure 1.2. Structures of (A) L-fucose and (B) Sialyl-Lewis , an example of the fucose containing glycan. (Figure 1.2. B) it has roles in functioning of erythrocytes18 and in fertilization.19 On the other hand, fucose is involved in pathological recognition events20, invasion of viruses or bacteria, metastasis and progression of tumours21 or cancer.22

1.2 Ralstonia solanacearum Lectin (RSL) Ralstonia solanacearum is a Gram-negative plant pathogen that causes the common disease lethal wilt23 in over 200 plants including economically important crops such as potato, tomato, tobacco, banana and flowering plants like Pelargonium.24 R. solanacearum is a soil-borne pathogen that enters the plant roots via wounds or secondary roots and spreads throughout plant via the xylem vessels.24,25 Ralstonia solanacearum is a β-proteobacterium and therefore belongs to a group of bacteria in which the genomic organization is still poorly characterized. Ralstonia solanacearum is the second representative of this group for which the complete genome sequence was determined and described.26 R. solanacearum produces 14 haemagglutinin-type proteins (class of surface molecules containing variable internal repeats that are structurally related to haemagglutinins) and 13 additional ORFs coding for proteins which are responsible for adhesion and attachment to the host.26 R. solanacearum uses two lectins to bind to fucose-containing xyloglucan oligomers.5,27 RS-IIL binds to L-fucose but has a higher affinity for D-mannose. RSL is a fucose binding lectin, characterized in this thesis.5 Host cell-wall carbohydrates constitute specific attachment sites for pathogen protein receptors.28 Lectin-mediated attachment to target cell-walls enables cell destruction by bacterial virulence factors. RSL has been identified as a receptor for plant-pathogen interactions.5 Inhibitors, which block the active site of RSL could reduce or eliminate the interaction between Ralstonia and plant hosts. Therefore, the discovery of RSL inhibitors is a necessary step towards the development of effective pesticides for

13 Chapter I

Ralstonia. The availability of such pesticides will help combat Ralstonia wilting and thus prevent agronomic and economic losses. However to design such inhibitors, detailed studies of ligand / sugar binding modes are necessary. The first steps were done by X-ray crystallography (Table 1.3.) but solution state research was not available until the work presented in this thesis. RSL has µM affinity towards L-fucose and fucose-containing oligosaccharides.5 RSL is formed by trimerization (Figure 1.3 A) with three active sites set between adjacent monomers (inter-monomeric) and three in the monomer between two β-sheets (intra-monomeric).5 RSL belongs to a small family of fucose- binding proteins with a β-propeller fold, that is categorized by six binding sites oriented for binding to glycoconjugates on membranes.5,6,29,30 The common feature for the lectins of this family is the presence of two Trp or Trp and Tyr residues in the binding pocket where they are responsible for saccharide binding via polar and CH···π stacking interactions (Figure 1.3. B).

Figure 1.3. (A) Cartoon representation of the RSL trimer bound to methyl-α-L-fucoside (PDB 2BT9).5 Carbohydrates are shown as spheres. (B) Two Trp side chains from RSL inter-monomeric binding site responsible for binding methyl-α-L-fucoside.

Structural characterization of RSL is relatively new with the first crystal structure obtained by Kostlanova et al. in 2005. To date, RSL was primarily studied by X-ray crystallography resulting in eight sugar-bound RSL structures deposited in the Protein Data Bank (PDB) (Table 1.3.). In addition to experimental studies some computational modelling was performed to quantify the role of the CH··· interaction31 or using in silico mutagenesis and docking to calculate saccharides binding.32 Also, the free energies of RSL-monosaccharide complexes were evaluated.33,34 The NMR spectroscopy experiments reported in this thesis are so far the only NMR studies of RSL.

14 Introduction

Table 1.3. The 8 structures of RSL deposited in the PDB.

PDB Ligand Notes 2BT95 methyl-α-L-fucoside Atomic resolution (0.94 Å) 2BS55 2-fucosyllactose 2BS65 xyloglucan fragment D77G / G84S 5AJB35 LewisX 5AJC35 Sialyl-LewisX tetrasaccharides 3ZI836 α-L-fucose, (4R)-2-methylpentane-2,4-diol R17A 4I6S36 α-L-fucose W36A, domain swapped 4CSD37 methyl-α-L-fucoside, ‘Monomeric’ RSL

Many pathogens use lectins for multivalent attachment to oligosaccharide epitopes on human epithelia.38 The modular architecture of the RSL fold is well adapted for multivalent presentation and molecular recognition of cell surfaces.39 In recent years, RSL became a new interesting model to study membrane dynamics. For instance Arnaud et al. used RSL to design neolectins with modified valency.16 The combination of synthetic systems and the modified forms of RSL led to new insights in the role of lectin multivalency. The location of all carbohydrate binding sites on the same face of RSL promoted strong avidity for membranes but was not always sufficient to induce bending of the membrane. Continuation of this research confirmed that the ability of multivalent lectins to invaginate membranes is not directly linked to avidity to the glycosylated surface.37 Modified RSL (neolectin) was used to show that the distance between two sugar binding sites must exceed 27 Å, to facilitate lectin binding to glycolipids. However, when two binding sites are separated by 15 Å the neolectin could bend the membrane. 37 Another noteworthy feature of RSL is its close similarity to Burkholderia ambifaria lectin (BambL).6 Burkholderia ambifaria, is an organism with dual action that from one side protects plants against fungal infection but also has ability to cause infections in human, particularly problematic for patients with cystic fibrosis.40,41 RSL and BambL share ~ 80% sequence identity, both are a fucose specific trimmers. The one key residue that differs is in the intra-monomeric binding sites where Trp (responsible for stacking interactions, Figure 1.3.B) is replaced by Tyr. The close relation of RSL to BambL may impact future studies as BambL is the

15 Chapter I first fucose-binding lectin with a β-propeller fold to have been identified in a human pathogen genome.6

2. Structural Biology Structural biology is a branch of molecular biology that is focused on the molecular structure of biological macromolecules, particularly proteins. Nowadays, the structural biologist can use many experimental tools such as X-ray crystallography, NMR spectroscopy, electron microscopy or small-angle scattering methods. Each method has its advantages and limitations, therefore, the best results are obtained when these techniques are combined.

2.1 X-ray Crystallography Since 1958, when the structure of myoglobin was obtained, X-ray crystallography has become the main tool to derive protein structures.42 Today, X-ray crystal structure determination of proteins is a common method that can determine atomic- resolution structures. However, X-ray crystallography has its limitations that are not simple to overcome. Firstly, to obtain any structure the sample must be crystallized. This step is still the least understood and remains the bottleneck in structure determination. Secondly, the X-ray structure is based on the protein in the crystal lattice and so any dynamics or multiple conformations may not be observed. Another limitation is the fact that the electrons of the atoms scatter the X-rays, so observation of light atoms such as hydrogen is difficult.43

2.2 Small Angle Scattering (SAS) Small-angle X-ray scattering (SAXS) and small angle neutron scattering (SANS) are the two methods that have been applied to determine the structural features of proteins in solution, with particular utility for flexible systems.44 It is also popular to use small-angle scattering data for structure validation. Atomic models obtained by crystallography or NMR can be inserted into the molecular envelopes determined by SAS methods.45 As stated already the main advantage of SAS techniques is the possibility to measure flexible molecules directly in solution. For example, SAXS and SANS have

16 Introduction been used to study PEGylated proteins46–48, while the first PEGylated protein structure determined by X-ray crystallography was just recently published by the Crowley lab.49 Small angle scattering has its limitations as well. Firstly, the obtained models are low resolution (~10 Å) and show mostly the overall shape of the structure, e.g. relative domain orientations.50 Also sample preparation is a critically important step. Small-angle scattering patterns can be obtained from any quality of sample. It means that the sample to be measured must be highly pure i.e. composed of monodisperse, identical particles to have confidence in the structural models. SAS yields data that can be easily over interpreted so the absence of quality control in sample characterization can mislead the results.44

2.3 Structure and Dynamics by Nuclear Magnetic Resonance Spectroscopy The first globular protein structure was solved by NMR in 1984,51 which was relatively late as X-ray crystallography was already used for over two decades to provide protein structures. For that matter the validity of the NMR method in its early stages was questioned.52 Despite the initial problems, NMR spectroscopy alongside X-ray crystallography has become one of the most important techniques used to determine bio-macromolecular structures.42 Typically, structure determination from NMR relies on data obtained from liquid samples thus eliminating the crystal growing step. However, structure determination of microcrystalline protein samples by solid state NMR is also possible.53 This strategy is useful for membrane proteins that are challenging to crystallize as well as proteins with large intrinsically disordered regions.54,55 NMR as a solution method has limitations including the tumbling time of the protein, the resolution and assignment of overlapping peaks, both of which are dependent on the molecular weight. Selective labelling strategies have proven especially useful for tackling high molecular weight proteins.56–58 In addition, NMR data collection and analysis can be time-consuming, and it can be difficult to assess the quality of the final structures, in particular for novel folds.55 NMR is an ideal tool to probe protein dynamics. The atomic resolution and sensitivity to motion on a wide range of time scales provides information that is not as forthcoming from other experimental techniques. Difficulties occur in the specific interpretation and quantification of the dynamic processes.55 There are many reviews which deal with the theory and applications of relaxation dispersion experiments.

17 Chapter I

The theory of biomolecular dynamics was described by Palmer.59 The general application of NMR relaxation dispersion experiments was reviewed by Mittermaier and Kay60 while Loria et al. focused on using these measurements to characterise enzyme motions.61 Ishima presented an overview of 15N relaxation experiments to characterise protein dynamics.62 NMR experiments such as the Carr-Purcell-Meiboom-Gill (CPMG) is capable of detecting protein motions in the time scale of micro- to millisecond. These experiments allow characterization of conformational exchange rate constants, the equilibrium populations of the relevant conformations, and the chemical shift differences between the conformations, for example, domain and loop motions.61,63

Classical NMR relaxation R1, R2 and heteronuclear NOE map motions in the pico- to nanosecond timescales.64 NMR relaxation dispersion experiments have been successfully used to characterize internal protein motions,65,66 protein folding63,67 and protein–ligand interactions.68 Perhaps the most consistent conclusion from the many dynamic studies is that the fast backbone dynamics of a protein almost always change upon binding to a ligand. The binding of a ligand often reduces the mobility of backbone groups in or near the binding site. In addition to influencing the stability of protein-ligand complexes, there is evidence that changes in fast backbone dynamics can affect the stability of the folded protein.69

2.4 NMR Techniques to Investigate Protein-Ligand Interactions. NMR spectroscopy is a uniquely powerful technique to investigate protein–ligand binding from weak (transient) to strong (long-lived) interactions. Four most frequent methods include chemical shift perturbation studies (CSP),70 saturation transfer difference NMR (STD NMR),71 transferred nuclear Overhauser effect (TrNOE),72 and standard nuclear Overhauser effect spectroscopy (NOESY).73,74 CSP is the major method used in this thesis so the main aspects are briefly summarized. Chemical shift perturbation is the simplest and almost always the first step for detecting interactions between protein and ligand. CSP is very sensitive to structural changes, and can be measured accurately, providing both structural and thermodynamic information on the complex. Perturbations can take the form of changes in chemical shifts but sometimes binding event or structural changes can be also observed by line broadening. CSP is usually used to detect binding and identify

18 Introduction binding sites / interfaces and often to measure binding affinities. CSP is the only method that can directly provide both a Kd value and a binding site from the same set of measurements but it must be used with caution because75 CSP is not as precise as other biophysical techniques such as ITC. To obtain a Kd value, the system must be in fast exchange.64,70

Figure 1.4. The relationship between 2D NMR resonances and the exchange rate. (Left) Slow exchange: the free peak (blue) decreases in intensity as the bound peak (red) increases. (Right) Fast exchange: peaks move smoothly from free (blue) to bound (red). Figure adapted from Williamson 2013.70

The exchange rate is defined on the NMR timescale by considering the lifetimes of the free and bound state relative to the difference in the chemical shift of these two states. Typically, strong (Kd < 5 μM) and weak ligand binding correspond to slow and fast exchange, respectively. During an NMR titration (monitored by e.g. a 2D HSQC experiment, Figure 1.4.)76 the slow exchange limit is characterized by the disappearance of the signal corresponding to the free species, with concomitant appearance of a new signal corresponding to the bound state. In contrast, in fast exchange the observed signal is an average of the two states and it appears to shift from the free to bound position.64,70

2.5 Protein-Carbohydrate Interactions by NMR. NMR has proven particularly useful to study lectins. For example, galectins are well studied from both the ligand and receptor side. Using STD and HSQC experiments molecular recognition features from the perspective of the ligand and receptor were demonstrated for the threonine O-linked TF antigen binding to galectin-3.77 Miller et al. using the same NMR methods combined with molecular dynamic simulations investigated structural aspects of galectin-1 binding to two α-linked digalactosides

19 Chapter I and melibiose.78 Examples of binding and NMR dynamics studies for other galectins like galectin-779 or galectin-980,81 have also been reported. NMR spectroscopy has been used to study other types of lectins and carbohydrate binding domains. Combined NMR experiments and high-resolution X- ray structures were implemented to clarify the interaction of the bacterial adhesion FimH lectin domain with antagonists.82 The C-type lectin DC-SIGNR that promotes infection of pathogens such as HIV was characterized by NMR in the ligand-free and -bound states of a variety of glycans.83 The significant effect of decreasing pH on structural rearrangement of DC-SIGNR was also presented.84 Simpson et al. using NMR structure determination have shown the ligand specificity of enzymatic, non-catalytic carbohydrate-binding modules. It was determined that a single residue, Arg can control the orientation of one of the tryptophan residues that binds the saccharide. The mutation of this residue to Gly caused a change in the carbohydrate binding specificity from cellulose to xylan.85 Propster et al. used NMR structure determination to elucidate the molecular basis of how Siglec-8, a human immune-inhibitory receptor selectively recognizes sulfated carbohydrates.86

2.6 Noncovalent Interactions in Protein-Carbohydrate Complexes The main aspects of protein-carbohydrate interactions derived from structural studies were summarized in recent reviews. Marchetti et al. tackled the joint approach of NMR and molecular modeling87 while Asensio et al. elaborated the recognition of carbohydrate-aromatic interactions.88 Fernandez-Alonso et al. showed how the knowledge of molecular recognition between sugars and lectins can be translated into drug design.89 CH···π stacking interactions play a key role in protein-carbohydrate molecular recognition processes. A quantitative assessment of the interactions made between protein side chains and the pyranose forms of the most-common monosaccharides was provided. A comparison across all high-resolution structures of protein−carbohydrate complexes in the PDB was performed.90 An extensive analysis of CH···π interactions in water has been completed by Jimenez-Moreno et al. They employed dynamic combinatorial methodology to allow the accurate determination of the binding free energies for 117 isolated carbohydrate/aromatic complexes. The results indicated a large diversity of chemical features at the CH

20 Introduction donor and  acceptor units.91 In a recent study Hsu et al. found that the free energy derived from sugar−aromatic interactions is moderately affected by changes in the stereochemistry of the hydroxyl groups on the sugar (e.g. glucose versus galactose versus mannose) or by hydroxyl group deletion (e.g. glucose versus cyclohexane).92

3. Biomedical Applications Selectivity, the major feature of lectins can be used on many different fronts. This section outlines some of the main considerations in terms of potential applications of proteins in a biomedical context. Also highlights some areas, where RSL can be potentially used.

3.1 Lectins as Therapeutic Tools Most cell surface proteins and many lipids in cell membranes are glycosylated and these glycans are a target for lectins. Therefore, there is a rationale to use lectins to mediate drug targeting. For example, cancer cells, often express different glycans compared with healthy ones so lectins could be used as carrier molecules to target drugs to disease-relevant cells and tissues.93 Lectins can be used as imaging agents that recognize tumours. The lectin can be attached to a fluorescent probe and play a recognition role94,95 or the lectin can be fluorescent itself and have dual roles of attachment and marker.96 It was also shown that lectins can be used to protect drugs from hydrolysis and proteolysis by stomach acids and enzymes. Leong et al. proved that lectins can be a part of novel improved carrier for the oral delivery of insulin.97

3.2 PEGylation Protein-based therapeutics are becoming increasingly important. PEGylation, the covalent conjugation of polyethylene glycol (PEG) to proteins is a common strategy in drug development. The attachment of PEG shields the therapeutic protein from interactions with the immune system.98 In addition, PEGylation reduces renal clearance by increasing the size of the protein and it improves physicochemical properties such as stability and solubility leading to improved pharmacokinetics.99– 102 Since the first appearance of a PEGylated drug, (Adagen®) on the market in 1990, about ten PEGylated drugs are currently available.103 However, PEGylation

21 Chapter I can also reduce drug recognition and in consequence hinder biological activities.104 This disadvantage led to the idea of noncovalent PEGylation. A possible route to create such noncovalent conjugates is described in Chapter 4.

3.3 Physiological environments Understanding of protein behaviour in crowded environment is a key to comprehend proteins roles within their native cell environment. Proteins are the molecular machines responsible for numerous functions in the crowded environment of the cell interior so the particular physical properties of in cell milieu need to be considered. Macromolecular crowding is important but often ignored aspects of biological systems.105,106 The composition of E. coli is relatively well-known, making it a useful model for studies of molecular biology.107 The total concentration of protein, RNA and DNA is in range of 300-400 g/L, which makes most of the cell volume unavailable to other molecules. This physical property is important as the bulk of biochemical activity occurs in the crowded cytoplasm. In addition the total concentration of inorganic ions and metabolites of the E. coli cytoplasm can reach up to 300 mM.108,109 Fundamental protein properties like charge, stability or diffusion are influenced by the surrounding environment. Thus, the proteins behaviour in complex, physiological environments can vary significantly to simple in vitro conditions. Alteration of protein structure, changes to diffusion rates and thermal stability of a protein are a few of the crowding effects that could be over-looked in simple buffer systems.110–113 The need for physiologically-relevant measurements led to the development of in-cell NMR.114 Though in-cell NMR is limited due to inability to detect most globular proteins by standard HSQC measurements.115 To overcome such an issue mimicry of the intracellular conditions (confined/crowded environments) has been developed by using e.g. polyacrylamide gels,116 proteins and polymers as potential crowders.117–120 RSL appears as great model for NMR study. It has high thermal stability and despite of relatively large molecular mass (~29 kDa trimer) gives well resolved spectrum comparable to 10 kDa protein. It is comparable to well studied model such as protein G domain B1 (GB1) which is 8 kDa detectable by in-cell NMR.121,122 Also, RSL is almost neutral (pI ~6.5) protein that should not be much affected by charge-charge interactions that can alter detection protein in cells.121,122 The studies of RSL in physiological environment were elucidated in Chapter 5.

22 Introduction

Thesis Outline This thesis comprises four results chapters presenting structural studies of RSL by using mainly NMR spectroscopy supported with other techniques such as electrophoresis, small-angle X-ray scattering, or isothermal titration calorimetry. Chapter 2 presents the NMR assignments and dynamics which were a stepping stone to the characterization of RSL interactions. The chapter comprises NMR assignments of the RSL backbone and tryptophan side chain resonances for sugar-free, L-fucose-bound and D-mannose-bound forms. The assignment of L- fucose bound RSL revealed an anomeric recognition effect that was not previously reported. Split resonances corresponding to both L-fucose anomers were observed in the 2D 1H-15N HSQC spectrum for L-fucose bound RSL. This chapter also presents relaxation data of sugar-free and -bound RSL. The changes in rigidity of RSL upon sugar binding are described. The work in this chapter was developed in collaboration with Prof. Nico van Nuland and Dr Alex Volkov (Brussels). Chapter 3 is a continuation of the story about RSL-sugar interactions. The binding of a variety of monosaccharides to RSL was revealed by chemical shift perturbation studies. It was shown that it is possible to determine the monosaccharide orientation in complex with RSL based on crystal structure and Δδ analysis. Simple modelling based on translation and rotation of the sugar coordinates in the RSL binding site supported the results. The binding affinities of two weakly binding monosaccharides were estimated by using NMR titrations. In addition to monosaccharides, studies of the interactions between RSL and open-chain sugar such as glycerol and L- were performed revealing the adaptations of flexible molecules to the lectin binding side. Also the studies of RSL in vegetable extracts resulted in the binding of D-fructose, the major sugar found in tested extracts. Chapter 4 reports a possible route to noncovalent PEGylation of proteins. In collaboration with Prof. Neil Cameron (Durham) a fucosylated-PEG was prepared. The polymer bound to the lectin creating a noncovalently PEGylated conjugated. The RSL-glycopolymer interactions were characterized using NMR spectroscopy, size exclusion chromatography and electrophoresis. The “Medusa complex” of RSL and six PEG chains was determined by using small-angle X-ray scattering. Also the glycopolymer affinity and conjugate reversibility were determined by ITC and NMR experiments.

23 Chapter I

Chapter 5 presents a preliminary study of RSL in physiological and crowded environments. We used in-cell NMR experiments to study RSL inside E. coli. As RSL was not detectable in cell it was necessary to shift the approach towards cell mimicry. Using anionic BSA and cationic lysozyme crowding and electrostatics effects were studied by NMR spectroscopy. The experiments were followed by encapsulating RSL (in both sugar-free or -bound forms) in gelatin or agarose gels. Using NMR spectroscopy and electrophoresis the effect of molecular confinement on sugar-free and –bound RSL was elucidated.

24

25

Chapter II Anomer-Specific Recognition and Dynamics in RSL

This work was published as Antonik, et al. Anomer-Specific Recognition and Dynamics in a Fucose- Binding Lectin. Biochemistry 2016, 55, 1195–1203.

The NMR assignment and dynamics data were acquired at the Jean Jeener NMR Centre, Vrije Universiteit Brussel, Belgium. The author of the thesis acquired and processed all data under supervision of Dr Alexander Volkov and Prof. Nico van Nuland. The sugar-free and L-fucose bound forms of RSL were fully assigned by P.M.A. The dynamics analyses and figures were prepared by Dr Volkov and Dr Crowley.

26 NMR Characterization of RSL

Abstract

Sugar binding by a cell surface ~29 kDa lectin (RSL) from the bacterium Ralstonia solanacearum was characterized by NMR spectroscopy. The complexes formed with four monosaccharides and four fucosides were studied. Complete resonance assignments and backbone dynamics were determined for RSL in the sugar-free form and when bound to L-fucose or D-mannose. RSL was found to interact with both the α- and the β-anomer of L-fucose and the “fucose like” sugars, D-arabinose and L-galactose. Peak splitting was observed for some resonances of the binding site residues. The assignment of the split signals to the α- or β-anomer was confirmed by comparison with the spectra of RSL bound to methyl-α-L-fucoside or methyl-β-L- fucoside. The backbone dynamics of RSL were sensitive to the presence of ligand, with the protein adopting a more compact structure upon binding to L-fucose. Taking advantage of tryptophan residues in the binding sites, we show that the indole resonance is an excellent reporter on ligand binding. Each sugar resulted in a distinct signature of chemical shift perturbations – suggesting that tryptophan signals are a sufficient probe of sugar binding.

27 Chapter II

Introduction

Carbohydrate recognition is a topic of far-reaching importance123 that continues to challenge the protein scientist82,124–127 and the synthetic chemist128–131 alike. The principal challenge lies in understanding the balance of noncovalent interactions that favour receptor binding over solvation in water. Complex formation in the case of monosaccharides has the added feature of selection between the α and the β anomeric forms. NMR spectroscopy offers the powerful tools of chemical shift perturbation (Δδ) analysis to characterize protein-ligand binding. Here, we have applied this technique to study the fucose-binding lectin5 (RSL) from the bacterium Ralstonia solanacearum. Knowledge of fucose recognition by lectins is expected to aid the development of new strategies to combat bacterial disease. For example, host-pathogen interactions may be mediated by lectins that bind to fucosylated oligosaccharides present in plant cell walls or mammalian epithelial tissue.132 In the case of R. solanacearum, extracellular fucose-binding lectins likely contribute to infection and progression of wilt disease in potato and tomato.5 RSL is a ~29 kDa trimer with a 6-bladed β-propeller fold.5 Each monomer comprises two blades, corresponding to the N- and C-terminal halves of the protein, which share ~40% sequence identity (Figure 2.1. A). There are two sugar-binding sites per monomer. Site 1 is intra-monomeric and is nestled between the N- and C- terminal blades of the monomer. Site 2 is inter-monomeric and occurs between the blades of adjacent monomers (Figure 2.1. B). Each site contains a WiXGXGWi+5 31,88 motif. The indole ring of Wi forms CH- bonds with the monosaccharide while 5 the indole of Wi+5 donates a hydrogen bond to hydroxyl O3 of fucose. The occurrence of two almost identical binding sites makes RSL an interesting model protein for Δδ analysis. The protein is exceptionally stable (Tm = 86 °C in the sugar- free form),36 which favours NMR studies. There are numerous crystal structures (including PDB 2BT9 at 0.94 Å resolution) of RSL5 and related lectins6,29,36,133 in sugar-free and -bound forms. Furthermore, ligand binding to RSL has been characterized by isothermal titration calorimetry5,31 and computational methods.31,32,34 The interactions of RSL with four monosaccharides and four fucosides were investigated by chemical shift perturbation studies on 15N-labelled RSL.

28 NMR Characterization of RSL

Figure 2.1. (A) The primary structure of RSL comprises an N- and C-terminal half, which share ~40% sequence identity. Colons and dots indicate identical and similar residues, respectively, as revealed by (B) a structural alignment. Residues belonging to the intra- and inter-monomeric binding sites are shown in grey and black, respectively. The panel on the right shows how the methyl substituent of the sugar is exposed to the solvent. Side chains shown as sticks (except Thr82) make at least one non-covalent bond with the sugar.5 Remarkably, we observed anomer-specific effects with resonance splitting in the presence of L-fucose or related monosaccharides. While many lectin-carbohydrate complexes have been characterized by HSQC spectroscopy78,79,82,83,126,127,134–140 we have not found evidence of such anomer-specific effects in the literature. NMR binding data for α- and β-methyl- or -nitrophenyl-L-fucosides together with analysis of the RSL crystal structure suggest that a Tyr side chain that flanks the intra- monomeric site may act as a steric impediment to the binding of β-L-fucosides. The backbone dynamics of the sugar-free and sugar-bound forms of RSL were characterized in the ns to s timescales. Our data indicate an increased rigidity of the RSL structure upon sugar binding, consistent with a ligand-induced fit. We show also that the tryptophan indole –NH resonance is an excellent probe of ligand binding.141,142 In the case of RSL with three Trp side chains per binding site, sugar- specific signatures were obtained.

29 Chapter II

Results and Discussion

Backbone Resonance Assignments. The 1H-15N HSQC spectrum, with a total of 105 well-resolved cross-peaks, confirmed that RSL is a symmetric trimer in solution (Appendix B). Despite the relatively high molecular weight (29.2 kDa) the protein yielded an excellent HSQC comparable to that of a 10 kDa protein. Almost complete 1HN, 15N, C′, Cα and Cβ assignments were determined for sugar-free RSL. Figure 2.2. shows the assigned 1H- 15N HSQC spectrum. Pairs of homologous residues, due to the sequence and structural similarity of the N- and C-terminal sub-domains, had resonances with closely similar chemical shifts and line widths (Figure 2.2.; S9/S52, I16/I61, R17/R62, W31/W76, D32/D77, W36/W81, and A40/A85, F41/Y86, see also Figure 8). Interestingly, no peaks were observed for the glycines that occur in the

WiXGXGWi+5 motifs (residues 31-36 and 76-81, Figure 2.1.), suggestive of exchange broadening for these turn residues.

30 NMR Characterization of RSL

Figure 2.2. Assigned 1H-15N HSQC spectrum of RSL in sugar-free form. Sample conditions were 0.25 mM protein in 20 mM potassium phosphate, 50 mM NaCl, pH 6.0 at 303 K.

31 Chapter II

Figure 2.3. Assigned 1H-15N HSQC spectrum of RSL in L-fucose-bound form. Split resonances due to α or β-L-fucose are boxed; Y37, F41, W76s, K83 and G84. Sample conditions were 0.25 mM protein in 20 mM potassium phosphate, 50 mM NaCl, pH 6.0 at 303 K.

32 NMR Characterization of RSL

Figure 2.4. Assigned 1H-15N HSQC spectrum of RSL in D-mannose-bound form. Sample conditions were 0.25 mM protein in 20 mM potassium phosphate, 50 mM NaCl, pH 6.0 at 303 K.

33 Chapter II

Chemical Shift Perturbation Due to Sugar Binding. Monosaccharide/fucoside binding to RSL was in slow-exchange on the NMR time scale. The presence of D-arabinose, L-fucose, methyl-α-L-fucoside, methyl-β-L- fucoside, nitrophenyl-α-L-fucoside, nitrophenyl-β-L-fucoside, L-galactose or D- 5,32 mannose (with Kd values ranging from < 1 - 100 µM ), resulted in distinct perturbations of the backbone 1HN, 15N resonances (Figure 2.5.). Complete backbone assignments were determined for RSL bound to L-fucose or D-mannose (see Figures 2.3. and 2.4. , respectively for the assigned HSQCs). In the case of the other sugars, only the 1H-15N HSQC spectrum was assigned. The sugars D-arabinose, L-fucose, and L-galactose have identical stereochemistry and differ only in the nature of the C5 substituent; –H, –CH3 and –

CH2OH, respectively (Figure 2.7.). In the crystal structure of RSL bound to methyl- α-L-fucoside5 the C6 methyl is inserted into a shallow hydrophobic pocket formed by the side chains of Trp10, Ile59, Ile61, and Trp76 in the intra-monomeric site or Trp53, Pro14, Ile16, and Trp31 in the inter-monomeric site (Figure 2.1.). In addition to interactions with these side chains the methyl substituent is also in van der Waals contact with two water molecules. The nature of this portion of the binding site, in particular, the water accessibility suggests a molecular basis for the broad specificity of RSL. In the case of D-arabinose the absence of the C6 methyl is likely compensated by the insertion of a water molecule (Figure 2.8. B). When L-galactose binds, its hydroxyl substituent can orient towards the solvent and there are no unfavourable clashes. The tolerance of this hydrophobic pocket to polar substituents is evidenced by the binding of D-mannose. By virtue of their similar stereochemistry, L-fucose and D-mannose can bind to the same lectins (Figure 3.2.).5,8,29

34 NMR Characterization of RSL

Figure 2.5. Plot of chemical shift perturbations for RSL backbone resonances in the presence of L-fucose, methyl-α-L-fucoside, methyl-β-L-fucoside, nitrophenyl-α-L-fucoside or nitro-phenyl-β- L-fucoside. In the L-fucose complex some resonances were split in two (red data points) due to the presence of the α- and β-monomers. Specifically, the protein-sugar contacts mediated by hydroxyls O2, O3 and O4 of L-fucose can be established likewise by the O4, O3 and O2 groups of D- mannose (Figure 2.7.). Consequently, when D-mannose binds to RSL its anomeric

35 Chapter II

hydroxyl is inserted into the hydrophobic pocket. And considering the stereochemistry of C5 in L-fucose it is apparent that RSL must bind β-D-mannose. One might also speculate that the ~50 fold lower affinity for D-mannose (compared to L-fucose) is the result of a weakened hydrophobic effect at this site. In the presence of D- arabinose, L-fucose or L- galactose, the number of resonances in the 1H-15N HSQC

Figure 2.6. Spectral regions from the overlaid 1H-15N spectrum of RSL increased by HSQC spectra of sugar-free RSL (black contours) and ~10%. The assignment of RSL RSL bound to L-fucose (red), to methyl-α-L-fucoside (navy) or methyl-β-L-fucoside (orange). (A) An bound to L-fucose revealed that example of structurally homologous residues, R17 the backbone amide resonances of and R62, with resonances of similar chemical shift. (B) Example of a resonance, A85, that is sensitive to six residues (Y37, F41, K83, G84, both anomers of L-fucose. (C) The seven tryptophan A85 and Y86) were split into two indoles are useful reporters of sugar binding (refer to Figure 2.8.). clearly resolved peaks. These pairs of resonances were attributed to the binding of the α and β forms of L-fucose, which is present in solution in a ratio of 0.3 α-L-fucose to 0.7 β-L-fucose.143 When bound to RSL, L-fucose is positioned with the anomeric hydroxyl exposed to the solvent.5,36 Thus, the α and β forms can bind with minimal steric clashes between the protein and the anomeric hydroxyl. Crystallographic evidence for this effect was reported in a RSL crystal structure (PDB 4I6S), where both anomers of L-fucose were bound.36 As such RSL is suited to binding fucose-terminated oligosaccharides. While the lack of α/β anomer discrimination or selection is known for lectins,29,133,144 we have not found NMR evidence for this effect in the literature. The assignment of the split peaks to the binding of the α and β forms was confirmed by comparison with the spectra of RSL

36 NMR Characterization of RSL

Figure 2.7. Δδavg plots of RSL in the presence of 2 equivalents of different monosaccharides. Black bars indicate resonances that were split in two due to the presence of the α- and β-anomer. Blanks correspond to prolines (P14 and P44) and residues which lacked a resonance in the sugar- 5,32 free or -bound RSL. The structure of each sugar and their Kd values are indicated. D-mannose is oriented to show how hydroxyls O2, O3 and O4 match the stereochemistry of L-fucose. in complex with methyl-α-L-fucoside or methyl-β-L-fucoside (Figure 2.6.). There was good overlap between the fucose-split resonances and the single resonance in the fucoside complex. Note that the complex of RSL with D-mannose did not result in peak splitting in the HSQC. Only the β-D-mannose can bind for reasons already discussed.

Pronounced chemical shift changes (Δδavg = 0.4-0.6 ppm) were measured for the resonances of equivalent residues Ala40 and Ala85 (Figure 2.7.). These large effects can be rationalized by comparison with the X-ray structure.5 The amide NH of Ala40 and Ala85 donate a hydrogen bond to hydroxyl O2 of methyl-α-L-fucoside (N---O2, 2.9 Å) in the intra- and inter-monomeric binding sites, respectively.

Tryptophan Indole Probes of Sugar Binding. The pattern of backbone chemical shift perturbations was similar for D-arabinose, L- fucose, L-galactose and D-mannose (Figure 2.7.). Despite the differences in chemical structure, the main interactions (e.g. hydrogen bonds and CH- bonds) were maintained by each sugar and the backbone resonances of RSL sensed similar changes to the chemical environment. In contrast, the –NH resonance of the tryptophan indole was a more sensitive reporter,141,142 and a unique pattern of shifts was observed for each monosaccharide. There are seven tryptophans in RSL, three in the intra-monomeric site (Trp10, Trp76 and Trp81), and three in the inter- monomeric (Trp53, Trp31 and Trp36). The seventh tryptophan (Trp74) does not contribute directly to either site.

37 Chapter II

Figure 2.8. (A) Plot of chemical shift perturbations for the tryptophan indole resonance in the presence of four monosaccharides (refer to Figure 9). Filled and open circles correspond to the 1 N 15 H and N data, respectively. Some resonances were split in two (e.g. L-fucose data) due to the presence of the α- and β-anomer. (B) A simplified representation of the binding site showing the homologous pairs (refer to Figure 1.1.) of tryptophan side chains as sticks and colour-coded to

match panel A. The methyl-α-L-fucoside is shown as lines and the protein backbone is represented as a ribbon. Two crystallographic water molecules are indicated by red spheres. Dashed lines ε indicate the distances between the Trp N atoms and C6 or O3 of methyl-α-L-fucoside.

Figure 10A shows the indole –NH perturbations in the presence of the four sugars. As discussed, the differences in binding for this set of sugars were focused on the C5 substituent (or C1 in the case of D-mannose, Figure 2.7.). In the crystal structure, the methyl C6 of methyl-α-L-fucoside makes van der Waals contacts with the indoles of Trp 31/76 and Trp53/10. The shortest point of contact with C6 is to the indole –NH in the Trp31/76 pair (Figure 2.8. B). Therefore, it could be expected that these indole resonances will have the largest differences in perturbation when the various sugars bind. This was clearly the case (Figure 2.8. A, yellow bars). The Trp36/81 pair is furthest from C5, and the indole –NH of these side chains makes a hydrogen bond to hydroxyl O3, which is likely to dominate the chemical shift. Consistent with this structural interpretation, the Trp36/81 resonances had similar, large perturbations irrespective of which sugar was bound (Figure 2.8, A, red bars). The Trp53/10 pair had similar perturbations for D-arabinose, L-fucose and L- galactose, while shifts of similar size but opposite sign were observed for D-mannose (Figure 2.8. A, blue bars). Finally, the indole resonance of Trp74 had negligible chemical shift perturbations as this group was not involved in sugar binding. As with the backbone amide resonances, some of the indole resonances were also split due to the binding of α- or β-L-fucose (Figures 2.6. C and 2.8 A). Again the assignment of the split signals was confirmed by comparison with the data for methyl-α- or methyl- β-L-fucoside.

38 NMR Characterization of RSL

Considering the sugar-specific signatures observed for the indole resonances (versus the similar pattern of backbone amide shifts), the use of 15N-tryptophan labelled141 samples (rather than uniform labelling) may provide a simpler probe for the characterization of sugar binding. An effective application would require two or more binding site tryptophans with structurally distinct ligand interactions.

Steric Effects in the Binding Site. The observation of distinct chemical shift perturbations for the α- and β-anomers of L-fucose and methyl-L-fucoside prompted further analysis of this effect. In the atomic resolution crystal structure of RSL5 (PDB 2BT9) the methyl substituent of methyl-α-L-fucoside is solvent exposed (Figure 2.1. B, right panel). The solvent exposure of the anomeric substituent implies that β-L-fucosides can be accommodated. However, the side chain of Tyr37 adjacent to the intra-monomeric binding site may impede the binding of bulky β-L-fucosides. The corresponding residue in the inter-monomeric site is Thr82 (Figure 2.1. B). Figure 2.9. shows overlaid spectral regions of RSL bound to nitrophenyl-α-L-fucoside, nitrophenyl-β- L-fucoside or the corresponding methyl-L-fucoside. In some cases the same chemical shift perturbation (with respect to sugar-free RSL) was observed for the α- and β-, methyl- or nitrophenyl-L-fucoside. For other resonances, the β-L-fucosides gave rise to perturbations that differed significantly from the α-L-fucoside shifts. For example, the R62 resonance was similarly altered in the presence of all four fucosides while the equivalent R17 in the intra-monomeric site was not shifted by the β-anomers. While it is difficult to pinpoint chemical shift changes to specific structural effects (due to the proximity of the intra- and inter-monomeric sites) it seems reasonable to assume that steric interactions between the β-substituent and the Tyr37 side chain (and the lack of such an interaction at Thr82) results in a different contribution to the chemical shift perturbations of binding site resonances. A role for tyrosine side chains in lectin selectivity has been proposed previously, for example, the tyrosine gate in fimH.82,127 Perhaps, the single Tyr37 serves to modulate the binding affinity of the three intra-monomeric sites in RSL.

39 Chapter II

Figure 2.9. Regions from the overlaid 1H-15N HSQC spectra of pure RSL (black contours) and

RSL bound to methyl-α-L-fucoside (navy), methyl-β-L-fucoside (dark orange), nitrophenyl-α-L- fucoside (light blue) or nitrophenyl-β-L-fucoside (light orange). The similarities / dissimilarities in chemical shift perturbations due to sugar binding are evident. β substituents resulted in a distinct pattern (see C75, T82 and G84).

40 NMR Characterization of RSL

Dynamics of Sugar-free and Sugar-bound RSL. The dynamics of the RSL backbone on the ps-ns timescale were assessed by using the residue-based squared-order parameter (S2).145 With an average S2 = 0.82 ± 0.08, the sugar-free RSL backbone was rigid, except for the homologous loop regions comprising residues 32-35 and 77-80 (Figure 2.10., upper panels and compare Figure 2.1.). These considerably flexible (S2 < 0.75) loops contain two glycines (no amide tryptophans in the intra- and inter-monomeric binding sites. The flexibility of

2 15 Figure 2.10. Squared order parameters (S ), fast dynamics ( N R2/R1 profiles), and exchange 2 contribution to the transverse relaxation rates (R2ex) of RSL in the sugar-free and -bound forms. S values were calculated in RCI145 from the backbone amide secondary chemical shifts. Note the flexible (S2 < 0.75) loop regions comprising residues 32-35 and 77-80. The dashed lines are the averages for the sugar-free data. The RSL secondary structure is indicated.

41 Chapter II

Figure 2.11. The RSL monomer (PDB 2BT9)5 coloured by

R2ex values (see scale bar) in the sugar-free (upper), D- mannose-bound (middle) and L-fucose-bound (lower panel)

forms. The residues for which R2ex data could not be obtained are grey. The intra-monomeric binding site residues are shown as sticks, with the backbone nitrogen

shown as spheres. Methyl-α-L-fucoside is shown as sticks.

these loops was not affected by binding to either L- fucose or D-mannose. Residues 40-44 and 64-72 exhibited minor changes in the order parameters (|ΔS2| < 0.05), while S2 for the rest of the backbone remained constant. These data suggest that sugar binding did not affect the RSL backbone dynamics on the ps-ns timescale. 2 15 To complement the S analysis, N R1 and

R2 relaxation parameters were also measured. Governed by the rotational diffusion of a molecule,

the R1 and R2 relaxation rates can provide information on the hydrodynamic properties of the

protein. A small, uniform decrease in the R2/R1 ratios was observed for the sugar-bound forms of RSL compared to sugar-free RSL (Figure 2.10.,

middle panels). Quantitative analysis of the R2/R1 profiles yielded isotropic diffusion tensors with

rotational correlation time c = 12.42 or 11.75 ns for the sugar-free and -bound RSL, respectively.69 These data are consistent with an RSL trimer in solution and suggest that the protein adopts a more compact structure upon sugar binding. RSL dynamics were probed further by Carr-Purcell-Meiboom-Gill (CPMG) relaxation dispersion spectroscopy.61,146 In this experiment, protein groups that undergo conformational exchange on the μs-ms timescale give rise to relaxation dispersion curves, which provide an exchange contribution (R2ex) to the transverse relaxation rate R2. In sugar-free RSL, ~40% of residues experienced R2ex > 0 (Figure

42 NMR Characterization of RSL

2.10., lower panels), suggesting that the protein was in conformational exchange between two or more lowly-populated species. The clustering of these exchange effects suggests the possibility of concerted motions. Mannose binding moderately

altered the distribution of the R2ex terms, with a reduction in the

number of residues with R2ex contributions. In contrast, fucose

binding abolished the R2ex terms for most of the protein (Figures 2.10. and 2.11.). Such a drastic

decrease in R2ex values suggests that the conformational dynamics observed in sugar-free RSL were quenched upon fucose binding.

These data, together with the c Figure 2.12. Slow dynamics of RSL bound to L- fucose. Selected spectral regions of the 15N ZZ- results, indicate an increased

exchange experiments acquired with tmix = 0 s rigidity of the sugar-bound (black), 0.23 s (blue), and 0.5 s (red) showing the protein.123,134,139 resonances of (A) W81 and W36 (indoles) and (B) Finally, by using 15N ZZ- A85. Dashed rectangles connect the four exchange exchange spectroscopy,146 we cross-peaks. Note the intensity decrease of the auto- peaks (AA, BB) and the concomitant increase of the studied the slow dynamics of RSL cross-peaks (AB, BA) at longer mixing times. (C) bound to the α- and β-anomer of L- Quantitative analysis of the build-up curve for the fucose. In this experiment, exchange cross-peaks of A85. See Methods for the magnetization arising from species fit parameters. -1 3 in slow exchange (1 s ≤ kex ≤ 10 s-1) gives rise to symmetry-related cross-peaks. The build-up of the cross-peak intensities, followed in a series of spectra recorded with varying mixing time, can be used to extract the kinetic parameters. At long mixing times (0.5 - 1 s), the

43 Chapter II resonances of RSL bound to α- or β-L-fucose featured distinct cross-peaks (Figure 2.12.), indicating that the two bound forms were in exchange. Only the A85 resonances were resolved sufficiently for quantitative analysis (Figure 2.12. C) and -1 an exchange constant kex = 2.22 ± 0.25 s was obtained (see Methods). Consistent with the µM binding affinity5 the NMR data suggest that the α- and β-L-fucose- bound forms of RSL undergo very slow exchange. The dynamics of sugar-binding proteins have been assessed in numerous studies, some of which have pointed to a role for conformational entropy.82,127,137,138 In some examples, it was found that the sugar-bound form was more dynamic than the sugar-free form, with the entropy gain being thermodynamically favorable for ligand binding.138 In the case of RSL, the protein dynamics were reduced in the sugar-bound form. Upon binding to L-fucose the protein became more compact

(decreased c) and motions on the μs-ms timescale were frozen out. Interestingly, the fast dynamics analysis revealed flexible loops (S2 < 0.75) adjacent to both the intra- and inter-monomeric binding sites. Here, the flexibility was not affected by binding to monosaccharides. These loops extend away from the binding site and may play a structural role in RSL interactions with cell surface oligosaccharides.

Conclusions

Using NMR spectroscopy, we have shown that the bacterial lectin RSL binds both the α- and the β-anomer of L-fucose and “fucose like” sugars. Anomer-specific chemical shift perturbations were observed for methyl- and nitrophenyl-substituted L-fucosides. A 15N ZZ-exchange experiment confirmed that RSL binds to α- or β-L- fucose, which exchange on the second timescale. This appears to be the first example of such anomer-specific interactions measured by NMR. Analysis of the atomic resolution crystal structure suggests further that the accessibility of the intra- and inter-monomeric binding sites is different. A Tyr side chain (Tyr37) adjacent to the intra-monomeric may act as a steric impediment to the binding of β-L-fucosides with bulky substituents. The relaxation data reveal that RSL adopts a more compact structure when bound to L-fucose, suggesting a ligand-induced fit. The protein compaction due to sugar-binding was accompanied by decreased backbone motions on the μs-ms timescale. Finally, in addition to the conventional analysis of backbone

44 NMR Characterization of RSL amide chemical shifts we show that the tryptophan indole resonance is an excellent reporter of ligand binding, with characteristic Δδ signatures observed for each sugar.

Experimental

Materials: Carbohydrates were purchased from SIGMA (D-arabinose, L-fucose, L- galactose, D-mannose) or Carbosynth (methyl-α-L-fucoside, methyl-β-L-fucoside, 4- nitrophenyl-α-L-fucoside and 4-nitrophenyl-β-L-fucoside). 13C-labelled D-glucose 15 and (NH4)2SO4 were bought from Cortecnet.

Protein Production: 15N- and 13C/15N-RSL samples were produced in E. coli BL21 transformed with the plasmid pET25rsl.5 Cultures were grown to mid-log phase on LB and then transferred to minimal medium (supplemented with 75 mg/ml 15 15 carbenicillin). For N-labeling the medium contained 1 g/L (NH4)2SO4 as the sole nitrogen source. For 13C/15N-labeling 2 g/l 13C-labeled glucose was added as the sole carbon source. RSL was purified by affinity chromatography (D-mannose-agarose resin, SIGMA) on an ÄKTA FPLC, as described previously.5 Extensive dialysis was required to remove D-mannose.5 The protein concentration was determined by UV -1 -1 spectroscopy using an extinction coefficient ε280 = 44.6 mM cm .

NMR Chemical Shift Assignments: The samples contained 2 mM 13C/15N-RSL in

20 mM potassium phosphate, 50 mM NaCl pH 6.1 and 6% D2O for the lock. The sugar-bound samples were prepared by the addition of D-mannose or L-fucose to a final concentration of 24 mM (i.e. two equivalents). NMR experiments were performed at 303 K on Varian NMR Direct-Drive System 600 and 800 MHz spectrometers, the latter equipped with a salt-tolerant 13C-enhanced PFG-Z cold probe. All NMR data were processed in NMRPipe147 and analysed in CCPN.148 The assignment of the backbone resonances were determined from a set of 2D 1H-15N HSQC and 3D HNCACB, CBCA(CO)NH, and HNCO experiments (Appendix A).149 The tryptophan indoles were assigned from a combination of 2D (HB)CB(CGCD)HD150 and 3D 15N- and 13C-NOESY-HSQC spectra, that provide Cβ-Hδ correlations and strong intra-residue Hδ-Hε NOEs, respectively. Finally, the side chain NH2 resonances of asparagine and glutamine residues were assigned from

45 Chapter II the CBCA(CO)NH spectrum by correlating with the corresponding Cα/Cβ (Asn) or Cβ (Gln) chemical shifts. The assignments for RSL (Appendix B) have been deposited in the BMRB with accession numbers 25950 (bound to D-mannose), 25951 (bound to L-fucose) and 25952 (sugar-free).

Sugar Binding and Chemical Shift Perturbations: Samples contained 0.25 mM 15 N-RSL in 20 mM potassium phosphate, 50 mM NaCl pH 6.0 and 10% D2O. Monosaccharides or fucosides were added to a final concentration of 3 mM. Spectra were acquired on pure RSL and sugar-bound RSL samples at 303 K on an Agilent 600 MHz NMR spectrometer equipped with a HCN cold probe. The differences in chemical shifts () between the pure and sugar-bound RSL were measured in 148 2 CCPN and the average perturbations were calculated as Δδavg = {[ΔδH + (0.2 × 2 ½ 70 ΔδN )]/2} .

NMR Dynamics: Data were acquired on samples of sugar-free RSL and RSL bound 15 to L-fucose or D-mannose. The backbone N R1, R2, CPMG R2ex relaxation dispersion, and 15N ZZ-exchange experiments were recorded on a 600 MHz 15 spectrometer at 303 K. The N R1 and R2 relaxation rates were obtained from a series of 2D experiments with coherence selection achieved by pulse field gradients.151 The relaxation parameters and corresponding errors were extracted with 148 CCPN, and the transverse relaxation rates were corrected for the R2ex contribution (see below). The diffusion tensors of the free and bound RSL were obtained from the experimental R2/R1 values using the programme r2r1_diffusion (A. Palmer). 15 Residues with a N R2 higher than one standard deviation from the average (obtained for the entire set of the backbone amides) likely exhibit significant internal motions152 and were excluded from the analysis. The model selection was performed using F statistics as described elsewhere.153 The CPMG relaxation dispersion experiments154 were recorded with 0, 25, 50 (in duplicate), 75, 125, 175, 225, 350, 550, 750 (in duplicate), and 1000 Hz pulse repetition rates. The peak intensities at each repetition rate were determined in 155 CCPN, and the subsequent analysis was performed in NESSY, providing the R2ex values and their errors.

46 NMR Characterization of RSL

The 15N ZZ-exchange experiments156 were acquired as a series of interleaved 2D spectra with mixing times of 0 (in duplicate), 20, 50 (in duplicate), 90, 150, 230, 340, 500, 720, 1030, and 1500 ms. The ZZ exchange intensity build-up curves were analyzed using a composite ratio of four peak intensities (two auto-peaks, AA and BB, and two cross-peaks, AB and BA) according to Eq. 1:157

where I(t) is the corresponding peak intensity at the mixing time t and kon and koff are, respectively, the association and dissociation rate constants for the exchange 2 process. A good fit ( red = 0.00069) was obtained for the A85 data, with  = 1.235 ± 0.022 s-2. For a system in a two-state A ↔ B equilibrium, the equilibrium constant is given by K = [A]/[B] = k1/k-1, where k1 and k-1 are first-order association and dissociation rate constants, respectively, and the relative populations of the two species, [A]/[B], can be estimated from the ratio of the corresponding peak 157 intensities in the fully-relaxed HSQC spectrum. With K = [A]/[B] = k1/k-1 = 0.90 ± 0.12 for the A85 backbone amide resonances of RSL bound to the α- or β-anomers -2 -1 of L-fucose and  = k1k-1 = 1.235 ± 0.022 s , we obtain k1 = 1.05 ± 0.16 s and k-1 = -1 1.17 ± 0.20 s , yielding the final value of the exchange rate constant kex = k1 + k-1 = 2.22 ± 0.25 s-1.

47

Chapter III The Impact of Sugar Stereochemistry on Chemical Shift Changes in Sugar-bound RSL

Antonik and Crowley, manuscript in preparation

49 Chapter III

Abstract

Ralstonia solanacearum lectin was characterized by NMR spectroscopy in the presence of fifteen carbohydrates including twelve monosaccharides, one disaccharide, one open-chain sugar and glycerol. Chemical shift perturbation analysis revealed a relationship between the magnitude of the chemical shift changes, the sugar stereochemistry and binding affinity. Simple models of each carbohydrate in complex with RSL were built using the NMR data and the crystal structure of methyl--L-fucoside bound to RSL. We further revealed the importance of stereochemistry and type of C5 substituent to defining why the ‘L-fucose like’ scaffold is the most favoured for binding to RSL. Also, the binding affinities of D- galactose, D-glucose, glycerol, and L-fucitol were estimated from NMR titrations. L- fucitol was found to bind RSL with a ~1000-fold lower affinity than L-fucose, suggesting a large entropic contribution due to preorganization of the ligand.

50 Stereochemistry and Chemical Shift Changes

Introduction

Lectin-carbohydrate interactions is a topic of broad interest and crucial to understanding the ‘sugar code’.123,158 Carbohydrates are the most abundant type of biomolecule in nature and are involved in numerous biological processes as polysaccharides, oligosaccharides, glycopeptides or glycoproteins.15 As a consequence, carbohydrate-biomolecule recognition has attracted the attention of both carbohydrate159–161 and protein scientists. 82,127,162 At first glance, sugars can be tricky to analyse due to features such as stereochemistry, anomerization or isomerisation.123,158 Lectins are carbohydrate-binding proteins (including carbohydrate recognition domains) that can be considered as the “sugar detectors” and help to “encode” the message provided by sugars. Lectins are also known for their high specificity to one type of monosaccharide, often exclusively to one anomer but sometimes they can interact with structurally unrelated monosaccharides at the cost of binding affinity.8 Lectins can bind sugars with common structural features like D- mannose and L-fucose or N-acetylneuraminic acid and N-acetyl-glucosamine8 involving various weak forces like van der Waals interactions, hydrogen bonding and CH–π interactions.90 This complexity of interactions implies specialization in recognition of a specific epimer.123 NMR spectroscopy is a powerful technique to monitor proteins at the atomic level. In particular, chemical shift perturbation (Δδ) measurements provide insight into protein-ligand interactions. CSP is highly sensitive to structural changes and can be measured accurately by using relatively simple 2D HSQC experiments. In the 1H−15N HSQC experiment, one cross peak for each amide is observed; therefore upon the addition of a ligand, changes in chemical shifts belonging to residues of the receptor can be monitored.87 This type of experiment enables screening multiple ligands to obtain rapidly (in contrast to X-ray crystallography) comparative binding data. Moreover, in some cases both the binding site identity and the binding affinity can be obtained from one set of experiments.70 These features have resulted in the wide application of NMR spectroscopy lectin-carbohydrate investigation.86,140,163,164 Here, the question we addressed was whether it is possible to explain chemical shift changes based on sugar stereochemistry and knowledge of the binding site structure.

51 Chapter III

As the model lectin we used RSL, a well characterised protein with a high resolution crystal structure (0.94 Å), biophysical studies for a range of carbohydrates5 and NMR assignments for sugar-free and -bound forms (Chapter 2).165 RSL has two almost identical intra- and inter-monomeric binding site,5 (Figure 3.1.) that include six tryptophans that are directly involved in binding to the sugar. This unique feature is advantageous in studying carbohydrate binding by NMR.165 Analysis of only the tryptophan indole resonances can confirm binding and help to understand the effect of the sugar stereochemistry.

Figure 3.1. Model of methyl-α-L-fucoside (sticks) in the intra-monomeric binding site of RSL 5 (PDB 2BT9). Non-covalent interactions between RSL residues and methyl-α-L-fucoside are indicated (dashed lines, distances in Å) and match the colour scheme of the Δδ plots in Figure 3.4. Numbers of homologues residues are indicated as intra- and inter-monomeric binding site, respectively. The grey dashed lines show interactions between R17, D32 and Y37. . The interactions of RSL with fifteen carbohydrates were studied by chemical shift perturbation analysis; despite the specificity towards L-fucose, chemical shift changes were observed for RSL in the presence of all fifteen tested carbohydrates. Previously, lectins have been studied with a variety of saccharides by ITC5,166, X-ray crystallography,144,167 and NMR,91,92,136 but the diversity of carbohydrates in our work is much broader. Here, we attempt to show how sugar stereochemistry can affect binding to RSL. Note that a similar study involving a small, designed glycoprotein with different monosaccharides was published recently.92 Structural models of each sugar bound to RSL were built based on both modelling and NMR experiments. Thus, we were able to generalize how the substituent size and stereochemistry can enhance or decrease the RSL-carbohydrate interactions. We show also chemical shift perturbations for RSL in the presence of glycerol or L-

52 Stereochemistry and Chemical Shift Changes fucitol. These data indicate that flexible alcohols can be accommodated in the sugar binding site. Finally, we present data for sucrose that suggests RSL selectivity to pyranose sugars. RSL was screened in the presence of alternative materials such as vegetable extract. Data obtained for RSL-carbohydrate interactions was used to characterize content of vegetable juice that resulted in determination of D-fructose in e.g. carrot juice.

53 Chapter III

Results and Discussion

The interactions of RSL with twelve monosaccharides, a disaccharide, an acyclic sugar and glycerol were screened by 1H-15N HSQC spectroscopy. Unless otherwise indicated all spectra were acquired on 0.25 mM RSL in the presence of 3 or 100 mM sugar for strong and weak binders, respectively in 20 mM KPi, 50 mM NaCl, pH 6.0. The data for D-arabinose, L-fucose, L-galactose and D-mannose were presented in Chapter 2. The data are shown here also for ease of comparison.

The crystal structure of RSL shows methyl--L-fucoside (Kd ~2 μM) bound in the 1 5 energetically favourable C4 conformation (Figure 3.1.). In principle, D/L pyranoses 4 1 prefer the C1 / C4 chair conformations, respectively with D-arabinose being an 4 168 4 exception that favours C1. As shown in Chapter 2, D-mannose is a C1 sugar with “fucose-like” stereochemistry. D-mannose can be oriented such that hydroxyls O2, O3 and O4 correspond to L-fucose hydroxyls O4, O3, and O2. Consequently the anomeric –OH of D-mannose replaces the L-fucose C5 methyl (Figure 3.2. A). The term “L-fucose like” was introduced to indicate monosaccharides that 1 have the same ring conformation ( C4) and stereochemistry as L-fucose but with different substituents at C5 (e.g. L-galactose). Sugars that can mimic L-fucose stereochemistry, as in the case of D-mannose, were included in the “L-fucose like” group. The tested pyranoses (Table 3.1. and Figure 3.3.) revealed two different binding regimes. Sugars that bind tightly to RSL (Figure 3.3., left column: L-fucose, L-galactose, D-arabinose, D-fructose, D-mannose, D-lyxose, L-gulose) were observed in slow exchange while weakly binding sugars (Figure 3.3., right column: L- rhamnose, D-altrose, D-galactose, D-glucose, D-xylose) were in fast exchange on the NMR time scale. Upon binding to sugar-free RSL, the strong or weak binders resulted in large (Figure 3.3. left) or medium/small (Figure 3.3. right) chemical shift changes. Furthermore, the CSP pattern was similar for all strong binding sugars (Figure 3.4., left column), while weak binders (Figure 3.4., right column) presented more diversity (e.g. L-rhamnose) and / or resonance broadening (e.g. D-altrose). The indole spectral region for all of the sugars (Figure 3.5.) revealed a similar magnitude of CSP for homologous residues. Those indoles with strong specific interactions (e.g. hydrogen bond at Trp86/31, Figure 1) had larger CSPs compared to

54 Stereochemistry and Chemical Shift Changes weakly involved groups (e.g. Trp53/10 indole -NH is remote from sugar). The indole of the non-binding Trp74 was least affected by the presence of the different carbohydrates. Figure 3.6. shows the Δδ plots for the indole resonances. Here, the analysis is limited to only seven resonances (compared to ~85 backbone resonances), which aids the initial prediction of sugar binding.

Figure 3.2. (A) D-mannose mimicry of L-fucose and (B) possible reorientation of D-glucose to mimic D-mannose. L-fucose, the tightest binding monosaccharide to RSL (Table 3.1.) is treated as the sugar template which presents the most favourable topology. We assumed that if another sugar binds to RSL, it adopts the orientation with the closest match to the L-fucose framework. Analyses of the CSPs and simple sugar-RSL models (described in Methods) revealed that monosaccharides can mimic the conformation of RSL- bound L-fucose. In Chapter 2 we have shown how D-mannose obtains almost perfect mimicry of L-fucose. For clarity, this feature of D-mannose is shown in Figure 3.2. A. Figure 3.2. B presents two from six possible orientations of D-glucose in RSL binding site. The probable model was picked based on NMR data and structural comparison with other monosaccharides. The detailed NMR data analyses are described in the following sections.

55 Chapter III

Table 3.1. Summary of all carbohydrates mentioned in this chapter. Carbohydrate structures are shown in Figures 3.3. and 3.21.

a b Carbohydrates Kd (μM) Complex Structure Δδ / Exchange

5 Crystal L-fucosec SPR ~2 large / slow PDB 4I6S36

5 Docking L-galactosec SPR ~10 large / slow α/β-L-galactose34

Docking 32 34 D-arabinosec SPR ~50 Me-β-D-arabinose large / slow β-D-arabinose32

34 Docking D-fructose SPR ~90 large / slow α/β-D-fructopyranose34

Docking 32 34 D-mannosec SPR ~100 Me-α-D-mannose large / slow α-D-mannose32

d D-lyxose N.D. N.D. medium / slow

L-gulose N.D. N.D. medium / slow

Docking

α/β-L-rhamnose34 L-rhamnose N.D. medium / fast α-L-rhamnose32

D-altrose N.D. N.D. small / fast*

Docking D-galactose NMR ~1500 small / fast* α/β-D-galactose34

D-glucose NMR ~3500 N.D. small / fast*

D-xylose N.D. N.D. small / fast*

L-fucitol NMR ~3500 N.D. small / fast*

Glycerol NMR ~2500 N.D. medium / fast

Sucrose N.D. N.D. minor / fast* a b c d The method and Kd are indicated, See Figure 3.4. for Δδ plots, See Chapter 2, Not determined, *indicates fast exchange with a few resonances in intermediate regime

56 Stereochemistry and Chemical Shift Changes

Figure 3.3. Strongly (left column) and weakly (right column) binding carbohydrates in their preferred conformation. The strong binders are ranked by Kd (Table 1). The substituents in red 1 4 ( C4) / blue ( C1) differ from the topologically equivalent groups in L-fucose / D-mannose, respectively.

57 Chapter III

Figure 3.4. Plots of 1HN backbone chemical shift perturbations for RSL in the presence of each sugar. The coloured lines highlight residues that form noncovalent bonds with methyl--L- fucoside (see Figure 3.1.). Dashed lines indicate resonances that were broadened beyond detection or not assigned. Blanks correspond to prolines (P14 and P44) and residues (S2, G33, G35, G78, G80), which lacked a resonance in the sugar-free or -bound RSL. See Chapter 2 for details of the anomer effect on CSP.

58 Stereochemistry and Chemical Shift Changes

Figure 3.5. Indole spectral region from the overlaid 1H-15N spectra of RSL bound to 15 carbohydrates. Bright and dark colour-code corresponds to peaks from the inter- and intra- monomeric binding site, respectively. The colour scheme matches Figure 2.8. B. Dashed contours indicate resonances that were detectable down in the noise. Grey contours correspond to backbone resonances.

59 Chapter III

Figure 3.6. Plots of chemical shift perturbations for tryptophan indole resonances in the presence of each sugar. Filled and open circles correspond to the 1HN and 15N data, respectively. Colour- code matches Figure 3.5.; bright and dark colours correspond to Trp side chains in the inter- and intra-monomeric binding site, respectively. Trp74 is a non-binding residue. Dashed lines indicate resonances that were broadened beyond detection.

60 Stereochemistry and Chemical Shift Changes

Tight Binders / Slow Exchange / Large  Protein sugar binding was analysed by HSQC spectra of 15N labelled RSL in the presence of 2 eq. (3 mM) of each sugar. The tightly binding sugars resulted in a similar pattern of backbone chemical shift changes in RSL (Figure 3.4., left column). The binding affinity of the tight binders is determined by the C5 substituent in the order -CH3 > -CH2OH > -H > -OH (Figure 3.3.). This observation is based on the Kd values for L-fucose, L-galactose, D-arabinose / D-fructose and D-mannose (Table 3.1.). Three examples of “L-fucose-like” sugars (with the same ring conformation or perfect mimicry of stereochemistry) were described in Chapter 2 (L-galactose, D- arabinose and D-mannose). Here, we will focus on the other three D-fructose, D- lyxose and L-gulose. The analyses of each sugar model (Methods) were performed similarly. The initial pick of the model was based on Δδ plots of Trp indoles (Figure 3.5.), which provide unique information about the binding event. For example, indoles of

Trp53/10 can sense accurately the presence of C5 -H, -CH3, -CH2OH or -OH groups (Chapter 2) which helps to identify how a sugar binds. Further, more detailed analyses were elucidated from Δδ plots of RSL backbone resonance with a focus on CSPs for residues that belong to the binding sites. For instance, the Ile16/61 pair is sensitive to the same environment as Trp53/10 indoles so it was used to confirm initial pick. Diversity between the binding sites assists the deduction of any steric effects observed by CSPs at Tyr37/Thr82.

D-fructose is an example of a “fucose-like” sugar. In aqueous solution D-fructose is a mixture of pyranose and furanose forms (Figure 3.7. B). At equilibrium, D-Fructose is 57% β-D-pyranose, 31% β-D-furanose,

Figure 3.7. (A) Fischer projection of D-fructose 9% α-D-furanose, and 3% α-D- (B) The four conformations of D-fructose and 169 pyranose. Structurally D- their % abundance at chemical equilibrium in solution.169

61 Chapter III

fructopyranose is like D- arabinopyranose with a hydroxymethyl substituent at the anomeric carbon (Figure 3.3.). As D-fructopyranose has identical stereochemistry to L-fucose, the strength of RSL binding is likely to depend on the C5 substituent (-H, like D-arabinose) and the steric effect of the C1 substituent. This analysis is in agreement with the known binding affinities for D-arabinose, D-fructose and D-mannose, ~50, 90, and 100 μM, respectively.32,34 Interestingly, some of the backbone amide spectral changes in RSL were more similar for D-fructose and D-mannose rather than D-arabinose Figure 3.8. Spectral regions from the overlaid (Figure 3.8.). However, the indole Δδ 1H-15N spectra of sugar-free RSL (black contours) and RSL bound to D-fructose plots (Figure 3.6.) were more similar (magenta), D-mannose (green) or D-arabinose (light blue). Note homologous residues for D-fructose and D-arabinose, Ile16/61, Arg17/62, Cys30/75 and Tyr37/Thr82 implying that the C5 proton is adjacent to Trp53/10. This effect was also observed for the Ile16/61 resonances (Figure 3.8. C), residues which complete the C5 binding pocket. D-mannose like interactions, were observed at resonances such as Cys30/75, Thr82 (Figure 3.8. A), or Arg17/62,

Glu73 (Figure 3.8. B). The bulky -CH2OH group at the anomeric carbon of D- fructose (similar to the -CH2OH at C5 in D-mannose) is expected to affect differently the Tyr37/Thr82 pair. The former was shifted similarly but the latter had a smaller Δδ than observed for D-mannose bound RSL (Figure 3.8. A). Structural differences between the binding pockets are also evident from the Arg17/62 resonances. A small hydrogen bond network from the bulky Tyr37 to Glu32 and Arg17 (Figure 3.1., grey lines), cannot be formed by the corresponding Thr82, Glu77 and Arg62. The bulky -

CH2OH substituents that interact with Tyr37 likely contribute to chemical shift perturbations of Arg17 for both D-mannose and D-fructose (Figure 3.8. B). The absence of this interaction in the inter-monomeric binding site was suggested by the

62 Stereochemistry and Chemical Shift Changes same CSP for Arg62 when D-mannose, D-fructose or D-arabinose was bound. Structural and NMR studies resulted in the conclusion that RSL must bind to β-D- fructopyranose that is the most abundant form in solution. The β anomer is also energetically favoured due to a hydrogen bond with Tyr37, identified in a computational study.34 Although, the NMR experiments did not indicate any anomeric effect (split resonances, see Chapter 2), binding to the α anomer cannot be excluded34 but is unlikely considering the low abundance (3% α-D-fructopyranose).

D-arabinose (Kd ~50 μM) was the weakest binding carbohydrate with split resonances. Therefore, the anomeric effect might be less pronounced for monosaccharides with Kd > 50 μM such as D-fructose (Kd ~90 μM).

Figure 3.9. (A) Simplified representation of L-fucose bound to RSL based on 2BT9 crystal 5 structure. Non-covalent interactions (dashed lines) between RSL residues and L-fucose are indicated and match the colour scheme of the Δδ plots in Figures 3.4. and 3.6. Residue numbering is the same as Figure 3.1. (B) Representation of D-lyxose and L-gulose in the likely RSL-bound orientation. Both D-lyxose and L-gulose in complex with RSL (Figure 3.9.) gave rise to Δδ plots that were similar to the tight binding sugars (Figure 3.4.). Homologous resonances e.g. Ile16/61, Arg17/62, or Cys30/75 had comparable CSPs suggesting similar binding events (Figure 3.10.). While D-lyxose was a tight binder (saturation at 3 mM) the same concentration of L-gulose resulted in two sets of peaks in the HSQC, corresponding to L-gulose-bound and sugar-free RSL. At 30 mM sugar the protein was saturated. D-lyxose is similar to D-mannose without the C5 substituent and because it better retains the overall topology of L-fucose (Figure 3.9.) it binds tighter than L-gulose. L-gulose is a C3 epimer of L-galactose, (Figure 3.3.). Although, the axial C3-hydroxyl may be accommodated in the binding site with minimal clashes at Trp76/31 the loss of hydrogen bond interactions with Glu28/73 and Trp81/36 (Figure 3.9.) is the likely cause of the lower affinity. Apart from C3 the remaining substituents may interact with RSL similar to L-galactose.

63 Chapter III

Figure 3.10. Spectral regions from the overlaid 1H-15N spectra of sugar-free RSL (black contours) and RSL bound to D-lyxose (orange), L-gulose (blue), D-mannose (green) or L-fucose (red).

Weak Binders / Fast Exchange / Small  The five weakly binding sugars share at least one difference in stereochemistry with respect to L-fucose or D-mannose (Figure 3.3., right column). L-rhamnose is a C2/4 epimer of L-fucose. D-altrose is a C3 epimer of D-mannose. D- galactose is the enantiomer of the tightly binding L-galactose and the C2/4 epimer of D-mannose. D-glucose and D-xylose are C2 epimers of D-mannose, and D-xylose lacks the C5 substituent.

64 Stereochemistry and Chemical Shift Changes

Figure 3.11. (A) Simplified representation of L-fucose bound to RSL. Legend in Figure 3.8. (B) Representation of L-rhamnose, D-altrose, D-galactose, D-glucose and D-xylose in RSL-bound orientations that best mimic the L-fucose interactions . The presence of 3 mM D-altrose, D-galactose, D-glucose or D-xylose resulted in minor chemical shift perturbations of RSL. These complexes were studied also at 30 and 100 mM of sugar and binding was observed to be in fast-exchange on the NMR time scale. In contrast, 30 mM of L-rhamnose resulted in RSL saturation, indicating that it was the tightest binder of this group. With increasing sugar concentration, larger CSPs were observed and line-widths increased or were broadened beyond detection suggesting intermediate exchange effects for some resonances. For example, the peaks of Ala40 and/or Ala85 were broadened beyond detection in D-altrose, D-galactose, D-glucose and D-xylose. Broadening is not always an indication that the affinity is strong.70 In fact, the spectra of RSL in complex with strongly binding carbohydrates had decreased line widths compared to unbound RSL. For instance, methyl-α-L-fucoside bound to RSL resulted in an average ~3 Hz decrease in line-width compared to the sugar-free form. Therefore, the broadening with weak binders suggests some other events like a pre-equilibrium, conformational change of the protein before binding or a structural rearrangement of the protein-ligand complex after binding.70 The weakly binding sugars were ordered from the strongest to weakest (Figure 3.3., right column) based on the size of CSP in the presence of 30 mM of each sugar. Larger Δδ at 30 mM sugar suggested faster saturation, hence stronger interactions.

65 Chapter III

D-altrose and D-galactose resulted in similar broadening effects (Trp36/81) so should interact in a similar manner. D-glucose and D-xylose showed almost the same CSP effects, as expected considering their structural similarity. The C5 substituent of D- glucose likely protrudes from the binding site, as described for D- mannose. The initial analyses of the indole Δδ plots (Figure 3.6., right column) suggested that L-rhamnose binds in a similar mode to “fucose-like” sugars. The pattern for L-rhamnose was most similar to D-mannose suggesting insertion of a hydroxyl substituent in place of the C5 methyl in L-fucose. Perturbations of the Ile16/61 resonances (Figure 3.12. A) further support this suggestion. Of the weak binders L-rhamnose had the largest CSP Figure 3.12. Spectral regions from the overlaid 1H-15N spectra of sugar-free RSL (black for Arg17/62 (Figure 3.12. B), perhaps contours) and RSL bound to L-rhamnose (cyan), D-altrose (magenta), D-galactose consistent with L-rhamnose being the (green), D-glucose (orange) or D-xylose (blue) at 100 mM of each sugar (except L-rhamnose at only sugar in which the substituents at 30 mM). C3, C2, C1 fully retain the L-fucose topology at C5, C4, C3, respectively (Figure 3.11.). The unusual CSP for Ala40/85 (Figures 3.4. and 3.12. C) is likely caused by O5. The pronounced chemical shift changes for Tyr37 (Figure 3.12. D) is due to the equatorial C5 methyl group causing similar steric effects as the -CH2OH group in D-mannose or D-fructose. The homologous Thr82 did not have as large a CSP. D-altrose is a C3 epimer of D-mannose. L-gulose, the C3 epimer of L- galactose was shown to have the lowest affinity out of all of the strong binders due to this single stereochemical difference. In the case of D-altrose, there are two other

66 Stereochemistry and Chemical Shift Changes

4 features which make it a weaker binder than L-gulose. As D sugar, C1 conformation, D-altrose inserts the C1-hydroxyl (same as D-mannose) into the binding pocket (In contrast, L-gulose has a C5-hydroxymethyl, with improved binding at this site). D- altrose is also weaker than L-rhamnose because it cannot maintain the C5, C4, C3 topology. For example, the resonance of Glu73 that is directly affected by lack of interactions with O3 shows a different CSP to L-rhamnose (Figure 3.12. A).

D-galactose is C2/4 epimer of D-mannose. The Kds of ~15 and ~35 mM for D-galactose and D-glucose respectively (see next section) show that despite two epimers, D-galactose still binds twice tighter than D-glucose. It may be counterintuitive because D-glucose shares with D-galactose the same C2 epimer (related to D-mannose). Therefore, this structural analysis suggested that D-galactose has to bind in alternative, less obvious orientation. In proposed orientation, D- galactose still breaks two hydroxyl interactions but, more important O4 (in L-fucose) was conserved by O1 that maintains the ‘bridge interactions’ between Glu28 and Arg17 (Figure 3.1. B). Δδ showed similar plots for Trp53/10 side chains and Ile16/61 suggesting C2-hydroxyl insertion similar to C1 substituent in D-mannose (Figure 3.12. A). The lack of Δδ for Tyr37 (Figure 3.12. D) excluded the possibility of steric interactions with the C5-hydroxymethyl. Lastly, D-glucose and D-xylose are both C2 epimers of D-mannose and vary only at C5 between each other. Therefore, Δδ plots for RSL-bound to D-glucose/D- xylose are similar. The D-mannose like orientation result in D-glucose/D-xylose lacked -O2 mediated bonds (O4 in L-fucose) that disturbed the ‘bridge interactions’ shown in Figure 3.1. B. As the binding to O4 cannot be compensated as it was in the case of C3-hydroxyl (e.g. L-gulose) the affinity is significantly reduced. Based on structural and NMR studies we observed a relation between changes of stereochemistry and size of chemical shift perturbation. The difference in stereochemistry at each pyranose carbon, was assessed in order from the least to the most important as C1

67 Chapter III affects interactions with Glu28/73 and Trp36/81 (Figure 3.1. B). Although, the Trp Nε atom is sensitive to stereochemical changes, the hydrogen bond between indole and O3 seems to be less important for sugar binding. The lack of a hydrogen bond between Glu28/73 and O3 lower the affinity significantly (e.g. L-gulose or D-altrose) but this site can be compensated by a water molecule as long as the O4 interaction was sustained. Glu28/73 and Arg17/62 form a type of ‘bridge interaction’ via axial O4 (Figure 3.1. B). If these interactions were abolished the binding affinity dropped significantly like for D-glucose or D-xylose. The stereochemistry of the C5 substituent appears crucial for interactions with RSL. There is no example within the tested twelve pyranosides with an axial C5. Moreover, the C5 substituent (-CH3, -

CH2OH, -H, -OH) was responsible for the affinity of strong binders (Table 3.1. and see Chapter 2).

68 Stereochemistry and Chemical Shift Changes

Binding Affinity of D-galactose and D-glucose.

Figure 3.13. (A) Spectral regions from overlaid 1H-15N HSQC spectra of 0.25 mM sugar-free RSL (black) and RSL in the presence of up to 320 mM D-galactose (blue scale) or 300 mM D-glucose (red scale). (B) Binding isotherms for the RSL-D-galactose or RSL-D-glucose interactions determined by NMR spectroscopy. Open and filled symbols correspond to residues in the intra- and inter-monomeric binding sites, respectively. RSL-sugar titrations were performed by the addition of microliter aliquots of D- glucose or D-galactose to 15N-labelled protein and monitored by 1H–15N HSQC spectroscopy. Increasing chemical shift perturbations were observed as a function of the sugar concentration, indicative of fast exchange on the NMR time scale. D- glucose and D-galactose are C2 and C2/4 epimers of D-mannose, respectively so it could be expected that the more similar D-glucose would bind tighter to RSL as stated in previous section. Figure 3.13. A shows regions of the HSQC spectra and the chemical shift perturbations that occurred during titrations. Analyses of the chemical shift perturbations as a function of ligand concentration yielded hyperbolic binding curves (Figure 3.13. B). The curves were fit to a 1:1 binding model. The obtained Kd values are shown in Table 3.2.

69 Chapter III

Table 3.2. Dissociation constants (Kd) of D-galactose or D-glucose calculated for resonances in the inter- and intra-monomeric binding site. 1 N Sugar H resonance D-galactose Kd (mM) D-glucose Kd (mM) Inter-monomeric Arg62 18 (±3) 28 (±6) Cys75 15 (±1) 32 (±2) Gly84 16 (±1) 42 (±2) Intra-monomeric Arg17 5 (±1) 29 (±4) Ile61 14 (±2) 34 (±3) Average 14 (±5) 33 (±6)

D-galactose and D-glucose NMR titrations yielded Kds of ~15 and 35 mM, respectively. No significant differences were observed between the inter- and intra- monomeric binding sites. Arg17 in the presence of D-galactose was broad and therefore Δδ was difficult to measure. This data was included for contrast with D- glucose. The tighter binding of D-galactose (in spite of the two epimer differences with respect to D-mannose) suggested that D-galactose can orient differently to the the D-mannose sugar group. This different orientation (Figure 3.11. B) may be responsible for the tighter interaction than D-glucose.

70 Stereochemistry and Chemical Shift Changes

RSL Interactions with Sugar Alcohols.

Figure 3.14. Representation of the molecular mimicry of lactose (orange lines) by glycerol (blue sticks) bound to Galectin-3 (PDBs 3ZSK and 3ZSJ).170 Protein is shown in cartoon representation with side chains as lines. Two sugar alcohols, glycerol and L-fucitol were examined for binding to RSL. Glycerol is a three carbon polyalcohol with 2 rotatable bonds that yield different conformers. Glycerol, on account of its use as a cryoprotectant, is often found in crystal structures as a ligand bound to the protein, with >12,000 structures in PDB. NMR data proving the interactions of glycerol with protein in solution was not found. For example Seraboji et al.170 reported a 0.9 Å resolution crystal structure of Galectin-3 bound to glycerol (Figure 3.14.).

Figure 3.15. (A) Representation of the molecular mimicry of methyl-α-L-fucoside by two molecules of glycerol (orange shade, similar to L-galactose). (B) Glycerol molecules (orange) 5 were superposed onto methyl-α-L-fucoside (grey) bound to RSL (PDB 2BT9). Protein backbone is shown as ribbons and side chains as lines.

71 Chapter III

Galectin-3 is not structurally related to RSL but the availability of high resolution crystal structures provide evidence of glycerol binding to lectin (glycerol-bound PDB 3ZSJ) and imply mimicry of carbohydrate binding (lactose-bound PDB 3ZSK). In contrast to the studies reported in the following paragraphs, glycerol binding to Galectin-3 was not observed by NMR.170 Based on crystal structures of Galectin-3 bound to glycerol, or lactose, (Figure 3.14.), we modelled glycerol to match the C4-C6 fragment of methyl-α-L-fucoside (PDB 2BT9). A second glycerol molecule was positioned to mimic the C1-C3 half of the fucoside. The model suggests that two glycerol molecules can be accommodated in the binding site with a good match Figure 3.16. (A) Spectral region from overlaid 1H- to the methyl-α-L-fucoside bound to 15N HSQC spectra of 0.25 mM sugar-free RSL (black) and RSL in the presence of 1-300 mM of RSL (Figure 3.15.). glycerol (red scale) (B) Binding isotherms for the RSL-glycerol interactions in inter- (top) and itra- Initial screening of RSL in monomeric binding site determined by NMR spectroscopy. the presence of 3 mM glycerol did not reveal significant CSP suggesting that glycerol is a weak binder. An NMR titration was prepared and the Kd was estimated (Figure 3.15. and Table 3.3.) in a manner identical to that described for D-glucose and D-galactose. The average Kd was 25 mM. Interestingly, although the glycerol Δδ plot for backbone resonances was different to the other strong binders, the Δδ plot for the indole resonances was similar (Figures 3.4. and 3.6.). Based on the model built for two glycerol molecules (Figure 3.15. A), potentially the closest comparison could be observed for L- galactose because of the closest structural mimicry. However, the chemical shift

72 Stereochemistry and Chemical Shift Changes

perturbations were different from all tested pyranosides. The indole resonances yielded a pattern suggesting that the pyranose ring was mimicked (Figure 3.6.). Also the resonances of Arg17/62 and Ile16/61 presented matching patterns (Figures 3.4. and 3.17.). The essential difference was observed only for resonances of Ala40/85. These resonances have large shifts due to a hydrogen bond with the equatorial –O2 in L-fucose. The smaller chemical shift perturbation in the presence of glycerol suggests that this interaction was not retained.

Figure 3.17. Spectral regions from the overlaid 1H- 15N HSQC of sugar-free RSL (black contours) and RSL in the presence of 300 mM glycerol (orange) or 3 mM L-galactose (mauve).

73 Chapter III

Figure 3.18. (A) Representation of the molecular mimicry of methyl-α-L-fucoside by L-fucitol (blue). (B) Two possible models of L-fucitol (blue) when bound to RSL made by superposition of 5 L-fucitol onto methyl-α-L-fucoside (grey). L-fucitol is another example of a sugar , but with greater flexibility (4 rotatable bonds) than glycerol. As an acyclic derivative of L-fucose it can be expected to adopt a conformation that matches the L- fucose framework. Such a process may be entropically unfavoured. In contrast to thousands of crystal structures containing glycerol, to date, there are only 3 structures of L-fucitol bound to proteins (PDBs: 1FUI, 3A9T, 4C21).171–173 In each of these structures L-fucitol adopts the fully extended form suggesting that this is the energetically favoured

Figure 3.19. (A) Spectral regions from overlaid conformation. 1 15 H- N HSQC spectra of 0.25 mM sugar-free Two hypothetical models for RSL (black) and RSL in the presence of L-fucitol (blue scale). (B) Binding isotherms for the RSL- closed ring and extended chain L- L-fucitol interactions determined by NMR spectroscopy. fucitol were prepared to support NMR

74 Stereochemistry and Chemical Shift Changes

data (Figure 3.18.). The same set of NMR experiments as for glycerol was performed for L-fucitol. Interestingly, the binding affinity of L-fucitol (~35 mM) was slightly lower glycerol (~25 mM). Comparison of 1H-15N HSQC spectra for RSL bound to L-fucitol and L-fucose revealed that L-fucitol presented Δδ mostly for resonances in the inter-monomeric binding site. Possibly, the more spacious inter- monomeric site (Thr82) allows entry by the long chain sugar. It may be more difficult to enter the intra-monomeric binding site (flanked by Tyr37) Figure 3.20. Spectral regions from the overlaid 1H-15N spectra of sugar-free RSL (black therefore much smaller CSPs were contours) and RSL in the presence of 320 mM L-fucitol (blue) or 3mM L-fucose (red). observed. For example, chemical shift perturbations of Arg62, Glu73 and Thr82 in the presence of L-fucitol were larger than those of homologous Arg17, Glu28 and Tyr37. The inter-monomeric binding site can be bound by L-fucitol in the linear form as indicated by backbone CSPs but lack of Δδ for the indole resonances (only Trp81 presented CSP).

Table 3.3. Dissociation constants (Kd) of Glycerol and L-fucitol calculated for resonances in the inter- and intra- monomeric binding site. 1 N Sugar H resonance Glycerol Kd (mM) L-fucitol Kd (mM) Inter-monomeric Ile16 12 (±1) Arg62 10 (±0.5) 28 (±6) Cys75 13 (±1) 32 (±2) Gly84 37 (±3) 42 (±2) Ala85 40 (±1) Intra-monomeric Gly39 33 (±3) Ala40 34 (±2) Ile61 15 (±1) Average 24 (±13) 34 (±7)

75 Chapter III

RSL Binding to a Disaccharide.

Figure 3.21. Sucrose structure showing the D-glucose (red) and D-fructose (blue) units. Sucrose is a disaccharide of D-glucose and D-fructose linked via an acetal. It could be expected that sucrose would interact with RSL. We and others5,32,34 have shown that RSL can bind a range of pyranoses, but there is no evidence in the literature for binding to furanoses. NMR studies of D-fructose also indicated a preference for binding of β-D-fructopyranose, therefore it is expected that sucrose can bind via the D-glucose unit only.

Figure 3.22. Model of possible conformation for β-D-glucose (pink) and sucrose (teal) in the 5 intra-monomeric binding site (PDB 2BT9). As shown in the weak binders section, D-glucose needs to reorient the pyranose ring to favour its interactions to RSL (Figure 3.21. A). As the D-glucose unit in sucrose is in α conformation then sucrose could bind to RSL using a similar orientation as D- glucose (Figure 3.22. B, left). In this case there should be no steric clashes with Tyr37 and sucrose would bind both, intra- and inter-monomeric binding sites. NMR screening revealed that RSL-sucrose interactions are similar to the weak complex formed with D-glucose (Figure 3.23.). The Δδ plot for backbone resonances showed almost no changes in the presence of 30 mM sucrose with respect to D-glucose at the same concentration (Figure 3.23. C). Detailed NMR data analyses of sucrose binding to RSL revealed significant CSP for resonances in the

76 Stereochemistry and Chemical Shift Changes inter-monomeric binding site (e.g. Cys75, T82) but no Δδ was observed for homologous residues (Cys30, Tyr37). Also perturbations of the indole resonances (Figure 3.23. B) were minor in the inter-monomeric binding site (Trp36 and Trp53). The homologous indole cross-peaks for Trp81 and Trp10 (intra-monomeric) did not show CSP. Lack of interactions in the intra-monomeric binding site suggested that the D-glucose unit binds in a different orientation than the D-glucose monosaccharide. The different binding mode may depend on steric effects of the Tyr37 side chain. A possible conformation of sucrose is shown in Figure 3.22. C. Based on the similarity of CSP between D-glucose and sucrose it was assumed that RSL does not bind D-fructofuranose.

Figure 3.23. (A-B) Spectral regions from the overlaid 1H-15N spectra of sugar-free RSL (black contours) and RSL bound to D-glucose (pink) or sucrose (teal) (C) Plots of chemical shift perturbations in the presence of 30 mM D-glucose or sucrose . Blanks correspond to prolines (P14 and P44) and resonances that were broadened beyond detection in sugar-free or -bound RSL. NMR of RSL in Plant Extracts RSL was screened in the presence of six vegetable extracts made of carrot, cucumber, tomato, potato, onion or parsnip. All of the samples were prepared identically (see Methods). RSL in the presence of each extract revealed significant chemical shift perturbations. Spectra acquired for RSL in the presence of carrot, cucumber, tomato, potato and onion extracts showed identical CSPs. Based on the similarity of the HSQC spectra, it was assumed that the chemical composition of each extract was similar. Thus Δδ analyses for the extracts were made based on the spectrum of RSL in the presence of carrot extract. Carrot juice comprises ~10%

77 Chapter III

Figure 3.24. 1H-15N HSQC spectra of 0.25 mM sugar-free RSL (black) or RSL in the presence of D-fructose (blue) or extract of carrot (red) or parsnip (green). (A) Proof of D-fructose in carrot extract. (B) Differences in CSP for RSL in the presence of carrot or parsnip extract. sugars mainly glucose, sucrose, xylose and fructose.174 Fructose is the strongest binding sugar (~90 μM, Table 3.1.) of the sugars that occur in carrot. A comparison of RSL in the presence of the extract or 3 mM D-fructose confirmed that RSL interacted with D-fructose in the extracts (Figure 3.24. A). Only the spectrum of RSL in the presence of parsnip extract presented smaller CSPs. The small differences in CSPs for parsnip extract suggest that RSL was bound to a fructoside (Figure 24B). We can exclude the possibility of concentration effects because D-fructose binds to RSL in the slow exchange regime. As described in the thesis introduction, RSL is a lectin used by plant pathogens to invade the host. Finding potential inhibitors of RSL-sugar binding

78 Stereochemistry and Chemical Shift Changes could help to prevent agricultural losses. For this reason, RSL was screened in the presence of water soluble crude extracts that were prepared in Teagasc, Ashtown. Extracts were made of potentially cheap sources like potato peels, marjoram, rosemary, sage and oregano. These extracts were screened to check if they consist of any non-carbohydrate based compounds that can bind to RSL. Water-based extracts yielded spectra that did not show any CSP suggesting lack of interactions with any compound in the extracts. The crude apolar extracts were firstly dissolved in DMSO. Unfortunately, any contact with water caused immediate precipitation of the extracts. In consequence, the extracts prepared from organic solvents were not suitable for this screening.

Conclusions

NMR spectroscopy was used to observe RSL-sugar interactions. We tested fifteen different carbohydrates including twelve monosaccharides, two sugar alcohols and sucrose as an example of disaccharide. RSL shows specificity towards the pyranose ring. Some resonances of homologous residues like Ala40/85 or Arg17/62 gave rise to distinct chemical shift patterns because the intra- and inter-monomeric binding sites of RSL are not identical as it is usually assumed.31–34 RSL binding sites have evolved to recognise the pyranose framework. Changes in the topology (with respect to L-fucose) affect the strength of interactions between RSL and a carbohydrate. The equatorial position of the C5 substituent seems to be crucial for interactions between RSL and ligand. The type of C5 substituent can change leading to altered binding affinities, for example apolar -CH3

(L-fucose, Kd ~2 µM) versus polar -OH (D-mannose, Kd ~100 µM). Based on data acquired for 14 monosaccharides, the importance of sugar topological similarity to L-fucose was assessed as C5>C4>C3>C2>C1 in order from the most to the least important. Strong binders were always observed in slow exchange binding mode and for those with Kd < 50 μM the anomeric effect was observed. Weakly binding sugars showed common factors like small Δδ in the presence of 3 mM sugar, mM affinity and fast exchange regime. Also severe line width broadening (mostly Trp81/36 indoles and Ala40/85) for binding site resonances was observed. The weakly binding

79 Chapter III sugars are often considered non binders like L-rhamnose and D-galactose.5 Here, dissociation constants of D-galactose and D-glucose were estimated with values of 15 and 35 mM, respectively. Although, it was evident that glycerol can bind to protein and even mimic sugar binding of galectin,170 here we have shown the proof of interactions in solution presented by chemical shift perturbation. We showed that two glycerol molecules can bind simultaneously to RSL mimicking pyranose scaffold. L-fucitol, despite close similarity to L-fucose, binds with mM affinity at the inter-monomeric site only. It suggests incomplete mimicry of L-fucose ring by open chain derivative. Figure 3.25. presents all the strong and weak binders (Figure 3.3.) bound to RSL in the intra-monomeric binding site in the form elucidated from the analyses above. Lastly, RSL was screened with a range of plant extracts. Results indicate binding of RSL to D-fructose, which is a major carbohydrate found in vegetables. Screening of sugar-free extracts revealed no Δδ implying that none of the tested extracts contained a compound that could bind to RSL at the tested concentrations. Lectins, by definition, are highly selective towards a specific sugar moiety which makes them precise drug delivery tools.93 However, RSL promiscuity may also be harnessed in therapeutic design. RSL binding to multiple sugars may be advantageous in the case of poorly-defined receptors i.e. when cell-surface consist different carbohydrates. In this case L-fucose binding RSL could be used as a transporter to D-arabinose or L-galactose terminated receptors. Other carbohydrates like D-mannose require different binding (monosaccharide orientation) and therefore binding would be substantially limited. For example, RSL could bind only to D- mannose terminated polysaccharides linked via hydroxyl at C6 like synthetically prepared D-mannose agarose.175 Other physiologically relevant oligosaccharides such as mannans (where sugar units are linked with β(1→4) glycosidic linkage) would no longer interact to RSL. The same phenomenon applies to other glycans 4 terminated with C1 sugars e.g. D-glucose or D-galactose.

80 Stereochemistry and Chemical Shift Changes

Figure 3.25. Models of each carbohydrate in a possible conformation at the intra-monomeric binding site of RSL. See Methods for modelling strategy.

81 Chapter III

Experimental

Materials: Carbohydrates were purchased from SIGMA (Glycerol, Sucrose, D- galactose, D-glucose, D-altrose) or Carbosynth (L-fucitol, L-gulose, D-lyxose). D- xylose, D-fructose and L-rhamnose (SIGMA) were provided by Prof. Paul Murphy. Plant material from potato, marjoram, rosemary, sage and oregano was provided by Teagasc, Ashtown, Dublin.

NMR spectroscopy. 15N-RSL was prepared as described previously (Chapter 2).165 The typical sample conditions comprised 0.25 mM 15N-RSL in 20 mM potassium phosphate, 50 mM NaCl, pH 6.0 and 10% D2O. Carbohydrates were added to a final concentration of 3 or 100 mM. Titrations were performed using aliquots of 3 M stocks of D-glucose or D-galactose, 2 M L-fucitol or 5 M glycerol. Over the course of the titration the protein was diluted 1.2-fold. The sample pH was checked and if necessary corrected to pH 6.0 after each addition of the ligand. Two-dimensional 1H- 15N HSQC spectra were acquired on samples at 303 K on an Agilent 600 MHz NMR spectrometer equipped with a HCN cold probe. All NMR data were processed in NMRPipe.147 The differences in chemical shifts between the pure and sugar-bound RSL were measured in CCPN.148

Binding isotherms. Binding isotherms were obtained by plotting the magnitude of the chemical shift change (Δδ) as a function of the ligand concentration. The data were fit (nonlinear least squares in Origin) to a one-site binding model, with Δδ and the ligand concentration as the dependent and independent variables, respectively, and the dissociation constant (Kd) and maximum chemical shift change (Δδmax) as the fit parameters.

Models of RSL-sugar interactions. Sugars structures (ideal conformation) were downloaded from PDBe Chem. Least squares (LSQ) superposition to methyl-α-L- fucoside in RSL (PDB 2BT9)5 was performed in COOT. Manual modelling (rotation and translation) was used to obtain the best match to methyl-α-fucoside i.e. when all carbons and O5 fit pyranose framework (Figure 3.26.).

82 Stereochemistry and Chemical Shift Changes

Figure 3.26. The examples of the models for β-D-mannose with (A) good and (B) bad match to methyl-α-L-fucoside. This type of modelling was limited to one of six possible orientations. If we assume that the starting orientation matches methyl-α-L-fucoside as found in the crystal then the first model would involve rotation such that C1/C2/C3/C4/C5/O of the fucoside match the modelled sugar atoms C2/C3/C4/C5/O/C1 and so on. The final conformation of the sugar (Figure 3.25.) was picked based on the chemical shift perturbation analyses (Figures 3.4. and 3.6.). The structure of D-altrose is not available in the PDB. A model of D-altrose was made by combining the D-mannose and L-gulose structures. α-L-fucose was modelled from the methyl-α-L-fucoside by deleting the methyl group.

Plant extract preparation: Vegetable extracts. Each vegetable (carrot, onion, parsnip, cucumber, tomato and potato) was crushed using a ceramic mortar and pestle, 1.5 mL of extract (juice) was centrifuged for 10 min and stored at -20 ˚C. Teagasc extracts. Plant materials from five sources: potato peels, marjoram, rosemary, sage and oregano were used. The crude plant extracts were obtained using methods described previously.176,177 The solid/liquid extracts were generated in four different solvents (dichloromethane, ethyl acetate, and water). Only the water crude fractions were subjected to solvent-solvent portioning using MeOH to remove sugars. Fractions of each crude extract were subjected to further fractionation by flash chromatography.177 All of the solvents were of analytical grade and purchased from SIGMA.

83 Chapter III

Preparation of NMR samples for extracts screening: Vegetable extracts: The typical sample comprised 0.15 mM sugar-free RSL, in 20 mM KPi, 50 mM NaCl pH

6.0, 10% D2O and 50 μL of concentrated vegetable extract. If necessary, the pH was adjusted to 6.0. Teagasc extracts: 1 mg of crude/flash purified extract in water was dissolved in 1 mL of standard buffer and the pH was adjusted to 6.0 if necessary. Extracts obtained from organic solvents were dissolved in DMSO but were unstable upon addition of water aliquots. NMR samples contained 0.2 mM of sugar-free RSL, 20 mM KPi, 50 mM NaCl pH 6.0, 10% D2O and 10 μL of 1 mg/mL extract stock.

84 Stereochemistry and Chemical Shift Changes

85

Chapter IV Noncovalent PEGylation of RSL

This work was published as Antonik, et al. Noncovalent PEGylation via Lectin−Glycopolymer Interactions. Biomacromoleculs, 2016, 17, 2719−2725.

The synthesis of a fucose-capped polyethylene glycol (Fuc-PEG) was prepared at Durham University, Durham, UK by Dr Ahmed Eissa. The small-angle x-ray scattering (SAXS) experiments were performed at EMBL, Grenoble, France. SAXS analyses and figures were prepared by Dr Adam Round and Dr Peter Crowley.

86 Noncovalent PEGylation of RSL

Abstract

PEGylation, the covalent modification of proteins with polyethylene glycol, is an abundantly used technique to improve the pharmacokinetics of therapeutic proteins. The drawback with this methodology is that the covalently attached PEG can impede the biological activity (e.g. reduced receptor-binding capacity). Protein therapeutics with “disposable” PEG modifiers have potential advantages over the current technology. Here, we show that a protein-polymer “Medusa complex” is formed by the combination of a hexavalent lectin with a glycopolymer. Using NMR spectroscopy, small angle X-ray scattering (SAXS), size exclusion chromatography and native gel electrophoresis it was demonstrated that the fucose-binding lectin RSL and a fucose-capped polyethylene glycol (Fuc-PEG) form a multimeric assembly. All of the experimental methods provided evidence of noncovalent PEGylation with a concomitant increase in molecular mass and hydrodynamic radius. The affinity of the protein-polymer complex was determined by ITC and competition experiments to be in the µM range, suggesting that such systems have potential biomedical applications.

87 Chapter IV

Introduction

The modification of proteins with synthetic polymers is a powerful route to novel functionalized macromolecules with applications in biotechnology and medicine.178–184 PEGylation, the covalent attachment of polyethylene glycol, is a versatile means to increase a protein’s size, stability and solubility.49,104,179,185,186 The increased radius of protein-PEG conjugates leads to a reduction in renal filtration, which together with decreased immunogenicity can result in longer circulation half lives compared to the native protein.104,186 Such improvements in pharmacokinetics are crucial for the development of cost-effective and patient-friendly protein therapeutics. However, PEGylation can also interfere with the processes of molecular recognition and lead to reduced biological activity. Steric hindrance of the binding site by PEG can impede complex formation with the target receptor.104,186 This effect may be eliminated in poly(zwitterion)-based conjugates.181 Furthermore, reversible PEGylation strategies have been developed that take advantage of slowly hydrolysable,187,188 light-sensitive189 or thiol-sensitive190 linkers to release the PEG component. With the growing demand for tunable protein-polymer macromolecules the focus of attention is shifting towards noncovalent conjugates.191–201 Examples include polymer conjugation via: co-factor reconstitution,191 the biotin-streptavidin complex,193 supramolecular complexation with cucurbituril,194 charge-charge195 and metal affinity196,199 interactions, as well as lectin-mediated complexation.200,202,203 The advantage of noncovalent systems is that temporal control of conjugate assembly and disassembly can be asserted by the presence of competitor molecules.194 In the context of protein therapeutics it is envisaged that the biologically-active ligand-receptor complex may act as a competitor to displace the lower affinity polymer interaction. Thus, the potential interference caused by covalently attached PEG chains will be minimised. Noncovalent PEGylation involves several challenges, the foremost being the control of the protein-polymer affinity which must be sufficiently high to enable the positive effects of PEGylation (i.e. increased circulation half life) and yet low enough to permit transient unbinding and competition by the host receptor (Kd ~nM-µM). Coupled to this challenge is the need for simple linker chemistries which are biocompatible, synthetically attractive

88 Noncovalent PEGylation of RSL

Figure 4.1. Simplified representations of (A) fucose-capped PEG (Fuc-PEG, see Scheme S1 for details) and (B) noncovalent PEGylation of RSL with Fuc-PEG (~2.3 kDa) to yield a protein- polymer “Medusa complex”. Note that RSL is a hexavalent trimer with three intra-monomeric and three inter-monomeric sugar binding sites.5,165 and inexpensive. At this point we note that the study of noncovalent protein-PEG assemblies is not new. Early applications were focused on protein purification by phase separation in mixtures of dextran and dye-conjugated PEG.204–206 Figure 4 in reference 205 is startling in its originality and relevance to current developments. Here, we have combined a fucose-binding lectin with a fucose-capped polyethylene glycol (Fuc-PEG) to produce a noncovalent protein-polymer “Medusa complex” (Figure 4.1.). RSL, a hexavalent fucose-binding lectin from Ralstonia solanacearum, was chosen as the model protein for this study. RSL is a well- characterized ~29 kDa trimer with three intra-monomeric and three inter-monomeric binding sites for L-fucose and related sugars.5,36,37,165 Building on current developments in the structural characterization of PEGylated proteins,49,207 we have used TROSY NMR spectroscopy to demonstrate protein-polymer noncovalent conjugation via the fucose-binding site. The formation of high molecular weight protein-polymer particles was confirmed by size exclusion chromatography (SEC), small angle X-ray scattering (SAXS) and electrophoresis experiments. SAXS analysis further revealed a core structure bearing six PEG tails with random coil conformations. Competition binding experiments demonstrated that the Fuc-PEG appendages could be displaced by methyl-α-L-fucoside (Me-fuc) but not by L- galactose (L-gal). Together, these data indicate that reversible protein-polymer conjugates were formed from the combination of a carbohydrate binding protein and

89 Chapter IV a sugar-functionalized polymer. This strategy of noncovalent PEGylation using glyco-PEG may benefit the development of lectin-mediated drug delivery systems.93,208,209

Results and Discussion

Preliminary characterization of RSL bound to Fuc-PEG. The formation of a noncovalent conjugate between RSL and Fuc- PEG was expected to yield a Figure 4.2. SEC of RSL in the presence of 12 particle of ~50% increased mass equivalents of Fuc-PEG (red) or Me-fuc (blue). The (compared to RSL, ~29 kDa). elution buffer was 20 mM potassium phosphate, 50 mM NaCl, pH 6.0. Assuming that all six sugar- binding sites are occupied by Fuc- PEG (~2.3 kDa) the conjugate will have a mass of ~43 kDa. An initial characterization of the conjugate was performed by using size exclusion chromatography. RSL in the presence of 12 equivalents of Me- 5 fuc (Kd ~0.6 M ) eluted at ~75 mL (Figure 4.2.). This elution volume was higher than expected (the ~31 kDa DNase eluted at ~65 mL) and suggests that the RSL weakly interacted with the

Figure 4.3. Native gel electrophoresis in 5 % Superdex resin. In the presence of polyacrylamide (top) or 2 % agarose (bottom panel) Fuc-PEG, elution was observed at of RSL in the sugar-free or polymer bound form. The ~55 mL, which corresponds to equivalents of Fuc-PEG, the buffer pH and the proteins of ~66 kDa (as indicated electrode locations are indicated. by a BSA standard). This higher-

90 Noncovalent PEGylation of RSL

than-expected mass is characteristic of PEGylated proteins as the random coil nature of PEG occupies a larger volume than a corresponding mass of folded protein.49,210,211 The peak at ~43 mL, corresponding to the void volume, suggested the presence of high molecular weight aggregates.121 The conjugate was further characterized by native gel electrophoresis. Samples of sugar- free RSL or RSL in the presence of Fuc-PEG were analysed in both polyacrylamide and agarose gels

Figure 4.4. Two dimensional 1H-15N HSQC (Figure 4.3.). Consistent with the watergate258 (top) and TROSY-HSQC218,219 (bottom) theoretical pI of ~6.5, RSL and the spectra of 0.5 mM RSL bound to Fuc-PEG. Sample conjugate migrated towards the conditions were 20 mM potassium phosphate, 50 mM anode or cathode at pH 8.3 or pH NaCl, pH 6.0 at 303 K 6.0, respectively. In the sugar-free form a diffuse protein band was observed. With increasing amounts of Fuc-PEG the migration distance was decreased. This observation suggests that noncovalent PEGylation produced larger particles that were more effectively retarded by the gel.

NMR Characterization. 15N-labelled RSL was titrated with µL aliquots of Fuc-PEG (10 mM stock) and monitored by 1H-15N TROSY-HSQC spectroscopy. The TROSY was necessary to improve spectral quality, consistent with the increased molecular mass and longer tumbling time of the conjugate (Figure 4.4. shows a comparison of the HSQC and TROSY-HSQC spectra of RSL bound to Fuc-PEG). The protein-polymer interaction was in slow-exchange on the NMR timescale consistent with an equilibrium dissociation constant in the micromolar range.70,165 Considering the overall similarity of the spectra of RSL bound to Me-fuc and RSL bound to Fuc-PEG (Figure 4.5.) it

91 Chapter IV

was evident that the highly-stable RSL structure5 was unaltered by interaction with the polymer. Figure 4.6. shows spectral regions of the superposed 1H-15N TROSY-HSQC of sugar-free RSL and RSL in the presence of Fuc- PEG or Me-fuc. Recently, we assigned the backbone resonances

Figure 4.5. The overlaid 1H−15N TROSY-HSQC of RSL in the sugar-free form and spectra of RSL bound to Me-fuc (navy contours) and in the presence of fucose-like RSL bound to Fuc-PEG (red). sugars including Me-fuc.165 Numerous resonances assigned to backbone and tryptophan indole nuclei in the sugar-binding sites had similar c hemical shift perturbations (Δδ) in the presence of Fuc-PEG or Me-fuc (Figure 4.6. C). The similar pattern of shifts for each ligand indicated that Fuc-PEG binds to RSL in a fashion similar to Me-fuc. Interestingly, only two backbone resonances showed substantially different  due to Fuc-PEG binding. The avg for Lys34 and Asn79 in the presence of Fuc-PEG were >2 times larger than in the Me-fuc form (Figure 4.6. C). Lys34 and Asn79 are structurally equivalent residues located in loops that flank the inter- and intra-monomeric sugar binding sites, respectively.5,165 The backbone N atom of these residues is within 7.8 Å of the methyl substituent of Me-fuc in the RSL crystal structure (PDB 2BT9).5 The large  observed for these resonances (relative to the Me-fuc data) is evidence that the bulky Fuc-PEG polymer interacts at these sites. Note that the adjacent residues (33, 35 and 78, 80) are glycines and no amide resonances were observed for these either in the sugar-free form165 or in the polymer bound form.

92 Noncovalent PEGylation of RSL

The ∼50% increase in molecular mass of the RSL conjugate was expected to result in increased line widths of the NMR signals. In the presence of Me-fuc, the average 1HN resonance line width of RSL was 19.8 (±2.3) Hz. In the presence of 12equiv of Fuc-PEG, the majority of the resonances demonstrated increased line widths and the average was 28.6 (±4.1) Hz (Table 4.1.). This line broadening indicates an increased rotational correlation time consistent with the higher molecular weight of the complex with Fuc-PEG. The average broadening increase of ∼8 Hz due to six PEG chains (each ∼2 kDa) contrasts with the ∼1 Hz broadening observed for a Figure 4.6. Spectral regions (A) and (B) from the overlaid 1H-15N TROSY-HSQC spectra of sugar-free monoPEGylated protein with 5 49 RSL (black contours) and RSL bound to Fuc-PEG kDa PEG. Considering the (red) or Me-fuc (blue). (C) Plot of the average possible contribution of chemical shift perturbations (Δδavg) upon interaction aggregation (Figure 4.2.) to the of RSL with 12 eq. of Fuc-PEG (red) or Me-fuc NMR data, a sample of RSL and (blue). The Me-fuc data were reported previously.165 12 equiv of Fuc-PEG was analyzed by TROSY NMR before and after purification by SEC. The pre- and post- SEC samples yielded similar NMR spectra and the average line width was 26.0 (±4.4) Hz for the purified complex. Therefore, it can be concluded that the observed line width changes were due to noncovalent PEGylation (rather than aggregation).

93 Chapter IV

Table 4.1. Line-widths of 1HN resonances extracted from the TROSY-HSQC spectra of RSL bound to Me-fuc or Fuc-PEG. Line-widths were measured in CCPNmr for non-overlapping cross- peaks (~65 % of total). Six resonances with line widths ≥ 40 Hz were considered outliers and were excluded from the analysis

1HN line-width (Hz) 1HN line-width (Hz) Residue Residue Me-fuc Fuc-PEG Me-fuc Fuc-PEG

V3 16.0 26.1 G45 17.3 25.9 Q4 21.6 34.3 D46 22.1 32.9 T5 22.3 34.8 N47 20.6 25.4 A7 22.0 31.4 V48 20.9 30.0 T8 19.6 29.1 S52 17.7 24.6 S9 18.2 29.5 W53 20.2 30.9 W10 19.5 38.2 V55 20.7 28.2 G11 17.8 31.4 S57 24.3 31.1 T12 20.3 22.3 A58 19.8 23.7 V13 20.0 27.8 I59 18.6 31.0 S15 22.2 32.2 R62 21.0 27.4 I16 25.2 25.9 Y64 21.1 33.4 R17 18.7 27.5 A65 20.2 34.1 Y19 23.6 29.5 G68 16.5 38.4 T20 24.7 30.0 T70 20.0 25.0 A21 22.8 23.3 T71 21.1 26.6 N22 18.7 29.5 T72 20.1 30.2 N23 17.2 23.3 E73 19.8 30.8 G24 16.9 26.3 C75 19.7 33.5 K25 19.0 22.9 W76 21.8 30.9 T27 21.7 24.4 N79 20.5 25.2 E28 17.9 28.4 T82 21.4 23.5 W31 18.8 32.3 K83 15.6 21.1 Y37 20.6 28.1 G84 21.2 30.5 T38 16.4 28.3 A85 17.7 25.9 G39 22.8 30.7 T87 13.1 35.0 F41 19.3 25.8 A88 19.3 30.8 N42 19.4 27.9 T89 18.1 21.4 E43 19.3 32.0 N90 16.7 20.9 Me-fuc = 19.8 (±2.3) Average line-width (Hz) Fuc-PEG = 28.6 (±4.1)

94 Noncovalent PEGylation of RSL

To assess whether the PEG alone contributed to the observed chemical shift perturbations control experiments were performed using PEG 2000 (the starting material for the Fuc- PEG synthesis). There were no appreciable effects on the NMR

Figure 4.7. Spectral region of overlaid 1H-15N spectrum of RSL (Figure 4.7.) TROSY-HSQC spectra of sugar-free RSL (black indicating that PEG 2000 did not contours) and RSL in the presence of 6 (magenta) or bind at this concentration (3 12 (cyan) equivalents of PEG 2000. mM).212 This result confirmed that the L-fucose cap was necessary for interaction. Competition binding experiments were used to assess the binding affinity of RSL for Fuc-PEG. The pre-formed complex of 0.25 mM RSL in presence of 6 equivalents (1.5 mM) of Fuc-PEG was treated with 1.5 mM of Me-fuc or L-gal. Examination of the tryptophan

indole resonances (Figure 4.8.), Figure 4.8. Spectral region from the overlaid TROSY-HSQC spectra of RSL in the presence of 6 which are excellent reporters of equivalents of Fuc-PEG (red) and after the addition of sugar binding,165 indicated that the 6 equivalents of Me-fuc (top) or L-gal (bottom panel). Fuc-PEG appendages were 5 5 displaced by Me-fuc (Kd ~ 0.6 µM ) but not by L-gal (Kd ~ 9.0 µM ). The indole resonances of RSL bound to Fuc-PEG were shifted upon the addition of Me-fuc (compare Figure 4.6. B). Interestingly, the structural equivalent W31 and W76 had split resonances (Figure 4.8., blue contours) suggesting that the Fuc-PEG was not completely displaced. In contrast the spectrum of RSL bound to Fuc-PEG was completely unchanged by the addition of the weaker binding L-gal. These

95 Chapter IV experiments proved the noncovalent / reversible nature of the interaction between RSL and Fuc-PEG ITC was used to further characterize the complex of RSL with Fuc-PEG (Figure 4.9.). The fit parameters matched previously published data,5,37 and the measured Kd (1.3 ±0.3 µM) was consistent with the results of the competition experiments.

Figure 4.9. Isothermal titration calorimetry of RSL (24 µM) binding to Fuc-PEG (1.5 mM) in 0.1

M Tris-HCl pH 7.5 at 25° C. The fit parameters were Kd = 1.3 ±0.3 µM, n = 2.5 ±0.2, -ΔG = 33.7 ±0.5 kJ/mol, -ΔH = 38.1 ±0.8 kJ/mol and TΔS = -4.4 ±1.1 kJ/mol.

Small angle X-ray scattering. RSL samples, at a maximum concentration of ~10 mg/mL, bound to Me-fuc or Fuc-PEG were characterised by SAXS. Data collection was performed using both the automated sample changer and the online SEC at

Figure 4.10. SAXS data and model fits for RSL BM29. The online SEC was used bound to Me-fuc (lower curves) and RSL bound to to remove high molecular weight Fuc-PEG (upper curves). aggregates (>150 kDa, Figure 4.2.) immediately prior to data collection. The molecular mass (MM), the radius of gyration (Rg), the Dmax, and the volume (V) of RSL bound to Me-fuc were estimated

96 Noncovalent PEGylation of RSL

Figure 4.11. Top down and side views of the SAXS models of (A) RSL bound to Me-fuc and (B) RSL bound to Fuc-PEG. The models are color-coded to match the fits in Figure 4.10. The

DAMMIF models of RSL generated with P1 or P6 symmetry are colored magenta and cyan,

respectively. The DAMMIF (P6) or CORAL (PEG chains only) models of RSL bound to Fuc-PEG are colored green and blue, respectively at 27 ±3 kDa, 1.8 ±0.1 nm, 4.8 ±0.2 nm, and 42 nm3, respectively. These parameters coincide, within error, with those computed from the RSL crystal structure (PDB 2BT9). The predicted scattering curve (CRYSOL) from the crystal structure fits neatly to the experimental data (χ = 1.97, Figure 4.10.). Ab initio models calculated 213 in DAMMIF (with P1 or P6 symmetry, Figure 4.10.) matched the protein crystal structure (Figure 4.11.). All of these data confirmed that RSL is a trimer in solution. Samples of RSL bound to Fuc-PEG resulted in a significant change to the scattering pattern (Figure 4.10.). The parameters MM, Rg, Dmax, and V increased to 42 ±5 kDa, 2.9 ±0.1 nm, 10.0 ±0.2 nm, and 72 nm3, respectively, consistent with the formation of the protein-polymer conjugate. The difference in molecular weight between RSL bound to Me-fuc and RSL bound to Fuc-PEG was ~15 kDa in agreement with the expected mass difference of 13.8 kDa. Ab initio models generated in DAMMIF213 had six, three or one elongated protrusions that fit the experimental data with χ = 3.5, 2.8 or 2.4 for P6, P3 and P1 symmetry, respectively.

The fit and corresponding envelope for P6 are shown in Figures 4.10. and 4.11. B. As the resolution of the ab initio modelling was insufficient to determine the locations

97 Chapter IV of the PEG chains, constrained rigid body modelling was performed in CORAL.214 Chains of 22 dummy residues, based on Svergun’s approximation,46 were used to model the Fuc-PEG. The fuc-PEG appendages were constrained to the known fucose-binding sites (at residues Trp31 and Trp76 of the inter- and intra-monomeric binding sites).5,165 The resulting model gave good fits to the scattering data (χ = 2.4) and overlaid neatly with the P6 ab initio envelope (Figure 4.11. B). Irrespective of the modelling strategy and in agreement with earlier studies46,48,49 the Fuc-PEG chains were observed to extend away from the protein core in random coil conformations, rather than interact with the protein surface. Therefore, the SAXS data were consistent with the proposed “Medusa complex”. RSL and Fuc-PEG are an interesting model system in the context of the 46,49 “grafting distance” (DG) and the Flory dimension (RF) of the polymer. The fucose-binding sites in RSL have a hexagonal arrangement with an edge length of ~2 nm (distance between the methyl substituents in Me-fuc crystal structure, PDB

2BT9). PEG 2000 has a RF of ~3.4 nm, which is ~1.7-fold the grafting distance

(nearest neighbours). As DG < RF the tendency is for the PEG to adopt an extended “brush” conformation and thereby reduce steric interactions between the polymer chains. This model is well-represented by the SAXS data (Figure 4.11. B).

Conclusions

We have demonstrated the possibility of noncovalent PEGylation in a protein- polymer conjugate based on the hexavalent lectin RSL and a fucose-capped PEG. A variety of biophysical techniques were used to characterize complex formation. By using NMR spectroscopy it was confirmed that the glycopolymer interacts with the fucose-binding sites of the protein. The chemical shift perturbation plot was consistent with previously obtained data for the binding of “fucose like” sugars.165 The ~1.5 fold increase in NMR line widths and the necessity of the TROSY pulse sequence were indicative of an increased molecular weight (slower tumbling time) of the conjugate. The Kd for Fuc-PEG binding was estimated at ~1.3 M by ITC measurements and the reversible nature of the complex was demonstrated by competition experiments. The Fuc-PEG appendages were displaced by the tighter binding Me-fuc (while the weaker binding L-gal had no effect). Small angle X-ray

98 Noncovalent PEGylation of RSL scattering was used to elucidate the structure of the conjugate. The scattering data were consistent with a protein-polymer “Medusa complex” in which the PEG chains were extended from the protein core. In terms of broader applications, noncovalent PEGylation could involve target proteins with an engineered sugar-binding site and a glycol-PEG with appropriate sugar functionality and PEG size.

Experimental

Materials: Methyl-α-L-fucoside (Me-fuc) and L-galactose (L-gal) were purchased from Carbosynth and Sigma, respectively. L-fucose-capped polyethylene glycol (Fuc-PEG) was synthesized via a copper catalyzed azide-alkyne cycloaddition215 (Scheme 4.1.). Alkyne terminated PEG (Alk-PEG) was prepared according to an adapted literature procedure,216 by using PEG monomethyl ether (∼2 kDa, Sigma 202509) and propargyl bromide. The 1H NMR spectrum of Alk-PEG included peaks at 2.42 and 4.18 ppm, assigned to the alkyne proton and the adjacent CH2 protons, respectively. 2-azidoethyl α-L-fucose was synthesized as described.217 For the click reaction, stoichiometric amounts of Alk-PEG and 2-azidoethyl α-L-fucose were combined in t-/water (5:1) in the presence of copper sulfate (0.1 eq.) and sodium ascorbate (0.2 eq.) at 50 °C for 24 h. The product was purified by flash chromatography and formation of the triazole was confirmed by 1H NMR and ATR- FTIR spectroscopy.

Scheme 4.1. Synthesis of fucose-capped PEG (Fuc-PEG) via click chemistry.

99 Chapter IV

Protein samples: RSL was produced in E. coli BL21, transformed with the plasmid pET25rsl, and purified by mannose-affinity chromatography.5 The 15N-labeled protein was prepared as described recently.165

Size exclusion chromatography: SEC was carried out at 21° C on an Äkta FPLC equipped with an XK 16/70 column (1.6 cm diameter, 65 cm bed-height) packed with Superdex 75 (GE Healthcare). Filtered, degassed buffer (20 mM KPi, 50 mM NaCl, pH 6.0) was used at a constant flow rate of 1.5 mL/min. The samples (800 µL) contained RSL (250 µM) with 12 eq. of Fuc-PEG or Me-fuc. Sample elution was monitored at 280 nm.

Electrophoretic characterization of noncovalent PEGylation: Native electrophoresis was performed using 5% polyacrylamide or 2% agarose gels (6.0 cm x 7.0 cm) in horizontal mode. The samples (15 L) contained 100 M RSL and up to 12 equivalents of Fuc-PEG. Acrylamide gels were run at a constant voltage of 150 V in Tris/Gly buffer at pH 8.3 for 30 min. Agarose gels were run at 100 V in 20 mM KPi at pH 6.0 for 20 min. Visualization of the protein migration was achieved with Coomassie staining and a flatbed scanner.

NMR characterization: 1H-15N TROSY-HSQC218,219 spectra were acquired at 303 K on an Agilent 600 MHz NMR spectrometer equipped with a HCN cold probe. The samples contained 0.25 or 0.5 mM 15N-RSL in the sugar-free form or in the presence of 1-12 equivalents of Fuc-PEG, methyl-α-L-fucoside or L-galactose. The buffer (20 mM potassium phosphate, 50 mM NaCl, pH 6.0) was identical for all samples. The data were processed in NMRPipe.147 The differences in chemical shifts (Δδ) between the sugar-free and bound forms of RSL were measured in CCPN148 and the average 2 2 ½ 70,165 perturbations were calculated as Δδavg = {[ΔδH + (0.2 × ΔδN )]/2} .

Isothermal titration calorimetry: Experiments were performed at 25 °C using a Nano ITC (TA Instruments) and conditions similar to those previously published.5 Samples of protein (24 µM) and ligands (1-2 mM) were prepared in identical buffer (0.1 M Tris-HCl, pH 7.5). Ligand solutions were added in 30 injections of 3 µl at intervals of 3 min while stirring at 300 rpm. A control experiment was performed by

100 Noncovalent PEGylation of RSL injecting Fuc-PEG into buffer, which yielded negligible heats of dilution. The data analysis was performed in NanoAnalyze.

SAXS characterization: Small-angle X-ray scattering experiments were performed at the ESRF BioSAXS beamline BM29 equipped with a Pilatus 1M detector (Dectris).220,221 RSL bound to Me-fuc or to Fuc-PEG was characterized at four different protein concentrations (10, 5, 2 and 1 mg/ml) using the automated sample changer.222 Data were collected also on a sample of the conjugate that was pre- treated via an online SEC at BM29.223 All of the samples were prepared in the same buffer, identical to that used for the NMR experiments. 55 (automated sample changer) or 100 (SEC) μl volumes were exposed to X-rays for 10 individual frames, each 1 second in duration. Individual frames were processed with the automatic processing pipeline224, to yield individual radially averaged curves of normalized intensity versus scattering angle s = 4πsinθ/λ. Additional data reduction in EDNA utilized the automatic data processing tools of the EMBL-Hamburg ATSAS package214,225 to combine timeframes (excluding any data points affected by radiation damage), yielding the average scattering curve for each exposure series. Matched buffer measurements taken before and after every sample were averaged and used for background subtraction. Merging of separate concentrations and further analysis were performed manually using the ATSAS package.226

The forward scattering I(0) and radius of gyration Rg were calculated from the Guinier approximation.227 The hydrated particle volume was computed using the 228 Porod invariant and the maximum particle size Dmax was determined from the pair distribution function computed by GNOM229 in PRIMUS.230 An estimate of the molecular mass (MM) of each construct was made from the I(0), with the partial specific volumes of RSL and PEG taken as 0.74 and 0.83 cm3/g, respectively.46 Ab initio models were calculated using DAMMIF213 and then averaged, aligned and compared in DAMAVER.231 CRYSOL232 was used to calculate the scattering of RSL based on the crystal structure coordinates5 (PDB 2BT9). In CORAL, 6 x 22 dummy residues with random coil conformations (P1 symmetry) were added to account for the six Fuc-PEG chains. The assumption of 22 residues per Fuc-PEG was based on the molecular weight (~2.3 kDa) and Svergun’s demonstration that the electron density of PEG (~0.4 e/Å3) is comparable to that of protein.46 The averaged and filtered model was used to visualize the protein-polymer assembly and

101 Chapter IV

SUPCOMB233 was used to optimally orient the RSL crystal structure (with minimal spatial discrepancy) into the final model. Fits of the models to the experimental data were prepared in SAXSVIEW (http://saxsview.sourceforge.net).

102

103

Chapter V RSL Interactions in Physiological and Crowded Environments

Antonik et al., manuscript in preparation

All of the NMR and electrophoretic experiments were performed by the author. The gelatin section was part of the final year project of Michael Cleary who prepared all of the gelatin and agarose NMR samples.

104 RSL in Physiological and Crowded Environments

Abstract Protein characterization in the native cellular environment is a challenge that protein scientists are still trying to overcome. An alternative strategy is to mimic the cell interior by using cell extracts or macromolecular crowders such as proteins, polysaccharides or synthetic polymers. Here, we report studies of RSL (pI ~6.5) in physiological and crowded environments. In-cell NMR spectroscopy proved to be ineffective to study RSL so we shifted our focus toward crowded and confined environments. BSA (pI ~5) or lysozyme (pI ~10) were used as protein crowders. It was observed that high concentrations of crowder (300 g/L) resulted in completely broadened spectra of sugar-free RSL, suggesting that the protein formed high molecular weight assemblies, either homooligomers or complexes with the crowder. Addition of the tightly binding methyl-α-L-fucoside recovered partially the spectrum of RSL in the presence of BSA but not in the presence of lysozyme. This suggests that the more flexible, sugar-free form of RSL is more susceptible to crowding effects while the lysozyme data suggest additional involvement of complementary charge interactions. To explore the contributions of confinement and electrostatic interactions, RSL was studied in cationic bovine-, anionic porcine-gelatin and in neutral agarose gels. Additional data in polyacrylamide gels provided further insights into the relationship between pore size and confinement effects.

105 Chapter V

Introduction

To date, the majority of protein studies have been carried out in buffered aqueous solutions.234,235 As the nature of macromolecular organization within the cell has gained more attention in recent years, protein studies are shifting towards in- cell experiments.236–240 The concentration of macromolecules in the cell reaches hundreds of grams per litre, limiting the cell volume available to other molecules.239,240 This crowded environment raises the potential for the formation of transient complexes that may not to be observed in dilute, aqueous solutions.110,111,113 For instance, crowding and confinement can lead to protein aggregation, due to changes in kinetics or protein stability.241 NMR spectroscopy is a unique technique that allows the observation of protein structure in solution at the atomic level. In-cell NMR can provide information on how the crowded intracellular environment affects protein structure and dynamics.114,236,242–244 Nevertheless, this method has limitations. It was shown that many globular proteins are undetectable in E. coli cells by NMR spectroscopy.114,245,246 This technique also requires careful sample preparation to avoid the protein leakage problem.247 The generation of an artificial cell interior provides a solution for these issues and presents a straightforward method to examine proteins in physiologically- relevant environments. Cell mimics such as macromolecular crowding/confinement agents have been used to study biological macromolecules in physiological environments.120,241 To replicate the in-cell environment potential crowders like proteins118, polysaccharides248, and synthetic polymers119 have been tested. The model protein RSL5 is a ~29 kDa symmetric trimer with a low net charge (Asp × 3, Glu × 3, Lys × 3, Arg × 3 and His × 1, theoretical pI ~6.5). RSL has small positively and negatively charged patches evenly distributed on the surface (Figure 5.1.) making it an interesting molecule to examine against differently charged proteins (including gelatins) as charge-charge interactions should be weak. Also RSL is relatively large for in-cell NMR but the lack of strong electrostatics can be an interesting feature to investigate. Here, we present NMR studies of RSL in physiological and crowded environments. RSL is anionic at physiological pH 7.6 such as in human

106 RSL in Physiological and Crowded Environments

Figure 5.1. Electrostatic surface representations of RSL (PDB 2BT9), where positive and negative potentials are coloured blue and red, respectively. serum. We tested RSL using in-cell NMR but the spectrum was not detectable. We decided to mimic the intracellular environment and test RSL in the presence of protein crowders. pH 7.6 was chosen for our experiments to be consistent with physiological conditions. BSA and hen egg white lysozyme are standard proteins used as crowders with the size of ~66 and ~14 kDa, respectively. Values of pI ~5 and ~10 for BSA and lysozyme, respectively make these two proteins negatively or positively charged at physiological pH. We used agarose and gelatin gels to encapsulate RSL in confined environments. Bovine and porcine gelatins have different physicochemical properties. Bovine gelatin is anionic with a narrow pI of 4.7-5.4 while porcine gelatin is basic with a broad pI of 7.0-9.0.249 Bovine gelatin (at 100 mg/mL) has pores size of 5-20 nm.250,251 Experiments on RSL encapsulated in gelatin or agarose were carried out at pH 6, which makes bovine and porcine gelatins anionic and cationic, respectively. In contrast to gelatin that is a network of charged peptides, the agarose gel is a network of neutral polysaccharides. To investigate aspects of gel pore size on confinement we performed additional electrophoresis experiments in 2% agarose and 15% polyacrylamide gels, with pore sizes of 130-170 nm and below 20 nm, respectively.252,253

Results and Discussion

In-Cell / Extract NMR. Sugar-free RSL in aqueous solution yields a well resolved 2D HSQC spectrum (Figure 5.2. A and Chapter 2). The in-cell 1H-15N HSQC (Figure 5.2. B)

107 Chapter V showed that RSL was undetectable in E. coli cells.114,245,246 The observed peaks were due to mobile side chain amides and 15N-labelled metabolites.236 Typical over expression of 15N-labelled RSL yields ~50 mg/L concentration. The high expression level (above the detection limit of ~50 µM) suggests that the lack of an in-cell spectrum was caused by interactions between RSL and macromolecules in the E. coli cytosol. A cell extract, prepared by a standard freeze-thaw procedure of the in-cell sample, yielded a well-resolved 1H-15N HSQC spectrum. Interestingly, the extract spectrum revealed chemical shift perturbations relative to sugar-free RSL in buffer (Figure 5.2. C).

Figure 5.2. 1H-15N HSQC spectra of (A) sugar-free RSL in 20 mM KPi, 50 mM NaCl, pH 6.0 (B) a suspension of E. coli cells containing over-expressed RSL (brown) and (C - E) Overlay of RSL

in extract (yellow) and RSL in the presence of 30 mM D-glucose (red) or D-galactose (blue). The black dotted circle indicates where the Trp36 indole typically resonates.

As presented in Chapter 3, the weakly binding sugars have similar, small CSPs. Therefore, it is hardly distinguishable which exact sugar in the cell extract was bound to RSL. Panels D and E of Figure 5.2. show spectral regions from overlaid

108 RSL in Physiological and Crowded Environments

HSQC spectra of RSL in a cell extract and in the presence of 30 mM D-glucose or D- galactose. In general, the regions presented similar CSPs. Distinctive differences in Δδ was observed for Glu73 (Figure 5.2. D) and the indole of Trp31 (Figure 5.2. E) suggesting binding of D-glucose instead of D-galactose. Bennet et al. have shown that the E. coli metabolome contains mM concentration of glucosides.109 The overall spectra of RSL in extract and in the presence of 30 mM D-glucose are similar but not identical, suggesting that RSL could be bound to glucosides found in E. coli. Tyr64, Thr71 and Lys83 in the presence of D-glucose presented similar CSPs as RSL in cell extract while the cross-peaks of Arg29 and Asp32 were clearly different (Figure 5. 2. D).

RSL in the Presence of Protein Crowders or Human Serum. 2D 1H-15N TROSY-HSQC experiments were acquired on 0.25 mM samples of sugar-free 15N-labelled RSL in the presence of BSA or lysozyme in 20 mM KPi, 50 mM NaCl at pH 7.6. Samples were prepared at 100 g/L concentration of each crowder and adjusted to pH 7.6. After data acquisition, additional crowder was added to 300 g/L, the pH was readjusted and samples were remeasured. Spectra were acquired also after the addition of 2 eq. (3 mM) of methyl-α-L-fucoside to check the sugar-bound form of RSL in the crowded environment. In the presence of 100 g/L of BSA, RSL yielded a high quality HSQC spectrum (Figure 5.3. top) while in 300 g/L of BSA the spectrum of RSL was broadened beyond detection (Figure 5.3. middle). Upon addition of 3 mM methyl-α- L-fucoside the spectrum was partially recovered (Figure 5.3. bottom). The sugar-free RSL is more flexible than sugar-bound RSL165 (Chapter 2). The addition of tightly binding monosaccharides such as methyl-α-L-fucoside decrease spectrum line width resulting in ~3 Hz sharper peaks (Chapters 2 and 5). The presence of 300 g/L BSA might prompt aggregation of RSL and hence causing severe broadening. Addition of methyl-α-L-fucoside increases RSL rigidity and also decrease the spectrum line width so resonances were again detectable. Both RSL and BSA at pH 6 are anionic, so charge-charge interactions between RSL and the crowder are repulsive. In contrast, only 100 g/L of lysozyme caused significant broadening. At 300 g/L lysozyme the spectrum of RSL was completely broadened beyond detection and addition of methyl-α-L-fucoside did not recover the spectrum. Possibly anionic RSL interacted with the cationic lysozyme to form large molecular weight complexes.

109 Chapter V

Figure 5.3. 1H-15N TROSY-HSQC spectra of 0.25 mM RSL in the presence of 100 (top) or 300 g/L (middle) of the protein crowder. Samples with 300 g/L of the crowder were treated with 3

mM (2eq.) of methyl-α-L-fucoside (bottom). The buffer was 20 mM Kpi, 50 mM NaCl at pH 7.6. The spectra are contoured identically. The crowding and attractive charge-charge interactions in 300 g/L of lysozyme may have caused RSL aggregation such that RSL was not detectable by NMR in either sugar-free or -bound forms. In serum, the NMR spectrum of sugar-free RSL was also broadened beyond detection (Figure 4B). The heterogeneous composition of serum and increased viscosity relative to water hinders molecular tumbling and prompts broadening. Small molecules such as lactate studied in body fluids showed broader 1H NMR spectra in serum and urine.254,255 The similarity of the spectra of RSL in serum (Figure 5.4.) or in the presence of BSA (Figure 5.3.) suggests that human serum albumin (HSA) (the most abundant protein in serum256,257) was responsible for spectral broadening. These results suggest complex formation (and increased τc) in

110 RSL in Physiological and Crowded Environments

serum. The role of charge-charge interactions on complex formation for RSL in either crowders or serum cannot be fully considered without additional experiments at different salt concentrations.

RSL in Confined Environments – Gelatin or Agarose

To our knowledge, gelatin has not been used previously as a crowder or confinement agent in NMR samples. First, it was necessary to determine what gelatin concentration would be suitable for physiologically–relevant experiments without impacting spectral quality. Bovine gelatin samples were 1 15 Figure 5.4. H- N TROSY-HSQC spectra of examined by 1H-15N HSQC (A) 0.25 mM RSL in 20 mM KPi, 50 mM watergate258 spectroscopy at NaCl, pH 7.6 (B) in human serum (C) and after concentrations of 50, 75, 100 and 125 the addition of 3 mM methyl-α-L-fucoside mg/mL (Figure 5.5.). Bovine gelatin spectra revealed background signals in the proton dimension (7.8 - 8.8 ppm) and throughout the 15N dimension. The noise (7.8 - 8.8 ppm) is consistent with background signals from the amides of collagen, which is present at high concentration. This noise partly obscures the spectrum and makes gelatin samples more difficult to analyze and interpret. 75 mg/mL gelatin was selected for further experiments as this concentration was the highest, most physiologically relevant concentration which did not significantly affect the spectrum (Figure 5.2. B). Also it was found that concentrations above 100 mg/mL were difficult to prepare and resulted in heterogeneous gels.1H-15N HSQC experiments were performed for sugar- free RSL and RSL bound to methyl-α-L-fucoside in 75 mg/mL porcine or bovine gelatin, or 0.5 – 2% agarose gels. In porcine gelatin the spectrum of sugar-free RSL was broadened beyond detection with only side chain cross-peaks detected. In contrast, in bovine gelatin the

111 Chapter V

spectrum was broad but detectable. Interestingly, the spectral quality improved in both gels when methyl-α- L-fucoside was added. The experiments were also performed using the TROSY pulse sequence but this approach did not improve the spectral quality. At pH 6.0 RSL is slightly cationic. Crowding experiments suggest that RSL can interact with oppositely charged proteins (e.g. lysozyme, Figure 5.3.). Similar effects might be expected in the highly confined charge-rich environment of the gel. Interestingly, the results were counter-intuitive as cationic RSL was undetectable in the cationic porcine gelatin, yet detectable in the anionic bovine gelatin (Figure 5.6.). Perhaps one of the explanations is that the pores in porcine gelatin are smaller than in bovine, resulting in more confined Figure 5.5. 1H-15N HSQC watergate spectra of bovine gelatin at (A) 50 (B) 75 (C) 100 and (D) environment at the same concentration. 125 mg/mL in 20 mM KPi, 50 mM NaCl, pH However, a literature value for the pore 6.0 size of porcine gelatin was not found so this is just a deduction. More insight into this phenomenon was provided by NMR experiments in agarose gels (Figure 5.7.). Resonance broadening of sugar-free RSL was dependent on the concentration of agarose (0.5 or 2%, Figure 5.7. top). The decreased pore size in the more concentrated agarose gel has a greater confinement effect and more strongly influences the τc of RSL. Similarly as it was in the case of experiments with crowders, the addition of methyl-α-L-fucoside improved the spectral line-widths.

112 RSL in Physiological and Crowded Environments

Figure 5.6. 1H-15N HSQC spectra of 0.1 mM sugar-free RSL (top row) or RSL bound to methyl-

α-L-fucoside (bottom row) in porcine (left) or bovine (right) gelatin. The sample conditions were 75 mg/mL gelatin in 20 mM Kpi, 50 mM NaCl, pH 6.0

1 15 Figure 5.7. H- N HSQC spectra of sugar-free RSL (top row) and RSL bound to methyl-α-L- fucoside (bottom row) encapsulated in 0.5% (left) or 2% (right) agarose gel. The gels were prepared in 20 mM KPi, 50 mM NaCl, pH 6.0

113 Chapter V

Difference in line-widths between the 0.5% and 2% agarose samples remained (Figure 5.7. bottom). A detailed comparison of the line-widths in agarose gels is presented in Table 1. The agarose gel is a polysaccharide network of D-galactose and 3,6-anhydro- L-galactose repeat units.252 It could be expected that this sugar network binds to RSL and the resulting “high molecular complex” could cause CSPs and/or severe resonance broadening. However, no CSPs were observed and there was only ~3 Hz difference in line-width between 0.5% and 2% agarose gels (Figure 5.7. and Table 5.1.) suggesting minimal interactions between RSL and the gel. RSL binds to both galactose enantiomers (Chapter 3) that are the constituents of agarose so binding to the gel could occur. However, if we consider structure of the agarose, then interactions with the gel are impossible. Agarose is a linear polymer consisting of alternating D-galactose and 3,6-anhydro-L-galactopyranose linked by α-(1→3) and β-(1→4) glycosidic bonds. In this case only the polymer termini could bind to RSL. Otherwise, we would not be able to observe migration of RSL in agarose gels as it would stick to the gel. Also CSP would be observed in spectra of sugar-free RSL acquired in agarose unless the complex was too large to detect by NMR (>100 kDa).

114 RSL in Physiological and Crowded Environments

Table 5.1. Line-widths of 1HN resonances extracted from the HSQC spectra of sugar-free RSL or RSL bound to methyl-α-L-fucoside in agarose gels. Line-widths were measured in CCPNmr for non-overlapping cross-peaks (~54% of total).

1HN line-width (Hz) 1HN line-width (Hz) 0.5% Agarose 2% Agarose 0.5% Agarose 2% Agarose Residue Residue Sugar Me- Sugar Me- Sugar Me- Sugar Me- Free fuc Free fuc Free fuc Free fuc V3 21.8 19.3 29.6 24.5 V48 24.1 24.1 28.1 28.1 Q4 22.8 25.6 29.1 23.3 S49 27.5 28.6 33.4 33.6 T5 27.2 25.9 32.9 27.2 S52 19.6 21.1 28.7 31.4 T8 26.6 20.4 34.0 29.2 V55 25.2 25.5 32.4 31.6 S9 24.3 22.1 26.1 24.6 A58 21.6 18.4 22.4 20.9 W10 24.5 26.5 34.6 30.5 I59 26.1 24.3 30.6 30.4 G11 26.6 21.2 15.9 24.0 R62 30.1 18.6 16.7 32.2 S15 29.0 30.1 47.9 29.7 Y64 25.4 24.2 27.5 42.2 Y19 26.6 26.8 24.9 43.4 A65 28.2 21.8 20.1 33.5 T20 21.8 23.5 32.2 23.1 G68 23.8 25.6 31.3 26.9 A21 31.2 21.3 36.3 26.4 T70 21.3 21.3 31.0 21.0 N23 26.6 22.6 31.2 37.1 T71 25.6 21.0 35.3 29.3 G24 20.6 20.6 26.5 26.6 T72 25.7 28.6 31.8 30.5 K25 20.9 21.7 28.1 21.9 E73 22.2 22.7 26.2 30.5 T27 25.8 23.9 34.2 35.8 W74 24.7 25.6 27.8 38.0 D32 31.1 23.3 20.1 19.6 C75 30.8 23.9 17.4 35.4 Y37 21.9 24.4 33.5 21.2 W76 36.0 22.1 16.1 18.5 T38 22.4 18.5 27.3 28.0 T82 25.1 27.2 20.7 25.8 G39 21.3 27.9 20.4 28.5 K83 17.6 20.4 26.2 21.8 F41 27.1 18.0 28.8 28.3 G84 27.3 20.0 31.0 24.6 N42 22.0 27.8 27.6 25.3 A85 32.7 24.6 25.0 36.6 E43 33.2 23.0 42.7 19.6 A88 25.3 26.8 22.5 25.3 G45 21.4 20.7 33.9 20.3 T89 22.2 19.2 21.0 20.7 D46 26.2 35.1 29.3 33.6 N90 17.7 17.3 20.1 18.9 N47 24.8 24.2 25.2 21.4 0.5%, Sugar Free = 25.2 (±3.9) 0.5%, Me-fuc = 23.4 (±3.5) Average line-width (Hz) 2%, Sugar Free = 28.1 (±6.5) 2%, Me-fuc = 27.8 (±6.1)

115 Chapter V

Electrophoretic characterization. To further investigate pore size effects in different types of gels, electrophoresis experiments were performed. RSL was characterized by SDS-PAGE and native gel electrophoresis. Samples of sugar-free or -bound RSL were analyzed in 15% polyacrylamide (vertical) or in 2% agarose (horizontal) gels. For native gel analysis, the agarose gel was prepared in KPi buffer at pH 6.0. Therefore, the migration of RSL (pI ~6.5) was towards the cathode. Samples of sugar-free RSL and methyl-

Figure 5.8. 2% agarose electrophoresis (A) and α-L-fucoside bound RSL migrated to 15% SDS-PAGE (B) of 0.1 mM RSL. (A) the different distances (Figure 5.8. A) in migration of sugar-free RSL (1) and RSL in the native agarose gels. Sugar-free RSL presence of 0.2 mM of D-glucose (2), D- migrated the shortest distance (3 mm), galactose (3), D-mannose (4), D-arabinose (5), but when bound to L-galactose (K ~10 L-galactose (6), L-fucose (7), or methyl-α-L- d fucoside (8). (B) Unboiled and boiled (indicated µM) or other tightly binding

with asterisk) sugar-free RSL (1) or RSL bound monosaccharides (Kd ≥ 2 µM) the to methyl-α-L-fucoside (8) migration distance was increased (Figure 5.8. A). The largest migration into the gel was for the tightest binding methyl-α-L-fucoside (Kd ~0.5 µM). The diameter of RSL (PDB 2BT9) is ~5 nm, that is ~30 fold smaller than the pores of 2% agarose gel. In this case migration should not be hampered by the pore size but could be slowed down by weak binding to the gel. Methyl-α-L-fucoside bound RSL would lack interactions with the agarose network resulting in the greatest migration into the gel. SDS-PAGE experiments were performed using 15% polyacrylamide gels that have at least 10 fold smaller pores than the agarose gels.252 RSL is a ~29 kDa trimer in solution (confirmed by SAXS259, Chapter 4). SDS and boiling should unfold the protein resulting in a band in the gel at ~10kDa. If unfolding is unsuccessful, then the native protein could migrate into the gel as far as a typical ~30 kDa protein

116 RSL in Physiological and Crowded Environments would travel. Samples of RSL for SDS-PAGE analysis were usually boiled for 10 minutes to completely denature the protein due to high thermal stability of RSL (Appendix D).36 The RSL diameter (5 nm) is ~3 times smaller than the pore size of a 15% polyacrylamide gel. To investigate the role of the pore size, samples of sugar- free and -bound RSL were compared. Interestingly, Figure 5.8. B shows that the native samples of sugar-free and -bound RSL did not migrate into the gel at all, suggesting that RSL forms assemblies larger than the pore size. The less dense stacking gel (5% polyacrylamide) resulted in “tailing” of the sample. When the sample was boiled and the protein was denatured the monomer was observed for both sugar-free and -bound RSL. Interestingly, the boiled sample for methyl-α-L-fucoside bound RSL gave rise to two distinct bands on the gel. The lower faint band corresponds to the unfolded monomer. The upper band was similar to the band observed for unboiled, native samples. The methyl-α-L-fucoside bound RSL is extremely hard to denature with a thermal stability 96 °C.36 The standard SDS-PAGE protocol requires boiling at 95 °C so heating for 10 minutes was insufficient to fully denature the fucoside bound RSL. Consequently, the loaded sample contained both folded and unfolded species of RSL which were observed in the gel.

Conclusions

RSL, similar to other globular proteins,121 was undetectable by in-cell NMR spectroscopy. However, a well-resolved HSQC spectrum of RSL was observed in concentrated cell extract. What is more interesting, RSL was found to be sugar- bound in cell extracts presumably to glucosides present in E. coli at mM concentration. The study of RSL in crowded environments revealed that protein crowders at 300 g/L can lead to complete the loss of NMR signal. This result is likely a consequence of the increased sample viscosity and the aggregation of RSL. Remarkably, addition of the tightly binding methyl-α-L-fucoside reversed the broadening effect, but only for RSL in the presence of BSA. RSL in the presence of lysozyme revealed significant broadening already at 100 g/L of the crowder. Spectral broadening may suggest complex formation between anionic RSL (at pH 7.6) and

117 Chapter V the highly cationic lysozyme via charge-charge interactions. However, this statement requires further experimental investigation such as NMR measurements at higher concentration of NaCl (100-150 mM) that may disrupt the electrostatic effects. Similar observations as for BSA experiment were made for RSL in human serum, presumably due to the abundance of HSA. Gelatin was examined as a potential macromolecular confinement agent using a procedure that relies on proteins that are stable at 40 °C. The HSQC spectra were impeded by background noise from gelatin. The results for RSL encapsulated in gelatin were firstly ambiguous. Bearing in mind results obtained for protein crowders it was hypothesized that RSL would not stick to either type of gelatin or just to the positively charged bovine gelatine. However, RSL encapsulated in either gelatin resulted in broad spectra. Upon addition of methyl-α-L-fucoside the quality of the spectra improved noticeably. RSL detection was also weakened with increasing agarose concentration. In 2% agarose gel the RSL spectrum was broad but no CSP was observed suggesting that RSL aggregates in the confined environment leading to impaired spectral detection. However, binding to agarose gel network could be too weak to be observed by NMR. The results for RSL encapsulated in a range of gels suggest that broadening detected in the spectra was due to different pore sizes that affected the rotational correlation time rather than electrostatics or strong binding effects between RSL and the gel network. To investigate the effect of confinement related to the pores size, the additional electrophoretic characterization for sugar-free and –bound RSL was performed in polyacrylamide and agarose gels. Sugar-bound RSL migrated further into the agarose gel than sugar-free form suggesting weak binding of the protein to gel network. The experiments in polyacrylamide gel showed that RSL may create high molecular aggregates in either sugar-free or -bound form that cannot migrate into 15% polyacrylamide gel. Methyl-α-L-fucoside bound form of RSL is highly thermal stable that even extensive boiling was not able to fully denature the protein sample.

118 RSL in Physiological and Crowded Environments

Experimental

Materials: Bovine and porcine gelatin, hen egg white lysozyme, BSA and DMSO were purchased form SIGMA.

Cell Slurry Preparation for In-Cell NMR: Four hours post-induction, 50 mL E. coli culture expressing RSL was harvested (1500 rpm, 20 ˚C, 15 minutes) and the 260 pellet was gently resuspended in a final volume of 0.5 mL containing 20% D2O.

Cell Extract Preparation: Concentrated cell extract samples were prepared as previously described.260

NMR Spectroscopy: 1H-15N TROSY-HSQC218,219 or HSQC watergate258 spectra were acquired at 303 K on an Agilent 600 MHz NMR spectrometer equipped with a HCN cold probe. The data were processed in NMRPipe147 and analyzed in CCPN.148

Gelatin/Agarose NMR sample preparation: The procedure for preparing gelatin samples was adapted from the protocol used for the gelatin strength test261,262 and the methods of Leuenberger.263 The gelatin was weighed out and combined with NMR buffer and D2O. The mixture was left to stand for 1 hour to rehydrate the gelatin particles, and then heated in a water bath for 20 min at 40 °C to ensure complete solvation. RSL (or RSL plus methyl-α-L-fucoside) was also heated for 5 min at 40 °C prior to transfer to the gelatin mixture. Samples were inverted several times to ensure homogeneity and transferred to a NMR tube using a Pasteur pipette. To aid sample transfer, the Pasteur pipette was incubated at 60 °C to prevent the gelatin from solidifying. The samples were incubated overnight at 4 °C to ensure complete gelation. A method similar to Crowley et al. was used to prepare agarose-gel NMR samples. 212 The agarose was weighed out and combined with NMR buffer, D2O and NaN3. The suspension was heated until melting (90 - 95 °C) and maintained at 70 °C followed by addition of RSL. The sample was inverted several times and transferred rapidly using a pre-heated pipette to an NMR tube. The sample was incubated at 4 °C overnight.

119 Chapter V

A typical NMR sample contained 0.1 mM RSL and 75 mg/mL of porcine/bovine gelatin or 0.5/2% of agarose in standard NMR buffer (20 mM KPi, 50 mM NaCl at pH 6.0, 10% D2O) and 0.04% NaN3 as a preservative. The pH of the protein-gelatin mixtures in the buffer was verified to be 6.0.

Electrophoretic characterization: 20 μL samples containing 0.1 mM RSL and 0.2 mM carbohydrate were analyzed by native gel electrophoresis. 0.5 or 2% agarose gels, 6.0 cm x 7.0 cm, were run at 100 V in 20 mM KPi at pH 6.0 for 20 min at 4 °C. SDS-PAGE (15% polyacrylamide) was used to analyze boiled or unboiled protein samples. Visualization of the protein migration was achieved with Coomassie staining and a flatbed scanner.

120

121

Chapter VI Discussion

122 Discussion

1. Overview of Main Results Chapters 2 and 3 presented analyses of RSL interactions with numerous carbohydrates. Chapter 1 was focused on NMR assignments of RSL backbone and Trp indole resonances for sugar-free and -bound forms (full list of assignments in Appendix B). Additional assignments were also provided for RSL at pH 4.0 and pH 7.6 and in 20% DMSO. The assigned HSQC spectra of RSL were essential fingerprints for future NMR work on RSL. Additionally, the assignments of tryptophan side chains provided a useful tool to observe binding events and narrowed the analyses to only 7 cross-peaks. Chapter 3 is a continuation of the research on RSL sugar interactions. RSL was screened in the presence of 15 different carbohydrates and monitored by NMR spectroscopy. The screening included examples of monosaccharides (e.g. D-fructose, D-glucose, D-galactose), a disaccharide (sucrose) and sugar alcohols (glycerol, L-fucitol). Interactions and possible sugar conformations were analysed by using chemical shift perturbation and manual modelling of each sugar into the RSL binding site. This broad variety of sugars helped to create a library of RSL-sugar bound spectra. This knowledge was used to analyse e.g. the content of vegetable extracts. The data confirmed binding of RSL to D-fructose found in vegetable juice. Chapter 4 presented a possible route to noncovalent protein PEGylation. We took advantage of the lectin specificity to bind a certain sugar moiety. In our case, the PEG polymer was terminally modified with one L-fucose moiety, resulting in noncovalent attachment to RSL. The protein- glycopolymer conjugate could be disassembled by competing interactions with tighter binding carbohydrates like methyl-α-L-fucoside. Chapter 5 is an initial study of RSL in physiological environments. Although, RSL was undetectable by in-cell NMR it raised the interesting suggestion of electrostatic interactions of the almost neutral RSL in highly crowded environment. All of the fundamental characteristics defined in this thesis give a good starting point to future RSL projects.

2 Future Projects Developments and New Ideas 2.1 RSL-Sugar Interactions There are many monosaccharides that could be tested to complete our knowledge on RSL-sugar interactions. For example, of the aldohexoses it may be worth to test D- allose, D-idose and D-talose. D-mannose is the tightest binding to RSL among the

123 Chapter VI tested aldohexoses in D configuration. We have shown that D-galactose, C2 epimer of D-mannose, binds twice tighter than D-glucose, C2/4 epimer (Chapter 3). The test of D-talose, C4 epimer of D-mannose, could aid information about RSL preferences i.e. whether axial / equatorial hydroxyl group is more important at C2 or C4. D- and volemitol are other examples of short and long open chain sugars that could provide new insight on conformation adaptation when a binds to a protein. We showed that the shorter glycerol binds tighter than the longer (more conformationally flexible) L-fucitol (Chapter 3). D-arabitol has an intermediate size and its affinity could be lower / higher than glycerol and L-fucitol, respectively. Volemitol as the longest among the tested open chain sugar is predicted to be the weakest binding or even non-binding. Protein-sugar interactions have their crucial role in nature as saccharides partake in numerous biological processes.158 Considering the biological importance of such recognition events an obvious continuation on the RSL-sugar interactions is to study the binding of complex carbohydrates. For example, LewisX (PDB 5AJB) and Sialyl- LewisX (PDB 5AJC) interactions with RSL have been characterized by X-ray crystallography.35 Perhaps, useful information can be obtained by NMR, in particular, how the flexible carbohydrates might interact with parts of the protein adjacent to the binding site. Lastly, screening more disaccharides like maltose (4-O-α-D-glucopyranosyl-D- glucose) or lactose (4-O-β-D-galactopyranosyl-D-glucose) may give new ideas on weak binders. Studies of RSL binding to weak sugars are practically impossible using other experimental techniques. For instance, Kostlanova et al. showed that D- galactose was classified as a non-binder5 while we proved that it is a weak binder with Kd ~ 15 mM. Disaccharides composed of tight binders could also be interesting to characterize, for example, the various mannobioses (dimannosides with α(1→X) linkages).

2.2 PEGylation PEGylation is a widely used modification to increase protein size, solubility and stability.100,102 In Chapter 4 we demonstrated noncovalent PEGylation. A study of standard RSL PEGylation can be worthwhile as it would enable a direct comparison of line broadening effects caused by covalent or noncovalent modifications. Also,

124 Discussion the PEG chains attached to RSL can presumably shield RSL from other interactions. For example, if PEG masks the binding site, protein-polysaccharide interactions may be inhibited. The relevance to physiological studies is presented in the next section. The RSL trimer has 84 residues that can be targeted for modification. The RSL monomer contains 28 reactive side chains (Tyr × 4, Asp × 3, Glu × 3, Lys × 3, Arg × 3, Cys × 2, and His × 1), 7 tryptophans and the C-/N-termini, which can be derivatized for conjugation purposes.5,264 The two most common conjugation strategies are via modification of cysteine or lysine residues.264 Therefore, narrowing target residues to just Cys and Lys, gives rise to a total of 18 potential modification sites (including the N-terminus) in the RSL trimer. The initial study of Cys modification in RSL was carried out by Leah Kearney (a BSc. project student). It was found that cysteine modification by maleimide was limited due to these residues being buried in the binding site. Instead, the modification of the well exposed lysine residues and the N-terminus was successfully performed and further studies are currently being carried out in the Crowley lab.

2.3 Physiological Environments Our preliminary studies show that RSL, like many globular proteins, is undetectable by in-cell NMR. The studies in crowded / confined environment are preliminary but an obvious new starting point is to investigate charge-charge interactions with the cationic lysozyme at pH 6. For this reason measurements of RSL-lysozyme mixtures in different salt concentrations are required. RSL is a new model protein for studies in physiological environments, therefore, many already developed systems can be implemented. For instance, using synthetic polymers may provide a feasible comparison with protein crowders. Crowley et al. showed the ability of high concentrations of PEG to bind to hydrophobic regions of cytochrome c.212 RSL was tested in the presence of 3 mM PEG 2000 and at this concentration no interactions were observed (Chapter 4). At much higher concentrations i.e. 100 / 300 g/L interactions might occur at hydrophobic sites in RSL (see Figure 4.1.). Another widely used polymer crowder is Ficoll70, a branched and cross-linked polymer of sucrose and epichlorohydrine that, due to its more spherical shape can mimic globular proteins of ~70 kDa (e.g. BSA). This synthetic crowder is more interesting as Ficoll70 can seemingly bind to RSL. It has a branched structure with chains

125 Chapter VI terminated by sucrose. We showed that sucrose, the main component of Ficoll70 is the weakest binding sugar among all tested but in the case of Ficoll70, multivalency plays a role. Even if the terminal sucrose binds in a different, less favoured orientation compared to the free disaccharide, the interactions between RSL and multiple Ficoll chains can occur at high concentration of the crowder resulting in high molecular weight complexes. Here, the PEGylated protein can be masked from other molecules so new insight on crowding and binding events can be elucidated using PEGylated RSL in the presence of Ficoll70. Presumably, PEGylated RSL can be protected from binding to Ficoll.

2.4 In Plant NMR The uses of unusual crowders to mimic physiological environments have been reported. For example, Martorell et al. studied CyaY protein in hen egg white,241 while Sarkar et al. prepared reconstituted cytosol to study chymotrypsin inhibitor 2 stability.265 In Chapter 5 we have presented the method of using gelatin as the confinement agent that is also quite novel approach. The effects of confinement can be developed further by encapsulating RSL in polyacrylamide gels that would provide direct comparison to the data in gelatin and agarose gel. Another alternative to crowded / confined environments is the idea of in-plant NMR. The transfer of sugar-free or -bound RSL into plant tissue could provide a natural confined environment. This idea can provide information about both crowding and binding effects. RSL screened in the presence of vegetable juice showed binding to D-fructose in the extract (Chapter 3). Sugar-free RSL transferred into plant tissue, could be observed by NMR and provide information on direct binding to the tissue. Presumably, if RSL binds to the tissue it will result in a high molecular weight assembly and therefore be undetectable by NMR. These interactions could be hindered by tighter binding monosaccharides (as in the case of assembly with fucosylated PEG, Chapter 3) and restoration of the RSL spectrum. On the other hand, the crowding in plant tissue may be higher than obtained in gels (Chapter 4) and NMR detection may be hampered due to the lack of motion (infinitely long tumbling time). This idea is wide open and might be tricky from the point of experiment design. Firstly, we need to choose a plant tissue that will be easy to work with. Here, the

126 Discussion natural candidate seems to be potato (cheap source) and RSL is from a pathogen that attacks potato (see Introduction and next section). Secondly, RSL needs to be transferred into the tissue, in a way which ensures that the resulting spectrum does not belong to RSL on the tissue surface. For example, protein can be injected into the plant tissue (rather than dipping the tissue into an RSL solution). Partial drying of the plant material followed by soaking in the RSL solution can be another practice. The other difficulty may be transfer of the plant tissue into the NMR tube and obtaining reliable shims. All of these disadvantages may be worth to overcome because the outcome can be interesting and surprising.

2.5 Plant Pathogen Inhibitors Ralstonia solanacearum is a threat to hundreds of plants of agricultural importance. The bacterium causes fatal damage to crops such as potato, tobacco and banana causing major agricultural and economical losses.24 A pesticide that helps combat Ralstonia wilting could be based on knowledge of RSL, which mediates plant- pathogen interactions.5 Since the fingerprint HSQC spectrum of RSL is available, every water soluble (or 80:20 water:DMSO) compound can be screened for binding to RSL. Additionally, extended studies on sugars can help to eliminate misleading results. The first effort to screen plant extracts was already performed without much success (Chapter 3) but the attempts to obtain a positive hit will be continued.

127

128

Bibliography

130 Bibliography

1. Boyd, W. C. & Shapleigh, E. Specific Precipitating Activity of Plant Agglutinins (Lectins). Science 119, 419 LP-419 (1954). 2. Sharon, N. & Lis, H. History of lectins: from hemagglutinins to biological recognition molecules. Glycobiology 14, 53R–62R (2004). 3. Ren, X.-M., Li, D.-F., Jiang, S., Lan, X.-Q., Hu, Y., Sun, H. & Wang, D.-C. Structural Basis of Specific Recognition of Non-Reducing Terminal N-Acetylglucosamine by an Agrocybe aegerita Lectin. PLoS One 10, e0129608 (2015). 4. Hsieh, T.-J., Lin, H.-Y., Tu, Z., Huang, B.-S., Wu, S.-C. & Lin, C.-H. Structural Basis Underlying the Binding Preference of Human Galectins-1, -3 and -7 for Galβ1-3/4GlcNAc. PLoS One 10, e0125946 (2015). 5. Kostlanova, N., Mitchell, E. P., Lortat-Jacob, H., Oscarson, S., Lahmann, M., Gilboa-Garber, N., Chambat, G., Wimmerova, M. & Imberty, A. The fucose-binding lectin from Ralstonia solanacearum - A new type of beta-propeller architecture formed by oligomerization and interacting with fucoside, fucosyllactose, and plant xyloglucan. J. Biol. Chem. 280, 27839– 27849 (2005). 6. Audfray, A., Claudinon, J., Abounit, S., Ruvoen-Clouet, N., Larson, G., Smith, D. F., Wimmerova, M., Le Pendu, J., Romer, W., Varrot, A. & Imberty, A. Fucose-binding Lectin from Opportunistic Pathogen Burkholderia ambifaria Binds to Both Plant and Human Oligosaccharidic Epitopes. J. Biol. Chem. 287, 4335–4347 (2012). 7. Lin, Y. P., Xiong, X., Wharton, S. A., Martin, S. R., Coombs, P. J., Vachieri, S. G., Christodoulou, E., Walker, P. A., Liu, J., Skehel, J. J., Gamblin, S. J., Hay, A. J., Daniels, R. S. & McCauley, J. W. Evolution of the receptor binding properties of the influenza A(H3N2) hemagglutinin. Proc. Natl. Acad. Sci. 109, 21474–21479 (2012). 8. Lis, H. & Sharon, N. Lectins: Carbohydrate-specific proteins that mediate cellular recognition. Chem. Rev. 98, 637–674 (1998). 9. Staudacher, E., Altmann, F., Wilson, I. B. H. & März, L. Fucose in N-glycans: from plant to man. Biochim. Biophys. Acta - Gen. Subj. 1473, 216–236 (1999). 10. Yu, L.-G. & Rhodes, J. M. in Lectin Methods Protoc. (eds. Rhodes, J. M. & Milton, J. D.) 379–384 (Humana Press, 1998). 11. Kilpatrick, D. C. Animal lectins: a historical introduction and overview. Biochim. Biophys. Acta - Gen. Subj. 1572, 187–197 (2002). 12. Chrispeels, M. J. & Raikhel, N. V. Lectins, lectin genes, and their role in plant defense. Plant Cell 3, 1–9 (1991). 13. Becker, D. J. & Lowe, J. B. Fucose: biosynthesis and biological function in mammals. Glycobiology 13, 41R–53R (2003). 14. Dang, S. Y., Sun, L. F., Huang, Y. J., Lu, F. R., Liu, Y. F., Gong, H. P., Wang, J. W. & Yan, N. E. Structure of a fucose transporter in an outward-open conformation. Nature 467, 734- U130 (2010). 15. Varki, A. Essentials of glycobiology. (Cold Spring Harbor Laboratory Press, 2009). 16. Arnaud, J., Audfray, A. & Imberty, A. Binding sugars: from natural lectins to synthetic

131 Bibliography

receptors and engineered neolectins. Chem. Soc. Rev. 42, 4798–4813 (2013). 17. Aplin, J. D. & Jones, C. J. P. Fucose, placental evolution and the glycocode. Glycobiology 22, 470–478 (2012). 18. Marcus, D. M. & Cass, L. E. Glycosphingolipids with Lewis Blood Group Activity: Uptake by Human Erythrocytes. Science 164, 553 LP-555 (1969). 19. Pang, P. C., Chiu, P. C. N., Lee, C. L., Chang, L. Y., Panico, M., Morris, H. R., Haslam, S. M., Khoo, K. H., Clark, G. F., Yeung, W. S. B. & Dell, A. Human Sperm Binding Is Mediated by the Sialyl-Lewis(x) Oligosaccharide on the Zona Pellucida. Science 333, 1761– 1764 (2011). 20. Listinsky, J. J., Siegal, G. P. & Listinsky, C. M. alpha-L-fucose - A potentially critical molecule in pathologic processes including neoplasia. Am. J. Clin. Pathol. 110, 425–440 (1998). 21. Fern ndez-Rodr guez, J., de la Cadena, M. P., Mart n ez- orzano, . S. Rodr guez- Berrocal, F. J. Fucose levels in sera and in tumours of colorectal adenocarcinoma patients. Cancer Lett. 121, 147–153 (1997). 22. Listinsky, J. J., Siegal, G. P. & Listinsky, C. M. The emerging importance of alpha-L-fucose in human breast cancer: a review. Am. J. Transl. Res. 3, 292–322 (2011). 23. Smith, E. F. A bacterial disease of tomato, pepper, eggplant and Irish potato (Bacillus solanacearum nov. sp.). Bull. (United States. Div. Veg. Physiol. Pathol. 12, 1–28 (1896). 24. Hayward, A. C. Biology and epidemiology of bacterial wilt caused by Pseudomonas solacacearum. Annu. Rev. Phytopathol. 29, 65–87 (1991). 25. Schell, M. A. Control of virulence and pathogenicity genes of Ralstonia solanacearum by an elaborate sensory network. Annu. Rev. Phytopathol. 38, 263–292 (2000). 26. Salanoubat, M., Genin, S., Artiguenave, F., Gouzy, J., Mangenot, S., Arlat, M., Billault, A., Brottier, P., Camus, J. C., Cattolico, L., Chandler, M., Choisne, N., Claudel-Renard, C., Cunnac, S., Demange, N., Gaspin, C., Lavie, M., Moisan, A., Robert, C., Saurin, W., Schiex, T., Siguier, P., Thebault, P., Whalen, M., Wincker, P., Levy, M., Weissenbach, J. & Boucher, C. A. Genome sequence of the plant pathogen Ralstonia solanacearum. Nature 415, 497–502 (2002). 27. Carpita, N. C. & Gibeaut, D. M. Structural models of primary cell walls in flowering plants: consistency of molecular structure with the physical properties of the walls during growth. Plant J. 3, 1–30 (1993). 28. Hooper, L. V & Gordon, J. I. Glycans as legislators of host–microbial interactions: spanning the spectrum from symbiosis to pathogenicity. Glycobiology 11, 1R–10R (2001). 29. Wimmerova, M., Mitchell, E., Sanchez, J. F., Gautier, C. & Imberty, A. Crystal structure of fungal lectin - Six-bladed beta-propeller fold and novel fucose recognition mode for Aleuria aurantia lectin. J. Biol. Chem. 278, 27059–27067 (2003). 30. Houser, J., Komarek, J., Kostlanova, N., Cioci, G., Varrot, A., Kerr, S. C., Lahmann, M., Balloy, V., Fahy, J. V, Chignard, M., Imberty, A. & Wimmerova, M. A Soluble Fucose- Specific Lectin from Aspergillus fumigatus Conidia - Structure, Specificity and Possible Role

132 Bibliography

in Fungal Pathogenicity. PLoS One 8, e83077 (2013). 31. Wimmerova, M., Kozmon, S., Necasova, I., Mishra, S. K., Komarek, J. & Koca, J. Stacking Interactions between Carbohydrate and Protein Quantified by Combination of Theoretical and Experimental Methods. PLoS One 7, (2012). 32. Mishra, S. K., Adam, J., Wimmerova, M. & Koca, J. In Silico Mutagenesis and Docking Study of Ralstonia solanacearum RSL Lectin: Performance of Docking Software To Predict Saccharide Binding. J. Chem. Inf. Model. 52, 1250–1261 (2012). 33. Mishra, S. K., Sund, J., Aqvist, J. & Koca, J. Computational prediction of monosaccharide binding free energies to lectins with linear interaction energy models. J. Comput. Chem. 33, 2340–2350 (2012). 34. Mishra, S. K., Calabró, G., Loeffler, H. H., Michel, J. Koča, J. Evaluation of Selected Classical Force Fields for Alchemical Binding Free Energy Calculations of Protein- Carbohydrate Complexes. J. Chem. Theory Comput. 11, 3333–3345 (2015). 35. Topin, J., Lelimousin, M., Arnaud, J., Audfray, A., Pérez, S., Varrot, A. & Imberty, A. The Hidden Conformation of LewisX, a Human Histo-Blood Group Antigen, Is a Determinant for Recognition by Pathogen Lectins. ACS Chem. Biol. (2016). 36. Arnaud, J., Claudinon, J., Tröndle, K., Trovaslet, M., Larson, G., Thomas, A., Varrot, A., Römer, W., Imberty, A. & Audfray, A. Reduction of Lectin Valency Drastically Changes Glycolipid Dynamics in Membranes but Not Surface Avidity. ACS Chem. Biol. 8, 1918–1924 (2013). 37. Arnaud, J., Tröndle, K., Claudinon, J., Audfray, A., Varrot, A., Römer, W. & Imberty, A. Membrane Deformation by Neolectins with Engineered Glycolipid Binding Sites. Angew. Chemie Int. Ed. 53, 9267–9270 (2014). 38. Imberty, A. & Varrot, A. Microbial recognition of human cell surface glycoconjugates. Curr. Opin. Struct. Biol. 18, 567–576 (2008). 39. Ambrosi, M., Cameron, N. R. & Davis, B. G. Lectins: tools for the molecular understanding of the glycocode. Org. Biomol. Chem. 3, 1593–1608 (2005). 40. Chiarini, L., Bevivino, A., Dalmastri, C., Tabacchioni, S. & Visca, P. Burkholderia cepacia complex species: health hazards and biotechnological potential. Trends Microbiol. 14, 277– 286 (2006). 41. Coenye, T., Mahenthiralingam, E., Henry, D., LiPuma, J. J., Laevens, S., Gillis, M., Speert, D. P. & Vandamme, P. Burkholderia ambifaria sp. nov., a novel member of the Burkholderia cepacia complex including biocontrol and cystic fibrosis-related isolates. Int. J. Syst. Evol. Microbiol. 51, 1481–1490 (2001). 42. Campbell, I. D. The march of structural biology. Nat Rev Mol Cell Biol 3, 377–381 (2002). 43. Fisher, A. J. in Amin. Acids, Pept. Proteins Org. Chem. 51–95 (Wiley-VCH Verlag GmbH & Co. KGaA, 2011). 44. Jacques, D. A. & Trewhella, J. Small-angle scattering for structural biology—Expanding the frontier while avoiding the pitfalls. Protein Sci. 19, 642–657 (2010). 45. Arnold, K., Bordoli, L., Kopp, J. & Schwede, T. The SWISS-MODEL workspace: a web-

133 Bibliography

based environment for protein structure homology modelling. Bioinforma. 22, 195–201 (2006). 46. Svergun, D. I., Ekstrom, F., Vandegriff, K. D., Malavalli, A., Baker, D. A., Nilsson, C. & Winslow, R. M. Solution structure of poly(ethylene) glycol-conjugated hemoglobin revealed by small-angle x-ray scattering: Implications for a new oxygen therapeutic. Biophys. J. 94, 173–181 (2008). 47. He, L., Wang, H., Garamus, V. M., Hanley, T., Lensch, M., Gabius, H.-J., Fee, C. J. & Middelberg, A. Analysis of MonoPEGylated Human Galectin-2 by Small-Angle X-ray and Neutron Scattering: Concentration Dependence of PEG Conformation in the Conjugate. Biomacromolecules 11, 3504–3510 (2010). 48. Pai, S. S., Hammouda, B., Hong, K. L., Pozzo, D. C., Przybycien, T. M. & Tilton, R. D. The Conformation of the Poly() Chain in Mono-PEGylated Lysozyme and Mono- PEGylated Human Growth Hormone. Bioconjug. Chem. 22, 2317–2323 (2011). 49. Cattani, G., Vogeley, L. & Crowley, P. B. Structure of a PEGylated protein reveals a highly porous double-helical assembly. Nat Chem 7, 823–828 (2015). 50. Kikhney, A. G. & Svergun, D. I. A practical guide to small angle X-ray scattering (SAXS) of flexible and intrinsically disordered proteins. FEBS Lett. 589, 2570–2577 (2015). 51. Williamson, M. P., Havel, T. F. & Wüthrich, K. Solution conformation of proteinase inhibitor IIA from bull seminal plasma by 1H nuclear magnetic resonance and distance geometry. J. Mol. Biol. 182, 295–315 (1985). 52. Wuthrich, K. The way to NMR structures of proteins. Nat Struct Mol Biol 8, 923–925 (2001). 53. Loquet, A., Bardiaux, B., Gardiennet, C., Blanchet, C., Baldus, M., Nilges, M., Malliavin, T. & Böckmann, A. 3D Structure Determination of the Crh Protein from Highly Ambiguous Solid-State NMR Restraints. J. Am. Chem. Soc. 130, 3579–3589 (2008). 54. Wagner, G. An account of NMR in structural biology. Nat. Struct. Biol. 4, 841–844 (1997). 55. Markwick, P. R. L., Malliavin, T. & Nilges, M. Structural Biology by NMR: Structure, Dynamics, and Interactions. PLoS Comput Biol 4, e1000168 (2008). 56. Waugh, D. S. Genetic tools for selective labeling of proteins with α-15N-amino acids. J. Biomol. NMR 8, 184–192 (1996). 57. Löhr, F., Reckel, S., Karbyshev, M., Connolly, P. J., Abdul-Manan, N., Bernhard, F., Moore, J. M. & Dötsch, V. Combinatorial triple-selective labeling as a tool to assist membrane protein backbone resonance assignment. J. Biomol. NMR 52, 197–210 (2012). 58. Jaipuria, G., Krishnarjuna, B., Mondal, S., Dubey, A. & Atreya, H. S. in Isot. labeling Biomol. NMR (ed. Atreya, H. S.) 95–118 (Springer Netherlands, 2012). 59. Palmer, A. G. NMR characterization of the dynamics of biomacromolecules. Chem. Rev. 104, 3623–3640 (2004). 60. Mittermaier, A. K. & Kay, L. E. Observing biological dynamics at atomic resolution using NMR. Trends Biochem. Sci. 34, 601–611 (2009). 61. Loria, J. P., Berlow, R. B. & Watt, E. D. Characterization of enzyme motions by solution NMR relaxation dispersion. Acc Chem Res 41, 214–221 (2008).

134 Bibliography

62. Ishima, R. in NMR of Proteins and Small Biomolecules (ed. Zhu, G.) 99–122 (Springer Berlin Heidelberg, 2012). 63. Korzhnev, D. M., Salvatella, X., Vendruscolo, M., Di Nardo, A. A., Davidson, A. R., Dobson, C. M. & Kay, L. E. Low-populated folding intermediates of Fyn SH3 characterized by relaxation dispersion NMR. Nature 430, 586–590 (2004). 64. Lian, L.-Y. NMR studies of weak protein–protein interactions. Prog. Nucl. Magn. Reson. Spectrosc. 71, 59–72 (2013). 65. Mulder, F. A. A., Hon, B., Mittermaier, A., Dahlquist, F. W. & Kay, L. E. Slow Internal Dynamics in Proteins: Application of NMR Relaxation Dispersion Spectroscopy to Methyl Groups in a Cavity Mutant of T4 Lysozyme. J. Am. Chem. Soc. 124, 1443–1451 (2002). 66. Massi, F., Grey, M. J. & Palmer, A. G. Microsecond timescale backbone conformational dynamics in ubiquitin studied with NMR R(1ρ) relaxation experiments. Protein Sci. 14, 735– 742 (2005). 67. Grey, M. J., Tang, Y., Alexov, E., McKnight, C. J., Raleigh, D. P. & Palmer III, A. G. Characterizing a Partially Folded Intermediate of the Villin Headpiece Domain Under Non- denaturing Conditions: Contribution of His41 to the pH-dependent Stability of the N-terminal Subdomain. J. Mol. Biol. 355, 1078–1094 (2006). 68. Tolkatchev, D., Xu, P. Ni, F. Probing the Kinetic Landscape of Transient Peptide−Protein Interactions by Use of Peptide 15N NMR Relaxation Dispersion Spectroscopy: Binding of an Antithrombin Peptide to Human Prothrombin. J. Am. Chem. Soc. 125, 12432–12442 (2003). 69. Jarymowycz, V. A. & Stone, M. J. Fast time scale dynamics of protein backbones: NMR relaxation methods, applications, and functional consequences. Chem. Rev. 106, 1624–1671 (2006). 70. Williamson, M. P. Using chemical shift perturbation to characterise ligand binding. Prog. Nucl. Magn. Reson. Spectrosc. 73, 1–16 (2013). 71. Angulo, J. & Nieto, P. M. STD-NMR: application to transient interactions between biomolecules - a quantitative approach. Eur. Biophys. J. 40, 1357–1369 (2011). 72. Jiménez-Barbero, J. & Peters, T. in NMR Spectrosc. Glycoconjugates 289–310 (Wiley-VCH Verlag GmbH & Co. KGaA, 2002). 73. Siebert, H.-C., Jiménez-Barbero, J., André, S., Kaltner, H. & Gabius, H.-J. in Recognit. Carbohydrates Biol. Syst. Part A Gen. Proced. (ed. Enzymology, B. T.-M. in) Volume 362, 417–434 (Academic Press, 2003). 74. Pérez, S. Tvaroška, I. in Advances in Carbohydrate Chemistry and Biochemistry (ed. Biochemistry, D. H. B. T.-A. in C. C. and) Volume 71, 9–136 (Academic Press, 2014). 75. McCoy, M. A. & Wyss, D. F. Spatial Localization of Ligand Binding Sites from Electron Current Density Surfaces Calculated from NMR Chemical Shift Perturbations. J. Am. Chem. Soc. 124, 11758–11763 (2002). 76. uiderweg, E. R. P. Mapping Protein−Protein Interactions in Solution by NMR Spectroscopy. Biochemistry 41, 1–7 (2002). 77. Yongye, A. B., Calle, L., Arda, A., Jimenez-Barbero, J., Andre, S., Gabius, H. J., Martinez-

135 Bibliography

Mayorga, K. & Cudic, M. Molecular Recognition of the Thomsen-Friedenreich Antigen- Threonine Conjugate by Adhesion/Growth Regulatory Galectin-3: Nuclear Magnetic Resonance Studies and Molecular Dynamics Simulations. Biochemistry 51, 7278–7289 (2012). 78. Miller, M. C., Ribeiro, J. P., Roldos, V., Martin-Santamaria, S., Canada, F. J., Nesmelova, I. A., Andre, S., Pang, M., Klyosov, A. A., Baum, L. G., Jimenez-Barbero, J., Gabius, H. J. & Mayo, K. H. Structural aspects of binding of α-linked digalactosides to human galectin-1. Glycobiology 21, 1627–1641 (2011). 79. Ermakova, E., Miller, M. C., Nesmelova, I. V, Lopez-Merino, L., Berbis, M. A., Nesmelov, Y., Tkachev, Y. V, Lagartera, L., Daragan, V. A., Andre, S., Canada, F. J., Jimenez-Barbero, J., Solis, D., Gabius, H. J. & Mayo, K. H. Lactose binding to human galectin-7 (p53-induced gene 1) induces long-range effects through the protein resulting in increased dimer stability and evidence for positive cooperativity. Glycobiology 23, 508–523 (2013). 80. Solis, D., Mate, M. J., Lohr, M., Ribeiro, J. P., Lopez-Merino, L., Andre, S., Buzamet, E., Canada, F. J., Kaltner, H., Lenschd, M., Ruiz, F. M., Haroske, G., Wollina, U., Kloor, M., Kopitz, J., Saiz, J. L., Menendez, M., Jimenez-Barbero, J., Romero, A. & Gabius, H. J. N- domain of human adhesion/growth-regulatory galectin-9: Preference for distinct conformers and non-sialylated N-glycans and detection of ligand-induced structural changes in crystal and solution. Int. J. Biochem. Cell Biol. 42, 1019–1029 (2010). 81. Nonaka, Y., Ogawa, T., Oomizu, S., Nakakita, S., Nishi, N., Kamitori, S., Hirashima, M. & Nakamura, T. Self-association of the galectin-9 C-terminal domain via the opposite surface of the sugar-binding site. J. Biochem. 153, 463–471 (2013). 82. Fiege, B., Rabbani, S., Preston, R. C., Jakob, R. P., Zihlmann, P., Schwardt, O., Jiang, X., Maier, T. & Ernst, B. The Tyrosine Gate of the Bacterial Lectin FimH: A Conformational Analysis by NMR Spectroscopy and X-ray Crystallography. ChemBioChem 16, 1235–1246 (2015). 83. Probert, F., Whittaker, S. B. M., Crispin, M., Mitchell, D. A. & Dixon, A. M. Solution NMR analyses of the C-type carbohydrate recognition domain of DC-SIGNR protein reveal different binding modes for HIV-derived oligosaccharides and smaller glycan fragments. J. Biol. Chem. 288, 22745–22757 (2013). 84. Probert, F., Mitchell, D. A. & Dixon, A. M. NMR evidence for oligosaccharide release from the dendritic-cell specific intercellular adhesion molecule 3-grabbing non-integrin-related (CLEC4M) carbohydrate recognition domain at low pH. FEBS J. 281, 3739–3750 (2014). 85. Simpson, P. J., Xie, H., Bolam, D. N., Gilbert, H. J. & Williamson, M. P. The Structural Basis for the Ligand Specificity of Family 2 Carbohydrate-binding Modules. J. Biol. Chem. 275, 41137–41142 (2000). 86. Pröpster, J. M., Yang, F., Rabbani, S., Ernst, B., Allain, F. H.-T. & Schubert, M. Structural basis for sulfation-dependent self-glycan recognition by the human immune-inhibitory receptor Siglec-8. Proc. Natl. Acad. Sci. 113, E4170–E4179 (2016). 87. Marchetti, R., Perez, S., Arda, A., Imberty, A., Jimenez-Barbero, J., Silipo, A. & Molinaro, A.

136 Bibliography

‘Rules of Engagement’ of Protein–Glycoconjugate Interactions: A Molecular View Achievable by using NMR Spectroscopy and Molecular Modeling. ChemistryOpen 5, 274– 296 (2016). 88. Asensio, J. L., Arda, A., Canada, F. J. & Jimenez-Barbero, J. Carbohydrate-Aromatic Interactions. Acc. Chem. Res. 46, 946–954 (2013). 89. Fernández-Alonso, M. d. C., Díaz, D., Berbis, M. Á., Marcelo, F., Cañada, J. & Jiménez- Barbero, J. Protein-Carbohydrate interactions studied by NMR: From molecular recognition to drug design. Curr. Protein Pept. Sci. 13, 816–830 (2012). 90. Hudson, K. L., Bartlett, G. J., Diehl, R. C., Agirre, J., Gallagher, T., Kiessling, L. L. & Woolfson, D. N. Carbohydrate-Aromatic Interactions in Proteins. J. Am. Chem. Soc. 137, 15152–15160 (2015). 91. Jimenez-Moreno, E., Jimenez-Oses, G., Gomez, A. M., Santana, A. G., Corzana, F., Bastida, A., Jimenez-Barbero, J. & Asensio, J. L. A thorough experimental study of CH/ interactions in water: quantitative structure-stability relationships for carbohydrate/aromatic complexes. Chem. Sci. 6, 6076–6085 (2015). 92. Hsu, C.-H., Park, S., Mortenson, D. E., Foley, B. L., Wang, X., Woods, R. J., Case, D. A., Powers, E. T., Wong, C.-H., Dyson, H. J. & Kelly, J. W. The Dependence of Carbohydrate- Aromatic Interaction Strengths on the Structure of the Carbohydrate. J. Am. Chem. Soc. 138, 7636–7648 (2016). 93. Bies, C., Lehr, C.-M. & Woodley, J. F. Lectin-mediated drug targeting: history and applications. Adv. Drug Deliv. Rev. 56, 425–435 (2004). 94. Sakuma, S., Yano, T., Masaoka, Y., Kataoka, M., Hiwatari, K., Tachikawa, H., Shoji, Y., Kimura, R., Ma, H., Yang, Z., Tang, L., Hoffman, R. M. & Yamashita, S. In vitro/in vivo biorecognition of lectin-immobilized fluorescent nanospheres for human colorectal cancer cells. J. Control. Release 134, 2–10 (2009). 95. Olive, K. P., Jacobetz, M. A., Davidson, C. J., Gopinathan, A., McIntyre, D., Honess, D., Madhu, B., Goldgraben, M. A., Caldwell, M. E., Allard, D., Frese, K. K., DeNicola, G., Feig, C., Combs, C., Winter, S. P., Ireland-Zecchini, H., Reichelt, S., Howat, W. J., Chang, A., Dhara, M., Wang, L., Rückert, F., Grützmann, R., Pilarsky, C., Izeradjene, K., Hingorani, S. R., Huang, P., Davies, S. E., Plunkett, W., Egorin, M., Hruban, R. H., Whitebread, N., McGovern, K., Adams, J., Iacobuzio-Donahue, C., Griffiths, J. & Tuveson, D. A. Inhibition of Hedgehog Signaling Enhances Delivery of Chemotherapy in a Mouse Model of Pancreatic Cancer. Science 324, 1457 LP-1461 (2009). 96. Chang, Y. S., di Tomaso, E., McDonald, D. M., Jones, R., Jain, R. K. & Munn, L. L. Mosaic blood vessels in tumors: Frequency of cancer cells in contact with flowing blood. Proc. Natl. Acad. Sci. 97, 14608–14613 (2000). 97. Leong, K. H., Chung, L. Y., Noordin, M. I., Onuki, Y., Morishita, M. & Takayama, K. Lectin-functionalized carboxymethylated kappa-carrageenan microparticles for oral insulin delivery. Carbohydr. Polym. 86, 555–565 (2011). 98. Bass, L. A., Ho, S. V, Mo, J., Buckley, J. J. & Finn, R. F. in Process Chem. Pharm. Ind. Vol.

137 Bibliography

2 383–402 (CRC Press, 2007). 99. Zalipsky, S. Chemistry of polyethylene glycol conjugates with biologically active molecules. Adv. Drug Deliv. Rev. 16, 157–182 (1995). 100. Harris, J. M. & Chess, R. B. Effect of pegylation on pharmaceuticals. Nat Rev Drug Discov 2, 214–221 (2003). 101. Rajan, R. S., Li, T., Aras, M., Sloey, C., Sutherland, W., Arai, H., Briddell, R., Kinstler, O., Lueras, A. M. K., Zhang, Y., Yeghnazar, H., Treuheit, M. & Brems, D. N. Modulation of protein aggregation by polyethylene glycol conjugation: GCSF as a case study. Protein Sci. 15, 1063–1075 (2006). 102. Shashwat, B. S., Naval, A., Rajesh, P. & Khandare, J. Poly(ethylene glycol)-Prodrug Conjugates: Concept, Design, and Applications. J. Drug Deliv. 2012, 17 (2012). 103. Pfister, D. & Morbidelli, M. Process for protein PEGylation. J. Control. Release 180, 134– 149 (2014). 104. Kontos, S. & Hubbell, J. A. Drug development: longer-lived proteins. Chem. Soc. Rev. 41, 2686–2695 (2012). 105. Ellis, R. J. Macromolecular crowding: an important but neglected aspect of the intracellular environment. Curr. Opin. Struct. Biol. 11, 114–119 (2001). 106. Ellis, R. J. Macromolecular crowding: obvious but underappreciated. Trends Biochem. Sci. 26, 597–604 (2001). 107. Cronan, J. E. in eLS (John Wiley & Sons, Ltd, 2014). 108. Sundararaj, S., Guo, A., Habibi-Nazhad, B., Rouani, M., Stothard, P., Ellison, M. & Wishart, D. S. The CyberCell Database (CCDB): a comprehensive, self-updating, relational database to coordinate and facilitate in silico modeling of Escherichia coli. Nucleic Acids Res. 32, D293–D295 (2004). 109. Bennett, B. D., Kimball, E. H., Gao, M., Osterhout, R., Van Dien, S. J. & Rabinowitz, J. D. Absolute metabolite concentrations and implied enzyme active site occupancy in Escherichia coli. Nat Chem Biol 5, 593–599 (2009). 110. Minton, A. P. Implications of macromolecular crowding for protein assembly. Curr. Opin. Struct. Biol. 10, 34–39 (2000). 111. Minton, A. P. The Influence of Macromolecular Crowding and Macromolecular Confinement on Biochemical Reactions in Physiological Media. J. Biol. Chem. 276, 10577–10580 (2001). 112. Bismuto, E., Martelli, P. L., De Maio, A., Mita, D. G., Irace, G. & Casadio, R. Effect of molecular confinement on internal enzyme dynamics: Frequency domain fluorometry and molecular dynamics simulation studies. Biopolymers 67, 85–95 (2002). 113. Bernadó, P., de la Torre, J. G. & Pons, M. Macromolecular crowding in biological systems: hydrodynamics and NMR methods. J. Mol. Recognit. 17, 397–407 (2004). 114. Maldonado, A. Y., Burz, D. S. & Shekhtman, A. In-cell NMR spectroscopy. Prog. Nucl. Magn. Reson. Spectrosc. 59, 197–212 (2011). 115. Xu, G., Ye, Y., Liu, X., Cao, S., Wu, Q., Cheng, K., Liu, M., Pielak, G. J. & Li, C. Strategies for protein NMR in Escherichia coli. Biochemistry 53, 1971–1981 (2014).

138 Bibliography

116. Bolis, D., Politou, A. S., Kelly, G., Pastore, A. & Andrea Temussi, P. Protein Stability in Nanocages: A Novel Approach for Influencing Protein Stability by Molecular Confinement. J. Mol. Biol. 336, 203–212 (2004). 117. Phillip, Y., Sherman, E., Haran, G. & Schreiber, G. Common Crowding Agents Have Only a Small Effect on Protein-Protein Interactions. Biophys. J. 97, 875–885 (2009). 118. Rivas, G., Fernández, J. A. & Minton, A. P. Direct observation of the enhancement of noncooperative protein self-assembly by macromolecular crowding: Indefinite linear self- association of bacterial cell division protein FtsZ. Proc. Natl. Acad. Sci. 98, 3150–3155 (2001). 119. Miklos, A. C., Li, C., Sharaf, N. G. & Pielak, G. J. Volume Exclusion and Soft Interaction Effects on Protein Stability under Crowded Conditions. Biochemistry 49, 6984–6991 (2010). 120. Wang, Y., Li, C. & Pielak, G. J. Effects of Proteins on Protein Diffusion. J. Am. Chem. Soc. 132, 9392–9397 (2010). 121. Crowley, P. B., Chow, E. & Papkovskaia, T. Protein Interactions in the Escherichia coli Cytosol: An Impediment to In-Cell NMR Spectroscopy. ChemBioChem 12, 1043–1048 (2011). 122. Wang, Q., Zhuravleva, A. & Gierasch, L. M. Exploring Weak, Transient Protein–Protein Interactions in Crowded In Vivo Environments by In-Cell Nuclear Magnetic Resonance Spectroscopy. Biochemistry 50, 9225–9236 (2011). 123. Gabius, H.-J., Andre, S., Jimenez-Barbero, J., Romero, A. & Solis, D. From lectin structure to functional glycomics: principles of the sugar code. Trends Biochem. Sci. 36, 298–313 (2011). 124. Fiege, B., Rademacher, C., Cartmell, J., Kitov, P. I., Parra, F. & Peters, T. Molecular Details of the Recognition of Blood Group Antigens by a Human Norovirus as Determined by STD NMR Spectroscopy. Angew. Chemie Int. Ed. 51, 928–932 (2012). 125. de Bentzmann, S., Varrot, A. & Imberty, A. Monitoring Lectin Interactions with Carbohydrates. Methods Mol. Biol. 1149, 403–414 (2014). 126. Pederson, K., Mitchell, D. A. & Prestegard, J. H. Structural Characterization of the DC- SIGN–LewisX Complex. Biochemistry 53, 5700–5709 (2014). 127. Vanwetswinkel, S., Volkov, A. N., Sterckx, Y. G. J., Garcia-Pino, A., Buts, L., Vranken, W. F., Bouckaert, J., Roy, R., Wyns, L. & van Nuland, N. A. J. Study of the Structural and Dynamic Effects in the FimH Adhesin upon α-d-Heptyl Mannose Binding. J. Med. Chem. 57, 1416–1427 (2014). 128. Segura, M., Bricoli, B., Casnati, A., Muñoz, E. M., Sansone, F., Ungaro, R. & Vicent, C. A Prototype Calix[4]arene-Based Receptor for Carbohydrate Recognition Containing Peptide and Phosphate Binding Groups. J. Org. Chem. 68, 6296–6303 (2003). 129. Ferrand, Y., Crump, M. P. & Davis, A. P. A synthetic lectin analog for biomimetic disaccharide recognition. Science 318, 619–622 (2007). 130. Jang, Y., Natarajan, R., Ko, Y. H. & Kim, K. Cucurbit[7]uril: A High-Affinity Host for Encapsulation of Amino Saccharides and Supramolecular Stabilization of Their α-Anomers in Water. Angew. Chemie Int. Ed. 53, 1003–1007 (2014).

139 Bibliography

131. Chandramouli, N., Ferrand, Y., Lautrette, G., Kauffmann, B., Mackereth, C. D., Laguerre, M., Dubreuil, D. & Huc, I. Iterative design of a helically folded aromatic oligoamide sequence for the selective encapsulation of fructose. Nat. Chem. 7, 334–341 (2015). 132. Esko, J. D. & Sharon, N. in Essentials Glycobiol. (eds. Varki, A., Cummings, R. D., Esko, J. D., Freeze, H. H., Stanley, P., Bertozzi, C. R., Hart, G. W. & Etzler, M. E.) 489–500 (Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press, 2009). 133. Fujihashi, M., Peapus, D. H., Kamiya, N., Nagata, Y. & Miki, K. Crystal Structure of Fucose- Specific Lectin from Aleuria aurantia Binding Ligands at Three of Its Five Sugar Recognition Sites. Biochemistry 42, 11093–11099 (2003). 134. Canales-Mayordomo, A., Fayos, R., Angulo, J., Ojeda, R., Martin-Pastor, M., Nieto, P. M., Martin-Lomas, M., Lozano, R., Gimenez-Gallego, G. & Jimenez-Barbero, J. Backbone dynamics of a biologically active human FGF-1 monomer, complexed to a hexasaccharide heparin-analogue, by N-15 NMR relaxation methods. J. Biomol. NMR 35, 225–239 (2006). 135. Li, J., Wu, H., Hong, J., Xu, X., Yang, H., Wu, B., Wang, Y., Zhu, J., Lai, R., Jiang, X., Lin, D., Prescott, M. C. & Rees, H. H. Odorranalectin Is a Small Peptide Lectin with Potential for Drug Delivery and Targeting. PLoS One 3, (2008). 136. Vakonakis, I., Langenhan, T., Promel, S., Russ, A. & Cambell, I. D. Solution structure and sugar-binding mechanism of mouse latrophilin-1 RBL: a 7TM receptor-attached lectin-like domain. Structure 16, 944–953 (2008). 137. Nesmelova, I. V, Ermakova, E., Daragan, V. A., Pang, M., Menéndez, M., Lagartera, L., Solís, D., Baum, L. G. & Mayo, K. H. Lactose Binding to Galectin-1 Modulates Structural Dynamics, Increases Conformational Entropy, and Occurs with Apparent Negative Cooperativity. J. Mol. Biol. 397, 1209–1230 (2010). 138. Diehl, C., Engström, O., Delaine, T., Håkansson, M., Genheden, S., Modig, K., Leffler, H., Ryde, U., Nilsson, U. J. & Akke, M. Protein Flexibility and Conformational Entropy in Ligand Design Targeting the Carbohydrate Recognition Domain of Galectin-3. J. Am. Chem. Soc. 132, 14577–14589 (2010). 139. Ortega, G., Castaño, D., Diercks, T. & Millet, O. Carbohydrate Affinity for the Glucose– Galactose Binding Protein Is Regulated by Allosteric Domain Motions. J. Am. Chem. Soc. 134, 19869–19876 (2012). 140. Canales, Á., Mallagaray, Á., Berbís, M. Á., Navarro-Vázquez, A., Domínguez, G., Cañada, F. J., André, S., Gabius, H.-J., Pérez-Castells, J. & Jiménez-Barbero, J. Lanthanide-Chelating Carbohydrate Conjugates Are Useful Tools To Characterize Carbohydrate Conformation in Solution and Sensitive Sensors to Detect Carbohydrate–Protein Interactions. J. Am. Chem. Soc. 136, 8011–8017 (2014). 141. Rodriguez-Mias, R. A. & Pellecchia, M. Use of Selective Trp Side Chain Labeling To Characterize Protein−Protein and Protein−Ligand Interactions by NMR Spectroscopy. J. Am. Chem. Soc. 125, 2892–2893 (2003). 142. Bista, M., Kowalska, K., Janczyk, W., Dömling, A. & Holak, T. A. Robust NMR Screening for Lead Compounds Using Tryptophan-Containing Proteins. J. Am. Chem. Soc. 131, 7500–

140 Bibliography

7501 (2009). 143. Park, D., Ryu, K.-S., Choi, D., Kwak, J. & Park, C. Characterization and role of fucose mutarotase in mammalian cells. Glycobiology 17, 955–962 (2007). 144. Loris, R., Tielker, D., Jaeger, K.-E. & Wyns, L. Structural Basis of Carbohydrate Recognition by the Lectin LecB from Pseudomonas aeruginosa. J. Mol. Biol. 331, 861–870 (2003). 145. Berjanskii, M. V & Wishart, D. S. A Simple Method To Predict Protein Flexibility Using Secondary Chemical Shifts. J. Am. Chem. Soc. 127, 14970–14971 (2005). 146. Kleckner, I. R. & Foster, M. P. An introduction to NMR-based approaches for measuring protein dynamics. Biochim. Biophys. Acta-Proteins Proteomics 1814, 942–968 (2011). 147. Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J. & Bax, A. NMRPipe: A multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6, 277– 293 (1995). 148. Vranken, W. F., Boucher, W., Stevens, T. J., Fogh, R. H., Pajon, A., Llinas, P., Ulrich, E. L., Markley, J. L., Ionides, J. & Laue, E. D. The CCPN data model for NMR spectroscopy: Development of a software pipeline. Proteins-Structure Funct. Bioinforma. 59, 687–696 (2005). 149. Sattler, M., Schleucher, J. & Griesinger, C. Heteronuclear multidimensional NMR experiments for the structure determination of proteins in solution employing pulsed field gradients. Prog. Nucl. Magn. Reson. Spectrosc. 34, 93–158 (1999). 150. Yamazaki, T., Forman-Kay, J. D. & Kay, L. E. Two-dimensional NMR experiments for correlating C13β and HΔ/ε chemical shifts of aromatic residues in 13C-labeled proteins via scalar couplings. J. Am. Chem. Soc. 115, 11054–11055 (1993). 151. Farrow, N. A., Muhandiram, R., Singer, A. U., Pascal, S. M., Kay, C. M., Gish, G., Shoelson, S. E., Pawson, T., Formankay, J. D. & Kay, L. E. Backbone Dynamics of a Free and a Phosphopeptide-Complexed Src Homology 2 Domain Studied by 15N NMR Relaxation. Biochemistry 33, 5984–6003 (1994). 152. Rule, G. S. & Hitchens, T. K. in Focus Struct. Biol. (Springer, Dordrecht, Netherlands, 2006). 153. Mandel, A. M., Akke, M. & Palmer III, A. G. Backbone Dynamics of Escherichia coli Ribonuclease HI: Correlations with Structure and Function in an Active Enzyme. J. Mol. Biol. 246, 144–163 (1995). 154. Palmer III, A. G., Kroenke, C. D. & Loria, J. P. in Methods Enzymol. (eds. Thomas L. James, V. D. & Uli, S.) Volume 339, 204–238 (Academic Press, 2001). 155. Bieri, M. & Gooley, P. Automated NMR relaxation dispersion data analysis using NESSY. BMC Bioinformatics 12, 421 (2011). 156. Farrow, N., Zhang, O., Forman-Kay, J. & Kay, L. A heteronuclear correlation experiment for simultaneous determination of 15N longitudinal decay and chemical exchange rates of systems in slow equilibrium. J. Biomol. NMR 4, 727–734 (1994). 157. Miloushev, V. Z., Bahna, F., Ciatto, C., Ahlsen, G., Honig, B., Shapiro, L. & Palmer III, A. G. Dynamic Properties of a Type II Cadherin Adhesive Domain: Implications for the Mechanism of Strand-Swapping of Classical Cadherins. Structure 16, 1195–1205 (2008).

141 Bibliography

158. Gabius, H.-J., Siebert, H.-C., André, S., Jiménez-Barbero, J. & Rüdiger, H. Chemical Biology of the Sugar Code. ChemBioChem 5, 740–764 (2004). 159. Giuliani, M., Morbioli, I., Sansone, F. & Casnati, A. Moulding calixarenes for biomacromolecule targeting. Chem. Commun. 51, 14140–14159 (2015). 160. Roy, R., Murphy, V. P. & Gabius, H.-J. Multivalent Carbohydrate-Lectin Interactions: How Synthetic Chemistry Enables Insights into Nanometric Recognition. Mol. 21, (2016). 161. Fang, T., Gu, Y., Huang, W. & Boons, G.-J. Mechanism of Glycosylation of Anomeric Sulfonium Ions. J. Am. Chem. Soc. 138, 3002–3011 (2016). 162. Tomašić, T., Rabbani, S., Gobec, M., Raščan, I. M., Podlipnik, Č., Ernst, B. Anderluh, M. Branched α-d-mannopyranosides: A new class of potent FimH antagonists. Medchemcomm 5, 1247–1253 (2014). 163. Miller, M. C., Ippel, H., Suylen, D., Klyosov, A. A., Traber, P. G., Hackeng, T. & Mayo, K. H. Binding of polysaccharides to human galectin-3 at a noncanonical site in its carbohydrate recognition domain. Glycobiol. 26, 88–99 (2016). 164. Ippel, H., Miller, M. C., Vértesy, S., Zhang, Y., Cañada, F. J., Suylen, D., Umemoto, K., Romanò, C., Hackeng, T., Tai, G., Leffler, H., Kopitz, J., André, S., Kübler, D., Jiménez- Barbero, J., Oscarson, S., Gabius, H.-J. & Mayo, K. H. Intra- and inter-molecular interactions of human galectin-3: assessment by full-assignment-based NMR. Glycobiol. 26 (8): 888-903 (2016). 165. Antonik, P. M., Volkov, A. N., Broder, U. N., Lo Re, D., Van Nuland, N. A. J. & Crowley, P. B. Anomer-Specific Recognition and Dynamics in a Fucose-binding Lectin. Biochemistry 55, 1195−1203 (2016). 166. Surolia, A., Sharon, N. & Schwarz, F. P. Thermodynamics of Monosaccharide and Disaccharide Binding to Erythrina corallodendron Lectin. J. Biol. Chem. 271, 17697–17703 (1996). 167. Shirai, T., Watanabe, Y., Lee, M., Ogawa, T. & Muramoto, K. Structure of Rhamnose- binding Lectin CSL3: Unique Pseudo-tetrameric Architecture of a Pattern Recognition Protein. J. Mol. Biol. 391, 390–403 (2009). 168. Peter C. Collins, R. J. F. Monosaccharides: Their Chemistry and Their Roles in Natural Products. (John Wiley and Sons, 1995). 169. Bermúdez, C., Peña, I., Cabezas, C., Daly, A. M. & Alonso, J. L. Unveiling the Sweet Conformations of D-Fructopyranose. ChemPhysChem 14, 893–895 (2013). 170. Saraboji, K., Håkansson, M., Genheden, S., Diehl, C., Qvist, J., Weininger, U., Nilsson, U. J., Leffler, H., Ryde, U., Akke, M. & Logan, D. T. The Carbohydrate-Binding Site in Galectin-3 Is Preorganized To Recognize a Sugarlike Framework of Oxygens: Ultra-High-Resolution Structures and Water Dynamics. Biochemistry 51, 296–306 (2012). 171. Seemann, J. E. & Schulz, G. E. Structure and mechanism of L-fucose isomerase from Escherichia coli Edited by R. Huber. J. Mol. Biol. 273, 256–268 (1997). 172. Takeda, K., Yoshida, H., Izumori, K. & Kamitori, S. X-ray structures of Bacillus pallidus D- arabinose isomerase and its complex with l-fucitol. Biochim. Biophys. Acta - Proteins

142 Bibliography

Proteomics 1804, 1359–1368 (2010). 173. Higgins, M. A., Suits, M. D., Marsters, C. & Boraston, A. B. Structural and Functional Analysis of Fucose-Processing Enzymes from Streptococcus pneumoniae. J. Mol. Biol. 426, 1469–1482 (2014). 174. Sharma, K. D., Karki, S., Thakur, N. S. & Attri, S. Chemical composition, functional properties and processing of carrot - a review. J. Food Sci. Technol. 49, 22–32 (2012). 175. Uy, R. & Wold, F. 1,4-Butanediol diglycidyl ether coupling of carbohydrates to sepharose: Affinity adsorbents for lectins and glycosidases. Anal. Biochem. 81, 98–107 (1977). 176. Hossain, M. B., Rai, D. K., Brunton, N. P., Martin-Diana, A. B. & Barry-Ryan, C. Characterization of Phenolic Composition in Lamiaceae Spices by LC-ESI-MS/MS. J. Agric. Food Chem. 58, 10576–10581 (2010). 177. Hossain, M. B., Camphuis, G., Aguiló-Aguayo, I., Gangopadhyay, N. & Rai, D. K. Antioxidant activity guided separation of major polyphenols of marjoram (Origanum majorana L.) using flash chromatography and their identification by liquid chromatography coupled with electrospray ionization tandem mass spectrometry†. J. Sep. Sci. 37, 3205–3213 (2014). 178. Kochendoerfer, G. G., Chen, S.-Y., Mao, F., Cressman, S., Traviglia, S., Shao, H., Hunter, C. L., Low, D. W., Cagle, E. N., Carnevali, M., Gueriguian, V., Keogh, P. J., Porter, H., Stratton, S. M., Wiedeke, M. C., Wilken, J., Tang, J., Levy, J. J., Miranda, L. P., Crnogorac, M. M., Kalbag, S., Botti, P., Schindler-Horvat, J., Savatski, L., Adamson, J. W., Kung, A., Kent, S. B. H. & Bradburne, J. A. Design and Chemical Synthesis of a Homogeneous Polymer- Modified Erythropoiesis Protein. Science 299, 884–887 (2003). 179. Tao, L., Mantovani, G., Lecolley, F. Haddleton, D. M. α-Aldehyde Terminally Functional Methacrylic Polymers from Living Radical Polymerization: Application in Protein Conjugation ‘Pegylation’. J. Am. Chem. Soc. 126, 13220–13221 (2004). 180. Canalle, L. A., Lowik, D. W. P. M. & van Hest, J. C. M. Polypeptide-polymer bioconjugates. Chem. Soc. Rev. 39, 329–353 (2010). 181. Keefe, A. J. & Jiang, S. Poly(zwitterionic)protein conjugates offer increased stability without sacrificing binding affinity or bioactivity. Nat. Chem. 4, 59–63 (2012). 182. Mancini, R. J., Lee, J. & Maynard, H. D. Trehalose Glycopolymers for Stabilization of Protein Conjugates to Environmental Stressors. J. Am. Chem. Soc. 134, 8474–8479 (2012). 183. Nguyen, T. H., Kim, S.-H., Decker, C. G., Wong, D. Y., Loo, J. A. & Maynard, H. D. A Heparin-Mimicking Polymer Conjugate Stabilizes Basic Fibroblast Growth Factor (bFGF). Nat. Chem. 5, 221–227 (2013). 184. van Eldijk, M. B., Smits, F. C. M., Vermue, N., Debets, M. F., Schoffelen, S. & van Hest, J. C. M. Synthesis and Self-Assembly of Well-Defined Elastin-Like Polypeptide–Poly(ethylene glycol) Conjugates. Biomacromolecules 15, 2751–2759 (2014). 185. Broyer, R. M., Grover, G. N. & Maynard, H. D. Emerging synthetic approaches for protein- polymer conjugations. Chem. Commun. 47, 2212–2226 (2011). 186. Caliceti, P. & Veronese, F. M. Pharmacokinetic and biodistribution properties of

143 Bibliography

poly(ethylene glycol)–protein conjugates. Adv. Drug Deliv. Rev. 55, 1261–1277 (2003). 187. Peleg-Shulman, T., Tsubery, H., Mironchik, M., Fridkin, M., Schreiber, G. & Shechter, Y. Reversible PEGylation: A Novel Technology To Release Native Interferon α2 over a Prolonged Time Period. J. Med. Chem. 47, 4897–4904 (2004). 188. Pasut, G., Mero, A., Caboi, F., Scaramuzza, S., Sollai, L. eronese, F. M. A New PEG−β- Alanine Active Derivative for Releasable Protein Conjugation. Bioconjug. Chem. 19, 2427– 2431 (2008). 189. Serwa, R., Majkut, P., Horstmann, B., Swiecicki, J.-M., Gerrits, M., Krause, E. & Hackenberger, C. P. R. Site-specific PEGylation of proteins by a Staudinger-phosphite reaction. Chem. Sci. 1, 596–602 (2010). 190. Chen, J., Zhao, M., Feng, F., Sizovs, A. Wang, J. Tunable Thioesters as ‘Reduction’ Responsive Functionality for Traceless Reversible Protein PEGylation. J. Am. Chem. Soc. 135, 10938–10941 (2013). 191. Boerakker, M. J., Hannink, J. M., Bomans, P. H. H., Frederik, P. M., Nolte, R. J. M., Meijer, E. M. & Sommerdijk, N. A. J. M. Giant Amphiphiles by Cofactor Reconstitution. Angew. Chemie Int. Ed. 41, 4239–4241 (2002). 192. Sandanaraj, B. S., Vutukuri, D. R., Simard, J. M., Klaikherd, A., Hong, R., Rotello, V. M. & Thayumanavan, S. Noncovalent Modification of Chymotrypsin Surface Using an Amphiphilic Polymer Scaffold: Implications in Modulating Protein Function. J. Am. Chem. Soc. 127, 10693–10698 (2005). 193. Wurm, F., Klos, J., Rader, H. J. & Frey, H. Synthesis and Noncovalent Protein Conjugation of Linear-Hyperbranched PEG-Poly(glycerol) alpha,omega(n)-Telechelics. J. Am. Chem. Soc. 131, 7954–+ (2009). 194. Biedermann, F., Rauwald, U., Zayed, J. M. & Scherman, O. A. A supramolecular route for reversible protein-polymer conjugation. Chem. Sci. 2, 279–286 (2011). 195. Khondee, S., Olsen, C. M., Zeng, Y., Middaugh, C. R. & Berkland, C. Noncovalent PEGylation by Polyanion Complexation as a Means To Stabilize Keratinocyte Growth Factor-2 (KGF-2). Biomacromolecules 12, 3880–3894 (2011). 196. Mero, A., Ishino, T., Chaiken, I., Veronese, F. M. & Pasut, G. Multivalent and Flexible PEG- Nitrilotriacetic Acid Derivatives for Non-covalent Protein Pegylation. Pharm. Res. 28, 2412– 2421 (2011). 197. Kurinomaru, T., Tomita, S., Kudo, S., Ganguli, S., Nagasaki, Y. & Shiraki, K. Improved Complementary Polymer Pair System: Switching for Enzyme Activity by PEGylated Polymers. Langmuir 28, 4334–4338 (2012). 198. Mueller, C., Capelle, M. A. H., Seyrek, E., Martel, S., Carrupt, P.-A., Arvinte, T. & Borchard, G. Noncovalent PEGylation: Different effects of dansyl-, L-tryptophan–, phenylbutylamino-, benzyl- and cholesteryl-PEGs on the aggregation of salmon calcitonin and lysozyme. J. Pharm. Sci. 101, 1995–2008 (2012). 199. Kim, T. H., Swierczewska, M., Oh, Y., Kim, A., Jo, D. G., Park, J. H., Byun, Y., Sadegh- Nasseri, S., Pomper, M. G., Lee, K. C. & Lee, S. Mix to Validate: A Facile, Reversible

144 Bibliography

PEGylation for Fast Screening of Potential Therapeutic Proteins In Vivo. Angew. Chemie Int. Ed. 52, 6880–6884 (2013). 200. Wei, K., Li, J., Chen, G. & Jiang, M. Dual Molecular Recognition Leading to a Protein– Polymer Conjugate and Further Self-Assembly. ACS Macro Lett. 2, 278–283 (2013). 201. Ambrosio, E., Barattin, M., Bersani, S., Shubber, S., Uddin, S., van der Walle, C. F., Caliceti, P. & Salmaso, S. A novel combined strategy for the physical PEGylation of polypeptides. J. Control. Release 226, 35–46 (2016). 202. Sakai, F., Yang, G., Weiss, M. S., Liu, Y., Chen, G. & Jiang, M. Protein crystalline frameworks with controllable interpenetration directed by dual supramolecular interactions. Nat Commun 5, (2014). 203. Ng, S., Lin, E., Kitov, P. I., Tjhung, K. F., Gerlits, O. O., Deng, L., Kasper, B., Sood, A., Paschal, B. M., Zhang, P., Ling, C.-C., Klassen, J. S., Noren, C. J., Mahal, L. K., Woods, R. J., Coates, L. & Derda, R. Genetically Encoded Fragment-Based Discovery of Glycopeptide Ligands for Carbohydrate-Binding Proteins. J. Am. Chem. Soc. 137, 5248–5251 (2015). 204. Kula, M.-R., Johansson, G. & Bückmann, A. F. Large-Scale Isolation of Enzymes. Biochem. Soc. Trans. 7, 1–5 (1979). 205. Johansson, G. & Joelsson, M. Specifically increased solubility of enzymes in polyethyleneglycol solutions using polymer-bound triazine dyes. Anal. Biochem. 158, 104– 110 (1986). 206. Kopperschläger, G. & Kirchberger, J. in Aqueous Two-Phase Syst. Methods Protoc. (ed. Hatti-Kaul, R.) 329–344 (Humana Press, 2000). 207. Ravera, E., Ciambellotti, S., Cerofolini, L., Martelli, T., Kozyreva, T., Bernacchioni, C., Giuntini, S., Fragai, M., Turano, P. & Luchinat, C. Solid-State NMR of PEGylated Proteins. Angew. Chemie Int. Ed. 55, 2446–2449 (2016). 208. Lu, Z.-R., Gao, S.-Q., Kopečkov , P. Kopeček, J. Synthesis of Bioadhesive Lectin-HPMA Copolymer−Cyclosporin Conjugates. Bioconjug. Chem. 11, 3–7 (2000). 209. Neutsch, L., Wirth, E.-M., Spijker, S., Pichl, C., Kählig, H., Gabor, F. & Wirth, M. Synergistic targeting/prodrug strategies for intravesical drug delivery — Lectin-modified PLGA microparticles enhance cytotoxicity of stearoyl gemcitabine by contact-dependent transfer. J. Control. Release 169, 62–72 (10AD). 210. Fee, C. J. & Van Alstine, J. M. Prediction of the viscosity radius and the size exclusion chromatography behavior of PEGylated proteins. Bioconjug. Chem. 15, 1304–1313 (2004). 211. Fee, C. J. & Van Alstine, J. M. in Protein Purif. 339–362 (John Wiley & Sons, Inc., 2011). 212. Crowley, P. B., Brett, K. & Muldoon, J. NMR Spectroscopy Reveals Cytochrome c– Poly(ethylene glycol) Interactions. ChemBioChem 9, 685–688 (2008). 213. Franke, D. & Svergun, D. I. DAMMIF, a program for rapid ab-initio shape determination in small-angle scattering. J. Appl. Crystallogr. 42, 342–346 (2009). 214. Petoukhov, M. V, Franke, D., Shkumatov, A. V, Tria, G., Kikhney, A. G., Gajda, M., Gorba, C., Mertens, H. D. T., Konarev, P. V & Svergun, D. I. New developments in the ATSAS program package for small-angle scattering data analysis. J. Appl. Crystallogr. 45, 342–350

145 Bibliography

(2012). 215. Kolb, H. C., Finn, M. G. & Sharpless, K. B. Click Chemistry: Diverse Chemical Function from a Few Good Reactions. Angew. Chemie Int. Ed. 40, 2004–2021 (2001). 216. Eissa, A. M., Khosravi, E. & Cimecioglu, A. L. A versatile method for functionalization and grafting of 2-hydroxyethyl cellulose (HEC) via Click chemistry. Carbohydr. Polym. 90, 859– 869 (2012). 217. Ni, J., Singh, S. & Wang, L.-X. Synthesis of Maleimide-Activated Carbohydrates as Chemoselective Tags for Site-Specific Glycosylation of Peptides and Proteins. Bioconjug. Chem. 14, 232–238 (2003). 218. Pervushin, K., Riek, R., Wider, G. & Wüthrich, K. Attenuated T2 relaxation by mutual cancellation of dipole–dipole coupling and chemical shift anisotropy indicates an avenue to NMR structures of very large biological macromolecules in solution. Proc. Natl. Acad. Sci. 94, 12366–12371 (1997). 219. Weigelt, J. Single Scan, Sensitivity- and Gradient-Enhanced TROSY for Multidimensional NMR Experiments J. Am. Chem. Soc. 1998, 120, 10778−10779. J. Am. Chem. Soc. 120, 12706 (1998). 220. Pernot, P., Theveneau, P., Giraud, T., Fernandes, N. R., Nurizzo, D., Spruce, D., Surr, J., McSweeney, S., Round, A., Felisaz, F., Foedinger, L., Gobbo, A., Huet, J., Villard, C. & Cipriani, F. New beamline dedicated to solution scattering from biological macromolecules at the ESRF. J. Phys. Conf. Ser. 247, 12009 (2010). 221. Pernot, P., Round, A., Barrett, R., De Maria Antolinos, A., Gobbo, A., Gordon, E., Huet, J., Kieffer, J., Lentini, M., Mattenet, M., Morawe, C., Mueller-Dieckmann, C., Ohlsson, S., Schmid, W., Surr, J., Theveneau, P., Zerrad, L. & McSweeney, S. Upgraded ESRF BM29 beamline for SAXS on macromolecules in solution. J. Synchrotron Radiat. 20, 660–664 (2013). 222. Round, A., Felisaz, F., Fodinger, L., Gobbo, A., Huet, J., Villard, C., Blanchet, C. E., Pernot, P., McSweeney, S., Roessle, M., Svergun, D. I. & Cipriani, F. BioSAXS Sample Changer: a robotic sample changer for rapid and reliable high-throughput X-ray solution scattering experiments. Acta Crystallogr. Sect. D Biol. Crystallogr. 71, 67–75 (2015). 223. Round, A., Brown, E., Marcellin, R., Kapp, U., Westfall, C. S., Jez, J. M. & Zubieta, C. Determination of the GH3.12 protein conformation through HPLC-integrated SAXS measurements combined with X-ray crystallography. Acta Crystallogr. Sect. D 69, 2072– 2080 (2013). 224. Brennich, M. E., Kieffer, J., Bonamis, G., De Maria Antolinos, A., Hutin, S., Pernot, P. & Round, A. Online data analysis at the ESRF bioSAXS beamline, BM29. J. Appl. Crystallogr. 49, 203–212 (2016). 225. Petoukhov, M. V, Konarev, P. V, Kikhney, A. G. & Svergun, D. I. ATSAS 2.1 towards automated and web-supported small-angle scattering data analysis. J. Appl. Crystallogr. 40, s223--s228 (2007). 226. Konarev, P. V, Petoukhov, M. V, Volkov, V. V & Svergun, D. I. ATSAS 2.1, a program

146 Bibliography

package for small-angle scattering data analysis. J. Appl. Crystallogr. 39, 277–286 (2006). 227. Guinier, A. Diffraction of x-rays of very small angles-application to the study of ultramicroscopic phenomenon. Ann. Phys. (Paris). 12, 161–237 (1939). 228. Porod, G. in Small angle X-ray Scatt. (eds. Glatter, O. & Kratky, O.) 17–51 (Academic Press, 1982). 229. Svergun, D. I. Determination of the regularization parameter in indirect-transform methods using perceptual criteria. J. Appl. Crystallogr. 25, 495–503 (1992). 230. Konarev, P. V, Volkov, V. V, Sokolova, A. V, Koch, M. H. J. & Svergun, D. I. PRIMUS: a Windows PC-based system for small-angle scattering data analysis. J. Appl. Crystallogr. 36, 1277–1282 (2003). 231. Volkov, V. V & Svergun, D. I. Uniqueness of ab initio shape determination in small-angle scattering. J. Appl. Crystallogr. 36, 860–864 (2003). 232. Svergun, D., Barberato, C. & Koch, M. H. J. CRYSOL a Program to Evaluate X-ray Solution Scattering of Biological Macromolecules from Atomic Coordinates. J. Appl. Crystallogr. 28, 768–773 (1995). 233. Kozin, M. B. & Svergun, D. I. Automated matching of high- and low-resolution structural models. J. Appl. Crystallogr. 34, 33–41 (2001). 234. Laue, T. & Demeler, B. A postreductionist framework for protein biochemistry. Nat Chem Biol 7, 331–334 (2011). 235. Kyne, C. & Crowley, P. B. Grasping the nature of the cell interior: from Physiological Chemistry to Chemical Biology. FEBS J. 283, 3016–3028 (2016). 236. Serber, Z. & Dotsch, V. In-cell NMR spectroscopy. Biochemistry 40, 14317–14323 (2001). 237. Li, C., Charlton, L. M., Lakkavaram, A., Seagle, C., Wang, G., Young, G. B., Macdonald, J. M. & Pielak, G. J. Differential Dynamical Effects of Macromolecular Crowding on an Intrinsically Disordered Protein and a Globular Protein: Implications for In-Cell NMR Spectroscopy. J. Am. Chem. Soc. 130, 6310–6311 (2008). 238. Gierasch, L. M. & Gershenson, A. Post-reductionist protein science, or putting Humpty Dumpty back together again. Nat Chem Biol 5, 774–777 (2009). 239. Spitzer, J. From Water and Ions to Crowded Biomacromolecules: In Vivo Structuring of a Prokaryotic Cell. Microbiol. Mol. Biol. Rev. 75, 491–506 (2011). 240. Spitzer, J. & Poolman, B. How crowded is the prokaryotic cytoplasm? FEBS Lett. 587, 2094– 2098 (2013). 241. Martorell, G., Adrover, M., Kelly, G., Temussi, P. A. & Pastore, A. A natural and readily available crowding agent: NMR studies of proteins in hen egg white. Proteins Struct. Funct. Bioinforma. 79, 1408–1415 (2011). 242. Sakai, T., Tochio, H., Tenno, T., Ito, Y., Kokubo, T., Hiroaki, H. & Shirakawa, M. In-cell NMR spectroscopy of proteins inside Xenopus laevis oocytes. J. Biomol. NMR 36, 179–188 (2006). 243. Selenko, P. & Wagner, G. Looking into live cells with in-cell NMR spectroscopy. J. Struct. Biol. 158, 244–253 (2007).

147 Bibliography

244. Li, C., Wang, Y. & Pielak, G. J. Translational and Rotational Diffusion of a Small Globular Protein under Crowded Conditions. J. Phys. Chem. B 113, 13390–13392 (2009). 245. Theillet, F.-X., Binolfi, A., Frembgen-Kesner, T., Hingorani, K., Sarkar, M., Kyne, C., Li, C., Crowley, P. B., Gierasch, L., Pielak, G. J., Elcock, A. H., Gershenson, A. & Selenko, P. Physicochemical Properties of Cells and Their Effects on Intrinsically Disordered Proteins (IDPs). Chem. Rev. 114, 6661–6714 (2014). 246. Monteith, W. B., Cohen, R. D., Smith, A. E., Guzman-Cisneros, E. & Pielak, G. J. Quinary structure modulates protein stability in cells. Proc. Natl. Acad. Sci. 112, 1739–1742 (2015). 247. Barnes, C. O. & Pielak, G. J. In-cell protein NMR and protein leakage. Proteins 79, 347–351 (2011). 248. Aguilar, X., F. Weise, C., Sparrman, T., Wolf-Watz, M. & Wittung-Stafshede, P. Macromolecular Crowding Extended to a Heptameric System: The Co-chaperonin Protein 10. Biochemistry 50, 3034–3044 (2011). 249. Gómez-Guillén, M. C., Giménez, B., López-Caballero, M. E. & Montero, M. P. Functional and bioactive properties of collagen and gelatin from alternative sources: A review. Food Hydrocoll. 25, 1813–1827 (2011). 250. Saltzman, W. M., Radomsky, M. L., Whaley, K. J. & Cone, R. A. Antibody diffusion in human cervical mucus. Biophys. J. 66, 508–515 (1994). 251. Varghese, J. S., Chellappa, N. & Fathima, N. N. Gelatin–carrageenan hydrogels: Role of pore size distribution on drug delivery process. Colloids Surfaces B Biointerfaces 113, 346–351 (2014). 252. Stellwagen, N. C. Electrophoresis of DNA in agarose gels, polyacrylamide gels and in free solution. Electrophoresis 30, S188–S195 (2009). 253. Pluen, A., Netti, P. A., Jain, R. K. & Berk, D. A. Diffusion of Macromolecules in Agarose Gels: Comparison of Linear and Globular Configurations. Biophys. J. 77, 542–552 (1999). 254. Bell, J. D., Brown, J. C. C., Kubal, G. & Sadler, P. J. NMR-invisible lactate in blood plasma. FEBS Lett. 235, 81–86 (1988). 255. Foxall, P. J. D., Mellotte, G. J., Bending, M. R., Lindon, J. C. & Nicholson, J. K. NMR spectroscopy as a novel approach to the monitoring of renal transplant function. Kidney Int. 43, 234–245 (1993). 256. Farrugia, A. Albumin Usage in Clinical Medicine: Tradition or Therapeutic? Transfus. Med. Rev. 24, 53–63 (2010). 257. Fanali, G., di Masi, A., Trezza, V., Marino, M., Fasano, M. & Ascenzi, P. Human serum albumin: From bench to bedside. Mol. Aspects Med. 33, 209–290 (2012). 258. Mori, S., Abeygunawardana, C., Johnson, M. O. & Vanzijl, P. C. M. Improved Sensitivity of HSQC Spectra of Exchanging Protons at Short Interscan Delays Using a New Fast HSQC (FHSQC) Detection Scheme That Avoids Water Saturation. J. Magn. Reson. Ser. B 108, 94– 98 (1995). 259. Antonik, P. M., Eissa, A. M., Round, A. R., Cameron, N. R. & Crowley, P. B. Noncovalent PEGylation via Lectin–Glycopolymer Interactions. Biomacromolecules 17, 2719–2725

148 Bibliography

(2016). 260. Kyne, C., Ruhle, B., Gautier, V. W. & Crowley, P. B. Specific ion effects on macromolecular interactions in Escherichia coli extracts. Protein Sci. 24, 310–318 (2015). 261. Stainsby, G. Gelatin Gels, in Advances in Meat Research: Collagen as Food. (Van Nostrand Reinhold, 1987). 262. Slade, L. & Levine, H. Polymer-Chemical Properties of Gelatin in Foods, in Advances in Meat Research: Collagen as Food. (Van Nostrand Reinhold, 1987). 263. Leuenberger, B. H. Investigation of viscosity and gelation properties of different mammalian and fish gelatins. Food Hydrocoll. 5, 353–361 (1991). 264. Hermanson, G. T. in Bioconjugate Tech. 169–212 (Academic Press, 2008). 265. Sarkar, M., Smith, A. E. & Pielak, G. J. Impact of reconstituted cytosol on protein stability. Proc. Natl. Acad. Sci. U. S. A. 110, 19342–19347 (2013). 266. Arakawa, T., Kita, Y. & Timasheff, S. N. Protein precipitation and denaturation by dimethyl sulfoxide. Biophys. Chem. 131, 62–70 (2007).

149

150

151

Acknowledgements

Since joining the Crowley lab, my journey to Ph.D. was enriched by many people with their constant support and guidance.

Firstly, I want to thank my supervisor Dr Peter Crowley for his constant support and patience especially in the last months. His energy and enthusiasm were one of the major factors that helped to finish this journey with the happy end.

I am also grateful to my Teagasc supervisor Dr Dilip Rai for all the helpful discussions and understanding before and after my placement in Teagasc Food Research Centre, Ashtown, Dublin.

I am very grateful for the financial support of the Teagasc Walsh fellowship and the write-up bursary from National University of Ireland, Galway.

I wish to thank various people for their contribution to this Ph.D.

 The Crowley group members provided useful discussions on my work and a friendly atmosphere in and outside the lab.

 Prof. Michaela Wimmerová for providing the pET25rsl plasmid.

 The School of Chemistry technical staff, especially Dr Róisín Doohan for NMR facility maintenance and helpful comments on NMR experiment design. I also thank Marian Vignoles for the acquisition of mass spectrometry data.

 I wish to thank Prof. Nico van Nuland and Dr Alex Volkov (Vrije Universiteit Brussel, Belgium) for allowing me to work in the lab and all the help and NMR knowledge shared in the early stage of my PhD.

 I am also grateful to our collaborators Dr Ahmed Eissa, Prof. Neil Cameron (Durham University) and Dr Adam Round (ESRF Grenoble) for their contributions to the project on noncovalent PEGylation.

 Leah Kearney for the initial study of Cys modification in RSL.

 Michael Cleary for the initial studies of RSL in gelatin and agarose.

Na koniec chciałbym podziękować rodzicom za cierpliwość i wytrwałość przez te ostatnie pięć lat na wygnaniu. Niech również mają swoje dwie linijki.

152

Curriculum Vitae

Paweł Michal Antonik was born on February the 28th 1987 in Zawiercie, Poland. In 2006 Paweł started his Master of Science degree in Chemistry at Wroclaw University of Technology, Poland. Also during this period (November 2009 – February 2010) Paweł was an Erasmus programme visiting student in the Crowley lab. In autumn 2011 Paweł graduated with M.Sc Eng. after defending his master thesis about chemical degradation of the macromolecular polyphenolic compounds isolated from selected medicinal plants under the supervision of Dr Izabela Pawlaczyk. The same year he decided to rejoin the Crowley lab at National University of Ireland, Galway to pursue a Ph.D. in protein science. This work focused on the characterisation of lectin-carbohydrate interactions and structural characterization of the lectin using NMR spectroscopy and was performed under the supervision of Dr Peter Crowley. As a part of his PhD, Pawel performed two placements in the laboratory of Prof. Nico van Nuland at the Vrije Universiteit Brussel, Belgium where he worked under the supervision of Dr Alex Volkov. The work involved NMR assignments and dynamics of Ralstonia solanacearum lectin. The results of these projects are presented in this thesis.

153

Publications

Antonik, PM, Eissa, AM, Round, AR, Cameron, NR, Crowley, PB Noncovalent PEGylation via Lectin–Glycopolymer Interactions. Biomacromolecules 2016, 17, 2719–2725.

Antonik, PM, Volkov, AN, Broder, UN, Lo Re, D, Van Nuland, NAJ, Crowley, PB Anomer-Specific Recognition and Dynamics in a Fucose-binding Lectin. Biochemistry 2016, 55, 1195−1203.

Oral Presentations

Jun. 2016 Lectin–Glycopolymer Interactions as a Route to Noncovalent PEGylation, 68th Irish Universities Chemistry Research Colloquium, UCC, Cork, Ireland

Conferences & Workshops

Jun. 2016 68th Irish Universities Chemistry Research Colloquium, Cork, Ireland Oct. 2014 Disordered Motifs and Domains in Cell Control, Dublin, Ireland* Sept. 2014 Soft-Inter 2014: Soft Interactions in Biological and Biomimetic Self- Assemblies, Saint Malo, France Jun. 2014 66th Irish Universities Chemistry Research Colloquium, Galway, Ireland Apr. 2013 Glycoproteins: From structure to disease. Palma de Mallorca, Spain* Apr. 2013 2nd Irish NMR Meeting, Dublin, Ireland* May 2012 EuroMAR, Dublin, Ireland* *Selected to Present a Poster

Research Visits

Mar. – Apr. and Sept. 2012, placements in Jean Jeener NMR Centre, at Vrije Universiteit Brussel, Belgium to perform NMR assignments and Dynamic studies of RSL Jun. – Aug. 2012, placement in Teagasc Food Research Centre, Ashtown, Dublin 15, Ireland, to prepare plant extracts for NMR screening

154

155 Appendix A

Appendix A

Figure A1. (A) Typical Cα and Cβ chemical shifts for 20 amino acids. (B) The schemes of magnetization transfer in CBCANNH (top) and CBCA(CO)NNH (down) experiments (http://www.protein-nmr.org.uk/) that were used in (C) sequential assignment presented on example of last four RSL sequence residues.

156 Appendix B

Appendix B

Table B1. NMR assignments of sugar-free RSL and RSL bound to D-Mannose or L-Fucose*

Residue Residue δ δ δ Atom No. Type Sugar-free D-Mannose L-Fucose 1 SER CA 57.62 1 SER CB 63.22 2 SER H 8.65 2 SER C 174.21 174.23 174.21 2 SER CA 58.30 58.27 58.30 2 SER CB 65.46 65.40 65.52 2 SER N 116.85 3 VAL H 7.97 7.94 7.93 3 VAL C 173.62 173.60 173.57 3 VAL CA 61.93 61.89 61.87 3 VAL CB 32.61 32.59 32.62 3 VAL N 116.18 116.04 115.85 4 GLN H 7.63 7.57 7.58 4 GLN HE21 6.99 6.98 6.97 4 GLN HE22 7.60 7.54 7.57 4 GLN C 175.35 175.38 175.39 4 GLN CA 54.74 54.72 54.77 4 GLN CB 31.31 31.32 31.26 4 GLN CG 33.67 33.70 4 GLN CD 179.05 4 GLN N 120.58 120.38 120.43 4 GLN NE2 111.43 111.46 111.43 5 THR H 8.11 8.08 8.08 5 THR C 174.01 173.98 174.05 5 THR CA 59.47 59.41 59.42 5 THR CB 73.64 73.66 73.67 5 THR N 112.05 111.99 112.07 6 ALA H 8.91 8.82 8.86 6 ALA C 174.19 174.14 174.17 6 ALA CA 51.92 51.77 51.79 6 ALA CB 24.37 24.38 24.47 6 ALA N 121.33 120.88 121.10 7 ALA H 8.94 8.94 8.97 7 ALA C 176.06 175.92 175.91 7 ALA CA 50.96 51.12 51.09 7 ALA CB 23.67 23.79 23.66 7 ALA N 123.81 123.31 123.27 8 THR H 9.00 8.98 8.94 8 THR C 170.25 170.19 170.25 8 THR CA 60.58 60.53 60.49 8 THR CB 70.13 70.29 70.27 8 THR N 114.30 114.51 114.38 9 SER H 8.20 8.17 8.21 9 SER C 173.80 173.80 173.67 9 SER CA 57.48 57.42 57.41 9 SER CB 67.26 67.35 67.55 9 SER N 116.83 116.73 117.26 10 TRP H 8.42 8.53 8.43 10 TRP HD1 7.00 7.03 10 TRP HE1 10.06 9.87 10.08 10 TRP C 174.80 174.82 174.95

157 Appendix B

10 TRP CA 57.81 57.68 57.57 10 TRP CB 32.07 32.16 32.24 10 TRP N 116.33 116.33 116.14 10 TRP NE1 129.22 128.88 129.53 11 GLY H 8.88 8.83 8.89 11 GLY C 173.65 173.56 173.43 11 GLY CA 44.75 44.65 44.71 11 GLY N 105.71 105.83 106.18 12 THR H 8.17 8.14 8.18 12 THR C 175.62 175.42 175.40 12 THR CA 61.76 61.56 61.38 12 THR CB 69.71 69.84 69.80 12 THR N 105.79 105.52 105.31 13 VAL H 8.01 7.97 7.97 13 VAL CA 61.76 67.94 67.94 13 VAL CB 69.71 29.96 29.86 13 VAL N 124.00 124.01 123.85 14 PRO C 175.05 174.95 174.92 14 PRO CA 62.17 62.47 62.35 14 PRO CB 32.77 32.58 32.66 15 SER H 7.17 7.12 6.99 15 SER C 173.23 173.17 173.35 15 SER CA 58.99 58.83 58.88 15 SER CB 63.96 63.94 64.22 15 SER N 110.47 110.14 109.93 16 ILE H 8.71 8.76 8.73 16 ILE C 177.18 177.42 177.47 16 ILE CA 60.00 59.98 59.86 16 ILE CB 42.55 42.24 42.42 16 ILE N 118.78 118.99 118.16 17 ARG H 8.89 8.89 8.90 17 ARG C 173.96 174.03 173.84 17 ARG CA 53.68 53.66 53.78 17 ARG CB 34.58 33.96 34.26 17 ARG N 127.96 128.20 128.38 18 VAL H 8.69 8.61 8.67 18 VAL C 175.95 175.93 175.87 18 VAL CA 60.99 60.86 60.92 18 VAL CB 34.47 34.51 34.47 18 VAL N 120.94 120.91 121.22 19 TYR H 9.70 9.68 9.69 19 TYR C 174.89 174.92 174.86 19 TYR CA 57.02 56.97 56.96 19 TYR CB 41.09 41.17 41.22 19 TYR N 132.32 132.65 132.46 20 THR H 8.75 8.67 8.74 20 THR C 172.87 172.83 172.81 20 THR CA 61.65 61.66 61.69 20 THR CB 71.01 71.01 70.97 20 THR N 117.60 117.06 117.45 21 ALA H 9.52 9.49 9.50 21 ALA C 176.15 176.12 176.17 21 ALA CA 50.85 50.81 50.84 21 ALA CB 19.48 19.42 19.43 21 ALA N 131.62 131.79 131.81 22 ASN H 8.99 8.99 8.97 22 ASN HD21 6.54 6.52 6.52 22 ASN HD22 7.50 7.49 7.49 22 ASN C 175.28 175.28 175.26 22 ASN CA 52.50 52.45 52.52

158 Appendix B

22 ASN CB 41.16 41.17 41.19 22 ASN CG 175.40 22 ASN N 121.94 121.97 121.89 22 ASN ND2 109.86 109.85 109.94 23 ASN H 7.52 9.26 9.27 23 ASN HD21 6.76 6.74 6.73 23 ASN HD22 7.50 7.50 7.50 23 ASN C 175.26 175.24 175.22 23 ASN CA 54.24 54.21 54.24 23 ASN CB 37.34 37.29 37.33 23 ASN CG 178.26 178.17 23 ASN N 112.29 125.47 125.45 23 ASN ND2 112.20 112.22 112.22 24 GLY H 8.89 8.87 8.87 24 GLY C 175.39 175.35 175.34 24 GLY CA 45.78 45.79 45.80 24 GLY N 104.04 104.07 104.10 25 LYS H 7.81 7.80 7.80 25 LYS C 174.10 174.10 174.10 25 LYS CA 55.10 55.16 55.14 25 LYS CB 35.80 35.66 35.72 25 LYS N 122.63 122.86 122.78 26 ILE H 8.92 8.97 8.91 26 ILE C 175.45 175.50 175.48 26 ILE CA 59.37 59.40 59.46 26 ILE CB 39.34 39.48 39.38 26 ILE N 123.67 123.93 123.78 27 THR H 8.29 8.15 8.22 27 THR C 173.01 173.00 172.97 27 THR CA 60.09 60.06 60.14 27 THR CB 71.77 71.70 71.66 27 THR N 115.21 114.94 115.24 28 GLU H 8.87 8.79 8.79 28 GLU C 175.49 175.54 175.55 28 GLU CA 55.02 55.00 54.96 28 GLU CB 37.32 37.71 37.62 28 GLU N 123.64 122.71 123.22 29 ARG H 8.89 8.88 8.93 29 ARG C 175.40 175.13 175.51 29 ARG CA 53.03 52.72 52.84 29 ARG CB 33.64 33.50 33.50 29 ARG N 127.48 127.39 127.49 30 CYS H 9.18 9.23 9.21 30 CYS C 171.29 171.06 171.12 30 CYS CA 56.26 56.48 56.27 30 CYS CB 30.51 30.12 30.38 30 CYS N 119.08 119.15 119.30 31 TRP H 8.14 7.98 8.07 31 TRP HD1 6.91 7.06 31 TRP HE1 10.42 10.16 10.50 / 10.58* 31 TRP C 174.98 175.02 175.03 31 TRP CA 56.46 56.36 56.39 31 TRP CB 30.10 30.34 30.14 31 TRP N 121.74 121.80 122.07 31 TRP NE1 129.75 129.22 130.36 / 130.59* 32 ASP H 8.75 8.70 8.73 32 ASP CA 52.94 52.66 52.92 32 ASP CB 42.87 42.77 42.84 32 ASP N 126.64 126.31 126.69 33 GLY C 174.11 174.32 174.23

159 Appendix B

33 GLY CA 45.58 45.66 45.50 34 LYS H 7.46 7.42 7.45 34 LYS CA 54.79 54.77 54.72 34 LYS CB 33.16 33.22 33.18 34 LYS N 119.55 119.19 119.45 35 GLY C 173.78 173.82 173.64 35 GLY CA 44.21 44.19 44.24 36 TRP H 8.94 8.91 8.82 36 TRP HD1 7.23 7.28 36 TRP HE1 10.86 11.13 11.38 / 11.26* 36 TRP C 177.35 177.42 177.43 36 TRP CA 57.81 57.83 57.75 36 TRP CB 30.17 30.07 30.11 36 TRP N 122.95 122.82 122.52 36 TRP NE1 131.61 133.09 133.15 / 132.59* 37 TYR H 9.39 9.28 9.37 / 9.28* 37 TYR C 174.30 174.36 174.53 37 TYR CA 56.18 56.41 56.13 / 56.19* 37 TYR CB 40.83 40.31 40.98 / 40.97* 37 TYR N 119.50 119.18 119.37 / 119.30* 38 THR H 9.30 9.26 9.33 38 THR C 175.01 174.87 174.77 38 THR CA 63.99 64.51 64.31 38 THR CB 69.32 69.36 69.11 38 THR N 121.46 121.50 121.70 39 GLY H 8.72 8.51 8.68 39 GLY C 174.09 174.14 174.27 39 GLY CA 44.46 44.52 44.74 39 GLY N 116.32 115.72 116.56 40 ALA H 8.18 8.72 8.88 40 ALA C 178.30 178.60 178.63 40 ALA CA 53.09 53.35 53.32 40 ALA CB 19.46 19.56 19.34 40 ALA N 121.19 120.33 121.18 41 PHE H 7.74 7.68 7.68 / 7.49* 41 PHE C 175.15 175.27 175.14 41 PHE CA 60.36 60.48 60.39 / 60.35* 41 PHE CB 39.15 38.70 38.83 / 38.81* 41 PHE N 118.73 118.26 118.24 / 118.00* 42 ASN H 7.65 7.53 7.67 42 ASN HD21 6.50 6.45 6.46 42 ASN HD22 7.25 7.22 7.24 42 ASN C 173.53 173.45 173.45 42 ASN CA 52.78 52.82 52.87 42 ASN CB 39.32 39.41 39.48 42 ASN CG 177.20 42 ASN N 128.15 128.48 128.70 42 ASN ND2 110.61 110.92 110.81 43 GLU H 6.93 7.37 7.11 43 GLU C 175.27 43 GLU CA 52.19 51.94 52.22 43 GLU CB 30.31 29.96 30.42 43 GLU N 120.60 121.03 120.87 44 PRO C 176.54 176.47 176.49 44 PRO CA 62.39 62.29 62.38 44 PRO CB 32.88 32.80 32.88 45 GLY H 7.71 7.72 7.69 45 GLY C 170.14 170.15 170.13 45 GLY CA 46.34 46.19 46.22 45 GLY N 107.67 107.89 107.64

160 Appendix B

46 ASP H 8.56 8.51 8.52 46 ASP C 175.76 175.74 175.76 46 ASP CA 54.45 54.39 54.44 46 ASP CB 43.76 43.71 43.79 46 ASP N 117.93 117.30 117.52 47 ASN H 7.91 7.87 7.88 47 ASN HD21 7.79 7.73 7.76 47 ASN HD22 8.05 8.00 8.04 47 ASN C 172.77 172.75 172.76 47 ASN CA 52.63 52.57 52.65 47 ASN CB 44.55 44.43 44.57 47 ASN N 116.73 116.70 116.68 47 ASN ND2 114.87 114.82 114.83 48 VAL H 9.94 9.89 9.93 48 VAL C 172.84 172.79 172.81 48 VAL CA 59.15 59.15 59.18 48 VAL CB 36.56 36.52 36.54 48 VAL N 125.16 125.14 125.34 49 SER H 8.40 8.35 8.37 49 SER C 172.10 172.03 172.07 49 SER CA 58.56 58.56 58.59 49 SER CB 67.25 67.33 67.32 49 SER N 121.46 121.60 121.49 50 VAL H 8.57 8.55 8.54 50 VAL C 172.65 176.78 172.41 50 VAL CA 59.07 59.18 59.28 50 VAL CB 35.51 35.24 35.07 50 VAL N 116.00 116.15 116.02 51 THR H 8.95 8.93 8.90 51 THR C 170.15 170.12 170.10 51 THR CA 61.05 61.05 61.10 51 THR CB 69.88 69.91 69.93 51 THR N 118.68 118.82 118.80 52 SER H 8.31 8.27 8.30 52 SER C 172.19 172.16 171.97 52 SER CA 57.74 57.79 57.82 52 SER CB 67.37 67.27 67.44 52 SER N 118.63 118.51 118.70 53 TRP H 8.64 8.65 8.66 53 TRP HD1 6.62 6.63 53 TRP HE1 9.57 9.43 9.55 53 TRP C 172.22 172.23 172.10 53 TRP CA 57.83 57.72 57.79 53 TRP CB 31.66 31.77 31.84 53 TRP N 118.48 118.39 118.80 53 TRP NE1 128.32 128.04 128.46 54 LEU H 8.54 8.53 8.50 54 LEU C 178.32 178.31 178.30 54 LEU CA 53.60 53.54 53.55 54 LEU CB 45.49 45.42 45.48 54 LEU N 121.91 122.12 122.00 55 VAL H 8.68 8.65 8.61 55 VAL C 177.31 177.33 177.35 55 VAL CA 62.40 62.41 62.39 55 VAL CB 32.53 32.47 32.38 55 VAL N 123.04 123.01 122.68 56 GLY H 9.20 9.17 9.16 56 GLY C 174.48 174.47 174.44 56 GLY CA 47.35 47.32 47.33 56 GLY N 119.23 119.16 119.13

161 Appendix B

57 SER H 8.93 8.90 8.91 57 SER C 173.54 173.51 173.53 57 SER CA 58.14 58.11 58.11 57 SER CB 64.06 64.03 64.06 57 SER N 122.16 122.10 122.19 58 ALA H 8.16 8.12 8.13 58 ALA C 175.09 174.97 175.13 58 ALA CA 51.81 51.77 51.79 58 ALA CB 20.12 20.13 20.11 58 ALA N 127.13 127.19 127.14 59 ILE H 7.59 7.52 7.56 59 ILE C 172.44 172.25 172.48 59 ILE CA 56.73 56.31 56.44 59 ILE CB 39.99 39.88 39.89 59 ILE N 123.88 123.81 123.69 60 HIS H 7.62 7.54 7.63 60 HIS HD2 6.72 60 HIS C 172.74 172.60 172.76 60 HIS CA 53.31 53.14 53.01 60 HIS CB 29.89 29.89 30.02 60 HIS N 123.01 123.15 123.17 61 ILE H 8.59 8.59 8.58 61 ILE C 176.61 176.99 177.00 61 ILE CA 60.42 60.45 60.30 61 ILE CB 41.71 41.41 41.40 61 ILE N 119.12 119.31 118.70 62 ARG H 9.06 9.18 9.15 62 ARG C 173.04 172.88 172.79 62 ARG CA 53.27 53.36 53.49 62 ARG CB 33.17 32.60 32.86 62 ARG N 128.67 129.92 129.71 63 VAL H 8.94 8.87 8.88 63 VAL C 174.15 174.12 174.12 63 VAL CA 60.29 60.22 60.34 63 VAL CB 34.26 34.11 34.14 63 VAL N 123.42 123.46 123.78 64 TYR H 8.67 8.63 8.64 64 TYR C 175.30 175.24 175.17 64 TYR CA 57.98 57.87 57.94 64 TYR CB 36.95 36.85 36.85 64 TYR N 125.23 125.41 125.37 65 ALA H 9.32 9.29 9.30 65 ALA C 176.76 172.53 176.72 65 ALA CA 50.56 50.50 50.56 65 ALA CB 22.44 22.41 22.37 65 ALA N 132.16 132.16 132.20 66 SER H 8.97 8.91 8.95 66 SER C 173.31 173.29 173.27 66 SER CA 58.20 58.12 58.15 66 SER CB 65.25 65.20 65.22 66 SER N 118.85 118.68 118.85 67 THR H 8.57 8.54 8.54 67 THR C 174.62 174.64 174.64 67 THR CA 62.09 62.13 62.15 67 THR CB 71.05 71.05 71.07 67 THR N 121.59 121.55 121.57 68 GLY H 9.44 9.40 9.43 68 GLY C 174.92 174.91 68 GLY CA 47.28 47.29 47.29 68 GLY N 118.14 118.23 118.34

162 Appendix B

69 THR H 8.58 8.58 8.56 69 THR C 174.57 174.53 174.54 69 THR CA 61.17 61.16 61.15 69 THR CB 68.55 68.52 68.53 69 THR N 115.45 115.66 115.51 70 THR H 8.36 8.35 8.35 70 THR C 173.80 173.81 173.83 70 THR CA 62.45 62.41 62.42 70 THR CB 70.50 70.44 70.43 70 THR N 119.34 119.39 119.44 71 THR H 8.80 8.84 8.81 71 THR C 172.80 172.78 172.78 71 THR CA 61.71 61.68 61.74 71 THR CB 75.65 70.55 70.61 71 THR N 126.11 126.30 126.11 72 THR H 9.52 9.45 9.47 72 THR C 171.36 171.32 171.27 72 THR CA 62.01 62.01 62.09 72 THR CB 70.97 70.95 70.96 72 THR N 125.21 125.17 125.24 73 GLU H 8.62 8.60 8.62 73 GLU C 173.41 173.35 173.32 73 GLU CA 54.01 53.82 53.79 73 GLU CB 34.71 34.85 34.98 73 GLU N 129.05 128.78 128.74 74 TRP H 9.56 9.55 9.58 74 TRP HD1 6.74 6.74 74 TRP HE1 10.50 10.50 10.49 74 TRP C 176.17 176.31 176.35 74 TRP CA 55.62 55.45 55.47 74 TRP CB 33.25 33.70 33.60 74 TRP N 129.44 129.33 129.55 74 TRP NE1 128.11 128.08 128.10 75 CYS H 10.06 10.14 10.22 75 CYS C 173.47 173.45 173.29 75 CYS CA 57.12 57.43 57.45 75 CYS CB 29.75 29.50 30.09 75 CYS N 120.57 120.56 120.57 76 TRP H 8.68 8.64 8.57 76 TRP HD1 6.32 6.36 76 TRP HE1 10.21 10.08 10.23 / 10.32* 76 TRP C 175.61 175.55 175.83 76 TRP CA 57.15 56.74 56.90 76 TRP CB 28.69 28.44 28.55 76 TRP N 124.25 124.74 123.67 76 TRP NE1 128.83 127.91 128.68 / 129.02* 77 ASP H 8.22 8.10 8.36 77 ASP CA 52.39 52.29 52.31 77 ASP CB 42.84 42.77 42.89 77 ASP N 126.35 126.56 126.61 78 GLY C 173.93 174.11 174.08 78 GLY CA 45.63 45.73 45.75 79 ASN H 7.83 7.88 7.81 79 ASN HD21 6.69 6.65 6.64 79 ASN HD22 7.36 7.32 7.32 79 ASN CA 52.62 52.53 52.49 79 ASN CB 39.48 39.38 39.35 79 ASN CG 177.68 177.69 79 ASN N 118.29 118.19 118.21 79 ASN ND2 112.36 112.18 112.19

163 Appendix B

80 GLY C 172.75 172.84 172.56 80 GLY CA 44.21 44.19 44.10 81 TRP H 8.81 8.69 8.64 81 TRP HD1 7.40 7.42 81 TRP HE1 11.15 11.29 11.55 / 11.37* 81 TRP C 178.14 178.15 178.22 81 TRP CA 57.42 57.41 57.33 81 TRP CB 30.33 30.20 30.28 81 TRP N 121.55 121.69 121.11 81 TRP NE1 131.38 132.03 132.31 / 131.43* 82 THR H 9.73 9.64 9.67 82 THR C 173.42 173.49 173.53 82 THR CA 60.14 60.12 60.15 82 THR CB 72.01 72.15 71.98 82 THR N 116.62 116.57 116.45 83 LYS H 8.82 8.81 8.84 / 8.80* 83 LYS C 178.29 178.14 178.21 / 178.19* 83 LYS CA 57.32 57.43 57.53 / 57.43* 83 LYS CB 32.62 32.59 32.59 / 32.74* 83 LYS N 125.52 125.58 125.33 / 125.16* 84 GLY H 9.15 9.05 9.08 / 9.01* 84 GLY C 174.27 174.58 174.65 / 174.61* 84 GLY CA 44.25 44.27 44.33 / 44.35* 84 GLY N 115.31 115.29 115.53 / 114.93* 85 ALA H 7.88 8.44 8.57 / 8.19* 85 ALA C 176.83 176.92 177.18 / 177.03* 85 ALA CA 52.05 52.50 52.64 / 52.63* 85 ALA CB 19.27 19.36 19.38 / 19.52* 85 ALA N 120.95 120.47 121.00 / 120.31* 86 TYR H 7.47 7.29 7.34 / 7.26* 86 TYR C 175.04 175.04 174.97 86 TYR CA 61.43 61.58 61.58 / 61.59* 86 TYR CB 40.01 40.09 40.01 / 40.05* 86 TYR N 119.78 119.20 119.17 / 119.08* 87 THR H 5.45 5.33 5.37 87 THR C 172.90 172.95 172.90 87 THR CA 58.75 58.56 58.62 87 THR CB 72.27 72.49 72.47 87 THR N 115.44 115.29 115.32 88 ALA H 8.39 8.36 8.37 88 ALA C 176.58 176.52 176.46 88 ALA CA 52.92 52.87 52.90 88 ALA CB 20.58 20.64 20.64 88 ALA N 124.16 123.99 123.98 89 THR H 7.66 7.64 7.65 89 THR C 173.03 172.98 173.00 89 THR CA 60.74 60.68 60.71 89 THR CB 70.56 70.53 70.52 89 THR N 109.09 109.21 109.30 90 ASN H 8.09 8.09 8.08 90 ASN HD21 6.91 6.88 6.88 90 ASN HD22 7.53 7.50 7.50 90 ASN CA 55.27 55.25 55.25 90 ASN CB 39.99 39.94 39.99 90 ASN CG 178.42 178.31 90 ASN N 125.20 125.17 125.28 90 ASN ND2 112.93 112.97 112.96 *chemical shift due to β-L-Fucose

164 Appendix C

Appendix C

NMR studies of RSL at different pH or in DMSO

RSL and pH Effects The chemical shift is sensitive to pH changes, therefore, to facilitate studies of RSL in acidic or physiological pH it was necessary to assign spectra in new conditions. CSPs for sugar-free RSL were observed at lower/higher pH compared to the typically used pH 6.0 (Figure C1). The majority of the resonances were not affected by pH under the range tested. The assignments for peaks, which presented CSPs, were transferred from the reference spectrum (pH 6.0). The crowded region in the middle of the spectra (Figures 2.2. – 2.4., small panel) has 4 overlapping peaks (Ala7, Ile26, Glu28, Val63) for which assignment transfer was ambiguous so these resonances remained unassigned at either pH 4.0 or 7.6.

Figure C1. (A) 1H-15N HSQC spectra of sugar-free RSL at pH 4.0 (red), pH 6.0 (black), and pH 7.6 (blue). Sample conditions were 0.25 mM protein in 20 mM KPi, 50 mM NaCl Boxed were

resonances that revealed Δδavg > 0.05 ppm and Trp indole resonances. (B) Plot of chemical shift perturbations for RSL backbone resonances at pH 4.0 (red) and 7.6 (blue)

165 Appendix C

Sugar-free RSL at pH 4.0 had large Δδ for 7 peaks (Δδavg > 0.03 ppm) with only 3 resonances (Asn42, Glu43, Gly45) > 0.05 ppm. On the other hand, at pH 7.6, sugar-free RSL revealed CSPs (> 0.03 ppm) for 22 cross-peaks including 11 resonances with Δδ > 0.05 ppm. (Table B1 and Figure 2B). Interestingly, the majority of the peaks affected by pH change correspond to residues from the inter- monomeric binding site (e.g. Arg62, Cys75, Asp77, Thr82 and Gly84, Figure 2.1.) that is more solvent exposed due to the lack of the bulky Tyr side chain, which partially shields the intra-monomeric binding pocket (Tyr37, Figure 2.1.). The only homologues residues that revealed CSPs for both binding sites (in pH 7.6) were

Gly39/84, though the Δδavg of Gly39 was half that of Gly84. The large Δδavg for Ile61 (intra-monomeric binding site) was likely due to the large CSPs for adjacent His60 and Arg62 residues (both inter-monomeric binding site). His60 is likely to deprotonate at pH >7.

Table B1. pH effect on Δδavg of sugar-free RSL.

pH 4.0 pH 7.6

0.03 ≤ Δδavg ≤0.05 Δδavg > 0.05 0.03 ≤ Δδavg ≤0.05 Δδavg > 0.05 W76 N42 Q4 T27 D77 E43 T20 N42 T82 G45 N23 E43 G84 G39 G45 D46 H60 V55 I61 S57 R62 S66 W76 T71 D77 W74 T82 T89 G84

The largest CSP in both pH 4.0 and 7.6 were observed for resonances of Asn42 and particularly Glu43. Additionally, the Glu43 side chains is highly solvent exposed and changes in buffer pH can be well sensed by ionisable side chain. The asparagine backbone is presumably sensitive to the pH effects at Glu43. In general, the spectrum of RSL at pH 4.0 was more similar to the spectrum at pH 6.0, with greater differences at pH 7.6. At pH 6.0 RSL is slightly cationic (pI ~ 6.5) and the protein will be slightly more charged at pH 4.0. In contrast, at pH 7.6 the net charge of RSL is anionic which results in more CSPs.

166 Appendix C

Interestingly, at pH 4.0 a new cross-peak for sugar-free RSL was observed. The new resonance belongs probably to Ser2, which at pH 6.0 overlaps with the cross-peak of Thr20 (like in the D-mannose assignment, Figure 2.4. and Appendix B). The change of pH helped to resolve the two peaks.

RSL in DMSO To facilitate future screening of ligands that have low water solubility, RSL stability was tested at 20% and higher concentrations of DMSO. Preliminary studies with unlabelled RSL in the range of 20-100% DMSO did not reveal any precipitation, Figure B2. 1H-15N HSQC spectra of sugar-free therefore, the experiment was repeated RSL at (A) 0 (black) and 20 (cyan), (B) 60 (C) 15 and 100% DMSO. Only sample C contained with N-labelled RSL and monitored deuterated DMSO. Samples were 0.25 mM by NMR spectroscopy. protein in 20 mM KPi, 50 mM NaCl, pH 6.0 15N-labelled sugar-free RSL at 20% DMSO had a well-resolved spectrum. CSPs were observed in comparison to the reference spectrum of RSL in buffer (Figure B2 A). The whole spectrum was shifted upfield in the proton dimension due to solvent effect (pH 6.0 remained). The Δδavg caused by DMSO was ~0.08 ppm on average (Figure B2 D). Observed CSPs are due to altered solvent and possibly binding of DMSO. Also, DMSO may cause local unfolding266 that can be observed as chemical shift perturbations. Above 20% of DMSO, the spectra were broader and broadened beyond detection in the presence of 60% DMSO (Figure B2 B). The high concentration of DMSO obscured the protein signals. Only two pairs of Asn side chains and the N-terminus resonance were detected. Lyophilized 15N-labelled sugar-free RSL in deutereated DMSO (100%) remained completely unfolded (Figure B2 C) because RSL lacked water molecules that typically would stabilize the whole structure.

167 Appendix D

Appendix D

Figure D1. 1H-15N HSQC spectrum of RSL in sugar-free form. Sample conditions were 0.25 mM protein in 20 mM potassium phosphate, 50 mM NaCl, pH 6.0 at 75 °C. The dashed box indicates the region where amide side chains typically resonate (at 30 °C).

168