<<

MECHANISTIC STUDIES OF RETAINING α- GLYCOSIDASES

by

RAN ZHANG

B.Sc. Peking University, 2003

A THESIS SUBMITTED IN PARTIAL FULFILLENT OF

THE REQUIREMENTS FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

in

THE FACULTY OF GRADUATE STUDIES

(Chemistry)

THE UNIVERSITY OF BRITISH COLUMBIA

(Vancouver)

August 2010

© Ran Zhang, 2010

Abstract

The majority of retaining α-glycosidases are believed to adopt the classical double displacement mechanism to catalyze their reactions, which features a catalytic nucleophilic residue, a general acid/base residue, two oxocarbenium-ion like transition states and one covalent glycosyl- intermediate. In my thesis, the catalytic mechanisms of three retaining α-glycosidases were investigated in detail as follows.

HPA is an enzyme which is responsible for hydrolyzing starch into shorter oligosaccharides. Several 2-deoxy-2,2-dihalo maltosyl chlorides were synthesized and tested as potential mechanism-based inhibitors of HPA, in the hope of trapping its covalent glycosyl-enzyme intermediate for crystallographic studies. Unfortunately, none of newly-synthesized compounds could cause time-dependent inactivation of HPA. By employing our newly developed in situ elongation strategy, 5-fluoro-α-D-glucopyranosyl fluoride and 5-fluoro-β-L-idopyranosyl fluoride showed kinetic behavior consistent with the proposed in situ elongation-inactivation process, allowing the trapping and further kinetic and structural analysis of the covalent intermediate of HPA. These structures provide interesting mechanistic insights into the catalytic mechanism of HPA.

TreS is an enzyme which catalyzes the reversible interconversion of maltose and trehalose. 5-Fluoro glycosyl fluorides were shown to be mechanism-based inhibitors of this enzyme by accumulating the covalent glycosyl-enzyme intermediate. The trapped intermediate was subjected to protease digestion followed by MS analysis of the resultant peptides to identify the catalytic residue as D230. The inability of TreS to carry out transglycosylation reactions onto exogenously added acceptors establishes the intramolecular nature of the rearrangement reaction, consistent with previous studies on other TreS . All studies support a double displacement mechanism involving an intramolecular “glucose flipping” step as the catalytic mechanism of this enzyme.

SpGH101 is an enzyme which specifically removes an O-linked disaccharide Gal- β-1,3-GalNAc-α from glycoproteins. Using the recently solved 3-dimensional structure of this protein as a guide, we carried out a detailed mechanistic investigation of this

ii retaining α-glycosidase using a combination of synthetic and natural substrates. Based on a model of the substrate complex of SpGH101, we proposed D764 and E796 as the nucleophile and general acid/base residues, respectively. These roles were confirmed by kinetic and mechanistic analysis of mutants at those positions using synthetic substrates and anion rescue experiments.

iii Table of Contents

Abstract…………………………………………………………………………...ii

Table of Contents………………………………………………………………...iv

List of Tables……………………………………………………………………..xi

List of Figures…………………………………………………………………...xii

List of Schemes………………………………………………………………….xxi

List of Abbreviations………………………………………………………….xxiii

List of Amino Acid Abbreviations……………………………………………xxvi

Acknowledgements…………………………………………………………...xxvii

Dedication……………………………………………………………………..xxix

Chapter 1: Introduction………………………………………………………….1

1.1 Glycosidases and CAZy classification…………………………………………….2

1.2 Catalytic mechanisms of glycosidases…………………………………………….5

1.2.1 Inverting glycosidases……………………………………………………….6

1.2.2 Retaining glycosidases………………………………………………………6

1.2.3 Mechanistic anomalies………………………………………………………7

1.3 Important features of the double displacement mechanism……………………….8

1.3.1 The catalytic nucleophile…………………………………………………... 8

1.3.2 General acid/base catalyst…………………………………………………..9

1.3.3 Oxocarbenium-ion like transition state…………………………………….11

1.3.4 The covalent glycosyl-enzyme intermediate………………………………12

1.3.5 Covalent inhibitors of glycosidases………………………………………..14

iv 1.3.6 Structural studies of glycosidases and non-covalent interactions………….21

1.3.7 Mechanistic features of α-glycosidases and β-glycosidases……………….24

1.4 Aims of this thesis……………………………………………………………….26

Chapter 2: Structural Studies of the Covalent Glycosyl-Enzyme Intermediate of Human Pancreatic α- (HPA)………………………………………28

2.1 General introduction to human pancreatic α-amylase…………………………..29

2.1.1 The GH13 family and α-………………………………………....29

2.1.2 Previous mechanistic studies of HPA……………………………………...31

2.1.3 Trapping the covalent glycosyl-enzyme intermediate of GH13 enzymes....35

2.1.4 Non-covalent inhibitors of HPA…………………………………………...39

2.2 Specific aims of this study………………………………………………………44

2.3 Structural analysis of the covalent glycosyl-enzyme intermediate of HPA…….45

2.3.1 Expression, purification and crystallization of HPA………………………45

2.3.2 Synthesis of 2-deoxy-2,2-dihalo maltosides……………………………….46

2.3.2.1 Previous synthesis of 2-deoxy-2,2-difluoro maltosides……………46

2.3.2.2 Synthesis of 2-deoxy-2,2-dihalo maltosides……………………….49

2.3.3 Kinetic evaluations of 2-deoxy-2,2-dihalo maltosides as potential

mechanism-based inhibitors of HPA………………………………………54

2.3.3.1 Evaluation of 2-chloro-2-deoxy-2-fluoro-4-O-[α-(1,4)-D-gluco

pyranosyl]-α-D-glucopyranosyl chloride (2.3) as an HPA

inactivator…………………………………………………………..55

2.3.3.2 Evaluation of 2-deoxy-2,2-difluoro-4-O-[α-(1,4)-D-glucopyranosyl]-

v α-D-arabinohexopyranosyl chloride (2.4) as an HPA inactivator….56

2.3.3.3 Evaluation of 4-O-[α-(1,4)-D-glucopyranosyl]-2-deoxy-2-fluoro-α-D-

mannopyranosyl fluoride (2.12) as an HPA inactivator…………..57

2.3.3.4 General conclusions from the kinetic analysis of the interactions of

2-deoxy-2,2-dihalo maltosides with wild-type HPA………………58

2.3.4 Attempted synthesis of p-toluenesulfonyl 2-deoxy-2,2-difluoro-α-maltoside

(2.17)………………………………………………………………………60

2.3.5 Attempted trapping of the covalent intermediate of HPA by using the

acid/base mutant E233Q of HPA…………………………………………..61

2.3.6 Application of the “in situ” inhibitor elongation methodology to generate

mechanism-based inhibitors of HPA: 2-deoxy-2,2-dihalo glycosides……..64

2.3.6.1 Chemical synthesis of 4’-O-methyl α-maltosyl fluoride (MeG2F)...65

2.3.6.2 Attempted in situ elongation of 2-deoxy-2,2-difluoro α-maltosyl

chloride (2.4) with MeG2F (2.21) by HPA…………………………68

2.3.6.3 Attempted in situ elongation of 2-deoxy-2,2-difluoro-α-D-

arabinohexopyranosyl chloride (2.2) with MeG2F by HPA……….71

2.3.7 Direct in situ inhibitor elongation as a strategy to structurally characterize

the covalent intermediate of HPA: the use of 5-fluoro glycosides…………73

2.3.7.1 Chemical synthesis of 5-fluoro-α-D-glucopyranosyl fluoride (2.30)

and 5-fluoro-β-L-idopyranosyl fluoride (2.31)……………………..73

2.3.7.2 Kinetic studies of wild type HPA in the presence of MeG2F and

5FGluF or 5FIdoF…………………………………………………..76

2.3.8 Structural studies of covalent glycosyl-enzyme intermediate of HPA…….83

vi 2.3.8.1 Structure of the monosaccharide 5-fluoroidosyl-HPA complex……83

2.3.8.2 Structure of the disaccharide G3F/5FIdoF/HPA complex…………...91

2.3.8.3 Structure of the trisaccharide MeG2F/5FIdoF/HPA complex……….94

2.3.8.4 Mechanistic implications of structural studies of covalent

intermediates of HPA…………………………………………………96

2.4 General conclusions………………………………………………………….....100

Chapter 3: Mechanistic Studies of Trehalose Synthase (TreS) from

Mycobacterium smegmatis……………………………………………………..103

3.1 Background…………………………………………………………………….104

3.1.1 Introduction to trehalose………………………………………………….104

3.1.2 Biosynthetic pathways of trehalose………………………………………108

3.1.3 Introduction to trehalose synthase (TreS)………………………………...110

3.2 Specific aims of this study……………………………………………………..113

3.3 Detailed kinetic analysis of TreS using different α-glucoside substrates………114

3.3.1 Design of substrates………………………………………………………114

3.3.2 Chemical synthesis of potential substrates of TreS………………………117

3.3.3 Kinetic analysis of aryl α-glucosides and αGlcF as TreS substrates……..118

3.3.4 pH Dependence of wild type TreS………………………………………..121

3.4 The trapping of the covalent intermediate of TreS and identification of the

catalytic nucleophile by mass spectrometric analysis of labeled peptides……..123

3.4.1 Mechanism-based inhibitors of TreS……………………………………..123

3.4.2 Identification of the catalytic nucleophile of TreS by mass spectrometry..128

vii 3.5 Investigations of the rearrangement mechanism of TreS………………………133

3.5.1 Approach 1: Attempted isotope-labeled glucose incorporation into the

disaccharide by TreS………………………………………………………133

3.5.2 Approach 2: Attempted TreS-catalyzed transglycosylation of glucose from

αGlcF onto glucose……………………………………………………….135

3.6 Conclusions…………………………………………………………………….137

Chapter 4: Mechanistic Studies of Endo-α-N-acetylgalactosaminidase (SpGH101) from Streptococcus pneumoniae………………………………………………………139

4.1 Background…………………………………………………………………….140

4.1.1 Protein glycosylation……………………………………………………..140

4.1.2 Endo-α-N-acetylgalactosaminidases (endo-α-GalNAcases):

GH101 family...... 141

4.2 Specific aims of this study…………………………………………………….147

4.3 Detailed mechanistic studies of SpGH101……………………………………148

4.3.1 Chemo-enzymatic synthesis of the substrates of SpGH101……………..148

4.3.2 Determining stereochemical outcome of SpGH101 by NMR……………151

4.3.3 Detailed kinetic studies of SpGH101 enzymes…………………………..153

4.3.3.1 Kinetic analysis of wild type SpGH101 enzymes and pH dependence

study………………………………………………………………153

4.3.3.2 Kinetic analysis of catalytic residue mutants of SpGH101……….156

4.3.3.3 Relative activities of wild type SpGH101 and its mutants with a

“natural” glycopeptide substrate………………………………….157

viii 4.3.3.4 Chemical rescue of catalytic residue mutants of SpGH101………158

4.3.4 Discussion of the kinetic analysis of SpGH101 and its mutants…………161

4.4 Conclusions and future directions…………………………………………….163

Chapter 5: Materials and Methods…………………………………………..165

5.1 Generous gifts and commercially available materials……………………….166

5.1.1 Synthetic …………………………………………………166

5.1.2 Enzymes and peptides……………………………………………………166

5.1.3 Commercially available carbohydrates, buffer salts and enzymes………166

5.2 Synthesis………………………………………………………………………166

5.2.1 General methods………………………………………………………….166

5.2.2 General synthetic methods……………………………………………….167

5.2.2.1 Fluorination conditions using acetyl hypofluorite…………………167

5.2.2.2 Acetylations………………………………………………………..167

5.2.2.3 Deprotection of acetyl groups under strongly basic conditions……168

5.2.2.4 Deprotection of acetyl groups under weakly basic conditions…….168

5.2.2.5 Deprotection of acetyl groups under acid conditions……………..168

5.2.3 Synthesis and compound characterization……………………………….169

5.3 Molecular biology…………………………………………………………….200

5.3.1 Expression and purification of wild type HPA………………………….200

5.3.2 Expression and purification of wild type CgtB enzyme…………………201

5.4 Enzymology…………………………………………………………………..202

5.4.1 Human pancreatic α-amylase (HPA)…………………………………….202

ix 5.4.1.1 General assay conditions………………………………………….202

5.4.1.2 Time-dependent Inactivation kinetics of potential mechanism-based

inhibitors…………………………………………………………...202

5.4.1.3 Reactivation kinetics……………………………………………….202

5.4.2 Trehalose synthase (TreS) from Mycobacterium smegmatis……………..203

5.4.2.1 Kinetic evaluation of different aryl α-glucosides as substrates……203

5.4.2.2 α-Glucosyl fluoride (αGlcF) kinetics………………………………203

5.4.2.3 pH profile studies………………………………………………….204

5.4.2.4 Fluorosugar inactivator studies…………………………………….204

5.4.2.5 Ki determination of 5FGlcF and GHIL/casuarine as competitive

inhibitors of TreS…………………………………………………...205

5.4.2.6 Reactivation experiments………………………………………….205

5.4.3 Endo-α-N-acetylgalactosaminidase from Streptococcus pneumoniae R6..205

5.4.3.1 Kinetic analysis of wild type SpGH101 and its various mutants…..205

5.4.3.2 pH profile studies…………………………………………………..206

5.4.3.3 Azide rescue product identification………………………………..207

5.5 Determination of stereochemical outcome of SpGH101 by NMR……………207

5.6 Summary of structure determination statistics for the covalent intermediate

of HPA…………………………………………………………………………207

References………………………………………………………………………209

Appendix I: Basic …………………………………………..220

Appendix II: Publications……………………………………………………..227

x List of Tables

Table 2.1. Comparison of kinetic parameters for inactivation of related GH13 enzymes……...... 59

Table 3.1. The selected six aryl-α-glucosides to be tested as TreS substrates…………117

Table 3.2. Summary of kinetic parameters of TreS substrates @ 37 ºC……………….119

Table 3.3. The individual kcat/Km values of TreS at different pH values………………122

Table 4.1. Kinetic parameters for SpGH101 and its various mutants………………….154

Table 4.2. The individual kcat/Km values of SPOG07 under different pH……………...155

Table 4.3. Kinetic parameters for the cleavage of DNPTAg by the E796A and E796Q mutants of SpGH101 in the presence of different anions………………………………160

xi List of Figures

Figure 1.1. Different stereochemical outcomes catalyzed by either an inverting glycosidase or a retaining glycosidase, as illustrated using an α-glycoside………………3

Figure 1.2. Nucleophile mutant of a retaining β-glucosidase acting as an “inverting” glycosidase in the presence of a azide…………………………………………………….9

Figure 1.3. Mechanism of azide rescue for the acid/base mutant of a retaining β- glucosidase……………………………………………………………………………….10

Figure 1.4. a) Chemical structures of deoxynojirimycin and isofagomine; b) Donation of lone pair electrons from the endocyclic oxygen to the anomeric carbon where positive charge is developed at the transition state……………………………………………….12

Figure 1.5. An example of the propsosed ion-pair intermediate for a retaining β- glucosidase……………………………………………………………………………….13

Figure 1.6. a) Mechanism of N-bromoacetylglycosylamines as affinity labels for glycosidases; b) Mechanism of a photoreactive probe as an affinity label to covalently derivatise the glycosidase………………………………………………………………..15

Figure 1.7. a) Chemical structures of 2-deoxy-2-fluoro-β-D-glucopyranosyl fluoride and 2, 4-dinitrophenyl 2-deoxy-2-fluoro-β-D-gluocopyranoside; b) Mechanism of trapping the covalent glycosyl-enzyme intermediate of a retaining β-glucosidase by 2-deoxy-2- fluoro-β-D-glucopyranosyl fluoride……………………………………………………..18

Figure 1.8. a) Chemical structures of some 2-deoxy-2,2-dihaloglycosides; b) Chemical structures of several 5-fluoroglycosyl fluorides…………………………………………19

Figure 1.9. Representative structures for each clan of glycoside …………...22

xii Figure 1.10. a) Proposed transition state of a retaining β-glycosidase; b) Proposed transition state of a retaining α-glycosidase……………………………………………..25

Figure 2.1. Structures of amylase and amylopectin…………………………………….29

Figure 2.2. Crystal structure of wild type HPA ………………………………………...32

Figure 2.3. Binding subsites of HPA……………………………………………………33

Figure 2.4. The double displacement mechanism of HPA……………………………...35

Figure 2.5. Several potential mechanism-based inactivatiors for GH13 enzymes……...36

Figure 2.6. Hypothesized unwanted nucleophilic aromatic substitution between HPA and trinitrophenyl 2-deoxy-2,2-difluoro α-maltoside in the HPA crystal……………………37

Figure 2.7. Rationale of accumulation of the covalent intermediate of CGTase by 4’’- deoxy-α-maltotriosyl fluoride……………………………………………………………38

Figure 2.8. Rearrangement of acarbose in HPA crystal…………………………………41

Figure 2.9. Selected examples of GH13 α-glucosidase inhibitors………………………42

Figure 2.10. Mechanism of in situ elongation of GHIL by HPA……………………….43

Figure 2.11: Chemical structure of Montbretin A………………………………………44

Figure 2.12: Example of an HPA crystal……………………………………………….46

Figure 2.13: Mechanism-based inhibitors of yeast α-glucosidase and proposed mechanism-based inhibitors of HPA……………………………………………………47

xiii Figure 2.14: General approaches to synthesize 2-deoxy-2,2-difluoro glycosides………48

Figure 2.15: Fluorination of glycals by SelectfluorTM…………………………………..50

Figure 2.16: Structure of 2-chloro-4-nitrophenyl α-maltotrioside (CNPG3)……………55

Figure 2.17: Kinetic evaluations of compound 2.3 with wild type HPA. a) Residual enzyme activity of HPA versus time in the presence of 2.3 (50 mM). b) Kinetic analysis of (2.3) as a reversible inhibitor of HPA in the presence of 5 mM CNPG3……………..56

Figure 2.18: Kinetic evaluations of compound 2.4 with wild type HPA. a) Residual enzyme activity of HPA versus time in the presence of 2.4 (50 mM). b) Kinetic analysis of 2.4 as a reversible inhibitor of HPA in the presence of 2 mM CNPG3……………….57

Figure 2.19: Kinetic evaluations of compound 2.12 with wild type HPA. a) Residual enzyme activity of HPA versus time in the presence of 2.12 (20 mM). b) Kinetic analysis of 2.12 as a reversible inhibitor of HPA in the presence of 5 mM CNPG3……………..58

Figure 2.20: Chemical structure of p-toluenesulfonyl 2-deoxy-2,2-difluoro α-maltoside (2.17)…………………………………………………………………………………….60

Figure 2.21: Residual enzyme activity of HPA E233Q mutant versus time in the presence of compound 2.19 (10 mM)……………………………………………………………..63

Figure 2.22: a) Hypothesized in situ elongation mechanism for weak-binding inhibitors of HPA; b) Chemical structures of two glycosyl donors used in the in situ inhibitor elongation methodology: α-maltotriosyl fluoride (G3F); 4’-O-methyl α-maltosyl fluoride (MeG2F)…………………………………………………………………………………65

Figure 2.23: Proposed mechanism of the methylation reaction by TMS-CHN2. “ROH” in this figure represents compound (2.23)………………………………………………….67

xiv Figure 2.24: Residual enzyme activity of HPA versus time in the presence of 10 mM MeG2F (2.21) and 50 mM 2-deoxy-2,2-difluoro α-maltosyl chloride (2.4) after pre- incubation at 30 oC for an hour………………………………………………………….69

Figure 2.25: Mass spectrum of the incubation mixture of compound 2.4 (50 mM), MeG2F (10 mM) and wild type HPA. (a) the whole spectrum; (b) expanded Peak A; (c) expanded Peak B; (d) expanded Peak C…………………………………………………70

Figure 2.26: Monosaccharide inhibitors which will be incubated with HPA in the presence of MeG2F to generate elongated mechanism-based inhibitors in situ…………71

Figure 2.27: Residual enzyme activity of HPA versus time in the presence of MeG2F (20 mM) and 2-deoxy-2,2-difluoro-α-D-arabinohexopyranosyl chloride (100 mM) after pre- incubation at 30 oC for an hour………………………………………………………….72

Figure 2.28: Synthesis of 5-fluoro derivatives (a) Photobromination route towards 5-fluoro sugar derivatives; (b) Mechanism of photobromination reaction; (c) Retrosynthetic analysis of 5-fluoro sugar derivatives using epoxide fluoridolysis method……………………………………………………………………………………74

Figure 2.29: Residual activity of HPA in the presence of (O) 50 mM 5FGlcF and (●) 50 mM 5FGlcF + 20 mM MeG2F, at 30 oC………………………………………………...77

Figure 2.30: (a) Residual activity of HPA in the presence of (O) 50 mM 5FIdoF alone; (●) 50 mM 5FIdoF + 20 mM MeG2F; (□) 50 mM 5FIdoF + 40 mM MeG2F; and, (▲) 25 mM 5FIdoF + 40 mM G3F, at 30oC. (b) Long-term analysis of residual activity of HPA (T) in the absence of any 5FIdoF and (X) in the presence of 100 mM 5FIdoF………..79

Figure 2.31: Mass spectra of HPA (above) and HPA treated with 25 mM 5FIdoF + 20 mM MeG2F (below)…………………………………………………………………….80

xv Figure 2.32: Reactivation of inactivated HPA at 30oC in the presence of (a) only buffer and (b) in the presence of different concentrations of maltose: (○) 20 mM maltose; (□) 30 mM maltose; (●) 40 mM maltose (Δ) 70 mM maltose (■) 100 mM maltose (▲) 150 mM maltose…………………………………………………………………………………...82

Figure 2.33: Residual activity of porcine pancreatic α-amylase in the presence of (●) 20 mM MeG2F; (O) 25 mM 5FIdoF; and, (□) 20 mM MeG2F + 25 mM 5FIdoF………....83

Figure 2.34: Omit difference electron density maps of the (a) Monosaccharide 5- Fluoroidosyl-HPA (Condition 1); (b) G3F/5FIdoF/HPA (Condition 2); (C) MeG2F/5FIdoF/HPA (Condition 3)……………………………………………………..84

Figure 2.35: Schematic drawings of the structures determined for the Conditions 1-3 covalent glycosyl-intermediate complexes in the of HPA……………………85

Figure 2.36: Schematic diagrams illustrating the hydrogen bonding interactions formed in the active site of HPA in the (a) -1 binding subsite (covalent portion) and (b) the +1 binding subsite (non-covalent portion) of the MeG2F/5FIdoF (Condition 1) glycosyl- enzyme intermediate complex…………………………………………………………...87

Figure 2.37: Overlays of the bound structures of the monosaccharide 5-fluoroidosyl- HPA (Condition 1) intermediate in green, on top of the structure found for the non- covalent transition state mimic acarbose in cyan, in the active site of HPA adjacent to catalytic residues…………………………………………………………………………90

Figure 2.38: A comparative overlap of the structures of the covalently bound monosaccharide 5-Fluoroidosyl-HPA (Condition 1) intermediate in green; the G3F/5FIdoF (Condition 2) intermediate in yellow; and, the MeG2F/5FIdoF (Condition 3) intermediate in magenta………………………………………………………………….92

xvi Figure 2.39: Schematic diagrams illustrating the hydrogen bonding interactions for the elongated MeG2F/5FIdoF (Condition 3) covalent glycosyl-enzyme intermediate complex are indicated…………………………………………………………………………….95

Figure 2.40: Overlays of the bound structures of the MeG2-5FIdo-HPA (Condition 3) intermediate in magenta, on top of the structure found for the non-covalent transition state mimic acarbose in cyan, in the active site of HPA adjacent to catalytic residues….96

Figure 2.41: Distortion of substrate by glycosidases……………………………………97

Figure 2.42: Stoddart’s pyranoside ring interconversion map…………………………..99

Figure 2.43: Possible pyranose conformational itinerary of α-amylase catalysis……...100

Figure 3.1: Chemical structure of trehalose……………………………………………104

Figure 3.2: Selected examples of trehalose-containing glycolipids from mycobacteria (a) trehalose dimycolate (cord factor) (b) sulfolipid-1 (SL-1)……………………………..107

Figure 3.3: Possible products formed from deuterated trehalose, normal trehalose and TreS in the different mechanistic scenarios…………………………………………….112

Figure 3.4: Potential substrates of TreS: aryl α-glucosides and α-glucosyl fluoride…..115

Figure 3.5: An illustrative example of a biphasic Brønsted plot obtained for hydrolysis of a series of aryl glycosides by a retaining β-glycosidase………………………………..116

Figure 3.6: Bronsted plot relating rates (logkcat) of TreS-catalyzed hydrolysis of five aryl

glucoside substrates with the ability of their phenol aglycones (pKa)…..119

Figure 3.7: Methanol competition experiments of TreS with 2 mM DNPGlc and different concentrations of methanol at 37 oC……………………………………………………120

xvii Figure 3.8: Residual enzyme activity of TreS after incubation at 37 oC for one hour at different pH values……………………………………………………………………...121

Figure 3.9: Dependence of kcat/Km upon pH for hydrolysis of DNPGlc by wild type TreS……………………………………………………………………………………..122

Figure 3.10: (a) Dixon plot of 5FGlcF as an apparent reversible inhibitor of TreS (b) reaction of 1 mM 5FGlcF and 0.5 μM TreS enzyme at 37 oC, monitored by a fluoride ion electrode………………………………………………………………………………...124

Figure 3.11: Inactivation of TreS by 5FIdoF at 37 oC. (a) Plot of residual activity versus time at the inhibitor concentrations shown (b) Replot of inactivation rate constants versus concentration of inactivator…………………………………………………………….125

Figure 3.12: (a) Dixon plots of GHIL as a reversible inhibitor of TreS (b) Dixon plot of casuarine as a reversible inhibitor of TreS (c) Chemical structure of GHIL (d) Chemical structure of casuarine…………………………………………………………………..126

Figure 3.13: Inactivation of TreS by 10 mM 5FIdoF in the absence and the presence of 5 μM casuarine at 37 oC………………………………………………………………….127

Figure 3.14: Mass spectrum of (a) intact TreS enzyme (b) mixture of TreS enzyme with 10 mM 5FIdoF, incubated at 37 oC for six hours………………………………………129

Figure 3.15: ESI-MS/MS analysis of unlabeled peptide P1 (a) MS/MS fragment-ion spectrum of peptide P1 (b) Fragmentation pattern of peptide P1 and corresponding m/z of singly charged b-ions and y-ions……………………………………………………….130

Figure 3.16: ESI-MS/MS analysis of labeled peptide P2 (a) MS/MS fragment-ion spectrum of peptide P2 (b) Fragmentation pattern of peptide P2 and corresponding m/z of singly charged b-ions and y-ions……………………………………………………….131

xviii Figure 3.17: Partial sequence alignment of several GH13 glycosidases including TreS……………………………………………………………………………………..132

Figure 3.18: TLC analysis of reactions containing 10 mM U-13C-D-glucose, 2 mM maltose and (a) 0.14 μM TreS enzyme (b) no enzyme at 37 oC for 24 hours followed by lyophilization, acetylation and aqueous workup……………………………………….134

Figure 3.19: Mass spectrum of the reaction containing 10 mM U-13C-D-glucose, 2 mM maltose and 0.14 μM TreS enzyme at 37 oC for 24 hours followed by lyophilization, acetylation and aqueous workup……………………………………………………….135

Figure 3.20: TLC analysis of the reaction of 15 mM glucose, 5 mM αGlcF and 0.35 μM TreS enzyme at room temperature for 28 hours……………………………………….136

Figure 4.1: General biosynthesis of Core 1 subtype O-glycans………………………141

Figure 4.2: The overall architecture of SpGH101…………………………………….144

Figure 4.3: Sequence alignment of the proposed catalytic domain of different GH101 enzymes………………………………………………………………………………..145

Figure 4.4: Structural representation of the catalytic domain of SpGH101………….146

Figure 4.5: Several SpGH101 substrates……………………………………………..149

Figure 4.6: 1H-NMR determination of the anomeric of the products of the reaction between wild type SpGH101 and pNPTAg……………………………..152

Figure 4.7: Dependence of kcat/Km upon pH for hydrolysis of DNPTAg by wild type SpGH101……………………………………………………………………………..155

xix Figure 4.8: TLC image of various reactions between SpGH101 enzymes and 0.5 mM TAg-IFN……………………………………………………………………………….157

Figure 4.9: Mechanisms of anion rescue of (A) Acid/base mutants, (B) Nucleophile mutants of retaining glycosidases……………………………………………………..159

Figure 4.10: Chemical rescue of the SpGH101 E796Q mutant at increasing concentrations of sodium azide……………………………………………………….160

xx List of Schemes

Scheme 1.1: Reactions catalyzed by glycosidases, transglycosidases and phosphorylases…………………………………………………………………………….2

Scheme 1.2: The catalytic mechanism of inverting glycosidases………………………...6

Scheme 1.3: Double displacement mechanism of a retaining α-glucosidase…………….7

Scheme 1.4: Mechanism of covalent labeling of glycosidases by ortho-(difluoromethyl)- arylglycoside……………………………………………………………………………..16

Scheme 1.5: Mechanism of covalent labeling of glycosidases by CBE…………………17

Scheme 2.1: Synthesis of protected 2-fluoro maltal (2.5) and 3,6-di-O-acetyl-4-O- [2’,3’,4’,6’-tetra-O-acetyl-α-(1,4)-D-glucopyranosyl]-2-deoxy-2-fluoro-α-D-manno- pyranosyl bromide (2.6)………………………………………………………………….49

Scheme 2.2: Synthesis of 4-O-[α-(1,4)-D-glucopyranosyl]-2-deoxy-2-fluoro-α-D-manno- pyranosyl fluoride (2.12)………………………………………………………………...51

Scheme 2.3: Synthesis of 2-chloro-2-deoxy-2-fluoro-4-O-[α-(1,4)-D-glucopyranosyl]-α- D-glucopyranosyl chloride (2.3)…………………………………………………………51

Scheme 2.4: Synthesis of 2-deoxy-2,2-difluoro-4-O-[α-(1,4)-D-glucopyranosyl]-α-D- arabinohexopyranosyl chloride (2.4)……………………………………………………52

Scheme 2.5: Attempted synthesis of tosyl 2-deoxy-2,2-difluoro α-maltoside (2.17)…...61

Scheme 2.6: Synthesis of 4-O-[α-(1,4)-D-glucopyranosyl]-2-deoxy-2-fluoro-α-D-gluco- pyranosyl fluoride (2.19)………………………………………………………………...62

xxi Scheme 2.7: Repeating Dr. Damager’s synthetic route towards 4’-O-methyl-α-maltosyl fluoride (2.21)……………………………………………………………………………66

Scheme 2.8: Improved synthesis of 4’-O-methyl α-maltosyl fluoride (2.21)…………...68

Scheme 2.9: Synthesis of 2-deoxy-2,2-difluoro-α-D-arabinohexopyranosyl chloride…..71

Scheme 2.10: Chemical synthesis of 5FGluF (2.30) and 5FIdoF (2.31)………………...75

Scheme 2.11: Proposed in situ elongation-trapping strategy using 5FGlcF and MeG2F.78

Scheme 3.1: The TPS-TPP biosynthetic pathway of trehalose………………………...108

Scheme 3.2: The TreY-TreZ biosynthetic pathway of trehalose………………………109

Scheme 3.3: TreS-catalyzed isomerisation of maltose and trehalose…………………..110

Scheme 3.4: Proposed catalytic mechanism of TreS…………………………………..113

Scheme 3.5: Chemical synthesis of αGlcF (3.1)……………………………………….117

Scheme 3.6: Chemical synthesis of DNPGlc (3.2)…………………………………….118

Scheme 3.7: Chemical synthesis of 2-deoxy-2-fluoro-α-D-glucopyranosyl fluoride….123

Scheme 3.8: Catalytic mechanism of PGM……………………………………………137

Scheme 4.1: Chemo-enzymatic synthesis of DNPTAg (4.1)………………………….150

xxii List of Abbreviations

5FGlcF - 5-fluoro-α-D-glucopyransoyl fluoride

5FIdoF - 5-fluoro-β-L-idopyranosyl fluoride

Abg - Agrobacterium sp. β-glucosidase

AcCl - acetyl chloride

AcOF - acetyl hypofluorite

AcOH - acetic acid

αGlcF - α-glucosyl fluoride

BMGY - Buffered Glycerol-complex Medium

BMMY - Buffered Methanol-complex Medium

BnBr - benzyl bromide

CAZy - -active enzymes

CBE - conduritol B-epoxide

CBM - carbohydrate binding module

CGTase - cyclodextrin glucanotransferase

CNPG3 - 2-chloro-4-nitrophenyl α-maltotrioside

Da - Dalton

DAST - diethylaminosulfur trifluoride

DMAP - 4-dimethylaminopyridine

DMF - dimethylformamide

DTT - dithiothreitol

EDTA - ethylenediaminetetraacetic acid

xxiii Et2O - diethyl

EtOAc - ethyl acetate

ER - endoplasmic reticulum

ESI - electrospray ionization

GH - glycoside

GHIL - D-gluconohydroximino-1,5-lactam

HPA - human pancreatic α-amylase

HPLC - high performance liquid chromatography

HRMS - high resolution mass spectrometry

IPTG - isopropyl β-D-1-thiogalactopyranoside

kcat - catalytic rate constant

kH/kD - ratio of catalytic rate constant of protio and deuterated substrates

Km - Michaelis constant

Ki - dissociation constant

LC/MS - liquid chromatography/ mass spectrometry

MeG2F - 4’-O-methyl α-maltosyl fluoride

MeOH - methanol

MS - mass spectrometry

NaOMe - sodium methoxide

NMR - nuclear magnetic resonance

PDB - protein data bank

PGM - phosphoglucoutase

PPA - porcine pancreatic α-amylase

xxiv Q-TOF - quadrupole time-of-flight (mass spectrometry)

Rf - retention factor

SelectfluorTM - 1-Chloromethyl-4-fluoro-1,4-diazoniabicyclo[2.2.2]octane

bis(tetrafluoroborate)

SpGH101 - endo-N-acetylgalactosiminidase from Streptococcus pneumoniae

T-Ag - Thomsen-Friedenreich antigen

THF - tetrahydrofuran

TIM - triosephosphate

TLC - thin layer chromatography

TreS - trehalose synthase

TreY - maltooligosyltrehalose synthase

TreZ - maltooligotrehalose trehalohydrolase

TRIUMF - TRI-University Meson Facility

UDP-glucose - uracil-diphosphate glucose

UV - ultra-violet

WT - wild type

xxv List of Amino Acid Abbreviations

Ala A Alanine Arg R Arginine Asn N Asparagine Asp D Aspartic acid Cys C Cysteine Glu E Glutamic acid Gln Q Glutamine Gly G Glycine His H Histidine Ile I Isoleucine Leu L Leucine Lys K Lysine Met M Methionine Phe F Phenylalanine Pro P Proline Ser S Serine Thr T Threonine Trp W Tryptophan Tyr Y Tyrosine Val V Valine

xxvi Acknowledgements

First of all I would like to express my deepest gratitude to my research supervisor: Dr. Stephen G. Withers for all his help in the past six years while I was working towards my Ph.D. degree in his group. Without his guidance, insight, encouragement and most importantly, patience, this thesis could not have been accomplished. I feel so lucky to have been part of his research group. His enthusiasm for science and life have been and will continue be an inspiration source for me.

I am very grateful to many great collaborators who have contributed to this work. My thanks goes to Dr. Gary Brayer and Dr. Chunmin Li for their generous support and tireless efforts in solving numerous crystal structures for the α-amylase project; the late Dr. Alan Elbein and Dr. Y.T. Pan for providing purified TreS enzyme; and Dr. Warren Wakarchuk for a fruitful collaboration on the SPOG project.

I wish to thank all the past and present members of the Withers group for their help and friendship. Because of them, my research life in the past six years has been such an enjoyable journey. Special thanks go to Dr. Hongming Chen for his numerous help in synthesis and casual chatting on subjects other than chemistry; Dr. Chris Tarling for his generous help in kinetics; Dr. Omid Hekmat for teaching me various aspects of research when I first joined the group; Dr. Shouming He for his tremendous help in ESI-MS; Dr. Jin-Hyo Kim and Dr. Stefan Juers for making my research life in A342 such a memorable time; Dr. Jacqueline Wicki for reading and correcting part of my Ph.D. thesis; Ms. Emily Kwan for assistance in the lab; and Ms. Miranda Joyce for help in preparing lots of paperwork.

In addition, I want to express my gratitude to many people in the Chemistry Department, UBC. Specially, my thanks goes to Dr. Martin Tanner for his critical reading of this thesis; Dr. Nick Burlinson and Ms. Zorana Danilovic for their technical help with NMR spectroscopy; Dr. Yun Ling for many suggestions on mass spectroscopy; and Dr. Mike Adam for letting me use fluorination facility at TRIUMF.

xxvii I am very grateful to British Columbia Innovation Council for offering me BCIC Innovation Scholarship. The Department of Chemistry (UBC) is also thanked for providing me Arthur S. Hawkes Scholarship in Chemistry and Gladys Estella Laird Research Fellowship.

My heartfelt personal thanks goes to my parents, who are far far away in China. Their support is always an important inspiration source for me. Lastly, I really want to thank my wife Suellen for her love, support and unwavering belief. Without her, this thesis could not have been possible.

xxviii

For My Parents, and Suellen

谨以此文献给我的父亲张振义先生,母

亲卢建萍女士, 以及爱妻周媛女士。

xxix

Chapter 1: Introduction

1 1.1 Glycosidases and CAZy classification

Carbohydrates and glycoconjugates are among the most abundant biological molecules occurring in nature and play crucial roles such as the storage of metabolic energy, maintenance of cellular structural integrity and participation in a range of important biological recognition processes.1 The controlled biosynthesis and degradation of these structures is therefore vital to the function of all organisms. This task is rendered challenging by the considerable stability of the glycosidic bonds involved. For example, it has been estimated that the half lives for the spontaneous hydrolysis of and starch are in the range of five million years!2,3 Fortunately, nature provides a solution to this problem in the form of highly proficient enzymes known as glycosidases and glycosyltransferases.

O O glycosidase + H O RO + HOR1 RO 2 or OH OR1

O O transglycosidase RO RO + R2OH + HOR1

OR2 OR1

O O phosphorylase RO + phosphate RO + HOR1

OPi OR1 Scheme 1.1: Reactions catalyzed by glycosidases, transglycosidases and phosphorylases.

Enzymes responsible for degrading various glycoconjugates can fall into three types, depending on their specific glycosyl acceptors.4 (Scheme 1.1) Glycosidases or glycoside hydrolases (GH) utilize water as the acceptor with the net reaction being hydrolysis. Considering the concentration of water under physiological conditions, this

2 reaction is typically regarded as irreversible though this is not strictly correct thermodynamically.5 If the acceptor is an alcohol functionality from another molecule such as a sugar or a lipid, the term “transglycosidase” is used to describe the enzyme involved. Although the net reaction is formation of a new , transglycosidases are still grouped together with glycosidases since these two types of enzymes share strong similarity in terms of both structures and catalytic machinery. Equilibrium constants for such reactions typically lie close to unity given the chemical similarity of the substrate and product. Well known examples of transglycosidases are the cyclodextrin glycosyltransferases which catalyze the formation of cyclic dextrins from starch. The third type of enzymes is that of the phosphorylases, which also typically have equilibrium constants close to unity.4 This reaction can therefore easily be driven in either direction by an excess of one reagent. Depending on the direction of the catalyzed reaction, they can either degrade a polysaccharide by transferring the glycosyl moiety to an inorganic phosphate acceptor or synthesize a new glycosidic bond by utilizing a glycosyl phosphate as the donor. For example, phosphorylase, working in conjunction with glycogen debranching enzyme, breaks glycogen down to glucose-1- phosphate, which is subsequently metabolized. The first two types of enzymes, glycosidases and transglycosidases, are the focus of this thesis.

O

OR2 inverting HO

O

HO O

OR1 retaining HO

OR2 Figure 1.1: Different stereochemical outcomes catalyzed by either an inverting glycosidase or a retaining glycosidase, as illustrated using an α-glycoside.

All the reactions catalyzed by glycosidases or transglycosidases are essentially nucleophilic displacement reactions at a saturated carbon, the anomeric center (a chiral

3 acetal or ketal center). There are only two possible stereochemical outcomes for this type of reaction, namely inversion or retention of configuration at the anomeric center.5 (Figure 1.1) Since different chemical mechanisms are associated with each scenario, determination of the stereochemical outcome is of fundamental importance to the discussion of catalytic mechanisms for any enzymatic glycosyl transfer. In addition to the classification as either “inverting” or “retaining” enzymes, glycosidases are also classified according to the anomeric stereochemistry of their substrates, thus names such as α-glycosidases or β- are used.

Consistent with the occurrence of diverse glycan structures in nature, there exist a large number of glycosidases, which obviously requires a systematic classification. Besides the general criteria such as inverting/retaining and α-glycosidases/β-glycosidases mentioned before, one of the most powerful classifications of glycosidases was introduced by Bernard Henrrissat in 1991, based on the premise that the primary sequences of proteins dictate their three-dimensional structures and hence their catalytic mechanisms.6,7 Glycosidases that share significant similarities in their primary sequences were grouped together and a family number was assigned for each such group. At the time of writing, there are 115 GH families and the number of enzymes entered into the database is still increasing rapidly. Valuable information on these Carbohydrate-Active enZYme (CAZy) families, including links to the primary sequences, three dimensional structures as well as mechanistic information, is available at the following, excellent up- to-date website: http://www.cazy.org.7 Very recently, an encyclopedic website (CAZypedia), which provides more detailed mechanistic information on CAZy families, was also released on the internet: http://www.cazypedia.org/index.php/Main_Page.

The classification of glycosidases based upon sequence similarity can greatly simplify studies of this class of enzymes since enzymes within the same family are thought to share structural and mechanistic similarities. This allows generalizations to be made based on structural and mechanistic studies on representative members of a family. This concept was first tested by Withers and coworkers by examining the stereochemical outcome for ten β-1, 4-glucanases and xylanases which were sampled from five different

4 families.8 Indeed, enzymes from the same family were found to catalyze glycoside hydrolytic reactions with the same stereochemical outcome, which strongly suggests that the same catalytic mechanism is employed. With the availability of three dimensional structures of many glycosidases in the past decade, it was also found that the general fold and active site topology are shared by members from the same family, validating the generality of this classification.9,10 Therefore, once the catalytic residues of one glycosidase have been identified, the corresponding residues for other members of the same family can be easily and accurately predicted by sequence alignment.11 Further, when structural information is available on one family member, homology models can be constructed between members from the same family and useful insights can be gleaned.

As the number of solved crystal structures of glycosidases has grown, it has been found that some families having little sequence similarity are nonetheless structurally related. Since three-dimensional structures are more conserved than primary sequences in the evolutionary process, the concept of “clans” was created to accommodate different families having similar structures.12 Currently, there are 14 clans, each of which contains at least two families. The largest clan is GH A, which currently includes families 1, 2, 5,

10, 17, 26, 30, 35, 39, 42, 50, 51, 53, 59, 72, 79, 86. All these families have a (β/α)8 barrel fold with the catalytic general acid/base residue and nucleophile residue located on strands 4 and 7, respectively.13 They are therefore sometimes called the 4/7 superfamily.

1.2 Catalytic mechanisms of glycosidases The two canonical mechanisms of enzymatic glycoside hydrolysis were first proposed by Koshland in 1953.14 Since then, considerable amounts of structural and mechanistic data have been accumulated in support of the proposed mechanisms, both of which involve nucleophilic displacement steps.5,15 The reactions result in either net inversion (Section 1.2.1), or retention (Section 1.2.2) of the substrate anomeric configuration.

5 1.2.1 Inverting glycosidases The inverting glycosidases operate through a single displacement mechanism, as shown in Scheme 1.2. The key catalytic residues are usually a pair of carboxylic acids, which are typically separated by a distance of 6 -12 Å, somewhat larger than that of the retaining glycosidases.16 This greater distance is necessary for the accommodation of a water molecule as well as the substrate at the active site and is one of the fundamental differences between the two general glycosidase mechanisms. One carboxylate acts as the general base and activates a water molecule for nucleophilic attack at the substrate anomeric center. At the same time, the second carboxylic acid functionality facilitates the departure of the leaving group via general acid catalysis. Reaction occurs via an oxocarbenium-ion like transition state.

δ- - H O O O O O O OH H ROH OH δ + OH δ - O O O O HO HO HO O HO HO R R HO δ - OH H HO H O O HO OH δ- H H OO O OH

O O -

Scheme 1.2: The catalytic mechanism of inverting glycosidases

1.2.2 Retaining glycosidases Two key amino acid residues, a catalytic nucleophile and a general acid/base catalyst can be found within the active site of retaining glycosidases. Although there are exceptions (such as GH33 sialidases), both catalytic residues are usually carboxylic acids, either Asp or Glu. The two carboxylic acid residues are separated by a distance of approximately 5 Å, which is shorter than that of the inverting enzymes.16 The catalytic nucleophile is suitably located for in-line attack on the anomeric center. The second amino acid is in close proximity to the glycosidic oxygen and donates a proton to this oxygen during the nucleophilic displacement step, thereby providing general acid catalysis to leaving group departure (Scheme 1.3). The glycosylation step results in

6 formation of a covalent glycosyl-enzyme intermediate via an oxocarbenium ion-like transition state. The residue that functioned as the general acid catalyst in the first step then acts as a general base in the second step, activating a water molecule for nucleophilic attack at the anomeric center of the glycosyl-enzyme intermediate.5,15 This second step is called the deglycosylation step and also proceeds via an oxocarbenium ion- like transition state. Overall, two nucleophilic displacement steps are required for substrate hydrolysis, which therefore results in net retention of anomeric configuration. This mechanism is commonly known as the double-displacement mechanism.

OH O O OH δ O O O HO O δ ROH HO HO HO R HO O O R δ HO H H H O O O 2 O O δ

OH

O HO HO O O glycosylation step OH H O

H

O O

covalent glycosyl-enzyme intermediate

OH OH δ O O O O O O δ HO HO HO HO

HO H O δ O H HO H H

O O O O δ

deglycosylation step Scheme 1.3: Double displacement mechanism of a retaining α-glucosidase

1.2.3 Mechanistic anomalies The classical inverting and retaining mechanisms discussed above are adopted by the majority of glycosidases. However, with continuous structural and mechanistic

7 characterization of glycosidases, several new mechanisms have been revealed for other groups of enzymes. For example, the and N-acetyl-β- belonging to GH18, 20, 84 and 85 were found to utilize the substrate acetamido carbonyl oxygen as an intramolecular nucleophile rather than a carboxylic amino acid residue.17,18 Sialidases and trans-sialidases from GH33, 34 and 83, on the other hand, were demonstrated to use a tyrosine residue as the catalytic nucleophile.19-21 Redox chemistry, which initially seemed irrelevant to glycoside hydrolysis reactions, was found to be involved in the catalytic mechanism of glycosidases from GH4 and 109.22-25 Several extensive reviews have been published on these atypical mechanisms26,27, which are not the focus of this thesis. All the following discussions will only concern the canonical double displacement mechanism.

1.3 Important features of the double displacement mechanism 1.3.1 The catalytic nucleophile Except for certain hexosaminidases and sialidases described in the previous section, all the biochemical studies so far have confirmed the catalytic nucleophile in the classical double displacement mechanism as being invariably a carboxylate residue (Asp or Glu) of the glycosidase. This conclusion came from crystallographic studies of a large number of retaining glycosidases16 and various trapping studies employing mechanism- based inhibitors15 (details of using mechanism-based inhibitors to study retaining glycosidases will be discussed in Section 1.3.5). From the perspective of organic chemistry, the choice of the catalytic nucleophile should satisfy at least three criteria. Firstly, it is a reasonably good nucleophile to displace the anomeric group of substrates. Secondly, the glycosylated nucleophile in the glycosyl-enzyme intermediate should be modestly unstable, which allows further rapid hydrolysis. Thirdly, it should possess a certain degree of anionic character so that it can stabilize the cationic oxocarbenium ion- like transition state (to be discussed to Section 1.3.3). Therefore, it is not surprising to see that nature selects a carboxylate residue to fulfill this catalytic role.

The importance of the catalytic nucleophile to the function of glycosidases can be evaluated by studying the corresponding nucleophile mutants. Dramatic decreases of

8 enzymatic activity were observed in all cases in which the nucleophile was mutated, typically more than six orders of magnitude.28-30 This loss of activity can be at least partially restored by adding exogenous nucleophilic anions such as azide and formate to occupy the vacancy created by the mutation and serve as alternate . A new product with inverted anomeric configuration will be generated, as shown in Figure 1.2. This formation of an inverted product essentially converts a retaining glycosidase into an inverting enzyme, as exemplified by the studies with GH1 Agrobacterium sp. β- glucosidase (Abg) and provides a useful method to identify the nucleophile residues of retaining glycosidases.28,30,31

O O O O

H OH OH O2N NO 2,4DNP O 2 O HO HO HO O HO

OH OH N3

N3

CH3 CH3

Figure 1.2: Nucleophile mutant of a retaining β-glucosidase acting as an “inverting” glycosidase in the presence of azide.

1.3.2 General acid/base catalyst This catalytic residue plays two roles in the double displacement mechanism, both as a general acid to facilitate the cleavage of the glycosidic linkage and as a general base to activate the water which attacks the glycosyl-enzyme intermediate. Usually inspecting the crystal structure of the glycosidase together with aligning its primary sequence with family members will give useful hints regarding the identity of the general acid/base residue. However, caution is needed when analyzing the kinetic data of the proposed acid/base mutants in order to confirm its role, as discussed below.

9 For substrates with leaving groups which don’t need much acid catalysis such as fluoride and 2,4-dinitrophenolate, the rate of the glycosylation step will be affected relatively little by the mutation, as evidenced by their kcat/Km values. However, the rate of the glycosylation step will be greatly lowered if the substrates bear poorer leaving groups that do need acid assistance for cleavage, such as simple alcohols and . In both cases, the deglycosylation step will be equally slowed due to the lack of general base catalysis.15,30-33 Sometimes, this leads to the deglycosylation step becoming the rate- limiting step, thus accumulation of the covalent glycosyl-enzyme intermediate. An

unusually small Km value is a good indicator of such circumstances, which often occur when substrates with good leaving groups react with the acid/base mutants.

CH3 CH3

OH N3 OH

O O HO HO HO HO N3

OH OH O O O O

Figure 1.3: Mechanism of azide rescue for the acid/base mutant of a retaining β- glucosidase

The finding of such kinetic behavior provides strong evidence to support the assignment of the acid/base residue, but is not confirmative. A simple, yet diagnostic test would be anion rescue experiments in which exogenous nucleophilic anions such as azide, formate, acetate and fluoride are added to the proposed acid/base mutant. If the substrate is deglycosylation rate-limiting, the rate of its cleavage will be greatly enhanced since the anion will go to the cavity created by mutation and react more rapidly with the glycosyl- enzyme intermediate than does water without base catalysis. In addition, a new product with retained anomeric configuration will be produced, as shown in Figure 1.3.30-33 Importantly, no such product can be generated by the wild type enzyme since the negatively charged acid/base residue will exclude the entry of the anion into the active

10 site due to charge repulsion. Therefore the rate acceleration and detection of the new product will firmly establish the identity of the general acid/base residue.

The dual function of the general acid/base residue requires its pKa to be switchable during catalysis for optimal activity. Excitingly, this has been demonstrated in the case of Bacillus circulans xylanase (Bcx) by using 13C-NMR titration for both the 34 free enzyme and its trapped glycosyl-enzyme intermediate. The pKa values of the nucleophile (Glu78) and the acid/base residue (Glu172) of the free enzyme were determined to be 4.6 and 6.7, respectively, well suited to their roles. However, when trapped with a mechanism-based inhibitor to form the covalent intermediate, the pKa value of Glu172 was demonstrated to be 4.2, lower by almost 2.5 units. The lowering of pKa is exactly what is needed for Glu172 to provide general base catalysis in the deglycosylation step, in contrast with its general acid role in the glycosylation step. Such a phenomenon arises from the change in charge of the catalytic nucleophile from anionic in the free enzyme to neutral in the intermediate. This phenomenon is believed to be general among retaining glycosidases and is called “pKa cycling”.

1.3.3 Oxocarbenium-ion like transition state One of the most obvious features of the double displacement mechanism is the occurrence of two oxocarbenium ion-like transition states. The strongest support for such

transition states was provided by the normal secondary kinetic isotope effect (kH/kD>1) determined for both the glycosylation and deglycosylation step.35-38 These results can only be rationalized by the existence of sp2-hybridized carbon centers at the transition states for both steps. Further support came from the high affinity binding (μM to nM) of several azasugars as glycosidase inhibitors, which were designed to mimic the positive charge developed at oxocarbenium-ion like transition states,39 as shown in Figure 1.4 (a).

11 a) OH OH b) OH OH

NH2 O O HO HO HO HO HO HO HO HO NH2 OH HO δ∗ HO deoxynojirimycin isofagomine Figure 1.4: a) Chemical structures of deoxynojirimycin and isofagomine b) donation of lone pair electrons from the endocyclic oxygen to the anomeric carbon where positive charge is developed at the transition state. The positive charge developed at C-1 at the transition state is believed to be stabilized by the lone pair of p-electrons from the endocyclic oxygen (Figure 1.4(b)). For the best orbital overlap, the four atoms C-5, O-5, C-1 and C-2 are required to be coplanar, thus distorting the pyranose ring away from a chair conformation. It has been proposed that only four conformations of the pyranose ring can satisfy this requirement and they 2,5 3 4 40 are B2,5, B, H4 and H3, respectively. Significant efforts have been put into synthesizing stable molecules which can mimic these structural features of the oxocarbenium-ion like transition state, since high-affinity glycosidase inhibitors are highly desirable for potential medical applications.

1.3.4 The covalent glycosyl-enzyme intermediate The existence of a covalent glycosyl-enzyme intermediate in the retaining mechanism has been the center of much debate. The pioneering structural work on hen egg white by Phillips suggested the occurrence of a long-lived oxocarbenium ion intermediate in the double displacement mechanism, as shown for a β-glucosidase in Figure 1.5(a).41,42 This subsequently became the paradigm for the catalytic mechanism of retaining β-glycosidases and has been presented in many biochemistry textbooks since then. However, accumulating evidence from various other studies supported a covalent glycosyl-enzyme intermediate, rather than the proposed “ion-pair” as the principal intermediate involved in the double displacement mechanism. Many physical organic studies employing model systems have demonstrated that discrete oxocarbenium ions do exist in free solution, but their lifetimes were estimated to be usually less than 10-10 s.43 For example, the lifetime of a glucosyl cation was estimated by Jencks to be around 10-12 s in water, by extrapolating the lifetimes of a series of derivatives of substituted

12 benzaldehydes and aliphatic aldehydes.44,45 While such a lifetime is longer than the period of a bond vibration (~10-14 s), when in the presence of negatively-charged nucleophiles such as azide and acetate, however, studies have shown that the existence of sugar oxocarbenium-ions is too short to be meaningful.45 Therefore it is unreasonable that the intermediate will take the form of a discrete oxocarbenium cation, especially in the presence of anionic carboxylate residues within the enzyme’s active site.

OH O H O

O HO O HO H

HO O O

Figure 1.5: An example of the proposed ion-pair intermediate for a retaining β- glucosidase

The most convincing experimental evidence to establish the covalent nature of the intermediate came from kinetic isotope effect measurements. As mentioned in Section

1.3.3, a normal secondary kinetic isotope effect (kH/kD>1) was obtained when using substrates which are deglycosylation step rate-limiting with the enzyme. Such results indicate that the anomeric carbon undergoes a change from sp3 to sp2 hybridization during this step, which corresponds to a covalent intermediate and an oxocarbenium-ion like transition state, respectively. If an ion pair intermediate were involved, an inverse

kinetic isotope effect (kH/kD>1), rather than a normal one, would have been observed.

Extensive research has been carried out to study the covalent glycosyl-enzyme intermediate and common techniques employed include mass spectrometry, NMR spectrometry and X-ray crystallography.46 A prerequisite of applying these techniques to study the fleeting covalent intermediate, however, would be stabilizing it so that its lifetime is long enough for it to be characterized. Fortunately, this goal has been achieved by designing and applying mechanism-based inhibitors or “slow substrates”.

13 1.3.5 Covalent inhibitors of glycosidases Mechanism-based inhibitors belong to a broader category of inhibitors called covalent inhibitors, which have a long history of application on glycosidases.47 In contrast to many non-covalent, reversible inhibitors, covalent inhibitors (or covalent inactivators) “shut down” the enzymatic activity by forming a covalent bond with the enzyme. This usually blocks the entry of other molecules into the enzyme’s active site and achieves tight inhibition. Typically the covalent bond is formed via a reaction between a nucleophilic enzymatic residue and an electrophilic center installed on the inhibitor. One reason for using covalent inhibitors to study glycosidases is to identify the catalytically critical amino acids. Subsequent mutation and kinetic analysis of the corresponding mutants can confirm the importance of the identified residues. Indeed through the design and use of more specific and reliable fluorinated sugar inhibitors (to be discussed later), the catalytic nucleophiles of a large number of retaining glycosidases have been successfully identified. Besides these mechanistic purposes, covalent inhibitors of glycosidases have found many other important applications in areas such as screening aglycone specificity of glycosidases and more recently proteomics.48-51

Covalent inhibitors can be divided into two classes: affinity labels and mechanism-based inhibitors. Affinity labels, as the name explains, contain a “specificity” portion which possesses affinity to the enzyme of interest, and a “warhead” portion which can form a covalent bond with the enzyme. The formation of the covalent bond can be the result of either the inherent high reactivity of “the warhead” or of being triggered by external activation such as light. By contrast, mechanism-based inhibitors are generally stable molecules in buffer solutions without enzymes. They only become reactive as a consequence of the catalytic machinery of the enzymes but then form a covalent bond with some enzymatic residue. For this reason, mechanism-based inhibitors are sometimes called “suicide inhibitors”.

14 (a) O

Br Br O O HO N H H HO N Nu Nu O enzyme enzyme (b) OH OH OH OH N N light O O react with enzyme HO HO

OH OH Figure 1.6: (a) Mechanism of N-bromoacetylglycosylamines as affinity labels for glycosidases. (b) Mechanism of a photoreactive probe as an affinity label to covalently derivatise the glycosidase.

Two selected examples of affinity labels of glycosidases are given in Figure 1.6. The first example is an N-bromoacetylglycosylamine (Figure 1.6(a)), in which the high inherent reactivity of the N-bromoacetyl moiety leads to formation of a covalent bond with some nucleophilic residue. This class of reagents has been demonstrated to inactivate several retaining glycosidases such as Escherichia coli β-galactosidase52, Cellulomonas fimi exoglycanase (Cex)53 and Thermoanaerobacterium saccharolyticum β-xylosidase33. In some cases, the covalently attached amino acid was suspected to be the general acid/base residue, and this was subsequently confirmed by further detailed kinetic analysis of the corresponding mutants. This has made N-bromoacetylglycosylamines a useful class of affinity labels to identity potential acid/base residue candidates for retaining glycosidases. The second example is a C-galactoside derivative containing a photoreactive diazirine moiety, as shown in Figure 1.6(b). In the presence of UV irradiation, a highly reactive carbene species is generated, which subsequently reacts with nearby enzymatic nucleophilic residues. Such a reagent has been shown to achieve modest inactivation (~80% residual activity) with E. coli β-galactosidase, representing a low efficiency of covalent labeling.54,55 However, the maltotrioside version of this reagent, which also contains a photoactive diazirine moiety at the reducing end, has been

15 shown to irreversibly inactivate porcine pancreatic α-amylase to almost 100% when irradiated.56

OH O F O F2HC FHC CHF HF CHF O enzyme Nu HO O Nu sugar enzyme enzyme

Scheme 1.4: Mechanism of covalent labeling of glycosidases by ortho-(difluoromethyl)- arylglycoside.

In contrast, mechanism-based inhibitors can only be “turned on” by normal . Because of this feature, they are generally more specific and reliable mechanistic probes. One class of mechanism-based inhibitors for glycosidases is that of the ortho-/para-(difluoromethyl)arylglycoside derivatives, shown in Scheme 1.4. In the first step, the aglycone is enzymatically cleaved off from the sugar, then quickly liberates HF and is transformed into a reactive fluorinated quinone methide species. This serves as a very good and can react with any nearby enzymatic nucleophile residue. These types of reagent have been demonstrated to function as mechanism-based inhibitors of many glycosidases including β-glucosidases57, β-galactosidases58 and neuraminidases59,60. However, since the aglycone lacks specificity for the enzyme, it may diffuse out of the active site after being cleaved and this can lead to labeling of non-active site residues or of other proteins in the mixture. Another well known example of a mechanism-based inhibitor of retaining glycosidases is that of conduritol B-epoxide (CBE), as shown in Scheme 1.5. The epoxide functionality in CBE can be opened up by the concerted actions of a general acid residue and a general base or nucleophile residue within the enzyme’s active site, resulting in covalent labeling of the general base. Indeed active site residues of many glycosidases (for example GH1 Sulfolobus. solfataricus β- glycosidase61, GH3 barley β-D-glucan glucohydrolase62 and GH3 Flavobacterium meningosepticum β-glucosidase63) were identified via use of CBE and the identities of the residues subsequently confirmed by kinetic analysis of the corresponding mutants. This

16 approach is particularly valuable when a crystal structure of the enzyme is not available and when no reliable predictions can be made from the CAZy database. Very interestingly, due to the structural symmetry of CBE, it can not only inactivate retaining β-glycosidase, but can also irreversibly inhibit retaining α-glycosidases. For example, the catalytic nucleophiles of GH31 sugar beet α-glucosidase and human lysosomal α- glucosidase were identified in this way.64,65 While the ability of CBE to inactivate both α- and β-glycosidases strengthens its versatility, this suggests that the relative binding flexibility of CBE within enzyme active sites could generate misleading labeling results. Indeed in a few cases such as human lysosomal β-glucocerebrosidase66 and almond β- glucosidase67, the residues identified using CBE that were initially thought to be the nucleophile, were later shown to be incorrectly assigned when more reliable activated fluorinated sugar inactivators were employed, as discussed below.

O O O O

HO H OH HO O HO HO HO HO OH OH O O O O

Scheme 1.5: Mechanism of covalent labeling of glycosidases by CBE. The design of fluorinated sugar inactivators stems from deep insights into the double displacement mechanism. These compounds have proved to be highly successful in labeling the catalytic nucleophiles of retaining glycosidases. The first class of fluorinated sugar inactivators is that of the 2-deoxy-2-fluoro glycosides (Figure 1.7(a)).68,69 The presence of a fluorine at C2 destabilizes the oxocarbenium ion–like transition states both inductively and via the removal of key transition state-stabilizing interactions. As a consequence, both the formation and the turnover of the glycosyl- enzyme intermediate are slowed down. However, the incorporation of a good leaving group, such as fluoride or 2,4-dinitrophenolate at the anomeric center, ensures that the intermediate is kinetically accessible and accumulates (Figure 1.7(b)). These 2-deoxy-2- fluoro glycosides have proved to be effective agents for trapping intermediates on β-

17 glycosidases and allowed identification of the catalytic nucleophiles of a large number of glycosidases.47,70 In addition, the glycosyl-enzyme intermediate trapped using 2- fluoroglycosides can be studied by MS, NMR and most notably X-ray crystallography.40,71,72 Importantly the covalent intermediate trapped in this way can still be slowly hydrolyzed. This has been clearly demonstrated by removing excess 2-fluoro glycoside inactivators from the inactivated enzyme and monitoring the time-dependent regain of enzyme activity, thereby confirming the catalytic competency of such trapped covalent intermediates. Sometimes, the turnover of the trapped intermediate (thus “reactivation” of the enzyme) can be accelerated by adding suitable acceptors so that transglycosylation, rather than hydrolysis, occurs.48

OH (a) OH O2N O O HO HO HO O HO F F NO F 2

(b)

O O O O O O H H H OH O OH fast O OH H slow O HO HO HO F HO OH O HO F HF HO F

F O O O O O O

Figure 1.7: (a) Chemical structures of 2-deoxy-2-fluoro-β-D-glucopyranosyl fluoride and 2,4-dinitrophenyl 2-deoxy-2-fluoro-β-D-glucopyranoside. (b) Mechanism of trapping the covalent glycosyl-enzyme intermediate of a retaining β-glucosidase by 2-deoxy-2-fluoro- β-D-glucopyranosyl fluoride.

Despite this success with β-glycosidases, 2-deoxy-2-fluoroglycosides are generally not found to be mechanism-based inhibitors of retaining α-glycosidases, but rather act as slow substrates. Two strategies have been employed to overcome this

18 problem. The first involves incorporation of two halogens (fluorine or chlorine) at C2 to dramatically slow down both steps, in conjunction with the incorporation of a highly reactive leaving group, such as trinitrophenolate or chloride, at C1 to speed up glycosylation, as shown in Figure 1.8(a).73,74 This class of inactivators will be discussed in greater detail in Section 2.1.3.

OH (a) OH F OH F O F O HO O2N O HO HO NO2 HO HO HO F F O Cl Cl Cl

O N (b) 2 OH F OH O O HO O HO HO HO HO HO F F OH F HO OH OH F F 5FGlcF 5FIdoF Figure 1.8: (a) Chemical structures of some 2-deoxy-2,2-dihaloglycosides. (b) Chemical structures of several 5-fluoroglycosyl fluorides

An alternative strategy would involve introducing a fluorine at C-5, as shown in Figure 1.8(b).75 As with 2-fluoro glycosides, this C-5 fluorine will destabilize both the oxocarbenium-ion like transition states inductively, resulting in slowing of both the glycosylation and the deglycosylation steps. The installation of a fluoride leaving group at the anomeric center will only accelerate the formation of the intermediate but not its breakdown. As a consequence, the covalent intermediate is trapped and accumulated. Compared with 2-fluoro glycosides, the inductive effect of the C-5 fluorine on the oxocarbenium ion-like transition state is much more pronounced since the fluorine replaces a hydrogen atom at C-5, as opposed to replacing an electronegative C-2 hydroxyl with a more electronegative fluorine in 2-fluoro glycosides. However, a second factor that is also important is the removal of important hydrogen bonding interactions which stablize the transition state. These can be very important at C2 thus a large part of the rate reduction as for the 2-fluoro glycosides is due to removal of such interactions.

19 However, no such effect is in play for 5-fluorosugars. The combined effects of these two competing factors have made 5-fluoro glycosides a quite different class of glycosidase inactivators. Unlike their 2-fluoro counterparts, 5-fluoro glycosyl fluorides can act as mechanism-based inhibitors of α-glycosidases and have successfully trapped the intermediates of GH13 yeast α-glucosidase76, GH31 Aspergillus niger α-glucosidase77, GH31 E. coli α-xylosidase78 and GH38 Drosophila melanogaster golgi alpha- mannosidase II79. Some of the trapped intermediates have been subjected to X-ray crystallographic analysis. Despite these successes, the exact reasons why 5-fluoro glycosides, but not 2-fluoro glycosides, can inactivate retaining α-glycosidases are not entirely clear. One possible explanation is that the different inhibition profiles of 2-fluoro and 5-fluoro glycosides stem from the different charge distribution at the transition states between α-glycosidases and β-glycosidases (to be discussed in Section 1.3.7) Besides inactivating α-glycosidases, 5-fluoroglycosides can also be efficient mechanism-based inhibitors of various retaining β-glycosidases such as GH1 Agrobacterium sp. β- glucosidase75 and several GH3 β-N-acetylglucosaminidases80,81, illustrating the versatility of this 5-fluoro approach.

Interestingly, kinetic studies of various 5-fluoro glycosides with α-glycosidases revealed that the C5 fluorinated substrate analogues with inverted configuration at C5 typically function as better mechanism-based inhibitors than their “natural” 5-fluoro counterparts, by forming longer-lived glycosyl-enzyme intermediates. For example, although 5FGlcF (Figure 1.8(b)) could inhibit yeast α-glucosidase by accumulating the intermediate, both the inactivation and reactivation processes were very rapid. Complete inactivation was not observed and instead a steady state residual enzyme activity was detected.75 Therefore, 5FGlcF acts as an “apparent” reversible inhibitor. On the other hand, its epimer 5FIdoF (Figure 1.8(b)) was demonstrated to inactivate yeast α- glucosidase in a normal time-dependent fashion with complete labeling. The half life of this trapped intermediate was determined to be 330 min at 37 oC, representing a stable, yet catalytically competent, intermediate.76 Possibly the inversion of C-5 configuration disrupts some non-covalent interactions between the enzyme and the “naturally

20 configured” inactivator so that both the inactivation and more importantly the reactivation step of the epimeric inactivator are further slowed down.

1.3.6 Structural studies of glycosidases and non-covalent interactions Structural characterization of glycosidases has a long history, with the very first crystal structure of an enzyme being that of hen egg white lysozyme.41 Of the current 115 families of glycosidases, more than 60 families already have at least one representative structure determined.46 An important feature emerging from these structural studies is the occurrence of widely diversified structural scaffolds, despite the fact that all these enzymes catalyze essentially the same chemical reaction: hydrolysis of an acetal.

Common folds for these enzymes include (β/α)8, (α/α)6, 5-fold-β-propeller, 6-fold-β- propeller and β-jelly roll but a large number of others exist. Structural representatives of each clan of glycosidases are shown in Figure 1.9. In spite of the wide array of folds of glycosidases, however, it has been found that their active site topologies fall into only three categories, which strongly correlate with the action pattern of the corresponding enzyme.9,12 Enzymes having pocket-shaped active sites usually recognize and cleave the non-reducing ends of saccharide structures in an “exo” mode of action. Well known examples include monosaccharidases and exopolysaccharidases, such as glucoamylases, β-amylases and sialidases. The second type contains a cleft-shaped active site. These “open” structures allow access to their polymeric substrate in a relatively random manner, thus their action mode is “endo” and the internal linkages of the polysaccharide chain are cleaved. Enzymes that degrade polymeric substrates such as α-amylases, endocellulases and chitinases belong to this category. The last type of active site topology is tunnel-like. This topology probably evolved from the “clefts” by “closing” them with long loops. The advantage of a “tunnel-like” active site probably lies in the increase of processivity since polymeric substrates thread through the tunnel and the reaction product remains bound close to the enzyme active site, ready for the next cleavage reaction. Cellobiohydrolases were the first group of enzymes that were shown to have this active site topology.82

21

Figure 1.9: Representative structures for each clan of glycoside hydrolases (GH). GH-A: Family 1 β-glucosidase B from P. polymyxa (PDB: 2O9P); GH-B: Family 16 endo-β-1,3- glucanase from alkaliphilic Nocardiopsis sp. F96(PDB: 2HYK); GH-C: Family 11 xylanase A (PDB: XlnA) from B. circulans(PDB: 1BCX); GH-D: Family 31 α-xylosidase (YicI) from E. coli K12 (PDB: 1XSI); GH-E: Family 33 trans-sialidase from T. cruzi (PDB: 1MR5); GH-F: Family 43 β-1,4-xylosidase from B. halodurans C-125 (PDB: 1YRZ); GH-G: Family 37 from E. coli (PDB: 2JF4); GH-H: Family 13 α- amylase from human pancreas (PDB: 1CPU); GH-I: Family 46 chitosanase from Streptomyces sp. N174 (PDB: 1CHK); GH-J: Family 32 β-fructosidase from T maritima (PDB: 1UYP); GH-K: Family 18, B from S. marcescens (PDB: 1E15); GH-L: Family 15 glucoamylase from S. fibuligera (PDB: 1AYX); GH-M: Family 48 cellulase

22 CEL48F from C. cellulolyticum (PDB: 1F9D); GH-N: Family 28, exopolygalacturonase from Y. enterocolitica (PDB: 2UVE).

With substrate analogues, transition-state-mimicking-inhibitors and mechanism- based inhibitors available, it has been possible to determine the crystal structures of glycosidases at different snapshots of the reaction coordinate.40,83,84 Comparison of these structures will give unprecedented insights into how glycosidases carry out catalysis. A particularly interesting observation is that the sugar in the -1 subsite changes its conformation along the reaction coordinate. By obtaining the conformation of the -1 subsite sugar in the Michaelis complex and the covalent glycosyl-enzyme intermediate, it has been possible to deduce the conformation of the sugar at the transition state, which obviously is very important for the development of high affinity glycosidase inhibitors.40 Further details on this topic will be discussed in Section 2.3.8.4.

Another important reason to structurally study glycosidases complexed with various ligands is to examine the non-covalent interactions between the enzyme and the bound ligand, which are the major driving forces to alter conformations of substrates along the reaction coordinate and are particularly important in stabilizing the transition states of the catalyzed reactions.85 They mainly include hydrogen bonding, salt bridges, Van der Waals interactions and hydrophobic interactions. Glycosidases are well evolved to (perhaps more importantly have to) provide such non-covalent interactions. They are therefore very good model systems for study of such interactions, since the substrates are polyhydroxylated, conformationally stable molecules. Two different, yet complementary approaches can be used to measure the strength of such non-covalent interactions. The first is mutating the active site residues involved in the interactions one by one and kinetically analyzing these mutants with a normal polyhydroxylated substrate. Comparison of these kinetic parameters with the corresponding data of the wild type enzyme can give the total contributions of all the interactions provided by each of the side chains.86-88 In the second approach each of the hydroxyls on the substrate is replaced, individually, by either fluorine or hydrogen. Kinetic analysis of the cleavage of all these deoxygenated or deoxyfluorinated substrates plus the normal substrate with wild type

23 enzyme allows the contribution of each hydroxyl to the stabilization of the transition state to be measured.89-91 These two approaches have been used to study both α-glycosidases and β-glycosidases. In the cases of β-glycosidases studied, it has been found that the contribution of the 2-OH to transition state stabilization (~ 18 kJ/mol in the case of Abg) is much more significant than that of the 3-OH, the 4-OH and 6-OH (3 – 8 kJ/mol in the case of Abg).15,90 However, such a phenomenon was not seen in the α-glycosidases studied86,87,91, possibly reflecting the subtle mechanistic differences between these two classes of glycosidases as discussed below.

1.3.7 Mechanistic features of α-glycosidases and β-glycosidases Similar double displacement mechanisms should be adopted by these two classes of retaining glycosidases, both featuring catalytic nucleophile, general acid/base catalysis, oxocarbenium-ion like transition states and a covalent glycosyl-enzyme intermediate. However, in each snapshot of the mechanism, the anomeric configurations of the bound species are the opposite. This stereochemical difference leads to some pronounced mechanistic variations between the two. The significant transition state stabilizing effects of the 2-OH in retaining β-glycosidases, as mentioned before, has been mainly attributed to the strong hydrogen bonding between the 2-OH and the carbonyl oxygen of the enzymatic catalytic nucleophile, as shown in Figure 1.10 (a). This was confirmed by crystallographic studies of the covalent intermediate of C. fimi β-1,4-glycanase (Cex) by using its double mutant (H205N/E127A) and a non-fluorinated substrate 2,4- dinitrophenyl β-cellobioside, in which an unusually short hydrogen bond (2.4 Å) was found between the 2-OH and the carbonyl oxygen of the nucleophile residue (E233).92 Structural studies of the covalent intermediate of GH26 Pseudomonas cellulosa β-(1,4)- mannanase also supported the existence of such strong hydrogen bonding.84 It has been postulated that this interaction will become even stronger at the transition state due to optimization of geometry through ring flattening, as shown in Figure 1.10 (a). However, the spatial arrangement of the catalytic nucleophile in retaining α-glycosidases precludes the formation of such a hydrogen-bonding interaction. In fact, various crystal structures of the trapped intermediates of α-glycosidases have revealed that the carbonyl oxygen of

24 the catalytic nucleophile instead is very close to the endocyclic oxygen of the -1 subsite sugar79,93, as illustrated in Figure 1.10 (b). A/B Nu. (a) (b)

O O O O δ O H O O HO R HO δ O O R

O H H H

O O O O

Nu. A/B

β-glycosidase α-glycosidase Figure 1.10: (a) Proposed transition state of a retaining β-glycosidase. (b) Proposed transition state of a retaining α-glycosidase.

These different interaction patterns have significant impacts on the charge distribution of the oxocarbenium ion-like transition state. For β-glycosidases, the strong hydrogen bonding at 2-OH would transiently lower its pKa. As a consequence, more negative charge will reside on the oxygen of 2-OH, which in turn favors a greater share of the positive charge on the anomeric carbon. For α-glycosidases, the proximity of the endocyclic oxygen to the carbonyl oxygen of the nucleophile favors more positive charge distributed on the endocyclic oxygen, rather than the anomeric carbon. Two lines of evidence support these charge-distribution differences. The non-covalent azasugar inhibitor deoxynojirimycin (Figure 1.4 (a)), which can mimic the positive charge at the endocyclic oxygen, was demonstrated to be a better inhibitor of α-glycosidases than β- glycosidases. On the other hand, the isofagomine (Figure 1.4 (a)) which mimics the positive charge developed at the anomeric carbon, has been shown to be a very potent inhibitor of β-glycosidases but only a modest one of α-glycosidases.15,39

25 The second evidence came from trapping studies using various fluorinated mechanism-based inhibitors. 2-Deoxy-2-fluoroglycosides are very efficient inactivators of β-glycosidases but cannot trap the intermediates of retaining α-glycosidases.47,94 By contrast, 5-fluoroglycosides have been demonstrated to be very good mechanism-based inhibitors of α-glycosidases, Substitution of fluorine at C-2 destabilizes the positive charge generated at C-1 most effectively, while substitution at C-5 will have maximal effect on charge generated at the endocyclic oxygen. This matches well with their inhibitory behavior.

1.4 Aims of this thesis The aims of this thesis are as follows:

(1) The design and synthesis of several novel mechanism-based inhibitors of human pancreatic α-amylase (HPA) that function via accumulation of a covalent glycosyl-enzyme intermediate, followed by kinetic analysis of these newly-synthesized inhibitors. Two approaches, using 2-deoxy-2,2-dihalo glycosides and employing 5-fluoro glycosides, will be investigated. If any of the newly synthesized inhibitors turn out to be efficient covalent inactivators, crystallographic studies of the covalent glycosyl-enzyme intermediate of HPA will be carried out collaboratively. This should provide detailed structural and mechanistic insights into the covalent intermediate of HPA, which has proven elusive so far.

(2) Detailed mechanistic studies of the GH13 trehalose synthase (TreS) from Mycobacterium smegmatis, which is responsible for the inter-conversion of maltose and trehalose. A continuous assay will first be developed by synthesizing and testing of a series of aryl α-glucosides. Both 2-fluoro glycosides and 5-fluoro glycosides will be tested as potential TreS inactivators to trap any covalent intermediate. If successful, the catalytic nucleophile will be identified by LC/MS analysis of the peptide digest. The question of whether the enzymatic rearrangement of the disaccharide is an intramolecular (involving only one molecule of disaccharide) or intermolecular (involving two molecules of disaccharide) process will be addressed using two different approaches. The

26 first approach will investigate whether isotopically labeled glucose can be incorporated into the disaccharide substrates/products in the presence of TreS. The second approach will involve testing whether exogenous glucose can react with the glucosyl-enzyme intermediate formed using α-glucosyl fluoride as the substrate, forming disaccharide products.

(3) Detailed kinetic studies of GH101 endo-α-N-acetylgalactosaminidase from S. pneumoniae. Sequence alignment of this enzyme with other GH101 members, along with structural similarities to GH13 α-amylases, have suggested several candidate residues as the catalytic nucleophile and acid/base catalyst. Kinetic analysis of mutant enzymes modified at those positions using substrates bearing different leaving groups, coupled with chemical rescue experiments, will confirm the identities of the catalytic residues of the enzyme.

27

Chapter 2: Structural Studies of the Covalent Glycosyl- Enzyme Intermediate of Human Pancreatic α-Amylase

28 2.1 General introduction to human pancreatic α-amylase

2.1.1 The GH13 family and α-amylases Starch, and the closely related polymer glycogen, serve as the most important storage forms of carbohydrate for a wide variety of living organisms. Being a glucose polymer, starch is made of two components: amylose and amylopectin.95,96 Amylose is largely linear polymer of glucose with exclusively α-(1,4) glycosidic linkages while amylopectin has some branching points involving α-(1,6) linkages in addition to the linear amylose structure (Figure 2.1). Due to the importance of starch in energy storage, a large number of enzymes exist to hydrolyze or modify this polymer. Interestingly, most of the starch-modifying enzymes belong only to a handful of glycoside hydrolase families and the majority of them are within the GH13 family.97

OH OH OH O O O O O HO OH O HO HO OH OH O OH O OH O HO OH O HO O OH O O OH O HO OH O HO OH OH O O OH O HO O HO OH O OH O amylose amylopectin Figure 2.1: Structures of amylose and amylopectin.

Currently, GH13 is the largest glycoside hydrolase family in the CAZy classification system.7 Enzymes from this family have been shown to exhibit a wide range of activities, such as α-amylases (EC 3.2.1.1), (EC 3.2.1.41), cyclodextrin glucanotransferases (EC 2.4.1.19), trehalose-6-phosphate hydrolases (EC 3.2.1.93), α- (EC 3.2.1.20), trehalose synthases (EC 5.4.99.16), amylosucrase (EC 2.4.1.4) and so on.96 Because of their important roles in starch metabolism, GH13 enzymes can be found in all levels of organisms including archaea, bacteria, plants and animals.98 Generally, very low sequence similarity is found between GH13 enzymes, with overall sequence homology as low as 10% when α-amylases from

29 different organisms are compared. Nonetheless, several highly conserved regions in the amino acid sequences were still identified.95-97 The existence of these conserved regions has sparked many evolutionary studies on GH13 enzymes.97,99

Numerous crystal structures of GH13 enzymes have been solved to date, both in their apo forms and as complexes with various substrate analogues/inhibitors. Selected examples of α-amylase structures include enzymes from Pyrococcus woesei100, Bacillus halmapalus101, Bacillus licheniformis102, Aspergillus niger103, Aspergillus oryzae104, barley105, porcine pancreas106, human saliva107 and human pancreas108. From these available structures, all the GH13 enzymes appear to be multi-domain proteins featuring 97,109 a (β/α)8 barrel (TIM barrel) as the catalytic domain (domain A). The adoption of a common catalytic core by all the GH13 enzymes raised many speculations as to whether this is the result of divergent or convergent evolution.99 Besides the catalytic domain, there usually is a large loop inserted between the third β strand and the third α helix of the TIM barrel in most of the crystal structures solved and this loop is described as domain B.110,111 The structure of domain B can vary depending on the specific enzyme. The third domain, domain C, can be found in most α-amylases.110 This domain is only loosely associated with domain A and is composed of an anti-parallel β-barrel type structure of unknown function. Suggestions regarding the function of domain C include facilitation of starch binding or stabilization of the catalytic domain A.111

The existence of many different enzyme activities and substrate specificities in the GH13 family prompted a further classification within this family of enzymes, aiming at establishing “subfamilies” of enzyme with better correlation between primary sequence and their substrate specificities.112 By combining multiple sequence alignment, clustering analysis and phylogenetic methods, the majority of the GH13 enzyme sequences were reliably divided into 35 subfamilies. Interestingly, all the subfamilies are found to be either monospecific or composed of enzymes with highly related specificities, clearly demonstrating the value of this categorization. The excellent correlation of subfamily and enzyme activity will make this classification a powerful functional prediction tool for GH13 enzymes after they are assigned to the corresponding subfamilies.

30 Despite the diverse substrate specificities of GH13 enzymes, they all act exclusively on α-glycosidic linkages, making the GH13 family one of the most important and well studied α-glycosidase families.111 Stereochemical outcome studies have clearly demonstrated that enzymes in this family catalyze their reactions with retention of anomeric configuration.113 Therefore it is widely believed that GH13 enzymes utilize the classic double-displacement mechanism involving formation of a covalent glycosyl- enzyme intermediate.111,114 The existence of this covalent intermediate has been confirmed by many studies. Initial low temperature NMR studies of the reaction between porcine pancreatic α-amylase (PPA) and [1-13C]maltotetraose provided evidence for the presence of such an intermediate.115 5-Fluoro glycosyl fluorides have been used to trap the catalytically competent covalent intermediate of yeast α-glucosidase and the catalytic nucleophile residue has been identified by LC-MS/MS analysis of proteolytic digests.76 Further proof comes from several studies in which the covalent intermediate has been directly visualized via crystallography, which will be discussed in detail in Section 2.1.3.

The active sites of most GH13 enzymes are believed to contain a number of subsites.110,111 Each subsite binds to one glucose moiety of the substrate through many non-covalent interactions provided by the side chains of the amino acid residues making up that subsite. The number of subsites and their locations relative to the catalytic residues provide important mechanistic information on these enzymes. Earlier identification of the number of subsites was achieved by kinetic measurements and product analysis using several well defined oligosaccharide substrates, with the earliest example being studies on porcine pancreatic α-amylase.116 A more direct approach to obtain such subsite information can be provided by crystallographic studies of GH13 enzymes in complex with various substrate analogues or inhibitors.94,106

2.1.2 Previous mechanistic studies of HPA Our group and Prof. Gary Brayer’s group (Department of Biochemistry, UBC) have had a long-lasting interest in studying one particular GH13 enzyme: human pancreatic α-amylase (HPA), not only because this enzyme serves as an excellent model to understand the general catalytic mechanisms of α-amylases and GH13 enzymes, but

31

Figure 2.2: Crystal structure of wild type HPA (photo courtesy of Prof. Gary Brayer). also because inhibition of HPA has important medical implications. HPA is a key endoglycosidase involved in the digestion of dietary starch in the gut, generating a mixture of oligosaccharides, including maltose and a variety of α-(1-4) and α-(1-6) branched oligoglucans that are then further hydrolyzed to glucose by other glucosidases. Consequently, the activity of HPA in the small intestine has been shown to directly correlate with post-prandial sugar levels in the blood stream.117-119 Control of the activity of HPA thus provides a valuable therapeutic approach for diseases such as diabetes and obesity. Indeed, several α-glucosidase inhibitors (Acarbose, Miglitol and Voglibose) have been used clinically for the treatment of Type II diabetes. However, their non-specific binding to a wide range of glucosidases may be the cause of undesirable side effects that

32 limit their effectiveness. A better understanding of the structure and mechanism of action of HPA would therefore assist in the design of more specific inhibitors for therapeutic use.

The first crystal structure of HPA was published in 1995108 and not surprisingly, three domains were found within this enzyme, as shown in Figure 2.2. The largest domain, domain A, has an (β/α)8 barrel structure where the active site is located. An essential chloride was also found to be located within domain A, in close proximity to the active site. Site-directed mutagenesis together with kinetic analysis of the resultant mutants of HPA suggested a role for the chloride ion in maintaining the 120-122 proper pKa of the catalytic residue E233 in the active site. Domain B protrudes from the side wall of domain A and forms a calcium binding site. Domain C has an anti- parallel β-barrel type structure of unknown function, though a role in starch binding seems likely. Subsite mapping studies have clearly demonstrated the presence of at least five high affinity glucose-binding subsites within the active site, with three on the non- reducing side of the scissile bond and two sites on the reducing end side (Figure 2.3).113 Indeed, when acarbose, a naturally occurring pseudo-tetrasaccharide, was soaked with wild-type HPA crystals, a rearranged pseudo pentasaccharide was found within the active site. Presumably, this rearranged product can occupy all those high affinity glucose- binding sites for maximum affinity with the enzyme.94 (Further discussion of acarbose can be found in Section 2.1.4.)

BondCleavage

OH OH OH OH OH O O O O O

OH OH OH OH OH RO HO O HO O HO O HO O HO OR'

-3 -2 -1 +1 +2

Figure 2.3: Binding Subsites of HPA

33 NMR stereochemical outcome studies using both an artificial substrate α-maltosyl fluoride and natural substrate maltoheptaose have clearly shown that HPA is a retaining α-glycosidase, as are the other GH13 enzymes.123 Therefore, it is believed that a double- displacement mechanism is adopted by HPA (Figure 2.4).15 Briefly, the first part of the reaction sequence involves an acid-catalyzed nucleophilic displacement of the aglycone by a carboxylate residue at the active site (glycosylation step), leading to the formation of a covalent glycosyl-enzyme intermediate. In the second part of the reaction, this covalent intermediate is then hydrolyzed, with general-base catalysis provided by the same group that earlier served as the acid catalyst (deglycosylation step). Overall the reaction pathway proceeds through two oxocarbenium-ion-like transition states.

Sequence comparisons with various GH13 enzymes have identified three highly conserved carboxylic acid residues throughout the whole family.111 Mutations of these residues significantly decreases the activity of all the GH13 enzymes studied, confirming their important roles in catalysis. In the case of HPA, structural analyses of this enzyme revealed that these three residues (D197, E233 and D300) are located at the bottom of a “V”-shaped active site cleft in domain A. The role of D197 as the catalytic nucleophile was quite well established as deletion of this residue decreases enzymatic activity by at least 106 fold.29 The identity of the general acid/base residue was somewhat less clear since both E233 and D300 were suitably poised for this role and replacement of either of these residues with the equivalent amide or alanine led to comparable decreases in enzymatic activity (~103 fold). However pH profiles and the significantly greater second- order rate constant observed for E233 variants with "activated" substrates relative to the natural substrate starch, are strongly suggestive of E233 filling the role of acid/base catalyst for HPA.29 In fact, the assignment of E233 as the general acid/base residue is quite consistent with studies on other GH13 enzymes. Crystal structures of porcine pancreatic α-amylase (PPA)106, cyclodextrin glucanotransferases (CGTase)124 and barley α-amylase125 complexed with acarbose all support the role of E233 (HPA numbering) as the proton donor in the catalytic mechanism.

34 O O O Oδ OH OHδ O RO O RO R'OH HO HO δ HO O R' HO O R'

H H O O O Oδ OH O O RO O HO HO H O H

O O O O O Oδ OH OHδ O RO RO O HO HO δ HO H O HO O H

H H O O O Oδ

Figure 2.4: The double displacement mechanism of HPA

2.1.3 Trapping the covalent glycosyl-enzyme intermediate of GH13 enzymes

Some of the most important mechanistic insights of glycosidases have come from studies of the covalent glycosyl-enzyme intermediate, especially via X-ray crystallography. In the case of β-glycosidases, activated 2-deoxy-2-fluoro glycosides can function as mechanism-based inhibitors by forming a covalent enzyme-inhibitor intermediate that can be crystallized. Examples of such crystal structures include the GH10 C. fimi xylanase/glucanase (Cex)92,126, GH5 Bacillus agaradhaerens endoglucanase (Cel5A)83, GH11 Bacillus circulans 1,4-β-xylanases (Bcx)127, GH26 P. cellulosa Mannanase 26A (Man26A)128 and most notably GH22 hen egg white lysozyme (HEWL)72. The successful use of 2-deoxy-2-fluoro glycosides lies in their ability to render the deglycosylation step as the rate-limiting step for β-glycosidases, as explained in Section 1.3.5.

35 OH OH F O O O RO RO RO HO HO HO F OH OH F HO F F F

OH OH OH F F F O O2N O O RO RO RO HO HO HO NO2 F Cl F O Cl Cl

O2N Figure 2.5: Several potential mechanism-based inactivators for GH13 enzymes

For retaining α-glycosidases, activated 5-fluoro glycosides (Figure 2.5) have been used to trap their covalent intermediates and these have allowed structural elucidation of the covalent intermediate of the exo-glycosidases GH38 golgi α-mannosidase II from D. melanogaster and GH31 α-xylosidase from E. coli.78,79 However, application of this strategy to oligosaccharide-processing endo-acting GH13 enzymes such as HPA would require the synthesis of a malto-oligosaccharyl version of the 5-fluoroglycosyl fluorides. Unfortunately, the chemical synthesis of such elongated 5-fluoro glycosyl fluorides has proven to be extremely challenging, with only one report of a 5-fluoro oligosaccharide in the literature to date, and that of disaccharides containing non-activated anomeric leaving groups.129

A possible solution for trapping the covalent intermediate of GH13 enzymes would involve introducing a second electron-withdrawing halogen, such as fluorine or chlorine, at the C-2 position of 2-deoxy-2-fluoro glycosides (Figure 2.5). This will further destabilize the oxocarbenium ion-like transition states and dramatically slow down both the glycosylation and the deglycosylation steps. At the same time, a highly reactive leaving group, such as chloride or 2,4,6-trinitrophenolate, could be installed at the anomeric center to render the covalent intermediate kinetically accessible. This strategy has been successfully applied to several cases including yeast α-glucosidase and HPA in the solution phase.73 In addition, the catalytic nucleophile of the GH27 α- galactosidase from Phanerochaete chrysosporium was also identified by using 2’,4’,6’-

36 trinitrophenyl 2-deoxy-2,2-difluoro-α-D-lyxo-hexopyranoside as the inactivator,74 the use of which very recently led to the structural elucidation of the covalent glycosyl-enzyme intermediate of GH27 human α-galactosidase.130 While 2’,4’,6’-trinitrophenyl 2-deoxy- 2,2-difluoro-α-maltoside trapped the covalent intermediate of HPA in the solution phase, crystallographers have not had any success in obtaining the crystal structure of the trapped intermediate. They observed that the protein crystal gradually turned yellow after being soaked with this compound and eventually collapsed. A possible explanation would be that an unwanted nucleophilic aromatic substitution reaction occurs between a nucleophilic residue within HPA’s active site and the trinitrophenol moiety in the aglycone, as shown in Figure 2.6.

OH

O HO HO OH F OH O O2N O HO NO2 F O

Nu O2N

Figure 2.6: Hypothesized unwanted nucleophilic aromatic substitution between HPA and trinitrophenyl 2-deoxy-2,2-difluoro α-maltoside in the HPA crystal

Some GH13 enzymes preferentially perform transglycosylation (transglycosidase) rather than hydrolysis (glycosidase). Selected examples of GH13 transglycosidases include cyclodextrin glucanotransferase (CGTase), glycogen debranching enzyme and amylosucrase. A useful strategy for trapping the covalent intermediate of these transglycosidases was developed in which the rate of the deglycosylation step was greatly reduced by mutating the general base residue and providing an incompetent “blocked”

37

H OH H OH O H O HO OH H OH O O HO O HO O HO O HO O OH O O HO OH HO O O HO O O HO HO OH HO - F F

Gln Gln

H OH O H HO OH O HO O HO OH O O X H OH transglycosylation HO O O O HO H HO OH OH O HO O HO OH slow hydrolysis HO O Gln O HO HO F Figure 2.7: Rationale for accumulation of the covalent intermediate of the acid/base mutant of CGTase using 4’’-deoxy-α-maltotriosyl fluoride acceptor for the transglycosylation. An activated substrate, usually a glycosyl fluoride, was used to facilitate the formation of the intermediate in the absence of the general acid/base catalyst. For example, in the case of B. circulans CGTase, an artificial substrate, 4’’-deoxy-α-maltotriosyl fluoride was designed and synthesized since the deletion of 4’’- OH would stop transglycosylation of the donor onto itself.131 Meanwhile the acid/base mutant of CGTase (Glu257Gln) was also employed to further slow down the hydrolysis of the covalent intermediate, as illustrated in Figure 2.7.132 This successfully accumulated enough intermediate for subsequent MS analysis and crystallographic studies.93 Unfortunately, this strategy is not applicable to hydrolytic GH13 glycosidases such as HPA since there is no way to exclude the acceptor, water, from the active site. Simply deleting the acid/base residue does not prevent the turnover of the intermediate as the enzyme still retains enough hydrolytic activity to prohibit intermediate characterization. Indeed, for the HPA acid/base mutant E233A, α-maltotriosyl fluoride

38 -1 (αG3F) turned out to be a reasonably good substrate with kcat = 0.82 s , Km = 1.1 mM, confirming the insufficiency of this strategy for hydrolytic enzymes.29

In a very recent report, the covalent intermediate of another GH13 transglycosidase, glycogen-debranching enzyme (TreX) from Sulfolobus solfataricus was surprisingly trapped in the enzyme crystal by using acarbose.133 While acarbose is a well- known transition-state analogue for HPA, it was unclear why this would trap the covalent intermediate of TreX. The authors proposed that this could possibly be the result of crystallization at pH 8.0, a pH value at which acarbose is rarely cleaved by the enzyme. In another related report, the covalent intermediate of a GH77 amylomaltase was also trapped with acarbose, though no convincing rationale was given by those authors either,134 raising concerns about the generality of using acarbose as the trapping reagent for other GH13 enzymes. These two cases could possibly be coincidental rather than the result of rational design.

2.1.4 Non-covalent inhibitors of HPA Tight-binding inhibitors of HPA offer good potential for being developed into novel therapeutics for diabetes and obesity. Not surprisingly significant efforts have been put into searching for HPA inhibitors of both high affinity and high specificity in the past several decades by many research groups.135-137 Generally all the inhibitors fall into two broad categories: proteinaceous inhibitors and carbohydrate analogues.

Naturally occurring proteinaceous inhibitors of α-amylases have been isolated from both bacteria and higher plants.138 It is believed that one of the natural functions of these inhibitors is to protect the host organisms from pathogens or pests by “shutting down” their α-amylases. Comparison of the primary sequences and three-dimensional structures can further be used to classify these proteinaceous inhibitors into seven families, with one family from Streptomyces species and the other six from higher plants.137 So far the most famous member within these seven families is tendamistat from Streptomyces tendae, not only because of its formidable potency against mammalian α- -12 139 amylases (Ki = 9 × 10 M for PPA), but also because this is the first protein whose

39 three-dimensional structure was solved by both NMR spectroscopy and X-ray crystallography.140,141 It is a very small protein with 74 amino acid residues. The crystal structure of the complex of PPA and tendamistat revealed its mechanism of inhibition.142 Tendamistat binds to an extended cleft of PPA to exclude substrate access to the enzyme’s active site. Four short segments of tendamistat in particular are involved in direct interactions with PPA. Besides various salt bridges, hydrogen bonding and hydrophobic interactions between PPA and these four short regions, one particular amino acid, Arg19 in tendamistat, directly binds to one of the catalytic residues, E233 via strong electrostatic interactions. In fact, this Arg19 is one of the amino acids in the tripeptide sequence Trp18–Arg19–Tyr20, which is highly conserved in this class of proteinaceous inhibitors. This triad is located at the tip of a β-turn structure, which is thought to be a very important structural feature to achieve α-amylase inhibition. Many studies have been carried out to synthesize short peptide mimics with this β-turn structure in order to find tight-binding but simpler α-amylase inhibitors. However, these mimics are found to bind considerably more weakly than the intact proteinaceous inhibitors, with Ki values usually in the micromolar range.137,143

Extensive screening efforts have led to the discovery of several carbohydrate- based α-amylase inhibitors.135 Most of these inhibitors belong to the trestatin family, which contains acarviosine (a valienamine unit linked to 4-amino-4,6-dideoxy-α-D- glucose) and a varying number of glucose residues.144 The most notable member of this family is acarbose, a clinically used drug for treating type II diabetes that was originally isolated from Streptomyces culture supernatant.117 As shown in Figure 2.8 (top structure), it is a pseudo-tetrasaccharide with the acarviosine attached to a maltose moiety

at the non-reducing end. Acarbose is a very potent inhibitor of HPA with Ki = 0.02 μM.145 It has been widely used in crystallographic studies with various α-amylases such as PPA106, barley α-amylase125, Aspergillus oryzae α-amylase146 and HPA94 to identify their active sites and key non-covalent interactions therein. Several structural features of 2 acarbose make it such a good α-amylase inhibitor. Firstly, the half chair ( H3) conformation of the valienamine is a very good mimic of the planar oxocarbenium ion- like transition state of α-amylase catalysis. Therefore this moiety should fit in the -1

40 subsite of HPA and other α-amylases very well, where sugar ring distortion is thought to occur at the transition state. Secondly, the nitrogen between valienamine and 4-amino- 4,6-dideoxy-α-D-glucose is protonated at physiological pH and this positive charge resembles the protonated glycosidic oxygen at the transition state.

OH OH OH O O O

OH OH OH OH OH HO HO N HO O HO O HO H

HPA crystal

OH OH OH OH O O O O

OH OH OH OH OH HO O O N HO O OH HO HO HO H HO -3 -2 -1 +1 +2

Figure 2.8: Rearrangement of acarbose in HPA crystal

Consistent with its transition-state character, the valienamine moiety is indeed seen to bind to the -1 subsite, as seen in crystal structures of acarbose complexed with various α-amylases. In most cases, a rearranged acarbose, rather than the original pseudo- tetrasaccharide, was found within the enzyme’s active site. In the case of HPA, a pseudo- pentasaccharide occupies all five glucose-binding subsites (Figure 2.8).94 Presumably this rearranged product was formed through a series of transglycosylation reactions catalyzed by HPA and a plausible mechanism has been proposed.147

41 OH OH OH

OH NH HO HO HO HO HO HO

OH OH OH OH OH OH HN NH2

valiolamine voglibose deoxynojirimycin

OH

OH OH CH2CH2OH OH N NH HO OH HO HOCH2 HO S OSO3 HO OH OH OH N OH miglitol salacinol GHIL Figure 2.9: Selected examples of GH13 α-glucosidase inhibitors

One of the important reasons why only a limited number of HPA inhibitors exist lies in the enzyme’s extended active site. By contrast, many good inhibitors of monosaccharide-processing GH13 α-glucosidases are known.39,135 Most of these inhibitors are monosaccharide-based molecules containing nitrogen to mimic the positive charge developed at the transition state. Selected examples include valiolamine, voglibose, 1-deoxynojirimycin (DNJ), miglitol, salacinol148 and D-gluconohydroximino- 1,5-lactam (GHIL) (Figure 2.9). Since the chemistry occurring in the -1 subsite of α- glucosidases and HPA is essentially the same, a significant degree of structural similarity would be expected in these regions of the enzymes, in order to stabilize similar transition- state structures. If a facile method to elongate those monosaccharide-based inhibitors could be developed, the elongated products would be expected to be very good HPA inhibitors by interacting with both the -1 subsite and the rest of the extended substrate- binding site. Indeed our group made some exciting discoveries recently using our “inhibitor extension assay” coupled with an “in crystal” version of the same process for definitive structural elucidation.149 This approach has its basis in the observation that the inhibitor acarbose is rearranged “in crystal” to generate a longer, tighter-binding version. We reasoned that it is possible to use HPA’s catalytic machinery for in situ elongation of a monosaccharide inhibitor by adding an activated glycosyl fluoride substrate at the same time as the inhibitor. Indeed we discovered that, after pre-incubation with α-maltotriosyl

42 fluoride (G3F) and HPA, the monosaccharide inhibitor GHIL (Ki = 18 mM) was

converted into a pseudo-trisaccharide with almost a thousand-fold higher affinity (Ki = 25 μM, Figure 2.10). What made it more interesting is that this in situ elongation reaction could be directly performed in HPA crystals for structural elucidation.

OH O HO HO OH OH wild type HPA HO NH O O HO HO HO OH OH HO OH N NH OH O HO O OH HO G2-GHIL (K = 25 μM) OH N GHIL (Ki = 18 mM) HO OH i HO O O HO OH HO O O HO HO F Figure 2.10: Mechanism of in situ elongation of GHIL by HPA

Encouraged by this new development, Dr. Chris Tarling in our group carried out high-throughput screening of several large libraries of crude extracts of various organisms and pure compounds by using this double screening methodology (with and without the addition of glycosyl fluoride for in situ elongation).136 The crude extract library was provided by the National Cancer Institute, USA and the pure compound library came from Canadian Chemical Biology Network (CCBN). In total around thirty “hits” were identified and the most potent one was subjected to bioassay-guided purification followed by structural elucidation via MS and NMR spectroscopy (These experiments were carried out by Dr. Kate Wood from Prof. Raymond Andersen’s group, UBC). Interestingly, this most potent hit turned out to be a glycosylated acyl flavonol named Montbretin A. It was originally isolated from an extract of Crocosmia crocosmiiflora, a garden hybrid of C. aurea and C. pottsii (Figure 2.11). It is very potent, competitive inhibitor of HPA with Ki = 8 nM. What makes it more appealing is that Montbretin A showed a high level of selectivity towards HPA when tested against a series of glycosidases including yeast α-glucosidase. Further studies of its in vivo efficacy

43 are currently underway at the Center for Drug Research and Development (CDRD) at UBC.

O

OH

HO O OH O OH O O OH O OH OH O OH O CH3 O CH3 HO OH OH O OH O

HO O OH

HO OH O

O

OH

OH OH Figure 2.11: Chemical structure of Montbretin A

2.2 Specific aims of this study

(1). Develop a new methodology to trap the covalent glycosyl-enzyme intermediate formed on GH13 hydrolytic α-amylases

While various mechanism-based inhibitors have been developed for GH13 transglycosidases and monosaccharide-processing glycoside hydrolases, no effective method that is suitable for crystallographic studies has been available to trap the covalent intermediate of hydrolytic α-amylases. A series of activated 2-deoxy-2,2-dihalo- maltosides will be first chemically synthesized and kinetically evaluated as potential mechanism-based inhibitors of HPA. In addition, the “in situ elongation” methodology recently developed for discovering better non-covalent inhibitors will also be employed as a strategy to trap the covalent glycosyl-enzyme intermediate of HPA. This will be achieved by reacting the enzyme with both an activated glycosyl donor and the proper

44 acceptor such as the newly-synthesized 2-deoxy-2,2-dihalo maltosides or 5-fluoro glycosyl fluorides simultaneously. Hopefully one of these options can cause time- dependent inactivation of HPA.

(2) Structural analysis of the trapped covalent glycosyl-enzyme intermediate of HPA

The newly-developed mechanism-based inhibitor(s) will be employed to trap the covalent intermediate of HPA in the crystal, which will be subjected to crystallographic studies through our collaboration with Prof. Gary Brayer. By examining the crystal structure of the trapped intermediate of HPA, we hope to reveal more mechanistic details about HPA catalysis. Particular attention will be given to elucidation of the exact roles of side-chain residues within HPA’s active site and the conformation of the sugar ring which is covalently attached to the enzyme.

2.3 Structural analysis of the covalent glycosyl-enzyme intermediate of HPA

2.3.1 Expression, purification and crystallization of HPA

The first step to structurally characterize the covalent glycosyl-enzyme intermediate would be ensuring adequate supply of the wild-type HPA enzyme. A former group member, Dr. Edwin Rydberg, developed an efficient method for the expression and purification of this mammalian protein.150 Very briefly, Pichia pastoris transformed with wild type HPA gene was grown and induced in the proper medium. The recombinant HPA protein was purified by loading the supernatant of cell culture onto a phenyl sepharose column. The fractions containing pure HPA proteins were combined and concentrated. Previous studies have shown that this recombinant enzyme couldn’t crystallize due to N-glycosylation at Asn461. Therefore the obtained HPA protein was subjected to deglycosylation by endoglycosidase F-cellulose binding domain fusion protein. The deglycosylated HPA was then loaded onto a small anion exchange Q- Sepharose column to remove small quantities of colored contaminant, generating pure

45 wild type amylase for kinetic studies and crystallization experiments. His protocol was adopted in my studies and will be described in detail in the experimental section 5.3.1.

All crystallographic experiments were performed by Dr. Chunmin Li (Prof. Gary Brayer’s group). Anexample of an HPA protein crystal is shown in Figure 2.12.

Figure 2.12: Example of an HPA crystal (photo courtesy of Dr. Chunmin Li)

2.3.2 Synthesis of 2-deoxy-2,2-dihalo maltosides

2.3.2.1 Previous synthesis of 2-deoxy-2,2-difluoro maltosides

Two maltosyl chlorides which bear two halogen groups, either two fluorine or one fluorine and one chlorine, at their C-2 positions were designed, as shown in Figure 2.13 (Compound 2.3 and 2.4). The halogens at C2 are expected to strongly destabilize the oxocarbenium ion-like transition state and slow down both the glycosylation and deglycosylation steps. Chloride was selected as the highly activated anomeric leaving group to speed up the glycosylation step to avoid the problem of nucleophilic aromatic substitution, as occurred when trinitrophenol was used as the leaving group (Figure 2.6).

46 OH OH F F O O HO HO For yeast α-glucosidase HO HO

Cl F Cl Cl 2.1 2.2

OH OH

O O HO HO HO OH HO OH For HPA F F OH O OH O O O HO HO

Cl F Cl Cl 2.3 2.4 Figure 2.13: Mechanism-based inhibitors of yeast α-glucosidase and proposed mechanism-based inhibitors of HPA

There are three general strategies by which sugars containing geminal fluorine substituents have been synthesized, though in most of the cases published the products have not additionally included a highly reactive leaving group. (Figure 2.14) The most common approach is to selectively oxidize the hydroxyl group at the site in question and then react the with DAST to introduce the gem-difluoro functionality.151-154 However, the success of this reaction is highly dependent on the structure of the substrate, particularly with respect to the identities and stereochemistry of neighboring groups.155,156 In some cases yields of over 80% were obtained while in other cases, moderate yields or even no detectable yields were observed due to competing side reactions such as migrations and eliminations. A second approach involves a Reformatsky reaction in which acyclic gem-difluoro precursors such as ethyl bromodifluoroacetate or bromodifluoromethyl phenyl acetylene are coupled with a correctly configured aldehyde, followed by product cyclization to complete the synthesis of the gem-difluoro substituted sugar ring.157,158 While this approach was successful for the assembly of 2-deoxy-2,2- difluororibose, it requires maintenance of exquisite control over the stereochemistry of the ring hydroxyls and would be especially challenging for the synthesis of disaccharides. The third approach, which was the one we decided to pursue, involves the initial synthesis of a 2-fluoroglycal, which is then reacted with an electrophilic fluorine source TM such as Selectfluor , F2, acetylhypofluorite or trifluoromethyl hypofluorite

47 159,160 (CF3OF). The gem-difluoro sugars can be obtained in respectable yields in this manner.

F O O oxidation O DAST Approach 1 BnO BnO BnO

OH F O

O OH OH

OHC R2 F F R2 R2 EtOCCF2Br deprotection Approach 2 F F Zn COOEt cyclization R1O R3 R1O R3 HO O R3

F O O fluorination O Approach 3 AcO AcO AcO F F

Figure 2.14: General approaches to synthesize 2-deoxy-2,2-difluoro glycosides

An additional advantage of the third route (Figure 2.14) lies in its compatibility with the synthesis of 2-deoxy-2-chloro-2-fluoro sugars such as compound (2.1) or (2.3) (Figure 2.13), since the same fluoroglycal intermediate can be reacted with electrophilic chlorine sources. Indeed, by using Cl2 as the chlorinating agent it should be possible to introduce chlorines at both the C-2 and C-1 positions, with the latter potentially serving as a suitably reactive leaving group. We have found that sugars bearing anomeric chlorides have generally proved to be too labile to be deprotected, even in the presence of a fluorine substituent at C-2, vividly illustrating the relatively high reactivity achievable with such an aglycone. However, it seems probable that, when accompanied by dihalo substitution at C-2, such deprotected sugars may prove sufficiently stable to survive aqueous conditions, but still reactive enough to undergo the first step of a retaining glycosidase mechanism and thereby trap the intermediate. An additional potential advantage is that, by introducing halides of different sizes at C-2 it may be possible to generate inhibitors that are selective for α-glucosidases over α-mannosidases, by incorporating an equatorial chlorine (mimicking a hydroxyl due to its similar size) and an

48 axial fluorine (mimicking a hydrogen atom), or vice versa to produce selective α- mannosidase inhibitors.

2.3.2.2 Synthesis of 2-deoxy-2,2-dihalo maltosides

Adoption of the third approach to synthesize 2.3 and 2.4 (Figure 2.14) requires an efficient preparation of the protected 2-fluoro maltal (2.5). This was accomplished as shown in Scheme 2.1.

OAc OAc OAc O O AcO O AcO AcO AcO AcO OAc OAc i) AcO OAc iii) AcO AcO O O O AcO O O AcO ii) O AcO AcO AcO OAc F OAc 2.7 2.8 2.9

OAc OAc OAc

O O O AcO AcO AcO iv) AcO OAc v) AcO OAc AcO OAc F AcO O AcO O + AcO O O O O AcO AcO AcO

F Br F Br 2.10 2.5 2.6 Scheme 2.1: Synthesis of protected 2-fluoro maltal (2.5) and 3,6-di-O-acetyl-4-O- [2’,3’,4’,6’-tetra-O-acetyl-α-(1,4)-D-glucopyranosyl]-2-deoxy-2-fluoro-α-D-manno-

pyranosyl bromide (2.6). i) HBr/AcOH (33%), glacial acetic acid; ii) Zn, H2O-AcOH TM (1:1), 44% yield over two steps; iii) Selectfluor , AcOH, CH3NO2; iv) HBr/AcOH

(33%), CH2Cl2; v) triethylamine, CH3CN, 25% yield over three steps.

Our synthesis started from the commercially available maltose. Acetylation of all the hydroxyl groups was simply achieved using acetic anhydride in pyridine. Bromination of the anomeric center followed by Zn elimination generated the per-O- acetylated maltal 2.8. Installation of the first fluorine into the C-2 position requires a suitable electrophilic fluorination source and earlier examples of such reagents include 159,160 fluorine gas (F2), XeF2, acetyl hypofluorite and trifluoromethyl hypofluorite. However, all these substances are very reactive and difficult to handle. In addition, the

49 choice of further functionalization of the anomeric substituent will be very limited once the above mentioned fluorination methods are employed. Fortunately, several milder, yet effective electrophilic fluorination reagents have been developed in the last two decades, one good example being SelectfluorTM.161,162 It has been demonstrated that in the presence of a suitable nucleophile, SelectfluorTM can be reacted with glycals to give regioselective fluorination at C-2 while simultaneously introducing the nucleophile at the C-1 position in a one-pot process (Figure 2.15).163,164 When this reagent was reacted with per-O-

Cl

N O Selectfluor O - (BF4 )2 RO RO N Nu.. F F Nu SelectfluorTM Figure 2.15: Fluorination of glycals by SelectfluorTM.

-acetylated maltal 2.8 in the presence of AcOH as the nucleophile, satisfyingly a 70% yield of fluorinated product 2.9 was obtained. 19F-NMR indicated that this “purified” product 2.9 in fact contained all four possible compounds with the C-2 fluorine and anomeric acetate in either of the axial or equatorial configurations. Since separation of these four compounds was extremely difficult at this stage, the mixture was directly reacted with HBr/AcOH to generate the glycosyl bromide. In order to prevent cleavage of the inter-glycosidic bond by the strong acid, a lower temperature (4 oC) and dilute HBr/AcOH were employed. NMR analysis revealed that the four compounds were converted into two brominated products 2.10. The α-configured anomeric bromide was clearly demonstrated by the small coupling constant between H-1 and H-2 for both the

“gluco”-configured product (J1,2 = 4.4 Hz) and the “manno-configured” product (J1,2 = 1.5 Hz). Elimination of the anomeric bromide was simply accomplished by using the weak base triethylamine to afford the protected 2-fluoro maltal (2.5). While all the “gluco”-configured bromide was consumed in this , the “manno”- configured bromide 2.6 remained intact (due to its undesired stereochemistry at C-1 and C-2) and could be easily separated from product 2.5 via flash chromatography. The

50 overall yield of the protected 2-fluoro maltal (2.5) from the protected maltal (2.8) was 25% over three steps.

OAc OAc OH

O O AcO O HO AcO AcO OAc AcO HO OAc OH F i) AcO F ii) F O AcO OH O O O O O AcO AcO HO

Br F F 2.6 2.11 2.12 Scheme 2.2: Synthesis of 4-O-[α-(1,4)-D-glucopyranosyl]-2-deoxy-2-fluoro-α-D-manno-

pyranosyl fluoride (2.12). i) HF/pyridine, 59% yield; ii) NH3/MeOH, 51% yield.

The major side product in Scheme 2.1, compound 2.6, is a very useful precursor for the synthesis of 4-O-[α-(1,4)-D-glucopyranosyl]-2-deoxy-2-fluoro-α-D-manno- pyranosyl fluoride (2.12). Fluorination of the anomeric center by HF/pyridine followed

by deprotection in weakly basic conditions (NH3/MeOH) afforded the compound 2.12 in a good yield, as shown in Scheme 2.2.

OAc OAc OH O AcO O O AcO HO AcO OAc i) AcO OAc ii) HO OH AcO F F O AcO OH O O O AcO O O AcO HO

F Cl Cl Cl Cl 2.5 2.13 2.3 Scheme 2.3: Synthesis of 2-chloro-2-deoxy-2-fluoro-4-O-[α-(1,4)-D-glucopyranosyl]-α-

D-glucopyranosyl chloride (2.3). i) Cl2, CCl4, 49%; ii) NaOMe, MeOH, 82%.

Per-O-acetylated 2-fluoro-maltal (2.5) was dissolved in cooled carbon tetrachloride (-23oC) and subjected to chlorination with chlorine gas.165-167 Early studies on the addition of chlorine to the double bond of tri-O-acetyl-D-glucal had shown that the stereochemical outcome for this addition depends on the solvent polarity, with non-polar solvents favoring syn addition while polar solvents favor anti addition.167 Indeed, the major product isolated in this case was that of syn addition from the α-face, namely 3,6- di-O-acetyl-4-O-[2’,3’,4’,6’-tetra-O-acetyl-α-(1,4)-D-glucopyranosyl]-2-chloro-2-deoxy-

51 2-fluoro-α-D-glucopyranosyl chloride (2.13), in 49% yield. This compound was readily deprotected using NaOMe in methanol to yield the desired target 2.3, which was purified by silica gel chromatography (Scheme 2.3). The syn addition stereochemistry in this case was clearly shown by analysis of the 1H and 19F-NMR spectra of the addition products.

Large J3,F coupling constants of 23.3 Hz clearly indicated an axial fluorine at C-2 with a trans-diaxial coupling to H-3. In addition, the presence of relatively small J1,F coupling constants of 6.1 Hz clearly indicated the absence of a trans-diaxial coupling in that case, thus the presence of an equatorial proton at C-1, and therefore of an axial anomeric chlorine is clearly shown.

v)

OAc OAc OAc O AcO O O AcO AcO AcO OAc i) AcO OAc ii) AcO OAc F AcO O AcO F O O AcO O AcO O O AcO AcO

F F F OAc OH 2.5 2.14 2.15

OAc OH

O O AcO HO AcO HO iii) OAc OH F iv) F AcO O HO O O O AcO HO

F F Cl Cl 2.16 2.4 Scheme 2.4: Synthesis of 2-deoxy-2,2-difluoro-4-O-[α-(1,4)-D-glucopyranosyl]-α-D- arabinohexopyranosyl chloride (2.4). i) AcOF, CFCl3, 59%; ii) hydrazine acetate, DMF, TM 68%; iii) SOCl2, BiOCl, CH2Cl2, 13%; iv) NH3, MeOH, 83%; v) Selectfluor ,

CH3NO2/water (v/v = 4:1), 84%.

For the synthesis of the target molecule 2.4, per-O-acetylated 2-fluoro-maltal (2.5) was subjected to further eletrophilic fluorination with acetyl hypofluorite to afford 1,3,6-tri-O-acetyl-4-O-[2’,3’,4’,6’-tetra-O-acetyl-α-(1,4)-D-glucopyranosyl]-2-deoxy-2,2- difluoro-α/β-D-arabinohexopyranose (2.14) in a good yield (59%). This reaction was carried out in the radio-synthesis facility of TRIUMF with the help of Dr. Nathaniel C. Lim. A two-step anomeric chlorination process was subsequently developed in which

52 hydrazine acetate was first used to selectively deprotect the anomeric acetate, yielding the α-hemiacetal 2.15 in a good yield (68%), as shown in Scheme 2.4. Our original plan was to replace this anomeric hydroxyl with chloride by reacting it with thionyl chloride. Indeed, a previous group member, Dr. John McCarter found that the monosaccharide version of the hemiacetal 2.15, namely 3,4,6-tri-O-acetyl-2-deoxy-2,2-difluoro-α-D- arabino-hexopyranose could be converted to the corresponding α-glycosyl chloride when o exposed to SOCl2 at 60 C for three days. The formation of the α-chloride is most likely a consequence of an initial β-chloride formation from displacement of the α-sulfinyl chloride, followed by equilibration to the thermodynamically favored α-chloride driven by the excess chloride in solution. However, under similar conditions, the disaccharide hemiacetal 2.15 underwent cleavage of its inter-glycosidic linkage and the major product isolated was the degraded monosaccharide. Obviously exposure to strong acidic conditions at high temperature was the primary reason for the cleavage. This chlorination

reaction was therefore attempted at lower temperature in either neat SOCl2 or

SOCl2/CH3CN, but still none of the desired product was obtained, even after prolonged stirring. Presumably the geminal fluorine substituents at C-2 severely slow down the displacement chemistry at the anomeric center of 2.15.

After unsuccessful experimentation with a range of chlorination conditions, including the use of triphenylphosphine with N-chlorosuccinimide168, we were delighted to discover that the addition of bismuth (III) oxychloride (BiOCl) to the thionyl chloride allowed the chlorination reaction to proceed at room temperature to generate the desired product 2.16, albeit in a low yield (Scheme 2.4).169 BiOCl is thought to act as a pro-

catalyst for BiCl3, which is difficult to handle, but which itself acts as an efficient Lewis acid catalyst for a range of reactions.170,171 The α-configured chloride of 2.16 is clearly

evidenced by the the small JH1, F2axial coupling constant of 6.4 Hz measured, with the other

coupling constant JH1, F2equatorial being undetectable by NMR. An axial anomeric proton would exhibit one large trans-diaxial coupling and one small. Deprotection of 2.16 was simply achieved using ammonia in methanol to afford the pure target compound 2.4.

53 Another much simpler preparation method for compound 2.15 was developed after finishing the synthesis of the target 2.4.172 We found that the synthesis of this hemiacetal could be achieved by directly reacting per-O-acetylated 2-fluoro-maltal (2.5) with SelectfluorTM in a nitromethane/water mixed solvent, as shown in Scheme 2.4. This route employed a much more stable and much safer fluorinating agent, and an excellent yield (>80%) was obtained.

2.3.3 Kinetic evaluations of 2-deoxy-2,2-dihalo maltosides as potential mechanism- based inhibitors of HPA. In order to evaluate potential inhibitors of HPA, a suitable kinetic assay is needed to quantify the activity of this enzyme. There exist many ways to assay activities of α- amylases and each one has its advantages and drawbacks.173 For example, the natural substrate starch can be used with a reducing sugar assay. However, the large molecular weight and heterogeneous nature of this polymer limits its use for detailed kinetic analysis. Use of the shorter and defined versions of starch, maltooligosaccharides, can overcome these limitations. Their hydrolysis can be quantified by either HPLC analysis or by reaction of the liberated reducing ends with suitable reagents (such as 3,5- dinitrosalicylic acid) in a stopped photometric assay. However, both these approaches involve stopped assays, which are time consuming and not ideal for inhibitor and inactivation studies. Fortunately, several synthetic substrates have been developed for continuously monitoring reactions catalyzed by α-amylases174 and the one we chose is the commercially available 2-chloro-4-nitrophenyl α-maltotrioside (CNPG3), as shown in Figure 2.16. A previous group member, Dr. Shin Numao evaluated CNPG3 as a substrate of HPA and demonstrated that it was a good HPA substrate with Km = 3.6 mM -1 173 and kcat = 1.9 s . The enzyme-catalyzed release of the chromophore, 2-chloro-4- nitrophenolate, is conveniently followed by monitoring the increase in absorbance at 400 nm in a UV-Vis spectrometer.

54 OH

O HO HO OH

OH O O HO OH

OH O Cl O HO NO2 OH O CNPG3 Figure 2.16: Structure of 2-chloro-4-nitrophenyl α-maltotrioside (CNPG3) Kinetic analysis of mechanism-based inactivators usually involves incubation of the enzyme with the compound of interest and monitoring loss of activity with time. In the case of HPA, aliquots of this mixture are removed at time intervals and diluted into assay cells containing a large volume (~ 1 mL) of CNPG3 (final concentration: 2 mM). This effectively stops the inactivation both by dilution of the inactivator and by competition with an excess of substrate. The residual enzymatic activity was determined from the rate of hydrolysis of the substrate, which is directly proportional to the amount of active enzyme. If the compound is indeed a mechanism-based inhibitor, a time- dependent exponential loss of enzyme activity will be observed. Further discussion of the inactivation kinetics can be found in the Appendix I of this thesis.

2.3.3.1 Evaluation of 2-chloro-2-deoxy-2-fluoro-4-O-[α-(1,4)-D-glucopyranosyl]-α-D- glucopyranosyl chloride (2.3) as an HPA inactivator.

Potential inactivator 2.3 (50 mM) was incubated with wild type HPA at room temperature and the residual enzyme activity vs. time was monitored, as shown in Figure 2.17 (a). Surprisingly, no time-dependent loss of enzymatic activity was observed. Two possibilities exist to rationalize the lack of HPA inactivation by compound 2.3. The first scenario is that 2.3 is actually a slow substrate of HPA and that the glycosylation step is the rate-limiting step, as is the case for reaction of 2-deoxy-2-fluoro-α-maltosyl fluoride with HPA. However, both TLC and mass spectrometric analysis of the mixture of HPA and compound 2.3 after overnight incubation demonstrated that no hydrolysis had occurred, ruling out this possibility. The second rationale would be that this compound could not bind to HPA. This hypothesis was simply tested by evaluating compound 2.3 as

55 a reversible HPA inhibitor. As Figure 2.17 (b) demonstrates, no significant inhibition of the hydrolysis of CNPG3 (5 mM) was observed at concentrations of 2.3 up to 20 mM, indicating that 2.3 does not bind at the active site.

0.3 0.3

0.2 0.2

0.1 0.1 Initial Rate (A400/min) Rate Initial Initial Rate (A400/min) Rate Initial

0 0 04080120160 0 4 8 12 16 20 Time (min) [Compound 2.3]/mM a) b) Figure 2.17: Kinetic evaluations of compound 2.3 with wild type HPA. a) Residual enzyme activity of HPA versus time in the presence of 2.3 (50 mM). b) Kinetic analysis of (2.3) as a reversible inhibitor of HPA in the presence of 5 mM CNPG3.

2.3.3.2 Evaluation of 2-deoxy-2,2-difluoro-4-O-[α-(1,4)-D-glucopyranosyl]-α-D- arabinohexopyranosyl chloride (2.4) as an inactivator of HPA.

In a similar manner, compound 2.4 (50 mM) was incubated with wild-type HPA at room temperature and the residual enzyme activity vs. time measured as shown in Figure 2.18 (a). Very disappointingly, no time-dependent loss of enzymatic activity was detected in this case either. Mass spectrometric analysis of this incubation mixture confirmed that no hydrolysis of compound 2.4 had occurred, indicating that it was not a substrate of the enzyme, either. Again, compound 2.4 was tested as a reversible inhibitor of HPA and only very weak binding to HPA was observed, as shown in Figure 2.18 (b). Consistent with this, no ligand was observed at the active site when HPA crystals were soaked with a concentrated solution of this compound (data not shown).

56 0.024 0.3

0.016 0.2

0.1 0.008 Initial Rate (A400/min) Rate Initial Initial Rate (A400/min) Rate Initial

0 0 0 40 80 120 160 200 240 0 204060 Time (min) [Compound 2.4]/mM a) b)

Figure 2.18: Kinetic evaluations of compound 2.4 with wild type HPA. a) Residual enzyme activity of HPA versus time in the presence of 2.4 (50 mM). b) Kinetic analysis of 2.4 as a reversible inhibitor of HPA in the presence of 2 mM CNPG3.

2.3.3.3 Evaluation of 4-O-[α-(1,4)-D-glucopyranosyl]-2-deoxy-2-fluoro-α-D- mannopyranosyl fluoride (2.12) as an HPA inactivator

Previous studies showed that 2-deoxy-2-fluoro α-maltosyl fluoride was a slow substrate, rather than a mechanism-based inhibitor of HPA with Km = (4.7 ± 0.9) mM, kcat = (0.17 ± 0.02) s-1.94 Since its epimer, compound 2.12, is readily available from my previous syntheses, we decided to test it as an HPA inactivator. Similar experimental methods were employed to those described for compounds 2.3 and 2.4. Interestingly, not only is compound 2.12 not a mechanism-based inhibitor of HPA (as shown in Figure 2.19 (a)), but also it is not turned over by the enzyme, in contrast to what was observed for 2-deoxy-2-fluoro-α-maltosyl fluoride. This conclusion was supported by TLC, M.S. analysis and fluoride electrode monitoring of the incubation mixture of compound 2.12 and HPA. Kinetic analysis of this compound as a reversible HPA inhibitor was also carried out (Figure 2.19 (b)) and it was demonstrated that 2.12 did not bind to HPA significantly up to a concentration of 15 mM, clearly explaining why it was not turned over by the enzyme.

57 0.3 0.4

0.2

0.2 0.1 Initial Rate (A400/min) Rate Initial Initial Rate (A400/min) Rate Initial

0 0 0 40 80 120 160 200 0 2 4 6 8 10 12 14 16 Time (min) [Compound 2.12]/mM a) b) Figure 2.19: Kinetic evaluations of compound 2.12 with wild type HPA. a) Residual enzyme activity of HPA versus time in the presence of 2.12 (20 mM). b) Kinetic analysis of 2.12 as a reversible inhibitor of HPA in the presence of 5 mM CNPG3.

2.3.3.4 General conclusions from the kinetic analysis of the interactions of 2-deoxy- 2,2-dihalo maltosides with wild-type HPA.

The lack of inactivation of HPA by compounds 2.3 and 2.4 was initially unexpected since the trinitrophenyl 2-deoxy-2,2-difluoro maltoside had previously been shown to be a slow, but effective time-dependent inactivator of the human α-amylase73, and since inherently, chloride should be a better leaving group than trinitrophenolate. Presumably binding interactions with the aryl moiety in the enzyme active site accelerate, relatively, the reaction of amylase with the trinitrophenyl glycoside. Indeed, a similar phenomenon is seen with yeast α-glucosidase upon comparison of the inactivation parameters for the different inactivators. (Table 2.1) Previous group members, Dr. John McCarter and Dr. Curtis Braun have demonstrated that all three compounds (2.1), (2.2), and trinitrophenyl 2,2-difluoroglucoside are mechanism-based inhibitors of GH13 yeast α-glucosidase73, but with widely varying inactivation parameters, as shown in Table 2.1. -1 -1 The trinitrophenyl 2,2-difluoroglucoside (ki/Ki = 0.25 min mM ) inactivates yeast α- glucosidase 2,000-fold faster than does the corresponding α-chloride. On that basis the lack of inactivation of human pancreatic α-amylase by the 2,2-difluoro maltosyl chloride is not so surprising given that even the trinitrophenyl 2,2-difluoromaltoside is a slow -3 -1 -1 (ki/Ki = 7.3 × 10 min mM ) inactivator. Interestingly, the replacement of a chlorine at

58 the C-2 position by a fluorine in the 2,2-dihaloglucosyl chloride results in a 50-fold slower inactivation of yeast α-glucosidase. This rate difference is undoubtedly due, at least in part, to the greater electronegativity of fluorine than of chlorine thus a bigger inductive effect on the transition state.

Table 2.1: Comparison of kinetic parameters for inactivation of related GH13 enzymes by 2-deoxy-2,2-dihaloglycosides -1 -1 Potential Inactivator Enzyme ki/Ki (min mM ) Trinitrophenyl 2,2- Yeast α-glucosidase 0.25 (a) difluoroglucoside

Compound 2.1 Yeast α-glucosidase (5.3 ± 3.4) × 10-3 (a)

Compound 2.2 Yeast α-glucosidase (9.1 ± 3.3) × 10-5 (a)

Trinitrophenyl 2,2- HPA 0.0073 (a) difluoromaltoside

Compound 2.3 HPA No inactivation (b)

Compound 2.4 HPA No inactivation (b)

(a) previous work73,175 (b) this thesis

The faster inactivation of retaining glycosidases by aryl aglycone-containing compounds compared with non-aryl aglycone-containing compounds is not without precedent. In the case of GH1 Agrobacterium sp. β-glucosidase (Abg), the second-order -1 -1 inactivation rate constant ki/Ki was determined to be 14.6 min mM for 2-deoxy-2- fluoro-β-glucosyl fluoride.69 When the aglycone fluoride was replaced by 2,4- dinitrophenol, this inactivation rate constant was increased by almost 30-fold to 502 min-1 mM-1.176 Presumably, the aryl aglycone makes some non-covalent interactions with the enzyme active site, leading to more efficient inactivation. We reasoned that an ideal mechanism-based inactivator of HPA should still bear two fluorines at C-2 and an activated, aryl aglycone (but not trinitrophenyl). Among the common leaving groups, p- toluenesulfonyl or tosyl, is an excellent leaving group which also has an aryl moiety.

59 Therefore tosyl 2-deoxy-2,2-difluoro α-maltoside (2.17) was designed, as shown in Figure 2.20.

OH

O HO HO OH F OH O O HO O F O S CH3

O 2.17 Figure 2.20: Chemical structure of p-toluenesulfonyl 2-deoxy-2,2-difluoro α-maltoside (2.17)

2.3.4 Attempted synthesis of p-toluenesulfonyl 2-deoxy-2,2-difluoro-α-maltoside (2.17).

The general strategy of converting an alcohol functionality into a tosyl group involves simply reacting the alcohol with tosyl chloride in the presence of a base catalyst. If the substrate is very stable under strong basic conditions, NaH can be conveniently used as the base for this transformation. Unfortunately, our starting material for the tosylation reaction, compound 2.15, bears six acetyl groups and is very labile under strong basic conditions. Therefore, weak bases such as pyridine, DMAP or triethylamine were used.

As shown in Scheme 2.5, the first attempted method involved reacting compound 2.15 with tosyl chloride in pyridine with DMAP as the catalyst.177 At room temperature this reaction proceeded extremely slowly. After three days’ stirring, only a trace amount of product could be detected by MS, with no product visible on the TLC. Heating this reaction to 50 oC still did not result in significant conversion. Presumably, the electron withdrawing effects of the gem-difluoro groups at C-2 greatly decrease the nucleophilicity of the anomeric hydroxyl. Similar problems have been encountered during the chlorination of this anomeric hydroxyl while preparing compound 2.16. Consequently this method was abandoned.

60 OAc OAc

O O AcO AcO AcO AcO OAc OAc TsCl, DMAP F F AcO AcO O O (too slow) O pyridine, R.T. O AcO AcO

F F OH OTs 2.15

OAc OAc O AcO O AcO AcO OAc AcO OAc F TsCl, dry CH2Cl2 AcCl 2.17 AcO F O AcO O triethylamine O (anomeric AcO O MeOH, 4oC 45 oC, two days AcO mixture) 3 days F OH F 2.15 OTs Scheme 2.5: Attempted synthesis of tosyl 2-deoxy-2,2-difluoro α-maltoside (2.17)

The second attempted tosylation method (Scheme 2.5) employed triethylamine as o 178 the base catalyst, and reacting 2.15 with tosyl chloride in dry CH2Cl2 at 45 C. Excitingly, a new UV-active spot was detected on TLC after two days reaction and all the starting material was consumed. MS analysis showed the new product had the expected molecular weight of the tosylated product. After column chromatography, NMR revealed that the product formed was an anomeric mixture of the protected tosylated disaccharides. Since these two products could not be separated, this mixture was deprotected using acetyl chloride/MeOH. Again TLC showed a single major UV-active spot and after column purification, NMR and MS analysis confirmed that this was still an anomeric mixture of the deprotected tosylated disaccharides. Due to the small amount of available starting material 2.15, the separation of this tosylated anomeric mixture was not pursued further. However, it would be very interesting to directly incubate HPA enzyme with this anomeric mixture and monitor the residual enzyme activity since it is probable that only the “right” inhibitor will be chosen by the enzyme.

2.3.5 Attempted trapping of the covalent intermediate of HPA by using the acid/base mutant E233Q of HPA. One of the strategies to slow down the deglycosylation step for retaining glycosidases involves mutating the enzyme’s general acid/base residue and reacting such

61 mutants with an activated substrate that does not need acid catalysis. As Section 2.1.3 mentioned, this has been successfully applied in accumulating the covalent intermediate of GH13 transglycosidases such as CGTase from Bacillus circulans.132 However, this strategy alone is not enough to accumulate a significant amount of covalent intermediate for hydrolytic α-amylases, such as HPA. We envisioned that it is possible to further decrease the turnover rate of the covalent intermediate by combining the fluorosugar strategy with that of mutating the acid/base catalyst. Indeed, the covalent intermediates of several retaining glycosidases have been successfully trapped and subsequently studied by combining these two strategies, with notable examples including GH22 hen egg white lysozyme (HEWL)72, GH26 β-Mannanase 26A from Pseudomonas cellulosa84 and GH38 golgi α-mannosidase II from Drosophila melanogaster79. Since 2-deoxy-2-fluoro-α- maltosyl fluoride (2.19) has been shown to be a slow substrate of wild-type HPA, it was logical for us to test this disaccharide as a potential mechanism-based inhibitor for the general acid/base mutant (E233Q) of HPA.

OAc OAc OH

O O AcO O AcO HO AcO HO AcO i) OAc OH OAc ii) AcO AcO O HO O O O O O AcO AcO HO

F F F F 2.8 2.18 2.19 Scheme 2.6: Synthesis of 4-O-[α-(1,4)-D-glucopyranosyl]-2-deoxy-2-fluoro-α-D-gluco-

pyranosyl fluoride (2.19). i) XeF2, BF3● Et2O, Et2O/CH2Cl2 mixed solvent (VEt2O/VCH2Cl2 = 5:1), 42% yield; ii) NaOMe, MeOH, 76% yield.

An earlier synthesis of 2-deoxy-2-fluoro α-maltosyl fluoride in our group involved reacting the protected maltal with F2 to generate two syn addition products. Subsequent deprotection of the principal “malto”-configured product afforded the 2-

deoxy-2-fluoro-α-maltosyl fluoride in a good yield. Due to the inconvenience of using F2, we decided to change the fluorination source to XeF2. Compared with F2, XeF2 is much safer, easier to handle and can be reacted in glassware.179 Despite being considered as a

tamer version of F2, XeF2 is still a very reactive substance that has found wide

62 applications in many types of reactions such as fluorodeiodination, fluorodecarboxylation, aromatic ring fluorination and perfluoroalkylation.179,180 In carbohydrate chemistry, XeF2 has been demonstrated to be an excellent electrophilic fluorination source for reactions with glycals, generating the corresponding 1,2- difluorinated sugars. For example, when per-O-acetylated D-glucal was treated with

XeF2, the major isolated product was 3,4,6-tri-O-acetyl 2-deoxy-2-fluoro α-D- glucopyranosyl fluoride in 61% yield.181 Encouraged by this, we treated per-O-acetylated

maltal 2.8 with 1.1 equivalent of XeF2 and a catalytic amount of BF3 etherate at room temperature (Scheme 2.6). The major product isolated was per-O-acetylated 2-deoxy-2- fluoro α-maltosyl fluoride (2.18) in 42% yield. The equatorial configuration of F-2 can be

firmly established by the relatively small JH3, F2 = 10.9 Hz since an axial F-2 would

exhibit one large trans-diaxial coupling constant. The small JH1,H2 = 2.7 Hz clearly demonstrates the absence of a trans-diaxial coupling in that case, thus the presence of an equatorial proton at C-1, and therefore of an axial anomeric fluorine. Subsequent Zemplen deprotection of 2.18 afforded the target compound 2.19.

0.02

0.016

0.012

0.008

Initial Rate (A400/min) Rate Initial 0.004

0 0 20406080100 Time (min) Figure 2.21: Residual enzyme activity of HPA E233Q mutant versus time in the presence of compound 2.19 (10 mM).

Incubation of compound 2.19 (10 mM) with the acid/base mutant of HPA (E233Q) at room temperature, however, did not result in a time-dependent loss of enzyme activity, as shown in Figure 2.21. Several possible reasons could explain this lack of time-dependent inactivation. One possibility is that compound 2.19 doesn’t bind

63 to the mutant enzyme or binds it very weakly. Another possibility is that the mutant enzyme can not form the covalent intermediate with 2.19 due to its lack of a general acid residue E233. Due to the limited availability of the mutant enzyme, this was not investigated further.

2.3.6 Application of the “in situ” inhibitor elongation methodology to generate mechanism-based inhibitors of HPA: using 2-deoxy-2,2-dihalo glycosides.

One of the problems that hinders the use of 2-deoxy-2,2-dihalo maltosides as mechanism-based inhibitors of HPA is their weak binding to the enzyme, as mentioned in Section 2.3.3. This is partially because α-amylases, including HPA, have an extended active site featuring a number of glucose-binding subsites and generally good occupancy of these subsites is required for good inhibition. Indeed, it has been demonstrated that HPA can transfer glucose residues onto some weak-binding inhibitors in situ, forming an elongated product that has much improved binding affinity to the enzyme, as illustrated in Figure 2.10.149 Therefore, we hypothesized that if we could utilize HPA’s own “transglycosylation machinery” to elongate the 2,2-dihalo maltosides in situ, the resultant 2,2-dihaloglycosides could possibly act as mechanism-based inhibitors, trapping the covalent intermediate, as illustrated in Figure 2.22 (a).

In the original report of this “in situ” elongation methodology, α-maltotriosyl -1 fluoride (G3F, Compound 2.20, Km = (0.26 ± 0.02) mM, kcat = (215 ± 5) s ) was used as the glycosyl donor to elongate the inhibitor GHIL. HPLC analysis of the reaction of HPA and G3F revealed that, besides the hydrolysis product maltotriose, significant amounts of elongated product such as α-maltohexaosyl fluoride were formed via self- transglycosylation.123 In order to maximize the chance of the transglycosylation occurring between the glycosyl donor and weak-binding inhibitors such as compound 2.4, this self- transglycosylation of the glycosyl donor must be avoided. One way to do this is to use the methylated version of the glycosyl donor, MeG2F (2.21) shown in Figure 2.22 (b).174

64 a)

O O O O O O O O O RO RO RO

F Weak Weak H O O H Inhibitor H Inhibitor

O O O O O O

b) OH

O OH HO HO OH O MeO OH O HO OH O HO OH OH O O OH O HO O HO OH F OH F 2.20 2.21 Figure 2.22: a) Hypothesized in situ elongation mechanism for weak-binding inhibitors of HPA; b) Chemical structures of two glycosyl donors used in the in situ inhibitor elongation methodology: α-maltotriosyl fluoride (G3F); 4’-O-methyl α-maltosyl fluoride (MeG2F)

2.3.6.1 Chemical synthesis of 4’-O-methyl α-maltosyl fluoride (MeG2F) A previous group member, Dr. Iben Damager developed the synthetic route for MeG2F (2.21), shown in Scheme 2.7.174 Briefly, maltose was reacted with benzaldehyde dimethylacetal in the presence of acid catalyst to protect 4’-OH and 6’-OH. Acetylation of the rest of the hydroxyls was simply carried out using acetic anhydride in pyridine. Regioselective deprotection of the benzylidene to expose the 4’-OH was achieved by using sodium cyanoborohydride and HCl in diethyl ether. Presumably this selectivity is the result of the stronger basicity of 4’-OH over 6’-OH, which leads to the preferred protonation at 4’-OH by H+.182,183 Subsequent methylation of the free 4’-OH was

accomplished using trimethylsilyldiazomethane catalyzed by BF3 etherate. The benzyl group at 6’-OH was then removed by catalytic hydrogenation followed by acetylation in

65 acetic anhydride/pyridine. Fluorination of the anomeric acetate was carried out in HF/pyridine yielding the α-fluoride as the major isolated product due to the anomeric effect. The final compound 2.21 was obtained by deprotection of the fluorinated product

using NH3/MeOH.

OBn Ph O OBn O O O AcO OAc HO O AcO OAc MeO i) iii) iv) AcO maltose AcO O OAc O AcO O ii) AcO O AcO O AcO O OAc AcO OAc OAc OAc OAc OAc 2.22 2.23 2.24

OH OAc OAc O MeO O AcO MeO O OAc AcO MeO v) vi) OAc AcO OAc AcO vii) viii) O AcO O O AcO O 2.21 AcO O O AcO AcO OAc OAc OAc OAc AcO F 2.25 2.26 2.27 Scheme 2.7: Repeating Dr. Damager’s synthetic route towards 4’-O-methyl-α-maltosyl fluoride (2.21). i) benzaldehyde dimethylacetal, p-toluenesulfonic acid, DMF; ii) acetic

anhydride, pyridine, 22% yield over two steps; iii) Na(CN)BH3, 1M HCl in diethyl ether,

THF, 78%; iv) trimethylsilyldiazomethane, BF3● Et2O, CH2Cl2, 29% yield; v) H2, Pd/C (10 wt. %), EtOAc, 99% yield; vi) acetic anhydride, pyridine, 96% yield; vii)

HF/pyridine, 71% yield; viii) NH3/MeOH, 99% yield.

Since large amounts of MeG2F were needed for the studies envisaged, Dr. Damager’s synthetic route was initially repeated on a much larger scale. Most of the steps in this pathway proceeded reasonably well, as shown in Scheme 2.7. However, the

methylation step converting 2.23 to 2.24 using trimethylsilyldiazomethane (TMS-CHN2) gave a really low yield (≤ 29%), which seriously lowered the overall yield of the final product. A similar phenomenon was observed by Dr. Damager and she noted in her publication that no higher than a 50% yield could be obtained for this methylation 174 reaction. TMS-CHN2, as a stable version of diazomethane, has been widely used in the methylation of carboxylic acids.184 Recent mechanistic studies of this methyl

esterification reaction revealed that the real methylation agent is not TMS-CHN2, but rather the free CH2N2 generated in situ from the reaction between TMS-CHN2 and the

66 common co-solvent of this reaction, methanol.185 Therefore we can assume that the methylation reaction in Scheme 2.7 should follow a similar mechanism (Figure 2.23), in which half of the compound 2.23 is needed to generate the free diazomethane, which in turn methylates the remaining half of the free alcohol of 2.23. This mechanism satisfyingly rationalizes why a maximal 50% yield could be obtained for this reaction.

TMS-OR

TMS CH N2 RO H TMS CH2 N2 CH2 N2 RO H

RO ROH

N2 RO ROMe CH3 N2

Figure 2.23: Proposed mechanism of the methylation reaction by TMS-CHN2. “ROH” in this figure represents compound (2.23).

In order to overcome the limitation of this low-yielding methylation reaction, we designed another synthetic route that should be suitable for large scale production of MeG2F, as shown in Scheme 2.8. Since the most common high yielding alcohol methylation method employs NaH/MeI, the protecting group acetates in the original route (Scheme 2.7) were replaced by benzyl groups. This new synthetic route started from maltose and its 4’-OH and 6’-OH were again protected with a benzylidene group. The benzylation of the remaining hydroxyls was conveniently carried out in the crude reaction mixture by adding NaH and subsequently benzyl bromide. Separation of the per-O- benzylated maltose and the per-O-benzylated-4’, 6’-benzylidene maltose was difficult at this stage, thus this “purified” mixture was directly subjected to regioselective ring

opening at 4’-OH using Na(CN)BH3 and HCl in ether. The overall yield of compound 2.28 from these three steps after column purification was 23%. Methylation of the free 4’-OH was accomplished in an excellent yield (91%) using NaH and MeI. The

methylation was clearly indicated by the significant decrease of the Rf value of the product 2.29 on TLC and an increase of 14 Da on the MS of the product compared with

67 that of the starting material 2.28. Subsequent catalytic hydrogenation and acetylation converted 2.29 into 2.26 in a good yield (82%) over two steps. The same chemistry was used as that in Scheme 2.7 to generate the final compound 2.21 from 2.26, as shown in Scheme 2.8.

OBn OBn

i) O O HO MeO BnO ii) BnO OBn OBn maltose (iv) BnO O BnO O iii) O O BnO BnO

OBn OBn OBn OBn 2.28 2.29

OAc OAc O O MeO MeO AcO OAc (v) AcO OAc vii) viii) AcO O 2.21 (vi) AcO O O O AcO AcO AcO OAc OAc F 2.26 2.27 Scheme 2.8: Improved synthesis of 4’-O-methyl α-maltosyl fluoride (2.21). i) benzaldehyde dimethylacetal, p-toluenesulfonic acid, DMF; ii) NaH, BnBr, DMF; iii)

Na(CN)BH3, 1 M HCl in diethyl ether, THF, 23% yield over three steps; iv) NaH, MeI,

THF, 91% yield; v) H2, Pd/C (10 wt. %), EtOAc/MeOH; vi) acetic anhydride, pyridine,

82% yield over two steps; vii) HF/pyridine, 71% yield; viii) NH3/MeOH, 99% yield.

2.3.6.2 Attempted in situ elongation of 2-deoxy-2,2-difluoro α-maltosyl chloride (2.4) with MeG2F (2.21) by HPA

High concentrations of both the glycosyl donor MeG2F (10 mM) and the acceptor 2-deoxy-2,2-difluoro α-maltosyl chloride (50 mM) were incubated with wild type HPA at 30 oC for an hour before aliquots were taken at time intervals and assayed with CNPG3. If HPA indeed transglycosylates a 4’-O-methyl maltose unit from MeG2F onto the non- reducing end of compound 2.4 and if the resultant tetrasaccharide is a mechanism-based inhibitor, time-dependent loss of enzyme activity should be detected. The residual enzyme activity measured at different times is shown in Figure 2.24.

68 0.3

0.2

0.1 Initial Rate (A400/min) Rate Initial

0 0200400 Time (min) Figure 2.24: Residual enzyme activity of HPA versus time in the presence of 10 mM MeG2F (2.21) and 50 mM 2-deoxy-2,2-difluoro α-maltosyl chloride (2.4) after pre- incubation at 30 oC for an hour.

Very disappointingly, the enzyme remained fully active even after incubation for six hours. An aliquot of this mixture was subjected to MS analysis to investigate this lack of inhibition. Interestingly, the mass spectrum revealed the existence of at least three major species in the mixture, as shown in Figure 2.25(a). Peak A (Figure 2.25(b)) corresponds to hydrolyzed MeG2F while Peak B (Figure 2.25(c)) is the mass of the 2- deoxy-2,2-difluoro α-maltosyl chloride (2.4). The third minor but detectable peak, Peak C, matched exactly the mass of the hypothesized elongated tetrasaccharide generated in situ, with 4’-O-methyl maltose at the non-reducing end and 2-deoxy-2,2-difluoro α-maltosyl chloride at its reducing end (Figure 2.25 (d)). The two isotopic peaks due to the two chlorine isotopes (35Cl and 37Cl) are clearly visible in the expansion of Peak C. Overall this data clearly showed that HPA was not inactivated under those conditions and the elongated 2.4 did not act as a mechanism-based inactivator.

69 (a) (b) A 379.3 2.4e6 B 2.4e6

1.6e6 1.6e6

Intensity Intensity

8e5 C 8e5

0 0

0 200 400 600 800 1000 1200 1400 1600 370 372 374 376 378 380 382 384 386 (c) m/z (d) m/z 6e5 2.4e6 403.3 741.3

405.2 4e5 1.6e6

Intensity Intensity 8e5 2e5 743.2

0 0 398 400 402 404 406 408 410 720 740 760 m/z m/z Figure 2.25: Mass spectrum of the incubation mixture of compound 2.4 (50 mM), MeG2F (10 mM) and wild type HPA. (a) the whole spectrum; (b) expanded Peak A; (c) expanded Peak B; (d) expanded Peak C.

Although this elongation attempt didn’t result in inactivation of HPA, it does provide important hints which might be useful for further trapping studies. Firstly, elongation of a weak-binding HPA inhibitor in situ is indeed possible, as evidenced by the analysis of the mass spectrum. Secondly, elongation of a disaccharide inhibitor, such as (2.4) with MeG2F produces a tetrasaccharide which may well be cleaved by HPA and certainly have non-productive binding modes as an inactivator. It may therefore be a better plan to elongate a monosaccharide inactivator to a trisaccharide which, bound in the -1 to -3 sites will derive optimal interactions for inactivation while minimizing non- productive binding modes. Therefore, two classes of monosaccharide inhibitors were envisaged for elongation: (1) 2-deoxy-2,2-difluoro-α-D-arabinohexopyranosyl chloride (2.2); (2) 5-fluoro-α-D-glucopyranosyl fluoride (2.30) and 5-fluoro-β-L-idopyranosyl fluoride (2.31), as shown in Figure 2.26.

70

OH OH F F O O O HO HO HO HO HO HO

F HO F HO HO F Cl F 2.2 2.30 2.31 Figure 2.26: Monosaccharide inhibitors which were incubated with HPA in the presence of MeG2F to generate elongated mechanism-based inhibitors in situ.

2.3.6.3 Attempted in situ elongation of 2-deoxy-2,2-difluoro-α-D-arabinohexo- pyranosyl chloride (2.2) with MeG2F by HPA

OAc OAc OAc OAc F O O O O i) AcO ii) AcO iv) AcO AcO AcO AcO AcO AcO iii) F F F OAc OAc 2.32 2.33 2.34 2.35 viii)

OAc OAc OH F F F O v) AcO O O vi) AcO vii) HO AcO AcO HO

F F F OH Cl Cl 2.36 2.37 2.2 Scheme 2.9: Synthesis of 2-deoxy-2,2-difluoro-α-D-arabinohexopyranosyl chloride (2.2). TM i) Selectfluor , AcOH, CH3NO2, 27% yield; ii) HBr (33% in AcOH), CH2Cl2; iii) triethylamine, CH3CN, 87% yield over two steps; iv) FOAc, CFCl3, 76% yield; v)

hydrazine acetate, DMF, 66% yield; vi) SOCl2, BiOCl, 19% yield; vii) NH3/MeOH, 52% TM yield; viii) Selectfluor , CH3NO2/water (v/v = 4:1), 20% yield.

The synthesis of 2-deoxy-2,2-difluoro-α-D-arabinohexopyranosyl chloride (2.2) is achieved by a route similar to that produced 2.4, as shown in Scheme 2.9. Fluorination of the commercially available 3,4,6-tri-O-acetyl D-glucal (2.32) was achieved using SelectfluorTM and acetic acid. The resulting anomeric mixture of 2.33 was subjected to

bromination at the anomeric center followed by elimination in triethylamine/CH3CN to

71 afford protected 2-fluoro glucal 2.34 in an excellent yield of 87% over two steps. The double bond between C-1 and C-2 was clearly demonstrated by the high chemical shift of

H-1 (δ = 6.76) and the small JH1, F2 (4.7 Hz) coupling constant. The introduction of the second fluorine at C-2 was accomplished using acetyl hypofluorite. Subsequent selective deprotection of the fluorinated product 2.35 by hydrazine acetate generated the hemiacetal 2.36. Alternatively, the Compound 2.36 could be obtained by reacting 2- fluoro glucal 2.34 with SelectfluorTM and water, albeit in a lower yield (20%) compared with the synthesis of its maltose version (Section 2.3.2.2). Chlorination of the anomeric

center of 2.36 was carried out in neat SOCl2 in the presence of BiOCl at room temperature. The α-configured chloride in the product 2.37 was evidenced by the two

small JH1, F2 coupling constants (4.8 Hz and 2.0 Hz). A β-chloride would result in a large trans-diaxial coupling constant between H-1 and F-2(axial). Simply deprotecting 2.37 in ammonia/MeOH afforded the target 2.2 in a modest yield (52%).

0.02

0.016

0.012

0.008

Initial Rate (A400/min) Rate Initial 0.004

0 0 100 200 300 Time (min) Figure 2.27: Residual enzyme activity of HPA versus time in the presence of MeG2F (20 mM) and 2-deoxy-2,2-difluoro-α-D-arabinohexopyranosyl chloride (100 mM) after pre- incubation at 30 oC for an hour.

High concentrations of compound 2.2 (100 mM) and MeG2F (20 mM) were incubated with wild type HPA at 30 oC for an hour. Aliquots of this mixture were then taken at different time intervals and assayed with a fixed concentration of CNPG3. The residual enzyme activity versus time is shown in Figure 2.27. It is very clear from this graph that no inactivation is occurring in this case. Reasons for this result were not

72 further investigated since at the same time another trapping strategy worked very well, as discussed in detail in the next section.

2.3.7 Direct in situ inhibitor elongation as a strategy to structurally characterize the covalent intermediate of HPA: the use of 5-fluoro glycosides

5-Fluoro glycosyl fluorides have proved to be more successful mechanism-based inhibitors of retaining α-glycosidases than 2-deoxy-2-fluoro glycosides.75,76 However, the bottleneck to applation of 5-fluoro glycosyl fluorides to amylases lies in the difficulty of synthesizing elongated versions. However, our in situ elongation strategy may overcome this problem since only the monosaccharide versions of 5-fluoro glycosides are needed for these elongation attempts.

2.3.7.1 Chemical synthesis of 5-fluoro-α-D-glucopyranosyl fluoride (2.30) and 5- fluoro-β-L-idopyranosyl fluoride (2.31)

The selective introduction of fluorine at C-5 of carbohydrates has been well documented in the literature. So far two synthetic methods have been developed for the synthesis of 5-fluoro sugars, as illustrated in Figure 2.28. The first route starts with a photobromination reaction via a radical mechanism to install bromine at C-5 (Figure 2.28(a)).186,187 The regioselectivity of this bromination reaction is believed to be determined by the relative stability of the radicals generated at different carbon centers of the carbohydrate. For a normal glucopyranoside, the radical generated at C-5 is the most stable since C-5 is the most substituted carbon and is coupled to an “ether” oxygen (Figure 2.28(b)). The C-5 bromide is thus the preferred product of this reaction and will predominantly adopt an axial configuration due to the anomeric effect of the endocyclic oxygen. Subsequent fluorination of the C-5 bromide is usually achieved using a silver salt.75 The conditions of the fluorination reaction largely control the stereochemistry of the product. When AgF is used, the major fluorinated product is the kinetic product with the stereochemistry at C-5 inverted. This fluorinated product needs to be exposed to HF/pyridine in order to generate the thermodynamically-favored 5-fluoro glycosyl

73 fluoride with an axial C-5 fluorine. Alternatively, a mixture of 5-fluoro glycosyl fluorides with both inverted and retained stereochemistry at C-5 can be directly obtained by treating the corresponding 5-bromo sugars with AgBF4 and the two components can be subsequently purified by chromatography. Overall the photobromination pathway is very straightforward but often suffers from narrow substrate scope and modest yield.

AgBF (a) 4 OAc OAc OAc F O NBS O AgF O HF/py. O AcO AcO AcO AcO R AcO light AcO R AcO R AcO R OAc Br F OAc AcO OAc OAc

(b) OAc Br O AcO HBr O N O light AcO R Br H OAc

Br NBS O N O

OAc Br OAc O AcO AcO R O O N O AcO AcO R OAc Br OAc

O N O

(c) OH O SePh

O O O O RO RO RO RO RO OR' RO OR' RO OR' RO OR' F OR OR OR OR Figure 2.28: Synthesis of 5-fluoro sugar derivatives (a) Photobromination route towards 5-fluoro sugar derivatives; (b) Mechanism of photobromination reaction (c) Retrosynthetic analysis of 5-fluoro sugar derivatives using epoxide fluoridolysis method.

The second synthetic approach to 5-fluoro glycosyl fluorides was developed recently, with the key step for the introduction of C-5 fluorine being fluoridolysis of a C- 5,6 epoxide group using HF/pyridine (Figure 2.28(c)).188 This C-5,6 epoxy group can be obtained by oxidation of a C-5,6 functionality, which can be prepared from a C-6 selenide. Compared with the previous photobromination method, this route definitely requires more steps but is suitable for substrates in which light-sensitive functional

74 groups are present. This epoxide fluoridolysis method has been successfully applied to the synthesis of several complicated targets such as uridine 5’-diphospho-5-fluoro-N- acetylglucosamine189 and 5-fluoro lactosamine glycosides129, which could be very difficult to assemble using the photobromination route. Due to the simplicity of our two targets, compounds 2.30 and 2.31, the photobromination method was employed for their synthesis, as shown in Scheme 2.10.

OAc OAc OAc F O O i) O ii) O iii) AcO AcO AcO AcO AcO AcO OAc AcO AcO Br AcO OAc AcO AcO AcO F F F 2.38 2.39 2.40 (v) iv) OAc OH F

O O O AcO HO HO (vi) AcO HO HO F F AcO HO HO HO F F F 2.30 2.41 2.31 Scheme 2.10: Chemical synthesis of 5FGluF (2.30) and 5FIdoF (2.31). i) HF/pyridine,

74% yield; ii) NBS, CCl4, hν, 62% yield; iii) AgF, CH3CN, 72% yield; iv) HF/pyridine,

23% yield; v) NH3/MeOH, 71% yield; vi) NH3/MeOH, 88% yield.

The commercially available per-O-acetylated glucopyranose was fluorinated using HF/pyridine to afford 2,3,4,6-tetra-O-acetyl-α-D-glucopyranosyl fluoride (2.38) in a good yield (74%). This glycosyl fluoride was subjected to photobromination in the presence of N-bromosuccinimide (NBS) to generate the 5-bromo glycosyl fluoride 2.39 in 62% yield. The replacement of H-5 by bromine was clearly indicated by the

disappearance of the JH4, H5 coupling constant in the NMR spectra of 2.39, compared with that of 2.38. Fluorination of C-5 was achieved using AgF in CH3CN to produce 2,3,4,6- tetra-O-acetyl-5-fluoro-β-L-idopyranosyl fluoride (2.40) in 72% yield, with inversion of the C-5 configuration. Subsequent exposure to HF/pyridine epimerized the C-5 fluoride in 2.40 to generate the thermodynamically favored 2,3,4,6-tetra-O-acetyl-5-fluoro-α-D- glucopyranosyl fluoride (2.41) in a poor yield (23%). Deacetylation of 2.41 and 2.40 in

75 ammonia/MeOH produced the desired targets, 5FGlcF (2.30) and 5FIdoF (2.31) respectively.

Inspection of the NMR spectra of 2.41 and its deprotected product 5FGlcF (2.30) 4 revealed that both of them adopt an undistorted C1 chair conformation, as all the coupling constants are very similar to those of standard D-glucosides. However, this is

not the case for the acetylated 5FIdoF (2.40). The relatively large JH2, H3 (8.5 Hz) implies

a trans-diaxial relationship while the small JH3, H4 (1.6 Hz) clearly indicates the absence of such relative orientations for H-3 and H-4. In fact, based on similar observations, Dr. John McCarter in our group concluded that both the protected 5-fluoro-β-L-idopyranosyl fluoride (2.40) and its deprotected form 5FIdoF (2.31) adopt a 2,5B boat conformation.175 Importantly, this 2,5B boat conformation represents the true structure of the free 5FIdoF in 4 solution. Presumably, the distortion of 5FIdoF from the standard C1 chair conformation arises from satisfying both the requirements of avoiding the steric repulsions associated with an axial hydroxymethyl group at C-5 and maximizing anomeric effects associated with an electronegative “anomeric” substituent at C-5. For simplicity, all the chemical structures of the 5FIdoF in this thesis will be drawn in standard chair conformations. However, it should be borne in mind that the true conformation of the free 5FIdoF is a 2,5B boat in solution.

2.3.7.2 Kinetic studies of wild type HPA in the presence of MeG2F and 5FGluF or 5FIdoF.

Incubation of 50 mM 5FGlcF (acceptor) and 20 mM MeG2F (donor) with HPA at 30 oC resulted in a time-dependent loss of activity, such that less than 45% activity remained after one hour of incubation, as shown in Figure 2.29. However, further incubation resulted in no further significant reduction of enzymatic activity. Instead a "steady-state" of residual activity was reached which lasted for ~2 hours. Upon even more prolonged incubation enzymatic activity gradually recovered to that of the wild- type enzyme. This behavior is consistent with the “in situ” formation of an elongated inactivator, which then forms a transient intermediate that is slowly hydrolyzed (Scheme

76 2.11). Recovery of activity occurs when all the reagent has been consumed. Importantly, incubation of either 50 mM 5FGlcF (Figure 2.29) or 20 mM MeG2F alone (data not shown) with HPA resulted in no inactivation, clearly illustrating the need for formation of a reactive species via transglycosylation before inactivation can occur. This inability to obtain complete and long-term inactivation is reminiscent of previous studies with 5FGlcF and yeast α-glucosidase, where similar "steady-state" kinetic results were observed, due to relatively rapid inactivation and then reactivation by hydrolysis.75

Figure 2.29: Residual activity of HPA in the presence of (O) 50 mM 5FGlcF and (●) 50 mM 5FGlcF + 20 mM MeG2F, at 30 oC.

While these results with 5FGlcF were encouraging, the relatively rapid turnover of the accumulated intermediate made a detailed examination of its structural characteristics via X-ray crystallographic methods problematic. A potential solution to this problem was suggested by kinetic studies on other α-glycosidases, which have shown that the C5 fluorinated substrate analogues with inverted configuration at C5 typically function as better mechanism-based inhibitors than their “natural” 5-fluoro counterparts, by forming longer-lived glycosyl-enzyme intermediates.76,79 In the case of HPA, the utility of this approach depended on whether the C5 epimer, 5FIdoF could first act as an acceptor for transglycosylation and then, if so, whether it would react and form a long lived intermediate.

77 OH O OH MeO OH O O HO O O MeO O OH O O O HO HO O MeO OH O HO OH HO HF HO O O HO O HO HO O OH HO OH HO HO F H O O O O HO HO H HO F HO F F F O O O O H O O

OH MeO O HO OH HO O O HO OH HO O O slow HO O O F HO Turnover

H O O

Trapped covalent glycosyl-enzyme intermediate Scheme 2.11: Proposed in situ elongation-trapping strategy using 5FGlcF and MeG2F

As can be seen in Figure 2.30 (a), incubation of 50 mM 5FIdoF and 20 mM MeG2F with HPA at 30 oC resulted in rapid and essentially complete inactivation of HPA. Significantly, no recovery of activity was seen, even after prolonged (up to 6 hours) incubation, suggesting that with elongated 5FIdoF, a more stable covalent glycosyl- enzyme had been formed. Interestingly, when a higher concentration of the donor molecule (40 mM MeG2F) was used, inactivation was slower, presumably because MeG2F acts as a competitive inhibitor, “protecting” the active site of HPA from reaction with elongated 5FIdoF (Figure 2.30 (a)). Even slower inactivation was observed when MeG2F was replaced by G3F, presumably because the elongated oligosaccharyl fluorides are even more effective “protecting agents”. Again, control experiments revealed that either 20 mM MeG2F or 50 mM 5FIdoF alone had no impact on enzyme activity within two hours, providing evidence that the elongated 5FIdoF generated in situ serves as the real inactivator. Interestingly, however, in longer term experiments in which HPA, either in solution or in the crystalline state (discussed later herein), was exposed to a very high concentration of 5FIdoF alone (100 mM or more), an extremely slow inactivation process was observed (Figure 2.30 (b)).

78

(a)

(b)

Figure 2.30: (a) Residual activity of HPA in the presence of (O) 50 mM 5FIdoF alone; (●) 50 mM 5FIdoF + 20 mM MeG2F; (□) 50 mM 5FIdoF + 40 mM MeG2F; and, (▲) 25 mM 5FIdoF + 40 mM G3F, at 30oC. (b) Long-term analysis of residual activity of HPA (T) in the absence of any 5FIdoF and (X) in the presence of 100 mM 5FIdoF, at 30 oC. To prevent a change in pH during these experiments, higher concentrations of buffer salts (200 mM NaPi, 100 mM NaCl) were used.

79

Figure 2.31: Mass spectra of HPA (above) and HPA treated with 25 mM 5FIdoF + 20 mM MeG2F (below).

Evidence for formation of an elongated covalent 5FIdoF species and insights into its composition were obtained using a mass spectrometry approach. Both the wild-type HPA and MeG2F-5FIdoF-treated HPA were analyzed by electrospray ionization mass spectrometry and individual masses determined to be 56,071 and 56,590 Da, respectively (Figure 2.31). The mass difference between these two species is 519 Da, which corresponds exactly with the molecular weight change (519 Da) due to the predicted covalent attachment of a trisaccharyl moiety containing a 5-fluoro-α-L-idosyl moiety at the reducing end. It is noteworthy that the mass spectra of the MeG2F-5FIdoF-treated

80 HPA revealed essentially no free enzyme, indicating nearly complete covalent labeling, consistent with our kinetic results.

The catalytic competence of the trapped covalent intermediate could also be demonstrated by reactivation experiments, in which MeG2F-5FIdoF-treated HPA was first freed of excess inactivator, then incubated in buffer and aliquots assayed at time

o -5 -1 intervals. At 30 C, reactivation was extremely slow with kreact = (4.8 ± 1.2) × 10 min , corresponding to a half life of 240 hours (Figure 2.32 (a)). Since many glycosidases carry out transglycosylation more rapidly than hydrolysis, we investigated the effects of addition of maltose on reactivation rates.48 In these experiments, the presence of 20 mM maltose dramatically reduced the half life of the trapped intermediate to 53 min. This observation opened up the possibility of measuring the binding affinity of maltose to the aglycone site of the glycosyl-enzyme intermediate of HPA. This could be done in a direct manner by measuring the rates of reactivation at different concentrations of maltose and then plotting the apparent reactivation rate constants as a function of maltose concentration (see Figure 2.32 (b)). From this work a dissociation constant for maltose

of Kd = (69 ± 17) mM and a maximal reactivation rate constant of kreact = (0.073 ± 0.008) min-1, corresponding to a half life of 10 minutes, were determined (Figure 2.32 (c)).

To determine the generality of this elongation approach to the study of covalent intermediates of α-amylases, we carried out a comparable series of experiments with porcine pancreatic α-amylase (PPA).106,190 When both the donor (25 mM 5FIdoF) and acceptor (20 mM MeG2F) were incubated with PPA, very fast inactivation was seen, with essentially no enzymatic activity left after 10 minutes (Figure 2.33). As expected, incubation of 20 mM MeG2F (donor) alone with PPA resulted in no time-dependent inhibition. Interestingly, incubation of 25 mM 5FIdoF (acceptor) alone with PPA yielded a very slow, but detectable loss of enzymatic activity, with this process being much faster than that observed for HPA (Figure 2.33). This faster inactivation of PPA may indicate a broader specificity of PPA for binding and turnover of the 5FIdoF monosaccharide in the -1 binding subsite than is the case with HPA. Most importantly this observation of greatly enhanced inactivation of both PPA and HPA in the presence of 5FIdoF and MeG2F

81 illustrates the generality of this in situ elongation strategy for the study of covalent intermediates in α-amylases and quite likely other retaining endo-glycosidases.

(a) Reactivation in Buffer Alone (c)

0.006

0.004

0.002 Initial Rate (A400/min) Rate Initial

0 0 2000 4000 6000 8000 Time (min)

(b)

Figure 2.32: Reactivation of inactivated HPA at 30oC in the presence of (a) only buffer and (b) in the presence of different concentrations of maltose: (○) 20 mM maltose; (□) 30 mM maltose; (●) 40 mM maltose (Δ) 70 mM maltose (■) 100 mM maltose (▲) 150 mM maltose. Lines represent fits to first order expressions, yielding apparent first order rate constants for reactivation at each concentration. A replot of these apparent reactivation rate constants as a function of maltose concentration, fitted to the Michaelis-Menten equation, is shown in Frame (c).

82

Figure 2.33: Residual activity of porcine pancreatic α-amylase in the presence of (●) 20 mM MeG2F; (O) 25 mM 5FIdoF; and, (□) 20 mM MeG2F + 25 mM 5FIdoF, at 30 oC.

2.3.8 Structural studies of covalent glycosyl-enzyme intermediate of HPA

All the following crystallographic experiments were carried out by Dr. Chunmin Li from Prof. Gary Brayer’s research group (Department of Biochemistry and Molecular Biology, UBC) through our collaboration. Summary of structure determination statistics for all data sets used in this thesis can be found in Section 5.6.

2.3.8.1 Structure of the monosaccharide 5-fluoroidosyl-HPA complex

Excellent high resolution diffraction data (1.43 Å) were obtained from an HPA crystal initially soaked in a solution of 100 mM 5FIdoF overnight, after which 150 mM MeG2F was added for an additional 2 hours (Condition 1). Strong continuous electron density in subsequent electron density maps clearly demonstrated the covalent attachment of a monosaccharide 5-fluoro idosyl moiety to the side chain OD1 oxygen atom of the putative active site nucleophile D197 (Figure 2.34 (a)). In forming this covalent bond the anomeric fluoride of the α-configured 5FIdoF has been displaced and replaced by the side chain of D197 in the β-configuration, as expected. The identity of the bound 5- fluoro-L-idosyl-intermediate was further confirmed by the observation of the fluorine atom at C5, and the axial orientation of the C-6 CH2OH substituent.

83 (a)

(b)

(c)

Figure 2.34: Omit difference electron density maps of the (a) Monosaccharide 5- Fluoroidosyl-HPA (Condition 1); (b) G3F/5FIdoF/HPA (Condition 2); (C) MeG2F/5FIdoF/HPA (Condition 3). Note that Frame (a) is drawn as a stereo diagram to more clearly show both the bound covalent intermediate and the associated non- covalently bound 5FIdoF moiety in the +1 binding subsite. All of the electron density maps shown are drawn at the 2.8 σ level and overlaid with the final refined structures of each intermediate complex (blue backbone). Also drawn are the active site residues D197, E233 and D300 (green backbone), with their side chain omit difference electron densities. The identity of each individual sugar ring has been indicated along with the binding subsite in which it resides.

84

Figure 2.35: Schematic drawings of the structures determined for the Conditions 1-3 covalent glycosyl-intermediate complexes in the active site of HPA. Unique to the Condition 1 structure is a non-covalently bound 5FIdoF moiety in the +1 binding subsite. Well defined, high affinity binding subsites in the active site cleft of HPA have been identified according to the convention indicated in the lower portion of this diagram. Also shown is the expected binding mode for a normal starch substrate, where hydrolysis would occur between subsites -1 and +1.

Notably, in these structural studies only the single sugar moiety derived from 5FIdoF is found covalently bound to D197 in the active site (at full occupancy) with no evidence of further elongation by MeG2F in electron density maps. Clearly overnight exposure to high concentrations (100 mM) of 5FIdoF leads to complete inactivation of HPA and negates the ability of the enzyme to elongate this inhibitor upon subsequent incubation with MeG2F. This is consistent with the results of kinetic studies, which indicate that 5FIdoF by itself, will slowly inactivate HPA and that the covalent intermediate formed is stable (Figure 2.30(b)). As will be discussed later, quite different results are obtained when crystals are soaked simultaneously in a combined 5FIdoF and

85 MeG2F solution (see Condition 3 results), to give a covalent complex that consists entirely of elongated intermediate. Our results also demonstrate the apparent inability of 5FIdoF to elongate via transglycosylation with itself.

As is evident in the electron density map for this complex in Figure 2.34(a) and the resultant schematic representation in Figure 2.35, the covalently attached 5FIdo ring 4 is bound in a C1 chair conformation and is positioned in the -1 binding subsite, attached to the side chain of D197. This correlates well with the expected bound conformation of a glucose residue at this site in a normal polymeric starch substrate (Figure 2.35). However, binding in this conformation might at first seem surprising for the analog in question since L-idosyl rings, and especially 5-fluoro-L-idosyl rings, are substantially distorted in solution in order to avoid the steric repulsions associated with an axial hydroxymethyl group at C-5 and in order to maximize anomeric effects associated with an electronegative “anomeric” substituent at C-5. As is apparent from the structure, a relatively large cavity exists on the α-face of the sugar that can accommodate the bulky axial CH2OH substituent of the 5-fluoroidosyl moiety, making such binding possible. In addition, since the hydrogen bonding interactions at the active site have been optimized 4 for binding of a sugar in a C1 conformation, the enzyme will stabilize the sugar in this conformation. A very similar result was seen in the structure of another α-glycosidase for which a trapped 5-fluoroglycosyl-enzyme intermediate has been described, that being a GH38 α-mannosidase.79 In that example, the 5-fluoro-L-gulosyl-enzyme intermediate 1 adopted exactly the same ( S5) conformation as a 2-fluoro-D-mannosyl-enzyme intermediate trapped on the same enzyme, showing that a bulky epimeric substituent at C5 could be accommodated. Indeed, in that case a large protein cavity was also observed on the back face of the sugar. Presumably such space is necessary to accommodate the conformational changes occurring within the sugar ring along the reaction coordinate.

86 (a) (b)

Figure 2.36: Schematic diagrams illustrating the hydrogen bonding interactions formed in the active site of HPA in the (a) -1 binding subsite (covalent portion) and (b) the +1 binding subsite (non-covalent portion) of the MeG2F/5FIdoF (Condition 1) glycosyl- enzyme intermediate complex.

Satisfyingly, the side chain carbonyl oxygen of D197 in this covalent complex has as its nearest neighbor (2.7 Å), the endocyclic oxygen of the covalently attached 5- fluoroidosyl moiety. This is exactly what was seen in the 3-dimensional structures of trapped covalent intermediates in CGTase93 and Golgi α-mannosidase II79. The suggestion was made that such an O-O interaction would serve to destabilize the ‘ground state’ structure of the covalent intermediate, but that upon transition state formation, the relative positive charge formed on the endocyclic oxygen would reduce this unfavourable interaction, thereby contributing to relative transition state stabilization. A similar mechanistic strategy therefore seems to be in play in HPA.

Beyond the covalent bond made to D197, several other hydrogen bonding interactions with other 5FIdo ring substituents are seen (Figure 2.36). For example, the conserved side chain of the active site residue D300 forms two strong hydrogen bonds to the C2 and C3 hydroxyls of the 5FIdo ring, serving to confirm the key role played by this residue in orienting intermediates within the -1 binding subsite. Interestingly the side

87 chain of D300 also binds a water molecule, in concert with the side chain of E233, the acid/base catalyst. Indeed E233 is ideally poised to carry out this function in the structure determined here. Other important interactions include hydrogen bonds to the side chains of H201, H299 and R195. The involvement of the side chain of R195 is of particular interest as it forms a bridging interaction between the side chain of E233 and the “ type” oxygen atom of D197 that forms the covalent bond to the 5-fluoroidosyl moiety. The importance of R195 to catalysis has been demonstrated by the 450-fold reduction in rate observed upon mutation.120 Notably this residue is itself influenced by direct interactions with a bound chloride ion that has been shown to be important for efficient catalytic activity. This more remotely bound anion resides in a well defined binding pocket off to one side of the active site cleft, where it also forms an important interaction with E233.

An unexpected finding of this structural determination was the observation of a second, non-covalently bound, 5FIdoF ring in the +1 binding subsite of the active site binding cleft (Figure 2.35). This moiety is also well defined (Figure 2.34(a)) and present at full occupancy. As is evident in Figure 2.35, this 5FIdoF group is positioned where one would normally expect the first product to reside during a hydrolytic cycle, preceding its expulsion off the surface of the enzyme. Furthermore, like the expected normal product, the 5FIdoF ring binds such that only 3.7 Å separates its C4-OH and the C1 carbon of the covalently bound 5FIdo ring. A number of additional stabilizing interactions are made to the ring substituents of the non-covalently bound 5FIdoF, including hydrogen bonds from the side chains of H201, H305 and the catalytic residue E233 (Figure 2.36(b)).

The surprising presence of 5FIdoF in the +1 binding subsite, is likely due, in part, to the high concentration of this inhibitor present under the experimental conditions used for structural studies and its structural similarity to glucose, which would normally occupy this subsite. However, hydrogen bond interactions present between the covalently and non-covalently bound 5FIdoF groups may also be important. Indeed, these interactions may be mutually beneficial in the initial binding of both moieties of 5FIdoF

88 in their respective binding subsites, and may facilitate the formation of the monosaccharide covalent intermediate observed. These interactions include hydrogen

bonds from the axial C6-CH2OH of the covalently linked sugar to the C5-F (2.7 Å) and C4-OH (2.7 Å) groups of the non-covalently bound inhibitor. The normal orientation of

the C6-CH2OH group in a starch substrate would preclude such an interaction.

Interestingly, the C6-CH2OH group of the non-covalently bound 5FIdoF inhibitor is directed out towards solvent, thus does not disrupt binding in the +1 binding subsite. Also notable, is the close interaction (2.7 Å) formed between the C2-OH of the covalently bound 5FIdo ring and the C4-OH of its non-covalently bound counterpart. This interaction would be possible for normal substrates and could constitute an important transition state stabilizing interaction with the first product of the reaction pathway as it leaves the enzyme surface. Equivalently, this interaction might also serve to stabilize the transition state in the transglycosylation mode of the reaction pathway.

Two additional 5FIdoF moieties are found bound to HPA, but at remote locations on the enzyme surface that are far too distant to influence binding in the active site region. The existence of such remote binding sites on HPA has been noted before, and the suggestion has been made that such sites may be involved in starch binding, but no definitive proof has been obtained.

It is instructive to compare the structure of the 5-fluoro idosyl-enzyme intermediate with non-covalent complexes such as that formed by acarbose94 (Figure 2.37). The small shifts observed in catalytic residues and the active site as a whole, between the wild-type, and the 5FIdoF and acarbose complexed structures, clearly suggest the relative rigidity with which the enzyme surface is held and the key role that substrate conformational flexibility must play in catalysis. Among the enzyme movements observed, is a small shift in the highly mobile residues 301-308, which reside along one side of the active site cleft. This readjustment in polypeptide chain serves to optimize inhibitor-enzyme interactions, primarily in more remote binding subsites. Also observed is the reorientation of the plane of the side chain of Y62, which rotates so as to track orientational changes in the ring bound in the -1 binding subsite. This is shown in

89 Figure 2.37 and presumably serves to contribute to both binding affinity and the proper orientation of substrate adjacent to the point of cleavage.

Figure 2.37: Overlays of the bound structures of the monosaccharide 5-fluoroidosyl- HPA (Condition 1) intermediate in green, on top of the structure found for the non- covalent transition state mimic acarbose in cyan, in the active site of HPA adjacent to catalytic residues.

Taken together, the 5FIdoF-HPA and acarbose/HPA complexes provide insights into the conformational changes occurring in the substrate on going from a non-covalent intermediate complex through to covalent intermediate formation with D197. As can be seen in Figure 2.37 the conformational changes would appear to be largely limited to the ring bound in the -1 binding subsite, and primarily involve the C2, C1, O5 and C5 ring atoms, with the largest displacements involving the C1 and C5 atoms (~2.2 Å). Interestingly, the nitrogen atom of acarbose lines up almost exactly with the C4-OH of the first product mimic represented by the non-covalently bound 5FIdoF in the +1 binding subsite, suggesting that the +1 binding subsite portion of a substrate undergoes relatively little reorientation during the formation of the covalent intermediate.

90 2.3.8.2 Structure of the disaccharide G3F/5FIdoF/HPA complex

In a second approach, we co-soaked a crystal of HPA in a solution containing both 100 mM 5FIdoF and 150 mM G3F overnight before collecting X-ray diffraction data (Condition 2). In subsequent electron density maps, strong contiguous electron density confirmed both a covalent linkage between a bound 5FIdo ring and the side chain of D197, as well as the elongation of this intermediate by one glucose unit (Figures 2.34(b) and 2.35). As observed for the monosaccharide 5FIdoF complex, the sugar is attached to D197 through a β-linkage. The identities of the two sugar rings present were confirmed by the L-configuration and the presence of the C-5 fluorine in the case of the 5FIdo ring, and the absence of these features for the glucose unit. This elongated intermediate was found to be bound at full occupancy.

Based on the composition of the disaccharide intermediate observed, it is clear that both transglycosylation and hydrolysis reactions have played a role in converting the initial mix of 5FIdoF and G3F, to the final product. Based on earlier studies of the preferred substrate cleavage patterns of HPA, a number of hydrolysis/transglycosylation routes are possible in this case. Most likely is the transglycosylation of a maltotriosyl unit from G3F onto 5FIdoF to form a tetrasaccharide, followed by the hydrolysis of the terminal maltose to yield Glc-5FIdoF, which could then react to form the covalent intermediate, blocking the active site. A similar scenario has been encountered in previous studies of in situ elongation attempts with 2-deoxy-2,2-difluoro α-maltosyl chloride (2.4) and MeG2F (Section 2.3.6.2), in which the product of the proposed elongation-cleavage reactions was still not a good inactivator of HPA and resulted in no inactivation of the enzyme.

Interestingly, the two additional 5FIdoF moieties found remotely bound outside the active site region in the monosaccharide covalent intermediate, are also observed bound in similar conformations in the disaccharide complex. This indicates that a substantial concentration of free 5FIdoF remains at the end of the various hydrolysis/transglycosylation reactions that occur in the active site. In view of this result

91 and given its initially higher concentration in the crystal soaking mixture, it would seem that much of the G3F originally present is involved in self transglycosylation and hydrolysis reactions, which would ultimately be expected to result in maltose. It is also clear from Figure 2.30(a) that these various G3F products act as competitive inhibitors of 5FIdoF elongation and subsequent covalent intermediate complex formation. These conclusions are supported by the absence of 5FIdoF binding at remote binding sites when a non-transglycosylatable substrate such as MeG2F is used as the elongation agent, as will be discussed later.

Figure 2.38: A comparative overlap of the structures of the covalently bound monosaccharide 5-Fluoroidosyl-HPA (Condition 1) intermediate in green; the G3F/5FIdoF (Condition 2) intermediate in yellow; and, the MeG2F/5FIdoF (Condition 3) intermediate in magenta. Active site residues are color coded to correspond to the bound intermediates and binding subsites for individual sugar residues are indicated (see also Figure 9). Water molecules bound between E233 and D300 have been drawn and color coded. Although related hydrogen bonding interactions are observed for these water molecules in all three intermediate complexes, for clarity only those for the MeG2F/5FIdoF (Condition 3) structure are drawn. The water molecule (labeled 'N') located closest to the C1 carbon of the 5FIdo residue in the -1 binding subsite, would be

92 expected to act as the nucleophile in the next step of catalysis.

In contrast to what is seen in the monosaccharide covalent intermediate complex, no 5FIdoF is seen in the +1 subsite in the disaccharide intermediate (Figures 2.35 and 2.38). Instead, a new water molecule has been substituted and positioned comparably to the C4-OH of the replaced 5FIdoF. Given the similarity of covalent intermediate attachment to D197 and the evident availability of free 5FIdoF in the soaking solution, as apparent from the observation of this group bound at remote surface sites, the origin of this difference at the +1 binding subsite must lie in subtle conformational displacements. Indeed a comparison of the mono- and disaccharide covalent intermediates (Figure 2.38),

shows that the C6-CH2OH of the covalently bound 5FIdo ring in the monosaccharide complex, which hydrogen bonds with the C5-F and C4-OH groups of the 5FIdoF bound in the +1 binding subsite, is slightly reoriented in the disaccharide complex. This displacement appears to be a consequence of the elongated nature of the disaccharide covalent intermediate, which places additional conformational restraints on the C6-

CH2OH group of the 5FIdo ring. As will be discussed later, a similar, but even more

pronounced C6-CH2OH shift is observed in the additionally elongated trisaccharide covalent intermediate structure.

As can be seen in Figure 2.38, the conformation of the 5-fluoroidosyl moiety is the same in the mono- and disaccharide covalent intermediates, as are the hydrogen bond interactions present in the -1 binding subsite (Figure 2.36(a)). Furthermore no appreciable shifts are observed in the active site residues D197, E233, D300 or R195. Also present, although slightly shifted, is a water molecule jointly bound between D300 and E233, which had been identified in earlier studies as potentially playing a role in catalysis. Our current studies suggest that a new water molecule in the disaccharide complex (labeled 'N' in Figure 2.38), which is positioned much closer to the C1 carbon atom of the covalent intermediate, would seem to be more representative of a nucleophilic water. It is striking that this water molecule is located nearly identically to both the C4-OH of the +1 binding subsite 5FIdoF of the monosaccharide intermediate and the inter-ring nitrogen in the complex formed by acarbose. From this position, this

93 water molecule is ideally poised, from both the perspective of distance and trajectory, to proceed with nucleophilic attack on the C1 carbon of the covalent intermediate formed with D197 and at the same time donate a proton to the side chain of E233, the acid/base catalyst.

2.3.8.3 Structure of the trisaccharide MeG2F/5FIdoF/HPA complex

To achieve greater control over the elongation process for kinetic and structural studies and extend the length of the covalent intermediate into the -3 subsite, the capped oligosaccharyl fluoride donor MeG2F was employed in further studies. To this end a crystal of HPA was co-soaked in a solution containing both 100 mM 5FIdoF and 150 mM MeG2F overnight (Condition 3). Following the collection of diffraction data, calculated electron density maps clearly illustrated the covalent nature of binding to the side chain of D197 and the fact that the bound intermediate included a total of three sugar rings (Figure 2.34(c)). As observed for the mono- and disaccharide complexes already discussed, the ring directly attached to D197 via a β-linkage, could be definitively identified as a 5-fluoroidosyl moiety. Furthermore, the rings bound in the more distant -2 and -3 binding subsites could also be clearly identified as glucose residues, consistent with transglycosylation of a 4’-O-methylmaltosyl moiety onto 5FIdoF to give the Glc- Glc-5FIdoF inactivator, which then forms the trapped intermediate. Interestingly, even though mass spectrometry confirmed the structure of the intermediate (Figure 2.31), the O-methyl substituent could not be seen in electron density maps, likely due to motional disorder. This has proved to be a consistent observation in structural studies wherein MeG2F has been used as an elongation reagent.

As illustrated in Figure 2.39, the trisaccharide covalent intermediate forms extensive hydrogen bonding interactions with the enzyme surface, with many of these involving the extended intermediate in the -2 and -3 binding subsites. As with the mono- and disaccharide complexes, the 5FIdo ring is well resolved and present at full occupancy. Attesting to the affinity of the additional interactions present in the -3 binding subsite, the glucose unit in the -2 binding subsite is better resolved and somewhat shifted from its

94 position in the disaccharide intermediate (Figure 2.38). Notably both the di- and

trisaccharide intermediates share a small reorientation of the C6-CH2OH group of the covalently bound 5FIdo ring in comparison to the monosaccharide intermediate, accounting for the finding that neither of these complexes contains an additional 5FIdoF group in the +1 binding site. Also in common is the new water molecule in the +1 binding site that would appear to be appropriately positioned to act as the nucleophilic water (labeled 'N' in Figure 2.40).

Figure 2.39: Schematic diagrams illustrating the hydrogen bonding interactions for the elongated MeG2F/5FIdoF (Condition 3) covalent glycosyl-enzyme intermediate complex are indicated. The 5FIdo moieties are drawn in blue, while catalytic residues are shown in red.

Interestingly, the two remotely bound 5FIdoF moieties found outside the active site region in the mono- and disaccharide intermediate complexes, are absent in the trisaccharide complex structure. Instead, a single MeG2F is found bound to the enzyme surface at an alternate remote binding site. Previous studies utilizing MeG2F as an elongation agent have also demonstrated binding at this site, the role of which would appear to relate to the initial global positioning of large scale polymeric substrates such as starch, adjacent to HPA.149

95

Figure 2.40: Overlays of the bound structures of the MeG2-5FIdo-HPA (Condition 3) intermediate in magenta, on top of the structure found for the non-covalent transition state mimic acarbose in cyan, in the active site of HPA adjacent to catalytic residues

2.3.8.4 Mechanistic implications of structural studies of covalent intermediates of HPA

Distortion of sugar rings during glycosidases catalysis has been proposed for many years since the pioneering work on hen egg white lysozyme by Sir David Phillips and was only directly demonstrated via crystallography after the 1990s.15 Presumably enzymes can achieve substrate distortion primarily through their non-covalent interactions with the substrates. Such distortion is believed to facilitate glycosidic bond cleavage in many ways, as illustrated by an example of a β-glycosidase in Figure 2.41. Firstly, the distortion usually re-orients the anomeric leaving group into an axial position, which is suitable for an in-line attack by the enzyme nucleophile residue. Also this axial placement of the aglycone can achieve a better orbital overlap between the lone pair electrons (p orbital) of the endocyclic oxygen and the σ* orbital of the anomeric

96 substituent, satisfying stereoelectronic requirements. Secondly, this distortion will raise the free energy of the substrate and hence lower the activation barrier of the cleavage. The distortion will also move the glycosidic oxygen into a closer position for protonation by the general acid catalyst.

NO2

O O2N

OH O2N O OH H O O NO2 RO + enzyme HO O O F RO HO F undistorted substate O

O

substrate distortion

Figure 2.41: Distortion of substrate by glycosidases

In the case of HPA, the last, trisaccharyl-enzyme intermediate structure (Figures 2.39 and 2.40) provides the best view of the covalent intermediate formed on this enzyme, since all three sugars are highly resolved and even the nucleophilic water molecule is 4 clearly seen. It is particularly interesting to note that all three sugars are bound in a C1 conformation, despite the considerable potential for distortion imposed by the axial hydroxymethyl group and equatorial fluorine at C5 on the sugar in the –1 subsite. Clearly interactions with the enzyme are strongly stabilizing this otherwise unfavourable conformation, much as was seen in an equivalent case with the Golgi α-mannosidase II intermediate structures. This strongly suggests that HPA, and presumably other GH13 4 amylases, stabilize their intermediates in the normal C1 (chair) conformation. This has raised concerns since a distorted conformation that optimizes in-line attack by minimizing 1,3-diaxial-like repulsive interactions between the incoming water and H-3 and H-5 had been expected for retaining α-glucosidases, much as has been seen for bound substrates with β-glucosidases.15,40 While the intermediates formed on

97 4 transglycosidases from GH13 had previously been seen to adopt the C1 conformation it was suggested that this could be a mechanism for increasing the lifetime of the intermediate to allow time for the leaving group to diffuse out of the active site and the acceptor group to enter the +1 subsite. Such was not expected for the hydrolytic amylases. While undoubtedly a distorted conformation must be formed at the transition state, as also evidenced by the tight binding of acarbose, possibly the need for significant distortion within the covalent intermediate has been over-rated. The beta-configured acylal intermediate is quite inherently reactive, and appears to be further destabilized electronically by the interaction of the endocyclic oxygen with the side chain carbonyl oxygen of D197. This may provide sufficient activation to drive efficient hydrolytic cleavage.

4 The observation of an undistorted C1 conformation for the covalently bound sugar raised some speculations regarding the possible conformation adopted by the sugar at the transition state during HPA catalysis. Such information would be very valuable since stable molecules mimicking this specific conformation are expected to possess high affinity towards HPA, a highly desirable attribute for being developed into a potential diabetes drug. One of the handful of ways to study the transition state conformation of sugars during enzyme catalysis is through structural studies of different stable species of enzyme/substrate complexes along the reaction coordinate.40 By careful inspecting the sugar conformations in the Michaelis complex and the covalent intermediate, we can predict the sugar conformation at the oxocarbenium-ion like transition state, since it is flanked by those two species in the catalytic mechanism. For example, the conformations of the (-1) subsite glucose in the Michaelis complex and the covalent intermediate of a 1 4 GH5 cellulase (Cel5A) were demonstrated to be S3 skew boat and C1 chair, 83 4 respectively. A half boat conformation, H3, was found to be flanked by these two conformations on Stoddart’s pyranoside ring interconversion map (Figure 2.42).40 4 Interestingly, this H3 conformation also satisfies the stereoelectronic requirements of an oxocarbenium-ion like transition state (C5, O5, C1 and C2 are within the same plane) and is predicted to be adopted by the pyranose ring at the transition state. Similarly, through structural studies of the Michaelis complex and the covalent intermediate, the transition

state of GH26 β-mannanases has been proposed to adopt a B2,5 boat conformation, which

98 1 o was flanked by a S5 skew boat in its Michaelis complex and a S2 skew boat in the corresponding covalent intermediate.84,191

O S2 3,OB B2,5

1 OH 3S S5 5 1

4 O H5 H1 1,4 4 B C1 B1,4 4 2 H3 H1

1 2H 5 S3 3 S1

2,5 B3,O B 2 SO

Figure 2.42: Stoddart’s pyranoside ring interconversion map

A similar logic can be applied to predict the transition state conformation of HPA catalysis. Structural studies of the Michaelis complexes of several GH13 enzymes have been performed, though not HPA. In cyclodextrin glucanotransferase (CGTase), the glucose in the -1 subsite in the Michaelis complex was clearly distorted from the normal 4 C1 chair conformation. However, the relatively low resolution (2.1 Å) of the structure precluded the exact assignment of this distorted conformation.93 In a very recent report, the crystal structure of a wild type α-amylase from Halothermothrix orenii complexed with maltotetraose spanning from its -1 to its +3 subsite was described.192 Very 2 interestingly, the glucose in the -1 subsite was found to adopt a SO skew boat conformation. On Stoddart’s pyranoside ring interconversion map (Figure 2.42), a half 2 2 boat conformation, H3, was found to be flanked by the skew boat SO (Michaelis 4 complex) and C1 chair (covalent intermediate). This seems to suggest that at the transition state of HPA (or more generally α-amylase) catalysis, the sugar pyranose ring 2 should adopt a H3 conformation (Figure 2.43). On the other hand, the valienamine of 2 acarbose is found to adopt a H3 conformation for its valienamine moiety. Interestingly,

the Ki values for acarbose of several active site mutants of CGTase were measured along

with kcat/Km values for α-glycosyl fluoride substrates (αGF and αG3F). A good linear

99 correlation was obtained in the plot of log (Km/kcat) for either substrate with each mutant

versus the corresponding log(Ki) for acarbose, indicating that changes in binding interactions with the inhibitor as a consequence of each mutation correlated quite well with changes in transition state binding interactions with the substrate, suggesting that 193 2 acarbose exhibits substantial transition state mimicry. The H3 half boat conformation of acarbose likely does not fully satisfy the stereoelectronic requirements for an oxocarbenium-ion like transition state. However, both our structural studies on HPA and 2 the earlier physical organic studies on CGTase suggest that H3 at least should be very close to the true conformation of the transition state for α-amylase catalysis.

O O OH CH2OH OH O O δ RO O α-Amylase HO RO δ HO O RO HO O OH OR HO RO OH 2 H SO (Michaelis complex) 4 O O C1 (Covalent intermediate)

Oxocarbenium-ion like transition state 2 (close to H3 conformer?) Figure 2.43: Possible pyranose conformational itinerary of α-amylase catalysis

2.4 General conclusions The synthesis of a series of 2-deoxy-2,2-dihalo-α-maltosyl chloride (2.3) and (2.4) as potential HPA inactivators has been achieved via the halogenation of protected 2- fluoromaltal precursors. Direct chlorination of per-O-acetylated 2-fluoro-maltal followed by basic deprotection yielded the corresponding 2-chloro-2-deoxy-2-fluoro-α-maltosyl chloride (2.3). Reaction of the per-O-acetylated 2-fluoro-maltal with acetyl hypofluorite or SelectfluorTM yielded the 2-deoxy-2,2-difluoromaltosyl derivative, which was converted to its α-chloride using thionyl chloride in the presence of BiOCl as catalyst and deprotected under basic conditions. In contrast to their monosaccharide derivatives which

100 caused active site-directed, time-dependent inactivation of yeast α-glucosidase via the trapping of covalent glycosyl-enzyme intermediates, neither of the 2-deoxy-2,2-dihalo-α- maltosyl chlorides caused time-dependent inactivation of HPA, despite the fact that the trinitrophenyl 2-deoxy-2,2-difluoro-α-maltoside functioned in that mode. Inspection of the kinetic parameters for inactivation of yeast α-glucosidase by 2-deoxy-2,2-dihalo monosaccharide derivatives revealed that the trinitrophenyl glycoside is approximately 1000-fold more reactive than the corresponding chloride. Therefore given the fact that

even trinitrophenyl 2-deoxy-2,2-difluoro-α-maltoside is a slow HPA inactivator (ki/Ki = 0.0073 min-1mM-1)73, the absence of any inactivation by the newly-synthesized 2-deoxy- 2,2-dihalo maltosyl chloride was not so surprising. This study demonstrates the importance of interactions between the aromatic aglycone and the +1 subsite of HPA in generating effective mechanism-based inhibitors of HPA. Unfortunately my synthesis of tosyl 2-deoxy-2,2-difluoro α-maltoside, which might have proved effective, was not successful.

An improved synthesis of the elongation reagent, 4’-O-methyl-α-maltosyl fluoride (MeG2F) was developed featuring a much higher yielding methylation step using NaH and MeI on a benzyl-protected sugar. Employing the in situ elongation strategy, several potential mechanism-based inhibitors were tested with HPA. Excitingly, 5-fluoro-α-D- glucopyranosyl fluoride (5FGlcF) and 5-fluoro-β-L-idopyranosyl fluoride (5FIdoF) showed kinetic behavior consistent with the proposed in situ elongation-inactivation process, allowing the trapping and further kinetic and structural analysis of the covalent intermediate of HPA trapped by 5FIdoF and MeG2F. Reactivation experiments revealed a hydrolytic half-life on the order of 240 hours. However, in the presence of saturating maltose, the trapped intermediate undergoes turnover via transglycosylation according to a half-life of less than 1 hour. The scope of this methodology was further proven with a closely related GH13 member, porcine pancreatic α-amylase (PPA). This observation of greatly enhanced inactivation of both PPA and HPA in the presence of 5FIdoF and MeG2F illustrates the generality of this in situ elongation strategy for the study of covalent intermediates in α-amylases and quite likely other retaining endo-glycosidases.

101 Using this newly developed elongation strategy, crystal structures of HPA trapped as its covalent intermediate were obtained in the forms of a monosaccharide complex, a disaccharide complex and a trisaccharide complex. In all three structures, the 5-fluoro idosyl moiety was covalently linked to D197 via a β-glycosidic bond. The role of E233 as the general acid/base residue was confirmed as it is ideally poised to deprotonate the water molecule which will in turn attack the anomeric center of the bound sugar. D300, another important catalytic residue, was found to form two strong hydrogen bonds to the C2 and C3 hydroxyls of the 5FIdo ring, confirming its key role in orienting intermediates within the -1 binding subsite. Non-covalent interactions between the covalent intermediate and other active site residues were also identified from these three structures. 4 Interestingly, the sugar in the -1 subsite was found to adopt a C1 conformation in all three structures. Although the same chair conformation has been observed in the trapped covalent intermediate of GH13 α-transglycosidases, such a conformation was unexpected for a purely hydrolytic enzyme such as HPA. Earlier studies predicted that the intermediate of α-amylases should be distorted in order to satisfy the stereoelectronic requirements of subsequent water attack. However, the adoption of an undistorted chair in our structures suggests that the intermediate formed is sufficiently reactive for efficient hydrolysis without the need for conformational activation. While undoubtedly a distorted conformation must be formed at the transition state, as also evidenced by the tight binding of acarbose, possibly the need for significant distortion within the covalent intermediate has been over-rated.

102

Chapter 3: Mechanistic Studies of Trehalose Synthase (TreS) from Mycobacterium smegmatis

103 3.1 Background 3.1.1 Introduction to trehalose Trehalose is a non-reducing disaccharide in which two glucose molecules are connected with an α,α-(1,1) glycosidic bond (Figure 3.1). Two closely related stereoisomers have been synthesized: neotrehalose and isotrehalose. In neotrehalose, the inter-glycosidic linkage is an α,β-(1,1) bond while isotrehalose has a β,β-(1,1) glycosidic linkage. However, only trehalose (α,α-(1,1)-linkage) has been isolated from, and is synthesized, by living organisms.

OH

O HO HO

OH O HO OH OH O

HO

Figure 3.1: Chemical structure of trehalose

Trehalose was first discovered in the fungus ergot by H. A. L. Wiggers in 1832.194 Since then, this disaccharide has been found in a wide spectrum of living organisms ranging from mycobacteria, bacteria, yeast and fungi up to higher orders of the plant kingdom and lower orders of the animal kingdom.195 In some of these organisms, the amount of trehalose found can be very significant, implying important biological roles for this simple disaccharide. For example, trehalose was reported to comprise 7% dry weight of the spores and macrocysts of Dictyostelium mucoroides.196 In the animal kingdom, 8% dry weight of eggs from the roundworm Ascaris lumbricoides was found to be trehalose.195 Further, the concentration of trehalose in insect haemolymph (blood) is also high, usually between 1% and 2%.197 By contrast, human blood glucose levels are controlled at around 0.1%. Apparently the non-reducing property of trehalose is an important factor for the tolerance of high concentrations in insect haemolymph.

104 Such a diverse occurrence of this simple sugar has led to many studies aiming at elucidating its biological functions in various organisms. The early view was that trehalose served as the storehouse of glucose, similar to the role of starch or glycogen and this has proved to be the case in some organisms. For example, trehalose mobilization has been demonstrated to be an important process for fungal spore germination and related developmental processes.198 The biosynthesis of trehalose is very intensive during this rapid growth period and the accumulation of this sugar in fungi is only possible when the growth rate drops. Similarly, high concentrations of trehalose and a high hydrolysis rate have been shown to be vital to the flight of many insects.197 More recently, however, many studies have revealed that trehalose is not just a simple storage compound but may serve many other important biological roles, as discussed below.

One of the most studied functions is that trehalose can stabilize membranes and proteins against extreme adverse conditions such as dehydration, oxygen radicals, heat and cold.195 Although water is vital to the survival of all living organisms, some organisms such as nematodes, brine shrimps, spores of certain fungi and baker’s yeast, can survive extreme dehydration, even when 99% of their water has been removed.199 Such a state, named anhydrobiosis, can persist for years until these organisms contact water again, when they rapidly swell and resume normal activities. Trehalose synthesis was found to be associated with these dehydration processes as its concentration is typically very high in such anhydrobiotic organisms. For example, up to 20% dry weight of the nematode Aphelenchus avenae during dehydration was found to be trehalose.200 Many studies have focused on understanding how this particular disaccharide can stabilize cellular membranes and proteins during extreme dehydration processes. It has been proposed and confirmed experimentally that trehalose can stabilize membranes by both inhibiting the fusion of vesicles and lowering the phase transition temperature of the dry lipids.201,202 Such stabilizing effects have been attributed to the specific configurations of the hydroxyls, which allow efficient hydrogen bonding with the polar head groups of the lipids.199 Likewise, trehalose has also been shown to stabilize labile proteins under extreme dehydration processes and again the interactions between the trehalose hydroxyls and the polar residues of the proteins are believed to be key to the

105 stabilization.203 Essentially the hydroxyl groups of trehalose replace the missing hydrogen bonds otherwise provided by water.

Trehalose has also been reported to protect cellular proteins from other environmental stresses such as heat, cold and oxygen radicals.195 High concentrations of trehalose can be found in organisms after being exposed to heat-shock, oxygen stress and lowered temperature. For example, yeast and other organisms can accumulate up to 0.5 M of trehalose to protect their proteins in the presence of heat inactivation in vitro.194 By contrast, mutant E. coli strains which can not synthesize trehalose are found to be much more vulnerable at 4 oC than the wild type.204 Undoubtedly the hydrogen bonding interactions between trehalose and protein residues are important contributors to the stabilization, thereby preventing the aggregation of denatured proteins. These can then refold rapidly in the presence of proper chaperone molecules after the removal of the environmental stress.

These stabilizing effects of trehalose against environmental stresses (dehydration, heat, cold, oxygen radicals) have led to many interesting applications. In one report, it was found that the introduction of low concentrations of trehalose could significantly improve the survival of mammalian cells during cryopreservation, which is very important for the long-term storage of living cells.205 In another interesting case, a trehalose biosynthetic gene (to be discussed in Section 3.1.2) was introduced into the tobacco chloroplast. While no difference of growth rate and fertility between the chloroplast transgenic plants and the control plants was observed, the transgenic tobacco showed significantly improved drought tolerance.206 It was therefore proposed that it should be possible to generate various transgenic crop plants which can withstand environmental stresses by introducing trehalose biosynthetic genes.

Besides its role as the protectant of membranes and proteins, trehalose has been found to be an important component of cell walls of many mycobacteria and corynebacteria in the form of glycolipids. The most well-studied example is trehalose dimycolate, or cord factor.207 It contains a trehalose core structure, with the unusual fatty

106 acid mycolic acid esterified at its 6-OH and 6’-OH, respectively (Figure 3.2(a)). The cord factor is present in the outer envelopes of most mycobacteria and is the most abundant extractable glycolipid in the cell wall of virulent Mycobacterium tuberculosis. The cord factor is the most toxic lipid produced by M. tuberculosis and has been identified as its virulence factor.207,208 In addition, its presence significantly increases the impermeability of the cell wall of these mycobacteria and protects them from various antibiotics.209 Another common class of trehalated glycolipids found in the cell walls of the various mycobacteria is that of the sulfolipids. A good example of these sulfolipids, sulfolipid-1 (SL-1), is shown in Figure 3.2(b).210 Trehalose serves as SL-1’s structural core, with four acyl groups attached at 2-OH, 3-OH, 6-OH, 6’-OH and one sulfation at 2’-OH. It is the most abundant sulfur-containing lipid in M. tuberculosis (~1% of the dry weight) and has been demonstrated to be important for the survival and virulence of the organism.211,212 The biosynthetic enzymes of SL-1 have subsequently become interesting targets for developing potential tuberculosis drugs.213 Non-sulfated glycolipids containing trehalose have also been identified from several mycobacteria. For example, diacylated trehalose derivatives at 2-OH and 3-OH have been found from the cell walls of M. tuberculosis and Mycobacterium fortuitum.214 Their exact biological functions, however, are still unknown.

O OH (a) O (b) C23H47 O 4-9 14,16 O CHCHC53H99 O HO OH HO O O HO HO O3SO O O OH O 14 O OH O OH 4-9 14,16 OH O OH HO O O

4-9 C53H99CHHC O 14,16 OH O

H47C23 O

cord factor SL-1 Figure 3.2: Selected examples of trehalose-containing glycolipids from mycobacteria (a) trehalose dimycolate (cord factor) (b) sulfolipid-1 (SL-1)

107 3.1.2 Biosynthetic pathways of trehalose The important functions of trehalose, as discussed in the previous section, have resulted in significant interest in its biosynthetic pathways. It was initially thought that enzymes involved in trehalose synthesis from M. tuberculosis should serve as good targets to develop tuberculosis drugs, since humans can not synthesize trehalose.215 However, the discovery of at least three independent biosynthetic pathways for trehalose in M. tuberculosis has decreased interest in this approach.216

The prevalent biosynthetic pathway for trehalose involves a two-step enzymatic process, as shown in Scheme 3.1. The first step is a glucosyl transfer reaction, catalyzed by trehalose-6-phosphate synthase (TPS; this enzyme is also called OtsA in E. coli.).217 It involves transfer of a glucosyl moiety from UDP-glucose to the acceptor glucose-6- phosphate (its α-anomer) to produce trehalose-6-phosphate. This enzyme activity has been found in most of the trehalose-producing organisms including mycobacteria, insects and yeasts.195 The second step is dephosphorylation of the trehalose-6-phosphate to generate the free trehalose, catalyzed by trehalose-6-phosphate phosphatase (TPP, OtsB in E. coli).215 The TPP enzyme is present in most of the organisms where TPS enzyme is present. Interestingly, in the yeast Saccharomyces cerevisiae, the TPS and TPP enzymes, together with two other regulatory proteins, are all part of a trehalose synthase complex.218 The involvement of these regulatory proteins in trehalose synthesis has led to proposals that trehalose metabolism in this organism has close interactions with glycolysis/ fermentation and is a highly regulated process.

OH

O HO HO - OH OPO 2 3 OH O O O OH HO HO TPS TPP HO + HO trehalose OH HO HO OH O UDP OH UDP O phosphate 2- O3PO UDP-glucose glucose-6-phosphate trehalose-6-phosphate Scheme 3.1: The TPS-TPP biosynthetic pathway of trehalose

108 Alternative biosynthetic routes to trehalose have been discovered in several organisms for which trehalose is probably vital to their survival, such as mycobacteria and corynebacteria. One of the elucidated pathways involves two enzymatic steps, as shown in Scheme 3.2. In the first step, the reducing end of glycogen or a maltooligosaccharide is converted to a trehalose by isomerizing the terminal α-(1,4) linkage into an α-(1,1) linkage. Maltooligosyltrehalose synthase (TreY) is responsible for this rearrangement. This is followed by cleavage of the terminal trehalose by the second enzyme, maltooligosyltrehalose trehalohydrolase (TreZ). This TreY-TreZ pathway was first identified in Arthrobacter219-221 and was subsequently found in Rhizobia222, Sulfolobus acidocaldarius223 and M. tuberculosis195. Interestingly, the combined use of the TreY-TreZ enzymes has allowed facile, inexpensive industrial production of trehalose.224

OH OH O O O O HO OH HO OH OH OH O OH O TreY O O HO TreZ O HO O OH HO + trehalose OH OH O O OH O HO OH HO OH OH OH OH O

HO Scheme 3.2: The TreY-TreZ biosynthetic pathway of trehalose

Another novel biosynthetic pathway employs only one enzyme, trehalose synthase (TreS), to catalyze the reversible inter-conversion of maltose and trehalose, as shown in Scheme 3.3, arriving at an equilibrium mixture of roughly equal amounts of maltose and trehalose (40 – 45% of each, with glucose as the side product). Essentially the reaction catalyzed by TreS is very similar to that performed by TreY, but the substrate of TreS is much shorter. This enzyme activity was first demonstrated in Pimelobacter and Pseudomonas putida through an extensive screening of 2,500 strains of soil bacteria.225 Subsequently TreS enzyme has been purified and characterized from many other microorganisms such as Thermus aquaticus226, Thermus caldophilus227, Thermus thermophilus228, Picrophilus torridus229, M. tuberculosis216 and M. smegmatis216.

109 OH

OH O HO O HO HO HO OH OH TreS O OH O HO O HO OH OH OH O OH HO Scheme 3.3: TreS-catalyzed isomerisation of maltose and trehalose

The presence of three biosynthetic pathways in M. tuberculosis raised questions about the roles of each. This was first addressed by generating mutant strains of the closely related M. smegmatis which lacked either two or all three of the biosynthetic pathways.214 As expected, the mutant which lacked all three pathways could not proliferate and enter the stationary phase in the absence of exogenous trehalose. However, mutants with any one of the three pathways could actually grow at wild type levels. This suggested that the three pathways (TPS-TPP, TreY-TreZ and TreS) are mutually redundant for the growth of M. smegmatis. In a sharp contrast, later studies employing a similar approach on M. tuberculosis provided evidence that the TPS-TPP pathway is actually the dominant one among the three pathways.230 Mutant strains of M. tuberculosis which lacked this pathway showed significant defects in growth both in vitro and in vivo. This study also demonstrated the strict requirement for a functional TPP enzyme for the survival of M. tuberculosis, suggesting it as a good target for developing potential anti- tubercular drugs.

3.1.3 Introduction to trehalose synthase (TreS) Detailed mechanistic studies on trehalose synthase (TreS) are of interest for the following reasons. Firstly, TreS is a mechanistically interesting enzyme since it catalyzes an intriguing rearrangement between α-(1,1) and α-(1,4) glycosidic linkages. Further it is quite likely that TreS shares significant mechanistic similarities with TreY, thus mechanistic insights into TreS should shed light on TreY, another poorly understood enzyme. Secondly, there is still significant interest in finding economic ways to produce

110 trehalose on an industrial scale since this sugar has found many applications in the food, cosmetics and pharmaceutical industries.224 TreS could potentially provide a simple alternative to the TreY-TreZ pathway for commercial production of trehalose. Despite the fact that TreS enzymes have been cloned, expressed and purified from a range of microorganisms,216,225-227,229,231-236 only limited mechanistic information is available on this class of enzymes. Most of the published studies have focused only on studying the effects of changing various factors such as temperature, pH and metal cations on the enzymatic activity of TreS, but not probed its mechanism.

TreS is an α-retaining transglycosidase in the α-amylase family (GH13). Sequence alignment of several TreS enzymes with other GH13 glycosidases has already identified several conservative regions.226,229,234,236,237 Therefore TreS is believed to adopt the classical double displacement mechanism in catalyzing the inter-conversion of maltose and trehalose. Two central questions exist regarding this mechanism. Firstly, although a double displacement mechanism seems to be the most probable, there has been no direct demonstration of the formation of a covalent intermediate on this particular class of enzymes, so it is not certain that one exists. Secondly, in order to catalyze the rearrangement, the glucose in the aglycone binding site must be flipped before it re-attacks the covalent glycosyl-enzyme intermediate. The question is whether this flipping event occurs inside or outside of the enzymatic active site. If outside, can an exogenous glucose molecule be incorporated into the disaccharide product by TreS catalysis?

111 D D D OH OH D O O HO HO HO HO

OH OH O O HO + HO OH OH OH OH O O D HO HO D D D TreS TreS

D D OH maltose + d8-maltose D OH D O D D HO D O HO OH HO D HO OH OH O OH O OHO OHO OH OH OH OH d4-maltose + d4-trehalose

Intramolecular mechanism Intermolecular mechanism Figure 3.3: Possible products formed from deuterated trehalose, normal trehalose and TreS in the different mechanistic scenarios.

The second question has been addressed in two interesting studies on different TreS enzymes. In the earlier report, 14C-labeled glucose was added to a reaction mixture containing maltose and TreS from Pimelobacter sp. R48.231 After 24 hours reaction, most of the maltose had been converted to a mixture of trehalose and maltose, as expected. However, no 14C labeled-glucose was incorporated into either maltose or trehalose, supporting the intra-molecular nature of the rearrangement. The reverse reaction pathway, from trehalose to maltose, was investigated in the second report.232 Both deuterated

(2,4,6,6’,2’,4’,6’’,6’’’-d8) trehalose and normal trehalose were incubated with TreS from Thermus caldophilus GK24, as shown in Figure 3.3. If the rearrangement occurs in an

intramolecular fashion, the products will be normal maltose and d8-maltose plus normal

trehalose and d8-trehalose. On the contrary, if the rearrangement is an intermolecular

112 process, d4-maltose and d4-trehalose will be generated as well. After 12 hours incubation,

the reaction mixture of d8-trehalose, normal trehalose and TreS was subjected to HPLC

separation followed by MS analysis. Interestingly, no d4-maltose or d4-trehalose could be detected, confirming the intramolecular fashion of the mechanism. With this information, our proposed catalytic mechanism for TreS is shown in Scheme 3.4.

OH OH O O O O OH O O O O HO HO HO HO OH O HO OH HO OH OH HO O + O O δ HO HO HO O HO H O H HO OH OH OH O O OH δ O O OH OH O O

Ring flips

OH OH OH O O O O O O HO O O O HO + HO HO HO δ HO

OH HO OH O O O HO H HO H HO

OH OH OH OH OH O OH O O HO O δ O O O O HO HO HO

Scheme 3.4: Proposed catalytic mechanism of TreS.

Prof. Alan Elbein’s research group (Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, USA) has successfully cloned, expressed and purified the active TreS enzyme from M. smegmatis.216 This enzyme, kindly provided by his group, was investigated in this study as the model system.

3.2 Specific aims of this study (1) Developing a continuous assay to kinetically analyze TreS All the previous published studies on TreS employed stopped assays (as will be discussed in Section 3.3.1), which were time-consuming. To facilitate the mechanistic

113 analysis of TreS, a suitable continuous assay will be developed. This will be achieved by synthesizing a series of aryl α-glucosides and α-glucosyl fluoride and evaluating them as substrates for this enzyme.

(2) Trapping the covalent glycosyl-enzyme intermediate of TreS and identifying the catalytic nucleophile by mass spectroscopy. So far there has been no direct demonstration of a trapped covalent glycosyl- enzyme intermediate on TreS. Both 2-fluoro and 5-fluoro sugar inactivators will be synthesized and tested with the enzyme. If any one of them is shown to function as a mechanism-based inhibitor, the trapped covalent intermediate of TreS will be subjected to protease digestion followed by MS analysis of the resultant peptides to identify the catalytic nucleophile residue.

(3) Confirming the intramolecular nature of the rearrangement between maltose and trehalose catalyzed by M. smegmatis TreS. While this issue has been addressed before in two other TreS enzymes, it has not been demonstrated in the M. smegmatis TreS used in this study. Since this is very important for understanding its mechanism, two different approaches will be employed to confirm the intramolecular character of this enzymatic conversion.

3.3 Detailed kinetic analysis of TreS using different α-glucoside substrates

3.3.1 Design of substrates All the previous kinetic studies of TreS utilized stopped assays to quantify its activity. Usually maltose was incubated with TreS for a certain amount of time, followed by heat inactivation of the enzyme. The amount of trehalose in the mixture was then determined by HPLC analysis. A slightly different kinetic assay was employed to study TreS from M. smegmatis by Prof. Elbein’s group.216 In their study, trehalose was used as the substrate. After defined time periods, the reaction was stopped by heating and the amount of newly-formed maltose in the resultant mixtures was quantified by standard reducing sugar assays. While these stopped assays utilizing natural substrates are very

114 straightforward, they can be very time-consuming and inconvenient for monitoring the kinetics of inactivation. A continuous assay is needed for carrying out such studies of TreS.

OH OH O HO O HO HO X HO OH O OH F

X = NO2, Cl, etc

aryl-α−glucosides α-glucosyl fluoride Figure 3.4: Potential substrates of TreS: aryl α-glucosides and α-glucosyl fluoride

Many hydrolytic α-glucosidases in GH13 family can be conveniently assayed by using aryl α-glucosides such as 4-nitrophenyl α-glucoside.76,149 Indeed previous reports on TreS have also indicated that this enzyme does have weak hydrolytic activity along with its isomerization activity, producing small amounts of glucose, usually less than 10%, along with isomerized products.216,225 The relative higher leaving group abilities of anomeric phenols, compared with the glucose aglycone of the natural substrates, should enhance reaction rates. If initial studies are promising, kinetic parameters (Km and kcat) will be determined for a series of aryl glycosides, as shown in Figure 3.4. A plot of

log(kcat) or log(kcat/Km) of each substrate against the pKa values of the corresponding phenol leaving groups can be used to identify the rate-limiting steps for each substrate.

For many retaining β-glycosidases, biphasic plots of log(kcat) versus pKa have been obtained31,36-38, as shown in Figure 3.5.

115 log(kcat)

° ° ° ° ° ° ° ° ° ° ° °

pKa 567 8 Figure 3.5: An illustrative example of a biphasic Brønsted plot obtained for hydrolysis of a series of aryl glycosides by a retaining β-glycosidase.

For substrates with good leaving groups (usually pKa < 7), kcat is essentially

independent of the aglycone structure while for substrates with poorer leaving groups, kcat is highly dependent on the aglycone pKa. Consequently a negative slope is seen in the

right side of the plot since the higher the pKa of the phenol, the worse it is as a leaving group and the lower the enzymatic rate will be. Overall this biphasic, concave downward plot is diagnostic of a change of rate-determining steps. Substrates in the flat region are deglycosylation rate-limiting while substrates in the downwards region are glycosylation rate-limiting. The value of the negative slope reflects the sensitivity of the rate constant to the changes in the leaving group ability. A large negative value would indicate substantial charge development at the glycosidic oxygen, thus also likely at the anomeric center.

Another very useful class of glycosidase substrates is that of the glycosyl fluorides.238 The inherent high reactivity and small size of the anomeric fluoride render glycosyl fluorides highly efficient substrates of glycosidases. In fact, no example is known of glycosidases that can not process the corresponding glycosyl fluoride substrates. Therefore, the two classes of substrates, aryl α-glucosides and α-glucosyl fluoride (αGlcF) were synthesized and tested as potential TreS substrates (Figure 3.4).

116 3.3.2 Chemical synthesis of potential substrates of TreS

OAc OAc OH

O i) O ii) O AcO AcO HO AcO OAc AcO HO

OAc AcO OH F F 2.38 3.1 Scheme 3.5: Chemical synthesis of αGlcF (3.1). i) HF/pyridine, 74% yield; ii)

NH3/MeOH, quantitative yield.

The synthesis of αGlcF (3.1) was very straightforward (Scheme 3.5). Simply fluorinating the anomeric center by reacting per-O-acetylated glucose with HF/pyridine afforded the desired per-O-acetylated α-glucosyl fluoride (2.38) in a good yield (74%). The α-configuration of the anomeric fluoride was clearly demonstrated by the large

diaxial coupling constant JF,H2 (24.2 Hz). All the acetate groups were subsequently

removed by using NH3/MeOH to generate αGlcF in a quantitative yield.

Table 3.1: the selected six aryl-α-glucosides to be tested as TreS substrates

Compound Names Abbreviations pKa of the phenol* 2,4-dinitrophenyl α-glucoside (3.2) DNPGlc 3.96 3,4-dinitrophenyl α-glucoside (3.3) 34DNPGlc 5.36 4-chloro-2-nitrophenyl α-glucoside (3.4) 4C2NPGlc 6.45 4-nitrophenyl α-glucoside (3.5) 4NPGlc 7.18 2-nitrophenyl α-glucoside (3.6) 2NPGlc 7.22 3-nitrophenyl α-glucoside (3.7) 3NPGlc 8.39

* pKa values taken from reference [240]

Six different aryl α-glucosides were selected as potential TreS substrates, as shown in Table 3.1. Three of these compounds, 34DNPGlc (3.3), 4C2NPGlc (3.4) and 3NPGlc (3.7) were kindly provided by Dr. Vivian Yip from our group. 2NPGlc (3.6) was synthesized according to literature procedure.239 DNPGlc (3.2) was synthesized as follows.

117 OAc OAc OH

O i) O O N O O2N AcO AcO 2 ii) HO AcO AcO HO NO2 NO2 OAc OH OH AcO O O 3.8 3.9 3.2 Scheme 3.6: Chemical synthesis of DNPGlc (3.2). i) 1-fluoro-2,4-dinitrobenzene, K2CO3, DMF, 48% yield; ii) MeOH, acetyl chloride (4% v/v), quantitative yield.

The synthesis of DNPGlc (Scheme 3.6) started with installation of the 2,4- dinitrophenyl group onto the anomeric center of glucoside. 2,3,4,6-Tetra-O-acetyl glucose (3.8) was reacted with Sanger’s reagent in DMF in the presence of K2CO3 as the 240 catalyst. The presence of a strong base (K2CO3) catalyzes both the nucleophilic aromatic substitution and the isomerization of the initially formed β-glycoside to give the thermodynamically more stable α-product in one pot.241 As a result, the desired α- glycoside product 4.6 was obtained in an acceptable yield (48%). The α-configuration of the anomeric center was clearly demonstrated by the small coupling constant JH1, H2 = 3.5 Hz. Due to the instability of the anomeric DNP group under basic conditions, weakly acidic conditions were employed to deprotect the α-glycoside 3.9 and DNPGlc (3.2) was generated in quantitative yield.

3.3.3 Kinetic analysis of aryl α-glucosides and αGlcF as TreS substrates. Incubation of the five aryl α-glucoside substrates (except 3NPGlc) with TreS resulted in continuous increase of absorbance at 400 nm over time, indicating that all serve as TreS substrates. By contrast, the enzymatic turnover rate of 3NPGlc was too low to be measured. TLC analysis of selected mixtures of TreS and aryl α-glucoside substrates showed that only hydrolysis reactions occurred (data not shown). Michaelis- Menten kinetic analysis was performed on all five substrates to determine their kinetic parameters (kcat, Km), as shown in Table 3.2. Incubation of αGlcF with TreS enzyme also resulted in continuous release of fluoride ion, as monitored using a fluoride electrode. This allowed determination of kinetic parameters for αGlcF, as shown in Table 3.2. TLC analysis of the reaction mixture of TreS enzyme with αGlcF confirmed that glucose was the sole product (data not shown).

118 Table 3.2: Summary of kinetic parameters of TreS substrates @ 37 oC -1 -1 -1 -1 -1 Substrate pKa Km (mM) kcat (s ) kcat/Km (mM s ) Δε (M cm )* DNPGlc 3.96 2.9 ± 0.1 4.7 ± 0.1 1.6 9800 34DNPGlc 5.36 2.5 ± 0.1 (4.3± 0.1)×10-2 1.7 ×10-2 14220 4C2NPGlc 6.45 2.2 ± 0.1 8.3 ± 0.2 3.8 1300 4NPGlc 7.18 5.8 ± 0.2 0.10 ± 0.01 1.7 ×10-2 6464 2NPGlc 7.22 0.7 ± 0.1 8.1 ± 0.3 12 1650 αGlcF 3.17 0.15 ± 0.03 5.3 ± 0.3 35 N/A

-3 5FαGlcF 3.17 (Ki’) 0.016 2.9 × 10 N/A N/A

* All Δε values were determined at the wavelength of 400 nm.

While all five aryl α-glucoside and αGlcF serve as TreS substrates, their kcat values clearly classify them into two groups. For DNPGlc, 4C2NPGlc, 2NPGlc and αGlcF, very similar turnover numbers are obtained, which are much larger than those of 34DNPGlc and 4NPGlc, as depicted in Figure 3.6.

Logkcat 2

0

-2 468 pKa

Figure 3.6: Bronsted plot relating rates (logkcat) of TreS-catalyzed hydrolysis of five aryl

glucoside substrates with the leaving group ability of their phenol aglycones (pKa).

It is quite probable that hydrolysis of the faster substrates (DNPGlc, 4C2NPGlc, 2NPGlc and αGlcF) is deglycosylation rate-limiting since the corresponding enzymatic

119 rates do not depend on the leaving ability of the anomeric group. A common test for rate- limiting deglycosylation is to examine the effect of an exogenous neutral nucleophile such as methanol, ethylene glycol on the steady state rate. This has been demonstrated

for several retaining β-glycosidases, wherein kcat for deglycosylation rate-limiting substrates increases proportionally to the nucleophile concentration, while no increase is seen for glycosylation rate-limiting substrates.31,36,38 Methanol was selected for the study of TreS because of its small size. The addition of methanol at concentrations ranging from 100 mM to 800 mM did not result in any rate increase when using DNPGlc, 4C2NPGlc or 2NPGlc as substrates, as for DNPGlc in Figure 3.7. While this lack of a rate increase may suggest that deglycosylation is not rate-limiting, it might simply mean that methanol can not access the active site of this trapped intermediate species.

100%

80%

60% 40%

Relative activity 20%

0 200 400 600 [MeOH]/mM Figure 3.7: Methanol competition experiments of TreS with 2 mM DNPGlc and different concentrations of methanol at 37 oC.

The finding of particularly low rates for 34DNPGlc and 4NPGlc, considering the relatively good leaving group abilities of their phenols, is surprising – especially given

that their Km values resemble those of the other substrates. Possibly their aglycone structures interfere with any rate-limiting conformational change, or the interactions with the enzyme at the transition state are particularly unfavorable.

Unless otherwise noted, all the following kinetic studies only used DNPGlc (3.2) as the TreS substrate.

120 3.3.4 pH Dependence of wild type TreS pH Dependence studies generate important mechanistic insights into the ionization states of residues which are essential for enzymatic catalysis. In order to gain such information, stability tests were first carried out in which the enzyme was incubated in buffers at a series of pH values and aliquots removed at different time periods were assayed under standard kinetic conditions. Figure 3.8 shows a plot of the residual activity of TreS incubated for one hour at 37 oC at different pH values. From this figure it was clear that this enzyme is stable from pH = 5.8 – 8.0 for 60 minutes. This region was therefore selected for pH dependence studies.

100%

80%

60%

40%

20% Residual Activity

56789 pH Figure 3.8: Residual enzyme activity of TreS after incubation at 37 oC for one hour at different pH values.

The pH dependence of kcat/Km for TreS was measured using the substrate 33 depletion method at low substrate concentrations ([S] << Km). Since under those

conditions ([S] << Km), the reaction rates are given by the equation v = kcat[E]o[S]/Km. The time course of the reaction progress curve can be fit to a first-order equation and a pseudo-first-order rate constant kobs can be extracted. Since the kobs values correspond to

[E]okcat/Km, values of kcat/Km can be extracted by division of these obtained rate constants by the enzyme concentration. Assays were performed with 50 μM DNPGlc since its Km =

(2.9 ± 0.1) mM. The individual kcat/Km values of TreS determined at pH values are shown in Table 3.3 and a plot of these values versus pH is shown in Figure 3.9.

121 Table 3.3: The individual kcat/Km values of TreS at different pH values pH 5.8 6.2 6.5 6.8 7.1 7.4 7.7 8.0 -1 -1 kcat/Km (s mM ) 1.37 1.86 1.91 1.62 1.49 1.01 0.76 0.60

2

) 1.6 -1 s -1 1.2 (mM

m 0.8 /K cat

k 0.4

0 468

pH

pKa1 = 5.6 ± 0.2 pKa2 = 7.4 ± 0.1

Figure 3.9: Dependence of kcat/Km upon pH for hydrolysis of DNPGlc by wild type TreS. The data were fit to the expression for the pH dependence of an enzyme reaction dependent upon two essential ionizations by nonlinear regression using the program GraFit 5.0.13 (Erithacus Software Limited, 2006).

The bell shaped curve obtained from the plot of kcat/Km versus pH indicates that

two ionizable groups are involved in catalysis with pKa values of 5.6 ± 0.2 and 7.4 ± 0.1. These results are consistent with the classical double displacement mechanism adopted by retaining glycosidases. The smaller pKa value (5.6 ± 0.2) is believed to belong to the catalytic nucleophile and the larger value (7.4 ± 0.1) should represent that of the general

acid/base residue. The value of pKa1 could not be determined accurately since the instability of enzyme in acidic conditions (pH < 5.8) prevents collecting more data in this limb of the pH profile.

122 3.4 The trapping of the covalent intermediate of TreS and identification of the catalytic nucleophile by mass spectrometric analysis of labeled peptides.

3.4.1 Mechanism-based inhibitors of TreS 2-Deoxy-2-fluoro glycosyl fluorides have not generally proved useful in trapping the covalent intermediate for retaining α-glycosidases. However, since TreS is a transglycosidase and 2-deoxy-2-fluoro-α-D-glucopyranosyl fluoride (3.13) was easily synthesized according to Scheme 3.7, an attempt at testing compound 3.13 as a potential mechanism-based inhibitor of TreS was made. Not surprisingly, no time-dependent inactivation was observed (data not shown). Consequently, our attention shifted to the use of 5-fluoro glycosyl fluorides, which have proved much more successful with retaining α- glycosidases.

OAc OAc OH

O AcO O O i) AcO HO AcO ii) AcO HO

F F F OAc F F 2.33 3.12 3.13 Scheme 3.7: Chemical synthesis of 2-deoxy-2-fluoro-α-D-glucopyranosyl fluoride (3.13).

i) HF/pyridine, 43% yield; ii) NH3/MeOH, quantitative yield.

The chemical synthesis of two 5-fluoro glycosyl fluorides, namely 5-fluoro-α-D- glucopyranosyl fluoride (5FGlcF, 2.30) and 5-fluoro-β-L-idopyranosyl fluoride (5FIdoF, 2.31), has been described in Section 2.3.7.1. Kinetic evaluation of these compounds as potential mechanism-based inhibitors of TreS followed a similar approach to that used for testing 2-deoxy-2,2-dihalo maltosides with HPA (Section 2.3.3). Incubation of TreS with 5FGlcF indeed resulted in inhibition. However, no time-dependent inactivation could be detected and complete inhibition was not observed. Instead a steady-state residual activity was established. Essentially 5FGlcF behaved like an apparent reversible inhibitor and its

apparent Ki’ value was determined to be 16 μM, as shown in Figure 3.10 (a).

Considering the fact that the Km of DNPGlc is 2.9 mM for TreS, the much smaller Ki’ value of 5FGlcF was inconsistent with a true binding constant since it was only a simple

123 substrate analogue with a small-sized aglycone. Therefore this result suggested significant accumulation of the covalent intermediate of TreS by 5FGlcF and 5FGlcF should be more accurately described as a slow substrate of TreS with its deglycosylation step rate-limiting. Similar phenomena have been observed when testing 5-fluorosugars as potential mechanism-based inhibitors of other α-glycosidases.75,79 Indeed incubation of 5FGlcF and TreS resulted in continuous release of fluoride anion, as monitored using a fluoride electrode (Figure 3.10(b)). By using a large excess of substrate ([5FGlcF] >> -3 -1 Ki’), the turnover number of 5FGlcF by TreS was determined to be kcat = 2.9 × 10 s , roughly 2,000 fold smaller than that of αGlcF (Table 3.2).

(a) (b) 140 3.6 120

M) 3.2

100 -6 80 2.8 60 2.4 40 Fluoride (10 Fluoride 1/v (min/A400) 1/v 2.0 1/Vmax 20 1.6 0 -120 -80 -40 0 40 80 120 160 800 1200 1600 2000 [5FGlcF]/μM Time (s)

Figure 3.10: (a) Dixon plot of 5FGlcF as an apparent reversible inhibitor of TreS (b) reaction of 1 mM 5FGlcF and 0.5 μM TreS enzyme at 37 oC, monitored by a fluoride ion electrode. Only the portion of the graph after addition of the TreS enzyme was shown.

A common strategy to further lower the deglycosylation rate involved using the C-5 epimer of 5FGlcF, 5-fluoro-β-L-idopyranosyl fluoride (5FIdoF, 2.31) as the inactivator of TreS, a similar strategy to that used in the trapping of HPA and other α- glycosidases. Incubation of 5FIdoF with TreS indeed resulted in time-dependent decay of enzymatic activity. Pseudo-first-order kinetics of inactivation were seen at each concentration of 5FIdoF, allowing the determination of a series of rate constants for inactivation (kobs) at each inactivator concentration (Figure 3.11(a)). A replot of these inactivation rate constants versus inactivator concentration allowed the determination of

124 -1 inactivation parameters of ki = (0.0225 ± 0.0009) min and Ki = (13 ± 1) mM (Figure -1 -1 3.11 (b)), thus a second-order rate constant for inactivation of ki/Ki = 0.0017 min mM respectively. (a)

0.02 [I] = 5 mM

[I] = 10 mM 0.016 [I] = 15 mM

0.012 [I] = 20 mM

[I] = 25 mM 0.008 [I] = 30 mM Activity (A400/min)Activity 0.004 no 5FIdoF

0 0 40 80 120 160 200 Time (min) (b) ) -1 Rate (min Rate

0102030 [5FIdoF]/mM Figure 3.11: Inactivation of TreS by 5FIdoF at 37 oC. (a) Plot of residual activity versus time at the inhibitor concentrations shown (b) Replot of inactivation rate constants versus concentration of inactivator.

A common test to confirm the active site-directed nature of the inactivation is through the demonstration of protection against inactivation by a competitive inhibitor. Two α-glucosidase inhibitors were first tested as potential competitive inhibitors of TreS. The first, D-gluconohydroximino-1,5-lactam, or GHIL(Figure 3.12(a)), is a well-known

125 149 inhibitor of GH13 yeast α-glucosidase (Ki = 2.9 μM). Kinetic measurements (performed by Mr. Michael Lam, a former BIOC 449 student) showed that GHIL was

indeed a very good competitive inhibitor of TreS with Ki = (2.0 ± 0.5) μM (Figure

3.12(c)). Compared with the Km values of substrates in Table 3.2, the much smaller Ki value of GHIL suggests that TreS catalyzes reactions via oxocarbenium-ion like transition states and therefore binds to transition state analogues such as GHIL much tighter than substrates. The second inhibitor, casuarine, is a polyhydroxylated pyrrolizidine (Figure 3.12(b)) that has been reported to be a good inhibitor of several related α-glycosidases, including yeast α-glucosidase, fungal glucoamylase and E. coli trehalase.242,243 Interestingly, kinetic experiments showed that it was also a tight-binding

inhibitor of TreS with Ki = (2.5 ± 0.1) μM (Figure 3.12(d)). Casuarine was selected for the protection experiment and in the presence of 5 μM casuarine, the inactivation rate of TreS by 10 mM 5FIdoF was indeed reduced from (0.0105 ± 0.0002) min-1 to (0.0046 ± 0.0005) min-1, confirming that inactivation was occurring at the active site of TreS (Figure 3.13)

(a) OH (b) HO OH H NH HO HO OH HO OH N OH N

CH2OH (c) (d) 240 1200

160 800 1 / V (min/Abs) 400 V (min/Abs) 1 / 80

0 0 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -0.4 -0.2 0 0.2 0.4 0.6 -1 1 / [DNPGlc] (mM ) 1 / [DNPGlc] (min-1) Figure 3.12: (a) Chemical structure of GHIL; (b) Chemical structure of casuarine; (c) Lineweaver-Burk plot of GHIL as a reversible inhibitor of TreS;(d) Lineweaver-Burk plot of casuarine as a reversible inhibitor of TreS

126 no casuarine 0.03 casuarine

0.02 A400/min 0.01

0 0 20 40 60 80 100 120 Time (min) Figure 3.13: Inactivation of TreS by 10 mM 5FIdoF in the absence and the presence of 5 μM casuarine at 37 oC.

Reactivation experiments were attempted to demonstrate the catalytic competence of the covalent 5-fluoroidosyl-enzyme intermediate of TreS. Inactivated enzyme was freed from excess inactivators by ultrafiltration and then incubated in buffer at 37 oC. Aliquots of the reactivation mixture were taken at different times and assayed under standard assay conditions. Interestingly, even after four days incubation, there was no significant regain of enzymatic activity. The stability of the trapped intermediate is -1 perhaps unsurprising since the inactivation rate ki = (0.0225 ± 0.0009) min (t1/2 = 31 min) with 5FIdoF was relatively low, thus reactivation must be even slower. Possibly denaturation of the enzyme is occurring faster than reactivation so that the reactivation can not be detected. Indeed control experiments showed that TreS lost roughly 50% activity after incubation at 37 oC for four days. Since the natural function of TreS is to catalyze transglycosylation, 100 mM glucose was added to the reactivation mixture in the hope of accelerating the reactivation, but none was seen over four days at 37 oC. Again it seems likely that this is due to the conformationally closed nature of the active site, when the covalent intermediate is formed.

127 3.4.2 Identification of the catalytic nucleophile of TreS by mass spectrometry.

The complete inactivation of TreS by 5FIdoF opened up the possibility of studies to identify the amino acid to which it is attached by using electrospray ionization mass spectrometry on a Q-TOF instrument. The stoichiometry of inactivation was first shown by comparing the mass of the intact TreS and 5FIdoF-labeled TreS. A mass of (65237 ± 2) Da, was determined for the unlabeled wild type TreS [M+H+], as shown in Figure 3.14 (a). The predicted mass of the intact 593-amino-acid-enzyme, however, is 68202 Da. Since TreS has been reported to undergo a proteolysis reaction during long-term storage on ice244, the detected mass presumably reflects one of the proteolytic products that flies best in the mass spectrometer. Upon incubation of the enzyme with 10 mM 5FIdoF at 37 oC for six hours, the major peak in the mass spectrum was found to be (65418 ± 2) Da, increased by 181 Da (Figure 3.14 (b)). This difference corresponds very well with the incorporation of a single 5-fluoro-α-L-idopyranosyl moiety (181 Da) into the protein.

Next, samples of 5FIdoF-labeled TreS and the non-labeled enzyme control were subjected to pepsin digestion at pH = 2.0 at room temperature for two hours. The two resultant peptide mixtures were individually subjected to HPLC separation with analysis by ESI-MS. Comparative mapping studies of elution profiles of the two samples allowed identification of a short peptide P2 (m/z 858.4), which was present in the labeled TreS mixture but not in the control sample. Instead, another peptide P1 (m/z 677.3) was found in the non-labeled TreS mixture. The difference in m/z of these two peptides (181.1) is consistent with P2 being a singly charged peptide to which is attached a label of mass 181.1 (the mass of 5-fluoro-α-L-idopyranosyl moiety). P1 and P2 were therefore good candidates for the peptides of interest and were subjected to sequencing by collision- induced fragmentation.

128 (a)

400 65237 Da

200 Intensity

0 64200 64800 65400 66000 Molecular Weight / Da (b) 65418 Da

80

60

40 Intensity

20

0 64200 64800 65400 66000 Molecular Weight / Da

Figure 3.14: Mass spectrum of (a) intact TreS enzyme (b) mixture of TreS enzyme with 10 mM 5FIdoF, incubated at 37 oC for six hours.

129 (a) 392.24 60

286.15 295.18 40

187.09 Intensity 20 383.22 491.30 659.36 546.28 677.38 0 0 100 200 300 400 500 600 700 800

m/z (b)

Figure 3.15: ESI-MS/MS analysis of unlabeled peptide P1 (a) MS/MS fragment-ion spectrum of peptide P1 (b) Fragmentation pattern of peptide P1 and corresponding m/z of singly charged b-ions and y-ions. Red-colored fragments can be found in the spectrum.

The sequence of peptide P1 was determined to be 230DAVPYL235, as evidenced by the fragment-ion spectrum shown in Figure 3.15. Daughter ion fragmentation analysis of the labeled peptide P2 (Figure 3.16) confirmed that it was the same peptide as P1, except that the peak m/z = 858.4 corresponded to the 5-fluoro sugar-labeled peptide 230DAVPYL235 and its HF-elimination product (m/z 838.4). These results clearly confirmed that the six-amino-acid-peptide starting from Asp230 was indeed the site of labeling by 5FIdoF. Considering the fact that the majority of retaining glycosidases employ a carboxylic acid residue as the nucleophile, D230 as the only carboxylic acid in

130 the peptides P1 and P2, was therefore believed to be the best candidate to fill this catalytic role. Such a conclusion was further supported by aligning the primary sequence of TreS with other GH13 enzymes. D230 was indeed found to be highly conserved in this family and correctly corresponds to the catalytic nucleophile D197 of HPA (Figure 3.17).

(a)

392.24

30

20 677.37 659.36 858.44 Intensity 286.15 838.42 10 295.18 546.27

187.09 491.29 0 200 400 600 800

m/z (b) * Labeled

Figure 3.16: ESI-MS/MS analysis of labeled peptide P2 (a) MS/MS fragment-ion spectrum of peptide P2 (b) Fragmentation pattern of peptide P2 and corresponding m/z of singly charged b-ions and y-ions. Red-colored fragments can be found on the spectrum.

131

Figure 3.17: Partial sequence alignment of several GH13 glycosidases including TreS. The three conserved carboxylic acid residues are labeled with arrows and corresponding TreS numbering. The Swiss Prot. identifiers are indicated in parentheses below. The sequences shown are: Aspergillus niger α-amylase (P56271), Aspergillus oryzae α- amylase (P0C1B3), human pancreatic α-amylase (P04746), porcine pancreatic α- amylase (P00690), Bacillus circulans cyclomaltodextrin glucanotransferase (P43379) and Mycobacterium smegmatis trehalose synthase (A0R6E0).

Besides confirming the identity of the nucleophile as D230, the sequence alignment (Figure 3.17) also suggested that Glu272 is probably the general acid/base catalyst of TreS. The third carboxylic residue, which is highly conserved in the α-amylase family and thought to play significant roles in stabilizing the transition state, was also identified as D342 in the sequence of TreS. Further detailed kinetic analysis of TreS

132 mutants in which E272 and D342 have been modified are needed to confirm their proposed catalytic roles.

3.5 Investigations of the rearrangement mechanism of TreS Previous studies on two other TreS enzymes have clearly shown that they catalyze the rearrangement between maltose and trehalose through an intramolecular mechanism.231,232 Thus the aglycone glucose must flip inside the active site after the covalent intermediate is formed, without diffusing out of it. We are interested in confirming this lack of exchange for the M. smegmatis TreS.

3.5.1 Approach 1: Attempted isotope-labeled glucose incorporation into the disaccharide by TreS The first approach was designed to determine whether exogenous isotopically- labeled glucose could be incorporated into disaccharide products during TreS catalysis. If TreS employs an intermolecular mechanism, the masses of the disaccharide products will reflect this exchange process while if no exchange occurs, the masses will not alter. Two 13 kinds of isotope-labeled glucose, D-glucose-1- C and D-glucose-6,6’-d2 were employed. 10 mM of isotopically-labeled glucose and 2 mM of maltose or trehalose were incubated with TreS at room temperature overnight. The reaction mixtures were directly subjected to MS analysis. As had been seen with TreS from other sources, no change of disaccharide mass was observed, no matter what kind of isotopically-labeled glucose and disaccharide starting material were used (data not shown).

A second set of experiments were designed in which glucose and disaccharide products within the TreS reactions were acetylated before being analyzed by TLC and 13 MS. Two reactions containing 10 mM glucose (either D-glucose-6,6’-d2 or U- C-D- glucose), 2 mM maltose and TreS enzyme were incubated at 37 oC for 24 hours. A control sample with no enzyme added was also set up for comparison. All these three mixtures were first lyophilized and then subjected to acetylation using acetic anhydride and pyridine. After aqueous workup, the resulting samples were analyzed by TLC and MS. As shown in Figure 3.18 (a), TLC analysis of the reaction of U-13C-D-glucose,

133 maltose and TreS clearly demonstrated enzymatic conversion of maltose into trehalose, while no such conversion had occurred in the control sample (Figure 3.18 (b)). The mass spectrum of the mixture is shown in Figure 3.19. The two major peaks, 419.2 and 701.3, correspond to [per-O-acetylated (6×13C) glucose + Na+] and [per-O-acetylated maltose/trehalose + Na+], respectively. No incorporation of exogenous glucose was observed as this would shift the mass of the disaccharide products by 6 Da to 707 Da.

Similar results were obtained for the reaction mixture of D-glucose-6,6’-d2, maltose and TreS. These results unambiguous demonstrated that in the presence of TreS catalysis, there is no incorporation of exogenous glucose into disaccharide reactants. This is fully consistent with earlier studies of other TreS enzymes and all support an intra-molecular “ring flipping” mechanism (Scheme 3.4). (a) (b)

1 2 3 4 1 2 3 4

Figure 3.18: TLC analysis of reactions containing 10 mM U-13C-D-glucose, 2 mM maltose and (a) 0.14 μM TreS enzyme (b) no enzyme at 37 oC for 24 hours followed by lyophilization, acetylation and aqueous workup. Lane 1: per-O-acetylated maltose; Lane 2: per-O-acetylated trehalose; Lane 3: reaction mixture; Lane 4: per-O-acetylated glucose. The TLC (aluminum backed sheets of silica gel) was developed using the mixed solvent of diethyl ether/petroleum ether (v/v = 2:1).

134

Figure 3.19: Mass spectrum of the reaction containing 10 mM U-13C-D-glucose, 2 mM maltose and 0.14 μM TreS enzyme at 37 oC for 24 hours followed by lyophilization, acetylation and aqueous workup

3.5.2 Approach 2: Attempted TreS-catalyzed transglycosylation of glucose from αGlcF onto glucose αGlcF has been shown to a very good substrate of TreS. The small size of the anomeric fluoride essentially leaves the aglycone subsite empty. If exogenous glucose is present and can enter the active site of TreS, it is quite possible that transglycosylation might occur, forming either maltose or trehalose as the product. In order to test this hypothesis, 15 mM glucose and 5 mM αGlcF were incubated with TreS at room temperature for 28 hours. Another sample containing all the components except the enzyme was also set up as a control. TLC analysis of both mixtures is shown in Figure 3.20. Comparison of the control (Lane 1) and the TreS reaction (Lane 2) clearly showed that most of the αGlcF had disappeared in the presence of TreS, with formation of glucose as the only product. No transglycosylation had occurred, again supporting the intra-molecular rearrangement mechanism of TreS (Scheme 3.4).

135

αGlcF

glucose

1 2 3 Figure 3.20: TLC analysis of the reaction of 15 mM glucose, 5 mM αGlcF and 0.35 μM TreS enzyme at room temperature for 28 hours. Lane 1: control sample (no enzyme); Lane 2: TreS reaction; Lane 3: maltose. The TLC (aluminum backed sheets of silica gel) was developed using the mixed solvent of EtOAc/MeOH/water (v/v/v = 7:2:1)

Flipping of a substrate insi de an enzyme active site during an isomerization reaction is not without precedent. A very well-studied example is that of phosphoglucomutase (PGM), which catalyzes the interconversion of glucose 1-phosphate (G1P) and glucose 6-phosphate (G6P) and plays a central role in energy metabolism.245 An active site serine in PGM must be phosphorylated for the enzyme to be properly functional.246 Upon binding of the substrate, the phosphate is transferred from the serine residue to the free 6-hydroxyl of G1P or the 1-hydroxyl of G6P to make a dephosphoenzyme/glucose-1,6-diphosphate (G-1,6-diP) complex. After a reorganization event, the other phosphoryl moiety is transferred back to the serine residue to generate an alternate glucose monophosphate product, as shown in Scheme 3.8. The details of this reorganization process have attracted considerable interest and two different scenarios have been proposed.247 The first one, named the exchange mechanism, proposed that two binding modes of G-1,6-diP are present in PGM’s active site. After formation of G-1,6- diP, the intermediate flips over inside the active site and adopts a second binding mode which places the other phosphoryl group next to the active site serine and phosphoryl transfer occurs. This mechanism, interestingly, is very similar to that which we propose

136 for TreS. The second mechanism, named the minimal motion mechanism, involves a conformational change of the enzyme once the G-1,6-diP is formed inside the active site. It argues that G-1,6-diP has a single binding site and that the conformational change places the free serine next to the phosphoryl group for transfer. So far all the studies have supported the exchange mechanism in which two binding modes of G-1,6-diP exist in the active site.248-251 As Jeremy Knowles noted, the construction of a catalytic site is much more difficult than the construction of a substrate binding site.246

G-1-P

PGM enzyme PGM enzyme G-1-P

G-1-P PGM enzyme G-1,6-diP

G-6-P

PGM enzyme PGM enzyme G-6-P

G-6-P Scheme 3.8: Catalytic mechanism of PGM.

3.6 Conclusions TreS is an enzyme that catalyzes the reversible interconversion of maltose and trehalose. Several pieces of evidence support a double displacement mechanism involving an intramolecular “glucose flipping” step as the catalytic mechanism of this enzyme. The pH profile indicates that two ionizable groups are involved in catalysis, consistent with the nucleophile and general acid/base residues. Both 5FGlcF and 5FIdoF have been demonstrated to form covalent glycosyl-enzyme intermediates and this led to identification of the nucleophile as D230, consistent with the assignment of other GH13 glycosidases. The inability of TreS to carry out transglycosylation reactions onto exogenously added acceptors firmly establishes the intramolecular nature of the rearrangement reaction, consistent with previous studies on other TreS enzymes. Overall

137 TreS is a very typical GH13 transglycosidase containing several highly conserved short regions in its primary sequence with its most interesting feature, perhaps, being its ability to exclude exogenous acceptors from its catalytic site after formation of its glycosyl- enzyme intermediate. Further structural studies of this enzyme and its complexes with various ligands are needed to reveal details of the “glucose flipping” process in its catalytic mechanism.

138

Chapter 4: Mechanistic studies of Endo-α-N- acetylgalactosaminidase from Streptococcus pneumoniae

139 4.1 Background 4.1.1 Protein glycosylation

Glycosylation is the most common and complex post-translational modification of proteins. In fact more than half of the proteins in organisms are found to be glycosylated.252 Consistent with the wide occurrence of glycosylation, the carbohydrate moieties of glycoproteins have been shown to play many vital biological roles such as modulating three dimensional structures of proteins, maintaining stability of glycoproteins, acting as crucial elements in protein-protein interactions, contributing to the specific activity of signaling molecules (such as cytokines) and modulating enzyme activities.253 Given these various important functions, it is not surprising to see that glycosylation is the most diverse protein modification, both in terms of the amino acid which can be modified and the carbohydrate structures attached. Depending on the site of attachment, protein glycosylation primarily occurs in two forms, as N-linked glycans and O-linked glycans.254

N-glycans are covalently linked to the Asn residue of a polypeptide chain via a β- glycosylamine bond,255 within a strict consensus sequence of Asn-X-Ser/Thr (X = any amino acid except Pro). However, the occurrence of such a sequence does not guarantee the existence of N-glycosylation. The biosynthesis of N-glycosylated proteins involves assembly of the polypeptide chain and the core oligosaccharide donor (with a lipid pyrophosphate group as the aglycone) individually in the endoplasmic reticulum (ER).255,256 A multi-subunit enzyme named an oligosaccharyltransferase (OTase) is then responsible for transferring the sugar moiety from the lipid-linked oligosaccharide precursor onto the selected Asn residues of the nascent polypeptide.257 This protein- bound N-glycan is then subjected to trimming and modification, both in the ER and Golgi apparatus, before becoming the final glycan found in the mature protein.

O-linked glycans are found in a great variety of glycoproteins such as mucins, fetuin, glycophorin and antifreeze glycoproteins.254 The most common mucin-type O- glycans are attached via an α-glycosidic linkage between a GalNAc moiety at the

140 reducing end of an oligosaccharide and selected serine or threonine residues of proteins.258 In contrast to N-glycans, the biosynthesis of mucin-type O-glycans starts with a single sugar (GalNAc) transfer from UDP-GalNAc to the Ser/Thr residues by a class of enzymes called polypeptide GalNAc (PP-GalNAc-T, EC: 2.4.1.41), as shown in Figure 4.1.252 Despite extensive statistical studies, no strict consensus sequence has been identified for O-glycosylation, possibly due to the multiplicity of GalNAc with different substrate specificities. Nonetheless, computational methods which can predict the O-glycosylation sites of proteins have been developed to bridge the gap between the large number of known protein sequences and the small number of proteins whose glycosylation profiles have been experimentally investigated.259 After GalNAc is transferred onto the protein, further elongation of this monosaccharide can be achieved by different glycosyltransferases to yield eight core structures.252 One of the most common core subtypes, Core 1 type O-glycan, is generated by the addition of a galactose in a β-(1,3) linkage to the GalNAc by the corresponding β-1,3- galactosyltransferase (β1,3GalT). (Figure 4.1) Core 1 type O-glycans have been found to be widespread in mucin glycoproteins, which are abundant in intestinal tracts of humans and animals. The disaccharide involved in Core 1 structures, Gal-β-(1,3)-GalNAc, is sometimes referred to as the Thomsen-Friedenreich antigen or simply T-antigen (T- Ag).260 The eight core structures can be further elongated or terminated (for example, by sialylation) to generate a large number of O-glycan structures.

OH OH OH OH OH OH OH OH OH PP-GalNAc-T β1,3GalT O O O O HO HO HO O protein AcHN AcHN HO AcHN UDP O UDPGal O

protein protein Figure 4.1: General biosynthesis of Core 1 subtype O-glycans

4.1.2: Endo-α-N-acetylgalactosaminidases (endo-α-GalNAcase): GH101 family Endo-α-N-acetylgalactosaminidase (EC: 3.2.1.97) belongs to a special class of glycosidases which can hydrolyze Core 1 type O-glycans from various glycoproteins. This enzyme activity was first demonstrated in the filtrates of Clostridium perfringens.261

141 Soon after this, evidence of the presence of such enzyme activity in the filtrates of Streptococcus pneumoniae was also presented.262 To date, endo-α-GalNAcases from Clostridium perfringens,261 S. pneumoniae,262 Alcaligenes sp.,263 Streptomyces sp.,264 Bacillus sp.265 and Bifidobacterium longum266 have been purified and characterized. A new glycoside hydrolase family, GH101, was established to accommodate these enzymes. Since they can cleave off the Core 1 type O-glycan without damaging the proteins, GH101 enzymes serve as useful tools to study the structures of O-glycans and elucidate their biological roles. In addition, these enzymes also hold the potential to act as de novo glycosylating proteins when used in transglycosylation mode, possibly in engineered versions. Therefore, there is considerable interest in the detailed characterization of the catalytic machinery of GH101 enzymes. Our collaborators, Dr. Warren W. Wakarchuk and Ms. Lisa Willis (National Research Council, Ottawa, ON) successfully cloned, expressed and purified the endo-α-N-acetylgalactosaminidase (SpGH101) from S. pneumoniae R6 and kindly provided the wild type SpGH101 enzyme and its various mutants for this mechanistic study.

S. pneumoniae is a gram-positive organism that colonizes the respiratory tract of humans as a commensal organism but can become a pathogen that causes pneumonia, meningitis, bacteraemia and otitis media with significant morbidity and mortality, especially in children.267 Because S. pneumoniae is considered a leading cause of death worldwide, a great deal of research has been directed to determining the factors that contribute to the virulence of this organism. Of major importance to this pathogen is the ability to degrade host glycoproteins, and to metabolize the resultant carbohydrates.268 Consistent with this, S. pneumoniae has a large number of glycoside hydrolases, with 21 CAZy glycoside hydrolase families represented by at least one gene in the sequenced reference strains.7 These hydrolases are both secreted and cell-associated. Indeed the genome sequences of the reference strain S. pneumoniae R6 and of a clinical isolate contain at least 6 cell surface-localized enzymes that are thought to play a role in virulence.269-271 Besides SpGH101, these cell-associated enzymes include , , β-galactosidases, and endo-β-N-acetylglucosaminidases, which have been studied in some detail.272 Although the SpGH101 enzyme has not been

142 identified on its own as a virulence factor273, it may play a role in allowing the organism to remain commensal as the enzyme is secreted into the extracellular space, where mucin- type glycoproteins rich in Gal-β-1,3-GalNAc-α-Ser/Thr are abundant. Therefore this enzyme will aid the adherence of Streptococcus to human airway epithelial cells.273

The substrate specificities of several GH101 enzymes, including those of endo-α- GalNAcases from Clostridium perfringens (EngCP)274, B. longum (EngBF)266, Bacillus sp. (EngBA)265 and SpGH101275 have been studied in some detail using both synthetic pNP-α-glycosides and native glycoproteins as the substrates. These studies confirmed the ability of all these four endo-α-GalNAcases to remove the T-Ag disaccharide from both the artificial and the natural substrates, illustrating their fairly flexible aglycone specificities. However, three of the four enzymes, SpGH101, EngBF and EngBA, have been shown to have very strict requirements for the T-Ag disaccharide as the glycone265,266,275. EngCP, on the other hand, preferably hydrolyzes Core 1 type glycone structures but can also cleave other types of O-glycans, possibly reflecting its slightly different active site architecture from those of SpGH101 and EngBF. NMR monitoring of the reaction of pNP-TAg and EngBF demonstrated that it is a retaining glycoside hydrolase266, implying a double displacement mechanism for this class of enzymes. Kinetic analyses of EngBF mutants at the conserved carboxylic amino acids (Glu and Asp) have suggested candidate catalytic residues. However the exact assignment of each catalytic residue was not achieved266.

Recently, crystal structures of two GH101 enzymes, SpGH101 and EngBF, were solved independently by two research groups.276,277 The overall structures of these two enzymes are very similar. Both of them are multi-modular proteins with a catalytic domain as well as domains that resemble carbohydrate binding modules (CBMs, Figure 4.2). These include a CBM32 module, along with degenerate CBM4 and CBM16-like folds, plus modules that resemble a degenerate legume lectin fold. The catalytic domain

in both cases is a distorted (β/α)8 barrel flanked by a domain of all β-sheet structure: very

143

Figure 4.2: The overall architecture of SpGH101. (a) A stereo representation of the SpGH101 monomer shown in a cartoon representation, with domains coloured separately. (b) A schematic illustrating the SpGH101 domain organization. analogous to the arrangement seen in GH13 α-amylases. This flanking domain likely serves to shield the active site from water during catalysis.

Sequence similarity between GH101 enzymes is highest in the catalytic domain, with many of the conserved residues being in the active site pocket. (Figure 4.3) Earlier kinetic analyses of mutants made at the sites of conserved Glu and Asp residues of the EngBF enzyme had suggested two conserved residues to be important for enzyme activity, D682 and D789 (EngBF numbering), which correspond to D658 and D764 (SpGH101 numbering) respectively.266 However inspection of the active site of the SpGH101 structure (Figure 4.4) with the substrate modeled into place, based on the closely related α-amylase structure, suggested that the likely catalytic residues are instead D764 (nucleophile) and E796 (acid/base).276 The third catalytically important residue D658 was proposed to be involved in substrate binding based on the SpGH101 structure modeled with a T-Ag substrate. Later elucidation of the three-dimensional structure of EngBF also confirmed that Asp789 and Glu822 (corresponding to D764 and E796 of SpGH101) are suitably positioned to act as the catalytic nucleophile and general

144 acid/base catalyst, respectively.277 Mutation of either residue indeed severely lowers the enzymatic activity. However, no conclusive kinetic evidence was given to prove the exact roles of these residues.

Figure 4.3: Sequence alignment of the proposed catalytic domain of different GH101 enzymes. Representative sequences from GH101 were used for a multiple sequence alignment using Clustal 2.0, displayed with Genedoc and the section proposed as the catalytic domain was coloured for those residues that are absolutely conserved in this region. Sequences are: Arthrobacter aurescens TC1 genebank ABM10140.1; Bacillus cereus AH187 genebank ACJ78453.1; Bifidobacterium longum subsp. longum JCM 1217 genebank AAX44931.1; Clostridium perfringens str. 13 BAB80399.1; Enterococcus faecalis genebank BAG75105.1; Propionibacterium acnes KPA171202 genebank AAT83312.1; Streptococcus pneumoniae R6 genebank AAK99132.1; Streptomyces coelicolor A3(2) genebank CAA20079.1. (Figure courtesy of Dr. Warren W. Wakarchuk)

145

Figure 4.4: Structural representation of the catalytic domain of SpGH101. Those residues which are absolutely conserved were mapped onto the alpha-carbon backbone (yellow) from PDB 3ECQ of the catalytic domain by coloring them blue. Those absolutely conserved residues which are in the active site pocket are represented by the stick structures on this backbone. Residues displayed are F618; H657; D658; N692; H694; E699; Y701; W724/726; Y762; D764; V765; E796; W797; W810; Y816; W867. W residues are magenta; E/D residues are red; H residues are orange; Y residues are green; F residues are yellow; other residues are CPK colored. Images were made in Pymol. (Figure courtesy of Dr. Warren W. Wakarchuk)

Another very interesting property of GH101 enzymes is their transglycosylation activity. There is a considerable interest in understanding the biological roles of protein glycosylation due to their wide occurrence.278 However, native glycoproteins are usually heterogeneous and occur in multiple glycoforms, greatly hampering detailed structure- activity studies. On the other hand, the production of homogeneous glycoproteins is still a very immature field.279 Although there has been significant progress recently in producing such materials by using approaches such as glycosylation pathway engineering in microorganisms, chemical synthesis and enzymatic glycan re-modeling, there is no

146 unified method for this purpose.279 Developing suitable preparation methods for homogeneous glycoproteins still depend on many factors such as the size of the peptide, the extent of glycosylation and the structure of the glycans. In this regard, the use of GH101 enzymes in the transglycosylation mode would provide a convenient way of constructing O-glycopeptide linkages. Retaining glycosidases have been applied for the synthesis of glycosidic linkages for a long time, both in their wild type forms and in engineered versions.280,281 In fact, various wild type GH101 enzymes such as endo-α- GalNAcases from Bacillus sp.265,282, Streptomyces sp.264, EngBF266 and SpGH101283 have been used to generate many T-Ag containing glycoconjugates. The acceptors which could be used by these enzymes include common monosaccharides (glucose, mannose, galactose, arabinose, fucose), disaccharides (maltose, sucrose), simple alcohols (methanol, ethanol, 1-propanol, 1-butanol, 1-pentanol, glycerol, allyl alcohol), amino acids (serine, threonine) and most notably a serine-containing hexapeptide. This again confirms the relative flexibility of aglycone requirements of GH101 enzymes. However, transglycosylation yields in most cases were modest (~30% or less) and in the case of T- Ag linked hexapeptide by endo-α-GalNAcases from Streptomyces sp., only a 10% yield of the glycopeptide was obtained.264 Presumably one of the most important reasons for this poor yield is that of product hydrolysis, since the transglycosylated product in turn could serve as a substrate of the wild type GH101 enzyme. From this perspective, mutant GH101 enzymes hold promise for synthesis, either via the glycosynthase, thioglycosynthase or O-glycoligase approach.280,281 However, up to now, there has been no report on the synthesis of glycopeptide bonds using mutant GH101 enzymes, raising concerns about the feasibility of this approach, thus the need for a detailed mechanistic analysis of this class of enzymes.

4.2 Specific aims of this study

Detailed kinetic studies of wild type SpGH101 and its various mutants

Although there have been several kinetic studies of GH101 enzymes as mentioned in the introduction part, no detailed mechanistic studies of wild type and mutant enzymes

147 to assign the exact identities of the key catalytic residues have been reported. Such information is very valuable not only mechanistically, but also from the perspective of engineering these enzymes into useful biocatalysts for the synthesis of glycopeptides. The purpose of my studies is therefore to identify the catalytic nucleophilic and the acid/base residue through detailed kinetic and mechanistic studies of mutants modified at candidate residues. Substrates with different leaving groups will first be employed to quantify kinetic parameters of both wild type and mutant enzymes. Chemical rescue experiments will also be carried out to confirm the functional roles of the key catalytic residues.

4.3 Detailed mechanistic studies of SpGH101

4.3.1 Chemo-enzymatic synthesis of the substrates of SpGH101 In order to assay both the wild type and mutants of SpGH101, an activated substrate such as a dinitrophenyl glycoside is preferred. Since most of the GH101 enzymes have very strict glycone specificity, it is important to keep the intact T-Ag disaccharide as the glycone. Therefore, 2,4-dinitrophenyl 2-acetamido-2-deoxy-3-O-[β- D-galactopyranosyl]-α-D-galactopyranoside (DNPTAg) was designed as the optimal SpGH101 substrate, as shown in Figure 4.5. The 2,4-dinitrophenyl glycoside was chosen

because the greater leaving group ability of 2,4-dinitrophenolate (pKa = 4.0) makes it a more reactive substrate than its p-nitrophenyl analog (pKa = 7.2). In addition the lower

pKa of 2,4-dinitrophenolate ensures its extinction coefficient does not change significantly with pH above pH = 5 and that the assay is much more sensitive than with the p-nitrophenyl substrate at pH 6.5, which is the optimal pH for the SpGH101.

148 OH OH OH OH OH OH OH OH O O O HO O O O N HO 2 O NO 2 NO OH AcHN 2 OH AcHN O O

DNPTAg (4.1) pNPTAg (4.2)

OH OH OH OH OH OH O O O HO O2N HO O NO2 AcHN O OH AcHN O

H2N Val Gly Val Thr Glu Thr Pro CONH2

DNPGalNAc (4.3) TAg-IFN (4.4) Figure 4.5: Several SpGH101 substrates The synthesis of DNPTAg is very straightforward using a combined chemo- enzymatic approach. Our collaborator, Dr. Warren W. Wakarchuk’s group has recently cloned a β-1,3-galactosyltransferase (CgtB) from the bacterium Campylobacter jejuni, which showed flexible specificity against a diverse range of acceptors including both α- and β-configured GalNAc structures.284 Therefore we first synthesized DNPGalNAc (4.3) chemically using the convenient route developed by Dr. Hongming Chen240 and then transfer a galactose onto it using CgtB.

The chemical synthesis of DNPGalNAc, as shown in Scheme 4.1, was very similar to the method employed for the preparation of DNPGlc (Scheme 3.5). Very briefly, it also started with installation of the 2,4-dinitrophenyl group onto the anomeric center of GalNAc. 3,4,6-Tri-O-acetyl GalNAc (4.5) was reacted with Sanger’s reagent in

the presence of K2CO3 to provide the desired α-glycoside product 4.6 was obtained in a good yield (53%).240,241 The presence of the α-configured DNP group was clearly demonstrated by the small coupling constant JH1, H2 = 3.5 Hz. Due to the instability of the anomeric DNP group under basic conditions, weakly acidic conditions were employed to

149 deprotect the α-glycoside 4.6 and DNPGalNAc (4.3) was thereby generated in quantitative yield. This compound 4.3 itself was also tested as a substrate of SpGH101.

OAc OAc OAc OAc OH OH

O O O (i) ii) AcO AcO O2N HO O2N

NO2 NO2 AcHN AcHN AcHN OH O O 4.5 4.6 4.3

OH OH OH OH iii) O O HO O O2N

NO2 OH AcHN O 4.1 Scheme 4.1: Chemo-enzymatic synthesis of DNPTAg (4.1). i) 1-Fluoro-2,4- dinitrobenzene, K2CO3, DMF, 53% yield; ii) MeOH, acetyl chloride (4% v/v), 99% yield;

iii) UDP-galactose, MnCl2 (10 mM), dithiothreitol (DTT, 1 mM), NaOAc buffer (50 mM, pH = 6.0), wild type CgtB enzyme (0.1 mg/ml), 52% yield

Addition of the galactosyl moiety was achieved with the β-1,3- galactosyltransferase CgtB (CAZy GT2 family), which catalyzes transfer of galactose from UDP-Gal to a terminal N-GalNAc residue in C. jejuni strain OH4384 lipooligosaccharide.284 CgtB has been demonstrated to utilize both α- and β-configured GalNAc acceptors and is thus useful for synthesis of the Galβ1,3GalNAc subunit common both to the ubiquitous Core 1 O-glycans and the ganglioside glycolipids. CgtB was purified as a fusion protein with maltose-binding protein (MalE) from E. coli according to the literature procedure.284 Equal amounts of the donor (UDP-Gal) and the

acceptor (DNPGalNAc) were incubated with CgtB in the presence of MnCl2 (10 mM), DTT (1 mM) at pH = 6.0, room temperature overnight, as shown in Scheme 4.1. TLC

analysis of the reaction mixture showed that a new UV-active spot appeared with an Rf value about half of the Rf value of the glycosyl acceptor DNPGalNAc, as expected for the disaccharide product. The conversion of DNPGalNAc was estimated to be around 50% by TLC analysis after 24 hours. Since prolonged incubation did not further improve the yield, the reaction mixture was directly loaded onto a C-18 Sep-Pak Column (Waters) to

150 remove all the buffer salts. The fractions containing both DNPGalNAc and DNPTAg from the C-18 column were then subjected to silica gel column chromatography to afford the pure DNPTAg (52% yield). High resolution mass spectra confirmed the molecular formula of the product as that expected for DNPTAg (C20H27N3O15). The β-configured inter-glycosidic bond was evidenced by the trans-diaxial coupling between H-1’ and H-2’

(J1’,2’ = 7.5 Hz). Further characterization of this compound can be found in the Section 5.2.3.

Besides DNPGalNAc (4.3) and DNPTAg (4.1), two other substrates of SpGH101 were also obtained: pNPTAg (4.2) and TAg-IFN (4.4), as shown in Figure 4.5. The former substrate pNPTAg was directly purchased from Toronto Research Chemicals Inc. (Ontario, Canada). The latter glycopeptide, TAg-IFN (4.4), was provided by Dr. Warren W. Wakarchuk. TAg-IFN is a heptapeptide (primary sequence: VGVTETP) with a T-Ag disaccharide covalently linked to one of the threonines via an α-glycosidic linkage (Figure 4.5). It corresponds to positions 103 – 109 of the mature Interferon α-2b protein (165 amino acids), a recombinant human protein which is used clinically as an anti-viral drug.285 This should serve as a very good mimic of the natural substrates of SpGH101.

4.3.2 Determining stereochemical outcome of SpGH101 by NMR The first step in characterizing a glycosidase mechanism involves determining the stereochemical outcome of the enzymatic reaction: inverting or retaining.5 The closely related enzyme EngBF was shown to be a retaining glycosidase by NMR monitoring of the hydrolysis of pNPTAg.266 We decided to confirm this result with the wild type SpGH101 enzyme using the same substrate pNPTAg in an NMR study.

The NMR spectrum of a 5 mM solution of pNPTAg is shown in Figure 4.6 (t = 0 min). The doublet at 5.82 ppm (J = 3.0 Hz) corresponds to the anomeric proton H-1 of pNPTAg. The small coupling constant is due to the coupling between the axial H-2 and the equatorial H-1. The large peak (δ = 4.80 ppm) is the residual water peak. Four minutes after wild type SpGH101 enzyme was added, the doublet (δ = 5.82 ppm) has disappeared (Figure 4.6, t = 4 min) and a new peak (δ = 5.21 ppm, J = 3.1 Hz) has

151

H-1α (pNPTAg) H-1α (TAg) H-1β (TAg)

152 Figure 4.6: 1H-NMR determination of the anomeric stereochemistry of the products of the reaction between wild type SpGH101 and pNPTAg. pNPTAg (5 mM) was dissolved in 0.5 ml of SpGH101 enzyme buffer. The spectra were acquired after addition of 0.16 mg enzyme.

appeared. This resonance came from the α-anomer of the hydrolyzed product (T-Ag disaccharide) as the small coupling constant is again clearly indicative of an equatorial H- 1. After nine minutes of reaction, another doublet (δ = 4.69 ppm, J = 8.5 Hz) from the β- anomer of the hydrolyzed product gradually appeared, as is shown in Figure 4.6 (t > 9 minutes). This is due to the mutarotation of the α-anomer of T-Ag disaccharide into its β- anomer; the large J is the result of the trans-diaxial coupling between H-1 and H-2. Therefore, it is evident that SpGH101 catalyzes the hydrolysis of the substrate pNPTAg with retention of anomeric configuration at the anomeric center, consistent with results obtained for EngBF.

4.3.3 Detailed kinetic studies of SpGH101 enzymes

The following SpGH101 enzymes were provided by Dr. Warren W. Wakarchuk’s research group: a wild type SpGH101 with a 40 amino acid deletion at the N-terminus (SPOG07), a wild type SpGH101 with a 40 amino acid deletion at the N-terminus and a 200 amino acid deletion at the C-terminus (SPOG10), a D764A mutant of SPOG07 (proposed nucleophile mutant), an E796A mutant of SPOG07 (proposed acid/base mutant) and an E796Q mutant of SPOG07 (proposed acid/base mutant).

4.3.3.1 Kinetic analysis of wild type SpGH101 enzymes and pH dependence study

Kinetic parameters for the hydrolysis of DNPGalNAc, DNPTAg and pNPTAg by SPOG07 and SPOG10 were determined and are shown in Table 4.1. Interestingly, the deletion of 200 amino acid residues from the C-terminus of SPOG10 vs. SPOG07

increased its kcat value by 40%. However, the overall catalytic efficiency (kcat/Km) of these

153 two wild type SpGH101 enzymes was essentially the same. SPOG07 was therefore used for the rest of this study as the representative “wild type” SpGH101. The high specificity

of SPOG07 for the T-Ag disaccharide was shown by the 26,500-fold greater kcat/Km for the disaccharide substrate DNPTAg than the monosaccharide DNPGalNAc. This large difference in specificity is due not only to a higher affinity, as seen by the 70-fold decrease in Km, but also to a 400-fold increase in kcat. Surprisingly, the turnover number for hydrolysis of pNPTAg by SPOG07 is very similar to that of DNPTAg, even though a much worse leaving group pNP is present at the anomeric center of pNPTAg, implicating deglycosylation as the rate-determining step for both substrates.

Table 4.1: Kinetic parameters for SpGH101 and its various mutants

-1 -1 -1 Enzyme Substrate Km (μM) kcat (s ) kcat/Km (s mM ) SPOG07 DNPTAg 34 ± 2 541 ± 14 15.9 SPOG10 DNPTAg 48 ± 3 748 ± 19 15.6 D658A* DNPTAg ND ND 0.987 D764A DNPTAg 33 ± 2 0.76± 0.02 0.023 E796A DNPTAg 1.3 ± 0.1 18 ± 1 13.8 E796Q DNPTAg 1.2 ± 0.1 14 ± 1 11.7 SPOG07 DNPGalNAc 2400 ± 100 1.6 ± 0.1 0.0006 E796A DNPGalNAc 2100 ± 220 5.0 ± 0.2 0.002 SPOG07 pNPTAg 62 ± 3 579 ± 11 9.3 E796A pNPTAg 21 ± 1 1.0 ± 0.1 0.048 E796Q pNPTAg 212 ± 17 2.1 ± 0.1 0.0099

* data from Dr. Warren Wakarchuk

As Section 3.3.4 explained, all the ionizable functional groups inside the enzyme’s active site need to be in the correct ionization state in order for the enzyme to function optimally. Studies of pH dependence can therefore generate important mechanistic insights into the residues which are essential for enzyme catalysis. Stability tests were first carried out in which SPOG07 was incubated in buffers of different pH for a period of time before being assayed under standard kinetic conditions. The results

154 showed that this enzyme is stable for at least 10 minutes from pH = 5.2 – 8.4 (data not shown). This region was therefore selected for pH dependence studies.

The pH dependence of kcat/Km for SPOG07 was measured using the substrate 33 depletion method at low substrate concentrations ([S] << Km). Time course studies were performed with 5 μM DNPTAg since its Km = (34 ± 2) μM. The individual kcat/Km values of SPOG07 at different pH values are shown in Table 4.2 and a plot of these values versus pH is shown in Figure 4.7.

Table 4.2: The individual kcat/Km values of SPOG07 under different pH pH 5.2 5.6 6.1 6.5 7.0 7.5 8.0 8.4 -1 -1 kcat/Km (s μM ) 7.4 11.6 14.7 15.9 13.7 12.3 5.8 4.1

16 14 12

m 10 K /

cat 8

k 6

4

2 0 46810 pH

Figure 4.7: Dependence of kcat/Km upon pH for hydrolysis of DNPTAg by wild type SpGH101. The data were fit to a double ionization pH curve by nonlinear regression using the program GraFit 5.0.13 (Erithacus Software Limited, 2006).

The bell shaped curve obtained from the plot of kcat/KM versus pH indicated that two ionizable groups are involved in catalysis with pKa values of 5.3 ± 0.1 and 7.8 ± 0.1. This result is consistent with the classical double displacement mechanism adopted by 286 retaining glycosidases. The smaller pKa value (5.3 ± 0.1) is believed to belong to the

155 catalytic nucleophile and the larger value (7.8 ± 0.1) should represent that of the general acid/base residue.

4.3.3.2 Kinetic analysis of catalytic residue mutants of SpGH101.

Based on a model of the binding mode of the substrate within the active site276, the mutants D658A, D764A, and E796A/Q were constructed and analyzed using the disaccharide DNPTAg substrate, as shown in Table 4.1. The least active of these mutants is D764A, with a kcat/Km value some 700 times lower than that of the wild type enzyme. In fact this is likely a maximum estimate of the activity since, based upon the similarity of its Km value to that of the wild type enzyme, it is quite probable that even this low activity derives from a small amount of contaminating wild type enzyme. The

next most active is the D658A mutant, with a kcat/Km that is 16-fold lower than that of the wild type enzyme. However it did not show saturation kinetics with the DNPTAg

substrate, thus individual kinetic constants kcat and Km, could not be determined. The

most active were the E796A and Q mutants, with kcat/Km values essentially the same as

the wild type enzyme. Importantly however the kcat values are about 30 fold lower than that of the wild type enzyme, with the Km values also correspondingly lowered. A different scenario was encountered when using pNPTAg to assay the E796A/Q mutants.

Their kcat/Km values were significantly lower than that of SPOG07 and this primarily

came from the much lowered kcat values of the mutants, which were roughly 300 – 500 fold smaller than that of the WT enzyme.

The monosaccharide substrate DNP-GalNAc was hydrolyzed at a significant rate only by the WT enzyme and the E796A mutant, with an extremely low rate being found for the E796Q mutant (data not shown). Interestingly the kcat value for this substrate with

the E796A mutant was actually 3 fold higher than for the WT enzyme, though the Km values were similar. However these kcat values are some 300-400-fold lower than those of

the disaccharide substrate, and the Km value some 80-fold higher. These high Km values are suggestive of the glycosylation step being rate-limiting. Interestingly an exactly parallel situation was seen with the xylanase Cex from Cellulomonas fimi, where the “addition” of a glucose residue to a monosaccharide substrate was seen to change the

156 rate-limiting step from glycosylation to deglycosylation, and a rationale for this was developed.37,88

4.3.3.3 Relative activities of wild type SpGH101 and its mutants with a “natural” glycopeptide substrate

In order to investigate the enzymatic activity of the active site mutants on a more natural substrate, a synthetic glycopeptide, TAg-IFN (4.4, Figure 4.5) was tested as a substrate with both SPOG07 and the E796A/Q mutants. This substrate contains a T- Antigen disaccharide attached to the threonine side chain of a seven-amino-acid peptide via an α-glycosidic linkage. Upon the addition of 34 nM of WT SpGH101, 0.5 mM of TAg-IFN was rapidly cleaved, with hydrolysis being complete within three hours at room temperature, as is shown in Figure 4.8 (Lane 2 from left). However, the addition of either 290 nM of E796A (Lane 3 in Figure 4.8) or 220 nM of E796Q to 0.5 mM of TAg-IFN (Lane 4 in Figure 4.8) resulted in no observable cleavage of the glycosidic bond, even after incubation at room temperature for two days.

Figure 4.8: TLC image of various reactions between SpGH101 enzymes and 0.5 mM TAg-IFN. From left, Lane 1: control experiment with only 0.5 mM TAg-IFN present; Lane 2: reaction mixture of 0.5 mM TAg-IFN and 34 nM SPOG07 enzyme after three hours incubation at room temperature; Lane 3: reaction mixture of 0.5 mM TAg-IFN and

157 290 nM E796A mutant enzyme after two days incubation at room temperature; Lane 4: reaction mixture of 0.5 mM TAg-IFN and 220 nM E796Q mutant enzyme after two days incubation at room temperature. The developing solvent system was isopropanol: water:

NH4OH (v/v/v) = 6:1:2. The developed TLC plate was first dipped in 1% (v/v) triethylamine in acetone followed by drying in air for 5 min. Then it was dipped into a fluorescamine solution (0.1 mg/ml in acetone) followed by drying in air again. Peptides were revealed as fluorescent spots by illuminating the TLC plate with UV light (long wave).

4.3.3.4 Chemical rescue of catalytic residue mutants of SpGH101 A diagnostic tool for identification of the catalytic residues of retaining glycosidases involves “rescuing” the activities of catalytic residue mutants upon the addition of exogenous nucleophilic anions such as azide, formate, acetate or fluoride.15,287 For the acid/base mutants, the formation of a glycosyl-enzyme intermediate can be achieved by using activated substrates for which deglycosylation is rate-limiting. In the absence of a general base catalyst, the turnover of such intermediates by water is very slow. However, in the presence of exogenous anions such as azide which can occupy the vacant space created by the mutation, the glycosyl-enzyme intermediate turns over much faster by azidolysis than by hydrolysis. (Figure 4.9 (A)) The overall results are that the kcat is dramatically increased and a glycosyl azide product is formed with retention of anomeric configuration. For catalytic nucleophile mutants, which are usually inactive as hydrolases (>106 fold decrease of activity compared with WT enzyme), the anions will again accelerate the cleavage of substrates by displacing the anomeric leaving group and forming a glycosyl azide product with inverted anomeric configuration. (Figure 4.9 (B)) Both of these “rescue” strategies have been successfully applied to various retaining β- glycosidases to confirm the identities of both nucleophile and acid/base residues. Selected examples include Agrobacterium sp. β-glucosidase (Abg)28, Cellulomonas fimi exoglucanase/xylanase (Cex)32 and Cellulomonas fimi β-mannosidase (Man2A)31. Indeed, the azide rescue of the nucleophile mutant (E358A) of Abg set the stage for the discovery of glycosynthases.28 However, application of such methodologies to retaining α- glycosidases has met with only limited success. Only a modest enhancement of activity

158 was observed when exogenous anions were added to the mutants of retaining α- glycosidases. In the case of CGTase, for example, the addition of azide only increased the activity of its acid/base mutant by 1.8 fold.288 Similarly a two-fold increase of activity was obtained by adding sodium azide to the E233A mutant of HPA and no glycosyl azide product could be isolated.29 A) Acid/Base B) Acid/Base

CH3 O O - N3 O O H

HO HO OR

O O - N3

CH3

Nucleophile Nucleophile

Anion Rescue of Acid/Base Mutants Anion Rescue of Nucleophile Mutants Figure 4.9: Mechanisms of anion rescue of (A) Acid/base mutants, (B) Nucleophile mutants of retaining glycosidases.

To investigate the roles of D764 and E796 in catalysis, assays were performed in the presence of exogenous anionic nucleophiles (Table 4.3). In the case of the proposed acid/base (E796) mutants, all the anions including azide, formate and acetate increased kcat in a dose-dependent manner, with the greatest increase being seen with azide and the E796Q mutant. A representative example of the activity of E796Q versus azide concentration is shown in Figure 4.10. The only anion that restored activity to the nucleophile mutant (D764A), was azide (data not shown), which has the highest nucleophilicity of those tested (azide, formate, acetate and fluoride). In order to rule out

the possibility of these changes in kcat and Km simply resulting from salt effects, the kinetic parameters of the E796 mutants in the presence of either 100 mM or 500 mM

NaCl were measured. However, in none of the cases were the Km values of the mutants affected by the addition of Cl-. Anion rescue experiments with the E796A/Q mutants were also attempted using the monosaccharide substrate DNP-GalNAc. However, no

159 anion rescue was observed, completely consistent with the conclusion that the glycosylation step is rate-limiting for the monosaccharide substrates.

Table 4.3: Kinetic parameters for the cleavage of DNPTAg by the E796A and E796Q mutants of SpGH101 in the presence of different anions

-1 -1 -1 Mutant/ anion Km (μM) kcat (s ) kcat/Km (s mM ) E796A (no anion) 1.3 ± 0.1 18 ± 1 13.8

E796A + 100 mM NaN3 5.6 ± 0.6 33 ± 1 5.9

E796A + 100 mM NaOAc 2.6 ± 0.2 33 ± 1 12.7

E796A + 500 mM sodium formate 11 ± 1 48 ± 1 4.4

E796A + 100 mM NaCl 1.5 ± 0.1 17 ± 1 11

E796A + 500 mM NaCl 1.6 ± 0.1 14 ± 1 8.8

E796Q (no anion) 1.2 ± 0.1 14 ± 1 11.7

E796Q + 500 mM NaN3 15 ± 1 74 ± 2 4.9

E796Q + 500 mM NaOAc 6.4 ± 0.6 32 ± 1 5.0

E796Q + 500 mM sodium formate 9.7 ± 0.8 32 ± 1 3.3

E796Q + 500 mM NaCl 1.7 ± 0.2 9.1 ± 0.3 5.4

Figure 4.10: Chemical rescue of the SpGH101 E796Q mutant at increasing concentrations of sodium azide.

160 In order to isolate the glycosyl azide product formed with the acid/base mutant, a reaction mixture containing 3.5 mM DNPTAg, 1 M sodium azide and 5 nM of SpGH101 E796Q mutant was prepared and left at room temperature overnight. TLC (EtOAc: MeOH: water = 7:2:1) analysis of this reaction mixture revealed a new non-UV active spot (Rf = 0.32), which is distinct from DNPTAg (Rf = 0.49) and TAg (Rf = 0.14). After purification, high resolution mass spectra confirmed that the new product has the + molecular formula for the TAg-azide. (Calcd. for C14H24N4O10+Na : 431.1390, Found: 431.1393). The α configuration of the anomeric azide was clearly demonstrated by the 1 small J1,2 coupling constant of 4.2 Hz in the H-NMR spectrum of the purified product. The relatively large chemical shift (δ = 5.52 ppm) is also typical of an α-glycosyl azide since a much smaller chemical shift would be expected for a β-glycosyl azide289. 1H-

NMR (300 MHz, CD3OD) δ: 5.52 (1H, d, J1,2 = 4.2 Hz, H-1), 4.45 (1H, dd, J2,3 = 11.2 Hz,

J1,2 = 4.2 Hz, H-2), 4.41 (1H, d, J1’,2’ = 7.5 Hz, H-1’), 4.20 – 3.43 (m, 11H), 1.89 (3H, s,

COCH3).

4.3.4 Discussion of the kinetic analysis of SpGH101 and its mutants

The kinetic and mechanistic data accumulated provide very strong evidence that D764 acts as the catalytic nucleophile (pKa = 5.3 ± 0.1) and that E796 functions as the acid/base catalyst (pKa = 7.2 ± 0.1) in the double-displacement mechanism followed by SpGH101. These are the roles that were proposed on the basis of modeling of substrate binding into the active site using the structure of an α-amylase/substrate complex as a guide.276 Strong support for the assignment of D764 as the catalytic nucleophile comes from the observation that mutation to alanine results in essentially complete abrogation of enzyme activity. Even with the most active 2,4-dinitrophenyl T-antigen substrate the activity of the D764A mutant was some 700-fold lower than that of the wild type enzyme. Indeed, even this activity was most probably due to contaminating wild type enzyme. Such significant loss of activity has been seen in essentially all retaining glycosidases when their catalytic nucleophiles have been mutated.15,287

161 Evidence in support of a role for E796 as the acid/base catalyst is particularly strong. The mutants E796A and Q efficiently cleave substrates containing the highly reactive dinitrophenyl leaving group, which does not need acid catalytic assistance, with kcat/Km values essentially identical to that of the wild type enzyme. By contrast, cleavage of both pNPTAg (with a much worse leaving group) and the natural, non-activated glycopeptide substrate, which do need acid catalytic assistance, was severely compromised. Indeed no cleavage whatsoever was observed using glycopeptide substrate under fluorescent imaging, even after incubations with almost 10 times as much enzyme and incubation for two days. This contrasting behavior with the two different substrates is exactly what has been seen for other retaining glycosidases in which the acid catalyst is mutated, and is consistent entirely with the role.286

A more detailed inspection of the kinetic data for the dinitrophenyl glycoside also

reveals very low Km values for the acid mutants. This is not a consequence of improved affinity per se, but rather indicates the accumulation of the glycosyl enzyme intermediate.

This arises because the glycosylation step remains fast (as shown by kcat/Km values) as a consequence of the excellent leaving group ability of the dinitrophenyl group, but the

deglycosylation step is slowed (as seen in kcat) due to the removal of general base catalysis.

Further, very strong evidence for E796 as the acid/base catalyst derives from the rescue of steady state activity seen in the presence of the anionic nucleophiles azide, formate and acetate. Increases in kcat of up to 5 fold were observed, with no effect upon

kcat/Km. This arises because azide is a much better nucleophile than water, but cannot attack the glycosyl-enzyme formed on the wild-type enzyme due to electrostatic screening from the deprotonated E796. Removal of that charge in the alanine mutant allows direct attack of azide, formate or acetate at the anomeric centre, with associated increases in steady state rate. Parallel studies with sodium chloride confirm that the rate increases are not due to salt effects.

162 The relatively small increases in steady state rate observed in these anion rescue experiments are completely consistent with findings on other α-glycosidases where increases of 5-10 fold have been seen.29,290 This contrasts with the much larger increases typically seen for catalytic acid mutants of β-glycosidases.31,291 The ‘saturation’ behaviour seen in such experiments (Figure 4.10) is not a reflection of saturable reversible binding of the anion, but rather is a consequence of a change in rate-limiting step at higher anion concentrations. The smaller increases seen for the α-glycosidases compared to those seen for the β-glycosidases are, then, a reflection of the inherent relative rates of glycosylation and deglycosylation steps in α- and β-glycosidases. This most likely arises primarily from the somewhat greater reactivity of β-glycosides than their α-anomers. Interestingly, no anion rescue was observed when DNPGalNAc was used as substrate, consistent with the indications (from high Km values) that the glycosylation step, rather than the deglycosylation step, is rate-limiting in that case. Further confirmation of the mechanism shown in Figure 4.9 (A) is provided by the isolation of the α-configured azide adduct of the T-antigen substrate from such reaction mixtures.

The mechanistic role of D658 is less clear. Structural alignment with α-amylase places this residue on top of Y82, a residue that has been suggested to form hydrophobic interactions with the sugar. Clearly such a role is not possible for D658 since this residue is not hydrophobic. Indeed, modeling would suggest the formation of hydrogen bonds between D658 and the axial hydroxyl of the substrate. The large increase in Km value (no saturation was seen) is consistent with a role in substrate binding, as is the ≈ 15-fold decrease in kcat/Km seen upon mutation. Indeed, in the related EngBF enzyme, a similar substrate model also suggests that this residue is involved in hydrogen bonding with the substrate.

4.4 Conclusions and future directions The large (1767 amino acid) endo-α-N-acetylgalactosaminidase from S. pneumoniae (SpGH101) specifically removes an O-linked disaccharide Gal-β-1,3- GalNAc-α from glycoproteins. While the enzyme from natural sources has been used as

163 a reagent for deglycosylating cells for for many years, very few mechanistic studies have been performed. Using the recently solved 3-dimensional structure of the recombinant protein as a guide, we carried out a detailed mechanistic investigation of the SpGH101 retaining α-glycoside hydrolase using a combination of synthetic and natural substrates. Based on a model of the substrate complex of SpGH101 we proposed D764 and E796 as the nucleophile and general acid/base residues respectively. These roles were confirmed by kinetic and mechanistic analysis of mutants at those positions using synthetic substrates and anion rescue experiments. pKa values of (5.3 ± 0.1) and (7.2 ± 0.1) were

assigned to D764 and E796 on the basis of the pKa values derived from the bell-shaped

dependence of kcat/Km upon pH. The mechanistic information gathered from these studies hopefully can help engineering GH101 family enzymes into useful biocatalysts for in vitro synthesis of T-antigen-containing glycoconjugates.

164

Chapter 5: Materials and Methods

165 5.1 Generous gifts and commercially available materials 5.1.1 Synthetic carbohydrates 2,3,4,6-Tetra-O-acetyl-D-glucopyranose (3.8) and 2-acetamido-3,4,6-tri-O-acetyl- 2-deoxy-D-galactopyranose (4.5) were provided by Dr. Hongming Chen from our group.

5.1.2 Enzymes and peptides Endoglycosidase F-cellulose binding domain fusion protein (endo-F) was provided by Ms. Emily Kwan from Dr. R. A. J. Warren’s group in the Department of Microbiology, UBC. Trehalose Synthase (TreS) was expressed and purified by Dr. Y.T. Pan from Dr. Alan Elbein’s group in the Department of Biochemistry and Molecular Biology, University of Arkansas, USA. All the SpGH101 enzymes (wild type and various mutants) and [TAg]-IFNα2b glycopeptides were provided by Dr Warren Wakarchuk from the National Research Council of Canada. The E233A mutant of human pancreatic α-amylase (HPA) was a generous gift from Dr. Chunmin Li (Prof. Gary Brayer’s group).

5.1.3 Commercially available carbohydrates, buffer salts and enzymes. 2-Chloro-4-nitrophenyl α-maltotrioside (CNPG3) was purchased from Genzyme Corporation, Cambridge, MA. 4-Nitrophenyl 2-acetamido-2-deoxy-3-O-[β-D- galactopyranosyl]-α-D-galactopyranoside (pNPTAg) was purchased from Toronto Research Chemicals Inc. (Ontario, Canada).

All buffer salts used in kinetic studies, as well as porcine pancreatic α-amylase (PPA, EC 3.2.1.1, Type I-A) were purchased from Sigma-Aldrich unless otherwise noted.

5.2 Synthesis 5.2.1 General methods Unless otherwise noted, all the reagents were purchased from commercial chemical suppliers (Sigma/Adrich, Fluka and Cambridge Isotope Laboratories) and were used without further purification. Reactions were monitored by thin layer

chromatography (aluminum backed sheets of silica gel 60F254, 0.2 mm) and were visualized using UV light (254 nm) and by exposure to 10% ammonium molybdate in

166 2 M H2SO4 or by exposure to 10 % sulfuric acid in methanol followed by charring. Flash column chromatography was performed on 230-400 mesh silica gel. 1H-NMR and 13C- NMR spectra were recorded on 200 MHz, 300 MHz, 400 MHz or 600 MHz spectrometers and calibrated to the relevant residual solvent peak (eg. CDCl3, D2O and 19 CD3OD). F-NMR spectra were obtained at 188 MHz on a 200 MHz spectrometer or at 282 MHz on a 300 MHz spectrometer. 19F-NMR chemical shifts were reported using the

δ scale referenced to CFCl3 (δ= 0.00 ppm). Low resolution mass spectra were recorded using a triple quadrupole mass spectrometer equipped with an electrospray ionization ion source. High resolution mass spectra were measured in the mass spectrometry laboratory of Department of Chemistry, UBC. Anhydrous solvents were prepared as follows:

CH2Cl2, pyridine, CH3CN and CCl4 were dried over CaH2 and distilled prior to use. DMF was dried over 4 Å molecular sieves. MeOH was dried over Mg and distilled prior to use. THF was distilled over Na.

5.2.2 General synthetic methods

5.2.2.1 Fluorination conditions using acetyl hypofluorite

The general fluorination procedure using acetyl hypofluorite is the following: F2 (5%, 3 × 40 psi) was bubbled into a slurry of NaOAc (1 g) and glacial acetic acid (3 mL) in CFCl3 (30 mL) at -78°C. A solution of 2-fluoro glycal per-O-acetate (1.5 mmol) in

CFCl3 (10 mL) was then added to the slurry containing acetyl hypofluorite (in situ). The cooling bath was removed and the reaction was allowed to take place with the reaction progress being followed by TLC analysis. The CFCl3 was removed by evaporation and an aqueous workup followed by column chromatography resulted in purification of the compound. Note: These are hazardous reagents. All such experiments were performed within the synthetic facility at TRIUMF.

5.2.2.2 Acetylations Unprotected sugar was dissolved in dry pyridine and cooled down to 0 oC. Acetic

anhydride (Vpyridine : VAc2O = 3:2) was slowly added and the reaction mixture was allowed to warm up to room temperature and stirred until TLC showed that the reaction was

167 complete. Most of the organic solvent was evaporated off and the residue was re-

dissolved in CHCl3. The organic phase was then successively washed with 1 M HCl,

H2O, saturated NaHCO3 and brine. After drying over anhydrous MgSO4, all the organic solvent was removed by evaporation in vacuo followed by further purification by recrystallization or flash chromatography.

5.2.2.3 Deprotection of acetyl groups under strongly basic conditions The acetylated sugar was dissolved in dry MeOH (in a ratio of 10 mg protected sugar to 1 mL of methanol), then a very small piece of sodium metal was cut and carefully added to the reaction solution. After all the metal had reacted and no more hydrogen gas was released, the reaction was stirred at room temperature until complete. Acidic resin was added to the reaction solution and stirred until the solution just turned acidic. After filtration of the resin, the organic solvent was removed by evaporation in vacuo and the product was purified by flash column chromatography.

5.2.2.4 Deprotection of acetyl groups under weakly basic conditions The acetylated sugar was dissolved in dry MeOH (in a ratio of 10 mg protected sugar to 1 mL of methanol) and cooled down to 0 oC in ice water bath. Dry ammonium gas was bubbled in for 5 minutes. The reaction mixture was then allowed to warm up to room temperature and stirred until the reaction was complete. Solvent was removed by evaporation in vacuo and the product was purified by flash column chromatography.

5.2.2.5 Deprotection of acetyl groups under acid conditions The acetylated sugar was dissolved in dry MeOH (in a ratio of 10 mg protected sugar to 1 mL of methanol) and cooled down to 0 oC. Acetyl chloride (4% v/v of MeOH) was slowly added and the reaction mixture was vigorously stirred at 4 oC until no starting material remained. Solvent was removed by evaporation in vacuo and the product was purified by flash column chromatography.

168 5.2.3 Synthesis and compound characterization

1,3,4,6-Tetra-O-acetyl-2-deoxy-2-fluoro-α/β-D-glucopyranose (2.33)292

OAc

O AcO AcO

F OAc Per-O-acetylated D-glucal (2.32) (10 g, 36.7 mmol) was dissolved in 350 mL of TM CH3NO2. Selectfluor (14.3 g, 40 mmol, 1.1 eq.) was added portion-wise into the solution and the reaction mixture was stirred at 70 oC overnight. Acetic acid (250 mL) was added and the temperature was raised to 90 oC and maintained there for two more days until TLC showed that all the starting material was consumed. Most of the organic

solvent was evaporated in vacuo and the residue was re-dissolved in CHCl3. The organic

phase was successively washed with water, saturated NaHCO3 and brine. After drying

over MgSO4, all the organic solvent was evaporated and the residue was further purified by gradient flash column chromatography (pet-ether: EtOAc = 3:1 to 2:1) to afford the product as a white foam (3.55 g, 10 mmol, 27%, α/β = 1.0 : 2.2). MS: Calcd. for + 1 C14H19FO9 + Na : 373.1; Found: 373.0 H-NMR data (CDCl3, 300 MHz): δ 6.42 (d, 1H,

J1α, 2α = 4.0 Hz, H-1α); 5.78 (dd, 1H, J1β, 2β = 8.1 Hz, J1β, F2β = 3.1 Hz, H-1β); 5.55 (dt,

1H, J3α, F2α = 12.1 Hz, J3α, 2α = J3α, 4α = 9.6 Hz, H-3α); 5.37 (dt, 1H, J3β, F2β = 14.3 Hz, J3β,

2β = J3β, 4β = 9.2 Hz, H-3β); 5.09 (t, 1H, J4α, 3α = J4α, 5α = 9.6 Hz, H-4α), 5.07 (t, 1H, J4β, 3β

= J4β, 5β = 9.2 Hz, H-4β); 4.64 (ddd, 1H, J2α, F2α = 48.5 Hz, J2α, 3α = 9.6 Hz, J2α, 1α = 4.0 Hz,

H-2α); 4.44 (dt, 1H, J2β, F2β = 50.8 Hz, J2β, 3β = J2β, 1β = 8.1 Hz, H-2β); 4.32-3.83 (m, H-5α, H-6α, H-6α’, H-5β, H-6β, H-6β’); 2.19 (s, 3H, OAc), 2.17 (s, 3H, OAc), 2.08 (s, 3H, 19 OAc), 2.07 (s, 3H, OAc), 2.03 (s, 3H, OAc), 2.02 (s, 3H, OAc). F-NMR data (CDCl3,

282 MHz): δ 201.6 (ddd, JF2β, H2β = 51 Hz, JF2β, H3β = 14 Hz, JF2β, H1β = 3 Hz, F-2β), 203.0

(dd, JF2α, H2α = 49 Hz, JF2α, H3α = 12 Hz, F-2α)

169 3,4,6-Tri-O-acetyl-2-fluoro-glucal (2.34)293

OAc

O AcO AcO

F 1,3,4,6-Tetra-O-acetyl-2-deoxy-2-fluoro-α/β-D-glucopyranose (2.33, 2.7 g, 7.7

mmol) was dissolved in 100 ml of dry CH2Cl2. HBr (33% in AcOH, 30 mL) was added and the mixture was stirred at room temperature for two days until TLC showed that all the starting materials were consumed. After aqueous workup, the crude product (MS: + Calcd for C12H16BrFO7+Na : 393.0/395.0. Found: 393.1/395.0) was directly dissolved

in 100 mL CH3CN. Triethylamine (25 ml) was added and the reaction mixture was stirred at 80 oC overnight. The organic solvent was evaporated in vacuo, and the residue was re- dissolved in CHCl3 followed by being washed with water, saturated NaHCO3 and brine.

After drying over MgSO4, the solvent was removed by evaporation and the resultant crude product was further purified by flash column chromatography (petroleum ether: EtOAc = 3:2) to yield the pure 3,4,6-tri-O-acetyl-2-fluoro-glucal as a colorless oil (1.95 1 g, 6.7 mmol, 87% over two steps). H-NMR data (CDCl3, 300 MHz): δ 6.76 (d, 1H, J1, F2

= 4.7 Hz, H-1), 5.61 (t, 1H, J3, 4 = J3, F2 = 4.0 Hz, H-3), 5.19 (dd, 1H, J4, 5 = J4,3 = 4.0 Hz, H-4), 4.45-4.13 (m, 3H, H-5, H-6, H-6’), 2.11 (s, 3H, OAc), 2.10 (s, 3H, OAc), 2.09 (s, 19 3H, OAc). F-NMR data (CDCl3, 282 MHz) data: δ 166.6 (dd, JF2, H1 = 4.7 Hz, JF2, H3 = + 4.0 Hz, F-2). MS: Calcd for C12H15FO7+Na : 313.2. Found: 313.1

1,3,4,6-Tetra-O-acetyl-2-deoxy-2,2-difluoro-α-D-arabinohexopyranose (2.35)293 OAc F O AcO AcO

F OAc (Acetyl hypofluorite method) Sodium acetate (280 mg), glacial acetic acid (3.5

mL) and CFCl3 (35 mL) were mixed in a 3-necked round bottom flask and cooled in a dry ice/acetone bath (-78 oC). A gas reservoir (~ 1 L) was filled with 20% fluorine in

170 neon (10 psi) which was diluted 4-fold with helium to 40 psi. This mixture was then bubbled through the slurry. The flask was thus charged twice before protected 2-fluoro

glucal (2.34, 0.800 g, 0.9 mmol) dissolved in CFCl3 (10 ml) was added to the mechanically-stirred mixture. The loosely-stoppered flask of reaction mixture was allowed to warm to room temperature. After general work-up, column chromatography (petroleum ether: ethyl acetate = 4:1) afforded the product (0.77 g, 2.1 mmol, 76%) as an 1 anomeric mixture, only the spectrum of the α-anomer is given here. H-NMR data

(CDCl3, 200 MHz): δ 6.19 (t, 1 H, J1,Fe = J1,Fa 3.0 Hz, H-1α), 5.50 (dt, 1 H, J3,Fa 20 Hz,

J3,Fe = J3,4 =10 Hz, H-3), 5.20 (t, 1 H, J4,3 = J4,5 = 10 Hz, H-4), 4.3 - 4.0 (m, 3 H, H-5, H- 6, H-6′), 3.88 (s, 3 H, OAc), 3.85 (s, 3 H, OAc), 3.81 (s, 3 H, OAc), 3.74 (s, 3 H, OAc). 19 F-NMR data (CDCl3, 188 MHz): δ -121.0 (F-2e, F-2a coincident). MS: Calcd. for + C14H18F2O9+Na : 391.1; Found: 391.0

3,4,6-Tri-O-acetyl-2-deoxy-2,2-difluoro-α-D-arabinohexopyranose (2.36) OAc F O AcO AcO

F OH Fully acetylated difluorosugar (2.35, 0.5 g, 1.36 mmol) was dissolved in DMF (10 mL), hydrazine acetate (280 mg, 3.0 mmol) was added and allowed to react 3 days at 50 °C. Evaporation of the solvent in vacuo, followed by flash chromatography (hexanes : 1 ethyl acetate = 1:1) afforded the hemiacetal (280 mg, 66%) as a colourless gum; H-

NMR data (CDCl3, 200 MHz): δ 5.56 (ddd, 1 H, J3,Fa = 19 Hz, J3,4 = 10 Hz, J3,Fe =6.0

Hz, H-3), 5.24 - 5.10 (m, 2 H, H-1, H-4), 4.87 (broad, 1 H, J1,OH = 3 Hz, 1-OH), 4.32 - 4.00 (m, 3 H, H-5, H-6, H-6′), 2.09 (s, 3 H, OAc), 2.06 (s, 3 H, OAc), 2.00 (s, 3 H, OAc). 13 C-NMR data (CDCl3, 75 MHz): δ 171.1, 170.0, 169.5, 115.6 (dd, JC,F = 257.0 Hz, JC,F=

246.0 Hz), 91.4 (dd, JC,F= 36.0 Hz, JC,F= 28.7 Hz), 68.7 (t, JC,F= 18.5 Hz), 68.2, 67.7 (d, 19 JC,F= 6.3 Hz), 61.9, 20.8, 20.7, 20.6 F-NMR data (CDCl3, 188 MHz): δ -120.7 (dd,

171 JFe,Fa 251, JFe,3 6.0 Hz, F-2e), -122.7 (ddd, JFa,Fe 251, JFa,3 19, JFa,1 5.0 Hz, F-2a). + HRMS: Calcd. for C12H16F2O8+Na : 349.0711; Found: 349.0710

3,4,6-Tri-O-acetyl-2-deoxy-2,2-difluoro-α-D-arabinohexopyranose (2.36) OAc F O AcO AcO

F OH (Selectfluor Method) Protected 2-fluoro-glucal (2.34, 0.64 g, 2.2 mmol) was

dissolved in 25 mL of CH3NO2/water (v/v = 4:1) mixed solvent and Selectfluor (1.17 g, 3.3 mmol) was added. The reaction mixture was stirred at room temperature overnight and then was heated to 95 oC for four hours. After cooling down the mixture, most of the solvent was evaporated under diminished pressure. The residue was re-dissolved in

EtOAc and washed successively with water, saturated NaHCO3 (aq.) and brine. After

drying over MgSO4 and evaporation of the solvent, flash chromatography (petroleum ether: ethyl acetate = 1: 1) yielded a colorless syrup (130 mg, 20%).

3,4,6-Tri-O-acetyl-2-deoxy-2,2-difluoro-α-D-arabinohexopyranosyl chloride (2.37) OAc F O AcO AcO

F Cl The hemiacetal (2.36, 0.104 g, 0.32 mmol) was dissolved in 5 mL thionyl

chloride. 100 mg of BiOCl was added and the reaction was stirred vigorously under N2 at room temperature in the dark for two days. After removing all the solvent, the residue was re-dissolved in EtOAc followed by aqueous workup. Column chromatography (petroleum ether: ethyl acetate= 3:1) yielded the acetylated glycosyl chloride (0.020 g, 1 0.06 mmol, 19%). H-NMR data (CDCl3, 300 MHz): δ 5.92 (dd, 1 H, J1,Fa = 4.8 Hz, J1,Fe

= 2.0 Hz, H-1), 5.68 (dt, 1 H, J3,Fa = 15 Hz, J3,4 = J3,Fe = 10 Hz, H-3), 5.19 (t, 1 H, J4,5 =

J4,3 10 Hz, H-4), 4.2 - 4.0 (m, 3 H, H-5, H-6, H-6′), 2.13 (s, 3 H), 2.09 (s, 3 H), 2.05 (s, 3

172 13 H). C-NMR data (CDCl3, 75 MHz): δ 170.6, 169.6, 169.2, 115.3 (dd, J C,F = 257.4 Hz,

JC,F = 252.6 Hz), 88.2 (t, J C,F = 34.0 Hz), 71.1, 68.1 (t, J C,F = 19.4 Hz), 66.7 (d, JC,F = 4.2 19 Hz), 61.0, 20.7, 20.6, 20.5; F-NMR data (CDCl3, 188 MHz): δ -115.5 (ddd, JFa,Fe =

249.8 Hz, JFa,3 = 15 Hz, JFa,1 = 4.8 Hz, F-2a), -116.9 (ddd, JFa,Fe = 249.8 Hz, JFe,3 = 10 Hz, F-2e).

2-Deoxy-2,2-difluoro-α-D-arabinohexopyranosyl chloride (2.2) OH F O HO HO

F Cl The acetylated glycosyl chloride (2.37, 0.049 g, 0.14 mmol) was dissolved in methanol (10 mL), cooled to 0 oC and dry ammonia gas was bubbled in for 5 minutes. The reaction was then left stirring at room temperature overnight. The resulting light orange oil was purified by column chromatography (ethyl acetate: methanol: water = 147: 1 2: 1) to afford the product as a colourless oil (0.016 g, 0.073 mmol, 52%). H-NMR data

(D2O, 300 MHz): δ 6.15 (d, 1 H, J1, Fa= 7.6 Hz, H-1), 4.25 (ddd, 1 H, J3,Fa = 21.0 Hz, J3,4 13 = 9.5 Hz, J3,Fe = 5.0 Hz, H-3), 3.92-3.62 (m, 4 H, H-4, H-5, H-6, H-6′). C-NMR data

(CDCl3, 75 MHz): δ 170.6, 169.6, 169.2, 115.3 (dd, JC,F = 258.0 Hz, JC, F = 253.0 Hz),

88.2 (t, JC,F = 34.0 Hz), 71.1, 68.1 (t, JC,F= 19.4 Hz), 66.7 (d, J C,F = 4.2 Hz), 61.0, 20.7, 19 20.6, 20.5; F-NMR data (D2O, 188 MHz): δ -116.1 (dd, JFe,Fa = 248.7 Hz, JFe,3 = 5.0

Hz, F-2e), -119.45 (ddd, JFa,Fe = 248.7 Hz, JFa,3 = 21.0 Hz, JFa,1 = 6.0 Hz, F-2a).

Hexa-O-acetyl-maltal (2.8)294 OAc

O AcO AcO OAc

AcO O O AcO

173 Maltose per-O-acetate (2.7, 37 g, 54.5 mmol) was dissolved in 250 mL of glacial acetic acid and 90 mL of 33% HBr in acetic acid was added to the solution. The reaction mixture was stirred under N2 at room temperature for 3 h. Cold CHCl3 was added to the reaction mixture, which was then successively washed with cold water, cold saturated

NaHCO3 and cold brine. The organic phase was then dried over MgSO4 and evaporated in vacuo. The product was directly used for the next reaction without further purification.

The crude bromide was dissolved in 300 mL of H2O-AcOH (1:1) and cooled down to 0 oC. Activated zinc powder (90 g, 1.38 mol) was added to the solution and the reaction mixture was vigorously stirred at 0 oC for overnight. After filtration, the reaction

mixture was re-dissolved in cold CHCl3 and washed successively with H2O, saturated

NaHCO3 and brine. After drying over MgSO4, the organic phase was concentrated by evaporation in vacuo. The remaining crude product was further purified by gradient flash chromatography (EtOAc : pet-ether = 3:7 to 5:5). Yield: 13.62 g (white powder, 24 + 1 mmol, 44% over two steps). MS: Calcd for C24H32O15+Na : 583.2; Found: 583.2. H-

NMR data (CDCl3, 300 MHz): δ 6.44 (d, 1H, J1, 2 = 6.1 Hz, H-1), 5.51 (d, 1H, J1’, 2’ = 3.9

Hz, H-1’), 5.41 — 4.00 (m, 12H), 2.13 (s, 3H, CH3CO), 2.11 (s, 3H, CH3CO), 2.05 (2s,

6H, CH3CO), 2.03 (s, 3H, CH3CO), 2.01 (s, 3H, CH3CO).

Hexa-O-acetyl-2-fluoro-maltal (2.5) OAc

O AcO AcO OAc

AcO O O AcO

F Hexa-O-acetyl-maltal (2.8, 11.95 g, 21.3 mmol) was dissolved in 350 mL of

CH3NO2, 10.0 g (28.2 mmol, 1.3 eq.) of Selectfluor was then added portion-wise to the solution over one hour. The mixture was stirred vigorously for two days at room temperature until TLC (petroleum ether: ethyl acetate = 1: 1) showed most of the starting materials were consumed. AcOH (250 mL) was then added and the reaction mixture was heated at 100 oC for three days. Most of the solvent was evaporated and the residue was

174 re-dissolved in CH2Cl2. The organic phase was washed with water, saturated NaHCO3

(aq.), water, brine, dried over MgSO4 and concentrated under diminished pressure. Gradient flash column chromatography (petroleum ether: ethyl acetate= 2:1 to 3:2 to 1:1) purified the major fraction into a white foam (2.9, 9.6 g, 15.0 mmol, 70 %) NMR spectra showed that the major fraction contained four isomeric products, with the anomeric acetate and fluorine having all possible configurations. These proved difficult to separate 1 19 and H-NMR assignment of the mixture was not feasible. F-NMR data (CDCl3, 282

MHz): -220.3 (ddd, J F2, 2 = 51.0 Hz, J F2, 3 = 26.5 Hz, J F2, 1β= 19.0 Hz, F-2 “β-manno”), -

204.6 (ddd, J F2, 2= 48.5 Hz, J F2, 3= 27.5 Hz, J F2, 1α= 6.0 Hz, F-2 “α-manno”), -203.2 (dd,

JF2, 2= 48.5 Hz, JF2, 3 = 11.5 Hz, F-2 “gluco”), -201.4 (dd, JF2, 2 = 51.0 Hz, JF2, 3= 14.0 Hz, + F-2 “gluco”) MS: Calcd for C26H35FO17+Na : 661.2. Found: 661.2. This material was directly used for the next step without any further purification.

The mixture from the previous step (2.9, 9.6 g, 15.0 mmol) was dissolved in dry

CH2Cl2 (300 mL). HBr in acetic acid (32 mL, 33 wt %) was added and it was allowed to react for 3 days at 0°C. The reaction was stopped by washing the organic phase

successively with water, saturated aqueous NaHCO3 (aq.), water and brine and then dried over MgSO4. After evaporating the solvent in vacuo, two major products were identified by NMR as 3,6-di-O-acetyl-4-O-(2',3',4',6'-tetra-O-acetyl-α-(1,4)-D-glucosyl)-2-deoxy-2-

fluoro-α-glucosyl bromide and 3,6-di-O-acetyl-4-O-(2',3',4',6'-tetra-O-acetyl-α-D- 1 glucosyl)-2-deoxy-2-fluoro-α-mannosyl bromide (2.10). H-NMR (CDCl3) data: (Only the anomeric region of the NMR spectrum of the fluorosugars in the product mixture is

described here). δ 6.43 (d, 1H, J1, 2 = 4.4 Hz, H-1 “gluco”), 6.34 (dd, 1H, J1, F2 = 9.7 Hz, 19 J1, 2= 1.5 Hz, H-1 “manno”); F NMR (CDCl3) data: δ -182.8 (ddd, JF2, 2= 49.6 Hz, JF2, 3

= 26.3 Hz, JF2, 1= 9.7 Hz, F-2 “manno”), -190.3 (dd, J F2, 2 = 49.3 Hz, J F2, 3= 10.2 Hz, F-2 + “gluco”). MS: Calcd for C24H32BrFO15+Na : 681.1, 683.1. Found: 681.0, 683.1. This mixture was used without further purification.

The mixture of 2-fluoro-"gluco" and "manno" bromides (2.10) was dissolved in acetonitrile (250 mL) and triethylamine (50 mL), and allowed to stir for two days at room

175 temperature. The excess base was removed in vacuo, CHCl3 was added, and the organic layer was washed with water, saturated aqueous NaHCO3, and water, and dried (MgSO4). Gradient flash column chromatography (petroleum ether: ethyl acetate= 2:1 to 3:2) gave two major products, the unreacted manno compound (2.6) and the desired elimination product -O-acetyl-2-fluoro maltal (2.5, 3.1 g, 5.4 mmol, 25% over three steps). 1H

NMR (CDCl3, 400 MHz): δ 6.77 (d, 1H, J1,F 5.2 Hz, H-1); 5.45-5.37 (m, 3H, H-3, H-1’,

H-3’), 5.05 (t, 1H, J4’, 3’= J4’, 5’ = 10.0 Hz, H-4’), 4.85 (dd, 1H, J2’, 3’ 10.4 Hz, J2’, 1’ 4.0 Hz, H-2’), 4.47 (m, 1H, H-5), 4.40-4.11 (m, 4H, H-6a, H-6b, H-6’a, H-6’b), 4.08 (ddd, 1H,

J5’, 4’ 10.0 Hz, J5’, 6’a 4.8 Hz, J5’, 6’b 2.4 Hz, H-5’), 4.00 (m, 1H, H-4), 2.111 (s, 3H, OAc), 2.108 (s, 3H, OAc), 2.106 (s, 3H, OAc), 2.099 (s, 3H, OAc), 2.044 (s, 3H, OAc), 2.027 (s, 13 3H, OAc) C-NMR (CDCl3, 100 MHz): 170.80, 170.68, 170.53, 170.29, 170.19, 169.81,

142.20 (d, JC,F 239 Hz), 132.16 (d, JC,F 40 Hz), 96.96, 74.61, 74.53, 74.07, 70.62, 69.95,

68.46, 65.87 (d, JC,F 23 Hz), 61.99, 60.70, 20.95 (2 CH3), 20.91, 20.88, 20.82, 20.75 19F

NMR (CDCl3, 282 MHz): d -164.7 (d, JF2,1 5.2 Hz, F-2). HRMS: Calcd for + C24H31FO15+Na : 601.1545. Found: 601.1543

3,6-Di-O-acetyl-4-O-[2’,3’,4’,6’-tetra-O-acetyl-α-D-glucopyranosyl]-2-deoxy-2- fluoro-α-D-mannopyranosyl bromide (2.6)

OAc

O AcO AcO OAc F AcO O O AcO

Br The other major product from the above elimination reaction was the unreacted bromide. Starting from 6.0 g (9.4 mmol) of per-O-acetylated 2-fluoro maltoside mixture, 1 2.21 g (3.4 mmol) of this product was generated. H-NMR (CDCl3, 300 MHz) data: δ

6.36 (dd, 1H, J1, F2 = 9.6 Hz, J1,2 = 0.8 Hz, H-1), 5.55 (d, 1H, J1’, 2’ = 3.8 Hz, H-1’), 5.52

(ddd, 1H, J3, F2 = 27.5 Hz, J3, 4 = 10.0 Hz, , J3, 2 = 2.0 Hz, H-3), 5.38 (t, 1H, J4,3 = J4,5 = 10.0 Hz, H-4), 5.11- 3.90 (m, 10H, H-2, H-5, H-6, H-6’, H-2’, H-3’, H-4’, H-5’, H-6’, H-

176 6’’), 2.14 (s, 3H, OAc), 2.13 (s, 3H, OAc), 2.09 (s, 3H, OAc), 2.04 (s, 3H, OAc), 2.03 (s, 19 3H, OAc), 2.01 (s, 3H, OAc). F-NMR data (CDCl3, 282 MHz) data: δ -184.27 (ddd, JF2, + H2 = 49 Hz, JF2, H3 = 28 Hz, JF2, H1 = 10 Hz, F-2). MS: Calcd for C24H32BrFO15 + Na : 681.1/683.1, Found: 681.1/683.1

3,6-Di-O-acetyl-4-O-[2’,3’,4’,6’-tetra-O-acetyl-α-(1,4)-D-glucopyranosyl]-2-deoxy-2- fluoro-α-D-mannopyranosyl fluoride (2.11)295 OAc

O AcO AcO OAc F AcO O O AcO

F Protected “manno” disaccharide bromide (2.6, 700 mg, 1.1 mmol) was dissolved in 18 mL of HF/pyridine and the reaction mixture was stirred at 0 oC for 12 hours. The

mixture was then diluted with EtOAc and successively washed with saturated NaHCO3

(until no gas bubbles), H2O and brine. After drying over anhydrous MgSO4, the organic solvent was evaporated in vacuo and the crude product was subjected to gradient flash column chromatography (petroleum ether: EtOAc = 2:1 to 3:2) to yield the pure product 1 as white foam (0.39 g, 0.65 mmol, 59%). H-NMR (CDCl3, 300 MHz) data: δ 5.68 (ddd,

1H, J1, F1 = 48 Hz, J1, F2 = 3.9 Hz, J1, 2 = 2.0 Hz,H-1), 5.55 (d, 1H, J1’, 2’ = 4.1 Hz, H-1’), 5.40-3.95 (m, 12H), 2.15 (s, 3H, OAc), 2.13 (s, 3H, OAc), 2.09 (s, 3H, OAc), 2.03 (s, 6H, 19 2OAc), 2.00 (s, 3H, OAc). F-NMR (CDCl3, 282 MHz) data: δ -144.3 (dd, JF1, H1 = 48

Hz, JF1, F2 = 20 Hz, F-1), -208.7 (ddd, JF2, H2 = 48 Hz, JF2, F1 = 20 Hz, JF2, H3 = 24 Hz, F-2). + MS: Calcd for C24H32F2O15+ Na : 621.2, Found: 621.2.

177 4-O-[α-D-Glucopyranosyl]-2-deoxy-2-fluoro-α-D-mannopyranosyl fluoride (2.12) OH

O HO HO OH F OH O O HO

F Protected disaccharide fluoride (2.11, 0.39 g, 0.65 mmol) was dissolved in 40 mL MeOH and deacetylated by the general deprotection method described in Section 5.2.2.4.

Flash chromatography (EtOAc: MeOH: H2O = 7:2:1) afforded the pure product (0.115 g, 1 0.33 mmol, 51%). H-NMR (D2O, 300 MHz) data: δ 5.72 (ddd, 1H, J1, F1 = 48.7 Hz, J1, F2

= 3.9 Hz, J1, 2 = 2.0 Hz, H-1), 5.29 (d, 1H, J1’, 2’ = 3.8 Hz, H-1’), 4.80 (dt, 1H, J2, F2 = 50.7 19 Hz, J2, 3 = J2, 1 = 2.1 Hz, H-2), 4.20-3.22 (m, 11H). F-NMR (D2O, 282 MHz) data: δ -

144.4 (dd, JF1, H1 = 48 Hz, JF1, F2 = 18 Hz, F-1), -209.4 (dddd, JF2, H2 = 49 Hz, JF2, H3 = 30 + Hz, JF2, F1 = 18 Hz, JF2, H1 = 4 Hz, F-2). MS: Calcd for C12H20F2O9+ Na : 369.1, Found: 369.0

3,6-Di-O-acetyl-4-O-[2’,3’,4’,6’-tetra-O-acetyl-α-D-glucopyranosyl]-2-chloro-2- deoxy-2-fluoro-α-D-glucopyranosyl chloride (2.13) OAc

O AcO AcO OAc F AcO O O AcO

Cl Cl Protected 2-fluoro maltal (2.5, 0.80 g, 1.38 mmol) was dissolved in dry carbon

tetrachloride (80 mL) over 4 Å molecular sieves and cooled in a CCl4/dry ice bath (- o 23 C). Chlorine was bubbled through the solution until it turned yellowish green (5 minutes), then the flask was wrapped in aluminum foil to exclude light, allowed to slowly warm up to room temperature and stirred overnight. Upon completion of the reaction, excess chlorine was purged with a stream of dry nitrogen for several minutes until the solution was colorless, and the solvent was evaporated in vacuo. The resulting material

178 was chromatographed twice (petroleum ether: ethyl acetate = 3: 2; then petroleum ether: diethyl ether = 2:3), giving the disaccharide chloride as a white foam (0.44 g, 0.68 mmol, 1 49%) H-NMR data (CDCl3, 300 MHz): δ 6.03 (d, 1H, J1, F 6.1 Hz, H-1), 5.83 (dd, 1H, J3,

F 23.3 Hz, J3, 4 9.2 Hz, H-3), 5.37 (d, 1H, J1’, 2’ 4.0 Hz, H-1’), 5.36 (t, 1H, J3’, 4’=J3’, 2’=

10.6 Hz, H-3’), 5.08 (t, 1H, J4’, 3’= J4’, 5’= 9.7 Hz, H-4’), 4.88 (dd, 1H, J2’ 3’ 10.6 Hz, J2’, 1’

4.0 Hz, H-2’) 4.59 (dd, 1H, J6a, 6b 12.5 Hz, J6a, 5 2.44 Hz, H-6a) 4.50-4.20 (m, 4H, H-4,

H-5, H-6b, H-6’a), 4.06 (dd, J6’b, 6’a 13.4 Hz, J6’b, 5’ 2.3 Hz, H-6’b), 3.98 (dt, 1H, J5’, 4’ 9.7

Hz, J5’, 6’a= J5’, 6’b= 2.3 Hz, H-5’), 2.18 (s, 3H, OAc), 2.15 (s, 3H, OAc), 2.10 (s, 3H, 13 OAc), 2.09 (s, 3H, OAc), 2.03 (s, 3H, OAc), 2.01 (s, 3H, OAc). C-NMR (CDCl3, 75

MHz): δ 170.8, 170.6, 170.4, 170.0, 169.6, 169.5, 108.6 (d, JC, F2= 248.0 Hz), 96.1, 91.5

(d, JC, F2 = 30.7 Hz), 73.1 (d, J C, F2 = 17.9 Hz), 71.7, 71.4, 70.1, 69.3, 68.9, 68.0, 61.8, 19 61.5, 20.9, 20.8 (3C), 20.7 (2C); F-NMR data (CDCl3, 282 MHz): -119.9 (dd, JF, 3 23.0 35 + Hz, JF, 1 5.0 Hz, F-2) HRMS data: Calcd for C24H31O15F Cl2+Na : 671.0922. Found: 671.0923

2-Chloro-2-deoxy-2-fluoro-4-O-[α-D-glucopyranosyl]-α-D-glucopyranosyl chloride (2.3) OH

O HO HO OH F HO O O HO

Cl Cl Protected disaccharide chloride (2.13, 80 mg, 0.12 mmol) was dissolved in 10 mL of dry methanol, a small piece of sodium metal was added then the mixture was stirred at room temperature for two hours. Acidic ion exchange resin was added, stirred for 10 min until the solution became weakly acidic, then the resin was filtered off and the solvent was evaporated in vacuo. The product was purified by flash column chromatography (ethyl acetate: methanol: water = 7: 2: 1) as a white foam (41 mg, 0.10 mmol, 82%). 1H-

NMR data (D2O, 300 MHz): 6.26 (d, 1H, J1, F 6.3 Hz, H-1), 5.35 (d, 1H, J1’, 2’ 4.0 Hz, H-

1’), 4.40 (dd, 1H, J3, F 24.3 Hz, J3, 4 9.8 Hz, H-3), 4.13 (dt, 1H, J5, 4 9.8 Hz, J5, 6a= J5, 6b=

179 2.3 Hz, H-5), 3.90 (t, 1H, J4, 3= J4, 5= 9.8 Hz, H-4), 3.77- 3.53 (m, 6H, H-6a, H-6b, H-3’,

H-5’, H-6’a, H-6’b), 3.45 (dd, 1H, J2’, 3’ 10.0 Hz, J2’, 1’ 4.0 Hz, H-2’), 3.32 (t, 1H, J4’, 3’= 13 J4’, 5’= 9.7 Hz, H-4’). C-NMR data (D2O, 100 MHz): 110.52 (d, JC, F= 242 Hz), 99.71, 19 91.79 (d, JC, F= 31 Hz), 73.86, 73.83, 73.68, 72,99, 72,86, 71,64, 69.40, 60,53, 59.98. F-

NMR data (D2O, 282 MHz): -123.87 (dd, JF, 3 24.3 Hz, JF, 1 6.3 Hz, F-2) HRMS data: + Calcd for C12H19O9FCl2 + Na : 419.0288. Found: 419.0289

1,3,6-Tri-O-acetyl-4-O-[2’,3’,4’,6’-tetra-O-acetyl-α-D-glucopyranosyl]-2-deoxy-2,2- difluoro-α/β-D-arabinohexopyranose (2.14) OAc

O AcO AcO OAc F AcO O O AcO

F OAc (Acetyl hypofluorite method) Per-O-acetyl-2-fluoromaltal (2.5, 0.99 g, 1.7 mmol) was fluorinated according to the fluorination procedure mentioned in the general method. The reaction mixture was partially purified by flash chromatography (hexanes: ethyl acetate = 1: 1) to afford a syrup containing both α- and β anomers of the gem-difluoro products 0.67 g (2.14, 1.0 mmol, 59%) of. Only the spectra of the α-anomer are given 1 here. H-NMR (CDCl3, 300 MHz): δ 6.14 (d, 1 H, J1, F2a = 4.1 Hz, H-1), 5.57 (dt, 1 H, J3,

F2a = 16.7 Hz, J3,4 = J3, F2e = 8.4 Hz, H-3), 5.43 (d, 1 H, J1’, 2’ = 3.9 Hz, H-1’), 5.36 (t, 1 H,

J3’, 2’= J3’, 4’ = 9.8 Hz, H-3’), 5.07 (t, 1 H, J4’, 3’ = J4’, 5’ = 9.8 Hz, H-4’), 4.86 (dd, 1 H, J2’,

3’ = 10.5 Hz, J2’, 1’= 3.9 Hz), 4.50-3.90 (m, 7 H), 2.24 (s, 3 H), 2.14 (s, 3 H), 2.13 (s, 3 H), 13 2.09 (s, 3 H), 2.06 (s, 3 H), 2.02 (s, 3 H), 2.00 (s, 3 H). C-NMR (CDCl3, 150 MHz): δ

170.8, 170.7, 170.6, 170.2, 169.8, 169.6, 168.0, 114.8 (dd, JC,F= 258 Hz, JC,F = 246 Hz),

96.0, 88.8 (dd, JC,F = 37.7 Hz, JC,F = 31.7 Hz), 71.5 (d, JC,F = 4.5 Hz), 71.1 (t, JC,F = 19.6 Hz), 70.7, 70.3, 69.3, 68.9, 68.0, 62.2, 61.5, 21.0, 20.94, 20.91, 20.88, 20.80 (3C). + HRMS: Calcd. for C26H34F2O17+Na : 679.1662. Found: 679.1650. This anomeric mixture was directly used in the next step without further purification.

180 3,6-Di-O-acetyl-4-O-[2’,3’,4’,6’-tetra-O-acetyl-α-(1,4)-D-glucopyranosyl]-2-deoxy- 2,2-difluoro-α-D-arabinohexopyranose (2.15) OAc

O AcO AcO OAc F AcO O O AcO

F OH To 0.67 g (1.0 mmol) of the anomeric mixture of per-O-acetylated 2-deoxy-2,2- difluoro maltoside (2.14) was added 0.13 g (1.4 mmol, 1.4 eq.) of hydrazine acetate in DMF (20 mL) and the reaction was continued for 3 days at 50°C. The reaction was stopped by re-dissolving the mixture in ethyl acetate then the organic layer was washed successively with water and brine. Evaporation of the solvent in vacuo, followed by flash chromatography (hexanes: ethyl acetate = 1:1) gave the α-hemiacetal 12 (0.42 g, 0.68

mmol, 68%) as a white foam. δ 5.63 (ddd, 1 H, J3,F2a = 18.5 Hz, J3,4 = 9.2 Hz, J3,F2e =

6.5 Hz, H-3), 5.42 (d, 1 H, J1',2' = 4.0 Hz, H-1'), 5.35 (dd, 1 H, J3',2' = 10.4 Hz, J3',4' =

9.6 Hz, H-3'), 5.18 (d, 1 H, J1,F2a = 4.7 Hz, H-1), 5.05 (dd, 1 H, J4',3' = J4',5' = 9.8 Hz,

H-4'), 4.84 (dd, 1 H, J2',3' = 10.4 Hz, J2',1' = 4.0 Hz, H-2'), 4.6-4.0 (m, 7 H, H-4, H-5, H- 13 6a, H-6b, H-5', H-6a', H-6b'), and 2.2-1.9 (5 s, 6 OAc); C-NMR data (CDCl3, 75 MHz):

δ 170.9, 170.8 (2C), 170.2, 169.8, 169.6, 115.8 (dd, J C,F = 258.0 Hz, J C,F = 245.0 Hz),

95.7, 91.0 (dd, J C,F = 35.0 Hz, J C,F = 28.8 Hz), 71.9 (d, J C,F = 5.4 Hz), 70.9 (t, J C,F = 19.0 Hz), 70.2, 69.5, 68.6, 68.5, 68.1, 62.6, 61.5, 20.93, 20.85, 20.79, 20.7 (3C) 19F

NMR (CDCl3, 282 MHz): δ -120.9 (dd, JF2e,F2a = 253 Hz, JF2e,3 = 6.3 Hz, Fe-2), -

122.7 (ddd, JF2a,F2e = 253 Hz, JF2a,3 = 18.4 Hz, JF2a,1 = 4.5 Hz, Fa-2) HRMS data: + Calcd. for C24H32F2O16+Na : 637.1556. Found: 637.1562.

181 3,6-Di-O-acetyl-4-O-[2’,3’,4’,6’-tetra-O-acetyl-α-D-glucopyranosyl]-2-deoxy-2,2- difluoro-α-D-arabinohexopyranose (2.15) OAc

O AcO AcO OAc F AcO O O AcO

F OH (Selectfluor Method) Protected 2-fluoro-maltal (2.5, 20 mg, 0.035 mmol) and 18

mg (0.05 mmol) of Selectfluor were dissolved in 5 mL of CH3NO2/water (v/v = 4:1) mixed solvent. The reaction mixture was stirred at room temperature overnight and then heated to 95 oC for four hours. After cooling down the mixture, most of the solvent was evaporated under diminished pressure. The residue was re-dissolved in EtOAc and

washed successively with water, saturated NaHCO3 (aq.) and brine. After drying over

MgSO4 and evaporation of the solvent, flash chromatography (petroleum ether: ethyl acetate = 1: 1) yielded the compound 2.15 as a colorless syrup (18 mg, 84%).

3,6-Di-O-acetyl-4-O-[2’,3’,4’,6’-tetra-O-acetyl-α-D-glucopyranosyl]-2-deoxy-2,2- difluoro-α-D-arabinohexopyranosyl chloride (2.16)

OAc

O AcO AcO OAc F AcO O O AcO

F Cl

The hemiacetal (2.15, 250 mg, 0.41 mmol) was dissolved in 10 mL of dry CH2Cl2 along with 3.5 mL of SOCl2 (large excess to prevent the evaporation of SOCl2) and then BiOCl (0.35 g) was added. The reaction flask was wrapped with aluminum foil to exclude light, then the mixture was stirred vigorously at room temperature under a nitrogen atmosphere for three days when TLC showed that most of the starting material had been consumed. The reaction mixture was poured into ice-cold water, stirred, then

182 transferred to a separatory funnel. The organic phase was washed successively with

water, saturated NaHCO3 (aq.), water and brine. The organic layer was dried with

MgSO4, filtered and the solvent was evaporated in vacuo. Two rounds of flash column

chromatography (hexanes: ethyl acetate = 1:1 then CHCl3: acetone = 20:1) yielded the 1 pure product (34 mg, 0.054 mmol, 13%) as a colorless syrup. H-NMR data (CDCl3, 400

MHz): δ 5.93 (d, 1H, J1, F = 6.4 Hz, H-1), 5.78 (ddd, 1H, J3, Fa = 18.8 Hz, J3, 4 = 9.6 Hz, J3,

Fe = 5.2 Hz, H-3), 5.45 (d, 1H, J1’, 2’ = 4.0 Hz, H-1’), 5.37 (t, 1H, J3’, 2’= J3’, 4’= 10.4 Hz,

H-3’), 5.09 (t, 1H, J4’, 3’= J4’, 5’= 10.0 Hz, H-4’), 4.87 (dd, 1H, J2’, 3’ = 10.4 Hz, J2’, 1’ = 4.0

Hz, H-2’), 4.59 (dd, 1H, J6a, 6b = 12.6 Hz, J6a, 5 = 2.4 Hz, H-6a), 4.34 (dt, 1H, J5, 4 = 9.6

Hz, J5, 6a= J5, 6b= 2.4 Hz, H-5), 4.29-4.23 (m, 2H, H-6b, H-6’b), 4.18 (t, 1H, J4, 3=J4, 5= 9.6

Hz, H-4), 4.08 (dd, 1H, J6’a, 6’b = 12.6 Hz, J6’a, 5’ = 2.4 Hz, H-6’a), 3.95 (dt, 1H, J5’, 4’ =

10.0 Hz, J5’, 6’a=J5’, 6’b= 2.4 Hz, H-5’), 2.17 (s, 3H, OAc), 2.16 (s, 3H, OAc), 2.11 (s, 3H, 13 OAc), 2.08 (s, 3H, OAc), 2.04 (s, 3H, OAc), 2.02 (s, 3H, OAc). C-NMR data (CDCl3,

100 MHz): δ 171.6, 171.5, 171.2, 170.8, 170.4, 170.3, 116.2 (dd, JC,F = 258.6 Hz, JC,F =

249.5 Hz), 96.9, 88.8 (t, JC,F = 35.0 Hz), 72.3, 72.0 (d, JC,F = 5.4 Hz), 71.1, 71.0 (t, JC,F = 19.3 Hz), 70.2, 69.8, 68.9, 62.6, 62.3, 21.7, 21.63, 21.61, 21.56 (3C); 19F-NMR data

(CDCl3, 282 MHz): δ -115.8 (ddd, JFa, Fe = 247.5 Hz, JFa, 3 = 18.8 Hz, JFa, 1 = 7.0 Hz, F2a)

116.9 (dd, JFe, Fa = 247.5 Hz, JFe, H3 = 5.2 Hz, F2e) HRMS data: Calcd. for 35 + C24H31O15 ClF2+Na : 655.1217. Found: 655.1215

2-Deoxy-2,2-difluoro-4-O-[α-D-glucopyranosyl]-α-D-arabinohexopyranosyl chloride (2.4) OH

O HO HO OH F OH O O HO

F Cl Protected disaccharide (2.16, 20 mg, 0.03 mmol) was dissolved in 5 mL of methanol (HPLC grade), cooled to 0 oC and dry ammonia was bubbled in for one minute. After removal of the ice bath the reaction was stirred at room temperature overnight. The solvent was then evaporated to dryness and the residue purified by flash column

183 chromatography (ethyl acetate: methanol: water = 7: 2: 1) to afford the final product (10 1 mg 0.026 mmol, 83%) as a white foam. H-NMR data (D2O, 300 MHz): δ 6.13 (d, 1H,

J1, Fa = 7.3 Hz, H-1), 5.36 (d, 1H, J1’, 2’ = 3.8 Hz, H1’), 4.46 (ddd, 1H, J3, F2a = 21.2 Hz,

J3, 4 = 9.2 Hz, J3, F2e = 5.5 Hz, H-3), 4.07 (dt, 1H, J5, 4 = 9.2 Hz, J5, 6a= J5, 6b= 3.0 Hz, H-5),

3.85 (t, 1H, J4, 3= J4, 5= 9.2 Hz, H-4), 3.77-3.52 (m, 6H, H-6a, H-6b, H-3’, H-5’, H-6’a,

H-6’b), 3.44 (dd, 1H, J2’, 3’= 10.0 Hz, J2’, 1’= 3.8 Hz, H-2’), 3.31 (t, 1H, J4’, 3’= J4’, 5’= 9.1 13 Hz, H-4’). C-NMR data (D2O, 100 MHz): δ 119.42 (t, JC,F = 244.0 Hz), 102.24, 91.05 (t,

JC,F = 36.0 Hz), 76.32, 76.15, 75.71, 75.59, 74.39, 73.14 (t, JC,F = 19.0 Hz), 72.14, 63.28, 19 62.65. F-NMR data (D2O, 282 MHz): δ -117.7 (dd, JFe, Fa = 249.0 Hz, JFe, 3 = 5.5 Hz, F-

2e) 119.7 (ddd, JFa, Fe = 249.0 Hz, JFa, 3 = 21.2 Hz, JFa, 1 = 7.3 Hz, F-2a) HRMS data: 35 + Calcd. for C12H19 ClF2O9+Na : 403.0583. Found: 403.0589

2,3,4,6-Tetra-O-acetyl-α-D-glucopyranosyl fluoride (2.38)296 OAc

O AcO AcO

AcO F Per-O-acetylated glucose (5.03 g, 12.9 mmol) was dissolved in 25 mL of HF/pyridine and stirred at 4 oC overnight. The reaction mixture was diluted with EtOAc

and saturated NaHCO3 was then added portion-wise until there was no release of gas bubbles. The organic phase was then washed successively with water and brine. After

drying over anhydrous MgSO4, the solvent was evaporated in vacuo. Flash column chromatography (petroleum ether: EtOAc = 2: 1) yielded the pure product as a white 1 solid (3.33 g, 9.5 mmol, 74%). H-NMR (CDCl3, 300 MHz) data: δ 5.73 (dd, 1H, J1, F1 =

52.9 Hz, J1, 2 = 2.8 Hz, H-1), 5.48 (t, 1H, J3, 2 = J3, 4 = 9.8 Hz, H-3), 5.13 (t, J4, 5 = J4, 3 =

9.8 Hz, H-4), 4.94 (ddd, 1H, J2, F1 = 24.2 Hz, J2, 3 = 9.8 Hz, J2,1 = 2.8 Hz, H-2), 4.28 – 4.10 (m, 3H, H-5, H-6, H-6’), 2.08 (2s, 6H, 2OAc), 2.02 (s, 3H, OAc), 2.01 (s, 3H, OAc). 19 F-NMR data (CDCl3, 282 MHz) data: δ -150.4 (dd, JF1, H1 = 52.9 Hz, JF1, H2 = 24.2 Hz, F-1).

184 α-D-Glucopyranosyl fluoride (3.1)297 OH

O HO HO

HO F Protected glycosyl fluoride (2.38, 0.29 g, 0.83 mmol) was dissolved in 20 mL of dry MeOH and deacetylated by the general deprotection method described in Section

5.2.2.4. Flash column chromatography (EtOAc: MeOH: H2O = 17:2:1) afforded the pure 1 product as a white powder (150 mg, 0.83 mmol, quantitative yield). H-NMR (D2O, 300

MHz) data: δ 5.58 (dd, 1H, J1, F1 = 53.4 Hz, J1, 2 = 2.4 Hz, H-1), 3.78 – 3.40 (m, 6H, H-2, 19 H-3, H-4, H-5, H-6, H-6’). F-NMR (D2O, 282 MHz) data: δ -150.9 (dd, JF1, H1 = 53.4

Hz, JF1, H2 = 27 Hz, F-1).

2,3,4,6-Tetra-O-acetyl-5-bromo-α-D-glucopyranosyl fluoride (2.39)175 OAc

O AcO AcO Br AcO F Per-O-acetylated glucosyl fluoride (2.38, 1.52 g, 4.34 mmol) was dissolved in 60 mL of dry carbon tetrachloride and was poured into the bottom of an Ace water jacket-

cooled immersion-well photo-reactor equipped with a N2 gas inlet/outlet. N- Bromosuccinimide (recrystallized from water, 3.07 g, 17.2 mmol, 4 eq.) was added and

the reaction mixture was stirred vigorously under N2 with illumination by a 600 W lamp for 26 hours. The reaction was stopped by turning off the light and the mixture was cooled down to room temperature. The reaction solvent was then evaporated off under vaccum and the product was purified by flash column chromatography (petroleum ether: 1 EtOAc = 3:1) to yield 2.39 as a white foam (1.15 g, 2.68 mmol, 62%). H-NMR (CDCl3,

300 MHz) data: δ 5.91 (dd, 1H, J1, F1 = 53.3 Hz, J1,2 = 3.3 Hz, H-1), 5.86 (t, 1H, J2, 3 = J3, 4

= 10.3 Hz, H-3), 5.20 (d, 1H, J4, 3 = 10.1 Hz, H-4), 5.05 (ddd, J2, F1 = 23.6 Hz, J2, 3 = 10.4

Hz, J2, 1 = 3.3 Hz, H-2), 4.38 (dd, 2H, J6, 6’ = 12.4 Hz, H-6, H-6’), 2.14 (s, 3H, OAc), 2.10 19 (s, 3H, OAc), 2.09 (s, 3H, OAc), 2.04 (s, 3H, OAc). F-NMR (CDCl3, 282 MHz) data: δ

-146.7 (dd, JF1, H1 = 53 Hz, JF1, H2 = 24 Hz, F-1).

185 2,3,4,6-Tetra-O-acetyl-5-fluoro-β-L-idopyranosyl fluoride (2.40)175 F O AcO AcO

AcO AcO F Per-O-acetylated 5-bromo-α-glucosyl fluoride (2.39, 1.15 g, 2.68 mmol) was

dissolved in 20 mL of dry CH3CN. AgF (1.54 g, 12.1 mmol, 4.5 eq.) was added and the

reaction was stirred vigorously under N2 in the dark (wrapped with aluminum foil) at room temperature for 48 hours. The reaction mixture was filtered to remove the black silver by-products and then the solvent was evaporated in vacuo. Flash column chromatography (Hexanes: EtOAc = 2: 1) afforded the pure product as a colorless syrup (0.561 g, 1.52 mmol, 72%). The reaction did not go to completion under these conditions and 0.25 g of starting material was recovered after column chromatography. The actual yield was calculated after correction for the recovered starting material.) 1H-NMR

(CDCl3, 300 MHz) data: δ 5.78 (dd, 1H, J1, F1 = 58.1 Hz, J1, 2 = 2.0 Hz, H-1), 5.34 (ddd,

1H, J2, F1 = 25.8 Hz, J2, 3 = 8.5 Hz, J2, 1 = 2.0 Hz, H-2), 5.27 (dd, 1H, J3,2 = 8.5 Hz, J3, 4 =

1.6 Hz, H-3), 5.20 (dd, 1H, J4, 3 = 1.6 Hz, J4, F5 = 5.1 Hz, H-4), 4.41 (dd, 1H, J6, F5 = 24.4

Hz, J6, 6’ = 12.4 Hz, H-6), 4.16 (t, 1H, J6’ F5 = J6’, 6 = 12.4 Hz, H-6’), 2.10 (s, 3H, OAc), 19 2.09 (s, 6H, 2OAc), 2.06 (s, 3H, OAc). F-NMR (CDCl3, 282 MHz) data: δ -106.7 (m,

F-5), -139.0 (ddd, JF1, H1 = 58 Hz, JF1, F5 = JF1, H2 = 26 Hz, F-1). MS: Calcd. for + C14H18F2O9 + Na : 391.1, Found: 391.0.

5-Fluoro-β-L-idopyranosyl fluoride (2.31)175 F O HO HO

HO HO F

Protected 5-fluoro idosyl fluoride (2.40, 89 mg, 0.24 mmol) was dissolved in 14 mL of dry MeOH followed by the general deprotection method described in Section

5.2.2.4. Flash column chromatography (EtOAc: MeOH: H2O = 27:2:1) afforded the pure 1 product as colorless syrup (34 mg, 0.17 mmol, 71%). H-NMR (D2O, 300 MHz) data: δ

186 5.69 (dd, 1H, J1. F1 = 55.8 Hz, J1, 2 = 2.3 Hz,H-1), 3.89 - 3.5 (m, 5H, H-2, H-3, H-4, H-6, 19 H-6’). F-NMR (D2O, 282 MHz) data: δ -116.4 (ddd, F-5), -142.8 (ddd, JF1, H1 = 56 Hz, + JF1, H2 = 26 Hz, JF1, F5 = 12 Hz, F-1). MS: Calcd. for C6H10F2O5 + Na : 223.0, Found: 223.1.

2,3,4,6-Tetra-O-acetyl-5-fluoro-α-D-glucopyranosyl fluoride (2.41)175 OAc

O AcO AcO F AcO F Protected 5-fluoro idosyl fluoride (2.40, 200 mg, 0.54 mmol, dried over the vacuum pump for 2 hours) was dissolved in 4 mL of HF/pyridine. A catalytic amount of AgF was added and the mixture was stirred under argon gas in the dark at room temperature for 16 hours. EtOAc was added to dilute the mixture and the solution was washed successively with saturated NaHCO3 (until no release of gas bubbles), water and

brine. After drying over anhydrous MgSO4, the solvent was evaporated in vacuo and the crude product was further purified by flash column chromatography (hexanes: EtOAc = 1 2:1). Yield: 45 mg (0.12 mmol, 23%, colorless syrup). H-NMR (CDCl3, 300 MHz) data:

δ 5.85 (dd, 1H, J1, F1 = 52.5 Hz, J1, 2 = 3.1 Hz, H-1), 5.77 (t, 1H, J3, 4 = J3, 2 = 10.3 Hz, H-

3), 5.32 (dd, 1H, J4, F5 = 22.3 Hz, J4, 3 = 10.1 Hz, H-4), 5.06 (ddd, J2, F1 = 23.4 Hz, J2, 3 =

10.0 Hz, J2, 1 = 3.1 Hz, H-2), 4.31 (dd, 1H, J6, 6’ = 12.0 Hz, J6, F5 = 6.0 Hz, H-6), 4.08 (dd,

1H, J6’, 6 = 12.0 Hz, J6’, F5 = 3.8 Hz, H-6’), 2.13 (s, 3H, OAc), 2.12 (s, 3H, OAc), 2.11 (s, 19 3H, OAc), 2.05 (s, 3H, OAc). F-NMR (CDCl3, 282 MHz) data: δ -125.2 (F-5), -141.5 (F-1).

187 5-Fluoro-α-D-glucopyranosyl fluoride (2.30)175 OH

O HO HO F HO F Protected 5-fluoro glucosyl fluoride (2.41, 54 mg, 0.15 mmol) was dissolved in 7 mL of dry MeOH and deacetylated by the general deprotection method described in

Section 5.2.2.4. Flash column chromatography (EtOAc: MeOH: H2O = 17:2:1) afforded 1 the pure product as a colorless syrup (27 mg, 0.14 mmol, 88%). H-NMR (D2O, 300

MHz) data: δ 5.73 (dd, 1H, J1, F1 = 53.7 Hz, J1, 2 = 2.5 Hz, H-1), 3.94 (t, 1H, J3, 2 = J3, 4 = 19 9.9 Hz, H-3), 3.75- 3.60 (m, 4H, H-2, H-4, H6, H-6’). F-NMR (D2O, 282 MHz) data: δ

-130.5 (F-5), -142.9 (dt, JF1, H1 = 54 Hz, JF1, F2 = JF1, H2 = 26 Hz, F-1). MS: Calcd. for + C6H10F2O5 + Na : 223.0, Found: 223.1.

3,4,6-Tri-O-acetyl-2-deoxy-2-fluoro-α-D-glucopyranosyl fluoride (3.12)181 OAc

O AcO AcO

F F Per-O-acetylated 2-fluoro glucoside (2.33, 0.75 g, 2.1 mmol) was dissolved in 20 mL of HF/pyridine and this mixture was stirred at 0 oC for 12 hours. EtOAc was added to

dilute the mixture, which was washed successively with saturated NaHCO3 (until no

release of gas bubbles), water and brine. After drying over anhydrous MgSO4, the solvent was evaporated in vacuo and the crude product was further purified by flash column chromatography (petroleum ether: EtOAc = 3: 2). Yield: 0.28 g (0.90 mmol, 43%, white 1 powder). H-NMR (CDCl3, 300 MHz) data: δ 5.80 (dd, 1H, J1, F1 = 52.7 Hz, J1, 2 = 2.8 Hz,

H-1), 5.56 (dt, 1H, J3, F2 = 12.3 Hz, J3, 2 = J3, 4 = 9.6 Hz, H-3), 5.10 (t, 1H, J4, 5 = J4, 3 = 9.6

Hz, H-4), 4.54 (dddd, 1H, J2, F2 = 48.1 Hz, J2, F1 = 23.7 Hz, J2, 3 = 9.6 Hz, J2, 1 = 2.9 Hz, H-2), 4.31 – 4.10 (m, 3H, H-5, H-6, H-6’), 2.08 (s, 3H, OAc), 2.07 (s, 3H, OAc), 2.04 (s, 19 3H, OAc). F-NMR (CDCl3, 282 MHz) data: δ -151.8 (ddd, JF1, H1 = 52 Hz, JF1, H2 = 24

188 Hz, JF1, F2 = 19 Hz, F-1), -204.8 (ddd, JF2, H2 = 48 Hz, JF2, F1 = 19 Hz, JF2, H3 = 12 Hz, F-2). + MS: Calcd. for C12H16F2O7+ Na : 333.1, Found: 333.1

2-Deoxy-2-fluoro-α-D-glucopyranosyl fluoride (3.13)181 OH

O HO HO

F F Protected 2-fluoro glucosyl fluoride (3.12, 16 mg, 0.05 mmol) was dissolved in 5 mL of dry MeOH and deacetylated by the general deprotection method described in

Section 5.2.2.4. Flash column chromatography (EtOAc: MeOH: H2O = 7:2:1) afforded the pure product as a colorless syrup (10 mg, 0.05 mmol, quantitative yield). 1H-NMR

(D2O, 300 MHz) data: δ 5.78 (dd, 1H, J1, F1 = 53.8 Hz, J1, 2 = 2.9 Hz, H-1), 4.37 (dddd,

1H, J2, F2 = 48.1 Hz, J2, F1 = 24.6 Hz, J2, 3 = 9.6 Hz, J2, 1 = 2.9 Hz, H-2), 3.90 (dt, 1H, J3, F2

= 13.6 Hz, J3, 2 = J3, 4 = 9.5 Hz, H-3), 3.79 – 3.64 (m, 3H, H-5, H-6, H-6’), 3.45 (t, 1H, J4, 19 3 = J4, 5 = 9.6 Hz, H-4). F-NMR (D2O, 282 MHz) data: δ -150.7 (ddd, JF1, H1 = 53 Hz,

JF1, H2 = 24 Hz, JF1, F2 = 20 Hz, F-1), -204.6 (ddd, JF2, 2 = 48 Hz, JF2, F1 = 20 Hz, JF2, H3 = + 14 Hz, F-2). MS: Calcd. for C6H10F2O4 + NH4 : 202.1, Found: 202.1.

3,6-Di-O-acetyl-4-O-[2’,3’,4’,6’-tetra-O-acetyl-α-D-glucopyranosyl]-2-deoxy-2- fluoro-α-D-glucopyranosyl fluoride (2.18)295 OAc

O AcO AcO OAc

AcO O O AcO

F F Per-O-acetylated maltal (2.8, 24 mg, 43 μmol) was dissolved in anhydrous

Et2O/CH2Cl2 mixed solvent (VEt2O/VCH2Cl2 = 5:1). XeF2 (8 mg, 47 μmol, 1.1 eq.) and a

catalytic amount of BF3● Et2O were then added and the mixture was stirred vigorously at room temperature for three hours. The reaction mixture was redissolved in EtOAc and

this was successively washed with water, saturated NaHCO3, brine and dried over

189 anhydrous MgSO4 followed by evaporation. Flash column chromatography (petroleum ether: EtOAc = 3: 2) afforded the final product as a white foam (11 mg, 18 μmol, 42%). 1 H-NMR (CDCl3, 300 MHz) data: δ 5.74 (dd, 1H, J1, F1 = 52.7 Hz, J1, 2 = 2.7 Hz, H-1),

5.63 (dt, 1H, J3, F2 = 10.9 Hz, J3, 2 = J3, 4 = 9.4 Hz, H-3), 5.46 (d, 1H, J1’, 2’ = 4.0 Hz, H-1’), 5.40 – 3.91 (m, 11H), 2.14 (s, 3H, OAc), 2.10 (s, 6H, 2OAc), 2.08 (s, 3H, OAc), 2.03 (s, 19 3H, OAc), 2.01 (s, 3H, OAc). F-NMR (CDCl3, 282 MHz) data: δ -151.2 (dt, JF1, H1 = 53

Hz, JF1, H2 = JF1, F2 = 22 Hz, F-1), -205.6 (ddd, JF2, H2 = 50 Hz, JF2, F1 = 22 Hz, JF2, H3 = 11 + Hz, F-2). MS: Calcd. for C24H32F2O15+ Na : 621.2, Found: 621.1.

4-O-[α-D-Glucopyranosyl]-2-deoxy-2-fluoro-α-D-glucopyranosyl fluoride (2.19)123

OH

O HO HO OH

OH O O HO

F F Protected 2-deoxy-2-fluoro maltosyl fluoride (2.18, 48 mg, 80 μmol) was dissolved in 10 mL of dry MeOH and deacetylated by the general deprotection method described in Section 5.2.2.3. Flash column chromatography conditions: EtOAc: MeOH: 1 water = 12: 2:1. Yield: 21 mg (61 μmol, 76%, white foam). H-NMR (D2O, 300 MHz)

data: δ 5.78 (dd, 1H, J1, F1 = 53.9 Hz, J1, 2 = 2.8 Hz, H-1), 5.33 (d, 1H, J1’, 2’ = 3.8 Hz, H-

1’), 4.41 (dddd, 1H, J2, F2 = 47.9 Hz, J2, F1 = 24.4 Hz, J2, 3 = 9.5 Hz, J2, 1 = 2.9 Hz, H-2), 19 4.18 (dt, 1H, J3, F2 = 13.8 Hz, J3, 2 = J3, 4 = 9.5 Hz, H-3), 3.90 – 3.28 (m, 10H). F-NMR

(D2O, 282 MHz) data: δ -151.0 (ddd, JF1, H1 = 53 Hz, JF1, H2 = 24 Hz, JF1, F2 = 22 Hz, F-1),

-205.3 (ddd, JF2, H2 = 48 Hz, JF2, F1 = 22 Hz, JF2, H3 = 14 Hz, F-2). MS: Calcd. for + C12H20F2O9+ Na : 369.1, Found: 369.1.

190 2,4-Dinitrophenyl 2,3,4,6-tetra-O-acetyl-α-D-glucopyranoside (3.9)239 OAc

O AcO AcO NO2 AcO O

O2N The protected hemiacetal (3.8, 1.0 g, 2.9 mmol) was dissolved in 50 mL of dry

DMF. 1-Fluoro-2,4-dinitrobenzene (0.70 g, 3.8 mmol, 1.3 eq.) and K2CO3 (10 g, 7.2 mmol, 2.5 eq.) were then added to the solution and the reaction was stirred at room temperature for 5 hours. The reaction mixture was redissoved in EtOAc and successively washed with water, saturated NaHCO3, brine and dried over anhydrous MgSO4 followed by evaporation. Flash column chromatography (petroleum ether: EtOAc = 3: 2) afforded 1 the final product as a white solid. Yield: 0.73 g (1.4 mmol, 48%). H-NMR (CDCl3, 300

MHz) data: δ 8.74 (d, 1H, JAr-H3’, Ar-H5’ = 2.6 Hz, Ar-H3’), 8.43 (dd, 1H, JAr-H5’, Ar-H6’ = 9.2

Hz, JAr-H5’, Ar-H3’ = 2.6 Hz, Ar-H5’), 7.52 (d, 1H, JAr-H6’, Ar-H5’ = 9.2 Hz, Ar-H6’), 6.02 (d,

1H, J1, 2 = 3.5 Hz, H-1), 5.63 (t, 1H, J3, 2 = J3, 4 = 9.9 Hz, H-3), 5.19 (t, 1H, J4, 3 = J4, 5 =

9.9 Hz, H-4), 5.05 (dd, 1H, J2, 3 = 9.9 Hz, J2, 1 = 3.5 Hz, H-2), 4.26 – 4.07 (m, 3H, H-5, + H-6, H-6’), 2.09 (s, 3H, OAc), 2.05 (s, 6H, 2OAc). MS: Calcd. for C20H22N2O14 + Na : 537.1, Found: 537.0.

2,4-Dinitrophenyl α-D-glucopyranoside (3.2)239 OH

O HO HO NO2 HO O

O2N The protected aryl glycoside (3.9, 0.73 g, 1.4 mmol) was dissolved in 80 mL of dry MeOH and deacetylated by the general deprotection method described in Section 5.2.2.5. Two flash column chromatography conditions were used. The first one is EtOAc:

MeOH: water = 17: 2:1. The second condition is CH2Cl2: MeOH = from 20:1 to 5:1.

191 1 Yield: 0.50 g (1.4 mmol, quantitative yield, white solid). H-NMR (D2O, 300 MHz) data:

δ 8.80 (d, 1H, JAr-H3’, Ar-H5’ = 2.8 Hz, Ar-H3’), 8.43 (dd, 1H, JAr-H5’, Ar-H6’ = 9.4 Hz, JAr-H5’,

Ar-H3’ = 2.8 Hz, Ar-H5’), 7.56 (d, 1H, JAr-H6’, Ar-H5’ = 9.4 Hz, Ar-H6’), 5.98 (d, 1H, J1, 2=

3.4 Hz, H-1), 3.87 (t, 1H, J3, 4 = J3, 2 = 9.4 Hz, H-3), 3.72 (dd, 1H, J2, 3 = 9.4 Hz, J2, 1 =

3.4 Hz, H-2), 3.63 – 3.58 (m, 3H, H-5, H-6, H-6’), 3.45 (t, 1H, J4, 5 = J4, 3 = 9.4 Hz, H-4). + MS: Calcd. for C12H14N2O10+ Na : 369.1, Found: 369.1.

1,2,3,6,2’,3’-Hexa-O-acetyl-4’,6’-O-benzylidene-maltose (2.22)174

Ph O O O AcO OAc

AcO O O AcO

AcO OAc To 100 mL of dry DMF, 40 g of maltose (monohydrate, 111 mmol), 18 mL of benzaldehyde dimethylacetal and 100 mg of p-toluenesulfonic acid were added and the reaction mixture was rotated on a rotary evaporator at 65 oC under a weak vacuum for 7 hours. After the reaction was complete, the mixture was neutralized with basic exchange resin, filtered, and concentrated under vacuum by adding toluene as co-evaporant. Water was added to dissolve the remaining syrup and EtOAc was also added to extract the non- polar by-products. This organic layer was again washed with water and all the aqueous fractions were combined and concentrated under vacuum, again by adding toluene as co- evaporant. The remaining mixture was purified by flash column chromatography (EtOAc: MeOH: water = 8:1:1) to yield the benzylidene product, which was acetylated without characterization, as follows.

To the product from the previous step, 240 mL of pyridine and 160 mL of acetic anhydride were added according to the general acetylation method described in Section 5.2.2.2. The product (2.22) was obtained as a mixture of anomers. Yield: 16.09 g (24 1 mmol, 22% from maltose, white powder). H-NMR (CDCl3, 300 MHz) data: selected

192 data only δ 7.42 – 7.27 (m, 5H, Ar-H), 6.21 (d, J1α, 2α = 3.7 Hz, H-1α), 5.72 (d, J1β, 2β = + 8.1 Hz, H-1β). MS: Calcd. for C31H38O17+Na : 705.2. Found: 705.3.

1,2,3,6,2’,3’-Hexa-O-acetyl-6’-O-benzyl-maltose (2.23)174 OBn

O HO AcO OAc

AcO O O AcO

AcO OAc To 200 mL of dry THF, 15.5 g (23 mmol) of the acetylated benzylidene maltose

derivative (2.22) and 15 g of Na(CN)BH3 were added. 1 M HCl in ether solution was then added portion-wise to the reaction mixture until the solution remained acidic and gas development ceased. The mixture was stirred for another hour until TLC showed the reaction was complete. After most of the solvent had been evaporated in vacuo, the residue was re-dissolved in CH2Cl2 and washed with saturated NaHCO3. The aqueous

phases were washed three times with CH2Cl2. The combined organic phases were washed with water and brine, then dried over anhydrous MgSO4 and concentrated. Gradient flash column chromatography (petroleum ether: EtOAc = 4:1 to 1:1) yielded the products (2.23, 1 anomeric mixture) as a white foam (12.1 g, 18 mmol, 78%). H-NMR (CDCl3, 300 MHz)

data: selected data only δ 7.32 (m, 5H, Ar-H), 6.21 (d, J1α, 2α = 3.6 Hz, H-1α), 5.71 (d, J1β, + 2β = 8.2 Hz, H-1β). MS: Calcd. for C31H40O17+Na : 707.2. Found: 707.3.

1,2,3,6,2’,3’-Hexa-O-acetyl-6’-O-benzyl-4’-O-methyl-maltose (2.24)174 OBn

O MeO AcO OAc

AcO O O AcO

AcO OAc 4’-O-Unprotected maltose derivative (2.23, 0.26 g, 0.38 mmol) was dissolved in o 10 mL of dry CH2Cl2 to 0 C. BF3● Et2O (200 μL) and trimethylsilyldiazomethane (0.8

193 mL) were added and the mixture was stirred at 0 oC until the reaction was complete. After concentration in vacuo, flash column chromatography (diethyl ether: petroleum ether = 4:1) afforded the products (2.24, anomeric mixture) as a white foam (80 mg, 0.11 1 mmol, 29%). H-NMR (CDCl3, 300 MHz) data: selected data only δ 7.35 (m, 5H, Ar-H),

6.18 (d, J1α, 2α = 3.8 Hz, H-1α), 5.69 (d, J1β, 2β = 8.1 Hz, H-1β), 3.33 (s, 4’OCH3-α), 3.32 + (s, 4’OCH3-β). MS: Calcd. for C32H42O17+Na : 721.2. Found: 721.4.

1,2,3,6,2’,3’-Hexa-O-acetyl-4’-O-methyl-maltose (2.25)174 OH

O MeO AcO OAc

AcO O O AcO

AcO OAc To a solution of 1.10 g (1.6 mmol) of protected maltose derivative (2.24) in 40 mL EtOAc, 1.5 g of Pd/C catalyst (10 wt.%) was added and the mixture was stirred under

H2 (1 atm.) at room temperature for 2 days. The catalyst was then filtered off and the solvent was evaporated in vacuo. Flash column chromatography (petroleum ether: EtOAc = 2:3) yielded the product as a mixture of anomers (0.95 g, 1.6 mmol, quantitative yield). 1 H-NMR (CDCl3, 300 MHz) data: selected data only δ 6.18 (d, J1α, 2α = 3.7 Hz, H-1α), + 5.69 (d, J1β, 2β = 8.1 Hz, H-1β), 3.40 (s, 4’OCH3). MS: Calcd. for C25H36O17+Na : 631.2. Found: 631.3.

1,2,3,6,2’,3’,6’-Hepta-O-acetyl-4’-O-methyl-maltose (2.26)174 OAc

O MeO AcO OAc

AcO O O AcO

AcO OAc 1,2,3,6,2’,3’-Hexa-O-acetyl-4’-O-methyl-maltose (2.25, 0.95 g, 1.6 mmol) was dissolved in 30 mL of pyridine and the solution was cooled to 0 oC. Acetic anhydride (15 mL) was added according to the general acetylation method described in Section 5.2.2.2.

194 1 Yield: 1.0 g (1.5 mmol, 96%, colorless syrup). H-NMR (CDCl3, 300 MHz) data: selected data only δ 6.21 (d, J1α, 2α = 3.7 Hz, H-1α), 5.71 (d, J1β, 2β = 8.1 Hz, H-1β), 3.40 + (s, 4’OCH3-α), 3.39 (s, 4’OCH3-β). MS: Calcd. for C27H38O18+Na : 673.2. Found: 673.3.

2,3,6,2’,3’,6’-Hexa-O-acetyl-4’-O-methyl-α-maltosyl fluoride (2.27)174 OAc

O MeO AcO OAc

AcO O O AcO

AcO F To 200 mg (0.31 mmol) of acetylated 4’-O-methyl maltose (2.26), HF/pyridine (6 mL) was added and the solution was stirred at 0 oC for 9 hours. The reaction mixture was

diluted with EtOAc and saturated NaHCO3 was added portion-wise until there was no further release of gas bubbles. The organic phase was then washed successively with

water and brine. After drying over anhydrous MgSO4, the solvent was evaporated in vacuo. Flash column chromatography (petroleum ether: EtOAc = 2:3) yielded the pure 1 product as a white foam. Yield: 136 mg (0.22 mmol, 71%). H-NMR (CDCl3, 300 MHz)

data: δ 5.65 (dd, 1H, J1, F = 53.3 Hz, J1, 2 = 2.7 Hz, H-1), 5.54 (t, 1H, J3, 2 = J3, 4 = 9.5 Hz,

H-3), 5.35 (d, 1H, J1’, 2’ = 4.0 Hz, H-1’), 5.34 (t, 1H, J3’, 2’ = J3’, 4’ = 9.9 Hz, H-3’), 4.84

(ddd, 1H, J2, F = 26.0 Hz, J2, 3 = 9.5 Hz, J2, 1 = 2.7 Hz, H-2), 4.77 (dd, 1H, J2’, 3’ = 9.9 Hz,

J2’, 1’ = 4.0 Hz, H-2’), 4.54 – 4.16 (m, 5H, H-5, H-6a, H-6b, H-6’a, H-6’b), 4.03 (t, 1H, J4,

3 = J4, 5 = 9.5 Hz, H-4), 3.77 (m, 1H, H-5’), 3.42 (s, 3H, 4’OMe), 3.33 (t, 1H, J4’, 3’ = J4’, 5’ = 9.9 Hz, H-4’), 2.15 (s, 3H, OAc), 2.13 (s, 3H, OAc), 2.08 (s, 3H, OAc), 2.07 (s, 3H, 19 OAc), 2.06 (s, 3H, OAc), 2.02 (s, 3H, OAc). F-NMR (CDCl3, 282 MHz) data: δ -149.2 + (dd, JF, H1 = 53 Hz, JF, H2 = 26 Hz, F-1). MS: Calcd. for C25H35FO16+Na : 633.2. Found: 633.2.

195 4’-O-Methyl-maltosyl fluoride (2.21)174 OH

O MeO HO OH

OH O O HO

OH F Protected 4’OMeG2F (2.27, 136 mg, 0.22 mmol) was dissolved in 10 mL of dry MeOH and deacetylated by the general deprotection method described in Section 5.2.2.4.

Flash column chromatography (EtOAc: MeOH: H2O = 7:2:1) afforded the pure product 1 (80 mg, 0.22 mmol, quantitative yield, white solid). H-NMR (D2O, 300 MHz) data: δ

5.56 (dd, 1H, J1, F = 53.4 Hz, J1, 2 = 2.5 Hz, H-1), 5.29 (d, 1H, J1’, 2’ = 3.9 Hz, H-1’), 19 3.86 – 3.46 (m, 11H), 3.45 (s, 3H, 4’OMe), 3.14 (t, 1H, J4’, 3’ = J4’, 5’ = 9.5 Hz, H-4’). F-

NMR (D2O, 282 MHz) data: δ -150.5 (dd, JF, H1 = 53 Hz, JF, H2 = 27 Hz, F-1). MS: Calcd. + for C13H23FO10+Na : 381.1. Found: 381.1.

1,2,3,6,2’,3’,6’-Hepta-O-benzyl-maltose (2.28) OBn

O HO BnO OBn

BnO O O BnO

BnO OBn To 150 mL of dry DMF, 12.85 g of maltose (monohydrate, 35.7 mmol), 7 mL of benzaldehyde dimethylacetal and 600 mg of p-toluenesulfonic acid were added. The reaction mixture was rotated on a rotary evaporator at 65 oC under a weak vacuum for 5 hours. After this, 21 g of NaH (60 % in mineral oil) was added to the solution and the mixture was stirred at room temperature for 1.5 hours. The mixture was then cooled to 0 oC, 70 mL of benzyl bromide was added and the reaction mixture was gradually allowed to warm up to room temperature, then stirred overnight. The reaction was cooled to 0 oC again before methanol was added to quench the reaction. After concentration of the solvent in vacuo, EtOAc was added to dilute the organic phase, which was washed with water and brine, then dried over anhydrous MgSO4 and concentrated in vacuo. Gradient

196 flash column chromatography (petroleum ether: EtOAc = 10:1 to 6:1) was used to purify the major products (18.8 g, white solid) as a mixture of four compounds: 1,2,3,6,2’,3’- + hexa-O-benzyl-4’,6’-O-benzylidene-α/β-maltose (MS: Calcd. for C61H62O11+Na : 993.4. Found: 994.0.) and 1,2,3,6,2’,3’,4’,6’-octa-O-benzyl-α/β-maltose (MS: Calcd. for + C68H70O11+Na : 1085.5. Found: 1085.9.)

To 200 mL of dry THF was added 18.8 g of the above benzylidene/benzyl

protected maltose derivatives mixture and 6 g of Na(CN)BH3 were added. 1 M HCl in ether solution was then added portion-wise to the reaction mixture until the solution remained acidic and gas development ceased. The mixture was then stirred for another hour until TLC showed that reaction was complete. After most of the solvent had been

evaporated in vacuo, the residue was re-dissolved in CH2Cl2 and washed with saturated

NaHCO3. The aqueous phase was washed three times with CH2Cl2 and all the organic phases were then combined and washed with water and brine, and then dried over

anhydrous MgSO4 and concentrated. Gradient flash column chromatography (petroleum ether: EtOAc = 5:1 to 3:1) afforded the product 2.28 as a white foam. Yield: 8.0 g (8.2 1 mmol, 23% from maltose). H-NMR (CDCl3, 300 MHz) data: selected data only δ 7.42 – + 7.15 (m, Ar-H), 5.68 (d, 1H, J1,2 = 3.7 Hz, H-1α). MS: Calcd. for C61H64O11+Na : 995.4. Found: 996.0.

1,2,3,6,2’,3’,6’-Hepta-O-benzyl-4’-O-methyl-maltose (2.29)

OBn

O MeO BnO OBn

BnO O O BnO

BnO OBn 4’-O-unprotected maltose derivative (2.28, 8.0 g, 8.2 mmol) was dissolved in 250 mL of dry THF. NaH (750 mg, 60% in mineral oil) was added, the mixture was stirred at room temperature for 1 hour under argon, then MeI (1.1 mL) was added. The next day TLC showed that 70% of the starting material had been consumed. Another 700 mg of NaH (60% in mineral oil) and 1.1 ml of MeI were added and the mixture was stirred for a

197 total of 42 hours until TLC showed that the reaction was complete. The reaction was quenched by adding methanol and evaporation in vacuo. The residue was re-dissolved in

EtOAc and washed with water and brine, then dried over anhydrous MgSO4 and concentrated in vacuo. Flash column chromatography (petroleum ether: EtOAc = 5:1) 1 yielded the title product 2.29 (7.37 g, 7.5 mmol, 91%). H-NMR (CDCl3, 300 MHz) data:

selected data only δ 7.44 – 7.13 (m, Ar-H), 5.67 (d, 1H, J1,2 = 3.9 Hz, H-1α), 3.45 (s, 3H, + H-4’α). MS: Calcd. for C62H66O11+Na : 1009.5. Found: 1009.8.

2,4-Dinitrophenyl 2-acetamido-3,4,6-tri-O-acetyl-2-deoxy-α-D-galactopyranoside (4.6)240

OAc OAc

O AcO NO2 NH O O

O2N 2-Acetamido-3,4,6-tri-O-acetyl-2-deoxy-α-D-galactopyranose (4.5, 420 mg, 1.2 mmol) was dissolved in 12 mL of dry DMF. To this solution, 260 mg (1.2 eq.) of 1- fluoro-2,4-dinitrobenzene and 420 mg (2.5 eq.) of K2CO3 were added and stirred at room temperature under argon gas for 3 hours. After aqueous workup, gradient flash column chromatography (petroleum ether: EtOAc = 3:2 to 1:1) yielded pure protected 1 DNPGalNAc as a white solid (325 mg, 0.63 mmol, 53%). H-NMR (CDCl3, 400 MHz)

data: δ 8.79 (d, 1H, JAr-H3’, Ar-H5’ = 2.7 Hz, Ar-H3’), 8.44 (dd, 1H, JAr-H5’, Ar-H6’ = 9.3 Hz,

JAr-H5’, Ar-H3’ = 2.7 Hz, Ar-H5’), 7.58 (d, 1H, JAr-H6’, Ar-H5’ = 9.3 Hz, Ar-H6’), 6.25 (d, 1H,

JNHAc, 2 = 8.9 Hz, NHAc), 5.86 (d, 1H, J1, 2 = 3.5 Hz, H-1), 5.52 (dd, 1H, J4,3 = 3.4 Hz, J4,5

= 1.2 Hz, H-4), 5.34 (dd, 1H, J3, 2 = 11.5 Hz, J3, 4 = 3.4 Hz, H-3), 4.82 (ddd, 1H, J2, 3 =

11.5 Hz, J2, NHAc = 8.9 Hz, J2, 1 = 3.5 Hz, H-2), 4.29 (m, 1H, H-5), 4.12 – 4.09 (m, 2H, H- 6, H-6’), 2.18 (s, 3H, OAc), 2.02 (s, 3H, OAc), 1.99 (s, 3H, OAc), 1.95 (s, 3H, OAc). MS:

Calcd. for C20H23N3O13+Na+: 536.1. Found: 536.3.

198 2,4-Dinitrophenyl 2-acetamido-2-deoxy-α-D-galactopyranoside (4.3)240

OH OH

O HO NO2 NH O O

O2N Protected DNPGalNAc (4.6, 144 mg, 0.28 mmol) was dissolved in 25 mL of dry MeOH and deacetylated by the general deprotection method described in Section 5.2.2.5. Flash column chromatography conditions: EtOAc: MeOH: water = 17:2:1. Yield: 108 mg 1 (0.28 mmol, 99%, white solid). H-NMR (MeOD, 300 MHz) data: δ 8.77 (d, 1H, JAr-H3’,

Ar-H5’ = 2.8 Hz, Ar-H3’), 8.48 (dd, 1H, JAr-H5’, Ar-H6’ = 9.4 Hz, JAr-H5’, Ar-H3’ = 2.8 Hz, Ar-

H5’), 7.74 (d, 1H, JAr-H6’, Ar-H5’ = 9.4 Hz, Ar-H6’), 6.04 (d, 1H, J1, 2 = 3.3 Hz, H-1), 4.51

(dd, J2, 3 = 10.6 Hz, J2, 1 = 3.3 Hz, H-2), 4.03 (m, 2H, H-3, H-4), 3.91 (m, 1H, H-5), 3.74 + – 3.71 (m, 2H, H-6, H-6’), 1.92 (s, 3H, COCH3). MS: Calcd. for C14H17N3O10+Na : 410.1. Found: 410.2.

2,4-Dinitrophenyl 2-acetamido-2-deoxy-3-O-[β-D-galactopyranosyl]-α-D-galacto- pyranoside (4.1) OH OH OH OH

O O HO O NO2 OH NH O O

O2N

To 3850 μL of d-H2O, the following components were added: 14.3 mg of UDP- galactose (25 μmol, 1.0 eq.), 9.8 mg αDNPGalNAc (25 μmol, 1.0 eq.), 500 μL of NaOAc

buffer (500 mM, pH = 6.0), 500 μL MnCl2 (100 mM), 50 μL DTT solution in water (100 mM) and 100 μL of wild type CgtB enzyme284 (5.4 mg/mL) to make the total volume to 5.0 mL. This mixture was incubated at room temperature for 24 hours, then loaded onto a

C18 Sep-Pak Column (Waters) and eluted with a MeOH/H2O mixed solvent (MeOH:

H2O = 1:9 to 3:7) to remove all the salts. The collected fractions were freeze-dried and

199 further purified by silica gel flash chromatography (EtOAc: MeOH: water = 17:2:1 to 7:2:1) to afford the pure product as a light yellow foam. (7.2 mg, 13 μmol, 52%). 1H-

NMR (MeOD, 300 MHz) data: δ 8.76 (d, 1H, JAr-H3’, Ar-H5’ = 2.8 Hz, Ar-H3’), 8.48 (dd,

1H, JAr-H5’, Ar-H6’ = 9.3 Hz, JAr-H5’, Ar-H3’ = 2.8 Hz, Ar-H5’), 7.75 (d, 1H, JAr-H6’, Ar-H5’ = 9.3

Hz, Ar-H6’), 6.05 (d, 1H, J1, 2 = 3.4 Hz, H-1), 4.62 (dd, 1H, J2, 3 = 11.2 Hz, J2, 1 = 3.4 Hz,

H-2), 4.52 (d, 1H, J1’, 2’ = 7.5 Hz, H-1’), 4.33 (d, 1H, J4, 3 = 2.8 Hz, H-4), 4.12 (dd, 1H, J3,

2 = 11.2 Hz, J3, 4 = 2.8 Hz, H-3), 3.93 – 3.51 (m, 9H, H-5, H-6a, H-6b, H-2’, H-3’, H-4’, + H-5’, H-6’a, H-6’b), 1.98 (s, 3H, OAc). HRMS: Calcd. for C20H27N3O15+Na : 572.1340. Found: 572.1336.

5.3 Molecular biology 5.3.1 Expression and purification of wild type HPA The protocol used for expression and purification of HPA was first developed by Dr. Edwin Rydberg and was adapted as follows.150 A colony from a YPD plate (Yeast extract Peptone Digest plate, 1% yeast extract (w/v), 2% peptone (w/v), 2% glucose (w/v)) was used to inoculate 5 ml of BMGY medium (Buffered Glycerol-complex Medium, 1% yeast extract (w/v), 2% peptone (w/v), 100 mM potassium phosphate, pH 6.0, 1.34% yeast nitrogen base (w/v), 4 × 10-5% biotin (w/v), 1% glycerol (v/v)) and this was grown at 30 oC overnight. This 5 ml of cell culture was transferred to 500 ml of BMGY medium and grown at 30 oC for two days, in the presence of 200 μl of antifoam C. The cells were centrifuged at 6000 rpm, 4 oC for 15 minutes. After the supernatant was discarded, these cells were re-suspended in 100 ml of BMMY medium (Buffered Methanol-complex Medium, 1% yeast extract (w/v), 2% peptone (w/v), 100 mM potassium phosphate, pH 6.0, 1.34% yeast nitrogen base (w/v), 4 × 10-5% biotin (w/v), 0.5% methanol (v/v)) with addition of 50 μl of antifoam C. 1 ml of 50% (v/v) methanol in water was added twice a day into the cell culture and this was grown at 30 oC for one day. The cells were spun down by centrifugation at 6000 rpm, 4 oC for 15 minutes. The salt concentration of the supernatant was adjusted to 0.5 M by adding 5 M NaCl solution and the solution was filtered before loading onto the Phenyl Sepharose column.

200 The filtered supernatant (with 0.5 M NaCl) was loaded onto a Phenyl Sepharose column which was pre-equilibrated with loading buffer (0.5 M NaCl, 100 mM potassium phosphate, pH = 7.5). The column was washed with additional loading buffer (3 × bed volume) followed by elution with de-ionized water. All the fractions were checked on SDS-PAGE and positive fractions were pooled and concentrated to 4 – 8 ml using an Amicon centricon centrifugal concentrator. Concentrated buffer was added to this deep green solution to make a final buffer concentration of 20 mM potassium phosphate, 25 ml NaCl, pH = 6.9 (cutting buffer). Endoglycosidase F-cellulose binding domain (endo-F) fusion protein (5 μL) was added and the mixture was left at room temperature for two days. This was then loaded onto a Q-Sepharose column that was pre-equilibrated with the cutting buffer. While the green “impurity” stuck to the column, the HPA passed through and was collected. After loading of the sample, the Q-Sepharose column was further washed with additional cutting buffer (4 × bed volume) and all the fractions from this column were combined and concentrated.

5.3.2 Expression and purification of wild type CgtB enzyme284 A colony (E. coli BL21(DE3) containing the cgtB gene) was picked and grown in LB medium (5 ml with 5 μl of ampicillin (100 mg/ml) added) at 37 oC overnight. 500 μl of this culture was transferred into 50 ml of LB medium (with 50 μl ampicillin (100 o mg/ml) added) and grown at 37 C for four hours until OD600 was between 0.6 and 1.0. 500 μl of this culture was again transferred into 500 ml of 2YD media (1.6% tryptone (w/v), 1% yeast extract (w/v), 0.5% NaCl (w/v), with addition of 500 μl of Ampicillin (100 mg/ml), and grown at 37 oC for another two hours. The culture was left to induce for one day at 20 oC by adding 625 μl IPTG (800 mM, to adjust its final concentration to 1 mM). The cells were harvested by centrifugation at 5000 rpm, 30 minutes at 4 oC and re- suspended in 40 ml of buffer A (20 mM HEPES, 200 mM NaCl, 5 mM 2- mercaptoethanol, 1 mM EDTA, 10% glycerol (v/v), pH = 7.0). The cells were lysed using a French press at 10,000 psi and the lysate was centrifuged at 15,000 rpm for 30 minutes at 4 oC. The supernatant was loaded onto an amylose column (bed volume: 10 ml) which was pre-equilibrated with buffer A. The column was washed with another 50 ml of buffer A followed by elution with buffer B (20 mM maltose in buffer A). The fractions

201 containing CgtB enzyme were pooled and concentrated by centrifugation at 4000 rpm for 15 minutes at 4 oC. This was dialyzed overnight against 100 volumes of the dialysis solution (50 mM NaOAc buffer, pH = 6.0, 20% glycerol (v/v)) at 4 oC. The concentration of the protein was determined by the Bradford method.

5.4 Enzymology 5.4.1 Human pancreatic α-amylase (HPA) 5.4.1.1 General assay conditions. All kinetic studies were performed at 30 oC in 50 mM phosphate buffer containing 100 mM NaCl, pH 6.9, unless otherwise noted. Hydrolysis of CNPG3 by either HPA or PPA was monitored by the increase of absorbance at 400 nm using a Varian CARY 300 spectrophotometer equipped with a circulating water bath. Plastic cuvettes with a path length of 1 cm were used. All enzyme kinetic data were processed using the program GraFit 5.0.13 (Erithacus Software Limited, 2006).

5.4.1.2 Kinetic evaluation of potential mechanism-based inhibitors of HPA Samples of HPA (0.32 μM) were incubated in buffer in the presence of a range of concentrations of inactivator at 30 oC. Aliquots (10 μL) of these inactivation mixtures were removed at time intervals and diluted into assay cells containing a large volume (1 mL) of CNPG3 substrate (2 mM) pre-incubated at 30oC. This effectively stops the inactivation both by dilution of the inactivator and by competition with an excess of substrate. The residual enzymatic activity was determined from the rate of hydrolysis of the substrate CNPG3, which is directly proportional to the amount of active enzyme. For testing 2-deoxy-2,2-dihalo glycosides, concentrations ranging from 10 mM to 50 mM of each compound were used. For elongation experiments, both donor (20 mM to 40 mM MeG2F or G3F) and acceptor (25 mM to 50 mM 5FGlcF or 5FIdoF) were incubated together with HPA.

5.4.1.3 Reactivation kinetics. Fully inactivated HPA was freed of excess inactivator by 10 fold dilution with buffer (500 μL), then concentration at 4 oC using a 10 kDa nominal cut-off centrifugal

202 concentrator to a volume of approximately 50 μL. This procedure was repeated a total of eight times. The resultant solution was then diluted to 200 μL with either buffer alone or with maltose solutions at different concentrations (20 mM – 150 mM), and then incubated at 30 oC. Reactivation was monitored by removal of aliquots (10 μL) at appropriate time intervals and assaying using CNPG3 under standard assay conditions.

5.4.2 Trehalose synthase (TreS) from M. smegmatis 5.4.2.1 Kinetic evaluation of different aryl α-glucosides as substrates All kinetic studies of TreS were carried out at 37 oC in 40 mM potassium phosphate buffer, pH 6.8, unless otherwise noted. Michaelis-Menten kinetic parameters of various aryl α-glucosides were measured by monitoring the increase of absorbance at 400 nm using a Varian CARY 300 spectrophotometer equipped with a circulating water

bath. Quartz cuvettes (200 μL) with a path length of 1 cm were used. An approximate Km value of each substrate was first determined by measuring initial rate at three widely different concentrations of the substrate. The accurate values of kcat and Km were then

determined by using 6 – 8 different substrate concentrations ranging from 0.3 Km – 5 Km (depending on the availability and solubility of the substrate). All enzyme kinetic data were then processed using the program GraFit 5.0.13. The extinction coefficients of phenols and corresponding aryl α-glucosides were determined by measuring the absorbance of carefully prepared stock solutions of each compound in the same buffer at 37 oC. The calculated extinction coefficient differences for each phenol/aryl glycosides combination were in a good agreement with earlier literature values.36

5.4.2.2 α-Glucosyl fluoride (αGlcF) kinetics Reaction rates were determined by monitoring the release of fluoride using a fluoride ion electrode. Glass vials containing various concentrations of αGlcF substrate (0.1 mM – 1 mM) was incubated at 37 oC to establish a steady-state spontaneous hydrolysis rate before 10 μL of stock solution of TreS enzyme (0.47 μM) was added to make the final volume to 300 μL. The initial rate determined (after substraction of the spontaneous hydrolysis rate) was then processed using the GraFit program.

203 5.4.2.3 pH profile studies The pH stability of TreS was first studied by incubating the enzyme in buffers of different pH values ranging from 5.4 – 8.4. Aliquots of the mixtures were then assayed using 2 mM 2,4-dinitrophenyl α-glucoside (DNPGlc) in standard assay buffer (40 mM potassium phosphate, pH = 6.8) which was pre-incubated at 37 oC. The following buffers were used: 40 mM citrate-phosphate pH 5.4 – 5.8, 40 mM sodium phosphate pH 5.8 – 8.0, 40 mM sodium borate pH 8.0 – 8.4. This study revealed that TreS was stable from pH = 5.8 – 8.0 for one hour. This range was selected for pH profile studies.

The pH dependence of kcat/KM for TreS was measured using the substrate

depletion method at low substrate concentrations of DNPGlc ([S] << Km). Assays were performed with 50 μM DNPGlc. TreS was added to the pre-incubated substrate solution at 37 oC (final enzyme concentration: 0.7 μM) and the increase of absorbance at 400 nm was followed for 60 minutes until the reaction was complete. The data were fitted to a

first order curve using Grafit 5.0 software. The kcat/KM values were obtained from these fits by dividing by the enzyme concentration.

5.4.2.4 Fluorosugar inactivator studies Samples of the TreS (0.28 μM) were incubated in buffer in the presence of a range of concentrations of the inhibitors (1 mM – 10 mM 5FGlcF or 5 mM – 30 mM 5FIdoF) at 37 oC. Aliquots (10 μL) of these inactivation mixtures were removed at time intervals and diluted into assay cells containing 190 μL of DNPGlc substrate ([S] = 3 mM) pre-incubated at 37 oC. The residual enzymatic activity was determined from the rate of hydrolysis of the substrate, which is directly proportional to the amount of active enzyme. The process was monitored until 80 - 90% of the enzymatic activity was inactivated.

Pseudo-first order rate constants (kobs) for each inactivator concentration were calculated by fitting plots of the residual activity versus time to a single exponential equation using

GraFit. The values of kobs were then fit to the Michaelis-Menten equation describing the

inactivation process to obtain values of ki and Ki using Grafit software.

ki [I] kobs = K i + [I]

204 Protection experiments were carried out in a similar fashion to the above except that 5 μM casuarine was added to the inactivation mixture.

5.4.2.5 Ki or Ki’ determination of 5FGlcF/GHIL/casuarine as competitive inhibitors of TreS

The apparent Ki values of 5FGlcF were determined by measuring the rate of reaction at differing 5FGlcF concentrations (0 μM – 150 μM) for a single DNPGlc concentration (3 mM) in the presence of 23 nM TreS enzyme. The same concentration of enzyme was also used to react with different concentrations (1 mM – 10 mM) of DNPGlc

to determine the Vmax value. The residual enzymatic rates in the presence of 5FGlcF were then plotted in the form of a Dixon plot (1/rate vs. [5FGlcF]) and the intersection of the

line with y = 1/Vmax is the inverse value of Ki.

Similar experiments were performed to determine the Ki value of GHIL/casuarine as reversible inhibitors of TreS, except that multiple concentrations of DNPGlc (2 – 4 mM) were used. Lineweaver-Burk plots were employed to process the data.

5.4.2.6 Reactivation experiments TreS fully inactivated by 5FIdoF was freed of excess inactivator by 5 fold dilution with buffer into a total volume of 500 μL, then concentration at 4 oC using a 10 kDa nominal cut-off centrifugal concentrator to a volume of approximately 50 μL. This procedure was repeated a total of eight times. The resultant solution was then diluted to 100 μL with either buffer alone or with glucose solution (final [glucose] = 100 mM), and then incubated at 37 oC. Reactivation was monitored by removal of aliquots (10 μL) at appropriate time intervals and assayed using 3 mM DNPGlc substrate at 37 oC.

5.4.3 Endo-α-N-acetylgalactosaminidase from S. pneumoniae R6 (SpGH101) 5.4.3.1 Kinetic analysis of wild type SpGH101 and its various mutants Enzyme activity was measured using the substrates Gal-β-1,3-GalNAc-α-pNP (pNPTAg, obtained from Toronto Research Chemicals), Gal-β-1,3-GalNAc-α-DNP (DNP-TAg), or DNP-GalNAc. Assays were performed at 37 °C in 50 mM citrate-

205 phosphate buffer pH 6.5 with 0.1 mg/mL acetylated bovine serum albumin. All kinetic assays with pNP-TAg, DNP-TAg and DNP-GalNAc were performed in continuous mode by monitoring the increase of absorbance at 400 nm. All Michaelis-Menten kinetic parameters were determined in a similar fashion to those described in Section 5.4.2.1. The extinction coefficients of 2,4-dinitrophenolate and 4-nitrophenolate were determined to be 10,800 cm-1M-1 and 3,700 cm-1M-1, respectively under these conditions. For the chemical rescue experiments, 10 mM – 2 M of sodium acetate, sodium formate, sodium azide or potassium fluoride was added to the enzyme reactions and the pH of the solution was checked to ensure it remained at 6.5.

5.4.3.2 pH profile studies The pH stability of SpGH101 was first studied by incubating the enzyme in buffers of different pH values ranging from 4.0 – 9.0. Aliquots of the mixtures were then assayed using 40 μM DNP-TAg in standard assay buffer (citrate-phosphate buffer pH 6.5 with 0.1 mg/mL acetylated bovine serum albumin) which was pre-incubated at 37 oC. The following buffers were used: 50 mM citrate-phosphate pH 4.0 – 6.5, 50 mM sodium phosphate pH 6.5 – 8.0, 50 mM sodium borate pH 8.0 – 9.0, all of which contained 0.1 mg/ml acetylated BSA. This study revealed that SpGH101 was stable from pH = 5.2 – 8.4 for 10 minutes. This range was selected for pH profile studies.

The pH dependence of kcat/KM for SpGH101 was measured using the substrate depletion method at low substrate concentrations of DNP-TAg ([S] << Km). Assays were performed with 5 μM DNP-TAg. SpGH101 was added to the pre-incubated substrate solution at 37 oC (final enzyme concentration: 1.7 nM) and the increase of absorbance at 400 nm was followed for 10 minutes until the reaction was complete. The data were fitted

to a first order curve using Grafit software. The kcat/KM values were obtained from these fits by dividing by the enzyme concentration.

206 5.4.3.3 Azide rescue product identification. A reaction mixture containing 3.5 mM DNP-TAg, 1 M sodium azide and 5 nM of SpGH101 E796Q mutant was prepared and left at room temperature overnight. Product

analysis was performed by thin layer chromatography on 60 F254 silica gel aluminum plates (Merck) run in 7:2:1 (v/v/v) ethyl acetate/methanol/water and developed with 10%

ammonium molybdate in 2 M H2SO4 followed by charring. Purification of the final product was performed by directly loading the crude reaction mixture onto a silica gel flash column chromatography. The NMR spectrum was recorded using a Bruker AV- 300 MHz spectrometer. A high resolution mass spectrum was collected in the mass spectrometry laboratory of the Department of Chemistry at the University of British Columbia.

5.5 Determination of stereochemical outcome of SpGH101 by NMR The substrate pNP-TAg was dissolved in SpGH101 buffer (final concentration: 5 mM) and was freeze-dried four times with the addition of 0.5 ml D2O every time. The 1H-NMR spectrum (600 MHz) was obtained for the substrate before enzyme was added.

Wild type SpGH101 enzyme exchanged in D2O buffer was then added to begin the reaction (final enzyme concentration: 1.6 μM). Spectra were acquired every five minutes to first see the increase in amount of one anomer and then followed long enough to see the formation of the other anomer due to mutarotation. Based on chemical shift and coupling constants it was possible to assign the stereochemistry of the reaction products.

5.6 Summary of structure determination statistics for the covalent intermediate of HPA (Data courtesy of Dr. Chunmin Li)

207

Complex Structurea MeG2F/5FIdoF/ G3F/5FIdoF/ MeG2F/5FIdoF/ HPA (Condition 1) HPA (Condition 2) HPA (Condition 3)

Data Collection Parameters Space group P212121 P212121 P212121 Unit cell dimensions (Å) a 52.0 52.4 52.1 b 68.5 68.0 67.8 c 129.9 130.0 129.9 No. of measurements 489089 229973 197577 No. of unique reflections 85356 40346 30917 Mean I/σI b 48.9 (10.1) 40.8 (9.3) 41.2 (11.6) Multiplicity b 5.7 (5.5) 5.7 (5.6) 6.4 (3.6) Merging R-factor (%)b 3.0 (16.5) 7.0 (15.1) 7.0 (19.4) Maximum resolution (Å) 1.43 1.85 2.0

Structure Refinement Values Number of reflections 85356 40346 30917 Resolution range (Å) 50-1.43 33.4-1.85 50-2.0 Completeness (%)b 98.8 (98.1) 99.7 (97.4) 96.9 (94.0) No. protein atoms 3946 3946 3946 No. inhibitor atoms 51 49 59 No. solvent atoms 406 307 231 Average thermal factors (Å2) Protein atoms 18.8 19.7 19.7 Inhibitor atoms 30.0 36.3 29.8 Solvent atoms 32.8 30.9 26.7 Final R-free value (%)c 21.3 20.7 22.9 Final R-factor (%) 19.7 18.4 19.3

Structure Stereochemistry r.m.s. deviations bonds (Å) 0.005 0.006 0.006 angles (°) 1.01 1.02 0.99

a HPA crystal soaking procedures were as follows: Condition 1, first in 100 mM 5FIdoF overnight, then 150 mM MeG2F solution for two hours; Condition 2, 100 mM 5FIdoF and 150 mM G3F overnight; Condition 3, 100 mM 5FIdoF and 150 mM MeG2F overnight. b Values in parentheses refer to the highest resolution shell: 1.43-1.51 Å for the Condition 1 complex; 1.85-2.03 Å for the Condition 2 complex; and, 2.0-2.07 Å for the Condition 3 complex. c 5% of the data was set aside to calculate R-free.

208 References (1) Ajit Varki, R. C., Jeffrey Esko, Hudson Freeze, Gerald Hart, Jamey Marth Essentials of Glycobiology; Cold Spring Harbour Laboratory Press, 1999. (2) Wolfenden, R.; Lu, X. D.; Young, G. J. Am. Chem. Soc., 1998, 120, 6814-6815. (3) Wolfenden, R. Chem. Rev. 2006, 106, 3379-3396. (4) Lairson, L. L.; Withers, S. G. Chem. Commun. 2004, 2243-2248. (5) Sinnott, M. L. Chem. Rev. 1990, 90, 1171-1202. (6) Henrissat, B. Biochem. J. 1991, 280, 309-316. (7) Cantarel, B. L.; Coutinho, P. M.; Rancurel, C.; Bernard, T.; Lombard, V.; Henrissat, B. Nucleic Acids Res. 2009, 37, D233-D238. (8) Gebler, J.; Gilkes, N. R.; Claeyssens, M.; Wilson, D. B.; Beguin, P.; Wakarchuk, W. W.; Kilburn, D. G.; Miller, R. C.; Warren, R. A. J.; Withers, S. G. J. Biol. Chem. 1992, 267, 12559-12561. (9) Davies, G.; Henrissat, B. Structure 1995, 3, 853-859. (10) Henrissat, B.; Davies, G. J. Plant Physiol. 2000, 124, 1515-1519. (11) Henrissat, B.; Callebaut, I.; Fabrega, S.; Lehn, P.; Mornon, J. P.; Davies, G. Proc. Nat. Acad. Sci. U. S. A. 1995, 92, 7090-7094. (12) Henrissat, B.; Davies, G. Curr. Opin. Struct. Biol. 1997, 7, 637-644. (13) Jenkins, J.; Leggio, L. L.; Harris, G.; Pickersgill, R. FEBS Lett. 1995, 362, 281- 285. (14) Koshland, D. E. Biol. Rev. 1953, 28, 416-436. (15) Zechel, D. L.; Withers, S. G. Acc. Chem. Res. 2000, 33, 11-18. (16) McCarter, J. D.; Withers, S. G. Curr Opin Struct Biol 1994, 4, 885-92. (17) Mark, B. L.; Vocadlo, D. J.; Knapp, S.; Triggs-Raine, B. L.; Withers, S. G.; James, M. N. J Biol Chem 2001, 276, 10330-7. (18) Vocadlo, D. J.; Withers, S. G. Biochemistry 2005, 44, 12809-18. (19) Watts, A. G.; Damager, I.; Amaya, M. L.; Buschiazzo, A.; Alzari, P.; Frasch, A. C.; Withers, S. G. J Am Chem Soc 2003, 125, 7532-3. (20) Amaya, M. F.; Watts, A. G.; Damager, I.; Wehenkel, A.; Nguyen, T.; Buschiazzo, A.; Paris, G.; Frasch, A. C.; Withers, S. G.; Alzari, P. M. Structure 2004, 12, 775-84. (21) Newstead, S. L.; Potter, J. A.; Wilson, J. C.; Xu, G.; Chien, C. H.; Watts, A. G.; Withers, S. G.; Taylor, G. L. J Biol Chem 2008, 283, 9080-8. (22) Yip, V. L.; Varrot, A.; Davies, G. J.; Rajan, S. S.; Yang, X.; Thompson, J.; Anderson, W. F.; Withers, S. G. J Am Chem Soc 2004, 126, 8354-5. (23) Rajan, S. S.; Yang, X.; Collart, F.; Yip, V. L.; Withers, S. G.; Varrot, A.; Thompson, J.; Davies, G. J.; Anderson, W. F. Structure 2004, 12, 1619-29. (24) Yip, V. L.; Withers, S. G. Biochemistry 2006, 45, 571-80. (25) Liu, Q. P.; Sulzenbacher, G.; Yuan, H.; Bennett, E. P.; Pietz, G.; Saunders, K.; Spence, J.; Nudelman, E.; Levery, S. B.; White, T.; Neveu, J. M.; Lane, W. S.; Bourne, Y.; Olsson, M. L.; Henrissat, B.; Clausen, H. Nat Biotechnol 2007, 25, 454-64. (26) Yip, V. L.; Withers, S. G. Org Biomol Chem 2004, 2, 2707-13. (27) Yip, V. L.; Withers, S. G. Curr Opin Chem Biol 2006, 10, 147-55. (28) Wang, Q. P.; Graham, R. W.; Trimbur, D.; Warren, R. A. J.; Withers, S. G. J. Am. Chem. Soc. 1994, 116, 11594-11595. (29) Rydberg, E. H.; Li, C. M.; Maurus, R.; Overall, C. M.; Brayer, G. D.; Withers, S. G. Biochemistry 2002, 41, 4492-4502.

209 (30) MacLeod, A. M.; Tull, D.; Rupitz, K.; Warren, R. A.; Withers, S. G. Biochemistry 1996, 35, 13165-72. (31) Zechel, D. L.; Reid, S. P.; Stoll, D.; Nashiru, O.; Warren, R. A. J.; Withers, S. G. Biochemistry 2003, 42, 7195-7204. (32) Macleod, A. M.; Lindhorst, T.; Withers, S. G.; Warren, R. A. J. Biochemistry 1994, 33, 6371-6376. (33) Vocadlo, D. J.; Wicki, J.; Rupitz, K.; Withers, S. G. Biochemistry 2002, 41, 9736- 9746. (34) McIntosh, L. P.; Hand, G.; Johnson, P. E.; Joshi, M. D.; Korner, M.; Plesniak, L. A.; Ziser, L.; Wakarchuk, W. W.; Withers, S. G. Biochemistry 1996, 35, 9958-66. (35) Sinnott, M. L.; Souchard, I. J. Biochem J 1973, 133, 89-98. (36) Kempton, J. B.; Withers, S. G. Biochemistry 1992, 31, 9961-9. (37) Tull, D.; Withers, S. G. Biochemistry 1994, 33, 6363-6370. (38) Vocadlo, D. J.; Wicki, J.; Rupitz, K.; Withers, S. G. Biochemistry 2002, 41, 9727- 35. (39) Heightman, T. D.; Vasella, A. T. Angew. Chem. Int. Ed. 1999, 38, 750-770. (40) Vocadlo, D. J.; Davies, G. J. Curr. Opin. Chem. Biol. 2008, 12, 539-555. (41) Blake, C. C. F.; Koenig, D. F.; Mair, G. A.; North, A. C. T.; Phillips, D. C.; Sarma, V. R. Nature 1965, 206, 757-761. (42) Phillips, D. C. Proc. Nat. Acad. Sci. U.S.A. 1967, 57, 484-495. (43) Davies, G. J.; Sinnott, M. L.; Withers, S. G. In Comprehensive Biological Catalysis; Academic Press: New York, 1998; Vol. 1, p 119-208. (44) Amyes, T. L.; Jencks, W. P. J. Am. Chem. Soc. 1989, 111, 7888-7900. (45) Banait, N. S.; Jencks, W. P. J. Am. Chem. Soc. 1991, 113, 7951-7958. (46) Davies, G. J.; Gloster, T. M.; Henrissat, B. Curr. Opin. Struct. Biol. 2005, 15, 637-645. (47) Rempel, B. P.; Withers, S. G. Glycobiology 2008, 18, 570-586. (48) Blanchard, J. E.; Withers, S. G. Chem. Biol. 2001, 8, 627-633. (49) Hekmat, O.; Kim, Y. W.; Williams, S. J.; He, S. M.; Withers, S. G. J. Biol. Chem. 2005, 280, 35126-35135. (50) Hekmat, O.; Florizone, C.; Kim, Y. W.; Eltis, L. D.; Warren, R. A. J.; Withers, S. G. Chembiochem 2007, 8, 2125-2132. (51) Hekmat, O.; He, S. M.; Warren, R. A. J.; Withers, S. G. J. Proteome Res. 2008, 7, 3282-3292. (52) Naider, F.; Yariv, J.; Bohak, Z. Biochemistry 1972, 11, 3202-&. (53) Tull, D.; Burgoyne, D. L.; Chow, D. T.; Withers, S. G.; Aebersold, R. Anal. Biochemistry 1996, 234, 119-125. (54) Kuhn, C. S.; Lehmann, J. Carbohydr. Res. 1987, 160, C6-C8. (55) Kuhn, C. S.; Lehmann, J.; Jung, G.; Stevanovic, S. Carbohydr. Res. 1992, 232, 227-233. (56) Blancmuesser, M.; Driguez, H.; Lehmann, J.; Steck, J. Carbohydr. Res. 1992, 223, 129-136. (57) Halazy, S.; Berges, V.; Ehrhard, A.; Danzin, C. Bioorg. Chem. 1990, 18, 330-344. (58) Kurogochi, M.; Nishimura, S. I.; Lee, Y. C. J. Biol. Chem. 2004, 279, 44704- 44712.

210 (59) Hinou, H.; Kurogochi, M.; Shimizu, H.; Nishimura, S. I. Biochemistry 2005, 44, 11669-11675. (60) Lu, C. P.; Ren, C. T.; Lai, Y. N.; Wu, S. H.; Wang, W. M.; Chen, J. Y.; Lo, L. C. Angew. Chem. Int. Ed. 2005, 44, 6888-6892. (61) Febbraio, F.; Barone, R.; DAuria, S.; Rossi, M.; Nucci, R.; Piccialli, G.; DeNapoli, L.; Orru, S.; Pucci, P. Biochemistry 1997, 36, 3068-3075. (62) Hrmova, M.; Varghese, J. N.; De Gori, R.; Smith, B. J.; Driguez, H.; Fincher, G. B. Structure 2001, 9, 1005-1016. (63) Li, Y. K.; Chir, J.; Chen, F. Y. Biochem. J. 2001, 355, 835-840. (64) Iwanami, S.; Matsui, H.; Kimura, A.; Ito, H.; Mori, H.; Honma, M.; Chiba, S. Biosci. Biotechnol. Biochem. 1995, 59, 459-463. (65) Hermans, M. M. P.; Kroos, M. A.; Vanbeeumen, J.; Oostra, B. A.; Reuser, A. J. J. J. Biol. Chem. 1991, 266, 13507-13512. (66) Dinur, T.; Osiecki, K. M.; Legler, G.; Gatt, S.; Desnick, R. J.; Grabowski, G. A. Proc. Nat. Acad. Sci. U.S.A. 1986, 83, 1660-1664. (67) Legler, G.; Harder, A. BBA 1978, 524, 102-108. (68) Withers, S. G.; Street, I. P.; Bird, P.; Dolphin, D. H. J. Am. Chem. Soc. 1987, 109, 7530-7531. (69) Withers, S. G.; Rupitz, K.; Street, I. P. J. Biol. Chem. 1988, 263, 7929-7932. (70) Withers, S. G.; Aebersold, R. Protein Sci. 1995, 4, 361-372. (71) Poon, D. K. Y.; Ludwiczek, M. L.; Schubert, M.; Kwan, E. M.; Withers, S. G.; McIntosh, L. P. Biochemistry 2007, 46, 1759-1770. (72) Vocadlo, D. J.; Davies, G. J.; Laine, R.; Withers, S. G. Nature 2001, 412, 835-838. (73) Braun, C.; Brayer, G. D.; Withers, S. G. J. Biol. Chem. 1995, 270, 26778-26781. (74) Hart, D. O.; He, S. M.; Chany, C. J.; Withers, S. G.; Sims, P. F. G.; Sinnott, M. L.; Brumer, H. Biochemistry 2000, 39, 9826-9836. (75) McCarter, J. D.; Withers, S. G. J. Am. Chem. Soc. 1996, 118, 241-242. (76) McCarter, J. D.; Withers, S. G. J. Biol. Chem. 1996, 271, 6889-6894. (77) Lee, S. S.; He, S. M.; Withers, S. G. Biochem. J. 2001, 359, 381-386. (78) Lovering, A. L.; Lee, S. S.; Kim, Y. W.; Withers, S. G.; Strynadka, N. C. J. J. Biol. Chem. 2005, 280, 2105-2115. (79) Numao, S.; Kuntz, D. A.; Withers, S. G.; Rose, D. R. J. Biol. Chem. 2003, 278, 48074-48083. (80) Vocadlo, D. J.; Mayer, C.; He, S. M.; Withers, S. G. Biochemistry 2000, 39, 117- 126. (81) Stubbs, K. A.; Scaffidi, A.; Debowski, A. W.; Mark, B. L.; Stick, R. V.; Vocadlo, D. J. J. Am. Chem. Soc. 2008, 130, 327-335. (82) Rouvinen, J.; Bergfors, T.; Teeri, T.; Knowles, J. K. C.; Jones, T. A. Science 1990, 249, 380-386. (83) Davies, G. J.; Mackenzie, L.; Varrot, A.; Dauter, M.; Brzozowski, A. M.; Schulein, M.; Withers, S. G. Biochemistry 1998, 37, 11707-11713. (84) Ducros, V. M. A.; Zechel, D. L.; Murshudov, G. N.; Gilbert, H. J.; Szabo, L.; Stoll, D.; Withers, S. G.; Davies, G. J. Angew. Chem. Int. Ed. 2002, 41, 2824-2827. (85) Pauling, L. Nature 1948, 161, 707-709. (86) Frandsen, T. P.; Stoffer, B. B.; Palcic, M. M.; Hof, S.; Svensson, B. J. Mol. Biol. 1996, 263, 79-89.

211 (87) Sierks, M. R.; Svensson, B. Biochemistry 2000, 39, 8585-8592. (88) Wicki, J.; Schloegl, J.; Tarling, C. A.; Withers, S. G. Biochemistry 2007, 46, 6996-7005. (89) Mccarter, J. D.; Adam, M. J.; Withers, S. G. Biochem. J. 1992, 286, 721-727. (90) Namchuk, M. N.; Withers, S. G. Biochemistry 1995, 34, 16194-16202. (91) Frandsen, T. P.; Palcic, M. M.; Svensson, B. Eur. J. Biochem. 2002, 269, 728-734. (92) Notenboom, V.; Birsan, C.; Nitz, M.; Rose, D. R.; Warren, R. A. J.; Withers, S. G. Nat. Struct. Biol. 1998, 5, 812-818. (93) Uitdehaag, J. C. M.; Mosi, R.; Kalk, K. H.; van der Veen, B. A.; Dijkhuizen, L.; Withers, S. G.; Dijkstra, B. W. Nat. Struct. Biol. 1999, 6, 432-436. (94) Brayer, G. D.; Sidhu, G.; Maurus, R.; Rydberg, E. H.; Braun, C.; Wang, Y. L.; Nguyen, N. T.; Overall, C. H.; Withers, S. G. Biochemistry 2000, 39, 4778-4791. (95) van der Maarel, M. J. E. C.; van der Veen, B.; Uitdehaag, J. C. M.; Leemhuis, H.; Dijkhuizen, L. J. Biotechnol. 2002, 94, 137-155. (96) Kuriki, T.; Imanaka, T. J. Biosci. Bioeng. 1999, 87, 557-565. (97) Janecek, S. Prog. Biophys. Mol. Biol. 1997, 67, 67-97. (98) Pandey, A.; Nigam, P.; Soccol, C. R.; Soccol, V. T.; Singh, D.; Mohan, R. Biotechnol. Appl. Biochem. 2000, 31, 135-152. (99) Janecek, S.; Svensson, B.; Henrissat, B. J. Mol. Evol. 1997, 45, 322-331. (100) Linden, A.; Mayans, O.; Meyer-Klaucke, W.; Antranikian, G.; Wilmanns, M. J Biol Chem 2003, 278, 9875-84. (101) Davies, G. J.; Brzozowski, A. M.; Dauter, Z.; Rasmussen, M. D.; Borchert, T. V.; Wilson, K. S. Acta Crystallogr D Biol Crystallogr 2005, 61, 190-3. (102) Machius, M.; Wiegand, G.; Huber, R. J Mol Biol 1995, 246, 545-59. (103) Vujicic-Zagar, A.; Dijkstra, B. W. Acta Crystallogr Sect F Struct Biol Cryst Commun 2006, 62, 716-21. (104) Matsuura, Y.; Kusunoki, M.; Harada, W.; Kakudo, M. J Biochem 1984, 95, 697- 702. (105) Robert, X.; Haser, R.; Gottschalk, T. E.; Ratajczak, F.; Driguez, H.; Svensson, B.; Aghajari, N. Structure 2003, 11, 973-84. (106) Qian, M. X.; Haser, R.; Buisson, G.; Duee, E.; Payan, F. Biochemistry 1994, 33, 6284-6294. (107) Ramasubbu, N.; Paloth, V.; Luo, Y. G.; Brayer, G. D.; Levine, M. J. Acta Crystallogr D Biol Crystallogr 1996, 52, 435-446. (108) Brayer, G. D.; Luo, Y. G.; Withers, S. G. Protein Sci. 1995, 4, 1730-1742. (109) Sogaard, M.; Abe, J.; Martineauclaire, M. F.; Svensson, B. Carbohydr. Polym. 1993, 21, 137-146. (110) Nielsen, M. M.; Seo, E. S.; Dilokpimol, A.; Andersen, J.; Abou Hachem, M.; Naested, H.; Willemoes, M.; Bozonnet, S.; Kandra, L.; Gyemant, G.; Haser, R.; Aghajari, N.; Svensson, B. Biocatal. Biotransform. 2008, 26, 59-67. (111) MacGregor, E. A.; Janecek, S.; Svensson, B. BBA 2001, 1546, 1-20. (112) Stam, M. R.; Danchin, E. G. J.; Rancurel, C.; Coutinho, P. M.; Henrissat, B. Protein Eng. Des. Sel. 2006, 19, 555-562. (113) Braun, C.; Meinke, A.; Ziser, L.; Withers, S. G. Anal. Biochem. 1993, 212, 259- 262. (114) Yoon, S. H.; Fulton, D. B.; Robyt, J. F. Carbohydr. Res. 2007, 342, 55-64.

212 (115) Tao, B. Y.; Reilly, P. J.; Robyt, J. F. BBA 1989, 995, 214-220. (116) Robyt, J. F.; French, D. J. Biol. Chem. 1970, 245, 3917-&. (117) Chiasson, J. L.; Josse, R. G.; Hunt, J. A.; Palmason, C.; Rodger, N. W.; Ross, S. A.; Ryan, E. A.; Tan, M. H.; Wolever, T. M. S. Ann. Intern. Med. 1994, 121, 928-935. (118) Mooradian, A. D.; Thurman, J. E. Drugs 1999, 57, 19-29. (119) Scott, L. J.; Spencer, C. M. Drugs 2000, 59, 521-549. (120) Numao, S.; Maurus, R.; Sidhu, G.; Wang, Y.; Overall, C. M.; Brayer, G. D.; Withers, S. G. Biochemistry 2002, 41, 215-225. (121) Maurus, R.; Begum, A.; Kuo, H. H.; Racaza, A.; Numao, S.; Andersen, C.; Tams, J. P.; Vind, J.; Overall, C. M.; Withers, S. G.; Brayer, G. D. Protein Sci. 2005, 14, 743- 755. (122) Maurus, R.; Begum, A.; Williams, L. K.; Fredriksen, J. R.; Zhang, R.; Withers, S. G.; Brayer, G. D. Biochemistry 2008, 47, 3332-3344. (123) Braun, C. Ph.D. Thesis, University of British Columbia 1995. (124) Strokopytov, B.; Penninga, D.; Rozeboom, H. J.; Kalk, K. H.; Dijkhuizen, L.; Dijkstra, B. W. Biochemistry 1995, 34, 2234-2240. (125) Kadziola, A.; Sogaard, M.; Svensson, B.; Haser, R. J. Mol. Biol. 1998, 278, 205- 217. (126) White, A.; Tull, D.; Johns, K.; Withers, S. G.; Rose, D. R. Nat. Struct. Biol. 1996, 3, 149-154. (127) Sidhu, G.; Withers, S. G.; Nguyen, N. T.; McIntosh, L. P.; Ziser, L.; Brayer, G. D. Biochemistry 1999, 38, 5346-5354. (128) Ducros, V. M. A.; Zechel, D. L.; Murshudov, G. N.; Gilbert, H. J.; Szabo, L.; Stoll, D.; Withers, S. G.; Davies, G. J. Angew. Chem. Int. Ed. 2002, 41, 2824-2827. (129) Hagena, T. L.; Coward, J. K. Tetrahedron-Asymmetry 2009, 20, 781-794. (130) Guce, A. I.; Clark, N. E.; Salgado, E. N.; Ivanen, D. R.; Kulminskaya, A. A.; Brumer, H.; Garman, S. C. J. Biol. Chem. 2010, 285, 3625-3632. (131) Lindhorst, T. K.; Braun, C.; Withers, S. G. Carbohydr. Res. 1995, 268, 93-106. (132) Mosi, R.; He, S. M.; Uitdehaag, J.; Dijkstra, B. W.; Withers, S. G. Biochemistry 1997, 36, 9927-9934. (133) Woo, E. J.; Lee, S.; Cha, H.; Park, J. T.; Yoon, S. M.; Song, H. N.; Park, K. H. J. Biol. Chem. 2008, 283, 28641-28648. (134) Barends, T. R. M.; Bultema, J. B.; Kaper, T.; van der Maarel, M. J. E. C.; Dijkhuizen, L.; Dijkstra, B. W. J. Biol. Chem. 2007, 282, 17242-17249. (135) Asano, N. Glycobiology 2003, 13, 93r-104r. (136) Tarling, C. A.; Woods, K.; Zhang, R.; Brastianos, H. C.; Brayer, G. D.; Andersen, R. J.; Withers, S. G. Chembiochem 2008, 9, 433-438. (137) Svensson, B.; Fukuda, K.; Nielsen, P. K.; Bonsager, B. C. BBA 2004, 1696, 145- 156. (138) Franco, O. L.; Rigden, D. J.; Melo, F. R.; Grossi-de-Sa, M. F. Eur. J. Biochem. 2002, 269, 397-412. (139) Vertesy, L.; Oeding, V.; Bender, R.; Zepf, K.; Nesemann, G. Eur. J. Biochem. 1984, 141, 505-512. (140) Kline, A. D.; Braun, W.; Wuthrich, K. J. Mol. Biol. 1988, 204, 675-724. (141) Billeter, M.; Kline, A. D.; Braun, W.; Huber, R.; Wuthrich, K. J. Mol. Biol. 1989, 206, 677-687.

213 (142) Wiegand, G.; Epp, O.; Huber, R. J Mol Biol 1995, 247, 99-110. (143) Etzkorn, F. A.; Guo, T.; Lipton, M. A.; Goldberg, S. D.; Bartlett, P. A. J. Am. Chem. Soc. 1994, 116, 10412-10425. (144) Machius, M.; Vertesy, L.; Huber, R.; Wiegand, G. J. Mol. Biol. 1996, 260, 409- 421. (145) Rydberg, E. H. Ph.D. Thesis, University of British Columbia 2000. (146) Brzozowski, A. M.; Davies, G. J. Biochemistry 1997, 36, 10837-10845. (147) Li, C. M.; Begum, A.; Numao, S.; Park, K. H.; Withers, S. G.; Brayer, G. D. Biochemistry 2005, 44, 3347-3357. (148) Mohan, S.; Pinto, B. M. Carbohydr. Res. 2007, 342, 1551-1580. (149) Numao, S.; Damager, I.; Li, C. M.; Wrodnigg, T. M.; Begum, A.; Overall, C. M.; Brayer, G. D.; Withers, S. G. J. Biol. Chem. 2004, 279, 48282-48291. (150) Rydberg, E. H.; Sidhu, G.; Vo, H. C.; Hewitt, J.; Cote, H. C. F.; Wang, Y. L.; Numao, S.; MacGillivray, R. T. A.; Overall, C. M.; Brayer, G. D.; Withers, S. G. Protein Sci. 1999, 8, 635-643. (151) Middleton, W. J. J. Org. Chem. 1975, 40, 574-578. (152) Sharma, R. A.; Kavai, I.; Fu, Y. L.; Bobek, M. Tetrahedron Lett. 1977, 3433- 3436. (153) An, S. H.; Bobek, M. Tetrahedron Lett. 1986, 27, 3219-3222. (154) Dessinges, A.; Escribano, F. C.; Lukacs, G.; Olesker, A.; Thang, T. T. J. Org. Chem. 1987, 52, 1633-1634. (155) Ellaghdach, A.; Echarri, R.; Matheu, M. I.; Barrena, M. I.; Castillon, S.; Garcia, J. J. Org. Chem. 1991, 56, 4556-4559. (156) Nicolaou, K. C.; Ladduwahetty, T.; Randall, J. L.; Chucholowski, A. J. Am. Chem. Soc. 1986, 108, 2466-2467. (157) Fried, J.; Hallinan, E. A.; Szwedo, M. J. J. Am. Chem. Soc. 1984, 106, 3871-3872. (158) Hanzawa, Y.; Inazawa, K.; Kon, A.; Aoki, H.; Kobayashi, Y. Tetrahedron Lett. 1987, 28, 659-662. (159) Adamson, J.; Foster, A. B.; Westwood, J. H. Carbohydr. Res. 1971, 18, 345-&. (160) Mccarter, J. D.; Adam, M. J.; Braun, C.; Namchuk, M.; Tull, D.; Withers, S. G. Carbohydr. Res. 1993, 249, 77-90. (161) Lal, G. S. J. Org. Chem. 1993, 58, 2791-2796. (162) Nyffeler, P. T.; Duron, S. G.; Burkart, M. D.; Vincent, S. P.; Wong, C. H. Angew. Chem. Int. Ed. 2005, 44, 192-212. (163) Burkart, M. D.; Zhang, Z. Y.; Hung, S. C.; Wong, C. H. J. Am. Chem. Soc. 1997, 119, 11743-11746. (164) Vincent, S. P.; Burkart, M. D.; Tsai, C. Y.; Zhang, Z. Y.; Wong, C. H. J. Org. Chem. 1999, 64, 5264-5279. (165) Bradley, P. R.; Buncel, E. Can. J. Chem. 1968, 46, 3001-3006. (166) Lemieux, R. U.; Fraserre.B Can. J. Chem 1965, 43, 1460-1475. (167) Igarashi, K.; Honma, T.; Imagawa, T. J. Org. Chem. 1970, 35, 610-616. (168) Sugimoto, O.; Mori, M.; Tanji, K. Tetrahedron Lett. 1999, 40, 7477-7478. (169) Ghosh, R.; Chakraborty, A.; Maiti, S. Tetrahedron Lett. 2004, 45, 9631-9634. (170) Repichet, S.; Le Roux, C.; Roques, N.; Dubac, J. Tetrahedron Lett. 2003, 44, 2037-2040. (171) Sun, H. B.; Li, B.; Hua, R. M.; Yin, Y. W. Eur. J. Org. Chem. 2006, 4231-4236.

214 (172) Francisco, C. G.; Gonzalez, C. C.; Paz, N. R.; Suarez, E. Org. Lett. 2003, 5, 4171- 4173. (173) Numao, S. Ph.D. Thesis, University of British Columbia 2003. (174) Damager, I.; Numao, S.; Chen, H. M.; Brayer, G. D.; Withers, S. G. Carbohydr. Res. 2004, 339, 1727-1737. (175) McCarter, J. D. Ph.D. Thesis, University of British Columbia 1995. (176) Street, I. P.; Kempton, J. B.; Withers, S. G. Biochemistry 1992, 31, 9970-9978. (177) Kabalka, G. W.; Varma, M.; Varma, R. S.; Srivastava, P. C.; Knapp, F. F. J. Org. Chem. 1986, 51, 2386-2388. (178) Carr, J. A.; Bisht, K. S. Org. Lett. 2004, 6, 3297-3300. (179) Tius, M. A. Tetrahedron 1995, 51, 6605-6634. (180) Taylor, S. D.; Kotoris, C. C.; Hum, G. Tetrahedron 1999, 55, 12431-12477. (181) Korytnyk, W.; Valentekovichorvath, S.; Petrie, C. R. Tetrahedron 1982, 38, 2547-2550. (182) Garegg, P. J.; Hultberg, H. Carbohydr. Res. 1981, 93, C10-C11. (183) Garegg, P. J.; Hultberg, H.; Wallin, S. Carbohydr. Res. 1982, 108, 97-101. (184) Presser, A.; Hufner, A. Monatsh. Chem. 2004, 135, 1015-1022. (185) Kuehnel, E.; Laffan, D. D. R.; Lloyd-Jones, G. C.; del Campo, T. M.; Shepperson, I. R.; Slaughter, J. L. Angew. Chem. Int. Ed. 2007, 46, 7075-7078. (186) Blattner, R.; Ferrier, R. J. J. Chem. Soc., Perkin Trans. 1 1980, 1523-1527. (187) Ferrier, R. J.; Tyler, P. C. J. Chem. Soc., Perkin Trans. 1 1980, 1528-1534. (188) Hartman, M. C. T.; Coward, J. K. J. Am. Chem. Soc. 2002, 124, 10036-10053. (189) Hartman, M. C. T.; Jiang, S.; Rush, J. S.; Waechter, C. J.; Coward, J. K. Biochemistry 2007, 46, 11630-11638. (190) Qian, M. X.; Spinelli, S.; Driguez, H.; Payan, F. Protein Sci. 1997, 6, 2285-2296. (191) Tailford, L. E.; Offen, W. A.; Smith, N. L.; Dumon, C.; Morland, C.; Gratien, J.; Heck, M. P.; Stick, R. V.; Bleriot, Y.; Vasella, A.; Gilbert, H. J.; Davies, G. J. Nat. Chem. Biol. 2008, 4, 306-312. (192) Tan, T. C.; Mijts, B. N.; Swaminathan, K.; Patel, B. K. C.; Divne, C. J. Mol. Biol. 2008, 378, 852-870. (193) Mosi, R.; Sham, H.; Uitdehaag, J. C. M.; Ruiterkamp, R.; Dijkstra, B. W.; Withers, S. G. Biochemistry 1998, 37, 17192-17198. (194) Singer, M. A.; Lindquist, S. Trends Biotechnol. 1998, 16, 460-468. (195) Elbein, A. D.; Pan, Y. T.; Pastuszak, I.; Carroll, D. Glycobiology 2003, 13, 17r- 27r. (196) Clegg, J. S.; Filosa, M. F. Nature 1961, 192, 1077-1078. (197) Becker, A.; Schloder, P.; Steele, J. E.; Wegener, G. Experientia 1996, 52, 433- 439. (198) Thevelein, J. M. Microbiol. Rev. 1984, 48, 42-59. (199) Crowe, J. H.; Crowe, L. M.; Chapman, D. Science 1984, 223, 701-703. (200) Madin, K. A. C.; Crowe, J. H. J. Exp. Zool. 1975, 193, 335-342. (201) Leslie, S. B.; Teter, S. A.; Crowe, L. M.; Crowe, J. H. BBA-Biomembranes 1994, 1192, 7-13. (202) Crowe, L. M.; Reid, D. S.; Crowe, J. H. Biophys. J. 1996, 71, 2087-2093. (203) Allison, S. D.; Chang, B.; Randolph, T. W.; Carpenter, J. F. Arch. Biochem. Biophys. 1999, 365, 289-298.

215 (204) Kandror, O.; DeLeon, A.; Goldberg, A. L. Proc. Nat. Acad. Sci. U. S. A. 2002, 99, 9727-9732. (205) Eroglu, A.; Russo, M. J.; Bieganski, R.; Fowler, A.; Cheley, S.; Bayley, H.; Toner, M. Nat. Biotechnol. 2000, 18, 163-167. (206) Lee, S. B.; Kwon, H. B.; Kwon, S. J.; Park, S. C.; Jeong, M. J.; Han, S. E.; Byun, M. O.; Daniell, H. Mol. Breed. 2003, 11, 1-13. (207) Hunter, R. L.; Armitige, L.; Jagannath, C.; Actor, J. K. Tuberculosis 2009, 89, S18-S25. (208) Hunter, R. L.; Olsen, M. R.; Jagannath, C.; Actor, J. K. Ann. Clin. Lab. Sci. 2006, 36, 371-386. (209) Brennan, P. J.; Nikaido, H. Annu. Rev. Biochem. 1995, 64, 29-63. (210) Goren, M. B.; Brokl, O.; Roller, P.; Fales, H. M.; Das, B. C. Biochemistry 1976, 15, 2728-2734. (211) Goren, M. B. BBA 1970, 210, 116-126. (212) Goren, M. B.; Brokl, O.; Schaeffe.Wb Infect. Immun. 1974, 9, 142-149. (213) Mougous, J. D.; Petzold, C. J.; Senaratne, R. H.; Lee, D. H.; Akey, D. L.; Lin, F. L.; Munchel, S. E.; Pratt, M. R.; Riley, L. W.; Leary, J. A.; Berger, J. M.; Bertozzi, C. R. Nat. Struct. Mol. Biol. 2004, 11, 721-729. (214) Woodruff, P. J.; Carlson, B. L.; Siridechadilok, B.; Pratt, M. R.; Senaratne, R. H.; Mougous, J. D.; Riley, L. W.; Williams, S. J.; Bertozzi, C. R. J. Biol. Chem. 2004, 279, 28835-28843. (215) Edavana, V. K.; Pastuszak, T.; Carroll, J. D.; Thampi, P.; Abraham, E. C.; Elbein, A. D. Arch. Biochem. Biophys. 2004, 426, 250-257. (216) Pan, Y. T.; Edavana, V. K.; Jourdian, W. J.; Edmondson, R.; Carroll, J. D.; Pastuszak, I.; Elbein, A. D. Eur. J. Biochem. 2004, 271, 4259-4269. (217) Gibson, R. P.; Tarling, C. A.; Roberts, S.; Withers, S. G.; Davies, G. J. J. Biol. Chem. 2004, 279, 1950-1955. (218) Bell, W.; Sun, W. N.; Hohmann, S.; Wera, S.; Reinders, A.; De Virgilio, C.; Wiemken, A.; Thevelein, J. M. J. Biol. Chem. 1998, 273, 33311-33319. (219) Maruta, K.; Nakada, T.; Kubota, M.; Chaen, H.; Sugimoto, T.; Kurimoto, M.; Tsujisaka, Y. Biosci. Biotechnol. Biochem. 1995, 59, 1829-1834. (220) Nakada, T.; Maruta, K.; Tsusaki, K.; Kubota, M.; Chaen, H.; Sugimoto, T.; Kurimoto, M.; Tsujisaka, Y. Biosci. Biotechnol. Biochem. 1995, 59, 2210-2214. (221) Nakada, T.; Maruta, K.; Mitsuzumi, H.; Kubota, M.; Chaen, H.; Sugimoto, T.; Kurimoto, M.; Tsujisaka, Y. Biosci. Biotechnol. Biochem. 1995, 59, 2215-2218. (222) Maruta, K.; Hattori, K.; Nakada, T.; Kubota, M.; Sugimoto, T.; Kurimoto, M. Biosci. Biotechnol. Biochem. 1996, 60, 717-720. (223) Maruta, K.; Mitsuzumi, H.; Nakada, T.; Kubota, M.; Chaen, H.; Fukuda, S.; Sugimoto, T.; Kurimoto, M. BBA-General Subjects 1996, 1291, 177-181. (224) Higashiyama, T. Pure Appl. Chem. 2002, 74, 1263-1269. (225) Nishimoto, T.; Nakano, M.; Ikegami, S.; Chaen, H.; Fukuda, S.; Sugimoto, T.; Kurimoto, M.; Tsujisaka, Y. Biosci. Biotechnol. Biochem. 1995, 59, 2189-2190. (226) Tsusaki, K.; Nishimoto, T.; Nakada, T.; Kubota, M.; Chaen, H.; Fukuda, S.; Sugimoto, T.; Kurimoto, M. BBA-General Subjects 1997, 1334, 28-32. (227) Koh, S.; Shin, H. J.; Kim, J. S.; Lee, D. S.; Lee, S. Y. Biotechnol. Lett. 1998, 20, 757-761.

216 (228) Wang, J. H.; Tsai, M. Y.; Lee, G. C.; Shaw, J. F. Journal of Agricultural and Food Chemistry 2007, 55, 1256-1263. (229) Chen, Y. S.; Lee, G. C.; Shaw, J. F. J. Agric. Food Chem. 2006, 54, 7098-7104. (230) Murphy, H. N.; Stewart, G. R.; Mischenko, V. V.; Apt, A. S.; Harris, R.; McAlister, M. S. B.; Driscoll, P. C.; Young, D. B.; Robertson, B. D. J. Biol. Chem. 2005, 280, 14524-14529. (231) Nishimoto, T.; Nakano, M.; Nakada, T.; Chaen, H.; Fukuda, S.; Sugimoto, T.; Kurimoto, M.; Tsujisaka, Y. Biosci. Biotechnol. Biochem. 1996, 60, 640-644. (232) Koh, S.; Kim, J.; Shin, H. J.; Lee, D.; Bae, J.; Kim, D.; Lee, D. S. Carbohydr. Res. 2003, 338, 1339-1343. (233) Wu, X. L.; Ding, H. B.; Yue, M.; Qiao, Y. Appl. Microbiol. Biotechnol. 2009, 83, 477-482. (234) Yue, M.; Wu, X. L.; Gong, W. N.; Ding, H. B. Microb. Cell Fact. 2009, 8:34. (235) Asker, M. M. S.; Ramadan, M. F.; El-Aal, S. K. A.; El-Kady, E. M. M. World J. Microbiol. Biotechnol. 2009, 25, 789-794. (236) Zhu, Y. M.; Wei, D. S.; Zhang, J.; Wang, Y. F.; Xu, H. Y.; Xing, L. J.; Li, M. C. Extremophiles 2010, 14, 1-8. (237) Tsusaki, K.; Nishimoto, T.; Nakada, T.; Kubota, M.; Chaen, H.; Sugimoto, T.; Kurimoto, M. BBA-General Subjects 1996, 1290, 1-3. (238) Williams, S. J.; Withers, S. G. Carbohydr. Res. 2000, 327, 27-46. (239) Lee, S. S.; Yu, S.; Withers, S. G. Biochemistry 2003, 42, 13081-13090. (240) Chen, H. M.; Withers, S. G. Chembiochem 2007, 8, 719-722. (241) Berven, L. A.; Dolphin, D. H.; Withers, S. G. J. Am. Chem. Soc. 1988, 110, 4864- 4866. (242) Cardona, F.; Parmeggiani, C.; Faggi, E.; Bonaccini, C.; Gratteri, P.; Sim, L.; Gloster, T. M.; Roberts, S.; Davies, G. J.; Rose, D. R.; Goti, A. Chem. Eur. J. 2009, 15, 1627-1636. (243) Van Ameijde, J.; Horne, G.; Wormald, M. R.; Dwek, R. A.; Nash, R. J.; Jones, P. W.; Evinson, E. L.; Fleet, G. W. J. Tetrahedron-Asymmetry 2006, 17, 2702-2712. (244) Pan, Y. T.; Carroll, J. D.; Asano, N.; Pastuszak, I.; Edavana, V. K.; Elbein, A. D. FEBS J. 2008, 275, 3408-3420. (245) Lahiri, S. D.; Zhang, G. F.; Dunaway-Mariano, D.; Allen, K. N. Biochemistry 2002, 41, 8351-8359. (246) Knowles, J. R. Annu Rev Biochem 1980, 49, 877-919. (247) Ray, W. J.; Mildvan, A. S.; Long, J. W. Biochemistry 1973, 12, 3724-3732. (248) Ma, C.; Ray, W. J. Biochemistry 1980, 19, 751-759. (249) Ray, W. J.; Mildvan, A. S. Biochemistry 1973, 12, 3733-3743. (250) Percival, M. D.; Withers, S. G. Biochemistry 1992, 31, 498-505. (251) Percival, M. D.; Withers, S. G. Biochemistry 1992, 31, 505-512. (252) Van den Steen, P.; Rudd, P. M.; Dwek, R. A.; Opdenakker, G. Crit. Rev. Biochem. Mol. Biol. 1998, 33, 151-208. (253) Lis, H.; Sharon, N. Eur. J. Biochem. 1993, 218, 1-27. (254) Spiro, R. G. Glycobiology 2002, 12, 43r-56r. (255) Helenius, A.; Aebi, M. Annu. Rev. Biochem. 2004, 73, 1019-1049. (256) Helenius, A.; Aebi, M. Science 2001, 291, 2364-2369.

217 (257) Kowarik, M.; Numao, S.; Feldman, M. F.; Schulz, B. L.; Callewaert, N.; Kiermaier, E.; Catrein, I.; Aebi, M. Science 2006, 314, 1148-1150. (258) Hounsell, E. F.; Davies, M. J.; Renouf, D. V. Glycoconjugate J. 1996, 13, 19-26. (259) Julenius, K.; Molgaard, A.; Gupta, R.; Brunak, S. Glycobiology 2004, 15, 153- 164. (260) Glinsky, V. V.; Glinsky, G. V.; Rittenhouse-Olson, K.; Huflejt, M. E.; Glinskii, O. V.; Deutscher, S. L.; Quinn, T. P. Cancer Res. 2001, 61, 4851-4857. (261) Huang, C. C.; Aminoff, D. J. Biol. Chem. 1972, 247, 6737-6742. (262) Bhavanandan, V. P.; Umemoto, J.; Davidson, E. A. Biochem. Biophys. Res. Commun. 1976, 70, 738-745. (263) Fan, J. Q.; Kadowaki, S.; Yamamoto, K.; Kumagai, H.; Tochikura, T. Agric. Biol. Chem. 1988, 52, 1715-1723. (264) Ajisaka, K.; Miyasato, M.; Ishii-Karakasa, I. Biosci. Biotechnol. Biochem. 2001, 65, 1240-1243. (265) Ashida, H.; Yamamoto, K.; Murata, T.; Usui, T.; Kumagai, H. Arch. Biochem. Biophys. 2000, 373, 394-400. (266) Fujita, K.; Oura, F.; Nagamine, N.; Katayama, T.; Hiratake, J.; Sakata, K.; Kumagai, H.; Yamamoto, K. J. Biol. Chem. 2005, 280, 37415-37422. (267) van der Poll, T.; Opal, S. M. Lancet 2009, 374, 1543-1556. (268) Burnaugh, A. M.; Frantz, L. J.; King, S. J. J. Bacteriol. 2008, 190, 221-230. (269) Hoskins, J.; Alborn, W. E.; Arnold, J.; Blaszczak, L. C.; Burgett, S.; DeHoff, B. S.; Estrem, S. T.; Fritz, L.; Fu, D. J.; Fuller, W.; Geringer, C.; Gilmour, R.; Glass, J. S.; Khoja, H.; Kraft, A. R.; Lagace, R. E.; LeBlanc, D. J.; Lee, L. N.; Lefkowitz, E. J.; Lu, J.; Matsushima, P.; McAhren, S. M.; McHenney, M.; McLeaster, K.; Mundy, C. W.; Nicas, T. I.; Norris, F. H.; O'Gara, M.; Peery, R. B.; Robertson, G. T.; Rockey, P.; Sun, P. M.; Winkler, M. E.; Yang, Y.; Young-Bellido, M.; Zhao, G. S.; Zook, C. A.; Baltz, R. H.; Jaskunas, S. R.; Rosteck, P. R.; Skatrud, P. L.; Glass, J. I. J. Bacteriol. 2001, 183, 5709- 5717. (270) Tettelin, H.; Nelson, K. E.; Paulsen, I. T.; Eisen, J. A.; Read, T. D.; Peterson, S.; Heidelberg, J.; DeBoy, R. T.; Haft, D. H.; Dodson, R. J.; Durkin, A. S.; Gwinn, M.; Kolonay, J. F.; Nelson, W. C.; Peterson, J. D.; Umayam, L. A.; White, O.; Salzberg, S. L.; Lewis, M. R.; Radune, D.; Holtzapple, E.; Khouri, H.; Wolf, A. M.; Utterback, T. R.; Hansen, C. L.; McDonald, L. A.; Feldblyum, T. V.; Angiuoli, S.; Dickinson, T.; Hickey, E. K.; Holt, I. E.; Loftus, B. J.; Yang, F.; Smith, H. O.; Venter, J. C.; Dougherty, B. A.; Morrison, D. A.; Hollingshead, S. K.; Fraser, C. M. Science 2001, 293, 498-506. (271) Muramatsu, H.; Tachikui, H.; Ushida, H.; Song, X.; Qiu, Y.; Yamamoto, S.; Muramatsu, T. J. Biochem. 2001, 129, 923-928. (272) Kadioglu, A.; Weiser, J. N.; Paton, J. C.; Andrew, P. W. Nat. Rev. Microbiol. 2008, 6, 288-301. (273) Marion, C.; Limoli, D. H.; Bobulsky, G. S.; Abraham, J. L.; Burnaugh, A. M.; King, S. J. Infect. Immun. 2009, 77, 1389-1396. (274) Ashida, H.; Maki, R.; Ozawa, H.; Tani, Y.; Kiyohara, M.; Fujita, M.; Imamura, A.; Ishida, H.; Kiso, M.; Yamamoto, K. Glycobiology 2008, 18, 727-734. (275) Umemoto, J.; Bhavanandan, V. P.; Davidson, E. A. J. Biol. Chem. 1977, 252, 8609-8614.

218 (276) Caines, M. E. C.; Zhu, H. Z.; Vuckovic, M.; Willis, L. M.; Withers, S. G.; Wakarchuk, W. W.; Strynadka, N. C. J. J. Biol. Chem. 2008, 283, 31279-31283. (277) Suzuki, R.; Katayama, T.; Kitaoka, M.; Kumagai, H.; Wakagi, T.; Shoun, H.; Ashida, H.; Yamamoto, K.; Fushinobu, S. J. Biochem. 2009, 146, 389-398. (278) Bertozzi, C. R.; Kiessling, L. L. Science 2001, 291, 2357-2364. (279) Rich, J. R.; Withers, S. G. Nat. Chem. Biol. 2009, 5, 206-215. (280) Hancock, S. M.; D Vaughan, M.; Withers, S. G. Curr. Opin. Chem. Biol. 2006, 10, 509-519. (281) Shaikh, F. A.; Withers, S. G. Biochem. Cell Biol. 2008, 86, 169-177. (282) Ashida, H.; Yamamoto, K.; Kumagai, H. Carbohydr. Res. 2001, 330, 487-493. (283) Bardales, R. M.; Bhavanandan, V. P. J. Biol. Chem. 1989, 264, 19893-19897. (284) Bernatchez, S.; Gilbert, M.; Blanchard, M. C.; Karwaski, M. F.; Li, J.; DeFrees, S.; Wakarchuk, W. W. Glycobiology 2007, 17, 1333-1343. (285) Neumann, A. U.; Lam, N. P.; Dahari, H.; Gretch, D. R.; Wiley, T. E.; Layden, T. J.; Perelson, A. S. Science 1998, 282, 103-107. (286) Zechel, D. L.; Withers, S. G. Curr. Opin. Chem. Biol. 2001, 5, 643-649. (287) Ly, H. D.; Withers, S. G. Annu. Rev. Biochem. 1999, 68, 487-522. (288) Mosi, R. Ph.D. Thesis, University of British Columbia 1998. (289) Ying, L. Q.; Gervay-Hague, J. Carbohydr. Res. 2003, 338, 835-841. (290) Tarling, C. A.; He, S. M.; Sulzenbacher, G.; Bignon, C.; Bourne, Y.; Henrissat, B.; Withers, S. G. J. Biol. Chem. 2003, 278, 47394-47399. (291) Wang, Q.; Trimbur, D.; Graham, R.; Warren, R. A. J.; Withers, S. G. Biochemistry 1995, 34, 14554-14562. (292) Fokt, I.; Szymanski, S.; Skora, S.; Cybulski, M.; Madden, T.; Priebe, W. Carbohydr. Res. 2009, 344, 1464-1473. (293) Adam, M. J. J. Labelled Compd. Radiopharm. 1999, 42, 809-813. (294) Hehre, E. J.; Kitahata, S.; Brewer, C. F. J. Biol. Chem. 1986, 261, 2147-2153. (295) Ortner, J.; Albert, M.; Weber, H.; Dax, K. J. Carbohydr. Chem. 1999, 18, 297- 316. (296) Hayashi, M.; Hashimoto, S.; Noyori, R. Chem. Lett. 1984, 1747-1750. (297) Kitahata, S.; Brewer, C. F.; Genghof, D. S.; Sawai, T.; Hehre, E. J. J. Biol. Chem. 1981, 256, 6017-6026.

219 Appendix I: Basic Enzyme Kinetics

Kinetics is a very important research tool for understanding enzymes. By measuring the rates of enzyme-catalyzed chemical reactions in the presence of different substrates/inhibitors/activators, we can gain valuable insights into the enzyme catalytic mechanisms.

Michaelis-Menten Kinetics

One of the most fundamental enzyme kinetics is the Michaelis-Menten equation, which describes the relationship between the rate of enzyme-catalyzed reactions and the increasing concentration of substrate. The reaction scheme for a single substrate (S) reacting with the enzyme (E) is shown below:

k1 kcat E + S ES E + P k-1

The central concept in the above scheme is the existence of a non-covalent complex: ES, before the enzyme turns over the substrate to generate the product (P). This ES complex sometimes is referred to as Michaelis complex, in memory of Prof. Leonor Michaelis. Since in most of the cases, the rate of the chemical reaction is much slower than the rate for E, S and ES to reach equilibrium, an assumption can be made to greatly simply the calculation and this is that under the steady state conditions, [ES] can be regarded as constant: d[ES] = k1[E]free[S] – k-1[ES] – kcat [ES] = 0 dt

Therefore, [ES] = (k1[E]free[S]) / (k-1 + kcat)

k−1 + kcat Km is defined as: Km = k1

220 [ES] = ([E]free[S])/Km

Since [E]free = [E]total – [ES]: [ES] = ([E]total – [ES])[S] / Km [E] [S] [ES] = total K m + [S]

d[P] kcat [E]total [S] Therefore, the overall rate: V = = kcat[ES] = dt K m + [S]

This is the Michaelis Menten equation and in most of the cases, it satisfyingly rationalizes the relationship between enzymatic rate and the substrate concentration.

There are two important kinetic parameters in this equation, namely, kcat and Km. Km is roughly equivalent to the dissociation constant of ES complex and the smaller it is, the substrate binds tighter with the enzyme. Mathematically, it equals to the substrate concentration when 50% of the enzyme in bound with the substrate. kcat is the amount of substrate which can be turned over by the enzyme in a specified unit of time and the bigger it is, the better the enzyme catalyzes the reaction. Vmax is defined as the product of kcat and [E]total: Vmax = kcat[E]total and it represents the maximum kinetic rate for a fixed

kcat [E]total [S] kcat [E]total [S] amount of enzyme. When [S] >> Km, V = ≈ = Vmax. When K m + [S] [S]

kcat [E]total [S] kcat [E]total [S] kcat [S] << Km, V = ≈ = [E]total[S]. Therefore, the kinetic K m + [S] K m K m rate V is directly proportional to [S] and this reaction becomes to a first-order reaction. k The constant, cat , is a very useful index for comparing the relative specificity of one K m enzyme acting on different substrates and is refered to as specificity constant.

In terms of glycosidases which are the main focus of this thesis, the kcat value is actually the overall reaction rate of a two -step reaction, which includes glycosylation (k2) and deglycosylation (k3) steps, as shown in the following scheme:

221 k1 k2 k3 E + S ES E-S E + P k-1

There are many ways to derive the relationship between the overall rate kcat and the individual rate k2, k3. A very simple way is used below. Since kcat is defined as the amount of substrate which can be turned over by the enzyme in a specified unit of time, its reciprocal, 1/kcat can be regarded as the average time the enzyme needs to turn over one substrate molecule. Likewise, 1/k2 and 1/k3 can be interpreted as the average time which is spent by one substrate molecule on the glycosylation step and deglycosylation step, respectively. Therefore, the following relationship is very obvious: 1 1 1 = + kcat k2 k3

k2 k3 kcat = k2 + k3

When k2 >> k3 (deglycosylation step is the rate-limiting step):

k2 k3 k2 k3 kcat = ≈ = k3 k2 + k3 k2

When k3 >> k2 (glycosylation step is the rate-limiting step):

k2 k3 k2 k3 kcat = ≈ = k2 k2 + k3 k3

Interestingly, from the above equations, specially designed substrates which has either glycosylation or deglycosylation step rate-limiting can be used for enzyme kinetics. The overall rates measured experimentally will correspond to the rate of the slower step.

Mechanism-based Inhibition Kinetics of Glycosidases

A significant part of this thesis focused on developing mechanism-based inhibitors of retaining α-glycosidases. Therefore the kinetics involved is discussed in the

222 following. For a covalent mechanism-based inhibitor reacting with enzyme, the following scheme is applied:

k1 ki kreactivation E + I EI E-I E + P k-1

Since the kreactivation is negligible compared with the inactivation rate ki, this scheme can be simplified into:

k1 ki E + I EI E-I k-1 In most of the cases, [I] >> [E]. Therefore, this kinetics is very similar with Michaelis- Menten kinetics:

ki [E][I] V = = kobs[E] K i + [I]

ki [I] kobs = K i + [I]

Since kobs is only dependent on the following constants: the inactivation rate constant (ki), the apparent dissociation constant of enzyme with the inactivator (Ki) and the inactivator concentration ([I]), the above process can be regarded as a pseudo first-order reaction for the enzyme. This explains why exponential decay of enzymatic activity can be observed when enzyme is incubated with mechanism-based inactivators. By plotting the kobs against different inactivator concentrations [I], the values of ki and Ki can also be extrapolated, which are important indicators of how good an inactivator is. Sometimes the Ki value of certain inactivator can be much larger than the inactivator concentration which can be experimentally achieved (due to the availability and the solubility of the compound). In these cases when Ki >> [I]:

ki [I] ki kobs = = [I] K i + [I] K i

Instead of obtaining individual values of ki and Ki, only the second-order inactivation constant ki/Ki can be accurately determined.

223 Enzyme Kinetics in the presence of reversible inhibitors

Depending on the binding mode with the enzyme/enzyme-substrate complex, reversible inhibitors can exhibit either competitive inhibition or mixed inhibition, as dicussed in the following.

Competitive Inhibition: A competitive inhibitor only binds to the free enzyme and a general scheme is shown below:

k1 kcat E + S ES E + P + k-1 I

Ki

EI

Ki: the dissociation constant of the enzyme-inhibitor complex EI. Under the steady state conditions: d[ES] = k1[E]free[S] – k-1[ES] – kcat[ES] = 0 dt [S] [ES] = [E] free K m [S] [I] While [E]free = [E]total – [ES] – [EI] = [E]total - [E] free - [E] free K m K i [E] Therefore [E] = total free [I] [S] 1+ + K i K m

d[P] [S] kcat [E]total [S] V = = kcat[ES] = kcat [E] free = dt K m [I] [S] + K m (1+ ) K i

224 Interestingly, from the above equation, Vmax of the enzyme doesn’t change and Km is [I] increased by a factor of (1 + ). In other words, the addition of a competitive inhibitor Ki makes the substrate an apparent worse-binding substrate but once it is bound with the enzyme, the turnover number (kcat) of the enzyme is not affected by the inhibitor.

Mixed Inhibition In a more general scenario, a reversible inhibitor will not only bind to the free enzyme, but also the enzyme-substrate Michaelis complex, as shown in the following scheme:

k1 kcat E + S ES E + P k + -1 + I I

Ki Ki' k2 EI + S ESI k-2

Ki: dissociation contant of the enzyme-inhibitor complex EI.

Ki’: dissociation contant of the enzyme-substrate-inhibitor (ESI) complex (into ES and I).

Under steady state conditions, the following equation can be derived:

k [E] [S] V = cat total [I] [I] [S](1+ ) + K m (1+ ) K i ' K i

When the inhibitor can not bind to ES complex, Ki’ can be regarded as indefinite and [I] consequently, the term (1 + ) approaches to the value of 1. This essentially simplies K i ' the equation of mixed inhibition into the competitive inhibition equation.

225 When the inhibitor can not bind to the free enzyme and only binds to the enzyme- [I] substrate complex, Ki can be regarded as indefinite and consequently, the term (1 + ) K i approaches to the value of 1. Therefore, the mixed inhibition equation can be transformed into the following equation:

k [E] [S] cat total [I] (1+ ) K ' V= i K m + [S] [I] 1+ K i '

This type of inhibition is defined as uncompetitive inhibition and in this case, both the Km [I] and Vmax were reduced by the factor (1 + ). But the catalytic efficiency kcat/Km K i ' remains constant.

Another special case of mixed inhibition is non-competitive inhibition, in which the inhibitor binds to the free enzyme and the enzyme-substrate complex with the same affinity: Ki = Ki’. Consequently, the equation of mixed inhibition can be transformed into: k [E] [S] cat total [I] (1+ ) V = Ki K m + [S]

In this scenario, Km doesn’t change while the maximum rate Vmax is decreased by a factor [I] of (1 + ) in the presence of a non-competitive inhibitor. Ki

The three types of inhibition, namely, competitive inhibition, non-competitive inhibition and uncompetitive inhibition, can be distinguished from each other by observing their corresponding Dixon plots (1/V vs. [I]) or the Lineweaver-Burk plots (1/V vs. 1/[S]).

226 Appendix II: Publications

Parts of the contents in this thesis have been reported in the following publications:

1) Zhang, R.; Yip, V. L. Y.; Withers, S. G. “Mechanisms of Glycosyl Transfer” in “Comprehensive Natural Products Chemistry” "Mechanisms of Enzymatic Glycosyl Transfer" In Comprehensive Natural Products II Chemistry and Biology; Mander, L., Liu, H.-W, Eds.; Elsevier: Oxford, 2010; [Vol. 8], pp 385-422.

2) Zhang, R.; Li, C.; Williams, L. K.; Rempel, B. P.; Brayer, G. D.; Withers, S. G. “Directed 'in situ' Inhibitor Elongation as a Strategy to Structurally Characterize the Covalent Glycosyl-Enzyme Intermediate of Human Pancreatic Alpha-Amylase” Biochemistry, 2009, 48, 10752–10764.

3) Willis, L.*; Zhang, R*.; Houliston, S.; Withers, S. G.; Wakarchuk, W.W. “Mechanistic investigation of the endo-α-N-acetylgalactosaminidase from Streptococcus pneumoniae R1 ” (*Co-first Authors) Biochemistry, 2009, 48, 10334–10341.

4) Zhang, R.; McCarter, J. D.; Braun, C.; Yeung, W.; Brayer, G. D.; Withers, S. G. “Synthesis and Testing of 2-Deoxy-2,2-Dihalo Glycosides as Mechanism-based Inhibitors of α-Glycosidases” Journal of Organic Chemistry, 2008, 73, 3070-3077

5) Tarling, C. A.; Woods, K.; Zhang, R.; Brastianos, H. C.; Brayer, G. D.; Andersen, R. J.; Withers, S. G. “The Search for Novel Human Pancreatic α-Amylase Inhibitors: High-Throughput Screening of Terrestrial and Marine Natural Product Extracts” ChemBioChem, 2008, 9, 433-438

6) Maurus, R.; Begum, A.; Williams, L. K.; Fredriksen, J. R.; Zhang, R.; Withers, S. G.; Brayer, G. D. “Alternative Catalytic Anions Differentially Modulate Human α- Amylase Activity and Specificity” Biochemistry, 2008, 47, 3332 - 3344

227