Identification, Characterization, and Utilization of Glycosyltransferases

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Nicholas Roman Pettit

Graduate Program in Chemistry

The Ohio State University

2011

Dissertation Committee:

Dr. Peng George Wang, Advisor

Dr. Zucai Suo

Dr. Karin Musier-Forsyth

Copyright by

Nicholas Pettit

2011

3

ABSTRACT

Among the many biomolecules found in nature, such as amino acids and DNA, carbohydrates are among the most important and abundant. The assembly of carbohydrates to build various oligosaccharides or glycans is of considerable interest in the glycobiology community, as understanding, controlling, and manipulating such assemblies can yield many interesting insights. This is because glycans are directly involved with many critical biological processes such as molecular recognition, cell-to- cell communication, folding and stability, as well as being involved many unique disease state phenotypes. The assembly of these glycans is commonly accomplished by a class of enzymes called glycosyltransferases, and it is these enzymes which have been exploited in vitro for the synthesis of attractive oligosaccharides. However, the identification, characterization, and subsequent exploitation of a glycosyltransferase is a very tedious and long process which stems from the root problem that knowing the primary function of a glycosyltransferase must be determined experimentally. This is because glycans are not the result of primary products, and identifying the function and specificity of a glycosyltransferase is not straight forward. In this thesis, the work has been focusing on developing a method by which function can be quickly assigned to a putative glycosyltransferase, as well as characterizing several bacterial

ii glycosyltransferases, and lastly exploiting glycosyltransferase functions to perform various functions

Chapter 1 provides a general introduction to the function of glycans, the function and biosynthesis of cell surface oligosaccharides, an overview of the methodologies used to determine glycosyltransferase functions, and lastly a review of chemical approaches to study and exploit oligosaccharide biosynthesis.

Chapter 2 and 3 are primarily associated with characterization of bacterial glycosyltransferases and biosynthesis of O-antigen structures in two different strains of E. coli. Chapter 2 focuses on investigating the biosynthesis of the O-antigen repeating unit in E. coli O127 and E. coli O128; whereas Chapter 3 focuses on the detailed biochemical characterization of WbnI, a bacterial homologue of the important human blood group B galactosyltransferase.

In Chapter 4, a high throughput technology was developed by which a putative glycosyltransferase can be quickly cloned, expressed, and relative substrate specificity determined. This work utilizes ligation independent cloning, in vitro protein expression, and SAMDI mass spectrometry along with a library of donor/acceptor substrates, by which enzymatic activity of putative glycosyltransferases is assessed.

In Chapter 5, glycosyltransferases and sugar nucleotide biosynthesis are exploited both in vivo and in vitro for the synthesis of interesting and novel oligosaccharides.

Finally, Chapter 6 summarizes the main results of all the studies included in this thesis, and also provides further directions for each project. iii

DEDICATION

Dedicated to my mother and father, Susan and Paul Pettit for all their love and support.

iv

ACKNOWLEDGMENTS

My years at The Ohio State University have turned out to be the most intellectually challenging and important years in my life. I was challenged in ways I never anticipated by my advisor, my lab mates, my peers, and my teachers. For this, I feel that there are many people that deserve my gratitude. Let me start by first thanking my advisor, Professor Peng George Wang. His unmatched passion, love, and enthusiasm for science has been an inspiration for me, and will always be in my future endeavors.

Through his constant challenges and encouragements I feel like I have matured as a scientist and as an individual. This dissertation would not have been possible without Dr.

Wang‘s guidance and advice.

I also need to acknowledgement my lab-mates, whom have played an integral part in my years at The Ohio State University. Firstly, I want to thank Dr. Weiqing Han for putting up with my constant bombardment of questions pertaining to my research. He was a terrific mentor for the development of my biochemical and genetic skills, and without his daily advice and conversations I would have certainly been lost. I also thank

Dr. Guohui Zhao for allowing me to theorize with him about my projects, as he always made time in his day to aid me. I also owe gratitutde to several of the organic chemists in the lab for their advice, specifically TJ Styslinger and Cai Li. TJ was kind enough to v synthesize many compounds that I required for the progress of my research, as well as

NMR and mass spectrometric analysis of the compounds that I synthesized. Cai Li played vital role in one of my projects, and his knowledge of synthetic chemistry and application of said chemistry was very useful in the progress of our collaboration.

In addition to my current lab-mates, there are also several former lab-mates that deserve my gratitude. First and foremost, I want to thank my first mentor while in the

Wang Lab, Dr. Wen Yi. He was responsible for the initial development of my biochemistry and molecular biology techniques, as well as teaching me the ins and outs of the Wang Lab. Dr. Lei Li generously taught me many useful tricks and strategies in molecular cloning, which have become essential techniques in my research. Dr. Robert

Woodward has been extremely helpful in the proof reading and discussions pertaining to my publications. There are also many other lab-mates and friends from the past and present Wang Lab that I want to thank for their support and company.

I want to also thank our collaborators, Dr. Milan Mrksich from The University of

Chicago. His students, Lan Ban and Andreea Stuparu were helpful in obtaining and analyzing our data, as well as always being available should we have questions.

I also want to thank my committee members, who generously offered their valuable time in aiding me in my dissertation and defense.

Lastly, I need to thank my parents, friends, and family, whom have been critical for support and love during my time at Ohio State. I wouldn‘t be who I am today if it

vi wasn‘t for these people. No words can express how thankful I am for all the support, love, and advice that these people have given me during the last four years.

vii

VITA

November 21, 1984 ...... Born, Allison Park, PA

2007...... Bachelor of Science, Chemistry ...... Miami University, Oxford, OH

2007-2009 ...... Graduate Teaching Assistant ...... The Ohio State University, ...... Columbus, OH

2007 – 2011...... Research Assistant ...... The Ohio State University ...... Columbus, OH

PUBLICATIONS

Research Publications

1. Pettit, N.; Cai, L.; Han, W.; Liang, Z.; Guan, W.; Wang, P. G. American Journal of Biomedical Science 2011, 3, 107-115

2. Pettit, N.; Styslinger, T.; Mei, Z.; Han, W.; Zhao, G.; Wang, P.G. Biochemical and Biophysical Research Communications 2010, 402, 190-195

3. Han, W.; Li, L.; Pettit, N.; Yi, W.; Woodward, R.; Liu, X.; Guan, W.; Bhatt, V.; Song, J.; Wang, P.G. Methods Mol Biol. 2010, 600, 93-110

4. Liu, X.; Xia, C.; Lei, L.; Guan, W.; Pettit, N.; Zhang, H.; Chen, M.; Wang, P.G. Bioorganic & Medicinal Chemistry 2009, 17, 4910-4915

viii

FIELDS OF STUDY

Major Field: Chemistry

ix

TABLE OF CONTENTS

Page

Abstract ...... ii

Dedication ...... iv

Acknowledgments...... v

Vita ...... viii

List of Schemes ...... xiv

List of Tables ...... xv

List of Figures ...... xvii

List of Abbreviations ...... xxi

Chapters

1. Introduction ...... 1

1.1 Bacterial Cell Surface Oligosaccharides ...... 3

1.2 Polysaccharide Synthesis and Exploiting Nature to do Chemistry . 6

1.2.1 Traditional Chemical Synthesis ...... 7

1.2.2 Enzymatic Synthesis ...... 11

x

1.3 Current Methodologies for Glycosyltransferase Identification and Function ...... 17

1.4 Chemical Biology Approaches to Investigate Oligosaccharides In Living Systems ...... 23

1.5 Outline of the Work Described in this Thesis ...... 27

References ...... 29

2. Investigation of Polysaccharide Biosynthesis in E. coli ...... 34

2.1 Investigation of Two Glycosyltransferases from E. coli O127...... 34

2.1.1 Introduction ...... 34

2.1.2 Experimental Methods ...... 37

2.1.3 Results and Discussion ...... 43

2.1.4 Conclusions ...... 55

2.2 Investigation of the O-antigen Biosynthesis in E. coli O128...... 58

2.2.1 Introduction ...... 58

2.2.2 Experimental Methods ...... 61

2.2.3 Results ...... 67

2.2.4 Discussion ...... 77

References ...... 82

3. Biochemical Characterization of WbnI ...... 86

3.1 Introduction ...... 86

3.2 Experimental Methods ...... 89

xi

3.3 Results ...... 95

3.4 Discussion ...... 111

References ...... 121

4. Development of a High Throughput Assay for the Identification of Glycosyltransferase Activity ...... 124

4.1 Introduction ...... 124

4.2 Experimental Methods ...... 129

4.3 Results ...... 139

4.4 Discussion ...... 146

References ...... 152

5. Chemical Biology Approach to Investigate Bacterial Polysaccharide Glycosyltransferases ...... 154

5.1 Promiscuity of GlcNAc Pathway Towards GlcNAc Analogs ...... 154

5.1.1 Introduction ...... 154

5.1.2 Experimental Methods ...... 156

5.1.3 Results ...... 158

5.1.4 Conclusion ...... 164

5.2 Investigation of Donor Specificity of Bacterial α1,2-Fucosyltransferases...... 167

5.2.1 Introduction ...... 167

5.2.2 Experimental Methods ...... 170

5.2.3 Results and Discussion ...... 173

xii

References ...... 180

6. Conclusions and Perspectives ...... 184

Bibliography ...... 187

Appendix: 1H NMR, 13C NMR Spectra, Misc. Supporting Data...... 201

Appendix A, Supporting Data for Chapter 2 ...... 202

Appendix B, Supporting Data for Chapter 3 ...... 213

Appendix C, Supporting Data for Chapter 4 ...... 217

Appendix D, Supporting Data for Chapter 5 ...... 236

xiii

LIST OF SCHEMES

Scheme Page

Scheme 2.1 Enzymatic synthesis of Gal-β1,3-GalNAc ...... 42

Scheme 2.2 WbiN catalyzed reaction ...... 55

Scheme 3.1 Enzymatic synthesis of Type III/IV acceptor ...... 92

Scheme 4.1 Timeline to travel from putative GT gene to GT specificity . 126

Scheme 4.2 Method to go from putative GT to functional assignment ..... 126

Scheme 5.1 FKP catalyzed reaction for the synthesis of GDP-fucose Derivatives ...... 170

xiv

LIST OF TABLES

Table Page

Table 2.1 Acceptor substrate specificity of GST-WbiQ ...... 49

Table 2.2 Acceptor substrate for each Wbs* enzyme ...... 73

Table 2.3 Metal cation dependence assay for each Wbs* enzyme ...... 74

Table 3.1 Substrate specificity of WbnI ...... 98

Table 3.2 Kinetic parameters comparison between three α1,3-galactosyltransferases ...... 104

Table 3.3 Kinetic parameters for WbnI site directed mutants ...... 106

Table 4.1 Kinetic parameters for 3 identified GTs using SAMDI ...... 144

Table 4.2 Influence of divalent metal cations for identified GTs ...... 146

Table 5.1 Donor substrate specificity of WbiQ, WbwK, WbsJ, and FutC ...... 176

Table A.1 Primers used in the E. coli O128 Wbs* study ...... 207

Table B.1 Primers used for WbnI investigation ...... 213

Table C.1 Putative and known GTs used in Chapter 4 ...... 217

Table C.2 Acceptors used in the screening experiments for GT activity ...... 222

Table C.3 Synthetic route for preparation of several acceptors ...... 223

xv

Table C.4 List of activities observed by new and previously characterized GTs ...... 224

Table C.5 Kinetic parameters measured by using traditional radiolabelled Donors ...... 225

Table C.6 Acceptors and donors used for preparative oligosaccharide Synthesis ...... 225

xvi

LIST OF FIGURES

Figure Page

Figure 1.1 Example of glycan interactions ...... 2

Figure 1.2 LPS biosynthesis in E. coli O86 ...... 6

Figure 1.3 Chemical Synthesis of an oligosaccharide ...... 9

Figure 1.4 Mannose-6-phosphate N-glycan ...... 10

Figure 1.5 Glycosyltransferase catalyzed reactions ...... 13

Figure 1.6 Glycosynthase catalyzed reaction ...... 17

Figure 1.7 Methodologies for detecting glycosyltransferase activity ...... 22

Figure 1.8 Incorporation of monosaccharides into cell surface Oligosaccharides ...... 26

Figure 1.9 Visualization of ketone/aldehyde on the cell surface...... 27

Figure 2.1 O-antigen repeating unit structure of E. coli O127 ...... 36

Figure 2.2 Sequence alignment for WbiQ and other FucTs ...... 45

Figure 2.3 Purification of GST-WbiQ (SDS-PAGE and Western blot) ... 46

Figure 2.4 Subcellular location of GST-WbiQ...... 48

Figure 2.5 TLC demonstrating α1,2-FucT activity ...... 50

Figure 2.6 pH profile for GST-WbiQ enzymatic activity ...... 51

xvii

Figure 2.7 Metal dependence assay for GST-WbiQ ...... 52

Figure 2.8 Observed crystals for His6-WbiQ ...... 54

Figure 2.9 Expression of His6-WbiN...... 55

Figure 2.10 Model of LPS biosynthesis in E. coli O127 ...... 57

Figure 2.11 O-antigen Structure of E. coli O128 ...... 60

Figure 2.12 Protein expression of three GST-Wbs* enzymes ...... 69

Figure 2.13 LPS phenotype for wbs* deficient strains ...... 75

Figure 2.14 In vitro GST-pull down assay ...... 77

Figure 2.15 Model for LPS biosynthesis in E. coli O128 ...... 81

Figure 3.1 WbnI and BgtA catalyzed reactions...... 88

Figure 3.2 Sequence alignment for CAZy family 6 glycosyltransferases 99

Figure 3.3 Effect of metal cations on WbnI‘s enzymatic activity ...... 100

Figure 3.4 Effect of pH on WbnI‘s enzymatic activity ...... 101

Figure 3.5 Double reciprocal plots for WbnI activity ...... 103

Figure 3.6 LPS phenotype of ∆wbnI E. coli O86 and complemented strains ...... 108

Figure 3.7 Crystal structure of GTB with Type I & II acceptors bound . 112

Figure 3.8 Molecular model of WbnI ...... 113

Figure 4.1 Example demonstrated enzymatic activity using SAMDI ..... 128

Figure 4.2 Illustration demonstrating high throughput technique ...... 140 103

Figure 4.3 SAMDI spectra of the four newly identified GTs...... 142

Figure 4.4 Kinetic analysis of the newly identified GTs ...... 145

Figure 4.5 LacNAc synthesis and LacNAc containing structure ...... 150

xviii

Figure 5.1 Chemical structure of GlcNAc/GlNAc and their 2-carbon Isoteres ...... 155

Figure 5.2 Metabolic incorporation of ketone isotope on E. coli surface 157

Figure 5.3 Flow cytometry result showing successful incorporation of 2-ketoGlc ...... 161

Figure 5.4 Fluorescence microscopy visualizing cell surface Oligosaccharides ...... 162

Figure 5.5 LPS phenotype for E. coli O86 ∆wecA/∆waaL ...... 165

Figure 5.6 GlcNAc biosynthetic pathways ...... 167

Figure 5.7 Six fucose analogues used in the chemoenzymatic synthesis of GDP-fucose derivatives ...... 174

Figure 5.8 Method outlining procedure for FucT detection ...... 175

Figure 5.9 Sequence alignment for the four studied α1,2-FucTs ...... 179

Figure A.1 WbiN α1,3GalNAcT activity ...... 206

Figure A.2 GST-WbsH GalT activity ...... 209

Figure A.3 GST-WbsL GalNAc activity ...... 210

Figure A.4 GST-WbsK GalT activity ...... 211

Figure B.1 Size exclusion profile for WbnI ...... 215

Figure B.2 Tetrasaccharide GalNAc-α1,3-(Fuc-α1,2)-Gal-β1,3GalNAc -OMe ...... 215

Figure B.3 WbnI Q148A activity ...... 216

Figure C.1 SDS-PAGE and anti-His western blot for newly discovered GTs ...... 226

Figure C.2 Synthetic rotue for preparation of acceptor 9...... 227

Figure C.3 Enzymatic route for preparation of acceptors 13 and 14 ...... 228

xix

Figure C.4 Synthetic route for preparation of acceptors 20-25 ...... 228

Figure C.6 SAMDI data & respective statistical analysis ...... 229

Figure C.7 Double reciprocal plots for newly discovered GT activities.. 231

Figure C.8 Assay for testing the affect of different divalent metal cations on GT activities ...... 232

Figure D.1 SAMDI spectra for reactions with each GDP-donor ...... 241

Figure D.2 Synthesis of fucose derivatives 4-6 ...... 242

xx

LIST OF ABBREVIATIONS

α alpha

ADP adenosine diphosphate

ATP adenosine triphosphate

β beta

Bn Benzyl

⁰C degrees Celsius

CE capillary electrophoresis

E. coli Escherichia coli

FITC Fluorescein isothiocyanate

Fuc fucose

FucT fucosyltransferase g gram(s)

Gal galactose

GalNAc N-acetylgalactosamine

Glc glucose

GlcNAc N-acetylglucosamine

GDP guanosine diphosphate

xxi

GT glycosyltransferase

GTA human α1,3-GalNAc-transferase

GTB human α1,3-Gal-transferase

GT-A glycosyltransferase fold type A

GT-B glycosyltransferase fold type B

GTP guanosine triphosphate h hour(s)

HPLC high performance liquid chromatography

HS Heparan sulfate

Km Michaelis constant

L liter(s)

LB Lysogeny broth

Lex Lewis X

LPS lipopolysaccharide m milli min minute(s)

µ micro

M moles per liter

Man mannose

Me methyl

MS mass spectrometry m/z mass to charge ratio (MS)

xxii ndp random diphosphate nucleotide nm nanometer

NMR nuclear magnetic resonance

OPS O-polysaccharide

PBS phosphate buffered saline

PCR polymerase chain reaction ppm parts per million rt room temperature

RU repeating unit

TLC thin layer chromatography

UDP uridine diphosphate

Und Undecaprenyl

Undc Undecyl

UTP uridine triphosphate

xxiii

CHAPTER 1

INTRODUCTION

Glycoconjugates, such as glycoproteins and glycolipids are required biopolymers which mediate many vital biological processes.1 Such glycoconjugates are composed of monosaccharide building blocks, which are commonly assembled by sequential glycosylation reactions mediated by a class of enzymes called glycosyltransferases

(GTs).2 As observed in many prokaryotic systems, glycans in eukaryotic organisms are commonly found as cell surface oligosaccharides which directly impart biological functions, such as that seen with the interactions regulated by heparan sulfate (HS) oligosaccharides.3 Figure 1.1 illustrates some putative interactions that a eukaryotic cell can commonly participate in, as a result of the cell surface oligosaccharides. A common function of cell surface oligosaccharides is to act as a mode of communication between organisms, whether it is a foreign bacterium or virus particle, or even another cell in the original organism.4,5 This type of communication is often exploited by various pathogens, by which the pathogen displays molecular mimicry of the host‘s organism cell surface oligosaccharides, thereby allowing the pathogen to clear the host‘s immune system.6

Furthermore, unnatural modification of a host‘s cell surface oligosaccharides can be

1

Figure 1.1. Interactions between various biological systems via glycans.

characteristic of various diseases, such as the sialyl Tn (STn) antigen observed with various cancer cell lines.7 Tumor-associated carbohydrate antigens (TACAs), such as the sialyl Tn antigen is highly associated with prostate, breast, colorectal, and ovarian cancers, and is subsequently absent on normal tissues.8 Using modified sTn derivatives, conjugate vaccines were subsequently designed, which demonstrate the potential for glycoconjugate-based cancer vaccines. Thus, recognizing the observed phenotypical changes associated with various disease states demonstrates the utility towards exploiting cell surface oligosaccharides as a tool for making diagnoses or as a potential source of 2 therapeutics.9 Furthermore, glycans are a common post-translational modification observed in eukaryotic systems and as such they have many other functions involved with protein folding, stability, and function.10 Therefore, understanding the complete biogenesis of glycans is essential to fully elucidating the structure-function relationships observed; from which the biological properties of various glycans can then be further exploited for medicinal purposes.

1.1 Bacterial cell surface oligosaccharides

1.1.1 Function and characterization of the polysaccharides

While eukaryotic glycosylation is generally regarded as more important or interesting in the glycobiology community, as glycosylation is important for many disease states, understanding bacterial glycosylation can shed light and provide interesting insights for further understanding eukaryotic glycosylation. One example of a common source of prokaryotic glycosylation is in regards to the biosynthesis of cell surface oligosaccharides, which are common polysaccharides found connected to or associated with the outer membrane in bacterial organisms. These polysaccharides coat the cell surface and are responsible for protecting the bacteria from host immune clearance, as well as acting as an essential virulence factor for infection of the host.5 Such polysaccharides can be classified into three distinct groups depending on their association with the outer membrane: capsular polysaccharides (CPS), exopolysaccharides (EPS), and O-polysaccharides (OPS; O-antigen).11 The CPS are linear polymers that are covalently linked either by phospholipids or lipid-A molecules to the cell surface, and are

3 highly variable among the different strains of bacteria. Meanwhile, EPS are high molecular weight polymers (up to 1000 repeating units (RU‘s)) which exhibit no physical attachment to the cell surface, and are commonly referred as biofilms.12 The last type of commonly found polysaccharides in bacterial organisms is the OPS, and for purposes of this thesis will be covered in more detail than either CPS or EPS. The OPS is one of three major constituents of the bacterial cell surface lipopolysaccharide (LPS), and is composed of as many as 100 RU‘s. LPS appears to be essential for gram negative bacteria, and serves as a chemical barrier allowing the bacterium to colonize in harsh physiological environments.13 More specifically, the O-polysaccharides are highly variable and can contain unique glycosidic linkages, unique monosaccharide building blocks, both of which result in the abundant amount of serotypes observed for both E. coli and Salmonella. Bacterial infections constitute one of the major health problems worldwide (Escherichia coli O157, Streptococcus pneumoniae Haemophilius influenzae), especially in developing countries, and as such understanding the mechanisms and processes involved in glycosylation may be beneficial for future therapeutics and vaccinations.

1.1.2 Biosynthesis of polysaccharides

While nature has creatively designed the cell surface polysaccharides to be highly variable in structure and design, nature has only been able to develop a few methods to synthesize the aforementioned polysaccharides. Currently the biosynthesis of OPS, CPS, and EPS have been studied in vivo through gene deletion experiments, and very recently the enzymes involved in polysaccharide biosynthesis have been studied and characterized

4 in vitro. Due to the relevance of OPS towards the research in this work, the biosynthesis of the OPS will be focused on in this introduction. The O-polysaccharide biosynthesis is illustrated in Figure 1.2. This biosynthetic process begins at the cytosolic face of the plasma membrane, with the addition of a sugar phosphate to an undecaprenyl phosphate

(Und-P) carrier, forming an Und-PP-linked sugar. Then, via sequential enzymatic glycosylations by specific, individual, glycosyltransferases, a single repeating unit (RU) is synthesized on the Und-PP-sugar, creating a RU-PP-Und. In the example of E. coli

O86, Figure 1.2, a pentasaccharide RU is synthesized on the Und lipid carrier, from which the resultant RU-PP-Und is translocated to the periplasmic side of the inner membrane by the integral membrane protein, Wzx. In the periplasm, the RU-PP-Und conjugates are polymerized into a sugar polymer by the integral membrane protein

Wzy14, after which the length of the polymer is regulated by Wzz.15 After the completed polysaccharide is synthesized, it is then ligated to the Lipid-A-core structure, using the integral membrane protein, WaaL.16 The completed LPS is then inserted and translocated into the outer leaflet of the outer membrane by at least seven essential Lpt* enzymes.17

This process is regarded as the Wzy-dependent pathway, and the involved with the biosynthesis of OPS are clustered at the locus known as rfa. 18 Found in this locus are the genes required for the complete processing of the OPS, such as GTs, Wzy, Wzx, Wzz, and other genes required for synthesis of the sugar nucleotide donors.19 Currently, both the Wzy dependent pathway has been reconstituted in vitro, and the function of WaaL has been demonstrated in vitro, thus the last step towards in vitro LPS chemoenzymatic synthesis is to combine the Wzy/Wzz and WaaL catalyzed reactions.20 Lastly, the

5

Figure 1.2. LPS biosynthesis in E. coli O86.

difference between OPS, CPS, and EPS biosynthesis occurs at the stage of translocation and polymerization, whereby two other pathways other than the Wzy dependent pathway exist. These pathways are called the ABC transporter dependent pathway and the synthase dependent pathway, both of which are less characterized and less abundant in nature.21,22

1.2 Polysaccharide synthesis and exploiting nature to do chemistry

A significant problem in the glycobiology community is understanding the exact roles that various polysaccharides play in nature, specifically in reference to their 6 structure-function relationships. To further investigate the roles of various oligosaccharides, which can aid in the design of various glycan-related vaccines or other therapeutics, one must be able to obtain pure, complex, oligosaccharides. Two primary strategies are employed for obtaining pure oligosaccharides, one being traditional chemical synthesis and the other being an enzymatic driven synthesis. While both methodologies suffer from various limitations, there are numerous examples in which both methods have been applied for the in vitro synthesis of biologically relevant oligosaccharides.

1.2.1 Traditional chemical synthesis

Assembling complex glycans using traditional organic chemistry has suffered from the same limitations for several decades, which originates from the fact that each sugar moiety contains several hydroxyl groups which share similar reactivity. Thus to chemically synthesize oligosaccharides, which contain specific stereospecific and regiospecific glycosidic linkages, requires numerous cumbersome and laborious preparation steps.23 In the chemical synthesis of complex, branched, oligosaccharides, limited yields are frequently observed because of the necessity of constant protection and deprotection steps to control the reactivity of the numerous hydroxyl groups. The basic premise behind protection/deprotection of specific hydroxyl groups prevents reactions happening at an undesired hydroxyl group. The protecting groups can easily be removed to yield the free, unreacted, hydroxyl groups. Figure 1.3 briefly illustrates the challenges in chemically synthesizing a trisaccharide from the nonreducing end to the reducing end.24 Initially in frame I, one monosaccharide (C) is fully protected with differing

7 protecting groups Px, which is to be used for the conversion of an activated donor.

Secondly, another monosaccharide is protected at all but one hydroxyl group (C2, C3,

C4, or C6), which will serve as the free hydroxyl group allowing for the creation of a glycosidic linkage. Then in frame II, monosaccharide C has its anomeric protecting group converted into leaving group (LG). Next, in the presence of a promoter and monosaccharide D, a new stereoselective glycosidic linkage is generated through a glycosylation reaction. This protocol can either be repeated to assemble large oligosaccharides or glycoconjugates, or the entire oligosaccharide can be globally deprotected, yielding the final oligosaccharide. This example, while seemingly very basic, exploits very intricate chemical reactions, which allow for the desired stereoselectivity of the newly formed glycosidic linkages. The entire process involves many individual reactions, protection/deprotection steps, and purifications. Because of all the steps required to synthesize a complex oligosaccharide using traditional chemistry, it is common to observe very low yields, which currently makes complex carbohydrate research a difficult field.

While chemical synthesis of complex oligosaccharides is difficult, there have been several prominent examples whereby researchers have exploited new and interesting chemistry for the synthesis of complex glycans. Firstly, Wong et al. developed an efficient orthogonal protection-deprotection strategy, which has proven useful for a programmable approach to combinatorial carbohydrate synthesis.25 Another approach of synthesizing complex oligosaccharides, similar to peptide chemistry, is Seeberger‘s automatic solid phase synthesis, which utilizes glycosyl phosphates as building blocks,

8

Figure 1.3. Chemical synthesis of an oligosaccharide.

allowing for the stepwise incorporation of sugar moieties.26 A third method, while exploiting a combinatorial one pot synthesis approach, has been the development by

Wang et al. in which the authors design a new method for distinguishing the reactivity of all non-anomeric hydroxyl groups, thereby facilitating the synthesis β1,6-glucans and a library of oligosaccharides based on the influenza virus binding trisaccharide.24 Lastly, an example that is more related to glycobiology, Liu et al. were able to synthesize a bisphosphorylated mannose-6-phosphate N-glycan, which was utilized for in vivo fluorescence imaging, demonstrating that organic synthesis can be utilized for the synthesis of attractive and practical oligosaccharides (Figure 1.4).27 Related to the 9 research discussed in this thesis, recent work has demonstrated that chemical synthesis can be used for relatively large scale production of O-antigens (minus the lipid-PP- moiety), such as in the synthesis of the O-antigen tetrasaccharide from Azospirillum lipoferum.28 Lastly, while there are many good examples of new and innovative approaches towards complex oligosaccharide synthesis, there exists no universal method for the chemical synthesis of complex oligosaccharides in good yields.

Figure 1.4. Mannose-6-phosphate N-glycan; the glycan and linker where synthesized by traditional chemical approaches.

10

1.2.2 Enzymatic synthesis

Nature has developed an efficient method for creating regiospecific and stereospecific glycosidic bonds by utilizing a unique class of enzymes. In comparison to chemical methods for oligosaccharide synthesis, enzymatic glycosylation avoids the tedious protection/deprotection steps, as well as being able to undergo catalysis at physiological pH and in aqueous solutions. Thus, enzymes are an attractive alternative to chemical approaches and have been widely studied in the last decade as alternatives for the synthesis of bio-attractive oligosaccharides.29 Furthermore, there are two classes of enzymes which are employed for the synthesis of oligosaccharides, notably: glycosyltransferases and glycosidases.30,31 The former class of enzymes are the most widely used in glycan synthesis and have been exploited for the synthesis of many interesting and relevant oligosaccharides, whereas the latter typically hydrolyze glycosidic linkages, however, under controlled conditions can be used to synthesize glycosidic bonds.

Glycosyltransferases

The glycosyltransferase, one of the most important and abundant enzymes in nature, is able to transfer a monosaccharide moiety from the corresponding sugar nucleotide donor to a specific hydroxyl group of a sugar acceptor, protein, or lipid. There are currently 9 Leloir sugar nucleotides, which are the most common sugar nucleotide donors in eukaryotic systems (UDP-Glc, UDP-Gal, UDP- GlcNAc, UDP-GalNAc, UDP-

GLcA, UDP-Xyl, GDP-Fuc, GDP-Man, and CMP-Neu5Ac). These serve as the high energy substrate, thereby allowing catalysis to occur.32 Therefore, a Leloir

11 glycosyltransferase will utilize one of these nine donors, transfer the corresponding sugar to the acceptor, and release the di/mono phosphate nucleotide (Figure 1.5 A).

Furthermore, GTs are able to regulate the stereochemistry of the newly created glycosidic linkage between the donor and acceptor molecules, which is defined as either retention or inversion of the configuration at the anomeric carbon with respect to the sugar donor

(Figure 1.5 B). Knowing that there are two pathways that the newly created glycosidic linkage can adopt, there are likely different mechanistic details, which result in either inversion or retention of the stereochemistry. It is proposed from the available crystal structures of the inverting type GTs, that these GTs utilize a direct displacement type mechanism which mimics an SN2-like reaction. Conversely, the mechanism of the retention-type GTs is a less characterized; however a prominent view in the field is that these GTs utilize a double displacement mechanism which is the result of a covalently bound glycosyl-enzyme intermediate.33

Due to genetic studies there are over 60,000 putative GTs that are housed in the

CAZy database (www.cazy.org). Over the last ten years many of them have had their three-dimensional structures resolved, generally by X-ray crystallography. The available crystal structures revealed that there only two general folds that nucleotide-sugar- dependent GTs adopt, which are referenced as GT-A and GT-B type.34 Further threading analysis has revealed that many of the uncharacterized GTs adopt one of these two folds, suggesting that the majority of GTs may have evolved from a small number of progenitor sequences.30 The two folds are analogous in that they both adopt similar β/α/β Rossmann domains; however, the orientation in which the domains are situated appears to differ.

12

Figure 1.5. Glycosyltransferase catalyzed reactions: A) Reaction catalyzed by a galactosyltransferase, transferring Gal from UDP-Gal to GalNAc; B) Comparison in mechanism between a retaining and inverting glycosyltransferase.

The GT-A fold consists of an open twisted β-sheet surrounded by α-helices on both sides, and is usually generalized as two abutting Rossmann-like folds. The two tightly associated β/α/β domains are usually similar in size, which leads to the formation of a continuous central β-sheet. This notion lead some to describe the GT-A fold as a single domain fold, however, there are distinct nucleotide and acceptor binding domains.35

Furthermore, most GT-A enzymes possess a signature DXD (Asp-X-Asp) motif, which is responsible for coordinating a divalent metal cation, thereby promoting catalysis.36 This

13 motif is not a determining characteristic of the GT-A type GTs, as will be shown in this thesis, but it is a frequent characteristic of this class of transferases.

Like the GT-A fold, GT-B enzymes consist of two Rossmann-like domains, however, the domains are much less tightly associated, and an active site is created within the cleft between the two domains. The two different domains are most likely associated with binding of either the acceptor or donor substrates. Regardless of the fold that a GT appears to adopt (GT-A vs. GT-B), it appears that the overall fold of the enzyme cannot dictate the stereochemical outcome of the reaction that is catalyzed, as there are numerous examples of both retaining and inverting GTs that adopt the differing folds.

There is a third type of fold that has recently been discovered, notably the GT-C type fold. This type of GT is proposed to bind lipid phosphate-activated donor sugar substrates, as several lipid anchored GTs cannot adopt the traditional dual-Rossmann domains. There is little structural evidence proving the existence of the GT-C type fold, and this hampers the ability to definitively characterize a GT as adopting a GT-C like fold. However, further work is being done in order to determine if the putative GT-C type folds are evolutionarily related, and what their relationship is to other major classes of carbohydrate-active enzymes 30. Unlike the well characterized and studied GT-A and GT-

B folds, to date, all enzymes that adopt the predicted GT-C fold belong to inverting glycosyltransferase families, which may be a consequence of utilizing the lipid-linked activated donor.

The GT-A, GT-B, and putative GT-C type folds appear to apply GTs from all domains of life, including prokaryotes and eukaryotes. While there are some similarities

14 between the GTs from prokaryotes and eukaryotes, there are many distinct differences, which make the utilization of eukaryotic GTs more challenging to exploit than prokaryotic GTs for the synthesis of oligosaccharides.37 For example, looking at the putative GT sequences from various eukaryotic organisms, many if not most of the GT sequences contain at least one putative membrane domain, and thus require the membrane for complete enzymatic function.38 Membrane bound or anchored GTs are traditionally difficult to express in large amounts, and have been known to display very tight substrate specificities. Conversely, bacterial GTs are usually only membrane

―associated‖, not membrane bound, and can easily be purified as soluble, active, .

Furthermore, it is relatively easy to obtain genomic DNAs from bacterial species, whereas cDNA libraries are usually required for the cloning of eukaryotic transferases.

Lastly, there are numerous examples where bacterial GTs have exhibited promiscuous substrate specificities and thus are more advantageous than eukaryotic GTs for the in vitro synthesis of oligosaccharides.

While enzymatic synthesis of oligosaccharides has been demonstrated to be useful, there are many drawbacks, and chemical synthesis is still advantageous in some cases. The yield for enzymatic glycosylation is usually high, however the purification and handling of certain GTs is not always easy, and many of the reagents required for

GT-mediated reactions are very expensive, or not even commercially available (such as the sugar nucleotide donor substrates). Furthermore, there can be significant competitive inhibition of the glycosyltransferases by the by-products, such as the nucleotide phosphates. Thus, similar to chemical syntheses, cost effective methods are sought for

15 large scale oligosaccharide synthesis, such as the regeneration of the donor molecules using the Super-bug and Super-bead technology, as well as alternatives to obtaining the required donors/acceptors 39-41.

Glycosidases

During the synthesis of glycoproteins, glycosidases are responsible for the cleavage of glycosidic linkages, thereby controlling the size of the glycan portion of the glycoprotein. However, under certain conditions (such as point mutations), glycosidases

(then termed glycosynthases) can efficiently synthesize oligosaccharides by creating new glycosidic bonds, instead of hydrolyzing glycosidic bonds. Generally to create a competent glycosynthase, one needs to mutate the nucleophilic residue responsible for the nucleophilic attack, such as Glu, to a non-nucleophilic residue, which inhibits the enzyme‘s ability to attack the glycosidic bond. After this mutation is made, it has been shown that when using glycosyl fluoride substrates in an opposite configuration at the anomeric carbon when compared to the native donors, the enzyme is able to catalyze the formation of a glycosidic bond between this glycosyl donor and an acceptable acceptor.

An example of a glycosynthase mediated reaction is shown in Figure 1.6, whereby an alanine nucleophile mutant of the β-galactosidase/β-glucosidase from Agrobacterium sp. was created, and it could efficiently create a glycosidic linkage between the glucosyl fluoride substrate and a glucose derived acceptor. While these enzymes may require less expensive or hard to obtain donor molecules, as compared to GTs, the glycosynthases

16 exhibit weak regiospecificity, thus various linkages can be formed. While many examples of glycosynthases exist, genetic manipulation and in vitro enzymatic assays must be performed to to create an efficient glycosynthase.42,43

Figure 1.6. Glycosynthase-catalyzed reaction.

1.3 Current methodologies for glycosyltransferase identification and characterization

Currently, more than 200 GTs have been identified in the , and further predictions estimate that GT sequences account for about 1% of ORFs across all sequenced genomes. Two recent publications have accurately accessed the current state of glycosyltransferase functional characterization: ―The precise biochemical function of many (putative GT) sequences is still unknown, and elucidating the donor/acceptor specificity of these GT ―orphan sequences‖ remains a major challenge in glycobiology,‖44 and ―Their functional characterization remains an enormous challenge

17 given that the donor and acceptor substrates of less than 5% of the entries are known.‖45

The reason for the difficulty observed in assigning a function to a putative glycosyltransferase is that there is little correlation between the primary amino acid sequence and function. While homology may exist between different glycosyltransferases, homology does not necessarily translate to similarity in mechanism and/or acceptor/donor specificity. To address this problem, the CAZy (carbohydrate- active enzymes) database (http://www.cazy.org) was introduced, which organizes putative or known glycosyltransferases into 92 different GT families, based on sequence similarities.46 While GTs of a similar family have sequence similarity, which may correlate to a similar mechanism (retaining vs. inverting) and a similar topological fold

(GT-A vs. GT-B), the donor and acceptor specificities can vary significantly. One example is GT family 1 which contains 3724 GT gene sequences which all share the GT-

B fold and an inverting mechanism; however, the donor specificity varies between galactose, glucose, xylose, fucose, and even rhamnose. Therefore, while the CAZy family conveniently organizes all the putative GT sequences into 92 distinct families, and since there are approximately 60,000 putative GT sequences, there is still a lot that needs to be addressed in order to begin to identify some sort of function and specificity for a putative

GT.

To date there has been little done to address the issue of assigning function to a putative gene, without any previous knowledge of the putative in vivo function. The exact specificities of most of the GTs are not yet determined because in vitro experiments must be performed, which is challenging because of the limited availability of the donor and

18 acceptor substrates. Many prokaryotic GTs may require very obscure sugar donors which may either not be commercially available or are very difficult to either enzymatically or chemically synthesize. One example of this would be GDP-fucose, and while it is commercially available at 5 mg/$600, it is difficult to chemically synthesize; and in order to fully characterize a putative GT one typically needs more than just 5 mg.

Chemoenzymatic reactions, such as utilizing the biosynthetic pathway in vitro, to synthesize the required sugar nucleotide derivatives is becoming an attractive alternative to produce the required donors. Similarly, unusual or difficult to obtain acceptor substrates are frequently required, such as Und-PP-sugar linked substrates, which are certainly not commercially available and can take weeks to months to chemically synthesize. Regardless of the ability to obtain a sugar nucleotide or suitable acceptor, current technologies are still very primitive and the availability of the donor and acceptor molecules is a substantial bottleneck in the ability to assign function to a putative GT.

Traditionally there are a few approaches which have been undertaken in which the function of a putative gene has been determined.One method that is performed in the

Wang lab, deals with the characterization of GTs that are involved with the O-antigen biosynthesis. Commonly O-antigen structures are determined by NMR spectroscopy, in which the O-antigen is isolated from the bacterial organism and further subjected to many different chemical analyses.47,48 In addition to this, the genes responsible for the O- antigen biosynthesis can be easily identified by gene sequencing, because most of the genes involved in the biosynthesis are housed between the conserved galF and gnd genes.49 After the gene cluster and O-antigen structure have been determined, putative

19 functions can be assigned to the identified glycosyltransferases, and further in vitro experiments must be performed to fully elucidate the exact function of each GT.

Currently there are several methods in which glycosyltransferase activity can be detected, and these are briefly summarized in Figure 1.7.44 Figure 1.7 A shows a common method for monitoring the GT catalyzed reaction, which involves either some sort of detection of the NDP by-product, either by HPLC/UV, chemosensor, or immunodetection. Due to the presence of the UV-active nucleobase, simple UV detection can follow the consumption of the sugar-nucleotide donor by either HPLC, capillary electrophoresis (CE), or ion exchange chromatography; however, it is imperative to determine the amount of enzymatic hydrolysis of the donor under the assay conditions.50,51 An attractive method for determining the activity of a GT is using a mass spectrometry-based assay, such as that used by Davis and co-workers which utilized mass spectrometry for the high throughput screening for two GTs, yielding an acceptor specificity map for each enzyme.52 This method is very useful for identifying glycosyltransferase activity, but not linkage stereochemistry. Another widely used approach, although very limiting for the screening of GT mediated reactions, is the use of a ―tag‖ molecule, as shown in Figure 1.7 B. One such ―tag‖ molecule would be that of a radiolabeled (3H or 14C) donor molecule, which exhibits high sensitivity, and is particularly effective if only low levels of enzymatic activity can be obtained.53 In this method the ―tag‖ or radiolabel is located on the sugar moiety of the sugar-nucleotide donor, of which the radiolabeled sugar is enzymatically transferred to the acceptor substratre, and subsequent quantification of the product can be performed by scintillation

20 counting. The unreacted radiolabelled donor can be selectively removed by various chromatographic methods, such as ion-exchange chromatography, or the product can be purified away from the donor by revered phase chromatography. One group has demonstrated that the radiolabel approach can be exploited for a high throughput screen for detection of an inhibitor of a known fucosyltransferase, utilizing what they called scintillation proximity assays using scintillation coated microspheres.54 While the radiolabel approach is excellent and highly utilized in the current literature, it suffers from the limitation of available and expensive radiolabelled donors, further making this approach difficult for high throughput screening of GT activities. Other approaches in detecting GT activities in vitro, have utilized various UV-active or fluorescent analogues which allow for quick analysis of product formation (Figure 1.7 B & C), as well as pH based assays.55 Examples of fluorescent based assays are: the commercially available

―Transcreener assay56,‖ AlexFluor labeled donors57, or fluorescein/biodipy labeled acceptor substrates58, and a FRET-based GT assay59, in which each case the product or byproduct has some sort of fluorescence which can be easily detected.

Because of the complexity involved in a GT mediated reaction, many of the methods already mentioned, while they are proven as very powerful techniques, are not very relevant for the high throughput screening of GTs for activity. Each methodology suffers from similar prerequisites, most commonly lack of a library of available substrates, or tight selectivity by GTs in their ability to accept UV/fluorescent probed donors/acceptors. Many GTs have exhibited very tight substrate specificity for the donor and acceptor molecules, and thus using acceptors/donors with bulky probes may not be

21 realistic for the high throughput screen of GT activity. Thus, to develop a technique in which one can quickly assay a putative GT for activity will likely utilize the optimal acceptor/donor combination, and it is likely to involve nanotechnology or microarray

Figure 1.7. General principles for determining GT activity. A) Detection methods for the primary and secondary products of GT mediated reactions; B) Method of detection using

―tagged‖ donor or acceptors; C) FRET based assay; D) fluorescence based ligand displacement assay.

22 technology, thereby allowing for minimization of the required acceptors/donors.44

Progress has already been made in the carbohydrate microarray field, which has demonstrated the principle of screening GTs for activity60, selectivity61, and potential inhibitors.62

1.4 Chemical biology approaches to investigate oligosaccharides in living systems

The emerging field of glycomics in the last decade has observed substantial growth due in part to a lot of interesting and manipulative chemical biology. The study of glycans has been limited for various reasons, one of which is that the synthesis of oligosaccharides is not a direct result of primary gene products. Thus, glycan biosynthesis is not template driven nor is it under direct transcriptional control; therefore in comparison to nucleic acids or proteins, making genetic manipulation of genes related to glycan synthesis can result in undesired phenotypical effects. For example, disruption of specific genes related to the O-antigen biosynthesis in E. coli can result in lethal phenotypes due to the accumulation or biosynthesis of undesired glycans. Another reason why investigating glycan biogenesis is challenging is because glycans are extremely heterogeneous. The structure of the complex glycans are usually very branched, exhibiting branching off various sugars, proteins, and lipids, suggesting that the biosynthesis of complex glycans may be more challenging to study than either proteins or

DNA. Conversely, the study of proteomics and genomics, initially, appears easier to manipulate, and study at both the transcriptional level and at the cellular level. For example, using various reporter fluorescent probes, such as green fluorescent protein

(GFP), one can quickly access the location of a certain protein in living organisms,

23 usually without any concerns of lethality for the organism. This type of technology has only begun to surface in the literature in glycomics, primarily because various probes are not readily available or controllable at the carbohydrate level. Thus chemical biology approaches have begun to replace genetic approaches for the investigation of glycan imaging and glycan biogenesis investigation.

The current trend of monitoring and altering glycan structures in living organisms, which allows for probing the glycans biological functions, has been completed by two different approaches: 1) incorporation of unnatural carbohydrate or carbohydrate mimics into existing glycans; and 2) inhibition of biosynthetic pathways by small molecules. Due to the relevance of the former method to this thesis, only this strategy will be discussed.

Glycosyltransferases are the universal machines which assemble glycans, in vivo, and it is these enzymes (and other processing enzymes) which can be exploited for the introduction of unnatural substrates into the glycan biosynthetic pathways. This method utilizes traditional chemistry techniques to synthesize monosaccharide derivatives which include novel functional groups, which may further result in interesting phenotypical changes. The premise with this technique is that the derivatized monosaccharides can be recognized by the organism‘s endogenous glycan biosynthetic pathways, resulting in incorporation of the modified sugar moiety into the cells glycans. It is by this incorporation and presentation of the unnatural monosaccharide moiety that can be exploited to modulate or investigate various cellular communications, as well as being used to image the glycans.63,64 The latter subject has become immensely popular in the last several years, as many groups have begun to incorporate bioorthogonal groups into

24 monosaccharides, which can be selectively targeted by chemical reporters, allowing imaging of the cell surface oligosaccharides. Such bioorthogonal groups are the azide and ketone modifications, which can be chemically incorporated into monosaccharide precursors, and while the modified monosaccharide will be cycled through the glycan biosynthetic pathways, the chemical modification will go unreacted and will eventually be displayed on the cell surface (Figure 1.8).65 Figure 1.8 illustrates the incorporation of

GalNAz, the azido modified derivative of GalNAc, into the cell surface oligosaccharides.

This novel sugar is first imported into the cytoplasm, after which it can be incorporated by the GalNAc-salvage pathway processing enzymes. The host cell utilizes GalNAz as it would GalNAc, converts GalNAz into the corresponding sugar nucleotide donor, which is a substrate for one or many GalNAcT.

The two mentioned functional groups (azide & ketone) are among the most popular bioorthogonal groups which have been used to image cell surface glycans in both eukaryotic and prokaryotic organisms. The azide has demonstrated immense utility as a chemical reporter for labeling various molecules in nature since it is absent from most living organisms and has little reactivity towards endogenous substrates (at physiological pH and temperatures). Through either Click-Chemistry (in the presence of either an alkyne and Cu2+ or a strained alkyne) or the Staudinger ligation (with phosphines), the azide can be selectively targeted and efficiently label the cellular surface of various organisms.65-68 In addition to the azide, the ketone and aldehyde are not commonly found functional groups in living organisms, and are also able to act as bioorthogonal groups, for the use of labeling the cell surface molecules. The ketone and aldehyde can from

25 reversible Schiff bases with primary amines, which are formed by reacting the ketone/aldehyde with aminooxy or hydrozide groups, typically at lower pH‘s (5- 6)

(Figure 1.9).69 This pH value is typically unachievable inside the cell and thus using the ketone/aldehyde functional groups have mainly been utilized for the labeling of cell surface proteins and oligosaccharides.70

Figure 1.8. Incorporation of a modified monosaccharide into the cell surface oligosaccharides.

26

1.5 Outline of the work described in this thesis

The work in this thesis focuses on the class of enzymes called glycosyltransferases, more specifically with the identification, characterization, and utilization of these enzymes. The first two chapters of this work focus on the biochemical characterization of putative glycosyltransferases, then we move on to identifying function

Figure 1.9. Incorporation and visualization of ketone/aldehyde moieties on the cell surface.

of putative glycosyltransferases, and lastly we exploit several glycosyltransferases to perform new chemistries. Chapter 2 begins by characterizing the un-characterized glycosyltransferases from E. coli O128 and E. coli O127 O-antigen gene clusters, thereby expanding the available models for O-antigen biosynthesis beyond the traditional E. coli

O86. Using E. coli O128, we first elucidate the enzymatic function of three wbs* genes, identify the order of sequential glycosylation of the O-antigen repeating unit, demonstrate the importance of the glycosyltransferase genes for O-antigen biosynthesis, and lastly 27 show in vitro protein-protein interactions between the four glycosyltransferases.

Similarly, for E. coli O128, we demonstrate the function of the two remaining uncharacterized glycosyltransferases, and perform a detailed biochemical investigation for WbiQ. In Chapter 3, a complete biochemical characterization of WbnI, an α1,3-

Galactosyltransferase, was performed, which is an interesting homologue of the human blood group B α1,3-Galactosyltransferase. In Chapter 4, in collaboration with the

Mrksich laboratory, a ―high throughput‖ methodology was developed which allowed for the assignment of function to putative glycosyltransferases. Using a panel of sugar donors, acceptors, we expressed 85 putative GTs in vitro, and assayed their functions using SAMDI technology. Lastly, in Chapter 5, we exploit the biosynthetic pathways for nucleotide donor biosynthesis both in vivo and in vitro, to synthesize novel sugar nucleotide donors which can be utilized by various glycosyltransferases in vivo and in vitro. This work further supplements previous work done by the Wang lab in understanding, in characterizing, and in exploiting glycosyltransferases to perform new chemistries in vivo and in vitro.

28

REFERENCES

(1) Ohtsubo, K.; Marth, J. D. Cell 2006, 126, 855-67.

(2) Taniguchi, N., Honke, K., Fukuda, M. Handbook of Glycosyltransferase and Related Genes.; Springer: Tokyo, 2002.

(3) Cole, C. L.; Hansen, S. U.; Barath, M.; Rushton, G.; Gardiner, J. M.; Avizienyte, E.; Jayson, G. C. PLoS One 2010, 5, e11644.

(4) Comstock, L. E.; Kasper, D. L. Cell 2006, 126, 847-50.

(5) Varki, A. Nature 2007, 446, 1023-9.

(6) Yi, W.; Bystricky, P.; Yao, Q.; Guo, H.; Zhu, L.; Li, H.; Shen, J.; Li, M.; Ganguly, S.; Bush, C. A.; Wang, P. G. Carbohydr Res 2006, 341, 100-8.

(7) Julien, S.; Krzewinski-Recchi, M. A.; Harduin-Lepers, A.; Gouyer, V.; Huet, G.; Le Bourhis, X.; Delannoy, P. Glycoconj J 2001, 18, 883-93.

(8) Wu, J.; Guo, Z. Bioconjug Chem 2006, 17, 1537-44.

(9) Julien, S.; Picco, G.; Sewell, R.; Vercoutter-Edouart, A. S.; Tarp, M.; Miles, D.; Clausen, H.; Taylor-Papadimitriou, J.; Burchell, J. M. Br J Cancer 2009, 100, 1746-54.

(10) Helenius, A.; Aebi, M. Annu Rev Biochem 2004, 73, 1019-49.

(11) Evrard, B.; Balestrino, D.; Dosgilbert, A.; Bouya-Gachancard, J. L.; Charbonnel, N.; Forestier, C.; Tridon, A. Infect Immun 2010, 78, 210-9.

(12) Guttenplan, S. B.; Blair, K. M.; Kearns, D. B. PLoS Genet 2010, 6, e1001243.

(13) Duda, K. A.; Lindner, B.; Brade, H.; Leimbach, A.; Brzuszkiewicz, E.; Dobrindt, U.; Holst, O. Microbiology 2011.

(14) Kim, T. H.; Sebastian, S.; Pinkham, J. T.; Ross, R. A.; Blalock, L. T.; Kasper, D. L. J Biol Chem 2010, 285, 27839-49.

(15) Whitfield, C. Nat Chem Biol 2010, 6, 403-4.

(16) Islam, S. T.; Taylor, V. L.; Qi, M.; Lam, J. S. MBio 2010, 1.

29

(17) Freinkman, E.; Chng, S. S.; Kahne, D. Proc Natl Acad Sci U S A 2011, 108, 2486- 91.

(18) Valvano, M. A. Front Biosci 2003, 8, s452-71.

(19) Cai, C. S.; Zhu, Y. Z.; Zhong, Y.; Xin, X. F.; Jiang, X. G.; Lou, X. L.; He, P.; Qin, J. H.; Zhao, G. P.; Wang, S. Y.; Guo, X. K. BMC Microbiol 2010, 10, 67.

(20) Woodward, R.; Yi, W.; Li, L.; Zhao, G.; Eguchi, H.; Sridhar, P. R.; Guo, H.; Song, J. K.; Motari, E.; Cai, L.; Kelleher, P.; Liu, X.; Han, W.; Zhang, W.; Ding, Y.; Li, M.; Wang, P. G. Nat Chem Biol 2010, 6, 418-23.

(21) Young, J.; Holland, I. B. Biochim Biophys Acta 1999, 1461, 177-200.

(22) Raetz, C. R.; Whitfield, C. Annu Rev Biochem 2002, 71, 635-700.

(23) Wang, P. G. Nat Chem Biol 2007, 3, 309-10.

(24) Wang, C. C.; Lee, J. C.; Luo, S. Y.; Kulkarni, S. S.; Huang, Y. W.; Lee, C. C.; Chang, K. L.; Hung, S. C. Nature 2007, 446, 896-9.

(25) Sears, P.; Wong, C. H. Science 2001, 291, 2344-50.

(26) Burkhart, F.; Zhang, Z.; Wacowich-Sgarbi, S.; Wong, C. H. Angew Chem Int Ed Engl 2001, 40, 1274-1277.

(27) Liu, Y.; Chan, Y. M.; Wu, J.; Chen, C.; Benesi, A.; Hu, J.; Wang, Y.; Chen, G. Chembiochem 2011, 12, 685-90.

(28) Verma, P. R.; Mukhopadhyay, B. Carbohydr Res 2010, 345, 432-6.

(29) Zhou, G.; Liu, X.; Su, D.; Li, L.; Xiao, M.; Wang, P. G. Bioorg Med Chem Lett 2011, 21, 311-4.

(30) Lairson, L. L.; Henrissat, B.; Davies, G. J.; Withers, S. G. Annu Rev Biochem 2008, 77, 521-55.

(31) Hidaka, M.; Fushinobu, S.; Honda, Y.; Wakagi, T.; Shoun, H.; Kitaoka, M. J Biochem 2010, 147, 237-44.

(32) Palcic, M. M. Methods Enzymol 1994, 230, 300-16.

(33) Zechel, D. L.; Withers, S. G. Acc Chem Res 2000, 33, 11-8.

30

(34) Bourne, Y.; Henrissat, B. Curr Opin Struct Biol 2001, 11, 593-600.

(35) Unligil, U. M.; Rini, J. M. Curr Opin Struct Biol 2000, 10, 510-7.

(36) Wiggins, C. A.; Munro, S. Proc Natl Acad Sci U S A 1998, 95, 7945-50.

(37) Roychoudhury, R.; Pohl, N. L. Curr Opin Chem Biol, 14, 168-73.

(38) Pak, J. E.; Rini, J. M. Methods Enzymol 2006, 416, 30-48.

(39) Zhang, J.; Chen, X.; Shao, J.; Liu, Z.; Kowal, P.; Lu, Y.; Wang, P. G. Methods Enzymol 2003, 362, 106-24.

(40) Chen, X.; Liu, Z.; Zhang, J.; Zhang, W.; Kowal, P.; Wang, P. G. Chembiochem 2002, 3, 47-53.

(41) Zhao, G.; Guan, W.; Cai, L.; Wang, P. G. Nat Protoc 2010, 5, 636-46.

(42) Shaikh, F. A.; Withers, S. G. Biochem Cell Biol 2008, 86, 169-77.

(43) Mullegger, J.; Chen, H. M.; Warren, R. A.; Withers, S. G. Angew Chem Int Ed Engl 2006, 45, 2585-8.

(44) Wagner, G. K.; Pesnot, T. Chembiochem 2010, 11, 1939-49.

(45) Palcic, M. M. Curr Opin Chem Biol 2011.

(46) Cantarel, B. L.; Coutinho, P. M.; Rancurel, C.; Bernard, T.; Lombard, V.; Henrissat, B. Nucleic Acids Res 2009, 37, D233-8.

(47) Widmalm, G. Comprehensive Glycoscience 2007, 101-132.

(48) Lundborg, M.; Modhukur, V.; Widmalm, G. Glycobiology 2010, 20, 366-8.

(49) Samuel, G.; Reeves, P. Carbohydr Res 2003, 338, 2503-19.

(50) Taniguchi, N.; Nishikawa, A.; Fujii, S.; Gu, J. G. Methods Enzymol 1989, 179, 397-408.

(51) Kopp, M.; Rupprath, C.; Irschik, H.; Bechthold, A.; Elling, L.; Muller, R. Chembiochem 2007, 8, 813-9.

(52) Yang, M.; Brazier, M.; Edwards, R.; Davis, B. G. Chembiochem 2005, 6, 346-57.

31

(53) Palcic, M. M.; Pierce, M.; Hindsgaul, O. Methods Enzymol 1994, 247, 215-27.

(54) Ahsen, O.; Voigtmann, U.; Klotz, M.; Nifantiev, N.; Schottelius, A.; Ernst, A.; Muller-Tiemann, B.; Parczyk, K. Anal Biochem 2008, 372, 96-105.

(55) Persson, M.; Palcic, M. M. Anal Biochem 2008, 378, 1-7.

(56) Owicki, J. C. J Biomol Screen 2000, 5, 297-306.

(57) Helm, J. S.; Hu, Y.; Chen, L.; Gross, B.; Walker, S. J Am Chem Soc 2003, 125, 11168-9.

(58) Aharoni, A.; Thieme, K.; Chiu, C. P.; Buchini, S.; Lairson, L. L.; Chen, H.; Strynadka, N. C.; Wakarchuk, W. W.; Withers, S. G. Nat Methods 2006, 3, 609-14.

(59) Li, J. J.; Bugg, T. D. Chem Commun (Camb) 2004, 182-3.

(60) Park, S.; Shin, I. Org Lett 2007, 9, 1675-8.

(61) Blixt, O.; Allin, K.; Bohorov, O.; Liu, X.; Andersson-Sand, H.; Hoffmann, J.; Razi, N. Glycoconj J 2008, 25, 59-68.

(62) Bryan, M. C.; Lee, L. V.; Wong, C. H. Bioorg Med Chem Lett 2004, 14, 3185-8.

(63) Chang, P. V.; Prescher, J. A.; Hangauer, M. J.; Bertozzi, C. R. J Am Chem Soc 2007, 129, 8400-1.

(64) Campbell, C. T.; Sampathkumar, S. G.; Yarema, K. J. Mol Biosyst 2007, 3, 187- 94.

(65) Laughlin, S. T.; Baskin, J. M.; Amacher, S. L.; Bertozzi, C. R. Science 2008, 320, 664-7.

(66) Kohn, M.; Breinbauer, R. Angew Chem Int Ed Engl 2004, 43, 3106-16.

(67) Wang, Q.; Chan, T. R.; Hilgraf, R.; Fokin, V. V.; Sharpless, K. B.; Finn, M. G. J Am Chem Soc 2003, 125, 3192-3.

(68) Agard, N. J.; Baskin, J. M.; Prescher, J. A.; Lo, A.; Bertozzi, C. R. ACS Chem Biol 2006, 1, 644-8.

(69) Zeng, Y.; Ramya, T. N.; Dirksen, A.; Dawson, P. E.; Paulson, J. C. Nat Methods 2009, 6, 207-9.

32

(70) Luchansky, S. J.; Goon, S.; Bertozzi, C. R. Chembiochem 2004, 5, 371-4.

33

CHAPTER 2

INVESTIGATION OF POLYSACCHARIDE BIOSYNTHESIS IN E. COLI

2.1 Investigation of two glycosyltransferases from E. coli O127

2.1.1 Introduction

Fucosylation, the enzymatic transfer of L-fucose to either an oligosaccharide or a protein is accomplished by a class of enzymes called fucosyltransferases (FucTs). FucTs are an important class of enzymes for both mammals as well as bacteria. In mammalian systems, fucose containing glycoconjugates are directly involved in many biological processes, such as fertilization, neuronal development, immune responses, cell adhesion, and in many human diseases.1-3 For example, fucosylation occurs during the synthesis of the ABO(H) and Lewis antigens, which play important roles in human physiology.4,5

Contrastingly, fucosylation in prokaryotes is commonly observed in the O-antigens present in Gram-negative bacteria, the exposed portion of the lipopolysaccharides.6

Functions arising due to the O-antigens include but are not limited to virulence, molecular mimicry, clearance from the host‘s immune system, cell adhesion, and localization.7,8

34

FucTs catalyze the transfer of one fucose residue from the donor, guanosine-5‘- diphospho-β-L-fucose (GDP-Fuc), to a saccharide acceptor, forming a new glycosidic linkage. Based on the new glycosidic linkage formed (typically α1,2-, α1,3-, α1,4-, or

α1,6-) FucTs can be classified into four different subfamilies. Among them, α1,2-FucTs belong to glycosyltransferase family 11 (http://www.cazy.org/fam/acc_GT.html) and are responsible for the transfer of fucose to galactose (Gal) forming an α1,2-linkage.

Belonging to this family are many other α1,2-FucTs from humans, other mammals, viruses, plants, and bacteria. Several of the genes responsible for eukaryotic α1,2- fucosyltransferases have been cloned and characterized, some of which have come from humans.9-12 FUT1 and FUT2 are two human α1,2-FucTs that are responsible for the biosynthesis of different H-antigens.13 Importantly, only a few α1,2-FucTs have been cloned from bacterial sources and subsequently characterized: WbsJ from E. coli O127,

WbwK from E. coli O86, and FutC from Helicobacter pylori.14-16 Of these WbsJ and

FutC from H. pylori have unique substrate specificities and have demonstrated applicability in the synthesis of relevant fucose containing oligosaccharides.14,16

Regardless of the species that the α1,2- FucT was cloned from, no structural information is yet available for the α1,2-FucT subfamily (unlike the evolutionary related α1,6-FucT subfamily).17 As such, our understanding of the α1,2-FucT mechanism and roles of specific amino acid motifs are limited and are based off the available α1,6-FucT and

α1,3-FucT crystal structures.

Enteropathogenic strain Escherichia coli O127:K63(B8) (EPEC) is associated with infantile diarrhea in developing countries and is an example of a pathogen that

35

Figure 2.1. O-antigen repeat unit structure of E. coli O127 and its biosynthetic gene cluster. The bond formed by WbiQ is indicated with an arrow, and the residues highlighted in red form the H-antigen mimic. The bond formed by WbiN is indicated with an arrow.

displays blood group antigens on its cell surface.18,19 E. coli O127‘s O-antigen structure

(Figure 2.1) expresses molecular mimicry of human blood group H-antigen and was reported to possess human blood group H (O) activity.20 From the O-antigen biosynthetic gene cluster multiple genes were identified as glycosyltransferases involved in the assembly of the E. coli O127 polysaccharide, from which orf13 was identified as a putative α1,2-FucT.15,21 Thus, we propose that WbiQ encodes an α1,2-FucT that makes

Fuc-α1,2-Gal-β-1,3-GalNAc (human blood group H-antigen mimic) present in the O- antigen repeating unit. This work describes the method for the overexpression, purification, and identification of the subcellular localization of GST-WbiQ. After 36 overexpression in E. coli, the activity of WbiQ was optimized under different pH conditions and the influence of metal cations was tested. Furthermore, using a panel of acceptors, WbiQ showed strict acceptor substrate specificity and was only active toward acceptors that contained the Gal-β1,3-GalNAc-α-OR structure, forming Fuc-α1,2- Gal-

β1,3-GalNAc-α-OR. Based on the acceptor substrate specificity, WbiQ was used in the preparative synthesis of H-type 3 blood group antigen (Fuc-α1,2- Gal-β1,3-GalNAc-α-

OMe). Lastly, we cloned and expressed WbiN, the putative α1,3-GalNAcT, which is responsible for forming the linkage GalNAc-α1,3-GalNAc-PP-Und, in vivo.

2.1.2 Experimental methods Bacterial strains, plasmids, and reagents E. coli competent cell DH5α [lacZΔM15 hsdR recA] was obtained from

Invitrogen. E. coli competent cell BL21 (DE3) [F ompT hsdSB (rBmB) gal dcm (DE3)] was obtained from Stratagene. The plasmid, pGEX-4T-1 was obtained from GE Healthcare

Life Sciences. Restriction enzymes were obtained from New England Biolabs. All reagents were from Sigma Aldrich unless otherwise noted.

Cloning and construction of wbiQ and wbiN recombinant vector

The wbiQ gene was amplified by polymerase chain reaction (PCR) from chromosomal DNA of E. coli O127 with the forward primer 5‘-

ATGCGAATTCATGATGTATTGCTGTCTATCC (EcoRI restriction site underlined) and the reverse primer 5‘-ATGCCTCGAGCTACATTGCTATCCAGTTT (XhoI restriction site underlined). The PCR product was digested with EcoRI and XhoI and inserted into the EcoRI/XhoI sites of plasmid pGEX-4T-1 such that the resulting expression plasmid, pGEX-wbiQ, has WbiQ fused to the gene encoding glutathione S- 37 transferase (GST) in the same open reading frame. The constructs were transformed into

E. coli DH5α cells, and the resulting recombinant plasmid was characterized by restriction mapping and DNA sequencing. The correct constructs were transformed into

E. coli BL21 (DE3) for protein expression.

The wbiN gene was amplified by polymerase chain reaction (PCR) from chromosomal DNA of E. coli O127 with the forward primer 5‘-

GTACCATATGATGAAAAATGTTGGTTTTATTG (NdeI restriction site underlined) and the reverse primer 5‘- TCAGGGATCCTCAACCTAAAATAATGCTTTTATATG

(BamHI restriction site underlined). The PCR product was digested with NdeI and

BamHI and inserted into the NdeI/BamHI sites of plasmid pET15b such resulting in the recombinant plasmid, pET15b-wbiN, with WbiN containing a N-terminal His6 fusion tag.

The constructs were transformed into E. coli DH5α cells, and the resulting recombinant plasmid was characterized by restriction mapping and DNA sequencing. The correct constructs were transformed into E. coli BL21 (DE3) for protein expression.

Overexpression and purification of GST-WbiQ and His6-WbiN

E. coli BL21 (DE3) strain harboring the pGEX-4T-1-wbiQ was grown in 1 L of

LB medium at 37 ⁰C. Once the OD600 reached 0.8, isopropyl-1-thio-β-D- galactosylpyranoside(IPTG) was added to a final concentration of 0.5 mM for induction.

Protein expression proceeded for 15 hours at 16 ⁰C. Cells were harvested by centrifugation (5000 g) and stored at -20 ⁰C until needed. The cell pellet was resuspended in GST binding buffer (1x PBS, pH 7.4) and disrupted by sonication on ice

(Branson Sonifier 450). The cell lysate was cleared by centrifugation (10000g, 45 min, 4

38

⁰C) and the supernatant was loaded on to 4 mL of Glutathione Sepharose 4B slurry (GE

Healthcare Life Sciences). The protein was subsequently eluted with GST elution buffer

(50 mM Tris-HCl, pH 8.0, 10 mM reduced glutathione). Size exclusion chromatography was performed using Superdex 200 10/300 GL Column (GE Healthcare Life Sciences) equilibrated with 50 mM Tris-HCl, pH 7.5, 10% glycerol. Following the manufacturers‘ protocol the column was calibrated using both the high and low molecular weight kits

(GE Healthcare) and the molecular weight of the eluted GST-fusion protein was determined. The homogenous GST-WbiQ was stored at -80 ⁰C in a buffer containing 50 mM Tris-HCL, pH 7.5, and 10% glycerol.

E. coli BL21 (DE3) strain harboring the pET-15b-wbiN recombinant plasmid was grown in 1 L of LB medium at 37 ⁰C. Once the OD600 reached 0.8, isopropyl-1-thio-β-D- galactosylpyranoside(IPTG) was added to a final concentration of 0.5 mM for induction.

Protein expression proceeded for 15 hours at 16 ⁰C. Cells were harvested by centrifugation (5000 g) and stored at -20 ⁰C until needed. The cell pellet was resuspended in His binding buffer (5 mM Imidazole, 500 mM NaCl, Tris-HCl, pH 7.5,

10% glycerol) and disrupted by sonication on ice (Branson Sonifier 450). The cell lysate was cleared by centrifugation (10000g, 45 min, 4 ⁰C) and the supernatant was loaded on to 3 mL of Ni-NTA Agarose (GE Healthcare Life Sciences). The Ni-NTA Agarose with bound HIS-WbiN was washed with 5 column volumes of binding buffer, then 5 column volumes of binding buffer (500 mM NaCl, 50 mM Imidizaole, 10% glycerol, 20 mM

Tris-HCl, pH 7.5), and finally eluted with 10 mL of elution buffer (500 mM NaCl, 500

39 mM Imidizaole, 10% glycerol, 20 mM Tris-HCl, pH 7.5). The homogenous His-WbiN was stored at -80 ⁰C in a buffer containing 50 mM Tris-HCl, pH 7.5, and 10% glycerol.

SDS-PAGE analysis, western blot analysis, and protein quantification

Protein expression and purification were analyzed by 12% SDS-PAGE and stained with Coomassie Brilliant Blue R250. For western blot analysis, GST-WbiQ was separated by 12% SDS-PAGE then electrophoretically transferred onto a Nitrocellulose membrane (Invitrogen), followed by blocking with 5% nonfat dry milk in 1x PBS buffer.

All incubations were performed for 1 hour at room temperature, followed by 3 washings

(10 min each) with 1x PBS-T. The GST tagged protein was first probed with rabbit anti-

GST polyclonal antibody (1:1000, Cell Signaling Technology). The blot was then probed with HRP-conjugated goat anti-rabbit IgG (1:2000, GE Healthcare Life Sciences) and developed using either 3,3‘,5,5‘-Tetramethylbenzidine (TMB) or the ECL Western

Blotting Detection Reagents (GE Healthcare). Protein concentration was determined by using the BCA Protein Assay Kit (Thermo Scientific).

For western blot analysis, His6-WbiN was separated by 12% SDS-PAGE then electrophoretically transferred onto a Nitrocellulose membrane (Invitrogen), followed by blocking with 5% nonfat dry milk in 1x PBS buffer. All incubations were performed for 1 hour at room temperature, followed by 3 washings (10 min each) with 1x PBS-T. The

His6 tagged protein was first probed with mouse anti-His monoclonal (1:1000). The blot was then probed with HRP-conjugated goat anti-mouse IgG (1:2000, GE Healthcare Life

Sciences) and developed using ECL Western Blotting Detection Reagents (GE

Healthcare).

40

Fucosyltransferase activity assay, effects of pH and metal cations

Enzyme activity was determined at 37 ⁰C for 1 hour in a final volume of 100 µL containing 20 mM Tris-HCl (pH 7.5), 0.3 mM GDP-β-L-fucose (supplemented with

GDP- L-[U-14C]fucose (9000 cpm, American Radiolabeled Chemicals), 20 mM acceptor, and 10 µg GST-WbiQ. The acceptor was omitted in the control reaction. The reaction was terminated by adding 100 µL of ice cold water followed by addition of 800 µL (v/v =

1/1) Dowex 1x8 200-400 anion exchange resin. The mixture was centrifuged and the resulting supernatant was collected in a 20 mL plastic vial containing 10 mL of

Scintiverse BD (Fisher Scientific). After thorough vortexing, the radioactivity of the mixture was counted in a Beckmann LS-3801 liquid scintillation counter. The activity of

WbiQ under varying pH conditions was determined with 10 µg of GST-WbiQ in a 100

µL reaction mixture containing variable pH conditions (pH 5.0-9.5), 0.3 mM GDP-

Fucose, and 20 mM Gal-β1,3-GalNAc-OMe for one hour. The activity of WbiQ in the presence of various divalent metal cations was determined in a 100 µL solution containing 10 µg GST-WbiQ, 20 mM Tris-HCl (pH 7.5) 0.3 mM GDP-Fucose, 20 mM

Gal-β1,3-GalNAc-OMe, and 10 mM of a divalent metal, reacting for one hour.

Enzymatic synthesis of H-type 3 blood group

Prior to synthesizing the H-type 3 blood group antigen, the disaccharide acceptor

Gal-β1,3-GalNAc-OMe was first enzymatically synthesized. This disaccharide acceptor was synthesized following Scheme 2.1, using the previously characterized β1,3GalT,

LgtD.22

41

Scheme 2.1. LgtD catalyzed reaction for the synthesis of Gal-β1,3-GalNAc.

Using 1 mg of GST-WbiQ, milligram scale synthesis of Fuc-α1,2-Gal-β1,3-

GalNAc-OMe was performed in a final volume of 3.0 mL at 37 ⁰C containing 20 mM

Tris-HCl (pH 7.5), 10 mM Gal-β1,3-GalNAc-OMe (as prepared in 21), and 15 mM GDP- fucose (as prepared in 23). The reaction was monitored by thin-layer chromatography [i-

PrOH/H2O/NH4OH = 7:3:2 (v/v/v)]. Products were stained with anisaldehyde/MeOH/H2SO4 = 1:15:2 (v/v/v). After complete conversion of acceptor to product, the protein was removed by boiling, followed by centrifugation (12000 g, 15 min). Excess GDP-fucose and the by product, GDP, were removed by anion exchange chromatography, and the final trisaccharide product was purified by Bio-Gel P-2 gel filtration (Bio-Rad) with a water mobile phase. The desired fractions were pooled, lyophilized, and stored at -20 ⁰C.

GalNAc transferase activity determination of His6-WbiN

The reaction for WbiN was carried out the same as that for WbiQ but with a different acceptor and donor. The transferase activity was detected by reacting 10 µg of

His6-WbiN with 10 mM UDP-GalNAc, 5 mM GalNAc-PP-Undecyl, Tris-HCl, and a pH of 7.5

42

Mass spectrometry and NMR

Electrospray ionization mass spectrometry (ESI-MS) assay was conducted using at The Ohio State University mass spectrometry facility on a Bruker micrOTOF

Instrument provided by a grant from the Ohio BioProducts Innovation Center. 1H NMR and 13C NMR (Bruker Avance 500 MHz NMR spectrometer) were used for product confirmation. The trisaccharide product was dissolved in D2O and lyophilized before the

NMR spectra were recorded at 303 K in a 5 mM tube.

Crystallographic studies of WbiQ

After demonstrating the activity of GST-WbiQ, we set up crystallographic trials of both GST-WbiQ and His6-WbiQ. The protein purification and expression for both the

GST and His6 fusion proteins follows the protocol already described, and stored in 20 mM Tris-HCl and 100 mM NaCl. The fusion proteins were set up in microbatch trays for

PEGion and MembFacHT, supplemented with additional cofactors, such as GDP and

GDP-fucose.

2.1.3 Results and discussion

Characterization of WbiQ

From previous work, the O-antigen biosynthesis gene cluster was sequenced and several glycosyltransferases and processing enzymes were indentified from E. coli O127

(GenBank Accession no. AY493508). Among them, wbiQ, was identified as a putative

α1,2-fucosyltranferase. After performing a BLAST search of the WbiQ, it was identified to belong to glycosyltransferase family 11, characterized by a putative conserved GT-B domain. Glycosyltransferase family 11 contains α1,2-FucTs from all domains of life

43 such as bacteria, virus, mammals, and humans.7 The 299 amino acid WbiQ demonstrated high level amino acid identity towards several other bacterial α1,2-FucTs: WfbI from

Salmonella enterica O13 (61%) and WbwK from E. coli O86 (48%). Also in this family, with lower sequence identity, are several other characterized fucosyltransferases such as

WbsJ of E. coli O128:B12 (26%), FutC from H. pyolori (25%), and human FUT1 (15%).

Similar to recently characterized WbsJ, WbiQ contains several conserved motifs, shown by the sequence alignment (Figure 2.2). The three motifs labeled I, II, and III are conserved across both bacterial and mammalian α1,2-FucTs, α1,6-FucTs, and O-FucTs.

Motif I contains several basic residues, notably HxRRxD, which has been shown to be important for interacting with the donor GDP-fucose. The other two motifs, motif II and

III, were observed in the crystal structure of human α1,6-FucT (FUT8), which may be involved with binding of GDP-fucose.24 However, to fully elucidate the roles of these motifs in α1,2-FucTs, a three-dimensional structure would need to be determined.

Comparing WbiQ to recently characterized WbwK, they exhibit approximately

48% sequence similarity.15 Similar to WbwK, WbiQ contains a putative transmembrane domain as identified by using TMpred (Prediction of Transmembrane Regions and

Orientation), amino acids 246 to 264. Bacterial glycosyltransferases involved with the O- antigen biosynthesis are associated with the inner membrane facing the cytoplasmic side.25,26 However, the position of this putative transmembrane segment is unlikely important for anchoring WbiQ to the cytoplasmic face of the inner membrane.

44

Figure 2.2. Sequence alignment between WbiQ and many other FucTs, highlighting 3 important motifs.

While previous attempts at expressing WbiQ with a His6 tag using the pET-15b produced large concentrations of enzyme (200 mg/mL), the purified protein was not enzymatically active. Thus pGEX-4T-1 was chosen to express WbiQ with a GST affinity tag in order to improve the enzyme‘s solubility and stability. Expression of WbiQ with an

N-terminal GST tag was carried out in 1 L of LB under induction of IPTG. The fusion protein GST-WbiQ was purified in one step GST-affinity chromatography, and analyzed by 12% SDS-PAGE,as shown Figure 2.3 A, Lane 4. The recombinant protein has an apparent molecular weight of 60 kD as estimated by the SDS-PAGE and anti-GST western blot (Figure 2.3 B), which is similar to the theoretical molecular weight (61 kDa), as calculated from its primary amino acid sequence. The major impurity from the

SDS-PAGE (Lane 4) and anti-GST western blot appears to be soluble GST and/or truncated forms of GST-WbiQ. As such, we attempted to cleave the GST tag from the 45

Figure 2.3. Purification of GST-WbiQ fusion protein. (A) SDS-PAGE; Lane 1: Molecular weight marker; Lane 2: Pre-induction; Lane 3: Post-induction with IPTG; Lane 4: GST-

WbiQ after elution from GST-resin.; Lane 5: GST-WbiQ after elution from Superdex 200 gel filtration. (B) Anti-GST western blot from sample in Figure 2.3, Lane 4.

fusion protein by using thrombin; however, the cleavage efficiency was low and there appeared to be nonspecific cleavage. Thus, GST-WbiQ was further purified to near homogeneity using gel filtration chromatography (Figure 2.3 Lane 5). From the size exclusion chromatography, the molecular weight of GST-WbiQ is approximately 120 kDa, indicating dimerization, which may be due to the fusion tag, considering GST exists as a homodimer in nature. The GST-fusion protein purified using gel filtration was subsequently used for all experiments.

The subcellular localization of GST-WbiQ was investigated by using differential centrifugation methods, and was subsequently analyzed by SDS-PAGE and anti-GST 46 western blot.27 After lysis of the strain harboring pGEX-4T-1-wbiQ by sonication, the resulting lysate was subjected to centrifugation at 12000 g. Soluble GST-WbiQ (60 kDa) was present in the 12000 g centrifugation supernatant as shown by the anti-GST western blot in Figure 2.4 B, Lane 3. The formation of the pellet after 12000 g centrifugation contained a significant amount of recombinant protein, Figure 2.4 A and 2.4 B. Lane 4, suggesting that GST-WbiQ forms inclusion bodies. The supernatant after the 12000 g centrifugation was subjected to centrifugation at 50000 g for two hours, after which the supernatant was removed and centrifuged at 50000 g for another hour. The supernatant after 50000 g centrifugation contained soluble GST, as visualized by the anti-GST western blot, Figure 2.4 Lane 5. After purification of the membranes using ultracentrifugation, the resulting pellet contained both GST-WbiQ and GST, Figure 2.4 B

Lane 6, suggesting that GST-WbiQ is associated with the membrane in the E. coli host.

This result is consistent with other bacterial glycosyltransferases, whereby the proteins are soluble but have some association with the inner membrane in their E. coli hosts.27

While GST-WbiQ appears to associate with the inner membrane according this study, the role of the GST-fusion tag in this membrane association is unknown.

WbiQ belongs to glycosyltransferase family 11, and as such, is predicted to transfer L-fucose from GDP-β-L-fucose to β-D-Gal through an α1,2 linkage. Based on the O-antigen repeating unit of E. coli O127, a panel of acceptors were chosen to detect

α1,2-FucT activity, as well as provide the relative activity for various acceptors. The results show that WbiQ is active with the Gal-β1,3-GalNAc-OR acceptors, which are derivatives of blood group T-antigen (Table 2.1). These acceptors are also structurally

47

Figure 2.4. Subcellular localization of GST-WbiQ. (A) SDS-PAGE; Lane 1:Pre-induction whole cell lysate; Lane 2: Post-induction whole cell lysate; Lane 3: Supernatant after

12000 g centrifugation; Lane 4: Cell pellet formed after 12000 g centrifugation; Lane 5:

Supernatant after 50000 g ultracentrifugation; Lane 6: Cell pellet formed after 50000 g ultracentriguation. (B) Anti-GST western blot; Lane 1:Pre-induction whole cell lysate;

Lane 2: Post-induction whole cell lysate; Lane 3: Supernatant after 12000 g centrifugation; Lane 4: Cell pellet formed after 12000 g centrifugation; Lane 5:

Supernatant after 50000 g ultracentrifugation; Lane 6: Cell pellet formed after 50000 g ultracentrifugation.

similar to the native O-antigen repeating unit. WbiQ exhibits strict acceptor substrate specificity, as it did not recognize any of the other disaccharides (lactose, lactulose, Gal-

β1,4-glucitol) that contained the β-D-Gal residue at the nonreducing end. Acceptors Gb3 and α-Gal both have the β-D-Gal, not at the nonreducing end, and neither of these were

48 suitable acceptors for WbiQ. Lastly, the monosaccharide, β-D-Gal, did not serve as a suitable acceptor for WbiQ. The inability to accept β-D-Gal contrasts many other α1,2-

FucTs from family 11 that readily accept β-D-Gal as an acceptor.14,16

Table 2.1. Acceptor substrate specificity of GST-WbiQ Specific Activity Acceptor (nmol min-1 mg-1) Gal-β1,3-GalNAc-O-Me (T-antigen) 4.4 Gal-β1,3-GalNAc-O-OH (T-antigen) 3.9 Gal-β1,4-Glc (Lactose) ND* Gal-β1,4-Fru (Lactulose) ND Gal-β1,4-glucitol ND Gal-β-OMe ND Gal-α1,3-Gal-β1,4Glc (Gb3) ND Gal-α1,4-Gal-β1,4Glc (α-Gal) ND *ND: not detectable.

Further verification of enzymatic activity was demonstrated by using TLC as the method of detection. A 100 µL reaction mixture was set up containing 10 µg of GST-

WbiQ, with GDP-fucose as the donor and Gal-β1,3-GalNAc-OH as the acceptor. The trisaccharide product was visualized by TLC, whereby, after 12 hours of incubation at 37

⁰C a third spot forms which runs slower than the acceptor (Figure 2.5 Lane 4). After 48 hours, we observe complete consumption of GDP-fucose and formation of the trisaccharide product (Figure 2.5 Lane 5).

The effect of pH on α1,2-FucT activity was determined under varying pH conditions at 37 ⁰C using Gal-β1,3-GalNAc-OMe as the acceptor. Figure 2.6 shows the

49

Figure 2.5. TLC demonstrating α1,2-FucT activity of GST-WbiQ. Lane 1: GDP-Fucose;

Lane 2: Gal-β1,3-GalNAc-OH; Lane 3: Reaction mixture at time 0 hours; Lane 4:

Reaction mixture after 12 hours; Lane 5: Reaction mixture after 48 hours.

pH profile, which has an optimal pH range from 6.5 to 7.5. While the shape of the pH curve is the characteristic ―bell shape,‖ there are two maximums at pH 6.5 and 7.5. This observed result may be due to the effect of the pH by ionizing a specific catalytic residue, affecting the binding affinity, affecting the stability of the protein, or a combination of all these effects.14

WbiQ was labeled as a putative α1,2-fucosyltransferase belonging to glycosyltransferase family 11, which are characterized by a GT-B type fold.

Glycosyltransferases characterized by a GT-B type fold typically exhibit activities independent of divalent metals. In Contrasting, GT-A type glycosyltransferases have a

DXD motif, which coordinates a divalent metal, meaning a divalent metal is required for catalysis. WbiQ does not contain a DXD motif, and as such, it was expected not to require a metal for catalysis.28 The effects of various divalent metal cations and EDTA

50

Figure 2.6. pH profile for GST-WbiQ enzymatic activity.

on the α1,2-FucT activity of WbiQ were tested. From Figure 2.7 we show that WbiQ does not require a divalent metal cation for catalysis, as it exhibited full activity when no divalent metal is added or when 10 mM EDTA is present. Upon adding 10 mM of any of the various divalent metal cations, inhibition of enzymatic activity is observed. These results are in agreement with other GT-B type fucosyltransferases whereby a metal binding site (DXD motif) is not present in the primary amino acid sequence and therefore enzymatic activity is independent of metal ions.14

WbiQ was used to create the H-type 3 blood group antigen trisaccharide on a milligram scale. The reaction was carried out for 4 days and approximately 19 mg of Fuc-

α1,2-Gal-β1,3-GalNAc-OMe was obtained from the reaction containing GDP-Fucose,

Gal-β1,3-GalNAc-OMe, and purified GST-WbiQ. For confirmation of the correct

51

Figure 2.7. Metal dependence assay for GST-WbiQ.

linkage and structure, after gel filtration, the purified trisaccharide was analyzed by electrospray mass spectrometry and NMR. The assignment of 1H NMR and 13C NMR of

Fuc-α1,2-Gal-β1,3-GalNAc-OMe are found in the Appendix, and are consistent with those reported previously.29

Using both the pGEX-4T-1 and pET-15b expression vectors, high concentrations of pure WbiQ were obtainable as demonstrated by SDS-PAGE and western blot analysis.

Using concentrated GST-WbiQ and His6-WbiQ, microbatch trays were set up in the attempt to obtain the three-dimensional structure of WbiQ. No structural evidence is available for the α1,2-FucTs, and due to the high expression nature of WbiQ there was strong motivation to obtain crystals for X-ray crystallography. Initially microbatch trays were prepared with the WbiQ fusion protein using the PEGion and MembFacHT kits. 52

Initially, very small crystals (5-20 µM) were observed using the His6-WbiQ fusion protein, and confirmed to be protein by using the standard dye assay. The conditions that crystals were observed with are as follows: 1) 20% PEG 3350, 0.2 M LiCl (#4 of

PEGionHT); 2) 20% PEG 3350, 0.2 M NaCl (#6 PEGionHT); and 3) 20% PEG 3350, 0.2

M KCl (#8 of PEGionHT). The best of these three conditions was #6 of PEGionHT, however all three samples share common conditions of requiring the chloride anion and

20% PEG3350 (Figure 2.8). Further observations from the PEG ion screen were that

WbiQ shows a strong dependence of solubility on ionic strength. The protein droplets were soluble at and after 800 mM NaCl, and are increasingly soluble all the way to 3200 mM NaCl (maximum concentration used in the screen). This observation was consistent throughout the pH range of 5-9; meanwhile, the protein was insoluble below pH 5 in any of the salt PEG combinations. Additional crystals were obtained in additional buffer screens, such as PEGRx F7, however the protein crystals were generally very small (5-10

µM) and not of diffraction quality. After repeating the different screens and obtaining only small crystals (20 µM was the largest), further work on WbiQ was discontinued.

Characterization of WbiN

Next, WbiN was identified as a putative α1,3-GalNAcT, belonging to the CAZy

GT family 4, which encode for GT-B type transferases with a retaining mechanism. This enzyme has 100% sequence similarity towards previously characterized WbnH from E. coli O86, and thus the goal of this work was just to confirm its enzymatic function in

30 vitro, so that a complete model of O-antigen biosynthesis could be obtained . His6-

WbiN also belongs to CAZy GT family 4 and its function was confirmed by mass

53 spectrometry analysis and NMR, and thus due to the sequence similarity we sought to just demonstrate enzymatic activity by using a traditional radiolabeled assay and with mass spectrometry. The purified His6-WbiN can be observed in Figure 2.9 (Lane 2), with the western blot confirmation in Lane 3. Using the reaction displayed in Scheme 2.2., we were able to detect GalNAcT activity from the purified His6-WbiN. Then using the traditional radiolabel donor assay, we confirmed that the enzyme does not require a divalent metal cation for activity and further has a requirement of a lipid domain to observe any glycosylation product. Due to limiting substrates we did not confirm the linkage formed by WbiN, but because it is 100% similar to WbnH we believe that it will encode an α1,3-GalNAcT.

Figure 2.8. Observed crystals of His6-WbiQ after blue dye staining.

54

Figure 2.9. Expression of His6-WbiN. Lane 1: Molecular weight marker; Lane 2: His6-

WbiN after one step Ni2+ affinity chromatography; Lane 3: Anti-His western blot.

Scheme 2.2. WbiN catalyzed reaction using GalNAc-αPP-Undecyl as an acceptor.

2.1.4 Conclusions

In this work we identified, purified, characterized, and demonstrated the subcellular location of WbiQ, the putative α1,2-FucT from E. coli O127:K63(B8). The wbiQ gene was biochemically proven to encode an α1,2-FucT through a radioactivity

55 based assay, TLC monitored assay, ESI/MS, and NMR. While the endogenous substrate

(Gal-β1,3-GalNAc-α1,3-GalNAc-O-PP-Und) was not available for testing in our experiments, we demonstrate that WbiQ can efficiently recognize Gal-β1,3-GalNAc-α-

OMe as a suitable substrate, possibly suggesting that the reducing end of the O-antigen repeating unit beyond this disaccharide may not be essential for WbiQ activity. Based on this substrate specificity, WbiQ could be characterized as a Family 4 α1,2-FucT whereby it recognizes Gal-β1,3-GalNAc-α acceptors but not other Gal-β containing acceptors; similar to WbwK from E. coli O86 and an α1,2-FucT from Caenorhabditis elegans.31

Furthermore, we show that WbiQ can be used for the efficient synthesis of H-type 3 blood group antigen.

WbiQ also appears to be a characteristic α1,2-FucT from glycosyltransferase family 11, containing several well conserved motifs as well as not containing a DxD metal cation binding motif. Based on the sequence similarity to other α1,2-FucTs,

170HxRRxD175 in WbiQ likely coordinates GDP-fucose. To further confirm the roles of the individual amino acids in motifs I-III within α1,2-FucTs, the three dimensional structure needs to be resolved. Additional work with WbiQ can be performed to examine the importance of associating with the cytoplasmic face of the inner membrane. While

GST-WbiQ was purified as a soluble protein, bacterial glycosyltransferases, especially those involved in LPS biosynthesis, are commonly membrane associated. With the exception of transferases which genetically encode a transmembrane domain, many glycosyltransferases non-covalently interact with the inner membrane. One suggestion that can be probed further is the investigation of basic-hydrophobic-basic amino acid

56 sequence motifs. Such motifs have been identified as lipid binding regions in certain proteins, such as myosin I from Dictyostelium, and may be identifiable by using the

Wimley and White hydrophobicity scale.32,33

In addition to the functional characterization of WbiQ, we also demonstrated that

WbiN encodes for a GalNAcT, which requires a GalNAc-PP-lipid acceptor, much like that of WbnH. Since the gene WbiP (Figure 2.1) had already been biochemically characterized, we now have a complete picture of the O-antigen biosynthesis in E. coli

O128 (Figure 2.10).

Figure 2.10. Model of LPS biosynthesis in E. coli O127.

57

2.2 Investigation of the O-antigen biosynthesis in E. coli O128

2.2.1 Introduction

The cell surface of gram negative bacteria is decorated with an abundance of important biomolecules, and one of the major components found on the outer membrane is that of lipopolysaccharide.34 Functionally, LPS has been confirmed to contribute to the structural integrity of the bacterium, as well as playing an integral role in the pathogenecity by acting as a molecular camouflage.35 The functions that are attributed to

LPS is a direct result of the structure, which is composed up of three unique structural domains: (i) lipid A, (ii) core oligosaccharide, and (iii) the O-polysaccharide (O- antigen).19,36 All three structural components have been well studied, and the biosynthetic pathways and mechanisms resulting in the various components have been elucidated.37,38

For purposes of this work, the O-antigen consists of multiple copies of a repeating oligosaccharide unit, which are polymerized, resulting in the outer most portion of the

LPS. There exist three primary biosynthetic pathways that have been reported which describe the processing of the O-antigen34 The first two processes are less common than the third and are regarded as the ABC transporter dependent pathway and the synthase dependent pathway. The most common mechanistic pathway for the processing of the O- antigens is what is regarded as the wzy dependent pathway and has been well characterized by our lab using E. coli O86 as a model strain.39,40 In this pathway, the O- repeating unit is assembled in the cytoplasmic face, with the addition of a sugar phosphate onto undecaprenyl phosphate (Und-P) to form an Und-PP-linked sugar; commonly catalyzed by the integral membrane protein WecA or WbaP (N-acetyl-

58 hexosamine or hexose, respectively).41,42 Following this initial reaction, subsequent glycosyltransferases (GTs) sequentially add different sugar residues onto the Und-PP- sugar intermediate, occurring at the cytoplasmic face of the inner membrane.29,43 Once the complete O-unit is assembled, it is further translocated by the O-antigen flippase,

Wzx, to the periplasmic face of the inner membrane, from which it is polymerized and chain length regulated by the enzymes Wzy and Wzz.34,44-46

In developing countries, enteropathogenic Escherichia coli (EPEC) O128 is a leading cause of infantile diarrhea. The O-antigen repeating unit is the exposed portion of the LPS and in E. coli O128 it consists of the pentasaccharide –[3-GalNAc-β1,4-Gal-

α1,3GalNAc-β1,6(Fucα1,2)Gal-β1].47 Among this pentasaccharide, the trisaccharide

β1,6(Fucα1,2)Gal-β1,3-GalNAc was reported to be the immunodominant part, while disaccharide Fuc-α1,2-Galβ appears almost as vital for the immunodominance.48

Previously, we reported the DNA sequencing for the E. coli O128:B12 O-antigen biosynthesis gene cluster and identified several genes needed for the synthesis of the O- antigen, including several glycosyltransferase which are vital for the O-antigen polysaccharide biosynthesis.49 Among the four identified putative glycosyltransferase genes, wbsH (orf1), wbsJ (orf9), wbsK (orf10), and wbsL (orf11), wbsJ was biochemically characterized to encode an α1,2-fucosyltransferase (FucT) which exhibited very broad acceptor substrate specificity.14 While the compositions of the monosaccharides found in the O-antigen are known, the order of transfer and the identity of the repeating unit are currently unknown. Orf1 (WbsH) contains a conserved domain found in CAZy glycosyltransferase family 4, which encode for retaining, GT-B type

59 glycosyltransferases. Since the O-antigen primary structure includes only one possible retaining linkage (except for the already known FucT), we propose that wbsH encodes an

α1,3-galactosyltransferase that makes the Gal-α1,3-GalNAc moiety present in the O- antigen repeating unit. Orf10 (WbsK) contains a conserved domain found in CAZy glycosyltransferase family 2, which encode for inverting, GT-A, glycosyltransferases.

This family encodes for many β1,3/4-GalT and β1,3/4-GalNAcT, of which WbiP (28% similarity) and LgtD (22% similarity) have been previously characterized by our lab.

While the sequence similarity is relatively low, wbsK, contains several well characterized motifs which have been identified in WbiP (DxD motif at 88DSD90) which is required for

Figure 2.11. The O-polysaccharide structure of E. coli O128 and its biosynthetic gene cluster. The relationship between the putative GT and its corresponding linkage in the O- antigen are distinguished by different colors.

60 interaction of the sugar nucleotide donor. Thus we propose that wbsK encodes a β1,3- galactosyltransferase that makes the Gal-β1,3-GalNAc (T-antigen mimicry) moiety present in the O-antigen repeating unit. Orf11 (WbsL) is the remaining glycosyltransferase in the O-antigen biosynthesis gene cluster, and contains a GT-B type fold, but is currently not included in a GT family in the CAZy database. Lastly, studies of eukaryotic glycosylation systems (such as O-linked glycosylation) have demonstrated that the enzymes involved in glycosylation adopt protein-protein interaction schemes forming protein complexes.50,51 However, it is currently unknown if the enzymes in the

O-antigen biosynthesis pathway adopt such interactions. In this chapter, we present a detailed biochemical characterization of the four glycosyltransferases found in the O- antigen biosynthesis gene cluster, and characterize their involvement with the O-antigen repeating unit biosynthesis.

2.2.2 Experimental methods

Bacterial strains, plasmids, and materials

Escherichia coli O128:B12:H (ATCC 12810) was obtained from American Type

Culture Collection. E. coli competent cell DH5α [lacZ∆M15 hsdR recA] was from

Invitrogen, and E. coli competent cell BL21 (DE3) [F ompT hsdSB (rBmB gal dcm (DE3) was from Agilent Technologies. Expression plasmid pET-15b was purchased from

Novagen. Expression plasmid pGEX-4T-1 was purchased from Amersham Biosciences.

All other chemicals and reagents were from Sigma-Aldrich unless otherwise noted.

61

Cloning and construction of recombinant plasmids

The four glycosyltransferases genes were amplified by polymerase chain reaction (PCR) from chromosomal DNA of E. coli O128. Each glycosyltransferase gene was cloned and inserted into both the pET-15b vector (N-terminal HIS tag) and the pGEX-4T-1 (N- terminal GST tag). All primers used in this study are found in the Appendix of this thesis.

For cloning into the pET-15b vector, the PCR product and empty pET15b vector were doubly digested with the restriction enzymes NdeI and BamHI, and the resulting PCR product was inserted into the NdeI/BamHI sites of plasmid pET-15b. For cloning into the pGEX-4T-1, the PCR product and empty pGEX-4T-1 plasmid were doubly digested with the restriction enzymes BamHI and XhoI, and the PCR product was inserted into the

BamHI/XhoI sites of plasmid pGEX-4T-1. The constructs were subsequently transformed into E. coli DH5α cells. Select clones were characterized by restriction mapping and

DNA sequencing. The correct constructs were further transformed into E. coli BL21

(DE3) for protein expression. The cloning and construction for wbsJ containing expression plasmids was done previously.14

Overexpression, enzyme purification, SDS-PAGE, and western blot analysis

E. coli BL21 (DE3) harboring the recombinant plasmid were grown in 1 L of LB medium, while shaking at 220 rpm at 37 ⁰C. When the OD600 reached 0.8, isopropyl-1- thio-β-D-galactospyranoside (IPTG) was added to a final concentration of 0.5 mM for induction, after which protein expression was allowed to proceed for 15 hours at 16 ⁰C.

Cells were harvested by centrifugation at 4 ⁰C and stored at -80 ⁰C until needed.

62

For the purification of GST-fusion proteins (using the pGEX-4T-1 recombinant plasmids), the cell pellet was suspended in 20 mM Tris-HCl, pH 7.5, 500 mM NaCl, and disrupted by sonication on ice. The lysate was cleared by centrifugation (10000g, 50 minutes) and the supernatant was loaded onto 3 mL of Glutathione Sepharose (GE

Healthcare), followed by washing with 1x PBS. The GST-tagged protein was subsequently eluted with 10 mL of GST elution buffer (50 mM Tris-HCl, 20 mM reduced glutathione, pH 8.0), and stored in 20 mM Tris-HCl, pH 7.5, and 10% glycerol at -80 ⁰C.

The total protein concentration was determined by using the BCA Protein Assay Kit

(Thermo) with bovine serum albumin as the standard.

For the detection of protein expression, both SDS-PAGE and western blot analyses were used. The SDS-PAGE analysis was done by resolving by 12% SDS-

PAGE and stained by using Coomassie Brilliant Blue R-250. The western blot was

14 performed as previously described. The His6 fusion proteins were detected by incubation with mouse anti-His monoclonal IgG. ECL Anti-mouse IgG Horseradish

Peroxidase antibody (GE Healthcare) was then used as the secondary antibody. The GST fusion proteins were detected by incubation with GST Rabbit Antibody (Cell Signaling

Technlogy). ECL Anti-rabbit IgG Horseradish Peroxidase antibody (GE Healthcare) was then used as the secondary antibody. Blots were then developed by using ECL Plus

Western Blotting Detection Reagents (GE Healthcare), followed by exposure to Kodak

BioMax MR Film.

63

In vitro characterization of GST-Wbs* enzymes

To determine the enzymatic function and the configuration of the newly created glycosidic linkage of each Wbs* enzyme, we used purified GST-Wbs* as the source of the enzyme. Reactions for the enzyme activity identification were performed in a scale of 100 µL, containing 10 µg enzyme, 10 mM donor, 10 mM acceptor, 20 mM Tris-HCl, pH 7.5, 37⁰C, and in the case of WbsK 10 mM Mn2+. The synthesis of the GalNAc-PP-

Undecyl and GalNAc-PP-PhU acceptors was performed as previously described.52 When using a monosaccharide acceptor, the reaction was monitored by TLC (staining with i-

PrOH/H2O/NH4OH = 8:2:2 (v/v/v)) and ESI/MS. When using the sugar-PP-lipid acceptor, the reaction was monitored by TLC (staining with N-BuOH/H2O/Acetic Acid =

2:1:1 (v/v/v)) and ESI/MS. After detection of the product, the reactions were scaled up for glycosidic linkage determination, using the same concentrations as already mentioned, and monitored by ESI/MS. The substrate specificity and metal dependence assay were performed using a modification of the radioactivity based assays described in references.53,54 The standard assay for glycosyl transfer to exogenously added acceptor was carried out in reaction mixtures of 40 µL total volume, which contained 0.5 mM acceptor, 20 mM Tris-HCL, pH 7.5, 0.5 mM [3H] UDP-donor, approximately 10µg of

GST fusion protein, and for the WbsK specificity 10 mM Mn2+. For those acceptors containing a C11 lipid chain, the reaction was terminated by the addition of 700 µL ice cold water, and the mixtures were applied to a 1 mL C18 Sep-Pak column (Waters) which had been equilibrated by washing with 4 mL of methanol followed by 6 mL of water. Columns were then washed with 5 mL of water and the product was eluted with 5

64 mL methanol. 1 mL fractions were collected and added to 10 mL Ready Safe scintillation fluid. For those acceptors which did not contain a C11 lipid chain, the reaction was terminated by the addition of 100 µL ice cold water, and was immediately added to 800 µL Dowex 1x 8-200 anion exchange resin (v/v = 1/1). After thorough mixing and centrifugation the supernatant was added to 10 mL of the scintillation fluid.

The radioactivity of all reactions was measured using a Perkin Elmer Tri-carb 2810TR

Liquid Scintillation Analyzer. The divalent metal cations and EDTA concentrations were tested at a concentration 10 mM. The acceptors used for each Wbs* enzyme were the optimal acceptors used in the substrate specificity assay and were as follows: WbsH,

GalNAc-PP-O(CH2)11-OPh; WbsL, Gal-α1,3-GalNAc-PP-O(CH2)11-OPh; and WbsK,

GalNAc-OH. The UDP-[3H]Gal and UDP-[3H]GalNAc were purchased from American

Radiolabeled Chemicals.

Functional inactivation of GTs, complementation, and LPS analysis

The wbsH, wbsJ, wbsK, and wbsL genes were individually replaced by a kanamycin resistance cassette by using the Quick and Easy E. coli Gene Deletion Kit

(Gene Bridges). The kanamycin resistance cassette was amplified from the provided plasmid by using 5‘ and 3‘ ends of the gene (Appendix). The PCR product was then transformed into E. coli O128 type strain carrying the pRedET (tetr) plasmid, and the kanamycin-resistant transformants were selected after induction of the RED genes, according to the manufacturer‘s instructions. PCR primers specific to the kanamycin gene and E. coli O128 DNA flanking the desired knock out gene were used to confirm the replacement. To complement the GT deficient mutants of E. coli O128, the pET-15b

65 recombinant plasmids used for protein expression were transformed into each respective deficient strain, and protein expression was induced. The LPS was extracted according to a previously published protocol, and analyzed by SDS-PAGE followed by silver staining.38

In vitro GST pull down assays

The in vitro GST pull-down assay was performed to demonstrate protein-protein interactions in vitro, by using GST tagged proteins as bait and His6 tagged proteins as prey. Using E. coli BL21 (DE3) as the host strain, IPTG (0.5 mM) induced strains harboring the pGEX-4T-1-GT plasmid were lysed in the presence of 1 mM dithiothreitol and proteinase inhibitor (Roche Prime Supply). The GST protein from the empty vector was used as the negative control. After centrifugation to remove cell debris, the cell lysate was added to GST beads that were equilibrated with GST equilibration buffer (50 mM Tris-HCl, pH 8.0, 200 mM NaCl, 1 mM EDTA, 1% CA-630, 1 mM DTT, and 10 mM MgCl2). The mixture was incubated at 4 ⁰C for 4 hours on a roller. The beads bound with the GST fusion proteins or just GST alone were washed with the GST equilibration buffer five times. The prey for the in vitro pull down assays was obtained by overexpressing the His6 fusion GTs in E. coli BL21 (DE3). IPTG (0.5 mM) induced strains harboring the pET-15b-wbs* plasmid were lysed in the presence of 1 mM dithiothreitol and protease inhibitor. After centrifugation to remove cell debris, the cell lysate was added to the GST beads that contained either the GST protein or the GST-GT fusion proteins. The mixture was incubated at 4 ⁰C overnight on a roller, after which the beads were washed five times with the GST-equilibration buffer. The bound proteins

66 were then eluted with GST pull down elution buffer (50 mM Tris-HCl, pH 8.0, 200 mM

NaCl, 50 mM reduced glutathione, 1 mM EDTA, and 1 mM DTT), and were then mixed with SDS sample loading buffer and subjected to Western blotting analysis using anti-His antibody.

2.2.3 Results

Expression of glycosyltransferases

Previously, our lab had sequenced the E. coli O128 O-antigen biosynthetic gene cluster and identified four putative glycosyltransferases, several genes responsible for sugar nucleotide biosynthesis, and other O-antigen processing genes.49 One of the genes, wbsJ, was previously cloned, expressed in E. coli BL21 (DE3), and shown to encode an

α1,2-fucosyltransferase. Furthermore this enzyme demonstrated very broad acceptor substrate specificity and was further used in the enzymatic synthesis of the Globo-H hexasaccharide.55 In addition to this, the whole O-antigen from E. coli O86 has been chemo-enzymatically synthesized and used as a substrate for the processing enzymes

Wzy and Wzz.39 Thus, we sought to investigate the O-antigen biosynthesis in E. coli

O128, in hopes to determine the function and importance of the remaining three glycosyltransferases found in the O-antigen biosynthesis gene cluster.

The full open reading frames for wbsH (1122 bp), wbsK (876 bp), and wbsL (900 bp) were PCR amplified from E. coli O128 genomic DNA and subsequently cloned into both the pET-15b and pGEX-4T-1 vectors. The recombinant vectors were transformed into E. coli BL21 (DE3) for induced expression with 0.5 mM IPTG at 16 ⁰C. The

67 proteins utilized for the in vitro enzymatic activity determination experiments were all the

GST fusion proteins, and thus the protein expression yields will be described as such.

Furthermore, thrombin cleavage was not complete and it appears as the GST affinity tag exhibits little hindrance on the enzymatic activity. GST-WbsH could be obtained at an approximate concentration 2.5 mg/L of bacterial culture. The recombinant protein has an apparent molecular weight of 70 kDa, similar to the theoretical value (69.97) calculated from its primary amino acid sequence. Similarly, WbsK and WbsL were purified by one step GST-affinity chromatography, both were obtained at an approximate concentration 4 mg/L and 2 mg/L, respectively. The apparent molecular weights of WbsK and WbsL were 61 and 62 kDa, respectively, which are similar to the theoretical values (60.61 and

61.71 kDa) as calculated from their primary amino acid sequences. Analysis of all the primary amino acid sequences (WbsH, WbsJ, WbsK, and WbsL) show that none of the proteins have any putative transmembrane domains, which is consistent with being able to purify the GTs as soluble proteins. However, like many prokaryotic glycosyltransferases, while not integrated with the inner membrane, bacterial GTs are commonly membrane associating enzymes, and thus using ultracentrifugational methods it was determined that all of the GTs did associate with the inner membrane (data not shown). Lastly, all four GTs have a nascent lipid linked acceptor (GalNAc-PP-Und), indicating that a membrane association may be beneficial for enzymatic activity.

In vitro determination of enzymatic function

The O-antigen repeating unit is assembled on the cytoplasmic face of the inner membrane, on a sugar-PP-lipid domain. Sequential enzymatic glycosylations occur in a

68

Figure 2.12. The protein expression of the three uncharacterized GTs involved with the

O-antigen biosynthesis. A) SDS-PAGE with commassie staining; Lane 1: Molecular weight marker; Lane 2: GST-WbsH; Lane 3: GST-WbsK; Lane 4: GST-WbsL. B)

Western blot analysis, probing with anti-GST antibody; Lane 1: GST-WbsH; Lane 2:

GST-WbsK; Lane 3: GST-WbsL.

step wise fashion, however determining the order of glycosylation of the E. coli O128 O- antigen repeat unit has not yet been elucidated. To determine the order of enzymatic addition, we performed in vitro reactions using purified Wbs* enzyme, with various combinations of donors, acceptors, cofactors, and buffers.

Many E. coli strains include the well conserved wecA gene which is responsible for adding either a GalNAc or GlcNAc onto the undecaprenyl lipid carrier, which is further used in the assembly of the O-antigen. To determine if E. coli O128 has this

69 wecA gene, we performed PCR amplification using primers for the conserved wecA gene, and as predicted we were able to PCR amplify the wecA gene (data not shown). Thus, the first sugar of the O-antigen repeat unit in E. coli O128 must be either a GalNAc or

GlcNAc, and based on the determined O-antigen structure, the first sugar must be

GalNAc. The O-antigen repeating unit contains two GalNAc moieties, and since fucose is commonly a terminal sugar, the proposed repeating was determined as shown in Figure

2.11.

Basing the experiments off this model, we first proposed that WbsH would transfer Gal to the C3 hydroxyl group of GalNAc-PP-Und. The native GalNAc-PP-Und is difficult to obtain from natural bacterial sources as well as being challenging to chemically synthesize. Furthermore, since other enzymes that perform similar functions have demonstrated that the complete 55-carbon lipid chain is not completely necessary for the enzymatic reaction, we avoided handling and synthesizing the GalNAc-PP-Und by preparing a Gal-GalNAc-PP-O(CH2)11-OPh acceptor (GalNAc-PP-PhU), which contains an 11-carbon lipid chain.30,53 This synthetic lipid acceptor was incubated with

GST-WbsH enzyme at 37 ⁰C, overnight, after which the enzyme was removed by brief boiling, and subjected to ESI-MS and TLC. The mass spectrum showed a prominent peak with m/z ratio at 788.6 (M2-) and at 626.5 (M2-, unreacted acceptor), consistent with the formation of Gal-GalNAc-PP-PhU product. Interestingly, no product was observed when using WbsH-His, possibly due to the very low level of protein expression observed.

Due to low yield in the observed reaction and limited amounts of acceptor substrate, the purified product was not obtained in sufficient amounts for NMR characterization. The

70 obtained result correlates well with similar results seen for the O-antigen biosynthesis in

E. coli O86, suggesting that WbsH is responsible for the addition of the second sugar in the repeating unit in E. coli O128.30 In addition to this, no galactosyltransferase activity was observed with WbsK when using the lipid-linked acceptor, further suggesting that

WbsH appends galactose onto GalNAc, as the second sugar moiety in the repeating unit.

In addition to determining the function of WbsH, purified enzyme was also used to further probe the acceptor substrate specificity of WbsH. Table 2.2 indicates that the combination of the lipid moiety and the pyrophosphate are substrate requirements for

WbsH activity, as no detectable activity was identified with any of the acceptors. Neither acceptors UDP-GalNAc nor GalNAc-P serve as suitable acceptors for WbsH, suggesting that the lipid portion of the acceptor is critical for enzymatic activity. Furthermore, similar activities were observed for both GalNAc-PP-lipid acceptors, possibly suggesting that the non sugar terminal is not essential or influential in the WbsH activity. This result is very interesting for two reasons: firstly, WbsH has no identifiable transmembrane domains and the native 55-carbon acceptor is embedded in the inner membrane, thus, proposing an interesting question of how this enzyme recognizes the lipid domain.

Secondly, although the natural GalNAc-PP-Und, contains a 55-carbon lipid chain, we demonstrate that the shorter 11-carbon lipid is sufficient, however not necessarily optimal, and due to limiting substrates we could not determine the full lipid chain length specificity of WbsH.

The next carbohydrate found in the O-antigen repeating unit was that of GalNAc, which in vivo is likely GalNAc-β1,4-Gal-α1,3-GalNAc-PP-Und. Initially, we were unsure

71 which gene encoded the β1,4GalNAcT, and through a combination of experiments we were able to detect β1,4GalNAcT activity from GST-WbsL. Only one reaction when using combinations of various donors (UDP-Gal, UDP-GalNAc), acceptors, buffers

(containing different pHs and detergents) yielded a product as detected by ESI/MS.

When soluble GST-WbsL was incubated with UDP-GalNAc and the Gal-α1,3-GalNAc-

α-PP-Undc, a product was detected with an m/z of 899.3 (M-2) and 449.2 (M- +H) .

Similar to the result obtained for WbsH, no product was observed when using any acceptor other than the carbohydrate-PP-lipid acceptor. This result suggests that WbsL requires the disaccharide moiety, pyrophosphate, and the lipid moiety for complete enzymatic function.

The last enzyme that we were able to determine the activity of, in vitro, was

WbsK. Initial enzymatic assays were performed using both soluble GST-WbsK and His6-

WbsK and no reaction was detectable using a combination of donors and acceptors. Since many prokaryotic GTs are associated with the inner membrane, we attempted at isolating the membrane fraction of IPTG induced E. coli BL21 (DE3) harboring pGEX-4T-1- wbsK, and using the crude membrane fraction as the source of enzyme (using E. coli

BL21 (DE3) harboring empty pGEX-4T-1 as the negative control). Interestingly, we were able to detect product formation by using the combination of donor and acceptor as

UDP-Gal and GalNAc. The corresponding product was detected by using ESI/MS with a

M/Z of (406), which was not present in the negative control, suggesting that WbsK was a

GalT with a suitable acceptor of GalNAc. In the same initial mass spectrum we observed

72

Table 2.2. Acceptor Substrate Specificity of Each Wbs* Enzyme WbsH WbsL WbsK (α1,3-GalT) (β1,4-GalNAcT) (β1,3-GalT) Acceptor R/A (%)a Acceptor R/A (%) Acceptor R/A (%) GalNAc- 100c Gal- 100c GalNAc-OH 100d α-PP- α1,3- PhU GalNAc- α-PP- PhU GalNAc- 96% Gal N/D GalNAc-αPP-Undc N/D α-PP- Undc UDP- N/Db Lactose N/D Gal N/D GalNAc GalNAc- N/D GalNAc N/D Lactose N/D α-P GalNAc- N/D UDP- N/D OH Gal a R/A: Relative Activity b N/D: No product detected using TLC or ESI/MS c %: Percentage determined using soluble protein d %: Percentage determined using membrane fraction as enzyme source

significant hydrolysis of UDP-Gal, which limited the obtainable reaction yield. Neither pure GST-WbsK nor His6-WbsK were enzymatically active in the presence of various detergents and buffers when using a monosaccharide acceptor.

Gene disruption

To determine whether or not wbsH, wbsJ, wbsK, and wbsL are required for E. coli

O128 LPS biosynthesis, we constructed several knockout strains by individually replacing the glycosyltransferases with the kanamycin resistance cassette by utilizing

RED recombination. The positive replacement of the kanamycin gene was confirmed by

PCR amplification and sequencing the region upstream and downstream of the wbsH, 73

Table 2.3. Metal Cation Dependence Assay Cation WbsH WbsL WbsK 2+ a Mn 100% 100% 100% Mg2+ 97% 81% 86% 2+ Ca 84% 77% 15% Water 124% 87% 0% EDTA 117% 81% 0% a %: Relative activity based off 100% for Mn2+

wbsJ, wbsK, and wbsL genes. LPS from the wild type and wbs* deficient strains were extracted, separated electrophoretically, and visualized by silver staining. The wild type,

E. coli O128, strain, as expected, displayed the typical LPS modality (Figure 2.13 Lane

1), with the number of O-antigen repeating units around 20. Lanes 2-5 are the glycosyltransferase deficient strains, wbsH, wbsK, wbsL, and wbsJ, respectively. In each of the wbs* knock out strains we observe the absence of the high molecular weight polymer LPS, as well as a non-discernable low molecular weight range. By removal of the various GTs involved in the O-antigen repeat unit biosynthesis, complete O-antigen cannot be formed which likely leads to the lack of polymerization by Wzy, resulting in the non-discernable low molecular weight LPS. Furthermore, with the absence of any band(s) in Lanes 2-5 further verifies that we have successfully disrupted the O-antigen biosynthesis. Also in all cases, the mutant strain was able to display wild type like LPS modality by introduction of a recombinant plasmid containing the gene that was replaced on the . Figure 2.13, Lanes 6-9, demonstrate the restoration of wild type like

LPS by incorporation of pET15b-wbsH, pET15b-wbsK, pET15b-wbsL, and pET15b-

74 wbsJ, into wbsH, wbsK, wbsL, and wbsJ deficient strains, respectively. Furthermore,

Figure 2.13 B is the western blot of Lanes 1-5 from Figure 2.13 A, by probing with E. coli O128 antiserum. Interestingly, we only observe bands in Lane 1, which likely suggests that the complete O-antigen repeat unit is required for detection by the O128 antiserum. The obtained results suggest that the wbsH, wbsK, wbsL, and wbsJ genes encode enzymes that play critical roles in the formation of LPS in E. coli O128.

Figure 2.13. LPS analysis of E. coli O128 wild type and wbs* deficient strains. A) LPS analysis visualized by silverstaining; Lane 1: Wild type E. coli O128; Lane 2: wbsH deficient strain; Lane 3: wbsJ deficient strain; Lane 4: wbsK deficient strain; Lane 5: wbsL deficient strain; Lane 6: wbsH deficient strain complemented by pET-15b-wbsH;

Lane 7: wbsJ deficient strain complemented by pET-15b-wbsJ; Lane 8: wbsK deficient strain complemented by pET-15b-wbsK; Lane 9: wbsL deficient strain complemented by pET-15b-wbsL. 75

In vitro pull down assay

Many enzymes, including glycosyltransferases interact in vivo, and such interaction can be demonstrated in vitro. To determine the interaction between the GTs involved in the O-antigen biosynthesis, we performed in vitro GST pull-down assays.

WbsH, WbsJ, WbsK, and WbsL were all expressed as GST fusion proteins, and the GST fusion proteins were used as bait to capture His6-WbsH, His6-WbsJ, His6-WbsK, and

His6-WbsL fusion proteins. Figure 2.14 A, is the anti-His western blot while using GST-

WbsH immobilized on the GST bead. Lanes 1 through 4 are pull down assays using

His6-WbsH, His6-WbsK, His6-WbsL, and His6-WbsJ, respectively. In all four lanes, bands are present, indicating that WbsH interacts with all four prey fusion proteins, in vitro. It is worth noting that in all the Lane 1 images, His6WbsH is the prey, the bands are very feint, which is likely due to the low level of expression observed of His6-WbsH in E. coli BL21 (DE3). Figure 2.14 B, C, and D all perform the same pull down experiments with different GST-fusion proteins immobilized on the GST resin; frames B, C, and D are

GST-WbsK, GST-WbsL, and GST-WbsJ, respectively. In all Lanes (1-4) in frames B, C, and D, proteins were observed, suggesting that all four GTs involved with the O-antigen biosynthesis interact with each other in vitro. Interestingly, interactions appear to happen between the same proteins (I.E. GST-WbsJ interacts with His6-WbsJ), which may correlate to the evidence suggesting that WbsJ (and other GTs) act at as oligomers as observed from size exclusion chromatography.56,57 Lastly Figure2.14 E uses GST as the bait, demonstrating no interaction exists between the His6-Wbs* fusion proteins and the

GST protein.

76

Figure 2.14. Interaction between Wbs* proteins determined by in vitro GST pull-down assays, using GST fusion proteins as bait and His6 fusion proteins as bait. The captured protein complexes were subjected Western blot analyses using anti-His antibody. A)

GST-WbsH immobilized as bait. B) GST-WbsK immobilized as bait. C) GST-WbsL immobilized as bait. D) GST-WbsJ immobilized as bait. In each frame: Lane 1: His6-

WbsH as prey; Lane 2: His6-WbsK as prey; Lane 3: His6-WbsL as prey; and Lane 4:

His6-WbsJ as prey.

2.2.4 Discussion

Glycoconjugates, and more specifically bacterial cell surface glycoconjugates have demonstrated important structure-function relationships when it comes to cell 77 pathogenesis, immune system clearance, as well as other many other critical biological processes.58 Exploitation of the glycosylation machinery that synthesizes these glycoconjugates has demonstrated immense utility for the in vitro synthesis of biologically attractive oligosaccharides. The chemoenzymatic synthesis, compared to traditional chemical synthesis, provides for an efficient method to produce complex carbohydrates.59 The traditional chemical synthesis of carbohydrates suffers from many limitations, one being difficult protection/deprotection steps, which limits the obtainable yields of specific glycosidic linkage. However, because of the rapid development in genomic sequencing, there-exists an unlimited tap of putative glycosyltransferases (over

60,000 GT modules) found in the CAZy database, which can be probed for their ability to be used in facile chemoenzymatic synthesis of interesting carbohydrates.60 More specifically, bacterial glycosyltransferases have extensively been shown to adopt flexible substrate specificities, and because of the ease in obtaining large quantities of soluble protein, are good targets for the exploitation of the glycosylation machinery. Moreover, eukaryotic glycosyltransferases, while still applicable for the synthesis of biologically relevant oligosaccharides, may have more rigid specificities, as well as being more challenging to clone and express due to the lack of cDNA and the fact that eukaryotic

GTs tend to be membrane bound. Furthermore, like E. coli O86, which exhibits molecular mimicry of mammalian glycoconjugates on the cell surface, various bacterial sources may have contain interesting GTs which are able to synthesize molecular mimic glycoconjugates.29 Lastly, bacterial sources also contain a rich source of GTs which transfer uncommon, or non Leloir type sugars, which can potentially be explored for the

78 development of novel oligosaccharides which may have interesting medicinally properties.61

The O-antigen biosynthesis gene cluster from E. coli O128 contains sixteen unique open reading frames, four of which are the actual glycosyltransferases involved for the synthesis of the glycosidic linkages in the O-antigen. Three of the genes, wzx, wzy, and wzz are involved with further downstream reactions which effectively process the completed O-antigen repeating unit. This leaves nine remaining genes which are involved with the biosynthesis of the required sugar nucleotide donors, such as gmd and flc which are required for the biosynthesis of GDP-fucose.62 The synthesis of the O- antigens is modulated by GT-mediated initiation and elongation in a stepwise manner, and many of the enzymes involved in this sequential type reaction have been shown to form protein-protein complexes. One glycosylation example, Gtf1 and Gtf2 form a protein complex as demonstrated by a yeast two-hybrid system, as well as an in vitro pull-down assay, in which it is suspected that this complex is required for glycosylation.63

We have shown that the four glycosyltransferases display protein-protein interactions by using an in vitro GST pull-down assay, which may suggest that the enzymes involved in the O-antigen biosynthesis may form a protein complex in vivo. While the interaction that we demonstrate occurs between all four GTs in the O-antigen gene cluster, it is currently unknown the role that this effect has on the actual biosynthesis of the O-antigen.

Eukaryotic systems have been well studied in which protein-protein interactions between

GTs occurs, and in correlation with some data for other bacterial GT protein-protein interacting systems, suggests that interactions observed between GTs is a shared trait

79 among both prokaryotic and eukaryotic organisms.64 Furthermore, while it hasn‘t been extensively studied in other O-antigen systems, there exists sequence homology between several GTs in the O128 O-antigen gene cluster as well as other E. coli systems, such as

E. coli O127 and E. coli O86. Due to the sequence similarity observed between these glycosyltransferases, it is reasonable to assume that there exist interactions between those

O-antigen GTs as well. Also found in most O-antigen biosynthetic gene clusters are numerous enzymes other than GTs, such as wzx and wzy, which are further used for downstream processing of the O-antigen. While Wzy and Wzz have been shown to interact, in vitro (unpublished data), it would be interesting and possibly beneficial to determine the interactions that occur between all the enzymes that are involved in processing of the O-antigen (GTs, Wzx, Wzy, Wzz, WaaL). Since protein-protein interactions readily occur among GTs, it is likely that the formation of large protein complexes is likely a regulatory method for the synthesis of these biopolymers.65 Thus, it is conceivable, since glycopolymers, such as LPS, are so critical for pathogenecity, that if the protein-protein interaction complexes could be disrupted, this could provide for a method of reducing the pathogenecity of the bacterium.66

E. coli O86 has served as a platform for the in vitro chemoenzymatic synthesis of the O-antigen, as well as further development of synthesizing complete LPS in vitro. In this work we elucidate the order enzymatic processes which constitute the O-antigen repeating unit biosynthesis in vitro, and in this work we provide the ground work for large scale synthesis of the O128 O-antigen. Large scale chemoenzymatic synthesis of the immunodominant oligosaccharide, or even just the complete O-antigen repeat unit,

80 could pave the way for the development of a well structured glycoconjugate vaccaine against E. coli O128. Lastly, from this study we are able to provide a model of the sequential enzymatic glycosylation in E. coli O128, which can then be further expanded into a complete LPS biosynthesis model (Figure 2.15).

Figure 2.15. Model for LPS biosynthesis in E. coli O128.

81

REFERENCES

(1) Pang, P. C.; Tissot, B.; Drobnis, E. Z.; Sutovsky, P.; Morris, H. R.; Clark, G. F.; Dell, A. J Biol Chem 2007, 282, 36593-602.

(2) Nishihara, S.; Iwasaki, H.; Nakajima, K.; Togayachi, A.; Ikehara, Y.; Kudo, T.; Kushi, Y.; Furuya, A.; Shitara, K.; Narimatsu, H. Glycobiology 2003, 13, 445-55.

(3) Becker, D. J.; Lowe, J. B. Glycobiology 2003, 13, 41R-53R.

(4) Orntoft, T. F.; Greenwell, P.; Clausen, H.; Watkins, W. M. Gut 1991, 32, 287-93.

(5) Oriol, R.; Samuelsson, B. E.; Messeter, L. J Immunogenet 1990, 17, 279-99.

(6) Lerouge, I.; Vanderleyden, J. FEMS Microbiol Rev 2002, 26, 17-47.

(7) K. Ohtsubo; Marth, J. D. Cell 2006, 126, 855 - 867.

(8) Moran, A. P. Carbohydr Res 2008, 343, 1952-65.

(9) Larsen, R. D.; Ernst, L. K.; Nair, R. P.; Lowe, J. B. Proc Natl Acad Sci U S A 1990, 87, 6674-8.

(10) Sarnesto, A.; Kohlin, T.; Hindsgaul, O.; Thurin, J.; Blaszczyk-Thurin, M. J Biol Chem 1992, 267, 2737-44.

(11) Hitoshi, S.; Kusunoki, S.; Kanazawa, I.; Tsuji, S. J Biol Chem 1995, 270, 8844- 50.

(12) Wang, G.; Rasko, D. A.; Sherburne, R.; Taylor, D. E. Mol Microbiol 1999, 31, 1265-74.

(13) Oriol, R. J Immunogenet 1990, 17, 235-45.

(14) Li, M.; Liu, X. W.; Shao, J.; Shen, J.; Jia, Q.; Yi, W.; Song, J. K.; Woodward, R.; Chow, C. S.; Wang, P. G. Biochemistry 2008, 47, 378-87.

(15) Li, M.; Shen, J.; Liu, X.; Shao, J.; Yi, W.; Chow, C. S.; Wang, P. G. Biochemistry 2008, 47, 11590-7.

(16) Daniel , B. S.; Yu-Nong, L.; Chun-Hung, L. Advanced Synthesis & Catalysis 2008, 350, 2313-2321.

82

(17) Chazalet, V.; Uehara, K.; Geremia, R. A.; Breton, C. J Bacteriol 2001, 183, 7067- 75.

(18) Rivas, M.; Miliwebsky, E.; Balbi, L.; Garcia, B.; Leardini, N.; Tous, M.; Chillemi, G.; Baschkier, A.; Strugo, L. Medicina (B Aires) 2000, 60, 249-52.

(19) Stenutz, R.; Weintraub, A.; Widmalm, G. FEMS Microbiol Rev 2006, 30, 382- 403.

(20) Widmalm, G.; Leontein, K. Carbohydr Res 1993, 247, 255-62.

(21) Yi, W.; Shen, J.; Zhou, G.; Li, J.; Wang, P. G. J Am Chem Soc 2008, 130, 14420- 1.

(22) Randriantsoa, M.; Drouillard, S.; Breton, C.; Samain, E. FEBS Lett 2007, 581, 2652-6.

(23) Zhao, G.; Guan, W.; Cai, L.; Wang, P. G. Nat Protoc 2010, 5, 636-46.

(24) Ihara, H.; Ikeda, Y.; Toma, S.; Wang, X.; Suzuki, T.; Gu, J.; Miyoshi, E.; Tsukihara, T.; Honke, K.; Matsumoto, A.; Nakagawa, A.; Taniguchi, N. Glycobiology 2007, 17, 455-66.

(25) Wang, Q. M.; Peery, R. B.; Johnson, R. B.; Alborn, W. E.; Yeh, W. K.; Skatrud, P. L. J Bacteriol 2001, 183, 4779-85.

(26) Wang, G.; Boulton, P. G.; Chan, N. W.; Palcic, M. M.; Taylor, D. E. Microbiology 1999, 145 ( Pt 11), 3245-53.

(27) Saksouk, N.; Pelosi, L.; Colin-Morel, P.; Boumedienne, M.; Abdian, P. L.; Geremia, R. A. Biochem J 2005, 389, 63-72.

(28) Lairson, L. L.; Henrissat, B.; Davies, G. J.; Withers, S. G. Annu Rev Biochem 2008, 77, 521-55.

(29) Yi, W.; Shao, J.; Zhu, L.; Li, M.; Singh, M.; Lu, Y.; Lin, S.; Li, H.; Ryu, K.; Shen, J.; Guo, H.; Yao, Q.; Bush, C. A.; Wang, P. G. J Am Chem Soc 2005, 127, 2040-1.

(30) Yi, W.; Yao, Q.; Zhang, Y.; Motari, E.; Lin, S.; Wang, P. G. Biochem Biophys Res Commun 2006, 344, 631-9.

(31) Zheng, Q.; Van Die, I.; Cummings, R. D. Glycobiology 2008, 18, 290-302.

83

(32) Brzeska, H.; Guag, J.; Remmert, K.; Chacko, S.; Korn, E. D. J Biol Chem 2010, 285, 5738-47.

(33) Wimley, W. C.; White, S. H. Nat Struct Biol 1996, 3, 842-8.

(34) Raetz, C. R.; Whitfield, C. Annu Rev Biochem 2002, 71, 635-700.

(35) Joiner, K. A. Curr Top Microbiol Immunol 1985, 121, 99-133.

(36) Holst, O.; Bock, K.; Brade, L.; Brade, H. Eur J Biochem 1995, 229, 194-200.

(37) Raetz, C. R.; Reynolds, C. M.; Trent, M. S.; Bishop, R. E. Annu Rev Biochem 2007, 76, 295-329.

(38) Abeyrathne, P. D.; Lam, J. S. Mol Microbiol 2007, 65, 1345-59.

(39) Woodward, R.; Yi, W.; Li, L.; Zhao, G.; Eguchi, H.; Sridhar, P. R.; Guo, H.; Song, J. K.; Motari, E.; Cai, L.; Kelleher, P.; Liu, X.; Han, W.; Zhang, W.; Ding, Y.; Li, M.; Wang, P. G. Nat Chem Biol 2010, 6, 418-23.

(40) Lundborg, M.; Modhukur, V.; Widmalm, G. Glycobiology 2010, 20, 366-8.

(41) Amer, A. O.; Valvano, M. A. Microbiology 2002, 148, 571-82.

(42) Amer, A. O.; Valvano, M. A. Microbiology 2001, 147, 3015-25.

(43) Lehrer, J.; Vigeant, K. A.; Tatar, L. D.; Valvano, M. A. J Bacteriol 2007, 189, 2618-28.

(44) Liu, D.; Cole, R. A.; Reeves, P. R. J Bacteriol 1996, 178, 2102-7.

(45) Robbins, P. W.; Bray, D.; Dankert, B. M.; Wright, A. Science 1967, 158, 1536- 42.

(46) Kanegasaki, S.; Wright, A. Proc Natl Acad Sci U S A 1970, 67, 951-8.

(47) Sengupta, P.; Bhattacharyya, T.; Shashkov, A. S.; Kochanowski, H.; Basu, S. Carbohydr Res 1995, 277, 283-90.

(48) Sengupta, P.; Bhattacharyya, T.; Majumder, M.; Chatterjee, B. P. FEMS Immunol Med Microbiol 2000, 28, 133-7.

(49) Shao, J.; Li, M.; Jia, Q.; Lu, Y.; Wang, P. G. FEBS Lett 2003, 553, 99-103.

84

(50) de Graffenried, C. L.; Bertozzi, C. R. Curr Opin Cell Biol 2004, 16, 356-63.

(51) Seko, A.; Yamashita, K. Glycobiology 2005, 15, 943-51.

(52) Riley, J. G.; Xu, C.; Brockhausen, I. Carbohydr Res 2010, 345, 586-97.

(53) Montoya-Peleaz, P. J.; Riley, J. G.; Szarek, W. A.; Valvano, M. A.; Schutzbach, J. S.; Brockhausen, I. Bioorg Med Chem Lett 2005, 15, 1205-11.

(54) Yi, W.; Perali, R. S.; Eguchi, H.; Motari, E.; Woodward, R.; Wang, P. G. Biochemistry 2008, 47, 1241-8.

(55) Su, D. M.; Eguchi, H.; Yi, W.; Li, L.; Wang, P. G.; Xia, C. Org Lett 2008, 10, 1009-12.

(56) Blixt, O.; Allin, K.; Bohorov, O.; Liu, X.; Andersson-Sand, H.; Hoffmann, J.; Razi, N. Glycoconjugate Journal 2008, 25, 59-68.

(57) Pettit, N.; Styslinger, T.; Mei, Z.; Han, W.; Zhao, G.; Wang, P. G. Biochem Biophys Res Commun 2010, 402, 190-5.

(58) Feizi, T. Immunol Rev 2000, 173, 79-88.

(59) Blixt, O.; Razi, N. Methods Enzymol 2006, 415, 137-53.

(60) Campbell, J. A.; Davies, G. J.; Bulone, V.; Henrissat, B. Biochem J 1997, 326 ( Pt 3), 929-39.

(61) Samuel, G.; Reeves, P. Carbohydr Res 2003, 338, 2503-19.

(62) Andrianopoulos, K.; Wang, L.; Reeves, P. R. J Bacteriol 1998, 180, 998-1001.

(63) Bu, S.; Li, Y.; Zhou, M.; Azadin, P.; Zeng, M.; Fives-Taylor, P.; Wu, H. J Bacteriol 2008, 190, 1256-66.

(64) Lehle, L.; Strahl, S.; Tanner, W. Angew Chem Int Ed Engl 2006, 45, 6802-18.

(65) Kos, V.; Whitfield, C. J Biol Chem 2010, 285, 19668-87.

(66) Hashimoto, K.; Madej, T.; Bryant, S. H.; Panchenko, A. R. J Mol Biol 2010, 399, 196-206.

85

CHAPTER 3

BIOCHEMICAL CHARACTERIZATION OF WBNI

3.1 Introduction

One family of glycosyltransferases, CAZy family 6 (GT6), has been characterized as adopting the GT-A type fold, which catalyzes the transfer of either galactose (Gal) or

N-acetylgalactosylamine (GalNAc), from the corresponding sugar nucleotide donor to a

β-linked Gal or GalNAc.1 The mechanism of this GT family is that of retaining, due to the α-1,3 glycosidic linkages that are obtained after catalysis. Among the 143 currently identified GT6 transferases, 13 of these have been functionally characterized through crystallography, mutagenesis, kinetic analysis, and investigation of substrate specificity and cofactors. Three of the most prevalent eukaryotic GTs that have been characterized are the histo-blood group A and B GTs (GTA and GTB, not to be confused with GT-A and GT-B type folds), as well as the bovine α1,3-galactosyltransferase (α3GT) which is involved in the synthesis of the α-gal epitope.2,3

Despite the breadth of knowledge known about the mammalian GTs from the

GT6 family, significantly less information is known about the GTs from the other domains of life. Only three bacterial GT6 enzymes have been identified and 86 subsequently characterized in vitro. Most recently, Brew and co-workers have identified and thoroughly characterized a GalNAcT from Bacteroides ovatus, BoGT6a.4 This bacterial GT was determined to have a similar overall mechanism and substrate specificity to human GTA, preferring a fucosylated acceptor, catalyzing the reaction in bi-substrate sequential mechanism.5 The striking difference of this and other bacterial

GT6 enzymes, in comparison to the eukaryotic, cyanophage, and marine metagenome

GTs, is that the bacterial GTs are missing the DXD motif. Most, but not necessarily all,

GT-A type GTs have a DXD motif, meanwhile the bacterial GTs of the GT6 family have an NXN motif which replaces the metal binding DXD motif.4,6 These bacterial enzymes seem to catalyze similar reactions in vivo, but the bacterial analogs are missing what is labeled by the eukaryotic GTs as a critical motif for enzymatic activity. As such, it was later identified that the GalNAcT, BoGT6a, while catalyzing a reaction similar to GTA, does not require the presence of a divalent metal cation for enzymatic activity, thereby suggesting that this enzyme does not need a DXD motif for catalysis. Similar to BoGT6a, recently characterized by our lab was another bacterial GalNAcT from this family of enzymes, BgtA from Helicobacter mustelae (Figure 3.1).7 Like BoGT6a, this enzyme promiscuously transfers GalNAc from the donor UDP-GalNAc to fucosylated acceptors, notably H-antigen derivatives. Also similar to the other identified bacterial transferases from GT6 family, the DXD which is highly conserved in the eukaryotic transferases is subsequently replaced with an NXN motif in BgtA (Figure 3.2).

87

Figure 3.1. WbnI catalyzed reaction to form the blood group B antigen. Replacement of

WbnI with BgtA complementation synthesizes blood group A antigen.

The enteropathogenic strain Escherichia coli (EPEC) O86:H2 is a gram-negative bacteria which is a known pathogen of humans and is a major cause of acute and persistent infantile diarrhea in developing countries.8 The virulent factor of this strain is the lipopolysaccharide (LPS), and more specific the O-polysaccharide (O-antigen), which is a structurally diverse polysaccharide containing the repeat unit Gal-α1,3-(Fuc-α1,2)-

Gal-β1,3-GalNAc-α1,3-GalNAc-β.9 In previous work we demonstrated that WbnI is an equivalent glycosyltransferase to that of the human blood group transferase B, which is responsible for the O-antigen biosynthesis, subsequently resulting in the blood group B activity of E. coli O86.10 Using purified WbnI, GTB, and bovine α3GT we demonstrate that the acceptor substrate specificity of WbnI is distinctly different from its eukaryotic

GT6 family counterparts. Although WbnI is able to accept several fucosylated H-antigen 88 derivatives, it heavily prefers the Fuc-α1,2-Gal-β1,3-GalNAc acceptor, suggesting the importance of the hydroxyl group at the axial 4 position on GalNAc and the presence of the fucose moiety. More interestingly, WbnI is able to retain full enzymatic activity in the presence of EDTA, further suggesting that GT6 bacterial enzymes do not share the requirement of divalent metal for catalysis with their eukaryotic equivalents. However, the kinetic data supports the concept that this class of enzymes, whether prokaryotic or eukaryotic, appear to catalyze a very similar reaction. In agreement with BoGT6a, through mutagenesis studies, the effects of single point amino acid mutations suggest that several key amino acids in the WbnI have similar structure function relationships with the eukaryotic GTs in the GT6 family.

3.2 Experimental methods

Expression and purification of glycosyltransferases

The expression plasmids of wbnI and bovine α1,3-galactosyltransferase (α3GT) were cloned in previous studies 3,10. Plasmid harboring the GTB gene was a generous gift from Dr. Monica Palcic. The constructs of pET15b-WbnI, pET15b-α3GT, and pCWΔlac-

GTB were transformed into E. coli BL21 (DE3) for protein expression. E. coli BL21

(DE3) harboring the recombinant plasmid was grown in 1 L Luria-Bertani (LB) medium at 37 °C until the OD600 reached 0.8. Isopropyl-1-thio-β-D-galactospyranoside (IPTG,

USB) was added to a final concentration of 0.5 mM, and the expression was allowed to proceed for 18 h at 16 °C. Cells were harvested and stored at –80 °C until needed. For the purification of WbnI and α3GT we used Ni2+ affinity chromatography. The cell pellet

89 was resuspended in chilled lysis buffer (50 mM Tris–HCl, pH 7.0, 0.5 M NaCl, 0.5%

(w/v) Triton X-100, and 10% glycerol (w/v)) and disrupted by brief sonication (Branson

Sonifier 450, VWR Scientific) on ice. The lysate was cleared by centrifugation (12,000g,

40 min, 4 °C) and loaded onto a Ni2+-NTA (nickel-nitrilotriacetic acid) agarose affinity column (5 mL bed volume, Qiagen) equilibrated with 20 mM Tris-HCl, pH 7.5, 500 mM

NaCl, and 5 mM imidazole, at 4⁰ C. The column was washed with 5 column volumes of

20 mM imidazole in the same buffer, and the protein was eluted with 500 mM imidazole.

In the case of pCWΔlac-GTB the cell lysate was loaded onto a Sp Sepharose FF column

(GE Healthcare), using 50 mM MOPS pH 7.0, 1 mM DTT as equilibration, loading and washing buffer (5 ml/min). For the elution of GTB, FF-elution buffer (50 mM MOPS pH

7.0, 0.5 M NaCl and 1 mM DTT) was used. Size exclusion chromatography was performed on all proteins using Superdex 200 10/300 GL Column (GE Healthcare Life

Sciences) equilibrated with 50 mM Tris-HCl, pH 7.5, 10% glycerol. The homogenous proteins were stored at -80 ⁰C in a buffer containing 50 mM Tris-HCL, pH 7.5, and 10% glycerol. All eluted proteins were analyzed by 12% SDS-PAGE. WbnI, WbnI mutants, and α3GT were confirmed by anti-His (primary antibody) western blot analysis following a previously established method 11. Protein concentration was determined using a BCA

Protein Assay Kit (Thermo Scientific).

Mutagenesis

Thirteen site directed mutants (D41A, D41E, D43A, D43E, N91A, N91A, N93A,

N93D, E185A, E185D, E185Q, and Q148A) were constructed using the QuikChange site-directed mutagenesis kit (Agilent). Following the manufacturers‘ protocol, 25 ng of

90 the template pET15b-WbnI plasmid, and 125 ng of either primer was used in the 100 µL polymerase chain reaction. The incubation conditions for PCR were: 95 ⁰C for 3 minutes followed by 16 cycles of 95 ⁰C for 0.5 minutes, 55 ⁰C for 1 min, and 68 ⁰C for 7.5 min.

The final PCR product was digested with DpnI for 60 min at 37 ⁰C, then transformed into the competent cells. All site directed mutants were verified by DNA Sequencing. The primers for the site directed mutants are in the Appendix of this thesis.

Molecular size

The oligomeric status of His6-WbnI was investigated by using size exclusion chromatography; using Superdex 200 10/300 GL Column equilibrated with 50 mM Tris-

HCl, pH 7.5, 10% glycerol. Using a flow rate of 0.4 mL/min and a pressure of 1 mPa,

250 µL of protein was loaded onto the column. The column was calibrated using both the

Gel Filtration LMW and HMW Calibration Kits (GE Healthcare), and the molecular weight of the native protein was estimated from a linear regression analysis of a plot of elution volume versus log (molecular weight).

Enzymatic synthesis of Type III/IV acceptor

Three of the acceptors used in this work, Type I, II, and V oligosaccharides Fuc-

α1,2-Gal-β1,3-GlcNAc-β-Sp, Fuc-α1,2-Gal-β1,4-GlcNAc-β-Sp, and Fuc-α1,2-Gal-β1,4-

Glc-β-Sp were kindly provided by The Scripps Research Institute. The T-antigen (Gal-

β1,3-GalNAc) and Type III/IV trisaccharide (Fuc-α1,2-Gal-β1,3-GalNAc) were enzymatically synthesized using previous characterized glycosyltransferases (LgtD and

WbsJ), as illustrated in Scheme 3.1. The di/tri-saccharides were purified by P2-size exclusion chromatography.

91

Scheme 3.1. Enzymatic synthesis of T-antigen mimic and the Type III/IV acceptor.

Enzyme activity assays

For all enzymatic assays, 1 unit of enzyme activity was defined as the amount of enzyme required to transform 1 μmol of sugar donor to acceptor per minute at 37°C.

Enzyme assays were performed at 37 °C for 30 minutes in a final volume of 100 μL containing 20 mM Tris-HCl, pH 7.5, 0.3 mM UDP-D-[6-3H]galactose (10000cpm,

American Radiolabeled chemicals) as the sugar donor, 10 mM acceptor and 10 μg enzymes. Acceptor was omitted in the blank control. Since His6-WbnI reactions could not be terminated by addition of EDTA, the 100 µL reaction mixture was immediately applied to a 2 mL polystyrene column (Thermo Scientific) containing 1 mL bed volume of Dowex 1x8 200-400 anion exchange resin (Sigma). The radioactive product was eluted with 1.5 mL of water, and collected in a plastic vial containing 10 mL of

Scintiverse BD (Fisher Scientific). The vial was vortexed thoroughly before the

92 radioactivity of the mixture was counted in a liquid scintillation counter (Beckmann LS-

3801 counter). The unit of activity was calculated for each acceptor and that acceptor which had the largest unit of activity had a relative activity (RA) of 100%.

The influence of divalent metal ions and pH on enzymatic activity was performed at 37 °C for 30 minutes in a final volume of 100 μL. The metal dependence experiments contained 20 mM Tris-HCl, pH 7.5, 0.3 mM UDP-D-[6-3H]galactose, 10 mM of the type

III/IV acceptor,10 μg WbnI, and 10 mM of the variable metal ion or EDTA. Reactions under varying pH were performed using different buffer systems (sodium citrate pH 5.0,

5.5, 6.0, and 6.5; Tris-HCl, pH 7.0, 7.5, 8.0, 8.5, 9.0, and 9.5) at a concentration of 50 mM in the presence of 0.3 mM UDP-D-[6-3H]galactose, 10 mM of the type III/IV acceptor, and 10 μg WbnI.

Kinetic properties of WbnI

Reactions were performed at 37 °C for 15 min in a 100 µL reaction mixture containing 20 mM Tris-HCl, pH 7.5, 1 µG of enzyme, Fuc-α1,2-Gal-β1,3-GalNAc, and

UDP-D-[6-3H]Gal. Conditions were chosen such that 15-25% of the radioactivity from the UDP-D-[6-3H]Gal was incorporated into product, in most experiments. The Fuc-α1,2-

Gal-β1,3-GalNAc acceptor was varied from 0.7-10 mM, while the donor varied from

0.075-0.6 mM. To demonstrate UDP-Gal hydrolase activity by WbnI, 10 µg of His-WbnI was incubated at 37 ⁰C for 30 minutes in the absence of an acceptor. Products containing the [6-3H]Gal moiety were purified and analyzed as performed in the previous section.

The kinetic data was analyzed by unweighted, non-linear regression using Sigma

Plot v11™. The data for the wild type WbnI was fit to the Michaelis-Mention equations

93 for a both a single substrate reaction, and a general two substrate sequential reaction,

Equations 1 and 2, respectively.

(Equation 1)

(Equation 2)

app In equation 1, [S] is the concentration that which was varied, and Km and app Vmax are the apparent Michaelis constant and maximum velocity, respectively. The concentrations of [A] and [B] in the two substrate sequential reaction equation are UDP-

Gal and Fuc-α1,2-Gal-β1,3-GalNAc, respectively. Ka and Kb represent the cognate

Michaelis constants, and Kia is the dissociation constant of substrate A. All mutant enzymes were characterized by varying one substrate while the second substrate was at a fixed concentration, from which the data was subsequently fitted to Equation 1 to yield the apparent Km and Vm values.

Gene disruption and complementation

The wbnI gene was functionally replaced by a kanamycin resistance cassette, using the traditional RED recombination, following the procedure and primers as done previously by Yi et al.9 Then using the wbnI deficient strain, pBAD/Myc-His A vector containing either wbnI, GTB, or the α3GT was electroporated into the deficient strain. By the addition of 0.2% L-arabinose the expression of the α1,3-GalTs was induced. Then the

LPS was isolated using the previously mentioned proteinase K method, and analyzed by silver staining and western blot.12

94

Molecular modeling

Both of the mammalian enzymes examined in this work have had their three- dimensional structures resolved by X-ray crystallography. Using GTB as the template

(PDB ID: 2RJ8), with the molecular modeling software SWISS-MODEL

(http://swissmodel.expasy.org/), WbnI was modeled. The model was further analyzed using the software PyMOL.

3.3 Results

Expression of WbnI

WbnI from E.coli O86:H2 was expressed cytoplasmically in E. coli BL21 (DE3) as a soluble active enzyme, and purified by one step nickel-affinity chromatography in yields of approximately 5 mg/liter. Active enzyme is produced by expression at 37 ⁰C, but the formation of inclusion bodies hindered the amount of purified protein, and it was determined optimal expression was at 16 ⁰C. SDS-PAGE of WbnI under reducing conditions resulted in a single band with an apparent molecular weight of 28,000 kDa.

The enzyme retains high catalytic activity when stored at -20 ⁰C; however it is sensitive to free/thaw exposure and readily precipitated after several thawings. In agreement with the SDS-PAGE under reducing conditions, WbnI eluted from a size exclusion column as a single peak with an elution time corresponding to an apparent molecular weight of

28,000, corresponding to the monomeric protein (Appendix Figure B.1).

95

Catalytic properties and comparative substrate specificity

Previous studies showed that WbnI is responsible for the formation of α1,3-Gal linkage on the O-antigen repeating unit of E. coli O86:H2, and is a bacterial glycosyltransferase equivalent to human blood group B transferase (GTB).10 In this study, the acceptor substrate specificity of WbnI was investigated, and further compared to the acceptor specificity of GTB and bovine α3GT. The acceptors were chosen on the basis of their importance in nature.7 Acceptor types I through V are naturally occurring

H-antigens which are common acceptors for GTA and GTB, meanwhile the T-antigen and lactose are non-fucosylated derivatives of the Type III/IV and Type V acceptors, respectively.13 From Table 3.1, it is evident that WbnI preferentially accepts the Type

III/IV acceptor over any other types of acceptors. This acceptor is the most structurally similar to the native E. coli O86 O-antigen repeating unit among all the acceptors.10

However, unlike GTB, WbnI does not readily accept the other H-antigen derivatives, which only vary in the linkage connecting the Gal moiety to the reducing end sugar as well the identity of the reducing end sugar. For example, the type I acceptor is the next best acceptor for WbnI with a relative activity of 18% as compared to the type III/IV acceptor. These acceptors only vary in the stereochemistry at the C4 position of the reducing end sugar (GalNAc vs GlcNAc), thus, it appears that the axial position of the C4 hydroxyl group on GalNAc is very important for catalysis. Furthermore, it seems that the

β1,3 linkage between Gal and HexNAc is the preferred linkage for the acceptor, as both type II and V acceptors have Gal-β1,4-Hex(NAc) linkages, and are not readily accepted by WbnI. This result is quite surprising given that the reducing end sugar imparts so

96 much specificity on WbnI activity, whereas GTB and BgtA readily accept all of the H- antigens. Lastly, it is shown that WbnI prefers fucosylated acceptors as the T-antigen and lactose have much lower relative activities than the type III/IV fucosylated acceptor, suggesting the presence of a fucose binding site in the WbnI active site.

In contrast to WbnI, GTB readily accepts all of the fucosylated acceptors but does display a modest preference for type I and II acceptors. Consistent with previous studies, the fucose containing acceptor is preferred over the nonfucosylated substrates.14,15 Thus, the results suggest that the acceptor binding pocket in WbnI is likely very rigid compared to that in GTB which can accommodate different types of acceptors. Also as expected,

α3GT only transfers galactose to non-fucosylated substrates, consistent with its native acceptor specificity.3 Furthermore, it appears the Gal-β1,4-Hex linkage is fairly important for catalysis as both lactose and the type V acceptor contain this moiety and have larger relative activities than those having Gal-β1,3-Hex linkages.16

All three GTs included in this thesis are classified as belonging to the GT6 family, which are defined as exhibiting the GT-A type fold. Glycosyltransferases exhibiting the

GT-A type fold commonly share a DXD motif which requires a divalent metal for catalysis.17,18 Both GTB and α3GT contain a DXD motif (Figure 3.2), and crystallography has demonstrated that this conserved motif interacts with a divalent metal during catalysis. However, bacterial BoGT6a showed activity independent of a divalent metal cation although it belongs to the GT6 family, suggesting a novel coordination of the sugar nucleotide donor.4 To test whether or not WbnI requires divalent metal ions for catalysis, the effect of five different divalent metal ions was tested on the enzyme activity

97

Table 3.1. Substrate specificity of WbnI Enzymes Acceptors WbnI GTB α3GalT 18% 100% 0%

Type I 5% 83% 0%

Type II 100% 82% 0%

Type III/IV 1% 81% 10%

Type V 12% 9% 0%

T-antigen 0% 6% 100%

Lactose

using the type III/IV acceptor. The result shows that WbnI appears to be divalent metal ion independent (Figure 3.3). In the absence of a divalent metal cation or in the presence of 10 mM EDTA, WbnI retains full enzymatic activity. By introducing either Mn2+,

Mg2+, or Ca2+ to the WbnI reaction mixture, slight inhibition of the enzymatic activity is observed. Significant loss in activity is observed with metals ions Cu2+ and Zn2+, which is 98

Figure 3.2. Sequence alignment of selected transferases from GT6 family. The abbreviations, donor specificity, source, and corresponding accession number are as follows. WbnI, α1,3-GalT, E. coli O86:H2, AAV80756; BoGT6a, α1,3-GalNAcT, B. ovatus, ZP_02064961; BgtA, α1,3-GalNAcT, H. mustelae 12198, CBG40459; Bov-

_a3GT, α1,3-GalT, Bos Taurus, AAA30558; GTB, α1,3-GalT, H. sapien, AAD26574;

Gbgt1, α1,3-GalNAcT H. sapien, AAF06145; GTA, α1,3-GalNAcT, H. sapien,

AAA36792; gi | 5723316, putative α1,3-GalT, Cyanophage P-SSm2, AAW48160. The

DXD/NAN and HDES motifs are marked with roman numerals I and II, respectively.

The amino acid residues selected for mutation in WbnI are marked with a *.

similar to results seen with other glycosyltransferases.19 However, parallel to BoGT6a, it appears that the bacterial glycosyltransferases from GT Family 6 may function

99

Figure 3.3. Effect of metal cations on α1,3-galactosyltransferase activity of WbnI. The specific activity of WbnI was determined in the presence of various divalent metals and

EDTA at a final concentration of 10 mM.

independent of metal cations, despite both having DXD motifs within their primary amino acid sequences. In WbnI there is a putative DXD motif at amino acids D41 and D43.

From Figure 3.3, it is clearly shown that the enzymatic activity is independent of a divalent metal ion, suggesting that this DXD motif may not actually be important for catalysis. Lastly, the role of pH on the α1,3-Galactosyltransferase activity was also determined, and WbnI is active over a range of pH values with an optimal activity at pH

7.5, which is typical for glycosyltransferases Figure 3.4.19

Kinetic characterization of His6-WbnI demonstrates a sequential mechanism

We performed a steady state kinetic analysis on WbnI, by varying the concentration of UDP-Gal at a series of fixed concentrations of the type III/IV acceptor.

100

Figure 3.4. Effect of pH on α1,3-galactosyltransferase activity of WbnI.

The data fit to Equation 2, which was displayed as a set of interesting lines in the double reciprocal plot of velocity versus the concentration of the type III/IV acceptor (Figure

3.5). Equation 2 is a sequential mechanistic equation which demonstrates that WbnI likely exhibits a bisubstrate sequential mechanism. From the fitting of the data to this equation, the values Ka, Kb, Kia, and kcat were obtained. From previously characterized bisubstrate ordered mechanistic enzymes Ka and Kb were assigned as the apparent

Michaelis constants for the donor and acceptor, respectively.4 In Table 3.2, the kinetic parameters obtained for WbnI are compared to reported literature values for GTB and bovine α3GT. The binding constants are more similar to that of the bovine α3GT, however the kcat is significantly lower than either GTB or α3GT. One explanation for such a low kcat is due to not using the optimal acceptor substrate, which is commonly a

101

Figure 3.5. Double-reciprocal plots for the transfer of galactose to Fuc-α1,2-Gal-β1,3-

GalNac. The type III/IV acceptor is plotted as the variable substrate at fixed concentrations of UDP-Gal (0.075, 0.15, 0.3, and 0.6 mM). The lines are based on a global fit of the data to Equation 2.

lipid-linked-tetrasaccharide. These results suggest that the catalytic mechanism is either a sequential mechanism or a random equilibrium mechanism, such that UDP-Gal and the acceptor bind to the enzyme, prior to any product being released. Since bisubstrate reactions can either be sequential or ping-pong mechanisms, WbnI can be distinguished as a sequential mechanism since the Kia is nonzero; whereas the ping pong type mechanisms produce double reciprocal plots displaying sets of parallel lines. Based on the results obtained for other α1,3GTs in the GT6 family, it was expected that WbnI would display an ordered type mechanism, whereby one substrate binds prior to the

102 binding of another substrate. To determine the order of binding, whether UDP-Gal or the acceptor bound first, the hydrolase activity of WbnI was determined, following previously used methods.20 The hydrolase activity of WbnI was determined by incubating larger concentrations of WbnI (20 µg) with various concentrations of (3H)UDP-Gal for

30 minutes, and the data was fit to Equation 1. A low, but noticeable, rate of hydrolase activity was identified, displaying a Kcat that amounts to 0.45% of the transferase activity.

This value is larger than both the hydrolase activities observed for BoGT6a (0.06%) and for α3GT (0.25%), and thus suggests that for the WbnI catalyzes the reaction by binding

UDP-Gal prior to the binding of the acceptor. Since WbnI catalyzes the cleavage between UDP and Gal, it is unlikely that the mechanism is obligatory ordered with the acceptor binding prior to the donor, and is more likely to have the donor bind first or to be random ordered.

Investigation of potential active site residues

The bacterial glycosyltransferases in GT family 6 have some sequence similarity towards their eukaryotic homologues, and as such we investigated the effects of mutating several of the potential active site residues in WbnI. The amino acids were chosen based on their structure-function relationship as demonstrated in the three dimensional structures of GTB and α3GT, and comparison to the work done on BoGT6a.4,16,21 As mentioned previously, WbnI does have a DXD motif at aspartic acids 41 and 43, and thus, site directed mutants of these residues were designed by mutating individually both

103

Table 3.2. Kinetic parameters for WbnI and other published α-1,3-GalTs Kinetic GTBa α3GTb WbnIc Parameter UDP-Gal activity Ka (µM) 54 430 326 ± 40 3 Kb (µM) 34 19 x 10 2352 ± 250 Kia (µM) 27 140 220 ± 24 -1 Kcat (s ) 6.5 6.4 0.20 ±0.01 Hydrolase activity d Ka (µM) ND 100 70 -1 -4 Kcat (s ) ND 6.4 8.7 x 10 a Data reported by Seto et al., using the acceptor Fuc-α1,2- Gal-β-O-(CH2)7-CH3. b Data reported by Zhang et al., using lactose as the substrate. c Determined using the acceptor Fuc-α1,2-Gal-β-1,3-GalNAc- OMe d ND, not determined

aspartic acids to both alanine and glutamate. Similar to the bacterial GT6 family transferases, WbnI contains an interestingly positioned 91NAN93 sequence, which aligns with the DXD motif present in the eukaryotic GT family 6 glycosyltransferases, and each asparagine was mutated to alanine and aspartate. Glutamine 148 was identified as a potential amino acid that may ne involved in differentiating the donor specificity, between UDP-Gal and UDP-GalNAc. This amino acid corresponds to either leucine or methionine 266 in the human GTA and GTB proteins, respectively, which allows for the distinction between UDP-GalNAc and UDP-Gal. Thus, glutamine 148 was mutated to alanine in an attempt to alter the donor specificity of WbnI. Lastly, glutamate 185 was

104 individually replaced with alanine, aspartate, and glutamine, on the basis that this amino acid may be directly involved with the catalytic mechanism. WbnI, GTB, and bovine

α3GT are a class of enzymes that retain the stereochemistry of the newly created glycosidic linkage (α1,3-GalT), and E185 is found within an HDES amino acid motif which is highly conserved among the GT6 family.22,23 In α3GT, glutamate 317 is thought to be an important nucleophile which promotes and stabilizes catalysis 24. Most of the site directed mutants had wild type like expression, purified using nickel-chelating chromatography. We chose to compare the apparent Michaelis Menton constant for the

app app acceptor ( Km), as well as the apparent Vmax ( Vmax); which are summarized in Table

3.3.

To investigate whether the 41DSD43 motif is involved in catalysis, four site directed mutants were made. The kinetic parameters for D41A, D41E, D43A, and D43E do not agree with previous results for similar mutations in other GT-A type glycosyltransferases,25 and are not likely to be involved in coordinating a divalent metal cation in catalysis. Mutations D41A, D41E, D43A, and D43E all displayed comparable levels of enzymatic activity and favor with the notion that these mutations affect structural components.26 Since comparable levels of enzymatic activity were observed, it is likely that the size and charge of the aspartic acids are not essential for activity.

Typically when a DXD motif is directly involved with the coordination of a divalent metal cation a distinct loss of enzymatic activity is seen with the aspartate to alanine mutation, which is subsequently restored with the alanine to glutamate mutation.25

105

Table 3.3. Kinetic parameters for WbnI site directed mutants Type III Acceptor app app Km Vmax -1 Enzyme (mM) (nmol min ) WT 0.62 ± .04 0.51 ± 0.07 D41A 1.36 ± 0.7 0.22 ± 0.07 D41E 1.05 ± 0.6 0.34 ± 0.08 D43A 1.47 ± 0.6 0.29 ± 0.04 D43E 1.36 ± 0.3 0.32 ± 0 .03 N91A 2.14 ± 0.4 0.34 ± 0.05 N91D 2.56 ± 0.3 0.23 ± 0.02 N93A 2.60 ± 0.4 0.28 ± 0.07 N93D 1.87 ± 0.5 0.32 ± 0.03 Q148A 3.11 ± 0.9 0.32 ± 0.07 E185A NDAa NDA E185Q 1.21 ± 0.1 0.27 ± 0.01 E185D 0.92 ± 0.05 0.34 ± 0.08 a NDA: no detectable activity with ESI/MS or TLC

Interestingly, D41 appears to be found in a highly conserved motif, and from the crystal structure of GTB, this conserved aspartic acid is found in a solvent exposed loop.

Further mutations involved Asn91 and Asn93 to both aspartic acids and alanines.

app Mutating either asparagines to alanine resulted in larger Km values for the type III/IV acceptor. Subsequent mutations of these asparagines to aspartates resulted in proteins that were difficult to purify to near homogeneity. It is possible that the burying of the negative charge may have had some impact on the yields observed for N91D and N93D.

Regardless, mutagenesis of either asparagines to aspartatic acid resulted in a decrease in

app app 185 the Vmax and an increase in the Km. Further mutation of E to alanine resulted in an enzyme void of any activity; however this lack of activity was rescued by the Glu185 to

app Asp mutation. The E185Q mutation further displayed a decrease in the Vmax and an 106

app increase in the Km. Lastly, while no UDP-GalNAcT activity was observed with the wild type WbnI, mutation of Gln148 to alanine resulted in a WbnI mutant which displayed marginal UDP-GalNAcT activity (Appendix) as detected by mass spectrometry. The kinetic parameters could not be accurately determined for the UDP-GalNAcT activity, likely because the native kcat is very low.

WbnI knockout and complementation experiments

Previous work with E. coli O86 had demonstrated that by removing the wbnI gene from the E. coli O86 chromosome had generated a new LPS phenotype.9 Furthermore, to this deficient strain, Yi and co-workers, were able to alter this LPS phenotype by incorporation of an α1,3-GalNAcT (BgtA), as shown in Figure 3.1.7 In this work, instead of complementing the wbnI deletion with a bacterial glycosyltransferase, we attempted at altering the LPS phenotype of the ∆wbnI E. coli O86 by incorporating a recombinant expression plasmid harboring either GTB or the α3GT genes. Following the work previously done with WbnI, the wbnI gene was functionally replaced by a kanamycin resistance cassette, and was confirmed by PCR analysis. The resulting LPS phenotype can be seen in Figure 3.6, where Figure 3.6 A is visualized by silver staining and Figure

3.6 B is visualized by western blot analysis using rabbit anti-O86 serum. Lane 1 in both

A & B are the wild type LPS phenotype, where as Lane 2 is the phenotype of ∆wbnI E. coli O86. The LPS phenotype change is the slight shift in the low molecular weight LPS.

Then by introducing pBAD-wbnI into the wbnI deficient strain, the resulting LPS phenotype resembles that of the wild type, Lane 3. As predicted from the acceptor substrate specificity of α3GT, no change in the LPS phenotype was observed when

107

Figure 3.6. LPS phenotype for ∆wbnI E. coli O86 and complemented strains. A)

Visualized by silver staining; B) Visualized by western blot probing with rabbit anti-O86 serum. Lane 1: wild type E. coli O86; Lane 2: ∆wbnI E. coli O86; Lane 3: ∆wbnI E. coli

O86 with pBAD-wbnI; Lane 4: ∆wbnI E. coli O86 with pBAD-GTB; Lane 5: ∆wbnI E. coli O86 with pBAD-α3GT.

pBAD-α3GT was inserted into the wbnI deficient strain (Lane 3). This enzyme doesn‘t have a fucose-binding site, thus this enzyme should not tolerate the fucosylated O- antigen. However, unexpectedly GTB was unable to complement the activity of WbnI

(Lane 4), and one plausible explanation for the lack of complementation by GTB is

108 because this enzyme is commonly membrane bound, and this membrane anchoring may be important for in vivo function.

Molecular modeling Fortunately for purposes of this work, GTB and bovine α3GT have been characterized using X-ray crystallography, and as such their structures could be used as templates to model WbnI.15,21,27,28 Using SWISS-MODEL, the primary amino acid sequence of WbnI was aligned and modeled against the structures of GTB which contained the substrates H-antigen (Type I) and UDP-Gal. Additional GTB structures were available which contained other cofactors as well as the H-antigen Type II acceptor, which were further used for modeling in the acceptor substrates in WbnI. The GTB structure was chosen over other the available α3GT structures for modeling because

GTB‘s native substrate is fucosylated whereas α3GT‘s native acceptor is non- fucosylated. Since it was demonstrated that WbnI exhibits a clear preference for fucosylated substrates it is likely that a better idea of the reason for WbnI‘s specificity would come via modeling after GTB.

Figure 3.7 is an image of the acceptor/donor binding site of GTB, where A) illustrates the interaction of UDP-Gal and the Type I acceptor, and B) illustrates the interaction of UDP-Gal and the Type II acceptor. The UDP-Gal moiety is identified by the ―line‖ model (in yellow/orange), whereas the trisaccharide acceptors are identified by green or blue (Types I or II, respectively) stick models (PyMOL defined terms). While the structure and active site residues of GTB have been experimentally determined, it is interesting to observe how the acceptors fit into the defined acceptor pocket. The Type I and II acceptors vary in linkage between the Gal and GlcNAc moieties, either β-1,3 or β-

109

1,4, respectively. However, depending on that linkage the entire GlcNAc moiety interacts with the binding pocket differently, seemingly allowed free rotation in the pocket, suggesting a promiscuous binding pocket. Also in Figure 3.7, the fucose moiety sits in a crevasse, putting the Gal moiety in a specific orientation, which is likely vital for catalysis. At the base of this fucose-binding pocket is also the absence of a tryptophan residue which, which by sequence alignment, is found in α3GT.

Using the GTB template, WbnI was modeled, and is seen in Figure 3.8. Frame A shows the GTB template in purple, with the WbnI model as teal (with UDP-Gal and Type

I substrates); whereas Frame B zooms in on the potential acceptor active site (showing only the Type I acceptor). As mentioned previously, WbnI and GTB share common acceptor and donor substrates, however, the mechanism of donor binding appears to differ. This idea can be further illustrated by the molecular model of WbnI, as only the potential acceptor binding domain appears to be modeled (amino acids 110-206). From the model, three amino acids in GTB appear to form a binding cleft which is in direct contact with the GlcNAc moiety of the acceptor (GTB: F235, S234, L329). These three amino acids in WbnI are Y118, G117, and Y209, respectively. The bulkier tyrosine residues in WbnI found in this potential binding cleft may attribute to the difference in specificity between WbnI and GTB, suggesting that the hydrogen bonding pattern of the tyrosines may not tolerate the non-type III acceptors. Site direct mutants were made of these active site residues in attempt to loosen the substrate specificity, but no considerable change in specificity was observed towards the Type I, II, or V acceptors.

This suggests that while these amino acids may be involved with specificity, it is the sum

110 of many interactions between multiple amino acid results in this binding pocket which results in the tight acceptor substrate of WbnI.

3.4 Discussion

In comparison to the characterized eukaryotic transferases from the GT6 family,

WbnI has a shorter primary amino acid sequence, is a soluble protein, and has very rigid acceptor substrate specificity. It also catalyzes the formation of blood group B structures found on the E. coli O86 O-antigen, in a metal independent mechanism. WbnI displays very strict substrate specificity and is able to catalyze the hydrolysis of the UDP-Gal donor, suggesting a similar and common sequential mechanism between the GT 6 family transferases.

Metal dependant glycosyltransferases, such as GTB and bovine α3GT, are examples that demonstrate the necessity of a DXD motif in glycosyltransferase activity.

In other glycosyltransferase families, metal independent enzymes exist; however, only one has been subsequently characterized from the GT6 family.4 One example of a metal independent glycosyltransferase, the sialyltransferase CstII, utilizes hydrogen bonding interactions between the CMP-NeuAc donor and two tyrosine hydroxyl groups during the glycosyltransferase mechanism.29 Other metal independent GTs are common, and one viable explanation of the metal independency is to look at metal independent glycosyltransferases from GT11 family. These transferases contain a motif which is responsible for mimicking the DXD-metal cation interaction, which further coordinates the sugar nucleotide, as well as aiding in the stabilization of the UDP after catalysis.30,31

111

Figure 3.7. Crystal Structure of GTB with UDP-Gal and A) Type I acceptor bound; and

B) Type II acceptor bound. 112

Figure 3.8. Molecular model of WbnI, where the GTB template is purple and WbnI is teal. A) Complete model; B)Zoom in on acceptor binding domain with Type I acceptor bound.

113

WbsJ, the α1,2-fucosyltransferase from E. coli O128, which belongs to the GT11 family, demonstrates metal independent fucosyltransferase activity. Through mutagenic studies, it was identified that a highly conserved, basic rich, HXRRXD motif was likely in direct contact with the sugar nucleotide donor, GDP-fucose.32 Similar basic rich amino acid motifs are present in all the members of the GT11 family, thus suggesting these fucosyltransferases originated from a common ancestor gene or from common ancestor gene duplication.33 Contrastingly though, while the GT11 family fucosyltransferases are metal independent, the presence of a divalent metal does not inhibit, and actually in the case of WbsJ and FucT, enzyme activity is slightly enhanced in the presence of Mn2+ at various concentrations.32,34 Such activity is not observed in WbnI; instead presence of a divalent metal seems to only demonstrate an inhibitive effect. Other metal independent glycosyltransferases, such as in the case of the Leukocyte type core 2 β1,6-GalNAcT, basic amino acids may substitute for the interaction between enzyme and the sugar nucleotide donor.6 As noted for BoGT6a, the metal independence observed for these bacterial glycosyltransferases are likely not because of mechanistic changes or alterations in the three dimensional structure of the protein.4 Because of the inability to identify the reason for much metal independnece, and based on other metal independent GTs, catalysis is likely mediated by an interaction of basic amino acids with the sugar nucleotide donor.

In addition to displaying differential activity in the presence of exogenous divalent metal as compared to the eukaryotic GT 6 family enzymes, WbnI also exhibits much more stringent acceptor substrate specificity as compared to the other characterized

114

GT 6 family enzymes. Crystallographic studies of GTA and GTB pin pointed the amino acid residues that are involved in binding the types I and II acceptors, which allows for some explanation as to the cause of substrate promiscuity. Despite having a much less promiscuous preference toward the acceptor substrate, WbnI does display some similarities towards its human homologue (GTB), in that a significant increase in the specific activity is observed when comparing the type III/IV trisaccharide towards the non-fucosylated acceptors. By demonstrating higher enzymatic activity in the presence of a fucosylated acceptor indicates that both GTB and WbnI have designated pockets in which the fucose moiety interacts. To explore where the fucose specificity arises, a comparison between the bovine α3GT and GTB/WbnI can be made. From the available crystal structure of bovine α3GT there are 5 identifiable tryptophan residues which appear in the catalytic pocket, two of which are unique to the bovine α3GT.16 Tryptophan residues 249 and 356 are specific to the bovine α3GT which has noticeably little to no ability to transfer galactose to the fucose containing acceptor. The corresponding amino acids for W249 and W356 in WbnI and BoGT6a are glycine and methionine/leucine, respectively. Furthermore, the corresponding residues in GTA/GTB, both preferring fucose containing acceptors, are serine/glycine and alanine, respectively. The lack of the bulky tryptophan residues at residues G117 and M220 in WbnI, as well as the relatively conserved smaller amino acids in the other fucose accepting GTs, may explain for these enzymes‘ ability to accept and even prefer acceptors that contain the fucose moiety.

Further structure-function analyses can be made between WbnI and the eukaryotic glycosyltransferases. Initial studies of WbnI were performed in the presence of a divalent

115 metal, primarily because a putative DXD motif was recognized. Further studies prompted experimentation to test the actual necessity of a divalent metal on enzyme catalysis, and thus determined that WbnI is metal independent. To verify that the DXD motif was uninvolved with the coordination of UDP-Gal, mutations of each aspartic acid to both alanine and glutamate were made. The obtained kinetic data supports the conclusion that the present DXD motif is not required for enzymatic activity. Furthermore, D41 is highly conserved among GT 6 family glycosyltransferases, and is likely not responsible for the binding of the sugar nucleotide donor. More interesting, is the exact alignment of the

NAN motif observed in the bacterial glycosyltransferases, which aligns with the well characterized and conserved DXD motifs. Both WbnI and BoGT6a experience activity independent of a divalent metal, and both contain the NAN motif. The exact role of the

NAN motif observed in the bacterial is unknown due to the unavailability of a three dimensional structure, however the alignment of this motif in comparison to the DXD motif in the metal dependant GTs of GT family 6 is intriguing.

While the source of this metal independency is unknown, it is possible that bacterial metal independence could be because of the ancestral gene from which the GT6 transferases originated from. WbnI is an intracellular enzyme and is likely associated with the inner membrane of E. coli O86. E. coli O86, like many commensal bacteria, inhabits the intestinal tracts of vertebrates, and the physiological concentration of metal cations in the host is low due to intestinal absorption.4 Thus, like many bacteria the uptake and efflux of exogenous cations, especially Mn2+, is tightly regulated such that the intracellular concentration may be low.35,36 Furthermore, the in vivo and in vitro

116 enzymatic synthesis of the E. coli O86 O-antigen repeat unit is completed by a series of sequential enzymatic glycosylations by the enzymes, WbnH, WbnJ, WbnK, and WbnI in the Wzy/Wzz dependent pathway.11 However during the in vitro synthesis of the O- repeat unit, only WbnJ (β1,3-GalT) requires the presence of a divalent metal cation, further suggesting that the metal independence in bacterial GTs may have been a result from evolution adaptation.37,38

All the GTs characterized from GT family 6 have been identified to demonstrate a retaining mechanism, meaning the final glycosidic linkage created by WbnI retains the stereochemistry of UDP-Gal in the resulting product. Further evidence has been proposed that the mammalian CAZy 6 family transferases adopt a mechanism similar to that of glycogen phosphorlases which hydrolyze UDP-Gal, with all other substrates present in the active site, generating a planar cationic oxycarbenium ion that is stabilized by various anionic groups.39,40 However, while this mechanism may be similar due to similar active- site residues, the presence of putative anionic residues (DXD) is absent in the bacterial

CAZy 6 GTs suggesting this function may be replaced by novel substructure of the protein.39 Furthermore, one particular residue that may be involved in the retaining mechanism potentially involves Glu185, which was identified because it belongs to a well conserved HDES motif, as seen in many other GT6 family transferases. This amino acid was seen in bovine α3GT to be directly involved in the catalytic mechanism, which was shown to function in the stabilization of the transition state by interacting with the acceptor LacNAc.23 Mutation of the Glu185 to alanine in WbnI resulted in an enzyme void of any detectable function, suggesting the importance of this conserved, negatively

117 charged residue. Mutagenesis of Glu185 to Gln, while similar in size, rescued the activity of the alanine mutation; meanwhile the mutation of E185D demonstrated higher activity than the corresponding glutamine mutation. This suggests that both the size and negative charge of the wild type glutamate residue is important for enzymatic activity, which correlates with similar mutations made in other GT6 family transferases. Similar to

Asp185, mutation amino acid Gln148, correlates with several other amino acids present in the other enzymes from this family. Specifically, it was shown that this amino acid alone in BoGT6a, GTA, and GTB imparts direct function in regards to the specificity of the donor, either UDP-GalNAc for BoGT6a/GTA or UDP-Gal in GTB. Mutation of Gln148 to alanine allowed for tolerance of UDP-GalNAc by WbnI, which was not detected in the wild type WbnI. The other α1,3GalTs display similar bulkier amino acids at this residue, suggesting that the N-acetyl group is too bulky for these enzymes to accommodate the donor UDP-GalNAc. Contrastingly, BoGT6a and GTA both have smaller amino acids alanine and leucine, respectively, at this position, both of which are able to accommodate and prefer the bulkier UDP-GalNAc donor. However, unlike BoGT6a and GTA, BgtA

(UDP-GalNAcT) has a bulkier tyrosine at this position, thus suggesting that while this amino acid site may be involved with the donor substrate specificity, there may be other amino acids in the bacterial GTs that help distinguish between UDP-Gal and UDP-

GalNAc. This idea is also supported by the lack of activity observed in the WbnI Q148A mutation, in that only limited activity could be detected by mass spectrometry.

Even though the bacterial glycosyltransferases from GT family 6 are slightly different in specificity, size, and metal dependence, it appears as though similar reactions

118 are catalyzed using similar bisubstrate catalytic mechanisms; likely through the interaction of key conserved amino acid residues. WbnI catalyzes the formation of the blood group B antigen in E. coli O86 in a sequential mechanism, which exhibits UDP-

Gal hydrolase activity, suggesting either random or ordered sequential binding of the donor and acceptor. In addition to the amino acids focused in this study, there are several other key amino acids which are highly conserved between the prokaryotic and eukaryotic GTs, such as Tyr13 in WbnI. This amino acid is 100% conserved among the enzymes in Figure 3.2, and was shown in the crystal structures of GTB to be important for the interaction of the sugar nucleotide donor.21 Because of all the similarities identified between the prokaryotic and eukaryotic glycosyltransferases from GT family 6, it is likely that the observation that the bacterial transferases are active without a metal ion cofactor is a result of a few amino acid changes in either the catalytic site or the functional mechanism. How the NAN motif that seems to be functionally important and present in the bacterial transferases affects the metal independency of these enzymes is unknown, but due to all the amino acid similarities between these enzymes the NAN motif may have some involvement in this novel interaction. This idea can be further supported by the fact that WbnI‘s N-terminus, which contains the NAN motif, was not able to be modeled using SWISS-MODEL, whereas the acceptor binding domain was.

One review proposes that the NAN modification came about because of horizontal gene transferase during the evolutionary development of phages, bacteria and eukaryotes.39

Lastly, while the bacterial glycosyltransferases mentioned exhibits a similar mechanism despite the metal independence issue, no structural based information has allowed a

119 conclusion on why WbnI exhibits such strict specificity towards only the type III/IV acceptor, whereas the other functionally characterized enzymes from this family exhibit much more promiscuous activities.

120

REFERENCES

(1) Hashimoto, K.; Madej, T.; Bryant, S. H.; Panchenko, A. R. J Mol Biol 2010, 399, 196-206.

(2) Yamamoto, F.; Clausen, H.; White, T.; Marken, J.; Hakomori, S. Nature 1990, 345, 229-33.

(3) Fang, J. W. L., J.; Chen, X.; Zhang, Y. N.; Wang, J. Q.; Guo, Z. M.; Zhang, W.; Yu, L. B.; Brew, K.; Wang, P. G. J. Am. Chem. Soc 1998, 120, 6635-6638.

(4) Tumbale, P.; Brew, K. J Biol Chem 2009, 284, 25126-34.

(5) Seto, N. O.; Palcic, M. M.; Compston, C. A.; Li, H.; Bundle, D. R.; Narang, S. A. J Biol Chem 1997, 272, 14133-8.

(6) Pak, J. E.; Arnoux, P.; Zhou, S.; Sivarajah, P.; Satkunarajah, M.; Xing, X.; Rini, J. M. J Biol Chem 2006, 281, 26693-701.

(7) Yi, W.; Shen, J.; Zhou, G.; Li, J.; Wang, P. G. J Am Chem Soc 2008, 130, 14420- 1.

(8) Feng, L.; Han, W.; Wang, Q.; Bastin, D. A.; Wang, L. Vet Microbiol 2005, 106, 241-8.

(9) Yi, W.; Zhu, L.; Guo, H.; Li, M.; Li, J.; Wang, P. G. Carbohydr Res 2006, 341, 2254-60.

(10) Yi, W.; Shao, J.; Zhu, L.; Li, M.; Singh, M.; Lu, Y.; Lin, S.; Li, H.; Ryu, K.; Shen, J.; Guo, H.; Yao, Q.; Bush, C. A.; Wang, P. G. J Am Chem Soc 2005, 127, 2040-1.

(11) Woodward, R.; Yi, W.; Li, L.; Zhao, G.; Eguchi, H.; Sridhar, P. R.; Guo, H.; Song, J. K.; Motari, E.; Cai, L.; Kelleher, P.; Liu, X.; Han, W.; Zhang, W.; Ding, Y.; Li, M.; Wang, P. G. Nat Chem Biol 2010, 6, 418-23.

(12) Abeyrathne, P. D.; Daniels, C.; Poon, K. K.; Matewish, M. J.; Lam, J. S. J Bacteriol 2005, 187, 3002-12.

(13) Letts, J. A.; Rose, N. L.; Fang, Y. R.; Barry, C. H.; Borisova, S. N.; Seto, N. O.; Palcic, M. M.; Evans, S. V. J Biol Chem 2006, 281, 3625-32.

(14) Nguyen, H. P.; Seto, N. O.; Cai, Y.; Leinala, E. K.; Borisova, S. N.; Palcic, M. M.; Evans, S. V. J Biol Chem 2003, 278, 49191-5. 121

(15) Alfaro, J. A.; Zheng, R. B.; Persson, M.; Letts, J. A.; Polakowski, R.; Bai, Y.; Borisova, S. N.; Seto, N. O.; Lowary, T. L.; Palcic, M. M.; Evans, S. V. J Biol Chem 2008, 283, 10097-108.

(16) Gastinel, L. N.; Bignon, C.; Misra, A. K.; Hindsgaul, O.; Shaper, J. H.; Joziasse, D. H. Embo J 2001, 20, 638-49.

(17) Lairson, L. L.; Henrissat, B.; Davies, G. J.; Withers, S. G. Annu Rev Biochem 2008, 77, 521-55.

(18) Breton, C.; Snajdrova, L.; Jeanneau, C.; Koca, J.; Imberty, A. Glycobiology 2006, 16, 29R-37R.

(19) Liu, X. W.; Xia, C.; Li, L.; Guan, W. Y.; Pettit, N.; Zhang, H. C.; Chen, M.; Wang, P. G. Bioorg Med Chem 2009, 17, 4910-5.

(20) Soya, N.; Shoemaker, G. K.; Palcic, M. M.; Klassen, J. S. Glycobiology 2009, 19, 1224-34.

(21) Patenaude, S. I.; Seto, N. O.; Borisova, S. N.; Szpacenko, A.; Marcus, S. L.; Palcic, M. M.; Evans, S. V. Nat Struct Biol 2002, 9, 685-90.

(22) Heissigerova, H.; Breton, C.; Moravcova, J.; Imberty, A. Glycobiology 2003, 13, 377-86.

(23) Tumbale, P.; Jamaluddin, H.; Thiyagarajan, N.; Brew, K.; Acharya, K. R. Biochemistry 2008, 47, 8711-8.

(24) Boix, E.; Zhang, Y.; Swaminathan, G. J.; Brew, K.; Acharya, K. R. J Biol Chem 2002, 277, 28310-8.

(25) Yi, W.; Perali, R. S.; Eguchi, H.; Motari, E.; Woodward, R.; Wang, P. G. Biochemistry 2008, 47, 1241-8.

(26) Breton, C.; Imberty, A. Curr Opin Struct Biol 1999, 9, 563-71.

(27) Persson, M.; Letts, J. A.; Hosseini-Maaf, B.; Borisova, S. N.; Palcic, M. M.; Evans, S. V.; Olsson, M. L. J Biol Chem 2007, 282, 9564-70.

(28) Rao, M.; Tvaroska, I. Proteins 2001, 44, 428-34.

(29) Chiu, C. P.; Watts, A. G.; Lairson, L. L.; Gilbert, M.; Lim, D.; Wakarchuk, W. W.; Withers, S. G.; Strynadka, N. C. Nat Struct Mol Biol 2004, 11, 163-70.

122

(30) Takahashi, T.; Ikeda, Y.; Tateishi, A.; Yamaguchi, Y.; Ishikawa, M.; Taniguchi, N. Glycobiology 2000, 10, 503-10.

(31) Zhang, Y.; Wang, P. G.; Brew, K. J Biol Chem 2001, 276, 11567-74.

(32) Li, M.; Liu, X. W.; Shao, J.; Shen, J.; Jia, Q.; Yi, W.; Song, J. K.; Woodward, R.; Chow, C. S.; Wang, P. G. Biochemistry 2008, 47, 378-87.

(33) Martinez-Duncker, I.; Mollicone, R.; Candelier, J. J.; Breton, C.; Oriol, R. Glycobiology 2003, 13, 1C-5C.

(34) Stein, D., Lin, Y.-N. and Lin, C.-H. Advanced Synthesis & Catalysis 2008, 350, 2313–2321.

(35) Hantke, K. Curr Opin Microbiol 2001, 4, 172-7.

(36) Rosch, J. W.; Gao, G.; Ridout, G.; Wang, Y. D.; Tuomanen, E. I. Mol Microbiol 2009, 72, 12-25.

(37) Yi, W.; Yao, Q.; Zhang, Y.; Motari, E.; Lin, S.; Wang, P. G. Biochem Biophys Res Commun 2006, 344, 631-9.

(38) Li, M.; Shen, J.; Liu, X.; Shao, J.; Yi, W.; Chow, C. S.; Wang, P. G. Biochemistry 2008, 47, 11590-7.

(39) Brew, K.; Tumbale, P.; Acharya, K. R. J Biol Chem 2010, 285, 37121-7.

(40) Watson, K. A.; McCleverty, C.; Geremia, S.; Cottaz, S.; Driguez, H.; Johnson, L. N. Embo J 1999, 18, 4619-32.

123

CHAPTER 4

DEVELOPMENT OF A HIGH THROUGHPUT ASSAY FOR THE

IDENTIFICATION OF GLYCOSYLTRANSFERASE ACTIVITY

4.1 Introduction

Glycosyltransferases (GTs) are among the most abundant enzymes in nature and are important for the biosynthesis of glycans.1 In vivo, GTs display high specificity in joining nucleotide donors and acceptor substrates to form glycosidic linkages. However,

GT-mediated reactions that are performed in vitro will tolerate a broader set of nucleotide donors and acceptors, and therefore have been useful catalysts in the synthesis of oligosaccharides.2 The use of GTs to synthesize complex oligosaccharides offers the benefits that the enzymes create very specific glycosidic linkages, avoid the need to employ tedious protection and deprotection steps that are required in organic synthesis, and are efficient at neutral pH.3 Genomic analysis can be used to identify a putative GT, but functional assays are needed to validate the activity.4 Furthermore, the discovery of

GTs demonstrating novel donor and acceptor specificities remains a difficult task, reflecting a lack of efficient tools and systematic strategies to functionally characterize

GTs.5,6 For example, among the 60, 000 putative glycosyltransferases organized in the

124

CAZy database (http://www.cazy.org/GlycosylTransferases.html) only a small fraction have been functionally characterized.7

Traditional approaches for accurately assigning a function to a putative glycosyltransferase involve thorough genetic analyses, in vivo experiments, and in vitro experiments all of which culminate in identifying a transferase‘s donor and acceptor specificities, as well as assignment of the newly formed glycosidic linkage. Because there is very little correlation between the primary amino acid sequence and the overall specificity of a GT, it is challenging to predict function from just the sequence alone.

Commonly, a putative GT is identified after a specific glycan has been identified or isolated from an organism, and therefore some function may be putatively assigned after the glycan is isolated. For example, in studying the O-antigen of bacteria, O-antigens are easily isolated from a bacterium and studied using NMR and Mass Spectrometry. After isolating and purifying the O-antigen, the carbohydrate composition of the antigen can be determined. After which, because the transferases responsible for O-antigen biosynthesis are clustered in a locus, putative function can be assigned to a specific transferases. After a gene is recognized and putatively assigned a donor and acceptor function, the further in vitro assignment is still very challenging and tedious, as represented in Scheme 4.1.

Wagner and Pesnot accurately describe the current understanding of GT assignment in a

2010 review of GTs, ―The precise biochemical function of many (putative) sequences is presently still unknown, and elucidating the donor/acceptor specificity of these GT

―orphan sequences‖ remains a major challenge in glycobiolgy‖.8

125

Scheme 4.1. Example illustrating the timeline to travel from putative GT gene to understanding the GT specificity.

Scheme 4.2. Proposed method to go from putative GT to functional assignment.

To address this challenge, we wanted to design a technology that allows us to bypass the need for any previous information other than knowing a gene is a putative glycosyltransferase, and assign donor, acceptor, and cofactor specificities. Since the

CAZy database houses 60,000 GTs, our idea is to go from putative transferase to functional assignment in a high throughput manner, relatively quickly (Scheme 4.2).

126

To achieve the idea proposed in Scheme 4.2, we needed address several individual aspects in order to generate a technology that is high throughput and more efficient than the traditional GT functional assignment assays. These included how to rapidly clone a putative GT, how to rapidly express active GTs, how to efficiently screen various sugar nucleotide donor, acceptor, and metal cofactors cost effectively, and lastly how to detect the glycosylation reaction. The cloning of a putative GT was accomplished in a relatively high throughput methodology, whereby we utilized ligation independent cloning (LIC) in order to quickly generate recombinant plasmids harboring the gene of interest 9. Then using in vitro protein expression, we were able to use the recombinant plasmids as templates for protein expression, which allowed for the rapid expression of the GTs. However, using in vitro expression has one dramatic problem in that the amounts of recombinant protein that is commonly obtainable can be very low. In addition to this drawback, the availability of the sugar nucleotide donors needed for such a large assay is very cost prohibitive. Thus, in collaboration with the Mrksich group, we developed a label-free assay that is based on a combination of self-assembled monolayers

(SAMs) on gold that present the carbohydrate substrate and mass spectrometry to identify products of the glycosylation reactions.10 The monolayers are well-suited to solid-phase assays of biochemical activities, in part because the substrates are presented in a regular environment and the tri(ethylene glycol) groups that surround the substrate are highly effective at preventing nonspecific interactions with proteins and thereby ensure that the interactions of soluble proteins occur only by way of the immobilized ligands.11,12 More importantly, the monolayers are compatible with matrix-assisted laser desorption-

127

Figure 4.1. Glycosyltransferase assays were performed by applying solutions containing a

GT and a sugar donor to regions of a self-assembled monolayer presenting carbohydrate acceptors. Bovine α1,3-galactosyltransferase and the sugar donor UDP-galactose were combined to form Gal-α1,3-Lactose, and the formation of the product (m/z 1458) was monitored by MALDI mass spectrometry.

ionization mass spectrometry (in a technique referred to as "SAMDI").13 Irradiation of the monolayer with a laser results in release of the alkanethiolates from the gold substrate, and because the chains undergo little fragmentation, the spectra provide direct masses for the ligand-substituted alkanethiols. As an example, in Figure 4.1, the SAMDI spectrum of a monolayer revealed a peak at m/z 1296 that corresponds to the mixed disulfide with a single lactose group. The peak at m/z 693 represented the tri(ethylene

128 glycol) disulfide. After treatment of the monolayer with bovine α1, 3- galactosyltransferase (50 mM, pH 8.0 Tris-HCl, 10 mM MnCl2 and 2 mM uridine diphosphogalactose) for 1 h at 37 ⁰C, a SAMDI spectrum showed the product peak at m/z

1458 that corresponds to the mixed disulfide containing the trisaccharide that resulted from enzymatic galactosylation of lactose.14 The absence of a peak at m/z 1296 demonstrates that the enzymatic reaction was essentially complete. Hence, the SAMDI assay provides a rapid and label-free assay of GT activities, which is especially useful in high throughput screenings.

4.2 Experimental methods

Materials T4 DNA Polymerase, LIC-qualified was purchased from Novagen. dGTP, dCTP, restriction enzyme SspI, Purelink Hipure Plasmid MiniPrep Kit and Expressway Maxi

Cell-Free E. coli Expression System were purchased from Invitrogen. Monoclonal Anti- polyHistidine antibody (mouse) was purchased from Sigma. Anti-Mouse IgG

Horseradish Peroxidase linked antibody and ECL Plus Western Blotting Detection

Reagents were purchased from GE Heathcare. Ni-NTA agarose was purchased from

Qiagen. BioGel P-2 Gel was purchased from Bio-Rad. All the nucleotide sugar donors, reagents for sugar acceptor synthesis were purchased from Sigma. Oligosaccharides

Gal1-4GlcNAc-Sp, GalNAc1-4GlcNAc-Sp, Fuc1-2Gal1-3GlcNAc-Sp, GalNAc1-

4[Fuc1-3]GlcNAc-Sp were kindly provided by The Scripps Research Institute. Proton nuclear magnetic resonance (1H NMR) spectra and carbon-13 (13C NMR) spectra were recorded on a Bruker DRX 500 spectrometer at 500 MHz and 125 MHz respectively.

129

Chemical shifts and coupling constants were reported in ppm and Hz respectively. Direct

On-Chip MALDI-TOF mass spectrometry was performed on Applied BiosystemsTM

4800 plus MALDI TOF/TOF mass spectrometer.

Bacterial strains, genomic DNAs, plasmids and growth conditions

Escherichia coli K12 (ATCC 29425), E. coli O3 (ATCC 23501), E. coli O6

(ATCC 19138), E. coli O77 (ATCC 23537), E. coli O142 (ATCC 23985) were purchased from American Type Culture Collection (ATCC). E. coli O24, E. coli O56, E. coli O142 were purchased from The E. coli Reference Center, The Pennsylvania State University.

E. coli DH5α used for cloning was purchased from Invitrogen. E. coli BL21(DE3) used for in vivo expression was purchased from Stratagene. Genomic DNAs of Stretococcus pyogenes (ATCC BAA-1065), Haemophilus ducreyi 35000HP (ATCC 700724),

Streptococcus pneumonia TIGR4 (ATCC BAA-334), Methanococcus maripaludis C5

(ATCC BAA-1333), Bifidobacterium infantis S12 (ATCC 15697), Campylobacter jejuni

(ATCC 700819), Caulobacter vibrioides CB15 (ATCC 19089), Helicobacter pylori

(ATCC 43504), Streptococcus pneumoniae TIGR4 (ATCC BAA-334), and Bacteroides

Fragilis NCTC 9343 (ATCC 25285D) were purchased from ATCC. Plasmid carrying the

DNA of Photobacterium damsel Pd2, 6ST was kindly provided by Dr. Xi Chen from UC

Davis. Plasmids carrying the DNAs of Haemophilus influenza LgtD, bovine 1,3GalT,

Helicobacter mustelae BgtA were constructed previously by the Wang Lab. The vector, pMCSG7, was kindly provided by Dr. Zhi-jie Liu from Institute of Biophysics, China. E. coli strains were grown on LB media at 37oC. Ampicillin at 100 µg mL-1 was added to the media for selection as needed.

130

Cloning of putative glycosyltransferases

Putative glycosyltransferases were PCR amplified using genomic DNAs, cDNAs or plasmids harboring glycosyltransferases as templates. The forward primer was generated using the first 20 base pairs of the putative GT gene and the nucleotide sequence 5‘ –TACTTCCAATCCAATGCATG, where the underlined ATG represents the start codon for the gene of interest. The reverse primer was generated using the last 20 base pairs of the putative GT gene and the nucleotide sequence 5‘-

TTATCCACTTCCAATGCTA, where the underlined CTA is the chain termination codon. The gene name, Protein Accession Number, Protein GI number, Gene GI number, and the bacteria source are listed in Appendix Table C.1. The gel purified PCR products were then treated with LIC-qualified T4 DNA polymerase at 22 ⁰C for 45 min in the presence of 2.5 mM dCTP, 5 mM DTT. The reactions were quenched by heating at 75 ⁰C for 20 min. The cloning vector, pMCSG7 was linearized using the restriction enzyme

SspI then gel purified. The linearized vector was then treated with LIC-qualified T4 DNA polymerase at 22 ⁰C for 45 min in the presence of 2.5 mM dGTP, 5 mM DTT and quenched using the aforementioned method. Treated vector and PCR product were then mixed in a 1:2 ratio (approximately 0.1 pmol of vector and 0.2 pmol of PCR product) and incubated at 22 oC for 5 min. Subsequently, EDTA was added at a final concentration of

5 mM to the mixture, and the mixture was allowed to future incubate for 10 min. The annealing reaction mixture was then transformed into E. coli DH5 competent cells.

Positive colonies were selected by plating the cells onto LB agar containing ampicillin.

The constructs were checked by DNA sequencing. The plasmids were purified using

131

Purelink Hipure Plasmid MiniPrep Kit and concentrated to at least 0.5 µg µL-1, and stored at -80oC until use.

In vitro expression and detection of glycosyltransferase activity

The in vitro expression of the putative glycosyltransferases was performed using the Expressway Cell-Free E. coli Expression System as described by the instructions.

Specifically, 2 µg of plasmid was used for each 100 µL reaction system, and the reaction was incubated at 37 ⁰C for 4 h. Subsequently, the insoluble particulates were removed by brief centrifugation and the supernatant containing soluble proteins was stored at -80 ⁰C until use. The positive control for protein expression was provided by the manufacturer, and the negative control was carried out using empty vector pMCSG7.

The expression of the target glycosyltransferase was confirmed by both SDS-

PAGE and anti-HIS western blot. Briefly, 5 µL of the aforementioned supernatant containing soluble proteins was precipitated using acetone. The protein pellet was resuspended in 1x SDS loading buffer and separated using SDS-PAGE. The western blot was performed using anti-polyhistidine antibody and anti-mouse IgG Horseradish

Peroxidase linked antibody. The blot was developed via treatment with ECL Western

Blotting Detection reagent.

In vivo expression and purification of glycosyltransferases

Plasmids harboring glycosyltransferases genes were transformed into E. coli

BL21 (DE3) for overexpression. The protein expression was induced by 0.2 mM IPTG at

16 ⁰C for 20 h. Cells were then harvested by brief centrifugation and disrupted by sonication in Buffer A (20 mM Tris-HCl, pH 7.5, 500 mM NaCl, 20 mM imidazole).

132

After centrifugation (12,000 g, 30 min, 4 oC), the supernatant was loaded onto 2 mL of

Ni-NTA agarose for purification. Subsequently, the agarose was washed using 100 mL of

Buffer A, and the proteins were eluted using 10 mL of Buffer B (20 mM Tris-HCl, pH

7.5, 500 mM NaCl, 250 mM imidazole). The fractions were then desalted against Buffer

C (50 mM Tris-Cl, pH 8.0, 100 mM NaCl, 15% Glycerol) and stored at -20 oC until use.

SDS-PAGE and Western blot confirmed the expression of the new enzymes discovered in this study (Figure C.1).

The screening and SAMDI detection

Screening reactions were performed on slides having a 5 x 10 array of gold islands modified with monolayers. The preparation of the oligosaccharide terminated

SAMs was discussed in detail in appendix to this thesis. Reactions were performed by applying a solution containing Tris-HCl (50 mM, pH 8.0), MnCl2 (10 mM), and the sugar donor (0.5 mM, in a total volume of 2 μL) to each individual gold island that presented one of the sugar acceptors, followed by the addition of the unpurified glycosyltransferase

(0.5 μL). We estimate that enzymes were present at concentrations in the low nM range.

The slides were kept in a humidified chamber at 37 ⁰C for 2 hours, and then rinsed with water followed by ethanol. The slides were dried under nitrogen and then analyzed by

SAMDI mass spectrometry. The monolayers in each island were treated with a solution of 2,4,6-trihydroxyacetophenone matrix (THAP, 5 mg mL-1, 0.5 mL in acetone). The slide was allowed to dry and then fixed to the sample holder for the Applied

BiosystemsTM 4800 plus MALDI TOF/TOF mass spectrometer. A 355 nm Nd:YAG laser was used as the desorption/ionization source with an accelerating voltage of 20 kV

133 and extraction delay time of 50 ns. All spectra were acquired using positive reflector mode, and were Gaussian-smoothed and baseline corrected. The screening yielded several hits, which are summarized in Appendix Table C.4. Among these hits, four proteins were discovered as novel enzymes. The new enzymes were expressed again in E. coli, and were used to synthesize oligosaccharide products at preparative scale. The details regarding the oligosaccharides preparations and structure characterizations are detailed in the Appendix.

Synthesis of oligosaccharides by new enzymes

Once the enzymatic activities from the screening were confirmed, the new enzymes were expressed again in vivo and purified. Milligram scale synthesis of the new oligosaccharide products was performed to identify their stereochemistry. Briefly, a 3 mL reaction mixture containing the enzyme, Tris-HCl (20 mM, pH 8.0), MnCl2 (10 mM), acceptor (20 mM) and sugar nucleotide donor (25 mM) was incubated for 24 hours at

37oC. The acceptor and sugar nucleotide used for each glycosyltransferase are listed in

Table C.6. The progress of the reactions was monitored by TLC (i-PrOH/H2O/NH4OH =

7:3:2 (v:v:v)) stained by anisaldehyde/methanol/H2SO4 = 1:15:2 (v:v:v) and MALDI-

TOF MS using THAP as matrix. The reactions were quenched by boiling, and the insoluble debris was removed by centrifugation. The resulting mixture was lyophilized and was then applied to a Bio-Gel P2 column using water as the mobile phase. The fractions containing the desired product were pooled, lyophilized, and subjected to NMR analysis.

134

Kinetic parameters neasured by SAMDI

To obtain kinetic constants for the new glycosyltransferases, pull-down assays were carried out.15 In general, the reactions were performed in solution and stopped at different time points. The crude mixtures were then applied to the monolayer to permit immobilization of the substrate and product prior to analysis by SAMDI. We also observed that in the absence of metal ions, no product formation could be detected, and we therefore used an EDTA solution to quench the reactions. In those cases, the solution phase enzyme reactions were performed in a conventional 384 well plate and each reaction contained one of the purified enzymes, the sugar donor, the acceptor, Tris-HCl

(50 mM, pH 8.0) and MnCl2 (10 mM) in a total volume of 10 μL. The enzymes were used at the following concentrations: BF0009, 0.14 mg mL-1; BF0614, 0.3 mg mL-1;

NC_002940.2, 0.3 mg mL-1. The concentrations of donors ranged from 100 μM to 5 mM and those of the acceptors ranged from 250 mM to 5 mM. For each set of concentrations, the reactions were carried out for times ranging from 2 to 30 min at intervals of 2 to 3 minutes and terminated by adding a mixture of cold ethanol and EDTA (10 mM, 20 mL).

Each reaction mixture was then applied to an individual gold circle of the array (using a volume of 2 mL) that was modified with the alkyne-terminated monolayer. An aqueous solution (1 mL per reaction) containing CuBr (2 mM) and triethylamine (0.5 mM) was applied to each circle and the reactions were incubated at room temperature from 30 min to 6 hours, depending on the concentrations of the sugar azides. The completion of the reactions was monitored by SAMDI. The slide was then rinsed with water, followed by ethanol, and dried under nitrogen. For quantification, the extent of glycosylation (R) was

135 determined from the peak intensities for product (Ip) and lactose substrate (Is) on the

SAMDI spectra using the relation: R = Ip/(Ip+Is). We confirmed that the measured ratio reflects the actual ratio of the two azido sugars in solution by performing a calibration experiment, which was discussed in detail in the appendix. The yield of the glycosylation, calculated from the equation in Appendix Figure C.6A, was plotted against the reaction time. The linear region of the plot was fitted in Microsoft Excel and the slope obtained was used as the initial velocity (v0). Double reciprocal plots of initial velocities are plotted in Appendix Figure C.7. The plots used the donors as the variable substrates and the acceptors as the constant substrates. The shape of the plots suggested that all the enzymatic reactions followed a general two-substrate sequential mechanism. Thus the data were fitted in Microsoft Excel using equation (1)

where [A] is the concentration of the donor, [B] is the concentration of the acceptor, Vmax is the apparent maximum velocity, Ka and Kb represent the cognate Michaelis constants for substrates A and B, respectively and Kia is the dissociation constant of the substrate A.

The graphs in Appendix Figure C.6 were obtained using the least squared linear fitting in

Microsoft Excel. The data were also fitted in Microsoft Excel with equation (2) to obtain the apparent Michaelis constant KM, apparent:

136 where [S] is either the donor or the acceptor and Vmax is the apparent maximum velocity.

The metal activity studies were carried out in a similar fashion and the details can be found in the Appendix.

Metal dependence measurement by SAMDI

We screened the new enzymes against a panel of divalent metal ions composed of: Mn2+, Mg2+, Ca2+, Mo2+, Co2+, Fe2+, Cu2+, Ni2+, Zn2+. As a negative control we replaced the metal ions with a 1 mM EDTA solution. Appendix Figure C.8 illustrates the strategies in this measurement.

The assays for BF0009 were performed directly on -Glucose-terminated SAMs

(acceptor 5, Figure C.A). The reaction mixture contains a cocktail of Tris-HCl (50 mM, pH 8.0), divalent metal ion (10 mM) and UDP-GalNAc (5 mM). The final enzyme concentration in the reaction mixture was 0.28 mg mL-1. The reactions were incubated for time intervals ranging from 5 to 90 min. After each time interval the individual gold substrates were rinsed with water and ethanol, dried under a stream of nitrogen and then

SAMDI analysis was performed to determine the reaction yield at each time point.

The assays for BF0614 were performed in solution using thiol-terminated -

GlcNAc (acceptor 11, Figure C.7 B) as the acceptor. The reactions were performed in a total volume of 7.5 µL in a cocktail containing Tris-HCl (50 mM, pH 8.0), divalent metal ion (10 mM), -GlcNAc (8.0 mM) and UDP-Gal (2 mM). The final enzyme concentration was 0.06 mg mL-1. The reactions ran for time intervals ranging from 5 to

90 minutes and then stopped by adding 1 µL of 100 mM EDTA solution. The reaction mixtures at different time points were incubated to maleimide terminated SAMs on

137 individual gold substrate for 30 min and SAMDI analysis was performed to quantify the reaction yields.

The metal dependence assay for NC_002940.2 was performed on a -Lactose- terminated terminated SAMs (acceptor 4, Figure C.7 C). The reactions were performed in a cocktail containing Tris-HCl (50 mM, pH 8.0), divalent metal ion (10 mM) and UDP-

GlcNAc (1 mM). The reactions were incubated for time intervals ranging from 3 to 50 minutes. After each time point, the reactions were stopped by rinsing individual gold substrates with water and ethanol, and drying them under a stream of nitrogen and then

SAMDI analysis was performed to determine the reaction yields.

Kinetic parameters measured using traditional radiolabeled donors

Reactions were performed at 37°C for 30 min in a reaction buffer containing Tris-

HCl (20 mM, pH 7.5) and 10 mM MnCl2. The K app,m values and V app,max are the averages of duplicate experiments. To determine kinetic parameters for the acceptors, the concentration for each acceptor were varied as follows: 1 mM, 2 mM, 10 mM and 20mM,

3 and with the donor of UDP-D-[6- H]Hex(NAc) at 0.3 mM. To determine the Kapp,m value for UDP-Hex(NAc), UDP-D-[6-3H]-Hex(NAc) (fixed radioactive counting of 10,000 cpm) was supplemented with different amounts of unlabeled UDP-Hex(NAc) to achieve varying final concentrations (0.08, 0.1, 0.3 and 0.6 mM); with the acceptor at a fixed concentration of 20 mM. The parameters Kapp,m and Vapp,max were obtained by the respective Lineweaver-Burk plots and are summarized in Appendix Figure C.6.

138

4.3 Results

Reaction screening

We applied the SAMDI assay to screen 85 bacterial GTs, including 76 putative enzymes that had not been previously characterized, to identify new glycosyltransferase activities. To perform the analysis in a high throughput mode, we prepared glass slides that had a 5 x 10 array of gold-coated islands (Figure 4.2). We prepared 25 oligosaccharide acceptors (Appendix Table C.2) and immobilized individual acceptors on the gold features using either a reaction of a thiol-functionalized carbohydrate with a maleimide-terminated monolayer or an azido-functionalized carbohydrate with an alkyne-terminated monolayer (Appendix Figure C.5). To prepare the recombinant vectors containing the putative GTs, we used ligation independent cloning (LIC) into the pMCSG7 vector and the recombinant plasmids were used as the templates for in vitro expression.16 Because traditional methods for protein expression and purification are tedious and time-consuming, we used in vitro expression to rapidly prepare the GTs, which were then assayed in unpurified form. The proteins were expressed using a cell- free in vitro E. coli expression system (Appendix Table C.1) and each individual GT was first mixed with one of the 7 sugar donors and then applied to individual gold islands presenting the sugar acceptor (Figure 4.2). All the reactions were performed in a cocktail containing Tris-HCl (50 mM, pH 8.0) and MnCl2 (10 mM). The array was incubated for

2 hours at 37 ⁰C, and then rinsed with water and ethanol, and each individual circular region representing a unique combination of GT, acceptor and donor was analyzed by

SAMDI.

139

Figure 4.2. A total of 14,875 reactions were performed by individually combining 85 putative GTs, 7 sugar donors (D) and 25 acceptors (A). (a) Reaction mixtures were prepared by mixing each GT with one donor and were then applied to monolayer spots having one acceptor. (b) The reaction mixtures were incubated, rinsed and then the arrays were analyzed by SAMDI to identify those combinations that give a glycosylation reaction.

Using this assay, we evaluated the 14,875 combinations of GT, nucleotide donor and immobilized acceptor in less than one week. The complete set of assays required approximately 50 mL of each unpurified enzyme and less than 2 mg of each donor and acceptor. Further, the SAMDI spectra were acquired in an automated mode and required approximately two seconds per assay. This combination of minimal reagent consumption and high throughput was critical to the scale of the experiment. Of the combinations tested, the great majority showed no activity; that is, the SAMDI spectra that showed

140 only the substrate was present on the monolayer—but twelve of the reactions showed new peaks that corresponded to individual glycosylation products. The combinations of enzyme, acceptor, and donor that give glycosylation reactions are summarized in

Appendix Table C.4. We identified new GT activities for four enzymes that had not previously been characterized (Figure 4.3). Two of these, BF0009 and BF0614, are from

Bacteroides Fragilis (B. Fragilis), and catalyze the GalNAcylation of α-glucose and the galactosylation of N-acetylglucosamine (GlcNAc), respectively. The two other GTs are

NC_002940.2 and AAF28363.1 from Haemophilus ducrey I (H. ducreyi), and transfer

GlcNAc from UDP-GlcNAc to lactose and to GlcNAc, respectively. To take one example, the protein expressed from the gene BF0009 was mixed with the donor uridine

5′-diphospho-N-acetylgalactosamine (UDP-GalNAc) and applied to a spot presenting α- glucose (α -Glc). The SAMDI spectrum revealed a new peak at m/z 1337 (Figure 4.3 A), which was 203 Dalton greater than the peak for the acceptor-terminated alkanethiolate

(m/z 1134) and is consistent with the addition of GalNAc to the acceptor.

Synthesis and characterization of oligosaccharides produced by the new GTs

After the large-scale screening experiment, we expressed the four new enzymes in

E. coli BL21 (DE3) (Appendix Figure C.1) and performed preparative glycosylation reactions to generate milligram quantities of the products to permit the structural characterization of the glycosidic linkages with 1D and 2D NMR (Appendix).We found that BF0009 catalyzes the formation of a β1,3 linkage between GalNAc and glucose. The

BF0614 GT joins galactose and GlcNAc through a β1,4 bond, which could make this enzyme an efficient catalyst for the synthesis of the important antigen N-acetyl-

141

Figure 4.3. Four new enzyme activities were discovered in the screen: (a) BF0009 from

B. Fragilis. Donor: UDP-GalNAc; Acceptor: β-glucoside. (b) BF0614 from B. Fragilis.

Donor: UDP-Gal; Acceptor: GlcNAc (α and β); (c) NC_002940.2 from H. ducreyi.

Donor: UDP-GlcNAc; Acceptor: β-lactose; (d) AAF28363.1 from H. ducreyi. Donor:

UDP-GlcNAc; Acceptor: GlcNAc (α and β).

Lactosamine (LacNAc). The protein NC_002940.2 joins GlcNAc and lactose through a

β1,3 linkage and AAF28363 creates a β1,4 linkage between two GlcNAc residues.

Finally, the set of enzymes that we tested included nine known GTs and of these we found that six were active (Appendix Table C.4). We believe that this fraction provides a fair estimate of the yield for expression of functional protein and is likely representative of yield realized for the other putative enzymes examined in this work, though we have not investigated whether the inactivity arises from improper expression or folding of the enzymes, or is due to inaccessibility of the immobilized substrates.

142

Biochemical characterization of the new GTs

We selected three GTs that gave new activities in the screen and characterized their kinetic parameters. We again used the SAMDI method but performed assays in a solution-phase format to avoid perturbations that may arise from presentation of the ligands at the surface 14. In this ‗pull-down‘ format, the reactions were performed in solution with an azido-modified oligosaccharide as the acceptor and a sugar nucleotide as the donor (Appendix Table C.5). The reactions were then quenched and applied to a monolayer presenting terminal acetylene to selectively immobilize the substrate and product of the enzyme reaction. For example, we characterized the BF0009-mediated transfer of UDP-GalNAc to an azido-glucose substrate by performing a series of reactions in a 384 well plate, where each well had the enzyme, the glucose azide (at concentrations ranging from 0.25 - 2 mM) and UDP-GalNAc (at concentrations ranging from 0.1 - 5 mM), in a total volume of 5 mL. Further, we performed a time course by setting up identical sets of reactions and stopping each at a distinct time. The quenched reactions were combined with copper bromide and triethylamine and applied to individual islands to allow immobilization of the sugars to the array (Figure 4.4 A). The yields were determined by integrating the mass peaks for the product and substrate

(Appendix Figure C.6 A and Figure 4.4 B). We determined the initial rate for each set of donor and acceptor concentration using five concentrations of donor and four concentrations of acceptor. Kinetic parameters were determined from double reciprocal plots (Appendix Figure C.7 and Figure 4.4 C) and are summarized in Table 4.1. We also performed this experiment using the standard radio-labeling approach and found that

143 those results agreed with our determination of kinetic parameters using the SAMDI pull- down assay (Appendix Table C.7).

Table 4.1. Kinetic parameters for glycosylation reactions mediated by three identified GTs. Parameters BF0009 BF0614 NC_002940.2 UDP- β-Glc UDP-Gal α-GlcNAc UDP- β-Lac GalNAc azide azide GlcNAc azide KMapp (mM) 0.384 4.51 .446 5.62 0.366 2.89 (0.039) (0.071) (0.023) (0.068) (0.044) (0.032) VMax app .025 (0.013) 0.22 (0.011) 0.19 (0.01) (nmol min-1)

Due to conserved folds and structural motifs, many GTs are known to require divalent metal ions to be active, and we therefore investigated the metal-dependent activities of the GTs that we discovered in the screen.17 We found that addition of EDTA abolished the activities for the three enzymes measured in this study. The GTs are most active in the presence of Mn2+ and Mg2+, but can also be activated by other divalent metal ions (Table 4.2). For example, BF0009 displayed the highest activity with Mg2+, but it also showed significant activity when Mn2+, Co2+ or Mo2+ ions were present in solution but minor activity when Cu2+ or Ni2+ were present, while Ca2+, Fe2+ or Zn2+ elicit no activity.

144

Figure 4.4. Kinetic analyses of the GTs were performed using a homogeneous assay format followed by a selective immobilization of an azido-tagged acceptor to an alkyne- terminated monolayer (a). (b) A time course for the BF0009-mediated GalNAcylation of azido β-glucose gave initial rates that were then analyzed in double-reciprocal plots (c) to give Vmax and KM,apparent for both substrates.

145

Table 4.2. Influence of divalent metals on the activity of three GTs Relative Activities (%) Metal BF0009 BF0614 NC_002940.2 Co2+ 58 (28) 89 (14) <1 Cu2+ 15 (6) <1 <1 Ca2+ <1 <1 57 (17) Fe2+ <1 <1 69 (19) Mg2+ 100 (11) 100 (29) 100 (10) Mn2+ 71 (17) 54 (19) 85 (19) Mo2+ 51 (30) 51 (22) 70 (16) Ni2+ 14 (11) 19 (4) <1 Zn2+ <1 <1 <1 EDTA <1 <1 <1

4.4 Discussion

While glycosyltransferases represent one of the most abundant protein families in nature, they are also among the least characterized. The in vitro study of GTs to determine substrate specificities requires significant quantities of protein, a donor and the acceptor, all of which are expensive or time consuming to obtain. Further, many assay formats use undesired radioactive substrates, requiring many tedious steps, and are not adaptable to high throughput formats. Moreover, the in vitro characterization of a GT is typically pursued only after a relevant glycosylation function has been identified by other means (i.e., genetic experiments), reflecting the difficulty in discovering these activities based on activity profiling. The example we describe here addresses this gap and offers an efficient strategy for discovering novel GTs. The requirement for minimal amounts of enzyme also allows us to use an in vitro expression system, which offers the potential of generating hundreds of proteins quickly, though only at pmol levels. These quantities are sufficient for performing thousands of reactions in the SAMDI based assay. Finally, the 146 label-free assay used here has advantage that it avoids the possible interference of labels with enzymatic activities and it enables the discovery of unanticipated events.18

In this study, we discovered four bacterial proteins that have distinctive GT activities. It will be interesting to assess their biological roles in vivo. Also these enzymes can be efficient catalysts to generate oligosaccharides with novel structures. For example, the protein BF0009 from B. fragilis was particularly significant because the enzyme catalyzed a reaction that yields a linkage that hasn‘t previously been observed in bacterial systems. Further, this enzyme only recognizes UDP-GalNAc as the donor substrate among the seven donors screened, making it more selective than the common bovine milk galactosyltransferase, which accept UDP-Gal, UDP-Glc and UDP-GalNAc.19 B. fragilis causes severe infection, diarrhea, and abscesses to human hosts though it is not clear whether this novel linkage, GalNAcβ1,3-Glc, exists in B. fragilis and is relevant to the pathogenesis.20

More interesting is the enzyme BF0614, which has demonstrated applicability in the enzymatic synthesis of LacNAc. The traditional chemical synthesis of this disaccharide as briefly illustrated in Figure 4.5 A. Like many traditional chemical approaches to oligosaccharide synthesis, the chemical synthesis of LacNAc suffers from the prerequisite of needing both an activated donor molecule as well as a singly deprotected acceptor, which allows for the controlling of the stereochemistry and regiochemistry of the newly synthesized glycosidic bond. Overall this approach to synthesize the LacNAc disaccharide requires numerous protection, deprotection, and purification steps, thereby limiting the overall yield. However, an alternative approach to

147 the chemical synthesis of LacNAc is the newly discovered enzymatic driven reaction, utilizing the donor UDP-Gal, acceptor GlcNAc, and the enzyme BF0614. Using this process, the reaction is completed in approximately one day followed by a single purification step, utilizing size exclusion chromatography. Therefore in comparison, one can obtain higher yields of the important LacNAc disaccharide, more rapidly (3-4 days versus 2-3 weeks), by exploiting the enzymatic process. However, while the enzymatic process is much more efficient and practical for the large scale synthesis, in the case of

LacNAc, requires the expensive, commercially available, UDP-Gal. In comparison to the commercially available price of LacNAc, $300 per 100 mg, UDP-Gal is approximately

$400 per 100 mg. However, there are alternatives to obtaining ―cheaper‖ UDP-Gal, which coincidently are also enzymatic driven processes.21 One process, which was actually used for the production of LacNAc in an industrial scale, is by using an in situ

UDP-Gal regeneration system exploiting GalE (UDP-Gal 4-epimerase), which converts

UDP-Glucose to UDP-Galactose by epimerizing the 4‘ position. Then UDP-Glc is regenerated by two enzymes PK and UDP-Glc-pyrophosphorylase which efficiently generates UDP-Glc from pyruvate and UTP.22 However, the regeneration of UDP-Glc may not be necessary for the small scale synthesis of UDP-Gal since its commercial price is $50 per 100 mg. Thus it is conceivable to just exploit the GalE enzyme in vitro, and in a one-pot reaction, mix UDP-Glc, GalE, BF0614, and GlcNAc, which would result in the product of LacNAc.23 A last method to obtaining the UDP-Gal donor cheaply is illustrated by using engineered E. coli which expresses the galT biosynthetic genes (galT, galU, galK, and ppa), which actually were utilized in the production of globotriose.24

148

Regardless of the method used to obtain LacNAc, obtaining this disaccharide is potentially useful due to the prevalence at which it is seen in nature. Figure 4.5 C is an example of an important tetrasaccharide found in nature which contains the disaccharide

LacNAc moiety, notably the sialyl Lewis X (sLex) tetrasaccharide. This disaccharide is expressed on the monocytes and granulocytes attached as O-glycans, which regulate leukocyte extravasation.25,26 Further up regulation of the SLex has been shown to be related to various phenotypes of pancreas, breast, lung, and colon tumors.27,28 Dissecting the SLex reveals the Lewis X (Lex), fucosylated trisaccharide epitope, which is found in many eukaryotes as well as bacteria, such as Helicobacter pylori, and are used by bacterial/pathogens to avoid the host immune system.29 Thus large scale synthesis of the

LacNAc disaccharide would be advantageous for the further enzymatic synthesis of either the Lex or sLex oligosaccharides or derivatives, which could be combined with microarray technologies to further scan or profile the interactions between the synthesized antigens and various lectins (carbohydrate binding proteins).30,31

In summary, this work demonstrates an effective strategy for combining carbohydrate arrays and mass spectrometry for the functional annotation of enzyme classes. The method is significant because it combines the immobilized arrays with a true label-free detection method that allows direct read-out of the biochemical activities on the array. Further, the method is compatible with solid-phase assays and also a ‗pull-down‘ format that allows for homogeneous phase assays. In applying this method to a library of in vitro expressed putative GTs, we discovered four new bacterial enzymes that catalyzed the formation of four distinct oligosaccharides. This method for the high throughput

149

Figure 4.5. A) Traditional chemical synthesis of LacNAc; B) Enzymatic synthesis of

LacNAc; C) Sialyl Lewis X (SLex) oligosaccharide (LacNAc highlighted in red).

characterization of glycosyltransferase activity offers a new opportunity for the identification of novel and interesting GTs from both bacterial and eukaryotic sources. It will also be important for the functional annotation of other enzyme families. Further, it in part to choosing to investigate bacterial GTs. There is an inherent bias of bacterial

GTs towards unusual or complicated sugar donors and acceptors, thus by the small library sample of donors/acceptors, we likely miss many of these obscure GTs. To 150 circumvent this problem further work can be done by either increasing the library of donors/acceptors, and perform this work robotically; or to investigate the more challenging eukaryotic glycosyltransferases which should require less obscure donor and acceptor molecules.

151

REFERENCES

(1) Ohtsubo, K.; Marth, J. D. Cell 2006, 126, 855-67.

(2) Li, M.; Liu, X. W.; Shao, J.; Shen, J.; Jia, Q.; Yi, W.; Song, J. K.; Woodward, R.; Chow, C. S.; Wang, P. G. Biochemistry 2008, 47, 378-87.

(3) Breton, C.; Snajdrova, L.; Jeanneau, C.; Koca, J.; Imberty, A. Glycobiology 2006, 16, 29R-37R.

(4) Yi, W.; Shen, J.; Zhou, G.; Li, J.; Wang, P. G. J Am Chem Soc 2008, 130, 14420- 1.

(5) Koeller, K. M.; Wong, C. H. Nature 2001, 409, 232-40.

(6) Aharoni, A.; Weiner, L.; Lewis, A.; Ottolenghi, M.; Sheves, M. J Am Chem Soc 2001, 123, 6612-6.

(7) Cantarel, B. L.; Coutinho, P. M.; Rancurel, C.; Bernard, T.; Lombard, V.; Henrissat, B. Nucleic Acids Res 2009, 37, D233-8.

(8) Wagner, G. K.; Pesnot, T. Chembiochem 2010, 11, 1939-49.

(9) Stols, L.; Gu, M.; Dieckman, L.; Raffen, R.; Collart, F. R.; Donnelly, M. I. Protein Expr Purif 2002, 25, 8-15.

(10) Houseman, B. T.; Mrksich, M. Chem Biol 2002, 9, 443-54.

(11) Min, D. H.; Su, J.; Mrksich, M. Angew Chem Int Ed Engl 2004, 43, 5973-7.

(12) Ban, L.; Mrksich, M. Angew Chem Int Ed Engl 2008, 47, 3396-9.

(13) Su, J.; Mrksich, M. Angew Chem Int Ed Engl 2002, 41, 4715-8.

(14) Fang, J.; Li, J.; Chen, X.; Zhang, Y.; Wang, J.; Guo, Z.; Zhang, W.; Yu, L.; Brew, K.; Wang, P. G. Journal of the American Chemical Society 1998, 120, 6635-6638.

(15) Min, D.-H.; Yeo, W.-S.; Mrksich, M. Analytical Chemistry 2004, 76, 3923-3929.

(16) Stols, L.; Gu, M.; Dieckman, L.; Raffen, R.; Collart, F. R.; Donnelly, M. I. Protein Expression and Purification 2002, 25, 8-15.

(17) Breton, C.; Oriol, R.; Imberty, A. Glycobiology 1997, 16, 29 R -37 R. 152

(18) Mrksich, M. ACS Nano 2008, 2, 7-18.

(19) Palcic, M. M.; Hindsgaul, O. In Glycobiology 1991; Vol. 1, p 205-209.

(20) Cerdero-Tarraga, A. M.; Patrick, S.; Crossman, L. C.; Blakely, G.; Abratt, V.; Lennard, N.; Poxton, I.; Duerden, B.; Harris, B.; Quail, M. A.; Barron, A.; Clark, L.; Corton, C.; Doggett, J.; Holden, M. T. G.; Larke, N.; Line, A.; Lord, A.; Norbertczak, H.; Ormond, D.; Price, C.; Rabbinowitsch, E.; Woodward, J.; Barrell, B.; Parkhill, J. In Science 2005; Vol. 307, p 1463-1465.

(21) Koeller, K. M.; Wong, C. H. Chem Rev 2000, 100, 4465-94.

(22) Wong, C. H.; Pollak, A.; McCurry, S. D.; Sue, J. M.; Knowles, J. R.; Whitesides, G. M. Methods Enzymol 1982, 89 Pt D, 108-21.

(23) Su, D. M.; Eguchi, H.; Yi, W.; Li, L.; Wang, P. G.; Xia, C. Org Lett 2008, 10, 1009-12.

(24) Koizumi, S.; Endo, T.; Tabata, K.; Ozaki, A. Nat Biotechnol 1998, 16, 847-50.

(25) Phillips, M. L.; Nudelman, E.; Gaeta, F. C.; Perez, M.; Singhal, A. K.; Hakomori, S.; Paulson, J. C. Science 1990, 250, 1130-2.

(26) Lowe, J. B.; Stoolman, L. M.; Nair, R. P.; Larsen, R. D.; Berhend, T. L.; Marks, R. M. Cell 1990, 63, 475-84.

(27) Kannagi, R.; Fukushi, Y.; Tachikawa, T.; Noda, A.; Shin, S.; Shigeta, K.; Hiraiwa, N.; Fukuda, Y.; Inamoto, T.; Hakomori, S.; et al. Cancer Res 1986, 46, 2619-26.

(28) Magnani, J. L.; Steplewski, Z.; Koprowski, H.; Ginsburg, V. Cancer Res 1983, 43, 5489-92.

(29) Moran, A. P.; Prendergast, M. M.; Appelmelk, B. J. FEMS Immunol Med Microbiol 1996, 16, 105-15.

(30) Feinberg, H.; Taylor, M. E.; Weis, W. I. J Biol Chem 2007, 282, 17250-8.

(31) Wang, W.; Hu, T.; Frantom, P. A.; Zheng, T.; Gerwe, B.; Del Amo, D. S.; Garret, S.; Seidel, R. D., 3rd; Wu, P. Proc Natl Acad Sci U S A 2009, 106, 16096-101.

153

CHAPTER 5

CHEMICAL BIOLOGY APPROACH TO INVESTIGATE BACTERIAL

POLYSACCHARIDE GLYCOSYLTRANSFERASES

5.1 Promiscuity of GlcNAc pathway towards GlcNAc analogues

5.1.1 Introduction

2-acetamido-2-deoxy-D-glycopyranosides (N-acetylglucosamine/galactosamine, also called GlcNAc/GalNAc) are widely distributed in living systems as oligosaccharides and glycoconjugates, and play significant roles in a wide range of biological processes.1,2

For example, GlcNAc is a major constituent of the backbone of bacterial peptidoglycan, meanwhile both sugars are fundamental components of many important polysaccachrides such as glycosaminoglycans.3-7 Hence, there is a considerable interest in chemical modification of GlcNAc/GalNAc residues by an unnatural analogue in such a polysaccharide chain for further understanding and modulating the targets of these glycosides.8-13 Especially, analogues containing bioorthogonal groups (e.g. reactive ketones, azides, alkynes, etc.) are of great interest since they can be further probed by chemoselective reactions.14-16 Among the various analogues, 2-acetonyl-2-deoxy-D- galactose (2-ketoGal, Figure 5.1.) has attracted much attention because it bears much 154 resemblance to GalNAc, and most significantly, contains a ketone handle. This substrate serves as a carbon isostere of GalNAc and was incorporated into cell surface glycoprotein through metabolic pathways. Then the ketone epitope was chemoselectively reacted with biotin hydrazide, stained with fluorescein isothiocyanate (FITC) labeled avidin, and detected by flow cytometry for quantification.17 Moreover, 2-ketoGal, which is transferred from the corresponding donor molecule UDP-2-ketoGal by a mutated glycosyltransferase18 was also widely used as a ketone probe for in vitro O-GlcNAc glycoprotein detections.14-16,19,20

Figure 5.1. Chemical structure of GlcNAc/GalNAc and their 2-carbon isosteres.

However, the corresponding 2-keto isostere of GlcNAc (2-ketoGlc) is missing from current literature. Bertozzi and co-workers demonstrated 2-ketoGal was successfully taken as a substrate for metabolic glycoprotein engineering through the salvage pathway while the GlcNAc counterpart could not be incorporated.17 One explanation for this observation is because, firstly, GlcNAc is abundant and the 2-ketoGlc

155 lies in strong competition with the endogenous GlcNAc. In addition to this, unlike

GalNac, current data suggests that there is a lack of suitable metabolic pathway (salvage pathway) which is critical for the generation of UDP-sugar donor.

In this work, we demonstrate that unnatural 2-ketoGlc can be processed by E. coli and displayed on the cell surface (Figure 5.2.). The incorporated ketone epitope allowed further detection by conjugation with commercially available aminooxybiotin reagent. Through western blot analysis, using purified lipopolysaccharide (LPS), we show that the LPS is modified with the ketone epitope.

5.1.2 Experimental methods

Bacterial strains and reagents

E. coli competent cell BL21 (DE3) [F ompT hsdSB (rBmB) gal dcm (DE3)] was obtained from Stratagene. All reagents were purchased from Sigma unless otherwise noted.

Synthesis of ketosugar

Peracetylated 2-ketoGlc was chemically synthesized from glucal as previously described.21

Labeling live bacterial cells via aniline-catalyzed oxime ligation

Either E. coli BL21 (DE3), and E. coli O86:B7 ∆wecA/∆waaL was inoculated in

200 µL in LB medium supplemented with the sugar analogue (peracetylated GlcNAc was used as negative control), grown overnight, at 37 ⁰C, and shaking at 250 rpm. Cells were collected by centrifugation at 4 ⁰C, washed 3 times with PBS buffer, and subsequently resuspended in a solution containing 20 mM MOPS, pH 6.7, 1 mM N-(aminooxyacetyl)- 156

Figure 5.2. Metabolic incorporation of ketone isotope on E.coli surface.

N‘-(D-biotinoyl) hydrazine (Invitrogen), and 10 mM aniline. Cells were then incubated at 4 ⁰C 90 minutes (on a rocker), after which the cells were harvested by centrifugation, and subsequently washed 3 times with PBS buffer. Cells were then resuspended in 0.5 mL of PBS, to which 5 µL of FITC conjugated avidin (0.5 mg/mL, Invitrogen) was added. The resuspension was incubated at 4 ⁰C on a rocker for 90 minutes, after which the cells were extensively washed with PBS buffer, and subsequently analyzed by both flow cytometry (BD FACS LSR II) and fluorescence microscopy (Olympus IX81 inverted microscope).

Functional inactivation of wecA and waaL

The wecA and waaL genes were replaced by a kanamycin resistance cassette, using the RED recombination system of phage lambda.22,23 For creation of the knock out strain we used the plasmids provided by the Quick and Easy E. coli Gene Deletion Kit 157

(Gene Bridges). The wecA gene was first replaced from the chromosomal DNA with the kanamycin resistance cassette following the manufacturer‘s protocol, which was verified by PCR. The kanamycin resistance cassette was then removed by FLP/FLPe expression and the E. coli O86:B7 ∆wecA was then used as the host to create the waaL inactivated strain, which was created following the manufacturer‘s protocol. The final E. coli

O86:B7 ∆wecA/∆waaL was confirmed by PCR as well as observing the change in LPS phenotype by LPS silver staining.

LPS isolation and analysis

To determine the relative location of the labeled sugar E. coli O86:B7

∆wecA/∆waaL was inoculated with 10 mM of the sugar analogue and subsequently labeled with the aminooxybiotin as mentioned. Then the LPS was extracted according to the published proteinase K digestion protocol, and analyzed by both SDS-PAGE (with silver staining) and western blot 24. To perform the western blot analysis, the LPS was separated by SDS-PAGE, transferred to a nitrocellulose membrane, and probed with HRP conjugated streptavidin (1:5000, Thermo Scientific). The blot was developed by using the

ECL Western Blotting Detection Reagents (GE Healthcare).

5.1.3 Results

Incorporation of 2-ketoGlc in cell surface oligosaccharides

In this study we sought to remodel the bacterial cell surface by introduction of an unnatural keto sugar (2-ketoGlc), which can be probed by using bioorthogonal chemistry.

Previous work by various groups have demonstrated the utility of modifying cell surface

158 oligosaccharides in vivo, typically exploiting very specific biosynthetic mechanisms, primarily in eukaryotic organisms. More specifically, we incubated 2-ketoGlc with various bacterial strains, with the notion that sugar may act as a substrate for various enzymes within the cells‘ oligosaccharide biosynthetic pathways, which could eventually be incorporated in the cell surface oligosaccharides. For example, there exists multiple salvage pathways, such as the GalNAc salvage pathway, in which a bacterium can obtain a specific sugar from the external medium, and directly convert this sugar to the required sugar nucleotide donor.25

Previously, we have demonstrated that the key sugar nucleotide donor molecule for 2-ketoGlc incorporation could be synthesized in vitro by bacterial kinase (NahK) and pyrophosphorylase (GlmU).21 To test whether or not 2-ketoGlc could be incorporated into the cell surface oligosaccharides, we first incubated overnight cultures of E. coli

BL21 (DE3) with 10 mM of 2-ketoGlc or peracetylated GlcNAc (negative control). Since the ketone is not a common functional group found in living organisms, by virtue of bioorthogonal chemistry, those cells that incorporate the 2-ketoGlc into the cell surface oligosaccharides can be easily identified. Therefore, to the overnight cultures both containing and void of the 2-ketoGlc, N-(aminooxyacetyl)-N‘-(D-biotinoyl) hydrazine

(aminooxybiotin) in the presence of the aniline catalyst at neutral pH was added.26

Ketone groups that had been metabolically engineered into the cell surface polysaccharides can react with the aminooxybiotin reagent resulting in biotin-coated bacterial cells. The biotinylated bacterial cells can be probed by staining with FITC conjugated avidin, and the resulting fluorescence signal can be analyzed by flow

159 cytometry. As shown in Figure 5.3. A, E. coli BL21 (DE3) cells fed with 2-ketoGlc showed an increase in fluorescence intensity compared with control samples. Two negative controls are shown, with the teal line representing E. coli BL21 (DE3) cells that were treated with the aminooxybiotin reagent followed by subsequent coupling with

FITC conjugated avidin, and the orange line representing the fluorescence intensity of just the naked E. coli BL21 (DE3). Cells incubated with the biotin linker and subsequently incubated with the FITC conjugated but were void of the 2-ketoGlc sugar observed no change in fluorescence (from the negative E. coli BL21 (DE3)) suggesting that the biotin hydrazine is reacting specifically with the ketone in 2-ketoGlc. A similar result was observed for an E. coli O86:B7 ∆wecA/∆waaL strain (Figure 5.3. B), which is used a model strain to identify the location of the modified sugar. These results are surprising, for the simple fact that there exists no endogenous GlcNAc-1-kinase, such as

NahK, which in vitro is utilized for the synthesis of the UDP-GlcNAc donor.21

Fluorescent microscopy imaging further confirmed successful incorporation of ketone group on E.coli cell surface (Figure 5.4.). White arrows in Figure 5.4 A point at FITC labeled E. Coli rods. Therefore, in vivo, these results suggest that the 2-ketoGlc is may be being processed either as a mock Glc or GlcNAc due to structural similarities, or is possibly being epimerized by GalE to either a mock Gal or GalNAc. Thus, the obtained results are suggestively very complex as it is challenging to identify which biosynthetic pathway(s) the 2-ketoGlc is traveling by.

160

Figure 5.3. Bacterial cell surface labeling by introduction of 2-ketoGlc, conjugated with aminooxybiotin, and detection by FITC conjugated avidin. The fluorescence signal was analyzed by flow cytometry. A) Fluorescence labeling of 2-ketoGlc fed E. coli BL21

(DE3). Red line: Positive reaction; Teal line: Negative control (no 2-ketoGlc but treated with aminooxybiotin and FITC-avidin; Orange fill: Negative control (naked E. coli BL21

(DE3)). B) Fluorescence labeling of 2-ketoGlc fed E. coli O86:B7 ∆wecA/∆waaL. Red line: Positive reaction; Teal line: Negative control (no 2-ketoGlc but treated with aminooxybiotin and FITC-avidin; Orange fill: Negative control (naked E. coli O86:B7

∆wecA/∆waaL).

In order to try and regulate the glycosylation we attempted at knocking out the de novo UDP-GlcNAc biosynthesis pathway, more specifically by removing glmM from the chromosome. To a strain of E. coli containing NahK (GlcNAc kinase) we attempted to knock out glmM and drive UDP-GlcNAc biosynthesis through NahK, however, we were unsuccessful. This is likely due in part because of the abundance of UDP-GlcNAc required by the cell, and eliminating this biosynthetic pathway is lethal for the bacterium. 161

Figure 5.4. Incorporation of 2-ketoGlc or GlcNAc (negative control) into E. coli. A and

B: E. coli O86:B7 ∆wecA/∆waaL incubated with peracetylated 2-ketoGlc, labeled with aminooxybiotin, and probed with FITC-avidin; C and D: E. coli O86:B7 ∆wecA/∆waaL incubated with peracetylated GlcNAc,labeled with aminooxybiotin, and probed with

FITC-avidin; A and C: Fluorescence images taken under the Olympus inverted microscope, white arrows point at FITC labeled E. Coli rods; B and D: phase contrast images taken at the same fields of A and C respectively.

Lastly, In addition to the flow cytometry results, incubation with 2-ketoGlc appeared to have a significant inhibitory effect on the cell growth. No such effects were reported in

162 previous studies in which unnatural sugars were incorporated with either eukaryotic or prokaryotic organisms.

Labeling LPS

Previous work by our lab has used E. coli O86 as a model system for the pin point modification of the fucosylated LPS by using metabolic pathway engineering, and in this work we sought to expand our ability to modify this model strain with 2-ketoGlc.27

Structurally 2-ketoGlc is very similar to GlcNAc, with the exception of the 2‘ amide, and thus may be incorporated as GlcNAc in the LPS. In the R3-type core oligosaccharide, there exists multiple GlcNAc residues, several Glc residues, and several other hexoses, thus representing several possible glycosyltransferases that may tolerate and incorporate this unnatural sugar when synthesizing the core oligosaccharide.28

To examine one possible location where the unnatural sugar is being incorporated we created a double knock out strain of E. coli O86:B7, which is functionally missing wecA and waaL. These enzymes are involved in the biosynthetic pathway of the LPS, more specifically, the synthesis and ligation of the O-antigen, respectively.29,30 By removing the wecA gene from the chromosome, we remove the glycosyltransferase which is responsible for the first sugar found in the O-antigen, thus removing the O-antigen from the LPS. Furthermore, by disruption of the waaL gene, we ensure that there is no ligation of any oligosaccharide onto the LPS core oligosaccharide. Thus, the resulting

LPS displayed on the cell surface will only contain the core oligosaccharides, which can verified by the LPS phenotype observed in Figure 5.5. Figure 5.5., Lane 1 shows the wild type LPS phenotype for E. coli O86:B7, whereas Lane 2 is the LPS phenotype for

163 the E. coli O86:B7 ∆wecA/∆waaL strain. The single band observed in lane 2 is characteristic of the core oligosaccharide, which contains several GlcNAc residues.30

Thus, to determine one possible location where the 2-ketoGlc may be incorporated, E. coli O86:B7 ∆ wecA/∆waaL was fed 2-ketoGlc and incubated overnight, from which the

LPS was isolated in analyzed with western blot. To both the E. coli O86:B7 ∆ wecA/∆waaL fed 2-ketoGlc and void of 2-ketoGlc, the cells were incubated with aminooxybiotin, which biotinylates cells that incorporate the 2-ketoGlc into the cell surface oligosaccharides. After excessive washing of the cells to remove unreacted aminooxybiotin, the LPS was isolated and separated electrophoretically by SDS-PAGE.

Then after transfer of the LPS to a nitrocellulose membrane, the presence of biotinylated

LPS was probed by using HRP conjugated to streptavidin. Figure 5.5., Lane 2, shows the single band of core oligosaccharide for the E. coli O86:B7 ∆wecA/∆waaL, and in Figure

5.5., Lane 4, we observe the same band in the western blot. Furthermore, absence of the band in lane 3, suggests that E. coli O86:B7 ∆wecA/∆waaL is incorporating the 2-ketoGlc into the core oligosaccharides.

5.1.4 Conclusion

In conclusion, we demonstrate that the 2-ketoGlc is incorporated in some form into the LPS core oligosaccharide, however due to the complexity of the biosynthetic pathways in which this sugar could conceivably interact with, it remains a challenge to identify the exact location that this sugar occupies on the LPS.31 The key to this success, however, likely resides in the promiscuity of sugar nucleotide biosynthetic pathways,

164

Figure 5.5. Lane 1: LPS phenotype for E. coli O86:B7, visualized by silver staining; lane

2: LPS phenotype for E. coli O86:B7 ∆wecA/∆waaL, visualized by silver staining; lane 3:

LPS phenotype for E. coli O86:B7 ∆wecA/∆waaL incubated with peracetylated GlcNAc, visualized by HRP conjugate streptavidin; lane 4: LPS phenotype for E. coli O86:B7

∆wecA/∆waaL incubated with peracetylated 2-ketoGlc, visualized by HRP conjugate streptavidin.

when creating the sugar nucleotide donor UDP-2-ketoGlc. Furthermore, this result may also demonstrate the promiscuity of the glycosyltransferases which are responsible for the synthesis of the oligosaccharides that are eventually displayed on the cell surface. In conjunction with the incorporation of modified fucose derivatives into the LPS of E. coli

O86:B7, we further demonstrate proof of principle that other unique monosaccharides can be engineered into the cell surface oligosaccharides in bacteria.27 Unlike our previous work where we could control the exact location of the modification to the LPS, because of the possible pathways that 2-ketoGlc may take, exact identification of which residue is 165 being modified in the core oligosaccharides is challenging (Figure 5.6). Once GlcNAc is localized in the cytoplasm there are many processing enzymes, in addition to the desired

UDP-GlcNAc biosynthetic enzymes that can interact with the 2-ketoGlc, which can possible affect the location of the displayed 2-ketoGlc. Such enzyme, as illustrated in

Chapter 4, is GalE, the 4‘epimerase which can epimerize the 4‘-hydroxy group of

GlcNAc to GalNAc, which can then interact with the GalNAc-salvage pathway, thereby displaying 2-ketoGal on the cell surface.32 Also in correlation with previous observations, we observed a competition effect, whereby it was critical to have an extracellular concentration of 2-ketoGlc at 10 mM. This is likely important for competing with the high intracellular concentrations of UDP-GlcNAc.17 Further studies are ongoing by incorporating different monosaccharides (GalNAc, GlcNAc) which include different modifications (e.g. azido, alkyne) at different positions which may expand the understanding in which biosynthetic pathways the 2-acetamido-2-deoxy-D- glycopyranosides analogues interact with.

166

Figure 5.6. GlcNAc biosynthetic pathways.

5.2 Investigation of Donor Specificity of Bacterial α1,2-Fucosyltransferases

5.2.1 Introduction As discussed in Chapter 2, fucosylation is an important enzymatic reaction which commonly results in terminal fucose moieties displayed on various oligosaccharides, and it is these terminal fucose moieties which are essential for the biological properties of the corresponding oligosaccharides.33 Based on the new glycosidic linkage formed (typically α1,2-, α1,3-, α1,4-, or α1,6-) FucTs can be classified into four different subfamilies. α1,2-FucTs belong to glycosyltransferase family 11

(http://www.cazy.org/fam/acc_GT.html) and are responsible for the transfer of fucose to galactose (gal) forming an α1,2-linkage. FUT1 and FUT2 are two human α1,2-FucTs

167 that are responsible for the biosynthesis of different H-antigens.34 Importantly, only a few α1,2-FucTs have been cloned from bacterial sources and subsequently characterized:

WbsJ from E. coli O127, WbnK from E. coli O86, WbiQ from E. coli O127, and FutC from Helicobacter pylori.35-38 Of these WbsJ and FutC from H. pylori have broad acceptor substrate specificities and have demonstrated applicability in the synthesis of relevant fucose containing oligosaccharides.35,37 Furthermore, recombinant WbnK and

WbiQ were feasibly produced in E. coli and the reactions were in ten to twenty-mg scale, however both FucTs have strict acceptor substrate specificity. Both FucTs only showed activity on the type III/IV precursor (Gal-β1,3-GalNAc)36, thereby synthesizing an H- antigen derivative. As one of the examples, the H-antigen, defined by the terminal Fuc-

α-1,2-Gal structure, constitutes the basic structure of the human ABO blood group system. H-antigen was found to serve as binding site for Campylobacter jejuni, one of the common causes of bacterial diarrhea39 and Norwalk virus, responsible for most of gastroenteritis.40 Large quantities of H-antigen oligosaccharides in breast milk were proven to protect infants against these pathogens via competitive inhibition of the binding between the pathogens and the epithelial ligands. Thus, usage of synthetic H-antigen saccharides as anti-infection therapeutics was logically suggested.41

While each of the four mentioned FucTs have been extensively biochemically characterized, especially regarding the acceptor substrate specificity, little is known about the tolerance of these enzymes and GTs in general to modified sugar nucleotide donors.

GTs are a well characterized class of enzymes, which contain two binding domains, one for the acceptor and one for the donor. When characterizing a putative GT, it is common

168 to test an array of sugar nucleotide donors (UDP-Gal vs UDP-Glc), as well as testing a panel of possible acceptors. However, little work has been done to demonstrate the tolerance or lack thereof for modified sugar nucleotides (such a modified GDP-fucose).

FucTs, belonging to the Leloir pathway, require GDP-fucose as a nucleotide activated substrate. GDP-fucose from chemical synthesis was used in most reported synthesis reactions. However, with decades of improvement, chemical procedures are still complicated and non-efficient because of the stereoselectivity problem and instability of the intermediate.42,43 There exist two biosynthetic pathways for GDP-fucose synthesis, notable the de novo pathway and the salvage pathway. With enzymes from the de novo pathway of E. coli, the production was efficient but expensive start materials GDP- mannose and cofactor (NADPH) had to be used.44 The enzymes involved in the salvage pathway have been identified in plenty of eukaryotes such as Homo sapiens45,46 and

Arabidopsis thaliana.47 But these enzymes have not been applied in practical GDP-fucose synthesis owing to low activity or are too challenging to obtain. The salvage pathway has never been identified in prokaryotes until it was found exist in a human symbiont,

Bacteroides fragilis.48 A bifunctional enzyme (FKP) with fucose kinase and fucose-1- phosphate guanyltransferase activities is responsible for GDP-fucose biosynthesis in the salvage pathway in this organism. FKP catalyzes the conversion of fucose to GDP-fucose in two steps with fucose, ATP and GTP as starting materials (Scheme 5.1).48

Furthermore, the substrate specificity of FKP has been determined and readily accepts fucose derivatives bearing modifications at the 6‘ position.27 Knowing this, Wang and co- workers, utilized FKP to synthesize GDP-fucose derivatives, which then were used as

169

Scheme 5.1. FKP catalyzed reaction for the synthesis of GDP-fucose and derivatives.

donors for the enzymatic synthesis of Lewis X glycan derivatives, demonstrating the applicability of exploiting the GDP-fucose salvage pathway, in vitro, for milligram scale synthesis of the Lewis X fucose-modified oligosaccharides.49

SAMDI (Self-Assembled Monolayer analyzed by MALDI-TOF mass spectrometry) is emerging as a true label-free technique and has been widely used in various enzyme assays.50 In this work we sought to demonstrate that by combining

SAMDI technology and the synthesis of GDP-fucose derivatives we could create a platform for the determination of donor specificity for four α-1,2-FucTs. While there exist many unique and functional assays for determining the substrate specificity of a putative glycosyltransferase, most traditional and common approaches suffer from the lack of availability of the necessary reagents, such as in unique fluorophores or necessary radio- labeled substrates.51

5.2.2 Experimental Methods

Materials, plasmids, protein expression All materials used in this work were purchased from Sigma Aldrich unless otherwise noted. WbiQ, WbwK, and WbsJ were constructed as N-terminal GST-fusion proteins and were expressed and purified as mentioned, whereas FutC was constructed as

170 an N-terminal His6-fusion protein in the pET-22a plasmid; following the published work

35,36,38,52 for each transferase. FKP was expressed and purified as an N-terminal His6- fusion protein as described previously.27

Instruments

1H spectra were recorded on a Bruker DRX 500 spectrometer. Chemical shifts and coupling constants were reported in ppm and Hz respectively. SAMDI MS was performed on an Applied BiosystemsTM 4800 plus MALDI TOF/TOF mass spectrometer.

Gold and titanium thin films were evaporated using an electron beam evaporator from

Thermionics.

Synthesis of fucose derivatives and corresponding GDP-sugars

The fucose derivatives illustrated in Figure 5.7, were chemically synthesized following previously established protocols 49,53. For the FKP catalyzed reaction, reaction mixtures were performed in 20 mM Tris-HCl, pH 7.5, 10 mM MgSO4, 20 mM ATP, 20 mM GTP, 20 mM L-fucose or fucose analogue, and inorganic pyrophosphatase. The reactions were incubated at 37 ⁰C for 24 hours, and the formation of GDP-fucose or

GDP-analogue was monitored by thin-layer chromatography (TLC) and further confirmed by ESI-MS. The GDP-fucose sugars were purified by P2 size exclusion chromatography, following by selective anion and cation exchanges to form the pure disodium salt.

Determination of relative activity for four FucTs

The following concentration of enzyme was used for fucosyltransferase activity:

Wbwk: 2.8 mg mL-1; Wbsj: 0. 38 mg mL-1; Wbiq: 1.8 mg mL-1; FutC: 0.077 mg mL-1.

171

Each set of reactions was performed in 50 mM Tris-HCl (pH = 7.5), 0.5 mM thiolated

Gal-1,3-GalNAc and one of 1 mM GDP sugars in a total volume of 100 L . The reactions were incubated at 37 ⁰C for the following time points: 2, 4, 6, 8, 10, 12, 15, 30,

60, 90,120, 150, 180, 210, 240, 300 min and 24 hours. After each time point, about 3 L was taken out of the tube and mixed with 3 L cold ethanol and the resulting mixture was stored in - 20 ⁰C until the aliquot from the last time point was taken. About 3 L mixture from each time point was transferred onto individual gold island with the maleimide group. The immobilization reaction was allowed to carry out for 10 min at room temperature. The completion of the reactions was confirmed by SAMDI. The entire plate is rinsed with water and ethanol and allowed to dry under a stream of nitrogen. Then the slide is glued to a conventional MALDI metal plate and subject for SAMDI characterization. For quantification, the extent of glycosylation (R) was determined from the peak intensities for product (Ip) and lactose substrate (Is) on the SAMDI spectra using the relation: R = Ip/(Ip+Is). We confirmed that this ratio of peak intensities of the trisaccharide product and disaccharide substrates could represent the ratios of their relative concentrations in solution. The yield of the glycosylation at each time point thus can be calculated from the equation: y = R * [A]0, where [A]0 is the initial concentration of the disaccharide substrate. The yield for each sugar donor was plotted against the reaction time. The linear region of the plot was fitted in Microsoft Excel and the slope obtained was used as the initial velocity. The initial velocity for GDP-Fucose for each enzyme is normalized to 100%. The relative activities of other GDP sugars were reported as relative initial velocities comparing to the ones with GDP-Fucose.

172

5.2.3 Results and discussion

Firstly, using traditional chemical synthesis, the fucose analogues found in Figure

5.7, were synthesized. All of these fucose derivatives were previously shown to be tolerated by FKP, and thus would provide for a suitable panel of substrates to compare and contrast the donor specificities. Following Scheme 5.1, milligram scale of the GDP- fucose sugars were obtained after the FKP catalyzed reaction. Previously reported, each fucosyltransferase investigated in this work readily accepts the the T-antigen mimic acceptor (Gal-β1,3-GalNAc), and in this work a thiol terminated T-antigen mimic (Gal-

β1,3-GalNAc-SH) was used as the basis to compare the donor substrate specificity of each FucT. Furthermore, WbiQ and WbwK exhibit stringent acceptor substrate specificity, and as a result, since the endogenous Und-PP-sugar acceptors are not readily obtainable, the Gal-β1,3-GalNAc-SH was the most logical acceptor for the total comparison of the donor substrate specificities.

Next, those donors were evaluated for their relative activities against the panel of

α1,2-FucTs. The glycosylation reactions were performed in solution with the thiolated

Gal-1,3-GalNAc as acceptor and one of the analogs as the donor. The reactions were carried out in solution and stopped at different time points by taking an aliquot out of the solution and adding cold ethanol to it. This mixture was directly spotted onto a gold island which had monolayer that presented a terminal maleimide group among tri(ethylene glycol) groups (Figure 5.8). In this way, a Michael addition between the thiolated sugars took place and the monolayer serves to immobilize both the trisaccharide product and unreacted disaccharide acceptor to the gold surface. This procedure allows

173

Figure 5.7. Six fucose analogues used in the chemoenzymatic synthesis of the respective

GDP-fucose derivatives.

for SAMDI analysis which allows for the determination of extent of glycosylation. Figure

5.8 shows a representative SAMDI spectrum for reactions performed with GDP-L-

Galactose, the thiolated acceptor, and FutC which was stopped at 15 min. The disaccharide substrate was partially converted to the trisaccharide product, with peaks at m/z 1330 and 1493 corresponding to the sodium adducts, respectively.

Using this method, we evaluated the relative activities of all six sugar donors against the four fucosyltransferases. We prepared reaction mixtures in eppendorf tubes, where each tube had one of the fucosyltransferase, one of the GDP-sugars (1 mM) and the thiolated acceptor (0.5 mM). Also we confirmed that the immobilization reaction was complete in 5 minutes and that both the product and Gal-1,3-GalNAc substrate reacted with the monolayers with similar kinetics. The monolayer was then analyzed by SAMDI, and the yield of glycosylation was measured from the ratio of peak intensities of the 174

Figure 5.8. Method for determining the relative activity of each fucose derivative for each of the four α1,2-FucTs.

product to the sum of the intensities for the product and the lactose substrate (supporting material). The yields for each donor concentration were plotted against the reaction time.

Initial velocities were determined by obtaining the slopes from least-square linear regression. The initial velocity for GDP-Fucose was normalized to 100 % and the relative velocities were calculated for the other five donors and the results were summarized in

Table 5.1, and the SAMDI spectra are in the found in the Appendix.

175

Table 5.1. Donor substrate specificity of four FucTs FucT: WbiQ WbsJ WbwK FutC Fucose donor (Derivative) GDP-fucose (1) 100% 100% 100% 100% GDP-D-arabinose (2) 0% 67% 46% 77% GDP-L-galactose (3) 28.8% 45.2% 48.7% 79.2% (2.3%) (3.2) (3.0%) (3.5%) GDP-6‘-azido-fucose (4) 0% 0% 0% 33% GDP-6‘-fluoro-fucose (5) 10.6% 36.2% 26.7% 68.8% (2.6%) (2.7%) (2.2%) (3.1%) GDP-6‘-alkyne-fucose (6) 0% 8.2% 7.9 % 12.9% (2.4%) (3.4%) (2.6%)

From Table 5.1, it is evident that the studied fucosyltransferases exhibit a clear preference for the standard GDP-fucose donor. Beginning with WbiQ, it appears that

WbiQ exhibits strict donor specificity, as the only two modified GDP-donors that it accepts with any detectable activity are derivatives 3 and 5. This result mimics the result seen for the acceptor substrate specificity as WbiQ only demonstrated in vitro activity towards one acceptor.38 Contrastingly, WbwK tolerates the D-arabinose and alkyne fucose derivatives, despite having a similar acceptor substrate specificity as compared to

WbiQ. However, the obtained result doesn‘t correlate with in vivo data, whereby WbwK tolerates the 6‘-azido-fucose.27 A possible explanation for the observed result is that in vitro WbwK exhibits low activity, whereas in vivo WbwK uses a different acceptor and is membrane associated, thus exhibiting optimal conditions for enzymatic reaction. WbsJ exhibits comparable enzymatic activity to WbwK, tolerating the same donors in approximately the same relative activity. Lastly, FutC appears to tolerate all the chosen

176

GDP-donors to some degree, displaying very promiscuous donor substrate specificity.

Like WbiQ, this result correlates well with the results observed for the acceptor substrate specificity, where FutC exhibited promiscuous acceptor substrate specificity as well.52

Previously, the crystal structure of an α1,3-FucT was reported, and in this work, the authors suggest that the transferase binds significantly more towards the GDP-moiety than the fucose moiety.54 While the evolutionary relationship between the α1,2-FucTs and α1,3-FucTs is currently unknown, it is clear that the α1,2-FucTs tightly associate with the GDP moiety, and have a clear binding domain for the fucose moiety.35

Commonly, especially in GTB type glycosyltransferases, the binding pocket is forged between the two Rossmann-like domains, where one domain houses the acceptor substrate pocket, and the other domain serves as the donor substrate pocket. α1,2-FucTs are an inverting, GTB type transferase, and like most transferases have been modeled three dimensionally, which have exhibited two distinct Rossmann fold-like domains.

Each transferase contains the typical HxRRxD motif, which is responsible for coordinating the pyrophosphate moiety of the GDP-donor, however, because of the lack of three dimensional structures, the importance of the remaining amino acids is currently unknown (Figure 5.9). Due to the low sequence similarity at the N-terminus, it was originally proposed that this domain may serve as the acceptor binding domain, because previous results showed that WbwK and WbsJ displayed very different acceptor specificities. However domain swapping between WbwK and WbsJ has shown that two domains near the C-terminus likely regulate the binding of the acceptor molecules, which may suggest that the N-terminus regulates the binding of the donor molecule.36 From the

177 substrate specificity results in Table 5.1, we propose that the fucosyltransferases have clear binding domains for the fucose moiety, as some GDP-sugars are tolerated and others are not. However, elucidating the exact reasons for the donor specificities cannot be resolved until a three dimensional structure is determined.

In conclusion, it was demonstrated that we can easily and quickly expand the knowledge of the donor substrate specificity of four different α1,2-fucosyltransferases.

By exploiting the salvage pathway for GDP-fucose synthesis, in vitro, various GDP- donors were easily synthesized and further utilized by the different FucTs. Furthermore, by utilizing SAMDI for the method of detection, we can quickly (and robotically) determine the activities of each transferase, with minimal amounts of reagents. By using minimal amounts of the potentially challenging to obtain sugar donors, this technology exhibits a significant advantage over other GT assays. This technology could be further applied to any type of glycosyltransferase should the methods for creating the derivatized donors be made available. Furthermore, other biosynthetic pathways have been used in vitro for the synthesis of modified sugar nucleotide donors, such as in the synthesis of

UDP-GalNAc and UDP-GlcNAc analogues, and this technology could be further used to investigate the donor specificity of some GalNAc and/or GlcNAc transferases.55,56 In addition, the SAMDI technology has the potential to be fully automated using robotics, and this type of study could be further expanded to doing the full donor and acceptor substrate characterizations, which would eliminate the need for complex (fluorophore labeled) or radiolabelled substrates that are commonly used for investigating glycosyltransferase substrate specificities.51 Lastly, it can be conceived that this type of

178 technology could be quickly adapted for the screening of potential inhibitors of glycosyltransferases.

Figure 5.9. Sequence alignment of WbiQ, WbwK, WbsJ, and FutC.

179

REFERENCES

(1) Dwek, R. A. Chem. Rev. (Washington, D. C.) 1996, 96, 683-720.

(2) Zachara, N. E.; Hart, G. W. Chem. Rev. (Washington, D. C.) 2002, 102, 431-438.

(3) van Heijenoort, J. Glycobiology 2001, 11, 25R-36R.

(4) Barreteau, H.; Kovac, A.; Boniface, A.; Sova, M.; Gobec, S.; Blanot, D. FEMS Microbiol Rev 2008, 32, 168-207.

(5) Laurent, T. C.; Fraser, J. R. Faseb J 1992, 6, 2397-404.

(6) Salmivirta, M.; Lidholt, K.; Lindahl, U. Faseb J 1996, 10, 1270-9.

(7) Mittal, N., Sanyal, S. N. American Journal of Biomedical Sciences 2010, 2, 190- 201.

(8) Hang, H. C.; Yu, C.; Pratt, M. R.; Bertozzi, C. R. J. Am. Chem. Soc. FIELD Full Journal Title:Journal of the American Chemical Society 2004, 126, 6-7.

(9) Kristova, V.; Martinkova, L.; Husakova, L.; Kuzma, M.; Rauvolfova, J.; Kavan, D.; Pompach, P.; Bezouska, K.; Kren, V. J. Biotechnol. FIELD Full Journal Title:Journal of Biotechnology 2005, 115, 157-166.

(10) Vocadlo, D. J.; Hang, H. C.; Kim, E.-J.; Hanover, J. A.; Bertozzi, C. R. Proc. Natl. Acad. Sci. U. S. A. FIELD Full Journal Title:Proceedings of the National Academy of Sciences of the United States of America 2003, 100, 9116-9121.

(11) Saxon, E.; Luchansky, S. J.; Hang, H. C.; Yu, C.; Lee, S. C.; Bertozzi, C. R. J. Am. Chem. Soc. FIELD Full Journal Title:Journal of the American Chemical Society 2002, 124, 14893-14902.

(12) Gurcel, C.; Vercoutter-Edouart, A.-S.; Fonbonne, C.; Mortuaire, M.; Salvador, A.; Michalski, J.-C.; Lemoine, J. Anal. Bioanal. Chem. 2008, 390, 2089-2097.

(13) Laughlin, S. T.; Baskin, J. M.; Amacher, S. L.; Bertozzi, C. R. Science (Washington, DC, U. S.) 2008, 320, 664-667.

(14) Rexach, J. E.; Clark, P. M.; Hsieh-Wilson, L. C. Nat. Chem. Biol. 2008, 4, 97- 106.

180

(15) Khidekel, N.; Arndt, S.; Lamarre-Vincent, N.; Lippert, A.; Poulin-Kerstien, K. G.; Ramakrishnan, B.; Qasba, P. K.; Hsieh-Wilson, L. C. J. Am. Chem. Soc. FIELD Full Journal Title:Journal of the American Chemical Society 2003, 125, 16162-16163.

(16) Khidekel, N.; Ficarro, S. B.; Peters, E. C.; Hsieh-Wilson, L. C. Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 13132-13137.

(17) Hang, H. C.; Bertozzi, C. R. J. Am. Chem. Soc. FIELD Full Journal Title:Journal of the American Chemical Society 2001, 123, 1242-1243.

(18) Ramakrishnan, B.; Qasba, P. K. J. Biol. Chem. 2002, 277, 20833-20839.

(19) Khidekel, N.; Ficarro, S. B.; Clark, P. M.; Bryan, M. C.; Swaney, D. L.; Rexach, J. E.; Sun, Y. E.; Coon, J. J.; Peters, E. C.; Hsieh-Wilson, L. C. Nat. Chem. Biol. 2007, 3, 339-348.

(20) Rexach, J. E.; Rogers, C. J.; Yu, S. H.; Tao, J.; Sun, Y. E.; Hsieh-Wilson, L. C. Nat Chem Biol 2010, 6, 645-51.

(21) Cai, L.; Guan, W.; Chen, W.; Wang, P. G. J Org Chem 2010, 75, 3492-4.

(22) Yu, D.; Ellis, H. M.; Lee, E. C.; Jenkins, N. A.; Copeland, N. G.; Court, D. L. Proc Natl Acad Sci U S A 2000, 97, 5978-83.

(23) Datsenko, K. A.; Wanner, B. L. Proc Natl Acad Sci U S A 2000, 97, 6640-5.

(24) Abeyrathne, P. D.; Lam, J. S. Mol Microbiol 2007, 65, 1345-59.

(25) Baskin, J. M.; Dehnert, K. W.; Laughlin, S. T.; Amacher, S. L.; Bertozzi, C. R. Proc Natl Acad Sci U S A 2010, 107, 10360-5.

(26) Zeng, Y.; Ramya, T. N.; Dirksen, A.; Dawson, P. E.; Paulson, J. C. Nat Methods 2009, 6, 207-9.

(27) Yi, W.; Liu, X.; Li, Y.; Li, J.; Xia, C.; Zhou, G.; Zhang, W.; Zhao, W.; Chen, X.; Wang, P. G. Proc Natl Acad Sci U S A 2009, 106, 4207-12.

(28) Kaniuk, N. A.; Vinogradov, E.; Li, J.; Monteiro, M. A.; Whitfield, C. J Biol Chem 2004, 279, 31237-50.

(29) Lehrer, J.; Vigeant, K. A.; Tatar, L. D.; Valvano, M. A. J Bacteriol 2007, 189, 2618-28.

181

(30) Abeyrathne, P. D.; Daniels, C.; Poon, K. K.; Matewish, M. J.; Lam, J. S. J Bacteriol 2005, 187, 3002-12.

(31) Milewski, S.; Gabriel, I.; Olchowy, J. Yeast 2006, 23, 1-14.

(32) Hang, H. C.; Yu, C.; Kato, D. L.; Bertozzi, C. R. Proc Natl Acad Sci U S A 2003, 100, 14846-51.

(33) Becker, D. J.; Lowe, J. B. Glycobiology 2003, 13, 41R-53R.

(34) Oriol, R. J Immunogenet 1990, 17, 235-45.

(35) Li, M.; Liu, X. W.; Shao, J.; Shen, J.; Jia, Q.; Yi, W.; Song, J. K.; Woodward, R.; Chow, C. S.; Wang, P. G. Biochemistry 2008, 47, 378-87.

(36) Li, M.; Shen, J.; Liu, X.; Shao, J.; Yi, W.; Chow, C. S.; Wang, P. G. Biochemistry 2008, 47, 11590-7.

(37) Daniel , B. S.; Yu-Nong, L.; Chun-Hung, L. Advanced Synthesis & Catalysis 2008, 350, 2313-2321.

(38) Pettit, N.; Styslinger, T.; Mei, Z.; Han, W.; Zhao, G.; Wang, P. G. Biochem Biophys Res Commun 2010, 402, 190-5.

(39) Ruiz-Palacios, G. M.; Cervantes, L. E.; Ramos, P.; Chavez-Munguia, B.; Newburg, D. S. J Biol Chem 2003, 278, 14112-20.

(40) Le Pendu, J. Adv Exp Med Biol 2004, 554, 135-43.

(41) Newburg, D. S. J Nutr 1997, 127, 980S-984S.

(42) Adelhorst, K.; Whitesides, G. M. Carbohydr Res 1993, 242, 69-76.

(43) Ichikawa, Y.; Sim, M. M.; Wong, C. H. 1992; Vol. 57, p 2943-2946.

(44) Albermann, C.; Piepersberg, W.; Wehmeier, U. F. Carbohydr Res 2001, 334, 97- 103.

(45) Pastuszak, I.; Ketchum, C.; Hermanson, G.; Sjoberg, E. J.; Drake, R.; Elbein, A. D. J Biol Chem 1998, 273, 30165-74.

(46) Hinderlich, S.; Berger, M.; Blume, A.; Chen, H.; Ghaderi, D.; Bauer, C. Biochem Biophys Res Commun 2002, 294, 650-4.

182

(47) Kotake, T.; Hojo, S.; Tajima, N.; Matsuoka, K.; Koyama, T.; Tsumuraya, Y. J Biol Chem 2008, 283, 8125-35.

(48) Coyne, M. J.; Reinap, B.; Lee, M. M.; Comstock, L. E. Science 2005, 307, 1778- 81.

(49) Wang, W.; Hu, T.; Frantom, P. A.; Zheng, T.; Gerwe, B.; Del Amo, D. S.; Garret, S.; Seidel, R. D., 3rd; Wu, P. Proc Natl Acad Sci U S A 2009, 106, 16096-101.

(50) Min, D.-H.; Yeo, W.-S.; Mrksich, M. Analytical Chemistry 2004, 76, 3923-3929.

(51) Wagner, G. K.; Pesnot, T. Chembiochem 2010, 11, 1939-49.

(52) Wang, G.; Boulton, P. G.; Chan, N. W.; Palcic, M. M.; Taylor, D. E. Microbiology 1999, 145 ( Pt 11), 3245-53.

(53) Yi, W.; Shen, J.; Zhou, G.; Li, J.; Wang, P. G. J Am Chem Soc 2008, 130, 14420- 1.

(54) Sun, H. Y.; Lin, S. W.; Ko, T. P.; Pan, J. F.; Liu, C. L.; Lin, C. N.; Wang, A. H.; Lin, C. H. J Biol Chem 2007, 282, 9973-82.

(55) Guan, W.; Cai, L.; Fang, J.; Wu, B.; George Wang, P. Chem Commun (Camb) 2009, 6976-8.

(56) Guan, W.; Cai, L.; Wang, P. G. Chemistry 2010, 16, 13343-5.

183

CHAPTER 6

CONCLUSIONS AND PERSPECTIVES

The work accomplished in chapters 2 and 3 provide a traditional biochemical characterization of several bacterial glycosyltransferases. By determining the order of sequential glycosylation for E. coli O128 and E. coli O127, this work provides the ground work for large scale synthesis of the repeating units, in vitro, which can be further used for either investigating the downstream enzymatic reactions or for possibly the creation of a pure, structurally defined polysaccharide based vaccine. Furthermore, by demonstrating that the transferases involved with O-antigen biosynthesis interact in vitro, inhibitors of this interaction may provide for a source of controlling or limiting the O- antigen biosynthesis in vivo, thereby reducing the bacterium‘s pathogenecity. In addition to the O-antigen biosynthesis investigation, WbnI, was shown to exhibit enzymatic activity in the absence of a divalent metal cation, which is unlike the eukaryotic transferases of the same CAZy family. Further work can be performed to determine why this metal independency occurs and where in the evolution timescale this genotype change occurred.

184

The work discussed in chapter 4 is one of the first contributions made in attempting to assign a function to a putative glycosyltransferase without any prior knowledge of the enzymes‘ function. Prior work in the field of glycosyltransferase functional assignment typically applies prior knowledge of the putative GTs towards experimentally determining the GT‘s function in vitro. However, this work has demonstrated proof of concept in the idea of assigning some sort of enzymatic function with the only knowledge that a particular gene encodes for a putative glycosyltransferase.

While the ―hits‖ observed in this work were relatively low, this was not likely an attribute to the methodology, but instead a lack of suitable acceptors and donors. Thus, future work with the characterization of bacterial glycosyltransferases can be performed by expanding the donor and acceptor substrate libraries to those more similar of endogenous bacterial substrates. Furthermore, future work will be performed by investigating the functional assignment of eukaryotic glycosyltransferases using the developed technologies, which should require less complex donor and acceptor molecules.

However, the protein expression of the eukaryotic transferases will be a significant hurdle to overcome; problems which are not observed for prokaryotic glycosyltransferase expression. Firstly, cDNA libraries are needed to obtain full length glycosyltransferase clones which is a time consuming and expensive obstacle. Also, the expression of the

―typical‖ membrane bound eukaryotic glycosyltransferase will be challenging as the expression of membrane bound enzymes may not be easily performed in vitro. The work described in chapter 5 complements previous work done by the Wang lab, in both aspects of imaging the cell surface oligosaccharides, in vivo, and investigating the

185 chemoenzymatic and application of unique sugar donor substrates. Previously, the Wang lab demonstrated that the cell surface oligosaccharides of E. coli O86 were able to incorporate modified fucose moieties, thus allowing the visualization of these sugars through bioorthogonal chemistry. In this work, the model system E. coli O86 was expanded to allowing the incorporation of a GlcNAc derivative into the cell surface oligosaccharides. Further work with this model strain can be to incorporate other modified monosaccharides, such as GlcNAz or GalNAz, into the cell surface oligosaccharides, allowing for alternative methods of visualization. Lastly, in chapter 5, the donor substrate specificity of four different fucosyltransferase accomplishes the proof of concept idea again, where the biosynthetic pathways of sugar donor synthesis can be exploited in vitro, after which the chemoenzymatically synthesized substrates can be used for determining the donor substrate specificity of a transferase. Due to the method of detection, this idea can be further expanded to other sugar nucleotides as only milligram amounts of the donor substrate is required, which is advantageous since some of the biosynthetic pathways may be more challenging to reconstitute in vitro.

186

BIBLIOGRAPHY

Abeyrathne, P. D.; Daniels, C.; Poon, K. K.; Matewish, M. J.; Lam, J. S. J Bacteriol 2005, 187, 3002-12.

Abeyrathne, P. D.; Lam, J. S. Mol Microbiol 2007, 65, 1345-59.

Adelhorst, K.; Whitesides, G. M. Carbohydr Res 1993, 242, 69-76.

Agard, N. J.; Baskin, J. M.; Prescher, J. A.; Lo, A.; Bertozzi, C. R. ACS Chem Biol 2006, 1, 644-8.

Aharoni, A.; Thieme, K.; Chiu, C. P.; Buchini, S.; Lairson, L. L.; Chen, H.; Strynadka, N. C.; Wakarchuk, W. W.; Withers, S. G. Nat Methods 2006, 3, 609-14.

Aharoni, A.; Weiner, L.; Lewis, A.; Ottolenghi, M.; Sheves, M. J Am Chem Soc 2001, 123, 6612-6.

Ahsen, O.; Voigtmann, U.; Klotz, M.; Nifantiev, N.; Schottelius, A.; Ernst, A.; Muller- Tiemann, B.; Parczyk, K. Anal Biochem 2008, 372, 96-105.

Albermann, C.; Piepersberg, W.; Wehmeier, U. F. Carbohydr Res 2001, 334, 97-103.

Alfaro, J. A.; Zheng, R. B.; Persson, M.; Letts, J. A.; Polakowski, R.; Bai, Y.; Borisova, S. N.; Seto, N. O.; Lowary, T. L.; Palcic, M. M.; Evans, S. V. J Biol Chem 2008, 283, 10097-108.

Amer, A. O.; Valvano, M. A. Microbiology 2001, 147, 3015-25.

Amer, A. O.; Valvano, M. A. Microbiology 2002, 148, 571-82.

Andrianopoulos, K.; Wang, L.; Reeves, P. R. J Bacteriol 1998, 180, 998-1001.

Ban, L.; Mrksich, M. Angew Chem Int Ed Engl 2008, 47, 3396-9.

187

Barreteau, H.; Kovac, A.; Boniface, A.; Sova, M.; Gobec, S.; Blanot, D. FEMS Microbiol Rev 2008, 32, 168-207.

Baskin, J. M.; Dehnert, K. W.; Laughlin, S. T.; Amacher, S. L.; Bertozzi, C. R. Proc Natl Acad Sci U S A 2010, 107, 10360-5.

Becker, D. J.; Lowe, J. B. Glycobiology 2003, 13, 41R-53R.

Blixt, O.; Allin, K.; Bohorov, O.; Liu, X.; Andersson-Sand, H.; Hoffmann, J.; Razi, N. Glycoconjugate Journal 2008, 25, 59-68.

Blixt, O.; Allin, K.; Bohorov, O.; Liu, X.; Andersson-Sand, H.; Hoffmann, J.; Razi, N. Glycoconj J 2008, 25, 59-68.

Blixt, O.; Razi, N. Methods Enzymol 2006, 415, 137-53.

Boix, E.; Zhang, Y.; Swaminathan, G. J.; Brew, K.; Acharya, K. R. J Biol Chem 2002, 277, 28310-8.

Bourne, Y.; Henrissat, B. Curr Opin Struct Biol 2001, 11, 593-600.

Breton, C.; Imberty, A. Curr Opin Struct Biol 1999, 9, 563-71.

Breton, C.; Oriol, R.; Imberty, A. Glycobiology 1997, 16, 29 R -37 R.

Breton, C.; Snajdrova, L.; Jeanneau, C.; Koca, J.; Imberty, A. Glycobiology 2006, 16, 29R-37R.

Brew, K.; Tumbale, P.; Acharya, K. R. J Biol Chem 2010, 285, 37121-7.

Bryan, M. C.; Lee, L. V.; Wong, C. H. Bioorg Med Chem Lett 2004, 14, 3185-8.

Brzeska, H.; Guag, J.; Remmert, K.; Chacko, S.; Korn, E. D. J Biol Chem 2010, 285, 5738-47.

Bu, S.; Li, Y.; Zhou, M.; Azadin, P.; Zeng, M.; Fives-Taylor, P.; Wu, H. J Bacteriol 2008, 190, 1256-66.

Burkhart, F.; Zhang, Z.; Wacowich-Sgarbi, S.; Wong, C. H. Angew Chem Int Ed Engl 2001, 40, 1274-1277.

Cai, C. S.; Zhu, Y. Z.; Zhong, Y.; Xin, X. F.; Jiang, X. G.; Lou, X. L.; He, P.; Qin, J. H.; Zhao, G. P.; Wang, S. Y.; Guo, X. K. BMC Microbiol 2010, 10, 67.

188

Cai, L.; Guan, W.; Chen, W.; Wang, P. G. J Org Chem 2010, 75, 3492-4.

Campbell, C. T.; Sampathkumar, S. G.; Yarema, K. J. Mol Biosyst 2007, 3, 187-94.

Campbell, J. A.; Davies, G. J.; Bulone, V.; Henrissat, B. Biochem J 1997, 326 ( Pt 3), 929-39.

Cantarel, B. L.; Coutinho, P. M.; Rancurel, C.; Bernard, T.; Lombard, V.; Henrissat, B. Nucleic Acids Res 2009, 37, D233-8.

Cerdero-Tarraga, A. M.; Patrick, S.; Crossman, L. C.; Blakely, G.; Abratt, V.; Lennard, N.; Poxton, I.; Duerden, B.; Harris, B.; Quail, M. A.; Barron, A.; Clark, L.; Corton, C.; Doggett, J.; Holden, M. T. G.; Larke, N.; Line, A.; Lord, A.; Norbertczak, H.; Ormond, D.; Price, C.; Rabbinowitsch, E.; Woodward, J.; Barrell, B.; Parkhill, J. In Science 2005; Vol. 307, p 1463-1465.

Chang, P. V.; Prescher, J. A.; Hangauer, M. J.; Bertozzi, C. R. J Am Chem Soc 2007, 129, 8400-1.

Chazalet, V.; Uehara, K.; Geremia, R. A.; Breton, C. J Bacteriol 2001, 183, 7067-75.

Chen, X.; Liu, Z.; Zhang, J.; Zhang, W.; Kowal, P.; Wang, P. G. Chembiochem 2002, 3, 47-53.

Chiu, C. P.; Watts, A. G.; Lairson, L. L.; Gilbert, M.; Lim, D.; Wakarchuk, W. W.; Withers, S. G.; Strynadka, N. C. Nat Struct Mol Biol 2004, 11, 163-70.

Cole, C. L.; Hansen, S. U.; Barath, M.; Rushton, G.; Gardiner, J. M.; Avizienyte, E.; Jayson, G. C. PLoS One 2010, 5, e11644.

Comstock, L. E.; Kasper, D. L. Cell 2006, 126, 847-50.

Coyne, M. J.; Reinap, B.; Lee, M. M.; Comstock, L. E. Science 2005, 307, 1778-81.

Daniel , B. S.; Yu-Nong, L.; Chun-Hung, L. Advanced Synthesis & Catalysis 2008, 350, 2313-2321.

Datsenko, K. A.; Wanner, B. L. Proc Natl Acad Sci U S A 2000, 97, 6640-5. de Graffenried, C. L.; Bertozzi, C. R. Curr Opin Cell Biol 2004, 16, 356-63.

Duda, K. A.; Lindner, B.; Brade, H.; Leimbach, A.; Brzuszkiewicz, E.; Dobrindt, U.; Holst, O. Microbiology 2011.

189

Dwek, R. A. Chem. Rev. (Washington, D. C.) 1996, 96, 683-720.

Evrard, B.; Balestrino, D.; Dosgilbert, A.; Bouya-Gachancard, J. L.; Charbonnel, N.; Forestier, C.; Tridon, A. Infect Immun 2010, 78, 210-9.

Fang, J.; Li, J.; Chen, X.; Zhang, Y.; Wang, J.; Guo, Z.; Zhang, W.; Yu, L.; Brew, K.; Wang, P. G. Journal of the American Chemical Society 1998, 120, 6635-6638.

Fang, J. W. L., J.; Chen, X.; Zhang, Y. N.; Wang, J. Q.; Guo, Z. M.; Zhang, W.; Yu, L. B.; Brew, K.; Wang, P. G. J. Am. Chem. Soc 1998, 120, 6635-6638.

Feinberg, H.; Taylor, M. E.; Weis, W. I. J Biol Chem 2007, 282, 17250-8.

Feizi, T. Immunol Rev 2000, 173, 79-88.

Feng, L.; Han, W.; Wang, Q.; Bastin, D. A.; Wang, L. Vet Microbiol 2005, 106, 241-8.

Freinkman, E.; Chng, S. S.; Kahne, D. Proc Natl Acad Sci U S A 2010, 108, 2486-91.

Gastinel, L. N.; Bignon, C.; Misra, A. K.; Hindsgaul, O.; Shaper, J. H.; Joziasse, D. H. Embo J 2001, 20, 638-49.

Guan, W.; Cai, L.; Fang, J.; Wu, B.; George Wang, P. Chem Commun (Camb) 2009, 6976-8.

Guan, W.; Cai, L.; Wang, P. G. Chemistry 2010, 16, 13343-5.

Gurcel, C.; Vercoutter-Edouart, A.-S.; Fonbonne, C.; Mortuaire, M.; Salvador, A.; Michalski, J.-C.; Lemoine, J. Anal. Bioanal. Chem. 2008, 390, 2089-2097.

Guttenplan, S. B.; Blair, K. M.; Kearns, D. B. PLoS Genet 2010, 6, e1001243.

Hang, H. C.; Bertozzi, C. R. J. Am. Chem. Soc. FIELD Full Journal Title:Journal of the American Chemical Society 2001, 123, 1242-1243.

Hang, H. C.; Yu, C.; Kato, D. L.; Bertozzi, C. R. Proc Natl Acad Sci U S A 2003, 100, 14846-51.

Hang, H. C.; Yu, C.; Pratt, M. R.; Bertozzi, C. R. J. Am. Chem. Soc. FIELD Full Journal Title:Journal of the American Chemical Society 2004, 126, 6-7.

Hantke, K. Curr Opin Microbiol 2001, 4, 172-7.

190

Hashimoto, K.; Madej, T.; Bryant, S. H.; Panchenko, A. R. J Mol Biol 2010, 399, 196- 206.

Heissigerova, H.; Breton, C.; Moravcova, J.; Imberty, A. Glycobiology 2003, 13, 377-86.

Helenius, A.; Aebi, M. Annu Rev Biochem 2004, 73, 1019-49.

Helm, J. S.; Hu, Y.; Chen, L.; Gross, B.; Walker, S. J Am Chem Soc 2003, 125, 11168-9.

Hidaka, M.; Fushinobu, S.; Honda, Y.; Wakagi, T.; Shoun, H.; Kitaoka, M. J Biochem 2011, 147, 237-44.

Hinderlich, S.; Berger, M.; Blume, A.; Chen, H.; Ghaderi, D.; Bauer, C. Biochem Biophys Res Commun 2002, 294, 650-4.

Hitoshi, S.; Kusunoki, S.; Kanazawa, I.; Tsuji, S. J Biol Chem 1995, 270, 8844-50.

Holst, O.; Bock, K.; Brade, L.; Brade, H. Eur J Biochem 1995, 229, 194-200.

Houseman, B. T.; Mrksich, M. Chem Biol 2002, 9, 443-54.

Ichikawa, Y.; Sim, M. M.; Wong, C. H. J of Org Chem 1992; Vol. 57, p 2943-2946.

Ihara, H.; Ikeda, Y.; Toma, S.; Wang, X.; Suzuki, T.; Gu, J.; Miyoshi, E.; Tsukihara, T.; Honke, K.; Matsumoto, A.; Nakagawa, A.; Taniguchi, N. Glycobiology 2007, 17, 455-66.

Islam, S. T.; Taylor, V. L.; Qi, M.; Lam, J. S. MBio 2011, 1.

Joiner, K. A. Curr Top Microbiol Immunol 1985, 121, 99-133.

Julien, S.; Krzewinski-Recchi, M. A.; Harduin-Lepers, A.; Gouyer, V.; Huet, G.; Le Bourhis, X.; Delannoy, P. Glycoconj J 2001, 18, 883-93.

Julien, S.; Picco, G.; Sewell, R.; Vercoutter-Edouart, A. S.; Tarp, M.; Miles, D.; Clausen, H.; Taylor-Papadimitriou, J.; Burchell, J. M. Br J Cancer 2009, 100, 1746-54.

K. Ohtsubo; Marth, J. D. Cell 2006, 126, 855 - 867.

Kanegasaki, S.; Wright, A. Proc Natl Acad Sci U S A 1970, 67, 951-8.

Kaniuk, N. A.; Vinogradov, E.; Li, J.; Monteiro, M. A.; Whitfield, C. J Biol Chem 2004, 279, 31237-50.

191

Kannagi, R.; Fukushi, Y.; Tachikawa, T.; Noda, A.; Shin, S.; Shigeta, K.; Hiraiwa, N.; Fukuda, Y.; Inamoto, T.; Hakomori, S.; et al. Cancer Res 1986, 46, 2619-26.

Khidekel, N.; Arndt, S.; Lamarre-Vincent, N.; Lippert, A.; Poulin-Kerstien, K. G.; Ramakrishnan, B.; Qasba, P. K.; Hsieh-Wilson, L. C. J. Am. Chem. Soc. FIELD Full Journal Title:Journal of the American Chemical Society 2003, 125, 16162-16163.

Khidekel, N.; Ficarro, S. B.; Clark, P. M.; Bryan, M. C.; Swaney, D. L.; Rexach, J. E.; Sun, Y. E.; Coon, J. J.; Peters, E. C.; Hsieh-Wilson, L. C. Nat. Chem. Biol. 2007, 3, 339- 348.

Khidekel, N.; Ficarro, S. B.; Peters, E. C.; Hsieh-Wilson, L. C. Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 13132-13137.

Kim, T. H.; Sebastian, S.; Pinkham, J. T.; Ross, R. A.; Blalock, L. T.; Kasper, D. L. J Biol Chem 2010, 285, 27839-49.

Koeller, K. M.; Wong, C. H. Nature 2001, 409, 232-40.

Koeller, K. M.; Wong, C. H. Chem Rev 2000, 100, 4465-94.

Kohn, M.; Breinbauer, R. Angew Chem Int Ed Engl 2004, 43, 3106-16.

Koizumi, S.; Endo, T.; Tabata, K.; Ozaki, A. Nat Biotechnol 1998, 16, 847-50.

Kopp, M.; Rupprath, C.; Irschik, H.; Bechthold, A.; Elling, L.; Muller, R. Chembiochem 2007, 8, 813-9.

Kos, V.; Whitfield, C. J Biol Chem 2010, 285, 19668-87.

Kotake, T.; Hojo, S.; Tajima, N.; Matsuoka, K.; Koyama, T.; Tsumuraya, Y. J Biol Chem 2008, 283, 8125-35.

Kristova, V.; Martinkova, L.; Husakova, L.; Kuzma, M.; Rauvolfova, J.; Kavan, D.; Pompach, P.; Bezouska, K.; Kren, V. J. Biotechnol. FIELD Full Journal Title:Journal of Biotechnology 2005, 115, 157-166.

Lairson, L. L.; Henrissat, B.; Davies, G. J.; Withers, S. G. Annu Rev Biochem 2008, 77, 521-55.

Larsen, R. D.; Ernst, L. K.; Nair, R. P.; Lowe, J. B. Proc Natl Acad Sci U S A 1990, 87, 6674-8.

Laughlin, S. T.; Baskin, J. M.; Amacher, S. L.; Bertozzi, C. R. Science 2008, 320, 664-7.

192

Laughlin, S. T.; Baskin, J. M.; Amacher, S. L.; Bertozzi, C. R. Science (Washington, DC, U. S.) 2008, 320, 664-667.

Laurent, T. C.; Fraser, J. R. Faseb J 1992, 6, 2397-404.

Le Pendu, J. Adv Exp Med Biol 2004, 554, 135-43.

Lehle, L.; Strahl, S.; Tanner, W. Angew Chem Int Ed Engl 2006, 45, 6802-18.

Lehrer, J.; Vigeant, K. A.; Tatar, L. D.; Valvano, M. A. J Bacteriol 2007, 189, 2618-28.

Lerouge, I.; Vanderleyden, J. FEMS Microbiol Rev 2002, 26, 17-47.

Letts, J. A.; Rose, N. L.; Fang, Y. R.; Barry, C. H.; Borisova, S. N.; Seto, N. O.; Palcic, M. M.; Evans, S. V. J Biol Chem 2006, 281, 3625-32.

Li, J. J.; Bugg, T. D. Chem Commun (Camb) 2004, 182-3.

Li, M.; Liu, X. W.; Shao, J.; Shen, J.; Jia, Q.; Yi, W.; Song, J. K.; Woodward, R.; Chow, C. S.; Wang, P. G. Biochemistry 2008, 47, 378-87.

Li, M.; Shen, J.; Liu, X.; Shao, J.; Yi, W.; Chow, C. S.; Wang, P. G. Biochemistry 2008, 47, 11590-7.

Liu, D.; Cole, R. A.; Reeves, P. R. J Bacteriol 1996, 178, 2102-7.

Liu, X. W.; Xia, C.; Li, L.; Guan, W. Y.; Pettit, N.; Zhang, H. C.; Chen, M.; Wang, P. G. Bioorg Med Chem 2009, 17, 4910-5.

Liu, Y.; Chan, Y. M.; Wu, J.; Chen, C.; Benesi, A.; Hu, J.; Wang, Y.; Chen, G. Chembiochem 2011, 12, 685-90.

Lowe, J. B.; Stoolman, L. M.; Nair, R. P.; Larsen, R. D.; Berhend, T. L.; Marks, R. M. Cell 1990, 63, 475-84.

Luchansky, S. J.; Goon, S.; Bertozzi, C. R. Chembiochem 2004, 5, 371-4.

Lundborg, M.; Modhukur, V.; Widmalm, G. Glycobiology 2010, 20, 366-8.

Magnani, J. L.; Steplewski, Z.; Koprowski, H.; Ginsburg, V. Cancer Res 1983, 43, 5489- 92.

193

Martinez-Duncker, I.; Mollicone, R.; Candelier, J. J.; Breton, C.; Oriol, R. Glycobiology 2003, 13, 1C-5C.

Milewski, S.; Gabriel, I.; Olchowy, J. Yeast 2006, 23, 1-14.

Min, D.-H.; Yeo, W.-S.; Mrksich, M. Analytical Chemistry 2004, 76, 3923-3929.

Min, D. H.; Su, J.; Mrksich, M. Angew Chem Int Ed Engl 2004, 43, 5973-7.

Mittal, N., Sanyal, S. N. American Journal of Biomedical Sciences 2010, 2, 190-201.

Montoya-Peleaz, P. J.; Riley, J. G.; Szarek, W. A.; Valvano, M. A.; Schutzbach, J. S.; Brockhausen, I. Bioorg Med Chem Lett 2005, 15, 1205-11.

Moran, A. P. Carbohydr Res 2008, 343, 1952-65.

Moran, A. P.; Prendergast, M. M.; Appelmelk, B. J. FEMS Immunol Med Microbiol 1996, 16, 105-15.

Mrksich, M. ACS Nano 2008, 2, 7-18.

Mullegger, J.; Chen, H. M.; Warren, R. A.; Withers, S. G. Angew Chem Int Ed Engl 2006, 45, 2585-8.

Newburg, D. S. J Nutr 1997, 127, 980S-984S.

Nguyen, H. P.; Seto, N. O.; Cai, Y.; Leinala, E. K.; Borisova, S. N.; Palcic, M. M.; Evans, S. V. J Biol Chem 2003, 278, 49191-5.

Nishihara, S.; Iwasaki, H.; Nakajima, K.; Togayachi, A.; Ikehara, Y.; Kudo, T.; Kushi, Y.; Furuya, A.; Shitara, K.; Narimatsu, H. Glycobiology 2003, 13, 445-55.

Ohtsubo, K.; Marth, J. D. Cell 2006, 126, 855-67.

Oriol, R. J Immunogenet 1990, 17, 235-45.

Oriol, R.; Samuelsson, B. E.; Messeter, L. J Immunogenet 1990, 17, 279-99.

Orntoft, T. F.; Greenwell, P.; Clausen, H.; Watkins, W. M. Gut 1991, 32, 287-93.

Owicki, J. C. J Biomol Screen 2000, 5, 297-306.

Pak, J. E.; Arnoux, P.; Zhou, S.; Sivarajah, P.; Satkunarajah, M.; Xing, X.; Rini, J. M. J Biol Chem 2006, 281, 26693-701.

194

Pak, J. E.; Rini, J. M. Methods Enzymol 2006, 416, 30-48.

Palcic, M. M. Curr Opin Chem Biol 2011.

Palcic, M. M. Methods Enzymol 1994, 230, 300-16.

Palcic, M. M.; Hindsgaul, O. In Glycobiology 1991; Vol. 1, p 205-209.

Palcic, M. M.; Pierce, M.; Hindsgaul, O. Methods Enzymol 1994, 247, 215-27.

Pang, P. C.; Tissot, B.; Drobnis, E. Z.; Sutovsky, P.; Morris, H. R.; Clark, G. F.; Dell, A. J Biol Chem 2007, 282, 36593-602.

Park, S.; Shin, I. Org Lett 2007, 9, 1675-8.

Pastuszak, I.; Ketchum, C.; Hermanson, G.; Sjoberg, E. J.; Drake, R.; Elbein, A. D. J Biol Chem 1998, 273, 30165-74.

Patenaude, S. I.; Seto, N. O.; Borisova, S. N.; Szpacenko, A.; Marcus, S. L.; Palcic, M. M.; Evans, S. V. Nat Struct Biol 2002, 9, 685-90.

Persson, M.; Letts, J. A.; Hosseini-Maaf, B.; Borisova, S. N.; Palcic, M. M.; Evans, S. V.; Olsson, M. L. J Biol Chem 2007, 282, 9564-70.

Persson, M.; Palcic, M. M. Anal Biochem 2008, 378, 1-7.

Pettit, N.; Styslinger, T.; Mei, Z.; Han, W.; Zhao, G.; Wang, P. G. Biochem Biophys Res Commun 2010, 402, 190-5.

Phillips, M. L.; Nudelman, E.; Gaeta, F. C.; Perez, M.; Singhal, A. K.; Hakomori, S.; Paulson, J. C. Science 1990, 250, 1130-2.

Raetz, C. R.; Reynolds, C. M.; Trent, M. S.; Bishop, R. E. Annu Rev Biochem 2007, 76, 295-329.

Raetz, C. R.; Whitfield, C. Annu Rev Biochem 2002, 71, 635-700.

Ramakrishnan, B.; Qasba, P. K. J. Biol. Chem. 2002, 277, 20833-20839.

Randriantsoa, M.; Drouillard, S.; Breton, C.; Samain, E. FEBS Lett 2007, 581, 2652-6.

Rao, M.; Tvaroska, I. Proteins 2001, 44, 428-34.

195

Rexach, J. E.; Clark, P. M.; Hsieh-Wilson, L. C. Nat. Chem. Biol. 2008, 4, 97-106.

Rexach, J. E.; Rogers, C. J.; Yu, S. H.; Tao, J.; Sun, Y. E.; Hsieh-Wilson, L. C. Nat Chem Biol 2010, 6, 645-51.

Riley, J. G.; Xu, C.; Brockhausen, I. Carbohydr Res, 345, 586-97.

Rivas, M.; Miliwebsky, E.; Balbi, L.; Garcia, B.; Leardini, N.; Tous, M.; Chillemi, G.; Baschkier, A.; Strugo, L. Medicina (B Aires) 2000, 60, 249-52.

Robbins, P. W.; Bray, D.; Dankert, B. M.; Wright, A. Science 1967, 158, 1536-42.

Rosch, J. W.; Gao, G.; Ridout, G.; Wang, Y. D.; Tuomanen, E. I. Mol Microbiol 2009, 72, 12-25.

Roychoudhury, R.; Pohl, N. L. Curr Opin Chem Biol 2010, 14, 168-73.

Ruiz-Palacios, G. M.; Cervantes, L. E.; Ramos, P.; Chavez-Munguia, B.; Newburg, D. S. J Biol Chem 2003, 278, 14112-20.

Saksouk, N.; Pelosi, L.; Colin-Morel, P.; Boumedienne, M.; Abdian, P. L.; Geremia, R. A. Biochem J 2005, 389, 63-72.

Salmivirta, M.; Lidholt, K.; Lindahl, U. Faseb J 1996, 10, 1270-9.

Samuel, G.; Reeves, P. Carbohydr Res 2003, 338, 2503-19.

Sarnesto, A.; Kohlin, T.; Hindsgaul, O.; Thurin, J.; Blaszczyk-Thurin, M. J Biol Chem 1992, 267, 2737-44.

Saxon, E.; Luchansky, S. J.; Hang, H. C.; Yu, C.; Lee, S. C.; Bertozzi, C. R. J. Am. Chem. Soc. FIELD Full Journal Title:Journal of the American Chemical Society 2002, 124, 14893-14902.

Sears, P.; Wong, C. H. Science 2001, 291, 2344-50.

Seko, A.; Yamashita, K. Glycobiology 2005, 15, 943-51.

Sengupta, P.; Bhattacharyya, T.; Majumder, M.; Chatterjee, B. P. FEMS Immunol Med Microbiol 2000, 28, 133-7.

Sengupta, P.; Bhattacharyya, T.; Shashkov, A. S.; Kochanowski, H.; Basu, S. Carbohydr Res 1995, 277, 283-90.

196

Seto, N. O.; Palcic, M. M.; Compston, C. A.; Li, H.; Bundle, D. R.; Narang, S. A. J Biol Chem 1997, 272, 14133-8.

Shaikh, F. A.; Withers, S. G. Biochem Cell Biol 2008, 86, 169-77.

Shao, J.; Li, M.; Jia, Q.; Lu, Y.; Wang, P. G. FEBS Lett 2003, 553, 99-103.

Soya, N.; Shoemaker, G. K.; Palcic, M. M.; Klassen, J. S. Glycobiology 2009, 19, 1224- 34.

Stein, D., Lin, Y.-N. and Lin, C.-H. Advanced Synthesis & Catalysis 2008, 350, 2313– 2321.

Stenutz, R.; Weintraub, A.; Widmalm, G. FEMS Microbiol Rev 2006, 30, 382-403.

Stols, L.; Gu, M.; Dieckman, L.; Raffen, R.; Collart, F. R.; Donnelly, M. I. Protein Expr Purif 2002, 25, 8-15.

Stols, L.; Gu, M.; Dieckman, L.; Raffen, R.; Collart, F. R.; Donnelly, M. I. Protein Expression and Purification 2002, 25, 8-15.

Su, D. M.; Eguchi, H.; Yi, W.; Li, L.; Wang, P. G.; Xia, C. Org Lett 2008, 10, 1009-12.

Su, J.; Mrksich, M. Angew Chem Int Ed Engl 2002, 41, 4715-8.

Sun, H. Y.; Lin, S. W.; Ko, T. P.; Pan, J. F.; Liu, C. L.; Lin, C. N.; Wang, A. H.; Lin, C. H. J Biol Chem 2007, 282, 9973-82.

Takahashi, T.; Ikeda, Y.; Tateishi, A.; Yamaguchi, Y.; Ishikawa, M.; Taniguchi, N. Glycobiology 2000, 10, 503-10.

Taniguchi, N., Honke, K., Fukuda, M. Handbook of Glycosyltransferase and Related Genes.; Springer: Tokyo, 2002.

Taniguchi, N.; Nishikawa, A.; Fujii, S.; Gu, J. G. Methods Enzymol 1989, 179, 397-408.

Tumbale, P.; Brew, K. J Biol Chem 2009, 284, 25126-34.

Tumbale, P.; Jamaluddin, H.; Thiyagarajan, N.; Brew, K.; Acharya, K. R. Biochemistry 2008, 47, 8711-8.

Unligil, U. M.; Rini, J. M. Curr Opin Struct Biol 2000, 10, 510-7.

Valvano, M. A. Front Biosci 2003, 8, s452-71.

197

van Heijenoort, J. Glycobiology 2001, 11, 25R-36R.

Varki, A. Nature 2007, 446, 1023-9.

Verma, P. R.; Mukhopadhyay, B. Carbohydr Res, 345, 432-6.

Vocadlo, D. J.; Hang, H. C.; Kim, E.-J.; Hanover, J. A.; Bertozzi, C. R. Proc. Natl. Acad. Sci. U. S. A. FIELD Full Journal Title:Proceedings of the National Academy of Sciences of the United States of America 2003, 100, 9116-9121.

Wagner, G. K.; Pesnot, T. Chembiochem 2010, 11, 1939-49.

Wang, C. C.; Lee, J. C.; Luo, S. Y.; Kulkarni, S. S.; Huang, Y. W.; Lee, C. C.; Chang, K. L.; Hung, S. C. Nature 2007, 446, 896-9.

Wang, G.; Boulton, P. G.; Chan, N. W.; Palcic, M. M.; Taylor, D. E. Microbiology 1999, 145 ( Pt 11), 3245-53.

Wang, G.; Rasko, D. A.; Sherburne, R.; Taylor, D. E. Mol Microbiol 1999, 31, 1265-74.

Wang, P. G. Nat Chem Biol 2007, 3, 309-10.

Wang, Q.; Chan, T. R.; Hilgraf, R.; Fokin, V. V.; Sharpless, K. B.; Finn, M. G. J Am Chem Soc 2003, 125, 3192-3.

Wang, Q. M.; Peery, R. B.; Johnson, R. B.; Alborn, W. E.; Yeh, W. K.; Skatrud, P. L. J Bacteriol 2001, 183, 4779-85.

Wang, W.; Hu, T.; Frantom, P. A.; Zheng, T.; Gerwe, B.; Del Amo, D. S.; Garret, S.; Seidel, R. D., 3rd; Wu, P. Proc Natl Acad Sci U S A 2009, 106, 16096-101.

Watson, K. A.; McCleverty, C.; Geremia, S.; Cottaz, S.; Driguez, H.; Johnson, L. N. Embo J 1999, 18, 4619-32.

Whitfield, C. Nat Chem Biol 2010, 6, 403-4.

Widmalm, G. Comprehensive Glycoscience 2007, 101-132.

Widmalm, G.; Leontein, K. Carbohydr Res 1993, 247, 255-62.

Wiggins, C. A.; Munro, S. Proc Natl Acad Sci U S A 1998, 95, 7945-50.

Wimley, W. C.; White, S. H. Nat Struct Biol 1996, 3, 842-8.

198

Wong, C. H.; Pollak, A.; McCurry, S. D.; Sue, J. M.; Knowles, J. R.; Whitesides, G. M. Methods Enzymol 1982, 89 Pt D, 108-21.

Woodward, R.; Yi, W.; Li, L.; Zhao, G.; Eguchi, H.; Sridhar, P. R.; Guo, H.; Song, J. K.; Motari, E.; Cai, L.; Kelleher, P.; Liu, X.; Han, W.; Zhang, W.; Ding, Y.; Li, M.; Wang, P. G. Nat Chem Biol 2010, 6, 418-23.

Wu, J.; Guo, Z. Bioconjug Chem 2006, 17, 1537-44.

Yamamoto, F.; Clausen, H.; White, T.; Marken, J.; Hakomori, S. Nature 1990, 345, 229- 33.

Yang, M.; Brazier, M.; Edwards, R.; Davis, B. G. Chembiochem 2005, 6, 346-57.

Yi, W.; Bystricky, P.; Yao, Q.; Guo, H.; Zhu, L.; Li, H.; Shen, J.; Li, M.; Ganguly, S.; Bush, C. A.; Wang, P. G. Carbohydr Res 2006, 341, 100-8.

Yi, W.; Liu, X.; Li, Y.; Li, J.; Xia, C.; Zhou, G.; Zhang, W.; Zhao, W.; Chen, X.; Wang, P. G. Proc Natl Acad Sci U S A 2009, 106, 4207-12.

Yi, W.; Perali, R. S.; Eguchi, H.; Motari, E.; Woodward, R.; Wang, P. G. Biochemistry 2008, 47, 1241-8.

Yi, W.; Shao, J.; Zhu, L.; Li, M.; Singh, M.; Lu, Y.; Lin, S.; Li, H.; Ryu, K.; Shen, J.; Guo, H.; Yao, Q.; Bush, C. A.; Wang, P. G. J Am Chem Soc 2005, 127, 2040-1.

Yi, W.; Shen, J.; Zhou, G.; Li, J.; Wang, P. G. J Am Chem Soc 2008, 130, 14420-1.

Yi, W.; Yao, Q.; Zhang, Y.; Motari, E.; Lin, S.; Wang, P. G. Biochem Biophys Res Commun 2006, 344, 631-9.

Yi, W.; Zhu, L.; Guo, H.; Li, M.; Li, J.; Wang, P. G. Carbohydr Res 2006, 341, 2254-60.

Young, J.; Holland, I. B. Biochim Biophys Acta 1999, 1461, 177-200.

Yu, D.; Ellis, H. M.; Lee, E. C.; Jenkins, N. A.; Copeland, N. G.; Court, D. L. Proc Natl Acad Sci U S A 2000, 97, 5978-83.

Zachara, N. E.; Hart, G. W. Chem. Rev. (Washington, D. C.) 2002, 102, 431-438.

Zechel, D. L.; Withers, S. G. Acc Chem Res 2000, 33, 11-8.

199

Zeng, Y.; Ramya, T. N.; Dirksen, A.; Dawson, P. E.; Paulson, J. C. Nat Methods 2009, 6, 207-9.

Zhang, J.; Chen, X.; Shao, J.; Liu, Z.; Kowal, P.; Lu, Y.; Wang, P. G. Methods Enzymol 2003, 362, 106-24.

Zhang, Y.; Wang, P. G.; Brew, K. J Biol Chem 2001, 276, 11567-74.

Zhao, G.; Guan, W.; Cai, L.; Wang, P. G. Nat Protoc 2010, 5, 636-46.

Zheng, Q.; Van Die, I.; Cummings, R. D. Glycobiology 2008, 18, 290-302.

Zhou, G.; Liu, X.; Su, D.; Li, L.; Xiao, M.; Wang, P. G. Bioorg Med Chem Lett 2011, 21, 311-4.

200

APPENDIX

NMR SPECRTRA, ESI/MS DATA, SUPPORTING FIGURES AND TABLES

201

Appendix A. Spectra, Tables, Figures for Chapter 2

1HNMR for WbiQ reaction

202

GST-WbiQ catalyzed reaction 1H NMR spectrum of Fuc-α1,2-Gal-β1,3-GalNAc-α-OMe

1 (500MHz) at 27 ⁰C. -L-Fucp(12)--DGalp(13)--GalpNAc-OMe. H NMR (D2O,

500 MHz)  5.28 (d, J = 4 Hz, 1H), 4.81 (d, J = 3.5 Hz, 1 H), 4.68 (d, J = 7.7 Hz, 1 H)

4.30-4.21 (m, 3 H), 4.17 (dd J = 11.2, 2.9 Hz, 1 H), 4.02 (t, J = 6.2 Hz, 1 H), 3.95 (d, J =

3.3 Hz, 1 H) 3.88 (dd J = 9.7, 3.4 Hz, 1 H) 3.85-3.77 (m, 5 H) 3.74-3.65 (m, 4 H), 3.42 (s,

3 H), 2.09 (s, 1 H), 1.25 (d, J = 6 Hz, 3 H) ppm; HRMS (ESI): m/z calculated for

C21H37NO15Na [M+Na]: 566.2061; found 566.2076.

203

14CNMR for WbiQ reaction

204

GST-WbiQ catalyzed reaction13C NMR spectrum of Fuc-α1,2-Gal-β1,3-GalNAc-α-

OMe: (D2O, 125.7Hz)  = 173.69, 102.01, 99.32, 97.91, 76.33, 75.03, 73.83, 73.59,

71.89, 70.43, 69.60, 69.10, 69.08, 68.11, 66.82, 61.25, 60.95, 55.14, 49.43, 21.98, 15.38 ppm.

205

Inten.(x100,000)

534.42 5.0 266.63 606.32 2.5 251.19 368.26 737.56 403.06 0.0

Figure A.1.WbiN α1,3GalNAcT activity identified by ESI/MS. Acceptor: GalNAc-PP-

Undecyl (m/z 534); UDP-GalNAc (m/z 606); Product: GalNAc-GalNAc-PP-Undecyl (m/z

737).

206

Table A.1. Cloning and Knock Out Primers Used in O-128 Wbs* Study. Primer Sequence (5‘  3‘)a pGEX-4T- ACGTGGATCCATGATTAAAATATTGCATATACA 1-wbsH ACTGCTCGAGTTATAAAAAATAATTATTAGT pET15b- ACGTCATATGATGATTAAAATATTGCATATACA wbsH ACTGGGATCCTTATAAAAAATAATTATTAGT pGEX-4T- ACGTGGATCCATGATGGAAGATACAAAAGAATT 1-wbsK ACTGCTCGAGTTATTTTATTAACCATCGTAA pET15b- ACGTCATATGATGATGGAAGATACAAAAGAATT wbsK ACTGGGATCCTTATTTTATTAACCATCGTAA pGEX-4T- ACGTGGATCCATGGTTAATAAAATAAAAATAGC 1-wbsL ACTGCTCGAGTCATTTTGTTACCTCAAAATAT pET15b- ACGTCATATGATGGTTAATAAAATAAAAATAGC wbsL ACTGGGATCCTCATTTTGTTACCTCAAAATAT wbsH gene ATTGCATATACATCTTAGTAGTAAAATTAGCGGAGC disruption CCAAAGAGTTTCTTAATTAACCCTCACTAAAGGGCG G

GATTCTAATATCTTTGTACGTTCTGATAACTCAAAA TCCTTAACTACTGCTAATACGACTCACTATAGGGCT CG wbsJ gene GGCAATCAGATGTTTCAGTATGCAACTGCATTTGCT disruption ATTGCAAAAAGAACAATTAACCCTCACTAAAGGGC GG

TTCGGGAATAATATCATGTTTAATATCTTTCTTAAA CCATTTACTCGGAGTAATACGACTCACTATAGGGCT CG wbsK gene AAGAATTAATAACAGTAATAATGCCTGTTTATAATG disruption CGGAAAAATACATAATTAACCCTCACTAAAGGGCG G

TCATGTTCATACTTTAATAATATATATAATACGGCC TGAAAGACCTTACTTAATACGACTCACTATAGGGCT CG Continued

207 wbsL gene CTAATAAGTACAGAAAAAGCTAATCTTCCTGAAATT disruption GAAGCCTATAAGGAGAATTAACCCTCACTAAAGGG CGG

TTTCTATCCCGCTATTCATAATAACCTCTCGCCAACT TGGGGGGATATGTTAATACGACTCACTATAGGGCTC G aRestriction enzyme sites are underlined

208

Intens. -MS, 1.8-2.7min #(106-160)

8000 565.1

6000

4000 522.0 393.6 2000 587.0 788.3 626.2 323.0 453.0 0 200 300 400 500 600 700 800 900 m/z Figure A.2. ESI/MS demonstrating Galactosyltransferase activity by GST-WbsH, using the negative mode. The product Gal-GalNAc-PP-Phu can be seen as an m/z of 788.3, with the unreacted acceptor left at m/z of 626. UDP-Gal has an m/z of 565.

209

Intens. -MS, 0.7-0.9min #(43-52) x104 606.1

1.25

1.00

0.75 563.1

0.50 449.2

0.25 494.1 522.0 628.1 323.0 403.0 899.3 0.00 200 300 400 500 600 700 800 900 m/z Figure A.3. ESI/MS after GST-WbsL catalyzed reaction, run in the negative mode. The product can be seen at an m/z of 899 GalNAc-Gal-GalNAc-PP-Undc, with UDP-Gal at m/z of 565, and UDP-GalNAc at m/z of 606. To obtain this peak, first GST-WbsH was incubated GalNAc-PP-Undc and UDP-Gal for 24 hours, after which the enzyme was removed by boiling. Then GST-WbsL was incubated with UDP-GalNAc and the reaction mixture from the GST-WbsH catalyzed reaction. Notice there is no unreacted

Gal-PP-Undc (m/z 533) or Gal-GalNAc-PP-Undc (m/z 695).

210

Intens. +MS, 0.7-0.9min #(43-51) x104 244.1 1.25

1.00

0.75

309.2 0.50

343.2 489.8 465.2 0.25 203.1 230.3 277.2 406.2 151.1 393.3 546.4 0.00 100 200 300 400 500 m/z Figure A.4. ESI/MS spectra after GST-WbsK catalyzed reaction, between UDP-Gal and

GalNAc. The product peak (m/z 406) was formed using the membrane fraction of E. coli

BL21 containing pGEX-wbsK. Multiple peaks exist due to significant hydrolysis of

UDP-Gal by endogenous hydrolases. The product peak of m/z 406 was purified by size exclusion chromatography and analyzed by NMR.

211

1H NMR for WbsK catalyzed reaction, showing β-linkage

212

Appendix B: Spectra, Tables, and Figures for Chapter 3

Table B.1. Primers used for WbnI investigation.

GENE PRIMER SEQUENCES (5‘TO 3‘)

D41A F GGAAAAAAATATTATGTGTTTACCGCTTCTGATAGGATTTAT TTTAG

R CTAAAATAAATCCTATCAGAAGCGGTAAACACATAATATTTT TTTC

D41E F GGAAAAAAATATTATGTGTTTACCGAATCTGATAGGATTTAT TTTAG

R CTAAAATAAATCCTATCAGATTCGGTAAACACATAATATTTT TTTC

D43A F GAAAAAAATATTATGTGTTTACCGATTCTGCTAGGATTTATT TTAGTAAATATCTGAATGT

R AACATTCAGATATTTACTAAAATAAATCCTAGCAGAATCGGT AAACACATAATATTTTTTTC

D43E F GAAAAAAATATTATGTGTTTACCGATTCTGAGAGGATTTATT TTAGTAAATATCTG

R CAGATATTTACTAAAATAAATCCTCTCAGAATCGGTAAACAC ATAATATTTTTTTC

N91A F TTGATAAGTTACAAACTAACTCATATACTTTTTTCTTTGCTGC AAATGCAGTTATTGTCAAAG

R CTTTGACAATAACTGCATTTGCAGCAAAGAAAAAAGTATAT GAGTTAGTTTGTAACTTATCAA

N91D F ATAAGTTACAAACTAACTCATATACTTTTTTCTTTGATGCAA ATGCAGTTATTGTCAAA

R TTTGACAATAACTGCATTTGCATCAAAGAAAAAAGTATATGA GTTAGTTTGTAACTTAT

N93A F ATAAGTTACAAACTAACTCATATACTTTTTTCTTTAATGCAGC TGCAGTTATTGTCAAAGAGAT Continued 213

R ATCTCTTTGACAATAACTGCAGCTGCATTAAAGAAAAAAGTA TATGAGTTAGTTTGTAACTTAT

N93D F CTCATATACTTTTTTCTTTAATGCAGATGCAGTTATTGTCAAA GAGATTCC

R GGAATCTCTTTGACAATAACTGCATCTGCATTAAAGAAAAAA GTATATGAG

Q148 F GCTATCTTGGGTATTTAAAGAAAGGTATTTATTATGCAGGTT A GTTTCAATGGAGG

R CCTCCATTGAAACAACCTGCATAATAAATACCTTTCTTTAAA TACCCAAGATAGC

E185Q F AAAAAAACCTGATTGCTAAAGTACATGATCAGTCATATTTGA ATTATTATTATTACTAC

R GTAGTAATAATAATAATTCAAATATGACTGATCATGTACTTT AGCAATCAGGTTTTTTT

E185A F GCTAAAGTACATGATGCGTCATATTTGAATTATTATTATTAC

R GTAATAATAATAATTCAAATATGACGCATCATGTACTTTAGC

E185D F CTGATTGCTAAAGTACATGATGATTCATATTTGAATTATTATT ATTAC

R GTAATAATAATAATTCAAATATGAATCATCATGTACTTTAGC AATC

214

Figure B.1. Size exclusion profile for wild type WbnI. Tube 32 at 16.3 minutes represents the monomeric WbnI protein, as calibrated by a molecular weight standard, and as shown by SDS-PAGE.

Figure B.2. Structure of the tetrasaccharide GalNAc-α1,3-(Fuc-α1,2)-Gal-β1,3GalNAc- OMe.

215

Inten.(x100,000) 2.0 708.86 1.5 747.64 752.89 1.0 796.94

0.5 711.91 769.66 0.0 Figure B.3. LC/MS spectrum from crude reaction mixture containing UDP-GalNAc, type III/IV acceptor, Q148A mutant WbnI, in 20 mM Tris-HCl, and a pH 7.5. The product 747.71 is the product with the addition of one proton, and the product at 769.66 is the addition of a sodium adduct.

216

Appendix C. Spectra, Tables, Figures for Chapter 4

Table C.1. Putative Glycosyltransferase genes used in this study. GT Name Protein Protein Gene GI Source Number Accession No. GI No. No. GT01 LgtD AAC23227 1574422 16271976 Haemophilus influenzae Rd GT02 GGTA1 AAA30558 29135311 20995467 Bos taurus 3 GT03 Pd2,6ST BAA25316 2988379 2988378 Photobacteriu m damsel GT04 LgtA AAC44084 973185 973183 Neisseria meningitides GT05 LgtB AAC44085 973186 973183 Neisseria meningitides GT06 LgtC AAL12839 15991373 15991370 Neisseria meningitides GT07 Blon_23 ACJ53443 213524696 21369092 Bifidobacteriu 88 8 m longum subsp. Infantis GT08 RfaI BAE77665 85676415 85674274 E. coli K12 W3110 GT09 HD1090 AAP95956 33148436 33149228 Haemophilus ducreyi 35000HP GT10 MmarC5 ABO35533 132663887 13266267 Methanococcu _1235 1 s maripaludis C5 GT11 Blon_21 ACJ53166 213524419 23169092 Bifidobacteriu 04 8 m longum subsp. Infantis GT12 WclV ACH97157 203285040 20328503 Escherichia 1 coli O3 GT13 WclS ACH97151 203285034 20328503 Escherichia 1 coli O3 GT14 WclT ACH97155 203285038 20328503 Escherichia 1 coli O3 GT15 WclU ACH97156 203285039 20328503 Escherichia 1 coli O3 Continued 217

GT16 WbaC AAY23735 62959550 62959549 Escherichia coli O77 GT17 CAD197 CAD19795 22002929 22002925 Escherichia 95 coli O6 GT18 WcaA CAD19793 22002927 22002925 Escherichia coli O6 GT19 RfaG CAD19794 22002928 22002925 Escherichia coli O6 GT20 WfgS ACA24896 168481415 16848140 Escherichia 5 coli O159 GT21 WfgR ACA24891 168481410 16848140 Escherichia 5 coli O159 GT22 WfgQ ACA24889 168481408 16848140 Escherichia 5 coli O159 GT23 WfgP ACA24888 168481407 16848140 Escherichia 5 coli O159 GT24 HD0466 AAP95426 33147905 33149228 Haemophilus (LgtA) ducreyi 35000HP GT25 HD0472 AAP95431 33147910 33149228 Haemophilus (LgtB) ducreyi 35000HP GT26 WfaQ ABB29914 78191389 78191383 E. coli O56 GT27 WfaO ABB29908 78191382 78191373 E. coli O24 GT28 Cj1137c CAL35254 218562751 15791399 Campylobacter jejuni NCTC 11168 GT29 Cj1039 CAL35157 218562655 15791399 Campylobacter jejuni NCTC 11168 GT30 Cj1136 CAL35253 218562750 15791399 Campylobacter jejuni NCTC 11168 GT31 Blon_07 ACJ51828 213691620 21369092 Bifidobacteriu 24 8 m longum subsp. Infantis GT32 Blon_21 ACJ53163 213692955 21369092 Bifidobacteriu 01 8 m longum subsp. Infantis GT33 MmarC5 ABO35322 132663676 13266267 Methanococcu _1016 1 s maripaludis C5 GT34 MmarC5 ABO35608 132663962 13266267 Methanococcu _1310 1 s maripaludis Continued 218

C5 GT35 MmarC5 ABO34616 132662970 13266267 Methanococcu _0300 1 s maripaludis C5 GT36 Blon_21 ACJ53167 213692959 21369092 Bifidobacteriu 05 8 m longum subsp. Infantis GT37 Blon_06 ACJ51736 213691528 21369092 Bifidobacteriu 29 8 m longum subsp. Infantis GT38 Blon_21 ACJ53169 213524422 21352238 Bifidobacteriu 07 9 m longum subsp. Infantis GT39 Blon_19 ACJ53006 213524259 21352238 Bifidobacteriu 36 9 m longum subsp. Infantis GT40 Blon_05 ACJ51656 213522909 21352238 Bifidobacteriu 43 9 m longum subsp. Infantis GT41 Blon_05 ACJ51654 213522907 21352238 Bifidobacteriu 41 9 m longum subsp. Infantis GT42 Blon_00 ACJ51173 213522426 21352238 Bifidobacteriu 41 9 m longum subsp. Infantis GT43 Blon_21 ACJ53171 213524424 21352238 Bifidobacteriu 09 9 m longum subsp. Infantis GT44 Blon_23 ACJ53436 213524689 21352238 Bifidobacteriu 81 9 m longum subsp. Infantis GT45 Blon_21 ACJ53168 213524421 21352238 Bifidobacteriu 06 9 m longum subsp. Infantis GT46 Cj1140 CAL35257 112360460 30407139 Campylobacter jejuni NCTC 11168 GT47 Cj1138 CAL35255 112360458 30407139 Campylobacter jejuni NCTC 11168 GT48 Cj1440c CAL35549 112360750 30407139 Campylobacter jejuni NCTC 11168 GT49 b4GalT BAA34385 3869131 Mus musculus Continued 219

(house mouse) GT50 mIGnTC AAO86065 29650161 Mus musculus (house mouse) GT51 Gb3_syn AAR18365 38350359 Mus musculus th (house mouse) GT52 AAK248 AAK24824 13424474 Caulobacter 24 crescentus CB15 GT53 a3FucT AAB93985 2240202 Helicobacter pylori NCTC 11637 GT54 BgtA CBG40459 290964606 Helicobacter mustelae GT55 WbsK AAO37699 37788089 37788079 Escherichia coli O128 GT56 WbuP AAT77180 50429181 50429169 Escherichia coliO114 GT57 WbsT ABE98422 93115461 93115457 Escherichia coli O126 GT58 WcmC AAV85962 56384982 56384971 Escherichia coli O86 GT59 WbgO BAG11952 168986414 68986398 Escherichia coli O55 GT60 WcaA ADD57133 290763172 29076069 Escherichia 7 coli O55 GT61 WcaE ADD57129 290763168 29076069 Escherichia 7 coli O55 GT62 WbgL ABE98421 93115460 93115457 Escherichia coli O126 GT63 WbsU ABE98423 93115462 93115457 Escherichia coli O126 GT64 WbtF AAS73172 45644926 45644916 Escherichia coli 103 GT65 GTB AAD26574 4590454 4590453 Homosapien GT66 WbnI AAV80756 56159892 56159882 Escherichia coli O86 GT67 WbsJ AAO37698 37788088 37788079 Escherichia coli O128 GT68 SP_0102 AAK74289 14971569 19380493 Streptococcus 1 pneumoniae TIGR4 GT69 SP_0135 AAK74318 14971600 19380493 Streptococcus 1 pneumoniae TIGR4 Continued 220

GT70 SP_0136 AAK74319 14971601 19380493 Streptococcus 1 pneumoniae TIGR4 GT71 SP_1075 AAK75188 14972550 19380493 Streptococcus 1 pneumoniae TIGR4 GT72 SP_1076 AAK75189 14972551 19380493 Streptococcus 1 pneumoniae TIGR4 GT73 SP_1365 AAK75463 14972850 19380493 Streptococcus 1 pneumoniae TIGR4 GT74 SP_1366 AAK75464 14972851 19380493 Streptococcus 1 pneumoniae TIGR4 GT75 SP_1765 193804931 14973262 19380493 Streptococcus 1 pneumoniae TIGR4 GT76 SP_1766 AAK75841 14973263 19380493 Streptococcus 1 pneumoniae TIGR4 GT77 SP_1771 AAK75845 14973268 19380493 Streptococcus 1 pneumoniae TIGR4 GT78 SP_1838 AAK75911 14973339 19380493 Streptococcus 1 pneumoniae TIGR4 GT79 BF0008 CAH05787 60491039 60491031 Bacteroides Fragilis NCTC 9343 GT80 BF0009 CAH05788 60491040 60491031 Bacteroides Fragilis NCTC 9343 GT81 BF0004 CAH05783 60491035 60491031 Bacteroides Fragilis NCTC 9343 GT82 BF0186 CAH05963 60491215 60491031 Bacteroides Fragilis NCTC 9343 GT83 BF0187 CAH05964 60491216 60491031 Bacteroides Fragilis NCTC 9343 GT84 BF0614 CAH06364 60491612 60491031 Bacteroides Fragilis NCTC Continued 221

9343 GT85 BF4301 CAH09968 60495147 60491031 Bacteroides Fragilis NCTC 9343

222

Table C.3. Synthetic route for the preparation of protected disaccharide intermediates for acceptors 8-12.

AcO OAc O AcO OAc OAc R O OAc AcO R1O 1 R4 O CCl 3 O O O O R O NH AcO 2 R R R2O 3 O 4 R3 O R1 R2 R3 R4 8 H Ac NHFmoc OAc 9 Ac H NHFmoc OAc 10 Ac H OAc NHFmoc 11 Ac H OAc OAc 12 H Ac NHFmoc NHFmoc

223

Table C.4. Active glycosylation reactions identified from the screening. The enzymes, substrates and products are listed and new enzymes are indicated with an asterisk. Gene Name Donor Acceptor Products BF0009 from B.fragilis * UDP-GalNAc 22 GalNAc β1,3-Glc

BF0614 from B.fragilis * UDP-Gal 3, 19 Gal β1,4-GlcNAc NC_002940.2 Haemophilus UDP-GlcNAc 25 GlcNAc β1,3- ducreyi 35000HP * Lactose AAF28363.1 from Haemophilus UDP-GlcNAc 3 GlcNAc β1,4-GlcNAc ducreyi 35000HP * ggta1 from bovine UD -Gal 25 Gal α1,3-Lactose lgt C from UDP-Gal 25 Gal α1,4-Lactose Neisseria meningitides lgt D from UDP-GalNAc 13 GalNAc β1,3- Haemophilus Gal α1,3-Galβ1,4Glc influenzae Rd lgt A from UDP-GlcNAc GlcNAc β1,3-Lactose Neisseria meningitidis UDP-GalNAc 25 GalNAc β1,3-Lactose Human gtb UDP-Gal 25 Gal α1,3-Lactose

wbsJ from UDP-Gal 4 Gal 1,3-GalNAc E. coli O126

224

Table C.5. Kinetic parameters measured using traditional radiolabelled donors

Parameters BF0009 BF0614 NC_002940.2

UDP- Glc UDP- GlcNAc UDP- Lac GalNAc Gal GlcNAc

KM,apparent 0.581 0.523 0.429 1.36 0.48 1.38 (mM)

Vmax (nmol 0.263 0.211 0.1478 -1 min )

225

Figure C. 1. SDS-PAGE (upper) and anti-His western blot (lower) for the GTs that were discovered in the screen. The proteins are represented by their Protein Ascension number.

Lane 1: Molecular weight marker; Lane 2: In vitro expressed ACJ53166; Lane 3: In vitro

AAP95426; Lane 4: in vitro expressed CAH05788; Lane 5: in vitro expressed

CAH06364; Lane 6: in vivo expressed ACJ53166; Lane 7: in vivo expressed AAP95426;

Lane 8: in vivo expressed CAH05788; Lane 9: in vivo expressed CAH06364.

226

AcO OAc O OAc AcO OAc AcO OAc AcO AcO AcO OAc O CCl3 OAc AcO O AcO O O HO NH O b O O O AcO AcHN FmocHN O AcO O AcOFmocHN O AcO 26 a

OH AcO OAc OH HO c AcO OAc d OH O O O O O O AcO AcHN HO AcHN O SH AcO O SAc HO Figure C.2. Synthetic route for the preparation of acceptor 9. a. Catalytic TMSOTf, dichloromethane, 0 oC, 4 hours. b. (1) 20 % piperidine in DMF, 10 min, room temperature; (2) Acetic anhydride in pyridine, 1 hour, room temperature. c. Thioacetic acid, AIBN, THF, UV irradiation for 4 hours. d. Sodium methoxide in methanol, 1 hour, room temperature.

227

Alpha1,3- OH Galactosyl HO OH transferase O O LgtA 13 O O O HO HO O N3 14 UDP-Gal OH OH UDP-GlcNAc Figure C.3. Enzymatic route for the preparation of acceptors 13 and 14.

OAc For 19, 22-24 AcO O AcO R O CCl O O OH R = OAc or 3 OAc OH S O O O O O OH NHFmoc NH S O O O AcO O OR' R'''O O OR' 27 OAc AcO HO or O R R'' AcO O OAc O S O O O O N AcO R' = The disulfide linker R'' = OH or NHAc O O AcO O O O O O H OH O R''' = H or Glc or Gal S OH AcO 28 For 20, AcO O CCl 21 and 25 3 NH Figure C.4. Synthetic route for the preparation of acceptors 20-25.

228

a Yield = [a]0 * Ip/(Ip+IS) 1210 (Na)

1413 (Na)

1100 1500 m/z b 15 00

c

Figure C.6. (a) A representative SAMDI spectrum for a glycosylation reaction by

BF0009. The yield was determined using the equation shown, where [a]0 is the initial concentration of the azido sugar acceptor, Is and Ip are the peak intensities of the unreacted acceptor and the glycosylation product, respectively. (b) A residue plot showing the difference of the ratios of azido lactose and azido glucose in solution as

229 determined by SAMDI following immobilization to a monolayer (see text for explanation); (c) one sample t test showing the probability of the two ratios being the same.

230

a [Glc]

1/v0 (min/mM) 0.25 mM 12

0.5 mM

1 mM

8 2 mM

-2 -1 1 2 3 4

1/ [UDP-GalNAc] (1/mM) 4

1/v0 (min/mM) [GlcNAc] 55 0.25 mM

0.5 mM

-2 1 mM 35 b 2 mM

-3 -1 1 3 5 7 9 1/ [UDP-Gal] (1/mM)

1/v0 (min/mM) [Lac] 340 15 0.25 mM

0.5 mM

240

1 mM

2 mM 140 -3 -1 1 3 5 7 9 1/ [UDP-GlcNAc] (1/mM)

Figure C.7. Double 40 reciprocal plots for three of the enzymes discovered in the screen: (a)

BF0009; (b) BF0614; (c) NC_002940.2. For each set of plots, the acceptor was the constant substrate and the donor was the variable substrate.

c

231

OH OH OH OH OH OH HO HO HO O HO HO HO HO O HO O O AcHN O O O O HO O HO O HO HO

HN HN BF0009 HN HN O O O O OH O OH OH O OH O OH OH O

O O O O O UDP-GalNAc O O O O O

Au Au M2+ OH OH HO O O O O SH HO HO OH NHAc OH OH OH HO HO O OH HO O O O HO O SH O O AcHN HO HO S NHAc HO HO O AcHN S O O O O O N O N O N O N BF0614 HN HN HN HN O O O O OH O OH OH O UDP-Gal OH O OH OH O

O O O O O O O O O O M2+ Au Au

OH OH OH OH OH OH HO HO HO O OH O OH O OH O OH O OH O OH OH OH OH HO HO HO OH HO HO O O O O OH O OH HO O O OH HO O O O NHAc O HO O

NC_002940.2 HN HN HN HN O O O O OH O OH OH O UDP-GlcNAc OH O OH OH O O O O O O O O O O O 2+ Au M Au Figure C.8. Three GT-mediated reactions were tested in the presence of a variety of metal ions.

232

NMR Characterization of Products Synthesized from the New Glycosyltransferases

The structures of the products synthesized from the preparative-scale glycosylation reactions were characterized by 1-D and 2-D NMR including 1H-13C HMQC, 1H-1H

COSY and 1H-13C HMBC.

Product from BF0009 catalyzed reaction

1 OH H NMR (500 MHz, D2O): δ = 7.60 (m, 2H, Ph), 7.05 (m, HO OH HO O O O O HO HO 3H, Ph), 5.12 (d, J = 8.0, H1), 4.49 (d, J = 8.0, H1'), 3.83 NHAc (m, 2H, H2' and H4'), 3.87-3.70 (m, H6a, H6b, H3', H5),

3.70-3.50 (m, H5', H3, H6a', H2, H6b').

13 C NMR (125 MHz, D2O): δ = 157.1 (C1'' of Ph), 129.94 (C3'' of Ph), 123.33 (C4'' of

Ph), 116.48 (C2'' of Ph), 101.76 (C1'), 99.83 (C1), 78.81 (C3), 75.34 (C5), 74.68 (C5'),

72.48 (C2), 72.05 (C4), 70.67 (C3'), 67.61 (C4'), 62.47 (C6), 60.96 (C6'), 52.57 (C2'),

22.18 (C(O)CH3). HMBC signal: 78.81 (C3) and 4.48 (H1').

233

Product from BF0614 catalyzed reaction

1 H NMR (500 MHz, D2O): Gal: δ = 4.30 (d, J = 8.0, 1H, H1), HO OH OH O O O OH 3.55 (m, 1H, H2), 3.58 (m, 1H, H3), 3.83 (d, J = 3.0, 1H, H4), HO HO HO AcHN 3.89 (m, 1H, H5), 3.78 (m, 1H, H6a), 3.61 (m, 1H, H6b), 13C

NMR (125 MHz, D2O): 102.6 (C1), 70.92 (C2), 72.47 (C3), 68.50 (C4), 75.30 (C5),

60.98 (C6).

GlcNAc: 1H NMR: δ = 5.14 (d, J = 3.5, H1), 4.66 (d, J = 8.0, H1), 3.82 (m, H3), 3.77

(m, H2), 3.64 (br, H4, H2 and H4), 3.51 (m, H5), 3.49 (m, H5), 3.89 (m, H6a),

3.75 (m, H6b), 3.68 (m, H6a), 3.76 (m, H6b). 13C NMR: 94.80 (C1), 90.48(C1)

78.77 (C4), 78.35 (C4), 74.80 (C5), 72.47 (C5), 70.21(C3), 69.30 (C3), 59.92

(C6 and C6), 56.16 (C2), 53.66 (C2), 21.90 (C(O)CH3). HMBC signal: 4.30 (H1') and 78.7 (C4).

Product from NC_002940.2 catalyzed reaction

1 H NMR (500 MHz, D2O): δ = 7.46 - 7.54 (m, 5H, OH OH OHHO O O O O HO O O HO Ph), 4.83 (d, J = 12, 1H, OCH2Ph), 4.78 (d, 1H, J HO HO HO AcHN = 12, OCH2Ph), 4.62 (d, J = 8.0, 1H, H1''), 4.48 (d,

J = 8.0, 1H, H1), 4.37 (d, J = 8.0, 1H, H1'), 4.08 (d, J = 3.0, 1H, H4'), 3.92 (dd, J = 12,

2.3, 1H, H6a'), 3.84 (dd, J = 12, 2.3, 1H, H6a''), 3.73-3.62 (m, 6H, H3'', H5'', H6a, H6b'',

234

H3'), 3.60-3.48 (m, 4H, H3, H5', H6b, H2'), 3.43-3.27 (m, 2H, H4, H4''), 3.27 (t, J = 9.3,

1H, H2), 2.09 (s, 3H, CH3CO).

13 C NMR (125 MHz, D2O): δ = 175.0 (CH3CO), 128.72 (Ph), 102.9 (C1, C1''), 101.0

(C1'), 81.95 (C3'), 78.35 (C3), 75.65 (C4), 74.88 (C3''), 74.79 (OCH2Ph), 73.56 (C2'),

72.80 (C2), 69.99 (C5'), 69.68 (C4''), 68.33 (C4'), 60.94 (C6), 60.47 (C6''), 60.09 (C6'),

55.65 (C2''), 22.15 (CH3CO). HMBC signal: 81.95 (C3') and 4.62 (H1'').

Product from AAF28363.1 catalyzed reaction

1 H NMR (500 MHz, D2O): δ = 5.17 (d, J = 3.0, H1), 4.67 OH OH HO O O O HO HO OH (d, J = 7.5, H1), 4.57 (d, J = 8.5, H1, H1')4.56 (d, J = AcHN AcHN 8.0, H1'), 3.93-3.84 (m, H6a, H6a', H2, H6b), 3.79 (dd,

J = 12, 2.0, H6a), 3.77 - 3.70 (m, H2', H6b'), 3.69-3.59 (m, H2, H3, H4, H4,

H6b), 3.58-3.52 (m, H3', H3), 3.52-3.43 (m, H4', H5, H5), 2.05 (s, 3H, CH3CO),

2.02 (s, 3H, CH3CO).

13 C NMR (125 MHz, D2O): 101.45 (C1'), 101.44 (C1'), 94.79 (C1), 90.39 (C1),

79.82 (C4), 79.37 (C4), 75.86 (C5'), 74.53 (C3), 73.43 (C3), 72.50 (C3), 69.92

(C4'), 69.25 (C4'), 60.50 (C6'), 60.13 (C6), 60.01 (C6), 55.57 (C2'), 55.55 (C2'),

56.04 (C2), 53.59 (C2), 22.14, 22.08, 21.84 (CH3CO). HMBC signal: 4.57 (H1') and

79.8 (C4).

235

Appendix D. Spectra, Tables, Figures for Chapter 5

236

1H NMR for 2-ketoGlc

237

14C NMR for 2-ketoGlc

238

1H NMR Ac-2-keto-Glc

239

14C NMR Ac-2-keto-Glc

240

Section 5.2

Figure D.1. Representative SAMDI spectra for reactions with each GDP-donor. 1. GDP-

Fucose, FutC, 15 min; 2. GDP-azido Fuc, FutC; 3. GDP-fluoro Fuc, wbsj, 30 min; 4.

GDP-Fuc Alkyne, FucT, 24 h; 5. GDP-arabinose, FutC, 2 h.

241

Synthesis of fucose-derivatives

The synthesis of the fucose derivatives 4-6 was carried out as depicted in Figure D.2.

Figure D.2. Synthesis of fucose derivatives 4-6.

242

1,2,3,4-Di-O-isopropylidene-L-galactopyranoside (7). Anhydrous CuSO4 (886 mg,

5.55 mmol) and anhydrous L-galactose (0.400 mg, 2.22 mmol) were suspended in dry acetone (9 mL) and treated with concentrated H2SO4 (40 L). The resulting mixture was stirred at room temperature for 20 h. The cupric sulfate was removed by filtration and washed with acetone. The combined organic phases were neutralized by addition of

Ca(OH)2, filtered and concentrated. Purification by flash chromatography

(EtOAc:hexane, 1:1) gave the desired product as a colorless oil (470 mg, 81%). 1HNMR

(500 MHz):  = 5.50 (d, J = 5Hz, 1H), 4.56 (dd, J = 2.3, 7.9 Hz, 1H), 4.28 (dd, J = 2.3, 5

Hz), 4.23 (dd, J = 1.6, 7.9), 3.83 (m, 1H), 3.76 (m, 1H) 3.67 (dd, J = 4.32, 11.3 Hz) 1.48

(s, 3H), 1.40 (s, 3H), 1.29 (s, 3H), 1.29 (s, 3H). 13C NMR (125 MHz):  = 109.5, 108.7,

96.3, 71.5, 70.8, 70.6, 68.3, 62.1, 26.0, 26.0, 25.0, 24.4. HRMS: calculated for C12H20O6

[(M + Na)+] 283.1152, found 283.1136

1,2,3,4,-Di-O-isopropylidene-6-deoxy-6-flouro-L-galactopyranoside (8). To a solution of 1,2,3,4-Di-O-isopropylidene-L-galactopyranoside (7) (0.123 g, 0.50 mmol) in CH2Cl2

243

(5 mL) at room temperature was added collidine (131 L, 1.00 mmol) and DAST (133

L, 1.00 mmol). The reaction mixture was then stirred and refluxed for 6 hr.

Subsequently, it was then diluted with Et2O (30 mL) and washed with NaHCO3 (2 x 30 mL), 0.5 N HCl (2 x 20 mL), and brine (1 x 30 mL), dried over MgSO4, and concentrated. Purification by column chromatography (EtOAc:hexane, 1:3) gave the desired product (50 mg, 60%). 1HNMR (500 MHz):  = 5.53 (d, J = 5.0 Hz, 1H), 4.63-

4.60 (m, 1H), 4.57 (ddd, J = 5.2, 14.3, 46.9 Hz, 1H), 4.56-4.44 (m, 1H), 4.33 (dd, J = 2.5,

5.0 Hz, 1H), 4.25 (dd, J = 2.0, 7.9 Hz, 1H), 4.10-4.03 (m,1H), 1.53 (s, 3H), 1.44 (s, 3H),

1.33 (s, 6H). 13C NMR (125 MHz):  = 110.0, 108.9, 96.3, 82.2 (d, J = 168.0 Hz), 70.7,

70.6 (d, J = 6.2 Hz), 70.5, 66.7 (d, J = 22.6 Hz), 26.1, 26.0, 25.0, 24.5. HRMS: calculated for 285.1114 [(M + Na)+], found 285.1110.

6-Flouro-L-galactopyranoside (5). A solution of 1,2,3,4,-Di-O-isopropylidene-6-deoxy-

6-flouro-L-galactopyranoside (8) (48 mg, 0.18 mmol) in TFA/H2O (9/1, 5 mL) was stirred at room temperature for 2 h. Toluene (10 mL) was then added and the diluted reaction mixture was evaporated under reduced pressure. Purification by gel filtration

(P2 resin) followed by lyophylization yielded the desired product as an off-white powder

+ (31 mg, 94%). HRMS: calculated for C6H11O5F [(M + Na) ] 205.0483, found 205.0483.

244

6-O-Tosyl-6-deoxy-1,2,3,4,-Di-O-isopropylidene-L-galactopyranoside (9). To a solution of 1,2,3,4-Di-O-isopropylidene-L-galactopyranoside (7) (0.085 g, 0.35 mmol) in pyridine (1 mL) was added dropwise a solution of p-toluenesulfonyl chloride (0.198 g,

1.04 mmol) in CH2Cl2 (1 mL) and the reaction mixture was stirred over night. After the addition of water (1mL) the reaction mixture was stirred for 10 min and then concentrated with toluene. Reaction residue was then suspended in CH2Cl2 (15 mL) and washed with water (3 x 15 mL), NaHCO3 (2 x 15 mL), dried with MgSO4 and concentrated. Purification by column chromatography (EtOAc/hex, 1:2) gave the desired product as a syrup (122 mg, 85%). 1HNMR (500 MHz):  = 7.79 (d, J = 8.3 Hz, 2H),

7.31 (d, J = 8.0, 2H), 5.43 (d, J = 5.0, 1H), 4.57 (dd, J = 2.5, 7.9 Hz, 1H), 4.27 (dd, J =

2.5, 5.0 Hz, 1H) 4.20 (m, 1H), 4.18 (m, 1H), 4.08 (m, 1H), 4.03 (m, H), 2.42 (s, 3H), 1.48

(s, 3H), 1.33 (s, 3H), 1.30 (s, 3H), 1.26 (s, 3H). 13C NMR (125 MHz):  = 144.8, 133.0,

129.8, 128.2, 109.7, 109.2, 96.2, 70.7, 70.5, 70.4, 68.3, 66.0, 26.1, 25.9, 25.0, 24.5, 21.7.

+ HRMS: calculated for C19H26O8S [(M + Na) ] 437.1241, found 437.1210.

245

6-Azido-6-deoxy-1,2,3,4,-Di-O-isopropylidene-L-galactopyranoside (10). To a solution of 6-O-Tosyl-6-deoxy-1,2,3,4,-Di-O-isopropylidene-L-galactopyranoside (9)

(120 mg, 0.29 mmol) in DMF was added sodium azide (94 mg, 1.45 mmol). The reaction was stirred at 110 C for 36 h. Reaction mixture was concentrated under reduced pressure. Purification by column chromatography (EtOAc/hex, 1:6) gave the desired product as a light yellow syrup (76 mg, 92%). 1HNMR (500 MHz):  = 5.53 (d, J

= 5 Hz, 1H), 4.62 (dd, J = 2.5, 7.9 Hz, 1H), 4.32 (dd, J = 2.5, 5.0 Hz, 1H), 4.18 (dd, J =

2.0, 7.9 Hz, 1H), 3.91 (m, 1H), 3.50 (dd, J = 7.8, 12.7 Hz, 1H), 3.35 (dd, 5.3, 12.7 Hz,

1H), 1.54 (s, 3H), 1.45 (s, 1H), 1.33 (s, 1H), 1.32 (s, 3H). 13C NMR (125 MHz):  =

109.8, 108.9, 96.5, 71.3, 71.0, 70.5, 67.1, 50.8, 26.2, 26.1, 25.0, 24.6. HRMS: calculated

+ for C12H19N3O5 [(M + Na) ] 308.1217, found 308.1217.

6-Azido-L-galactopyranoside (4). A solution of 6-Azido-6-deoxy-1,2,3,4,-Di-O- isopropylidene-L-galactopyranoside (10) (72 mg, 0.25 mmol) in TFA/H2O (9/1, 5 mL) was stirred at room temperature for 2 h. Toluene (10 mL) was then added and the diluted reaction mixture was evaporated under reduced pressure. Purification by gel filtration

246

(P2 resin) followed by lyophylization yielded the desired product as an off-white powder

+ (50 mg, 96%). HRMS: calculated for C6H11O5N3 [(M + Na) ] 228.0591, found 228.0598.

1, 2, 3, 4-Di-O-isopropylidene-6-aldehydo-L-galactopyranoside (11). A suspension of

1,2,3,4-Di-O-isopropylidene-L-galactopyranoside (7) (0.080 mg, 0.31 mmol) and 45% 2- iodoxybenzoic acid (IBX) (0.574 g, 0.92 mmol) in EtOAc (5 mL) was refluxed for 4h.

The reaction mixture was quenched by addition of saturated Na2CO3. The organic layer was separated and the aqueous layer was extracted with EtOAc (2 x 10 mL). The organic layers were combined, dried with MgSO4, and concentrated. Purification by column chromatography (EtOAc/hex, 1:1) gave the desired product as a colorless oil (59 g mg,

75%). 1HNMR (500 MHz):  = 9.60 (s, 1H), 5.65 (d, J = 4.9 Hz, 1H), 4.63 (dd, J = 2.5,

7.8 Hz, 1H), 4.58 (dd, , J = 2.2, 7.8 Hz, 1H), 4.37 (dd, J = 2.5, 4.9 Hz, 1H), 4.18 (d, J

=2.2 Hz), 1.49 (s, 3H), 1.43 (s, 3H), 1.34, (s, 1H), 1.30 (s, 3H). 13C NMR (125 MHz):  =

200.3, 110.2, 109.2, 96.4, 73.3, 71.9, 70.6, 70.5, 26.1, 25.9, 24.9, 24.4. HRMS: calculated

+ for C12H18O6 [(M + Na) ] 281.0996, found 281.0889

247

1, 2, 3, 4-Di-O-isopropylidene-6-ethylyl-L-galactopyranoside (12). To a solution of 1,

2, 3, 4-Di-O-isopropylidene-6-aldehydo-L-galactopyranoside (11) (40 mg, 0.15 mmol) and K2CO3 (43 mg, 0.31 mmol) in anhydrous MeOH (2 mL) was added dimethyl 1- diazo-2-oxoprpylphosphate (36 mg, 0.19 mmol). Reaction mixture was stirred for 5 minutes and then concentrated under reduced pressure. EtOAc (10 mL) and H2O (10 mL) were added. Aqueous layer was extract with EtOAc (2 x 5 mL), organic layers were combined, dried with Na2SO4 and concentrated. Purification by column chromatography

(EtOAc/hex, 1:3) gave the desired product as a colorless oil (24 mg, 63%). 1HNMR (500

MHz):  = 5.55 (d, J = 5Hz, 1H), 4.62 (dd, J = 2.5, 7.8 Hz, 1H), 4.60 (t, J = 2.2 Hz, 1H),

4.31 (dd, J = 2.5, 5.1 1H), 4.28 (dd, J = 2.1, 7.7, 1H), 2.53 (d, J = 2.3 Hz, 1H), 1.54 (s,

3H), 1.52 (s, 3H), 1.38 (s, 3H), 1.33 (s, 3H). 13C NMR (125 MHz):  = 110.1, 109.1,

96.5, 78.9, 74.6, 72.7, 70.8, 70.3, 60.2, 26.2, 26.1, 24.9, 24.5. HRMS: calculated for

+ C13H18O5 [(M + Na) ] 277.1046, found 277.0980

6-ethylyl-L-galactopyranoside (6): A solution of 1, 2, 3, 4-Di-O-isopropylidene-6- ethylyl-L-galactopyranoside (12) (20.1mg, 0.08 mmol) in TFA/H2O (9/1, 5 mL) was

248 stirred at room temperature for 3 h. Toluene (10 mL) was added and the diluted reaction mixture was evaporated under reduced pressure. Purification by gel filtration (P2 resin) followed by lyophylization yielded the desired product as a white powder (13 mg, 94%).

+ HRMS: calculated for C7H10O5 [(M + Na) ] 197.0420, found 197.0419.

249

NMR Spectra for section 5.2.

250

251

252

253

254

255

256

257

258

259

1H NMR GDP-fucose derivative 2

260

1H NMR GDP-fucose derivative 3

261

1H NMR GDP-fucose derivative 4

262

1H NMR GDP-fucose derivative 5

263

1H NMR GDP-fucose derivative 6

264