On the engineering of : methods and applications for carbohydrate-active enzymes

Fredrika Gullfot

Doctoral Thesis in Biotechnology Stockholm, Sweden 2010

© Fredrika Gullfot

School of Biotechnology Royal Institute of Technology AlbaNova University Centre SE-106 91 Stockholm Sweden

Printed at US-AB Universitetsservice

TRITA-BIO Report 2010:14 ISSN 1654-2312

ISBN 978-91-7415-709-3

ii

ABSTRACT

This thesis presents the application of different engineering methods on enzymes and non-catalytic proteins that act upon xyloglucans. Xyloglucans are polysaccharides found as storage polymers in seeds and tubers, and as cross-linking glucans in the cell wall of plants. Their structure is complex with intricate branching patterns, which contribute to the physical properties of the polysaccharide including its binding to and interaction with other glucans such as cellulose.

One important group of xyloglucan-active enzymes is encoded by the GH16 XTH gene family in plants, including xyloglucan endo-transglycosylases (XET) and xyloglucan endo-hydrolases (XEH). The molecular determinants behind the different catalytic routes of these homologous enzymes are still not fully understood. By combining structural data and molecular dynamics (MD) simulations, interesting facts were revealed about enzyme-substrate interaction. Furthermore, a pilot study was performed using structure-guided recombination to generate a restricted library of XET/XEH chimeras.

Glycosynthases are hydrolytically inactive mutant glycoside hydrolases (GH) that catalyse the formation of glycosidic linkages between glycosyl fluoride donors and glycoside acceptors. Different enzymes with xyloglucan hydrolase activity were engineered into glycosynthases, and characterised as tools for the synthesis of well-defined homogenous xyloglucan oligo- and polysaccharides with regular substitution patterns.

Carbohydrate-binding modules (CBM) are non-catalytic protein domains that bind to polysaccharidic substrates. An important technical application involves their use as molecular probes to detect and localise specific carbohydrates in vivo. The three-dimensional structure of an evolved xyloglucan binding module (XGBM) was solved by X-ray diffraction. Affinity-guided directed evolution of this first generation XGBM resulted in highly specific probes that were used to localise non-fucosylated xyloglucans in plant tissue sections.

Keywords: enzyme engineering, rational design, directed evolution, DNA shuffling, glycosynthase, xyloglucan, xyloglucan endo-transglycosylase, retaining glycoside hydrolase, xyloglucanase, carbohydrate binding module, polysaccharide synthesis

iii

SAMMANFATTNING

I denna avhandling beskrivs hur olika metoder för s.k. har tillämpats på enzymer och icke-katalytiska proteiner som är aktiva på xyloglukaner. Xyloglukaner är polysackarider som förekommer som lagringskolhydrater i frön och rotknölar, och som bildar korslänkande glukankedjor i växters cellväggar. Strukturen är komplex och olika förgreningsmönster bidrar till polysackaridens fysikaliska egenskaper såsom bindning och interaktion med andra glukaner, till exempel cellulosa.

En viktig grupp av xyloglukanaktiva enzymer kodas av växtgenfamiljen XTH i GH16, xyloglukan-endo-transglykosylaser (XET) och xyloglukan-endo-hydrolaser (XEH). Kunskap saknas ännu om de molekylära orsakerna till de olika katalytiska vägarna hos dessa homologa enzymer. Genom att kombinera strukturdata och MD-simuleringar avslöjades intressanta fakta om interaktionen mellan enzym och substrat. Vidare genomfördes en pilotstudie för att använda strukturbaserad rekombinering för att skapa ett begränsat bibliotek av XET/XEH hybrider.

Glykosyntaser är hydrolytiskt inaktiva muterade glykosidhydrolaser (GH) som katalyserar bildandet av glykosidbindningar mellan glykosylflourider och acceptorglykosider. Olika enzymer med xyloglukanasaktivitet byggdes om till glykosyntaser, och karaktäriserades i sin egenskap av verktyg för att syntetisera väldefinerade och homogena xyloglukaner med regelbundna förgreningsmönster.

Kolhydratbindande moduler (CBM) är icke-katalytiska proteindomäner som binder till polysackaridsubstrat. En viktig teknisk tillämpning är att de kan användas som molekylära prober för att upptäcka och lokalisera specifika kolhydrater in vivo. Den tredimensionella strukturen av en evolverad xyloglukanbindande modul (XGBM) löstes med röntgendiffraktion. Med affinitetsbaserad riktad evolution av denna första generationens XGBM skapades mycket specifika prober som användes för att detektera icke-fukosylerade xyloglukaner i växtvävnadssnitt.

iv

” The function of the scientist is to know, while that of the engineer is to do. The scientist adds to the store of verified, systematized knowledge of the physical world; the engineer brings this knowledge to bear on practical problems.” - Encyclopedia Britannica

To my family, a remarkable pool of genes and activities

v

vi

LIST OF PUBLICATIONS

I Kathleen Piens,* Maria Henriksson,* Fredrika Gullfot, Marie Lopez, Régis Fauré, Farid M. Ibatullin, Tuula T. Teeri, Hugues Driguez and Harry Brumer (2007). Glycosynthase activity of hybrid aspen xyloglucan endo-transglycosylase PttXET16-34 mutants. Org. Biomol. Chem. 5 (24): 3971-3978. * These authors contributed equally to the work

II Fredrika Gullfot, Farid M. Ibatullin, Gustav Sundqvist, Gideon Davies and Harry Brumer (2009). Functional characterization of xyloglucan glycosynthases from GH7, GH12 and GH16 scaffolds. Biomacromolecules 10 (7): 1782-1788.

III Pekka B. Mark, Martin Baumann, Jens Eklöf, Fredrika Gullfot, Gurvan Michel, Åsa Kallas, Tuula T. Teeri, Harry Brumer and Mirjam Czjzek (2009). Analysis of nasturtium TmNXG1 complexes by crystallography and molecular dynamics provides detailed insight into substrate recognition by family GH16 xyloglucan endo-transglycosylases and endo- hydrolases. Proteins 75 (4): 820-836.

IV Fredrika Gullfot, Tuula T. Teeri and Harry Brumer (2010). Design of GH16 XET/XEH chimeric enzymes with SCHEMA. Manuscript.

V Laura von Schantz, Fredrika Gullfot, Sebastian Scheer, Lada Filonova, Lavinia Cicortas Gunnarsson, James E. Flint, Geoffrey Daniel, Eva Nordberg-Karlsson, Harry Brumer and Mats Ohlin (2009). Affinity maturation generates greatly improved xyloglucan- specific carbohydrate binding modules. BMC Biotechnology 9 (92).

VI Fredrika Gullfot, Tien-Chye Tan, Laura von Schantz, Eva Nordberg Karlsson, Mats Ohlin, Harry Brumer and Christina Divne (2009). The crystal structure of XG-34, an evolved xyloglucan-specific carbohydrate-binding module. Proteins 78 (3): 785-789.

vii

The author’s contribution:

Publication I: Experimental design and mathematical modelling, pH profiling and kinetic experiments with PttXET16-34 glycosynthases together with Maria Henriksson.

Publication II: Design, cloning and expression of TmNXG1 glycosynthases, experimental design, characterisation including kinetics of all presented glycosynthases, synthesis of homoxyloglucans incl. product analysis by HPAEC-PAD and SEC-ELS. Writing of the manuscript including figures and tables.

Publication III: Protein expression and purification, comparison of structural and MD simulation data and drawing of ligand plots.

Publication IV: Design of study and all experimental work in silico and in vitro, writing of manuscript.

Publication V: Binding studies with isothermal titration calorimetry on presented modules, writing of relevant sections including figures.

Publication VI: Protein crystallisation and optimisation, drafting of manuscript (excluding data collection) and figures.

Other contributions relevant to this thesis:

Expression of TmNXG1, cloning and expression of TmNXG1-'YNIIG, assistance in drafting of the manuscript for: Baumann et al. (2007). Structural evidence for the evolution of xyloglucanase activity from xyloglucan endo-transglycosylases: biological implications for cell wall metabolism. Plant Cell 19(6):1947-1963.

viii

LIST OF ABBREVIATIONS

AE Affinity electrophoresis CBM Carbohydrate binding module CNP Chloro nitrophenyl DMSO Dimethyl sulfoxide dNTP Deoxyribonucleotide triphosphate dsDNA Double-stranded DNA ELS Evaporative light scattering epPCR Error-prone PCR FITC Fluorescein isothiocyanate Fuc Fucose Gal Galactose GFC Gel filtration chromatography GH Glycoside hydrolase Glc Glucose GPC Gel permeation chromatography HPAEC High-performance anion-exchange chromatography HTS High-throughput screening ITC Isothermal titration calorimetry mAb Monoclonal antibody MD Molecular dynamics PAD Pulsed amperometric detection PCR Polymerase chain reaction SEC Size exclusion chromatography ssDNA Single-stranded DNA XET Xyloglucan endo-transglycosylase XEH Xyloglucan endo-hydrolase XGBM Xyloglucan binding module XGO Xylogluco-oligosaccharide Xyl Xylose

ix

x

TABLE OF CONTENTS

1 Introduction ...... 1 1.1 Carbohydrate-active enzymes ...... 2 1.1.1 Glycoside hydrolases ...... 2 1.2 Xyloglucan ...... 4 1.2.1 Structure and nomenclature ...... 5 1.2.2 Applications ...... 6 1.3 Proteins under investigation: GH, XET, XEH and CBM ...... 7 1.3.1 Xyloglucanase activity and CAZy classification ...... 7 1.3.2 GH16 XTH model enzymes: TmNXG1 and PttXET16-34 ...... 8 1.3.3 Carbohydrate binding modules ...... 11

2 Methods and applications ...... 15 2.1 Rational design: site-directed mutagenesis ...... 16 2.1.1 Application: glycosynthases ...... 18 2.1.2 Application: structure-function studies ...... 19 2.2 Directed evolution: non-recombination methods ...... 22 2.2.1 Screening and selection ...... 23 2.2.2 Application: engineered CBMs as xyloglucan-specific probes ...... 24 2.3 Directed evolution: ...... 27 2.3.1 Structure-guided recombination with SCHEMA ...... 27 2.3.2 Application: recombination of GH16 XET/XEH genes ...... 29

3 Analytical Techniques ...... 31 3.1 Measuring glycosynthase activity with a fluoride ion selective electrode ...... 31 3.2 Colorimetric XET activity assay ...... 32 3.3 Protein-ligand binding studies by isothermal titration calorimetry (ITC) ...... 32 3.4 Protein crystallisation ...... 34 3.5 Carbohydrate analysis by HPAEC-PAD ...... 35 3.6 Polysaccharide analysis by SEC-ELS ...... 37

xi

4 Aim of investigation ...... 39

5 Results and discussion ...... 41 5.1 Engineering of glycoside hydrolases into glycosynthases for the production of regularly substituted XGOs (publications I and II) ...... 41 5.2 Structure-function studies of an engineered xyloglucan hydrolase by crystallography and molecular dynamics simulations (publication III) ...... 47 5.3 Structure-guided recombination of GH16 XET/XEH with SCHEMA (paper IV) ...... 51 5.4 Engineered carbohydrate binding modules as molecular probes for xyloglucan (publications V and VI) ...... 53 5.4.1 The three-dimensional structure of XG-34 ...... 54 5.4.2 Improved xyloglucan binding modules (XGBM) ...... 55

6 Concluding remarks and outlook ...... 59

7 Acknowledgements ...... 61

8 References ...... 63

xii On the engineering of proteins: methods and applications for carbohydrate-active enzymes

1 INTRODUCTION

During the drafting of this thesis in spring 2010, the J. Craig Venter Institute (JCVI) announced the creation of the first self-replicating synthetic bacterial cell. At the heart of this accomplishment lies our capability to manipulate genetic material at will, including its redesign and synthesis de novo. Apart from a given publication in Science (Gibson et al. 2010), the news made a huge buzz in media world-wide, from sensationalist items in the tabloids to featured essays in prestigious political magazines (Aftonbladet 2010; Economist 2010). Synthetic biology, a hot niche within the biomolecular sciences, had made it into the public agenda.

Far from the media hype created by Craig Venter’s clever marketers (the news was released by his company, Synthetic Genomics Inc., on the day before the official announcement by the JCVI), protein scientists and molecular biologists carry on their daily work, routinely employing the very same methods for the modification of genes, proteins and microorganisms to obtain novel functions for an ever growing plethora of applications. Already mainstream, this fairly recent area of engineering is one of the most empowering techniques at hand of mankind.

The ability to redesign existing proteins has been exploited extensively in both industry and academia, from mundane uses such as common washing powders (Vonderosten et al. 1993), to the latest state-of-the-art cancer therapy (Löfblom et al. 2010). Proteins are routinely re- engineered to provide clues about their inherent mechanisms (Baumann et al. 2007), or to attain entirely novel catalytic properties (Gullfot et al. 2009a). Protein engineering also plays a profound role in advanced projects aimed at complete redesign on the cellular level, for example when introducing foreign biochemical pathways into heterologous hosts for the production of important therapeutic compounds (Dietrich et al. 2009).

The presented work concerns the engineering of carbohydrate-active enzymes and non-catalytic proteins from both plant and microbial origin. While the scientific goals are related to structure- function relationships of the studied enzymes and the plant cell wall and storage polysaccharide xyloglucan, a further ambition is to showcase the power of various protein engineering approaches for the different scientific purposes as described in this thesis.

1 Fredrika Gullfot 2010

1.1 Carbohydrate-active enzymes

Carbohydrate-active enzymes are enzymes that are involved in the assembly and breakdown of carbohydrates and glycoconjugates. Due to the immense structural diversity of these substrates, carbohydrate-active enzymes comprise a vast family of proteins in terms of structure, function and specificities. The Carbohydrate Active enZymes (CAZy) database (http://www.cazy.org) (Cantarel et al. 2009), an essential tool in the glycosciences, organises these enzymes into five main classes:

x glycoside hydrolases (GH), comprising glycosidases and transglycosylases that are described in section 1.1.1 below (Davies and Henrissat 1995),

x glycosyl transferases (GT), that catalyse the formation of glycosidic bonds between phospho-activated sugar residues and polysaccharidic or alternate acceptors such as a lipid moiety or a protein (Campbell et al. 1997),

x polysaccharide lyases (PL)WKDWFOHDYHJO\FRVLGLFERQGVE\Ƣ-elimination (Coutinho and Henrissat 1999),

x carbohydrate esterases (CE), that remove ester-based modifications (Coutinho and Henrissat 1999), and

x carbohydrate binding modules (CBM), non-catalytic protein domains described in section 1.3.3 below (Boraston et al. 2004).

1.1.1 Glycoside hydrolases

Glycoside hydrolases are enzymes that catalyse the hydrolysis of glycosidic linkages between sugar moieties (Sinnott 1990). In nature, their role is the degradation of poly- and oligosaccharides, for example during cell wall turnover in plants, or to gain access to nutrients by animals and microorganisms. The glycoside hydrolase (GH) class in CAZy also includes the transglycosylases, which share the same reaction mechanism. Their role is the re-arrangement of carbohydrates by transglycosylation (Davies and Henrissat 1995).

Catalytic mechanism

Generally, glycoside hydrolases will follow one of two main mechanistic routes: either the inverting or the retaining mechanism, resulting in the inversion or net retention of the anomeric

2 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

configuration of the donor saccharide (Sinnott 1990; Zechel and Withers 1999).

The canonical inverting mechanism is a straightforward, one step, single-displacement reaction, shown in Figure 1.

acid

O O O OG- O O

H O OG+ H ROH O RO RO RO O O R R

H2O O O OH H H H H

O O G-O O HO O

base

Figure 1: Canonical inverting mechanism of glycoside hydrolysis.

A carboxylic amino acid acts as a general base, and activates a water molecule that will perform a nucleophilic attack on the anomeric carbon at the centre of the glycosidic bond. Simultaneously, a second carboxylic acid residue will act as a general acid catalyst and permit the breaking of the glycosidic bond and the departure of the leaving group. The overall stereochemistry of the anomeric centre will be inverted in this reaction, yielding an ơ-sugar from a Ƣ-linkage and vice versa.

Retaining glycoside hydrolases perform a more intricate procedure to hydrolyse their substrate with retention of the anomeric configuration. Hydrolysis, or transglycosylation, is performed by a two-step, double displacement reaction according to a mechanism first described by Koshland (1953) (reviewed in Davies 1995, see Figure 2). In the first step, a carboxylic acid residue acts as a nucleophile and attacks the anomeric carbon, while a second carboxylic acid residue will act as a general acid catalyst and donate a proton to the leaving group. Through this nucleophilic substitution, an enzyme-glycosyl intermediate is formed with the anomeric carbon covalently bound to the nucleophilic residue, in opposite anomeric configuration. The second step of the reaction will differ depending on the catalytic route. In the case of hydrolysis, an incoming water molecule is activated by the now de-protonated acid/base, here acting as a general base catalyst, and in a second nucleophilic substitution the water oxygen attacks the anomeric carbon and releases the glycoside from the covalent bond to the enzyme. In the case of transglycosylation, a hydroxyl group on the incoming acceptor glycoside takes the role of the nucleophile in the

3 Fredrika Gullfot 2010

reaction instead, resulting in the formation of a new glycosidic bond and thus elongation of the polysaccharide.

acid/base

O O O OG- O O

H + H O OG R1OH O RO RO RO O O

R1 R1

H2O

O O G-O O O O nucleophile (base) 1.

O O O OG- O O

H H O OG+ H O RO RO RO O O O

R2 R2 R2

O O G-O O O O

2.

Figure 2: Canonical retaining mechanism of glycosyl transfer and hydrolysis. 1. Nucleophilic attack and formation of a covalently bound enzyme-glycosyl intermediate. 2. Nucleophilic substitution by an activated water molecule (R2 = H2O) or incoming acceptor sugar (R2 = sugar), and subsequent release of free sugar.

1.2 Xyloglucan

Xyloglucans are an important family of polysaccharides found as cross-linking glycans in the cell walls of plants and as storage polysaccharides in seeds (Carpita and McCann 2000). Cross-linking glycans (often referred to as hemicelluloses) are crucial for cell wall flexibility and cell expansion during growth and differentiation. In the so-called type 1 primary cell walls of dicots and non- commelinoid monocots, xyloglucans bind tightly to the exposed glucan chains of the paracrystalline cellulose microfibrils. They connect these cellulose microfibrils in a kind of network, by spanning the distance between them and binding to either other xyloglucan chains or other cellulose fibrils. This cellulose-xyloglucan framework is further embedded in a pectin matrix (Carpita and McCann 2000). In essence, this architecture provides necessary plasticity by permitting the rigid cellulose microfibrils to move relative to each other, without compromising cell wall stability.

4 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

1.2.1 Structure and nomenclature

Xyloglucans are based on a linear Ƣ-(1o4)-glucan backbone, branched with ơ-(1o6)-xylose units in regular, repeating patterns. The xylose residues can be further decorated with Ƣ-(1o2)- galactoses, which in turn are sometimes extended with ơ-(1o2)-fucoses. These decorations occur in a tissue- and species dependent manner (Fry et al. 1993; Hoffman et al. 2005; Pena et al. 2008). An overview of the general structure of xyloglucans and the nomenclature of the building blocks is given in Figure 3 below.

OH HO HO OH OH O x O HO O HO HO HO O O O OH z O OH O OH OH H O O HO O O HO OH HO O O HO O O OH OH n OH O OH HO O O y O OH OH HO x,y,z = 0 or 1 A HO

E-D-Glc-(1,4) G D-D-Xyl-(1,6)-E-D-Glc-(1,4) X E-D-Gal-(1,2)-D-D-Xyl-(1,6)-E-D-Glc-(1,4) L D-L-Fuc-(1,2)-E-D-Gal-(1,2)-D-D-Xyl-(1,6)-E-D-Glc-(1,4) F B

Figure 3: General structure (A) and nomenclature (B) of xyloglucan and its building blocks.

Xyloglucans are phylogenetically diverse, and there is a large variety of backbone branching patterns. The most common backbone repeat is the XXXG unit. The widely used xyloglucan from tamarind (Tamarindus indica) seeds is comprised of XXXG, XXLG, XLXG, and XLLG motifs, while the presence of fucosylated XXFG and XLFG motifs are characteristic of primary cell wall xyloglucans in dicots.

This elaborate structure is thought to form the basis for xyloglucan solution properties and its interaction with cellulose, and there is a substantial and growing interest in deciphering the relationship between xyloglucan structure and function. Approaches include in muro investigations of the effects of genetically altered xyloglucan composition (Reiter 2002; Cavalier et al. 2008), studies of the association of xyloglucans with bacterial cellulose model systems

5 Fredrika Gullfot 2010

(Whitney et al. 2006), and biophysical studies of the xyloglucan-cellulose interaction at the molecular level by modelling and experimental work (reviewed by Zhou et al. 2007). While intriguing, the structural complexity poses a serious drawback for relevant in vitro studies, since xyloglucan polymers as found in nature display a large degree of heterogeneity. Selective mutations permit an alternative in vivo approach, by studying genetically modified plants with altered xyloglucan composition such as the Arabidopsis mur1 and mur2 (Reiter 2002) or xxt1 and xxt2 mutants (Cavalier et al. 2008). Still, the lack of suitable substrates renders conclusive in vitro investigations on the role of particular structural elements practically impossible. This fact has stimulated an interest in synthetic methods to obtain xyloglucan mimics and analogues with well- defined structure and decoration patterns, forming the rationale behind the work presented in papers I and II of this thesis. Increased fundamental knowledge about the molecular determinants behind xyloglucan characteristics and behaviour will be essential to harness its full potential for technically advanced applications.

1.2.2 Applications

Bulk xyloglucan is used in several industries, in fairly low-tech applications that take advantage of its general gelling or smoothening properties. In the textile industry, it is used as a sizing agent, and in the food industry it is widely used as a gelling agent. Xyloglucan is also used in papermaking in similar functions as those of starches and galactomannans, to strengthen the sheet and to lower friction (Ahrenstedt et al. 2008). A comprehensive review of such applications can be found in Mishra and Malhotra (2009).

As in the case of many industrially promising polysaccharides, the extraordinary properties of xyloglucans have attracted interest in their application in more advanced technologies (Brumer et al. 2004; Gustavsson et al. 2005; Zhou et al. 2005; Bodin et al. 2007; Zhou et al. 2007). Perhaps the most compelling applications of xyloglucan are those reminiscent of its role in the plant cell wall, the binding to and crosslinking of cellulose microfibrils. In such biomimetic approaches, xyloglucan has a great and promising potential in the field of cutting-edge cellulose-based materials and biocomposites, permitting for example the production of super-hydrophobic or thermoresponsive surfaces (Lindqvist et al. 2008) or reinforced structures (Zhou et al. 2005; Lönnberg et al. 2006; Zhou et al. 2007). The novel and innovative materials produced show great potential for biomedical applications, as high-tech composites, or consumables with

6 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

extraordinary properties (Bodin et al. 2007).

1.3 Proteins under investigation: GH, XET, XEH and CBM

The work presented in this thesis concerns different proteins which act upon xyloglucans: xyloglucan endo-transglycosylases (XET), xyloglucan endo-hydrolases (XEH), other endo- glucanases with xyloglucanase activity, and non-catalytic carbohydrate binding modules (CBM) with affinity for xyloglucan.

1.3.1 Xyloglucanase activity and CAZy classification

Xyloglucanase activity (EC 3.2.1.151), i.e. the ability to endolytically cleave xyloglucan, described in section 1.2 above, has been found in altogether six different GH families, GH 5, 7, 12, 16, 44 and 74, reviewed in Gilbert et al. (2008). Some glucan hydrolases conventionally denoted as cellulases hydrolyse xyloglucan as well (Vincken et al. 1997), or can perform transglycosylation of xyloglucan substrates (York and Hawkins 2000). In this work, we have considered examples of both strict xyloglucanases and glucan hydrolases with broader substrate specificity, from GH families 7, 12 and 16, all of which employ the canonical double-displacement mechanism of retaining glycosyl transfer described above.

Both families GH7 and GH16 belong to clan B according to the CAZy classification. Based on structural features but also the evolutionary relationships between the respective substrates, it is suggested that GH7 and GH16 share a common ancestor with a Ƣ-bulge active site (Michel et al. 2001), see Figure 4. From this ancestor, the cellulases (GH7) and laminarinases (GH16) evolved. The GH7 cellulases, namely endo-1,4-Ƣ-glucanases and cellobiohydrolases, have remained relatively well conserved. Family GH16 however evolved into a quite divergent group of glycoside hydrolases, both with regards to structure and specificity. It is also suggested that the lichenases (1,3-1,4-Ƣ-glucanases) and XETs (1,4-Ƣ-endotransglycosylases) emerged rather recently from the laminarinase branch (1,3-Ƣ-glucanase), having evolved a different, non-Ƣ-bulged active site. The remaining types of glycosidases in this family are the ț-carrageenases, agarases (1,3-1,4- Ƣ-galactanases) and the fungal CRH (for Congo Red Hypersensitive) gene products, putative transglycosylases involved in the transfer of chitin to Ƣ-glucans (Michel et al. 2001; Cabib et al. 2008; Eklöf and Brumer 2010).

7 Fredrika Gullfot 2010

Several GH7 endo-glucanases show activity on xyloglucan. The GH7 cellulase HiCel7B from Humicola insolens is included in the work on xyloglucan synthesis presented in papers I and II. Other examples of xyloglucanase activity within this family are the endoglucanase EGI/EndoV from Trichoderma viride (Vincken et al. 1997), or the Trichoderma reesei cellulase EG1 (York and Hawkins 2000). Within the GH16 family, xyloglucan activity is abundant, due to the large number of xyloglucan endo-hydrolases (XEH, EC 3.2.1.151) and xyloglucan endo- transglycosylases (XET, EC 2.4.1.207) encoded by the important plant XTH gene family (Eklöf and Brumer 2010).

Family GH12 belongs to clan C, together with GH11, a family of xylanases. The recently characterised Bacillus licheniformis XG12, briefly considered in paper II, was the first reported instance of xyloglucanase activity within the GH12 family (Gloster et al. 2007).

1.3.2 GH16 XTH model enzymes: TmNXG1 and PttXET16-34

Xyloglucan endo-transglycosylases or XETs are enzymes found in the cell walls of plants, first described in the early 1990’s (independently by Fry et al. 1992; Nishitani and Tominaga 1992; and Farkas et al. 1992). Their suggested physiological role is to permit cell wall plasticity without compromising structural stability, an essential prerequisite for processes involving cell wall reconstruction such as germination, growth, vascular differentiation and fruit ripening (Carpita and McCann 2000; Popper and Fry 2004; Cosgrove 2005; Brummell 2006). By cleaving and re- ligating the cross-linking xyloglucan polymers, XETs allow the rigid cellulose microfibrils to “slide” relative each other, and to expand or shrink the space between fibrils. The catalytic mechanism is the canonical retaining mechanism of glycosyl transfer shown in Figure 2 above. The donor polysaccharide is cleaved in the first step, and the nucleophilic substitution in the second step is performed by the C4 hydroxyl group of the incoming acceptor saccharide. Thus, a new glycosidic bond is formed, resulting in transglycosylation and elongation of the polysaccharide, as shown schematically in Figure 5 below.

While all other enzymes in the GH16 family are hydrolases, genuine XETs (E.C 2.4.1.207) are strict transglycosylases, except for an important subgroup that has hydrolytic activity. These “hydrolytic XETs”, denoted XEHs (E.C. 3.2.1.151), are involved in the digestion of storage xyloglucan (Farkas et al. 1992), root elongation (Becnel et al. 2006), and/or fruit ripening by softening the cell wall (Brummell and Harpster 2001).

8 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

Figure 4: Proposed evolution of Clan B according to Eklöf and Brumer (2010). Representative structures for cellulases PDB 1CEL (Divne et al. 1994)ƪ-carrageenases PDB 1DYP (Michel et al. 2001)Ƣ-agarases PDB 1O4Y (Allouch et al. 2003); laminarinases PDB 2CL2 (Vasur et al. 2006); lichenases PDB 2AYH (Hahn et al. 1995); XETs and XEHs PDB 1UMZ (Johansson et al. 2004).

9

On the engineering of proteins: methods and applications for carbohydrate-active enzymes

Also, both enzymes were used as scaffolds for xyloglucan glycosynthases presented in papers I and II (Piens et al. 2007; Gullfot 2009).

Figure 6: XET/XEH structure and topology. The structure, represented by TmNXG1, VKRZVWKHWZRƢ-sheets stacked on each other, with the active site residues on top. The characteristic C-WHUPLQDOơ-helix extension is shown in the front (Baumann et al. 2007). A topological diagram is shown to the right.

1.3.3 Carbohydrate binding modules

The glycosidic bonds of polysaccharidic structures as found in nature are often difficult to access by the active site of carbohydrate-degrading enzymes. Therefore, many glycoside hydrolases feature carbohydrate binding modules (CBM) in addition to the catalytic domain. Carbohydrate binding modules in general are non-catalytic domains believed to facilitate the function of the catalytic module by serving three possible different important purposes: to increase the enzyme concentration on the substrate surface, to target the enzyme to its substrate polysaccharide specifically, and/or to loosen and disrupt the structure of the target polysaccharide (Boraston et al. 2004).

Structure and function

CBMs are relatively small domains (30-180 amino acids), usually separated from the catalytic module by a flexible linker. They are generally rich in aromatic amino acids and stabilising cysteines, and often feature metal ion coordination (Boraston et al. 2004). Common folds include

11 Fredrika Gullfot 2010

Ƣ-sandwiches and Ƣ-trefoils, but also OB (oligosaccharide/oligonucleotide binding) and hevein folds.

There are presently 59 different families of CBM according to the CAZy classification, which in turn are divided into three main groups based on the topology of the binding site, see Figure 7 (Boraston et al. 2004). Type A CBMs display a flat or platform-like binding surface, where aromatic residues such as tryptophans and tyrosines bind to the substrate by hydrophobic stacking interactions. These surface-binding CBMs bind to insoluble, highly crystalline cellulose or to chitin, and show little or no affinity for soluble carbohydrates. Type B, or glycan-chain binding CBMs feature a binding cleft or groove where soluble polysaccharide chains can be accommodated. As with type A CBMs, aromatic residues are important for binding, but also for substrate specificity depending on side-chain orientation. Also, hydrogen bonds between protein residues and sugars are essential for affinity and specificity of chain-binding CBMs. Finally, type C CBMs have substrate pockets rather than grooves, binding mono-, di- or trisaccharides in a lectin-like manner. Here, the hydrogen-bonding network between protein and ligand is thought to be crucial, and more extensive than in type B modules.

Figure 7: CBM binding site topography. Binding sites are highlighted in red. Example structures for the three types drawn from A) CBM1 from Trichoderma reesei Cel7A (Kraulis et al. 1989); B) CBM4 from Cellulomonas fimi Cel9b (Johnson et al. 1996); C) CBM9 from Thermotoga maritima Xyn10A (Notenboom et al. 2001). Figure taken from Kallas (2006)

CBM function and their role in assisting enzymatic catalysis has been the subject of several studies. The proximity effect responsible for increasing the enzyme concentration on the surface of the substrate has been shown by genetically removing the CBM domain from the catalytic module of the wild-type enzyme. Such truncated constructs display significantly lower activity on insoluble polysaccharides due to lower enzyme concentration at the substrate surface (Bolam et al. 1998; Boraston et al. 2004).

12 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

Perhaps most interesting from the perspective of potential technical applications is the function to target the hydrolytic enzyme to its specific substrate, or specific regions of the polysaccharide. This has been shown in several studies (Carrard et al. 2000; Boraston et al. 2001a; Notenboom et al. 2001; McCartney et al. 2004). This specific targeting capability affords carbohydrate binding modules to be developed into molecular probes for polysaccharide localisation in situ (Knox 2008; von Schantz et al. 2009; Sandquist et al. 2010), which forms the rationale of the work performed in papers V and VI presented in this thesis.

Some carbohydrate binding modules also display a disruptive function and are capable of loosening the polysaccharide structure, such as a family 2 CBM from Cellulomonas fimi Cel6A (Din et al. 1994) and a CBM from Penicillium janthinellum cellobiohydrolase 1 (Gao et al. 2001).

Technical applications

CBMs are small domains that easily fold as independent proteins, they attach to generally inexpensive and abundant matrices, and their binding specificities can be controlled. As such, they are excellent candidates for numerous biotechnological applications. For example, CBMs have sucessfully been used as fusion tags for affinity-based purification of several proteins (Boraston et al. 2001b; Kavoosi et al. 2004; Rodriguez et al. 2004; Guerreiro et al. 2008). Similarly, CBMs have been used as affinity tags for enzyme immobilisation and processing (Kauffmann et al. 2000; Gustavsson et al. 2001; Rotticci-Mulder et al. 2001; Hwang et al. 2004; Kavoosi et al. 2004). Fusion to CBMs in many cases increases recombinant protein expression levels, and expression vectors (pET34 and pET38) have been developed including CBMs as fusion tags for this purpose (Novy et al. 1997).

The targeting capacity of CBMs to their substrate has been exploited extensively in the textile industry, where cellulases are used for enzymatic stonewashing of denim. Fusion to CBMs enriches the cellulase concentration on the fabric’s surface, resulting in the need for less enzyme and thus saved costs (Cavaco-Paulo 1998). In numerous textile washing powders, CBMs are fused to recombinant enzymes that lack native affinity for cellulosic fibres, to increase enzyme targeting to the fabric (von der Osten et al. 1997a; von der Osten et al. 1997b).

CBMs can also be used for cell immobilisation to different cellulosic surfaces in various applications such as ethanol production, mammalian cell attachment and whole-cell diagnostics, by displaying CBMs on the surface of the cells. This has been done in both E. coli (Francisco et

13 Fredrika Gullfot 2010

al. 1993), Staphylococcus carnosus (Lehtiö et al. 2001) and yeast (Nam et al. 2002).

Due to their substrate specificity, CBMs are valuable as analytical tools in research and diagnostics, for example as molecular probes for the detection of specific polysaccharides in plant tissues as described in this work (von Schantz et al. 2009), and as part of carbohydrate microarrays (Moller et al. 2007). A review of such applications is provided by Knox (2008). Also, novel techniques for the construction of protein microarrays have been developed, by conjugating the array proteins to CBM and printing the array on cellulose-covered glass slides (Ofir et al. 2005).

As a final example, CBMs can be used for fibre modification, for instance due to their non- hydrolytic fibre disruption activity resulting in roughening of cellulosic surfaces (Din et al. 1991). Addition of CBM has been shown to have beneficial effects on mechanical pulp (Suurnakki et al. 2000), and resulting in improved drainability and mechanical properties of paper when added to paper fibres (Pala et al. 2003). Also, genetically fused CBMs have been developed as novel cross- linking agents, for example for the binding of starch to cellulose (Levy et al. 2004). A comprehensive review on current uses of CBMs in biotechnology can be found in Shoseyov et al. (2006).

14 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

2 METHODS AND APPLICATIONS

Protein engineering aims at modifying the structure and function of a protein through mutagenesis by various means. Methods are conventionally grouped into two main categories: rational design and directed evolution (Böttcher and Bornscheuer 2010).

Figure 8: A schematic overview on protein engineering approaches. Methods are dependent on the availability of structural and mechanistic data, suitable high-throughput screens (HTS), and desired output. In practise, routes are not necessarily clear-cut, and approaches are often integrated and combined (figure adapted from Böttcher and Bornscheuer 2010).

Rational design relies on available information about protein sequence, structure and often phylogenetic relationships, and mutations are introduced after a careful analysis and consideration of known parameters in order to obtain the desired result. Directed evolution methods comprise both non-recombining techniques, introducing random mutations on the

15 Fredrika Gullfot 2010

parental gene, or recombination methods, where two or more parental genes are shuffled in order to obtain hybrid proteins with novel functions. Directed evolution experiments are typically performed as high-throughput projects, relying heavily on suitable screens for the identification of functional clones.

A comprehensive treatise on available methods is beyond the scope of this thesis, and reviews can be found in Tao and Cornish (2002); Arnold and Georgiou (2003); Lutz and Bornscheuer (2009); Böttcher and Bornscheuer (2010) and references therein. Instead, this chapter is an attempt at illustrating the different approaches, by highlighting their scientific application for the various projects presented in this work.

2.1 Rational design: site-directed mutagenesis

Site-directed mutagenesis was invented by Michael Smith (Hutchison et al. 1978), who received the Nobel prize 1993 for this landmark method in molecular biology, together with Kary B. Mullis for his invention of the PCR reaction (Saiki et al. 1988). Site-directed mutagenesis provides an example of rational protein engineering, where an existing protein is modified by altering certain, pre-defined residues. These residues are chosen by an informed decision based on three- dimensional protein structure, homology models, or phylogenetic relationships, depending on available information, engineering purpose and desired outcome. Site-directed mutagenesis is performed both for exploratory purposes, or to tailor a protein towards a certain function or property.

Although technically simple, single-point mutations provide a very powerful tool in protein science. For example, structure-function relationships can be investigated by mutating certain residues hypothesised to play an important role in protein function, and analysing the results (Planas 2000; Proctor et al. 2005). So-called alanine scans, where presumptive key residues are mutated into alanine, are standard in the systematic analysis of active-site mechanism, and identifying the roles of contributing residues (Dembowski and Kantrowitz 1994). Other protein residues that preferably are mutated to clarify the role of particular features are for example those bearing N- or O-linked glycans (van den Steen et al. 1998), or residues suspected of being responsible for substrate binding (Proctor et al. 2005).

Often it is desirable to alter larger regions of a protein. For example, different N- or C-terminal truncations are common to obtain optimal expression levels, or to facilitate the formation of

16 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

ordered crystals for x-ray diffraction experiments (Derewenda 2004). In other cases, binding-site loops or otherwise distinctive regions are interesting candidates for redesign, in order to elucidate more profound structure-function relationships (Baumann et al. 2007). However, such attempts often prove more difficult, since conformational effects and side-chain interactions crucial for proper are often compromised in the process, and successful redesign generally requires many steps of refinement before yielding a functional protein, if at all.

The methods in use for site-directed mutagenesis generally involve mutagenic oligonucleotides, encoding the desired mutation and flanking wild-type regions. Such mutagenic primers can contain single base pair mutations, or multiple substitutions, insertions or deletions. They are annealed to the DNA of interest, and the subsequently synthesised DNA will incorporate the desired mutations. Different PCR and non-PCR methods exist to amplify the desired mutant gene, and to prepare recombinant vectors for expression. In this work, mutations were introduced with synthetic oligonucleotide mismatch primers. Both forward and reverse primers contain the desired mutation(s). Following a standard thermal cycling protocol, plasmid template dsDNA is denatured, the mutagenic primers anneal to the ssDNA template, and are extended by a proof-reading DNA polymerase during the elongation phase. Resulting heteroduplexes serve as templates in subsequent reactions, thus amplifying mutated plasmid DNA. After thermal cycling, the PCR product is treated with DpnI endonuclease to digest methylated parental DNA template, and the mutated plasmid is transformed into competent E. coli cells for nick repair, amplification and preservation.

Figure 9: Site-directed mutagenesis with Stratagene QuikChange®, based on mutagenic mismatch primers. Picture from Stratagene QuickChange® XLII Kit manual (2004).

17 Fredrika Gullfot 2010

2.1.1 Application: glycosynthases

Site-directed mutagenesis can be performed in order to re-design a wild-type protein towards new functionality. The glycosynthase technology as used in the presented work (papers I and II, Piens et al. 2007; Gullfot et al. 2009a) is a conceptually elegant example of protein engineering aimed at altering the natural reaction mechanism of an enzyme in order to perform novel synthetic tasks. The technology relies on the canonical two-step double-displacement mechanism of glycosyl transfer as employed by retaining glycoside hydrolases presented in section 1.1.1 above.

The active site nucleophile residue, usually a glutamate, is replaced by an inactive residue such as alanine, glycine or serine by site-directed mutagenesis (Mackenzie et al. 1998; Malet and Planas 1998). This prohibits the first step of the wild-type retaining mechanism, which is necessary for hydrolysis. However, the rest of the active site remaining intact, the enzyme still possesses the machinery to perform the second step of the retaining mechanism, i.e. the transfer of the enzyme-bound donor glycoside to an incoming acceptor molecule. The stereochemistry of the enzyme-glycosyl intermediate in the wild-type reaction can be mimicked by the provision of glycosyl fluoride donors, in the reverse anomeric configuration of the desired product, i.e. ơ for a glycosynthase based on a retaining Ƣ-glycosidase that will catalyse the formation of a Ƣ-linkage between the donor and the acceptor. The ơ-glycosyl fluoride will fit neatly in the active site donor cleft with the extra cavity created by the mutation, and the reaction required for transglycosylation will proceed from a stereochemically analogous situation to the second step of the canonical retaining mechanism (Figure 10).

O O O O

H H O O F RO RO O O R' R'

F

CH3 CH3 inert residue Figure 10: Glycosynthase reaction mechanism. The ơ-glycosyl fluoride mimicks the enzyme-glycosyl intermediate in the retaining mechanism of glycosyl transfer, as shown in Figure 2 above. By nucleophilic substitution with fluoride as the leaving group a new glycosidic bond is formed.

18 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

The general acid/base residue, here in its function as a general base catalyst analogous to step two of the canonical retaining mechanism of hydrolysis (refer to section 1.1.1 above), activates the acceptor oxygen which in turn attacks the anomeric carbon of the donor. Fluoride departs as the leaving group, and the new glycosidic linkage is formed.

The glycosynthase technology has proven very successful in the synthesis of different oligosaccharides, polysaccharides and glycocompounds, covering a variety of substrate specificities and linkages synthesized (Perugino et al. 2005; Hancock et al. 2006; Faijes and Planas 2007). The technology offers great advantages compared to other available synthetic methods, both traditional and enzymatic. Synthetic steps are fewer versus traditional organic carbohydrate synthesis, and compared to glycosyl transferases and nucleotide sugars, the glycoside hydrolases are generally easy to manipulate and glycosyl donor substrates inexpensive. Last but not least, yields are substantially higher in comparison to glycoside hydrolases employed for kinetically- controlled transglycosylation, due to the incapacitation of the hydrolytic machinery by mutation of the wild-type nucleophilic residue (Crout and Vic 1998; Vocadlo and Withers 2000; Wymer and Toone 2000; Faijes and Planas 2007). Comprehensive reviews of existing glycosynthases and applications can be found in the licentiate degree thesis preceding this work (Gullfot 2009, full text retrievable from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-10178), and in earlier publications by Perugino et al. (2005), Hancock et al. (2006), and Faijes and Planas (2007).

2.1.2 Application: structure-function studies

The explanation of the determinants behind the different catalytic routes in family GH16 XETs and XEHs, i.e. transglycosylation versus hydrolysis, has been an ongoing endeavour in our group (Baumann et al. 2007; Mark et al. 2009). For this purpose, PttXET16-34 and TmNXG1 have successfully been employed as model enzymes in several protein engineering approaches for subsequent structure-function studies.

An interesting and quite impressive structural difference between PttXET16A and TmNXG1 are two sequence insertions in TmNXG1 that are absent in PttXET16A, highlighted in Figure 11 below. Insert 1 in Tm1;*FRUUHVSRQGVWRWKHORRSFRQQHFWLQJƢ-strands 6 and 7. It is located at the donor site right before the catalytic nucleophile. Insert 2 is on the acceptor side, FRQQHFWLQJƢ-strands 8 and 9. Based on the sequence alignment, a deletion mutant of TmNXG1 was designed, lacking the five insert 2 residues. By site-directed mutagenesis with mismatch

19 Fredrika Gullfot 2010

PttXET16-34 AALRKP------VDVAFGRNYVPTWAFDHIKYFNGGNEIQLHLDKYTGTGFQSKGSYL 52 TmNXG1 QGPPSPGYYPSSQITSLGFDQGYTNLWGPQHQRVDQGS--LTIWLDSTSGSGFKSINRYR 58 . .* ..:.*.:.*. *. :* : :*. : : **. :*:**:* . * insert 1 PttXET16-34 FGHFSMQMKLVPGDSAGTVTAFYLSSQN---SEHDEIDFEFLGNRTGQPYILQTNVFTGG 109 TmNXG1 SGYFGANIKLQSGYTAGVITSFYLSNNQDYPGKHDEIDIEFLGTIPGKPYTLQTNVFIEG 118 *:*. ::** .* :**.:*:****.:: .:*****:****. .*:** ****** * insert 2 PttXET16-34 KGD-----REQRIYLWFDPTKEFHYYSVLWNMYMIVFLVDDVPIRVFKNCKDLGVKFPFN 164 TmNXG1 SGDYNIIGREMRIHLWFDPTQDYHNYAIYWTPSEIIFFVDDVPIRRYP--RKSDATFPL- 175 .** ** **:******:::* *:: *. *:*:******* : :. ...**:

PttXET16-34 QPMKIYSSLWNADDWATRGGLEKTDWSKAPFIASYRSFHIDGCEASVEAKFCATQGARWW 224 TmNXG1 RPLWVYGSVWDASSWATENGKYKADYRYQPFVGKYEDFKLG--SCTVEAASSCNPAS--- 230 :*: :*.*:*:*..***..* *:*: **:..*..*::. ..:*** ... .:

PttXET16-34 DQKEFQDLDAFQYRRLSWVRQKYTIYNYCTDRSRYPSMPPECKRDRDI 272 TmNXG1 -VSPYGQLSQQQVAAMEWVQKNYMVYNYCDDPTRDHTLTPEC------271 . : :*. * :.**:::* :**** * :* ::.***

Figure 11: Sequence alignment and overlay of PttXET16-34 and TmNXG1 structures. Loop inserts are marked by boxes in the sequence alignment and catalytic residues are highlighted in bold type. In the structural overlay, PttXET16-34 (Johansson et al. 2004) is shown in white with red loops, and TmNXG1 (Baumann et al. 2007) in grey with blue loop inserts.

20 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

primers, these residues were removed from TmNXG1 as the template gene, resulting in theTmNXG1-'YNIIG hybrid that was cloned and expressed in Pichia pastoris (Baumann et al. 2007).

TmNXG1-'YNIIG retained the overall structure of the hydrolytic XEH TmNXG1, but lacks the acceptor side loop just as the transglycosylating XET PttXET16-34. Indeed, the removal of this loop altered the activity in favour of transglycosylation, with a 5.7-fold lower hydrolytic and two- fold higher transglycosylating activity compared to wild-type TmNXG1, thus suggesting an important role of this loop for the determination of the catalytic route in GH16. Combined with phylogenetic analyses, interesting conclusions could be drawn about evolutionary relationships and the development of enzymatic activities in the GH16 XTH gene family (Baumann et al. 2007). A second study based on the TmNXG1-'YNIIG construct comprising the combination of X-ray crystallographic data and molecular dynamics (MD) simulations shed further light on subtle differences in substrate interaction caused by the loop deletion, as described in paper III in this work (Mark et al. 2009).

Several attempts have been made to remove insert 1 located at the donor site as well, however no constructs created by structure- and sequence based rational design efforts have so far resulted in expressed protein (Baumann et al., unpublished). The region involves a complex network of electrostatic interactions between residues that might be responsible for unfavourable conformational effects upon disruption. Thus, the removal or redesign of insert 1 provides an example of a case were high-throughput engineering approaches might be more appropriate than rational, site-directed mutagenesis, see section 2.3.2 below.

Further site-directed mutagenesis efforts on our XET/XEH model involve active-site residues pin-pointed by the increasing information obtained by phylogenetic analyses, protein structure and MD data. These residues are involved in substrate binding, and differences between the respective enzymes hint towards their potential roles in catalysis. Analysis of these constructs is ongoing (Gullfot, Eklöf and Brumer, unpublished work). In conclusion, our work provides one of many examples how conceptually simple site-directed mutagenesis can result in profound discoveries such as the revelation of one of the molecular determinants behind the catalytic route in GH16 XET/XEH, but also lead to conclusions about enzyme evolution and phylogenetic relationships (Baumann et al. 2007).

21 Fredrika Gullfot 2010

2.2 Directed evolution: non-recombination methods

Directed evolution aims at mimicking the process of Darwininan evolution in nature, where repeating cycles of mutation and selection provide a powerful algorithm to create diversity, as convincingly displayed by the plethora of life. This process can be dramatically accelerated in vitro, by use of various methods to introduce random mutations on a template gene, followed by recombinant expression in a suitable host organism such as E. coli or S. cerevisiae, and subsequent screening and selection of mutant clones with the desired properties. This process can be repeated in several rounds, until the desired degree of mutation is achieved (Figure 12).

Mutagenesis

Selected genes

Library of mutant genes

Sequencing or further rounds of Protein mutagenesis expression and selection

Screen or selection Proteins with desired property Library of mutant proteins

Figure 12: General steps in a directed evolution experiment. The selected template gene is randomly mutated to generate a library of mutant genes. The genes are expressed to provide a library of mutant proteins. Screening or selection is performed, and the variants with desired properties are sequenced or used for subsequent rounds of mutagenesis and selection (figure adapted from Tao and Cornish 2002).

Different physical and chemical mutagens can be used for this purpose, or even special mutator strains of E. coli that exhibit unusually high rates of spontaneous DNA mutagenesis due to genetic deficiencies in their DNA proofreading and editing machinery (Greener et al. 1997). The most commonly used method to introduce random mutations in a sequence in vitro is by error-

22 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

prone PCR (epPCR), also employed in the work presented in paper V of this thesis (von Schantz et al. 2009). epPCR is typically performed with a non-proofreading Taq polymerase, unbalanced ratios of dNTPs, and increased concentration of MgCl2 to stabilise non-complementary nucleotide pairs. This results in the introduction of random mutations, due to replication errors during the PCR amplification of the gene. Dedicated DNA polymerases such as Mutazyme® II DNA polymerase (Stratagene) are commercially available to further increase mutagenic rates and improve the uniformity of the mutational spectrum, i.e. mutations occur with equal frequency at both A-T and G-C positions.

The commercial significance of directed evolution is profound, permitting the improvement and modification of essential enzymatic properties such as stability, tolerance to non-natural conditions, substrate specificity, enantioselectivity and high catalytic turnover required for industrial biocatalysis (Cherry and Fidantsef 2003). For scientific purposes, directed evolution enables the generation of novel proteins and catalysts for the creation of new biomolecular tools, or to elucidate the natural evolutionary processes (Otten and Quax 2005).

2.2.1 Screening and selection

The great challenge in any directed evolution experiment is to identify and isolate functional clones with the desired properties out of the vast mass of generated mutant variants. This calls for careful design of libraries, and intelligent screening and selection strategies. Screening involves the analysis of clone characteristics, for example catalytic turnover by adding substrates that permit the detection of reaction products by colour or fluorescence. Typically, such screens are performed in a multi-well plate format, as for example the high-throughput screen for XET/XEH activity developed by Kaewthai et al. (2008) described in section 3.2 below. Selection, on the other hand, involves the application of certain conditions where only clones with desirable traits will survive and propagate. Classical examples include the addition of antibiotics to select for resistance, or to omit certain nutrients for selection against auxotrophy.

The choice of either screening or selection is obviously highly dependent on the desirable protein function. Many directed evolution experiments are aimed at enzyme activities for which no obvious high-throughput assays are available, and the development of suitable screening and selection conditions is a major scientific feat in itself. Also, any assay permits the detection of only those traits discerned by that particular method, which is a drawback in cases where directed

23 Fredrika Gullfot 2010

evolution is employed with the explicit purpose to create great functional diversity within a library.

Phage display

An ingenious way to create combinatorial libraries and permit subsequent selection of clones with the desired affinity towards a specific molecule or substance is by phage display, employed in paper V of this thesis (von Schantz et al. 2009). With this method, the expressed functional protein is linked to the major coat protein of a phage and thus displayed on the surface, while the single-stranded gene encoding the protein is contained within the capsid. Phages encoding proteins with desired properties can thus be “fished” from the library by means of affinity to the substrate, using the very same substrate as the “bait” (Rapley 2000).

The process is conceptually straightforward: the template gene is cloned into a phagemid vector, and the needed rounds of mutations are performed to obtain the desired library of genetically diverse clones. E. coli cells are transformed with the phagemid vectors, and subsequent liquid E. coli cultures are infected with helper phages to provide the necessary accessory proteins for the construction of new phage particles. The DNA of the gene of interest is also translated, with the resulting protein linked to the major coat protein, and thus displayed on the surface of the phage, while the corresponding DNA is encapsulated. Phages are harvested by centrifugation of the culture and filtration of the supernatant. Screening and selection of mutant proteins with the desired properties is performed by exposing the phages to immobilised substrate (Gunnarsson et al. 2004). The phages displaying proteins with the desired affinity will bind to the substrate and are collected for subsequent sequencing and/or expression of larger protein amounts in E. coli.

2.2.2 Application: engineered CBMs as xyloglucan-specific probes

Binding proteins play an important role in biotechnology, and are used extensively as molecular probes for the detection, visualisation and selection of specific biomolecules recognised by the specific binding protein in question. Immunoglobulins (antibodies) are most commonly used for this purpose, produced by injecting antigen into a suitable higher vertebrate organism such as rabbit, mouse, goat or hen, and harvesting the antibodies generated by the animal humoral immune response from the blood serum or egg yolk, or, in the case of monoclonal antibodies, by isolating lymphocytes for the production of hybridoma cells (Thorpe and Thorpe 2000).

24 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

There are huge benefits in obtaining binding proteins from non-animal sources, for economic, technical and ethical reasons. Interest has therefore turned to other suitable affinity scaffolds than antibodies, preferably smaller and more stable proteins with natural binding capacities, that can be further regulated by engineering approaches. Such scaffolds include e.g. antibody fragments, lipocalins (Skerra 2008), and the “Affibody” derived from the staphylococcal protein A triple- helix bundle domain (Nygren 2008). Carbohydrate-binding modules (CBM) as introduced in section 1.3.1 above fulfil the criteria of being small and stable proteins with innate affinity to their target molecules, and thus are interesting as scaffolds for the evolution of improved carbohydrate-specific binding probes.

In particular, suitable probes are needed for the detection of the important primary cell wall polysaccharide xyloglucan, to elucidate its role in cell development and microstructure. At the initiation of this study, only one monoclonal antibody (mAb) had been produced for this purpose, specific for fucosylated xyloglucan only (CCRC-M1, Puhlmann et al. 1994). It was therefore desirable to extend the range of molecular probes, such as for the detection of galactosylated xyloglucan. Recently, several new xyloglucan-specific mAbs have been made available, including LM15 for non-galactosylated xyloglucan (Marcus et al. 2008), and a whole set of plant polysaccharide mAbs including a number of mAbs specific for both fucosylated and non-fucosylated xyloglucans (Pattathil et al. 2010).

Preceding this work, the 18 kDa CBM 4-2 from the Rhodothermus marinus xylanase Xyn10A had been used to construct a combinatorial library of variants, based on the mutagenesis of twelve amino acids in the binding cleft that had been identified by NMR to undergo large chemical shifts upon titration with xylooligosaccharide substrate. Mutations were introduced with degenerate primers, and selection of clones was performed by phage display using xylan, cellulose, mannan and also a glycoprotein, human IgG4, as substrates in several selection rounds (Gunnarsson et al. 2004).

This library was further used for the selection of xyloglucan-specific variants by incubating the phages with non-fucosylated xyloglucan immobilised on beads, in the absence or presence of xylan to discard variants that retained the wild-type affinity for xylan (Gunnarsson et al. 2006). Two of the 21 obtained variants, XG-34 and XG-35, bound well to non-fucosylated xyloglucan, but only poorly or not at all to xylan, cellulose, arabinoxylan, Ƣ-glucan or fucosylated xyloglucan, thus proving that these to two variants had retained the binding capacity to non-fucosylated

25 Fredrika Gullfot 2010

xyloglucan of the parent, but also achieved exclusive specificity to this substrate, due to loss of affinity towards other substrates.

The three-dimensional structure of the XG-34 variant was solved by X-ray diffraction, and published as paper VI presented in this work (Gullfot et al. 2009b). Furthermore, XG-34 was chosen as the starting scaffold for the creation of two further libraries by affinity maturation, as presented in paper V (von Schantz et al. 2009). The first library was constructed by performing epPCR-based random mutagenesis on the XG-34 gene as the template, while a second round of epPCR was performed to obtain the second library. Selection was performed by phage display in three rounds, to identify clones binding tightly and exclusively to xyloglucan. This resulted in two variants, XG-34/1-X and XG-34/2-VI, whose strong and specific affinity permits their practical use as fluorescein-labeled molecular probes for the detection and visualisation of galactosylated xyloglucan in plant sections (von Schantz et al. 2009). The presented work on the evolved CBM4-2 from R. marinus Xyn10A thus provides an example on how directed evolution approaches can be used to obtain both extended diversity, but also more stringent specificity on suitable scaffolds, in this case to obtain essential new tools for carbohydrate research.

Figure 13: Engineering of CBM4-2 into XGBM by targeted mutagenesis and directed evolution. The first phase of targeted mutagenesis and selection by phage display generated the xyloglucan-specific XG-34 (Gunnarsson et al. 2006). By directed evolution of XG-34 as the parent and phage display selection, new XGBMs with high affinity for xyloglucan were obtained (figure taken from von Schantz et al. 2009).

26 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

2.3 Directed evolution: homologous recombination

One drawback of non-recombination directed evolution methods such as error-prone PCR is the limited degree of mutations that can practically be achieved without compromising protein integrity. The probability of a protein to retain its fold and function has been predicted to decrease exponentially with the number of random mutations introduced (Bloom et al. 2005). One strategy to create libraries with high levels of mutation while retaining structure and function is by recombination of homologous genes, as such mutations are generally more compatible with the protein backbone and thus less likely to disrupt the structure (Drummond et al. 2005).

Recombination of homologous genes is the basis of several popular methods, known as DNA shuffling (Stemmer 1994) or family shuffling (Crameri et al. 1998). In principle, two or more parental genes are cut into fragments, which then are recombined into hybrid progenies, called chimeras. These chimeras often incorporate traits from the parental genes, but commonly also completely novel functions, resulting in greatly diverse libraries that contain all possible combination of mutations present in the parental genes.

In the classical method invented by Stemmer (1994), the parental genes are digested with DNaseI into a pool of random DNA fragments. In a second step, reassembly PCR is performed, where the different fragments will prime each other based on homology. After 20-50 cycles of assembly, a final PCR is performed with primers to selectively amplify full-length sequences for subsequent cloning into an expression vector (Joern 2003). The drawback of this method is that > 70% sequence homology is required among the parental templates, and the propensity of DNaseI to hydrolyse dsDNA next to pyrimidine nucleotides results in a certain sequence bias in the gene fragment pool. Several derivative methods have therefore been developed to overcome these limitations by various means. See Arnold and Georgiou (2003) for a comprehensive overview including protocols.

2.3.1 Structure-guided recombination with SCHEMA

SCHEMA is a powerful computational tool for protein engineering by recombination (Voigt et al. 2002; Endelman et al. 2004; Silberg et al. 2004). The method has been used for the evolution of different enzymes such as Ƣ-lactamases (Meyer et al. 2003; Meyer et al. 2006), cytochrome P450s (Otey et al. 2004; Otey et al. 2006; Landwehr et al. 2007), and cellulases (Heinzelman et al. 2009a; Heinzelman et al. 2009b) into an impressive variety of novel enzymatic activites.

27 Fredrika Gullfot 2010

SCHEMA uses structural information to predict which fragments, or “schemas”, of homologous proteins can be swapped without disrupting the three-dimensional fold of resulting chimeras, in order to construct highly diverse libraries based on the underlying assumption that functional evolution is achieved by the recombination of conserved structural building blocks.

SCHEMA typically results in diverse libraries significantly enriched in functional proteins compared to other methods. In a SCHEMA-generated library based on cytochrome P450s with a63% sequence similarity, 47% of the chimeras retained the correct fold (Otey et al. 2006). Similar results can be achieved even for quite distantly related parental sequences, as shown by a SCHEMA library based on three parental Ƣ-lactamases with only 34-42% sequence identity. Out of 553 unique characterised chimeras, 111 (20%) retained Ƣ-lactamase activity (Meyer et al. 2006). Moreover, both studies showed that the SCHEMA disruption value E is a strong predictive metric, since functionality is significantly enriched amongst chimeras with low E values.

E = 3 E = 0

Figure 14: SCHEMA disruption. Amino acid side chains are represented as dots, peptide bonds as grey lines, and interactions between side chains as dotted lines. From the black and white parents, two hybrids are generated. In the first case (left), three interactions are disrupted, resulting in E = 3. In the second case (right), no interactions are disrupted, E = 0. According to the theory, hybrids with least disruptions are most likely to fold. Figure adapted from Voigt et al. (2002).

28 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

Based on the known three-dimensional structure of parent proteins, the SCHEMA energy function calculates the SCHEMA disruption value (E) for a chimera according to

E ¦¦Cij ' ij , iij! where Cij = 1 if any side-chain atoms or main-chain carbons in amino acid residues i and j are within 4.5 Å distance, and 'ij = 0 if contacts between amino acids i and j in the chimera are already found in any of the parental protein sequences; otherwise 'ij = 1 (Voigt et al. 2002).

The RASPP (Recombination as a Shortest Path Problem) algorithm selects crossover locations that minimize the average SCHEMA disruption E of the library, and generates optimal libraries for each level of diversity (m) (Endelman et al. 2004). Both the SCHEMA energy function and RASPP including necessary accessory tools are available as downloadable Python scripts from http://www.che.caltech.edu/groups/fha.

2.3.2 Application: recombination of GH16 XET/XEH genes

In continuation of our work with the homologues PttXET16-34 and TmNXG1 as a model system to uncover the structural determinants of transglycosylation versus hydrolysis within GH16 XET/XEH enzymes (see section 2.1.2 above), we were interested in swapping larger regions of these proteins in order to identify potential motifs responsible for the catalytic differences. Despite the success with the removal of the XEH-specific acceptor site loop resulting in the XET/XEH hybrid TmNXG1-'YNIIG (Baumann et al. 2007), attempts at re- engineering the TmNXG1-specific loop insert at the donor binding site remained fruitless, probably due to the complex network of electrostatic interactions between neighbouring residues. Since structural information was available for both PttXET16-34 and TmNXG1, we performed a pilot study for the application of SCHEMA to generate a restricted library of chimeras, included as paper IV in this thesis.

Optimal crossover locations were identified with the RASPP algorithm, and a set of crossover points were chosen to provide a fairly even distribution over the full-length sequence. The SCHEMA energy function was applied to calculate the disruption values for the resulting combinatorial library, and a limited set of chimeras was chosen based on low disruption (E) and high diversity (m) as cut-off criteria. The resulting restricted in silico library offered interesting insights on structural motifs biased against recombination, i.e. with a propensity to remain intact,

29 Fredrika Gullfot 2010

while other regions of the protein were more prone to permit crossovers of the parental sequences. Intriguingly, these findings were in accordance with our experience gained from the loop engineering experiments (Baumann et al. 2007).

While empirical data are currently lacking to support solid conclusions about the validity of these predictions, the potential of SCHEMA as an interesting method for larger-scale gene shuffling projects within the GH16 family is evident. Exploring the functional space inherent in the structural diversity displayed by these related enzymes could provide interesting insights of structure-function relationships within this family, but also generate evolved enzymes with enhanced activity and novel substrate specificities.

30 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

3 ANALYTICAL TECHNIQUES

Apart from the protein engineering methods described in previous sections, several important experimental methods for the study and characterization of proteins, carbohydrates and protein- substrate interactions have been employed in this work. This chapter provides a brief overview of their theoretical background and practical application.

3.1 Measuring glycosynthase activity with a fluoride ion selective electrode

During enzymatic condensation of ơ-glycosyl fluorides, the anomeric fluoride of the donor glycoside is released upon condensation with the acceptor hydroxyl group. Thus, the reaction can be followed as an increase of fluoride ion concentration in the reaction solution with a fluoride ion-selective electrode. This permits the detection of glycosynthase activity and analysis of reaction kinetics which was performed on all glycosynthases characterised in this work.

While autocondensation of ơ-glycosyl fluorides is possible, it is negligible compared to the enzymatic reaction. Also, some spontaneous hydrolysis of the fluoro substrates occurs, and is accounted for by recording the baseline fluoride ion release before adding enzyme to the reaction solution. During final data analysis, this baseline is subtracted from the recorded rate during the enzymatic reaction.

Fluoride ion concentration is measured as an electrical current (mV), and has to be converted to the actual concentration of fluoride ions ([F-]mM) . This is done via a standard curve constructed by recording mV readings at known fluoride ion concentrations. Above a certain concentration OHYHO KHUH!SSPRUƬ0 WKHUHODWLRQEHWZHHQP9DQGIOXRULGHLRQFRQFHQWUDWLRQLVD semilogarithmic function in the general form

(1) V p ln>@F   q where p and q are specific experimental constants depending on parameters such as for example pH. For this semilogarithmic region, the conversion function to compute fluoride ion concentration from mV readings will thus be given by

V q (2) >@F  e p .

31 Fredrika Gullfot 2010

3.2 Colorimetric XET activity assay

A colorimetric method to measure transglycosylation and/or hydrolytic activity on xyloglucan was first described by Sulova et al. (1995). Detection is based on tri-iodide anions binding to xyloglucan chains, forming a greenish coloured complex. The enzyme is added to a mix of xyloglucan (XG) and xyloglucan oligosaccharides (XGO) to assay transglycosylation, or XG only for hydrolysis. The solution is incubated at 30oC for a certain amount of time, typically 30 min. The enzymatic reaction is stopped by lowering the pH with HCl, a colouring solution with potassium tri-iodide and sodium sulfate is added, and the result is analyzed spectrophotometrically by measuring A620. The difference in absorbance compared to a control sample without enzyme indicates the extent of transglycosylation (XG + XGO) or hydrolysis (XG only). In the range A = 0.1 to A = 0.3, absorbance is linearly dependent on degree of transglycosylation (XG + XGO) or hydrolysis (XG only), and will provide a fair approximation of the level of activity.

This assay is routinely used in this work to test for XET/XEH activity, and has also been developed into a 96-well microtitre plate format suitable for high-throughput screening of XET/XEH expression (Kaewthai et al. 2008).

3.3 Protein-ligand binding studies by isothermal titration calorimetry (ITC)

Non-covalent interactions between biomolecules such as proteins and polysaccharides involve enthalpy changes as bonds are formed and released. Binding of a protein to its ligand can thus be followed by measuring the interaction enthalpy as heat released, by means of isothermal titration calorimetry (ITC). An analysis of the data obtained by a series of ITC experiments provides accurate quantification of affinity, stoichiometry and thermodynamics of binding between two molecules (Ladbury and Chowdhry 1996; O'Brien et al. 2000). ITC experiments in this work were performed to analyse the binding of the evolved xyloglucan binding modules to different ligands, and to determine the relevant thermodynamic constants.

The reaction cell of the titration calorimeter contains protein in solution, and the temperature of the cell is continuously measured. Ligand is injected at certain time intervals, and the heat of interaction is measured for each individual injection, by observing the temperature change of the reaction cell. As the protein in the reaction cell becomes saturated with ligand, the heat signal eventually diminishes. Heats of injection are then integrated with respect to time into a binding

32 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

isotherm, and fitted to the relevant binding model by non-linear regression. In this work, this was performed using the Origin® software for the analysis of ITC data (Figure 15).

Figure 15: Example of data from a typical ITC experiment. The upper panel shows the raw data, i.e. the thermal output for each injection (heats of injection). The area under each peak is equal to the total heat released during the injection. The lower panel shows the sigmoidal binding isotherm obtained by integrating these heats, plotted against the molar ratio of ligand in the reaction cell. Curve fitting results including thermodynamic parameters are shown in the box.

The single-site binding model used in our case contains the fitting parameters n (number of sites), -1 Ka (binding constant in M  DQG ƅH (heat change in cal/mole), for which initial guesses are provided. Taking the change in volume for each injection into account, the corrected heat released from each injection is calculated and compared to the measured heat from the corresponding experimental injection. Ka DQG ƅH values are then improved by standard Marquardt methods in an iterative process (Microcal manual, ITC Data analysis in Origin®,

1998). )LQDOO\IURPƅH and KaƅS (entropy change in cal/mole/deg), can be calculated from the IXQGDPHQWDOWKHUPRG\QDPLFUHODWLRQVKLSVƅG = -RT ln K ƅH - TƅS (Gibbs energy) (Ladbury and Chowdhry 1996).

33 Fredrika Gullfot 2010

3.4 Protein crystallisation

Determination of the three-dimensional structure of a protein by X-ray diffraction requires the protein in a crystalline form. In principle, protein crystallisation is the process of getting a solubilised protein out of the liquid phase into a solid state in the form of ordered crystals. The path from liquid into solid phase is assisted by precipitants, in essence solubility reducing agents that facilitate supersaturation of the protein solution (Luft and DeTitta 2009).

Figure 16: Phase diagram of a protein crystallisation experiment. Phases are determined by the concentration of protein and precipitant. Arrows indicate the path from soluble protein to protein crystals. As the protein solution becomes supersaturated, nucleation of crystals can occur in the labile phase. Nuclei can eventually grow into ordered crystals, as the solution enters the metastable phase of supersaturation.

The formation of ordered crystals of sufficient quality for X-ray diffraction experiments is highly dependent on the path from the undersaturated state through the supersaturated region of the phase diagram as shown in Figure 16, proceeding from the soluble phase to a labile phase where nucleation of crystals will occur, and back again to a slightly less saturated metastable phase that permits growth of the crystals (Luft and DeTitta 2009).

The three crucial parameters for a successful crystallisation experiment are thus protein concentration, precipitant concentration, and precipitant type (Luft and DeTitta 2009). These are determined empirically for each single crystallisation project, by screening over a large range of different conditions and combinations thereof (Newman et al. 2005). The precipitant cocktail

34 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

typically consists of a buffer to control pH and surface charge distribution, precipitating agents such as different salts and polymers, and often chemical additives that affect intermolecular interactions (Chruszcz et al. 2008).

A common method for achieving supersaturation to grow crystals is by vapour diffusion. In this work, a hanging drop method was used, where droplets of protein solution and precipitant are placed on the bottom of a thin glass plate that covers a reservoir filled with precipitant, typically in a 24-well format. This permits controlled dehydration and a smooth transition into supersaturation, due to differences in the vapour pressure of the drop compared to the reservoir solution. Diffusion through the vapour phase from the droplet to the reservoir will thus occur until vapour pressure equilibrium is reached (Benvenuti and Mangani 2007).

Initial screening of the droplets is performed on a daily basis to assess the level of saturation and precipitating patterns, and crystallisation conditions that seem favourable are fine-tuned by additional screens. Eventually, protein crystals might form. These can often be further improved by different seeding techniques, until sufficient quality for X-ray diffraction at satisfactory resolution is achieved (Benvenuti and Mangani 2007). Diffraction is dependent on the repetitive and symmetrical arrangement of atoms in the individual unit cells, and highly ordered crystals are thus a necessity. A comprehensive review on protein crystallography and how to interpret results from X-ray diffraction experiments is provided by Wlodawer et al. (2008).

3.5 Carbohydrate analysis by HPAEC-PAD

High-performance anion-exchange chromatography with pulsed amperometric detection (HPAEC-PAD) is a widely used method in carbohydrate analysis, facilitated by the Dionex ICS- 3000 system and carbohydrate-specific CarboPac columns (here: Dionex PA-100 and Dionex PA-200). This method has been used extensively throughout the thesis work for the analysis of glycosynthase products and activity of GH 16 XET/XEH chimeric mutants.

Carbohydrates are weakly acidic with pKa values around 12-14 (El Rassi and Nashabeh 1995), and will be partially ionised at pH > 12. This permits selective separation with an anion-exchange stationary phase such as provided by the CarboPac column matrix, which is specifically designed for the separation of mono- and oligosaccharides. In the experiments performed in the present work, the branched oligosaccharides are eluted by a sodium acetate gradient in sodium hydroxide, and the retention time of the different oligosaccharides depends on their size. An overview of the

35 Fredrika Gullfot 2010

relevant chemistry is presented in Figure 17 below.

OH OH

O O HO HO HO OH NaOH HO OH OH O-

HO

HO O O-

O O-

HO NaOAc HO HO OH HO OH

Figure 17: Schematic overview over anion exchange chromatography of carbohydrates. The pH is raised by sodium hydroxide and hydroxyl groups of the sugar are ionised to oxyanions. The ionised sugar binds to the anion exchange stationary phase. A gradient of sodium acetate is added as the eluent; the anionic acetate ions outcompete the sugar oxyanions on the stationary phase, and release the sugar from the stationary phase. The sugar is eluted.

Detection by pulsed amperometry is performed as a three-step sequence, schematically shown in Figure 18. Upon elution, the carbohydrates are detected by measuring the electrical current at a certain potential as they are oxidised on the surface of a gold electrode. The potential is then raised to oxidise the gold surface, which causes desorption of the oxidised carbohydrates and thus cleans the surface. The potential is lowered again, and the electrode surface is reduced back to gold, completing the pulse sequence, and the next measurement can be performed. The carbohydrate oxidation current in the first step is integrated over time after a certain delay to correct for a charging current due to the change of potential, yielding the detector response in coulomb (C). A review of HPAEC-PAD for carbohydrate analysis and recent developments can be found in Cataldi et al. (2000).

36 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

gold oxidation Potential (V) E2 t2

carbohydrate oxidation

delay integration gold oxide E1 t1 reduction E3 t3

Time (msec)

Figure 18: Pulse sequence during pulsed amperometric detection (PAD). The three potentials E1 (oxidation of carbohydrates on gold surface), E2 (cleaning of gold surface by oxidation and desorption of carbohydrate oxidation products) and E3 (restoration of gold surface by reduction) are performed at fixed durations t1, t2 and t3. A full cycle typically lasts 500-1000 ms, depending on the settings for the different steps. The electric current resulting from the oxidation of carbohydrates on the gold surface is integrated over time to obtain the output expressed as electric charge.

3.6 Polysaccharide analysis by SEC-ELS

High-mass polysaccharides, such as the XXXG-based xyloglucan homopolymers, cannot be analysed by the Dionex system. For these experiments, size exclusion chromatography with evaporative light scattering detection (SEC-ELS) performed in organic solvent was the method of choice.

With size exclusion chromatography, sample components are separated according to size by physical means rather than adsorption to the stationary phase. The pores of the solid phase matrix permit smaller molecules to enter, while larger molecules can penetrate the pores only partially. This causes shorter retention times for larger molecules, relative to smaller ones that travel a longer path through the stationary phase. Size exclusion chromatography can be performed with organic solvent as the mobile phase, as in this work, where dimethyl sulfoxide was used. This is often referred to as gel permeation chromatography (GPC), in contrast to gel filtration chromatography (GFC) performed with an aqueous liquid phase.

37 Fredrika Gullfot 2010

The eluted samples were analysed by an evaporative light scattering detector (ELS). By this technique, the eluent is first nebulised to droplets, from which the solvent subsequently evaporates. A light source is applied to the remaining analyte, which will scatter the light beams to a different extent according to its mass.

38 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

4 AIM OF INVESTIGATION

The present investigation concerns the application of various protein engineering approaches on xyloglucan-active enzymes and non-catalytic proteins for the study of the plant cell wall and storage polysaccharide xyloglucan.

The specific goals were:

ƒ The screening, engineering, expression and characterisation of enzymes with innate xyloglucanase activity as glycosynthases for the synthesis of homoxyloglucans with regular substitution patterns. These novel substrates not available in nature permit in vitro experiments regarding the role of particular structural elements for xyloglucan properties and its interaction with cellulose. Also, they are invaluable as ligands for structure- function analyses of xyloglucan-active enzymes.

ƒ The study of engineered xyloglucan binding modules (XGBM) derived from the Rhodothermus marinus xylanase Xyn10A CBM4-2, including three-dimensional structure determination by X-ray crystallography and binding studies by isothermal titration calorimetry (ITC). The evolved XGBM can be used as molecular probes that bind specifically to xyloglucan, for example for the detection and localisation of xyloglucans in plant tissue sections.

ƒ The design, cloning, expression and characterisation of engineered variants of PttXET16- 34 and TmNXG1 for structure-function studies, including site-directed mutagenesis and structure-guided recombination of the parental genes. The obtained constructs are part of ongoing efforts to elucidate the determinants of transglycosylation versus hydrolysis among GH16 XETs and XEHs.

39 Fredrika Gullfot 2010

40 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

5 RESULTS AND DISCUSSION

5.1 Engineering of glycoside hydrolases into glycosynthases for the production of regularly substituted XGOs (publications I and II)

The Humicola insolens endo-Ƣ-glucanase glycosynthase HiCel7B E197A has been used extensively in the production of xyloglucan oligosaccharides based on XXXGơF donors (Fauré et al. 2006; Saura-Valls et al. 2006; Fauré et al. 2007). However, it is not able to polymerise XLLGơF donors at practically feasible rates. This limitation prompted for the development of new xyloglucan- active glycosynthase variants.

The scope of the work presented in publication I was to investigate the potential of nucleophile mutants of a xyloglucan endo-transglycosylase from hybrid aspen (Populus tremula x tremuloides, PttXET16-34) as glycosynthases for the synthesis of xyloglucan oligosaccharides. The results were further compared to the HiCelB E197A glycosynthase, serving as a benchmark for the synthesis of non-galactosylated xylogluco-oligosaccharides and analogues.

A growing scientific interest in semi-synthetic homoxyloglucans with controlled and well-defined decoration patterns encouraged our search for further glycosynthases based on enzymes with wild-type xyloglucanase activity as scaffolds, as presented in publication II. Xyloglucanase activity (EC 3.2.1.151) has been found amongst glucan hydrolases from six different GH families (Gilbert et al. 2008). Five of these families (GH 5, 7, 12, 16 and 44) employ the canonical double- displacement mechanism of retaining glycosyl transfer, which forms the basis for the classical glycosynthase reaction, see section 2.1.1 above. In the presented study, we screened potential glycosynthase scaffolds from three different glycoside hydrolase families (GH7, GH12, and GH16), based on demonstrable wild-type xyloglucanase activity (reviewed in Gilbert et al., 2008). The glycosynthases were characterised and compared, including those presented in publication I.

Xyloglucan synthases from PttXET16-34

The catalytic nucleophile of PttXET 16-34, Glu85, was mutated into glycine, serine, or alanine, which generated glycosynthases with no detectable wild-type activity, but that were capable to oligomerise both XXXGơF and XLLGơF donor substrates. The condensation of XXXGơF by

PttXET16-34 E85A exhibited saturation kinetics with an apparent Km value comparable to other

41 Fredrika Gullfot 2010

glycosynthases derived from endo-glycosidases, while the observed kcat value was relatively low.

Compared to HiCel7B E197A, the kcat value was approximately five times lower, which was somewhat compensated by a lower Km value, see Table 1 and Piens et al. (2007).

Product analysis revealed the time-dependent formation of series of oligomers of the general structure (XXXG)nơF, along with products resulting from the spontaneous hydrolysis of the C1– F bond. The addition of an equimolar amount of XXXG as an alternate acceptor yielded nearly exclusively (XXXG)2-10 products in overnight reactions, biased toward lower Mw products compared with reactions with XXXGơF as both the donor and the exclusive acceptor substrate. The addition of alternate acceptors can therefore be used to both fine-tune the glycosynthase product distribution, as well as to control the chemistry at the reducing end of the produced oligosaccharides.

No activity on XLLGơF donors was detected by fluoride ion release or product analysis for HiCel7B E197A. In contrast, PttXET16-34 E85A catalysed the condensation of XLLGơF donors with slightly increased kcat and Km values compared to those found for XXXGơF, see Table 1 and Piens et al. (2007).

Table 1: Apparent kinetic constants for HiCel7B E197A and PttXET16-34 E85A glycosynthases XXXGɲF XLLGɲF

kcat Km kcat/Km kcat Km kcat/Km (min-1) (mM) (mM-1 min-1) (min-1) (mM) (mM-1 min-1)

HiCel7B E197A 5.2 r 0.4 6.7 r 1.1 0.77 - - - PttXET16-34 E85A 1.1 r 0.1 1.6 r 0.4 0.66 1.4 r 0.1 3.8 r 0.2 0.38

Similar to the results of product analysis observed for XXXGơF, PttXET16-34 E85A catalysed the homo-condensation of XLLGơF to produce a series of oligomers (XLLG)nơF (n = 2 - 6), which spontaneously hydrolysed to (XLLG)n. Incubation of XLLGơF with the alternate acceptor substrate XLLG led to near-exclusive production of (XLLG)n in overnight reactions, highly similar to the observations made in the case of XXXGơF with XXXG acceptors.

Interestingly, the degrees of polymerisation of products produced by condensation of XLLGơF were significantly lower than those from XXXGơF. It is likely that the PttXET16-34 E85A glycosynthase is inhibited by (XLLG)n products to a greater extent than by (XXXG)n products, possibly due to additional binding interactions to the pendant Gal residues. Nonetheless,

42 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

(XLLG)n oligomers up to n = 6 were observed (M 8230), demonstrating that PttXET16-34 E85A is the first glycosynthase capable of producing homogenously galactosylated xyloglucan fragments.

In summary, the study established that HiCel7B E197A, wild-type PttXET16-34, and PttXET16- 34 E85A can all be employed for the synthesis of xyloglucans with regular backbone substitution using appropriate donor substrates. Whereas PttXET16-34 E85A is able to oligomerise both XXXGơF and XLLGơF, the HiCel7B E197A glycosynthase cannot be used for the synthesis of galactosylated xyloglucans. The unique ability of PttXET16-34 E85A to use both XXXGơF and XLLGơF as donor substrates unlocks the possibility to produce a larger variety of xyloglucan oligo- and polysaccharides of defined composition for research and practical applications.

Xyloglucan synthases from family GH7

Enzymatic polymerisation of oligosaccharides into products with high degree of polymerisation is largely dependent on the catalytic turnover (Faijes and Planas 2007). The Humicola insolens GH7 cellulase HiCel7B has served as a scaffold for several useful glycosynthases (Lin et al. 2004; Fauré et al. 2006; Saura-Valls et al. 2006; Blanchard et al. 2007; Fauré et al. 2007), and since the E197S variant displays a 40-fold higher kcat compared to HiCel7B E197A for the condensation of other substrates (Ducros et al. 2003), we were interested to see whether HiCel7B E197S could be used to synthesise xyloglucans with higher degree of polymerisation than those previously obtained with the E197A variant.

The E197S variant was tested on ơ-xyloglucosyl fluoride substrates under the same conditions as in our previous study (Piens et al. 2007). Using XXXGơF as both donor and acceptor, HiCel7B E197S exhibited remarkably increased reaction rates compared to its alanine counterpart, making it the fastest (XXXG)n-producing glycosynthase identified thus far. During the condensation of

XXXGơF donors a colourless precipitate of (XXXG)n products is formed, consistent with reported observations that partial enzymatic degalactosylation of native xyloglucan leads to chain aggregation and gelling (Shirakawa et al. 1998; Whitney et al. 2006). The reaction proceeds so quickly that oligosaccharides with low degree of polymerisation are barely discernable by HPAEC-PAD analysis of the products. The reaction is easily scaled up, and XXXG-based homoxyloglucan could be produced in 50 mg batches.

The partitioning of reaction products into water-soluble and insoluble fractions and subsequent

43 Fredrika Gullfot 2010

analysis by size exclusion chromatography revealed interesting facts about the aqueous solubility limits of xyloglucan: (XXXG)n products up to M 14 000 mostly remain in the water-soluble fraction, while the insoluble fraction consists of products up to M 60 000. Most product is obtained in the range M 15 000 - 40 000, corresponding to n § 14 - 36, among the largest products obtained with a glycosynthase so far.

Figure 19: SEC-ELS product analysis of XXXG-based homoxyloglucan obtained with HiCel7B E197S. The chromatogram shows the water soluble fraction (dotted line) and the insoluble fraction (solid line) (Gullfot et al. 2009a).

Despite the impressive performance on XXXGơF, the HiCel7B E197S glycosynthase showed no activity at all on XLLGơF as measured by fluoride ion release or during product profiling by HPAEC-PAD, reflecting the same observations as in our previous study on HiCel7B E197A. The GH7 HiCel7B constructs are therefore somewhat limited with regard to their substrate range, since they do not permit condensation of galactosylated ơ-xylogluco-oligosaccharyl fluoride donors. At the same time, they are superior to all xyloglucan glycosynthases produced thus far for synthetic purposes involving ungalactosylated building blocks. In particular, the HiCel7B E197S glycosynthase is an outstanding tool for the synthesis of pure, high molecular weight XXXG-based homoxyloglucan.

Xyloglucan synthases from family GH12

The recently characterised Bacillus licheniformis XG12 (Gloster et al. 2007) is the first reported instance of xyloglucanase activity within the GH12 family, and no glycosynthase derived from a family GH12 scaffold had been reported so far. Since wild-type BlXG12 hydrolytic activity on xyloglucan is significant (Gloster et al. 2007; Ibatullin et al. 2008), the BlXG12 E155A

44 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

nucleophile mutant was an appealing candidate in our search for superior xyloglucan glycosynthases. However, only very low activity on xyloglucosyl fluorides was detected.

Some activity on XXXGơF was measured by fluoride ion release, and product analysis by HPAEC-PAD revealed that small amounts of the XXXGơF were consumed, resulting in traces of elongation products. In comparison, activity on XLLGơF was nearly undetectable, and product analysis did not reveal any elongation products or detectable consumption of XLLGơF substrate.

Interestingly, the results reflect the findings in the kinetic studies previously performed on wild- type BlXG12. Compared to the level of hydrolytic activity on tamarind xyloglucan, BlXG12 performed only poorly on branched aryl oligosaccharides, such as XXXG-Ƣ-2-chloro-4- nitrophenyl and XLLG-Ƣ-2-chloro-4-nitrophenyl. With such chloro-nitrophenyl (CNP) oligosaccharide donors, the rate of hydrolysis is inversely related to the degree of branching of the glucan backbone. Rates observed follow the order GGGG-Ƣ-CNP (fastest) > XXXG-Ƣ-CNP > XLLG-Ƣ-CNP (slowest) (Gloster et al. 2007; Ibatullin et al. 2008).

In conclusion, BlXG12 displays conceptual capability as a glycosynthase, but due to the poor performance in comparison to other available xyloglucan glycosynthases we do not regard BlXG12 E155A an apt choice for the synthesis of xyloglucan-based polymers. However, it is possible that the BlXG12 E155A glycosynthase may find utility in the synthesis of un- or low- substituted glucans, which could be investigated in future studies.

Xyloglucan synthases from family GH16

Several GH16 enzymes have served as scaffolds for glycosynthases, including the PttXET16-34 nucleophile mutants reported in publication I. PttXET16-34 follows the retaining mechanism of glycosyl transfer, which is the basis for the glycosynthase reaction, but it is a strict transglycosylase (EC 2.4.1.207) while other glycosynthases are based on hydrolase scaffolds. For this reason, we were interested to derive new xyloglucan synthases from a pronounced hydrolase within the GH16 family, such as the structurally closely related homolog xyloglucan endo- hydrolase from Tropaeolum majus (TmNXG1).

The apparent kinetic parameters for the hydrolysis of Glc8-based xyloglucan oligosaccharides by wild-type TmNXG1 indicated a potentially faster scaffold compared to PttXET16-34, if wild-type rates for hydrolysis were reflected in the glycosynthase reaction. Thus, the catalytic nucleophile

45 Fredrika Gullfot 2010

Glu94 of TmNXG1 was replaced by alanine, glycine or serine, yielding hydrolytically inactive enzymes that all displayed activity on both XXXGơF and XLLGơF donors. Also included in this study was the alanine mutant E94A of the engineered XET activity hybrid TmNXG1-'YNIIG, which lacks the YNIIG loop adjacent to the acceptor binding site (Baumann et al. 2007).

The kinetic characterisation revealed saturation kinetics for the polymerisation of both XXXGơF and XLLGơF by TmNXG1 E94A, at faster rates than the previously reported PttXET16- 34 E85A glycosynthase. Rates of condensation of the galactosylated XLLGơF donors compared well to HiCel7B E197A rates on XXXGơF, the benchmark glycosynthase used in the preparative synthesis of xyloglucan oligosaccharides (Fauré et al. 2006; Saura-Valls et al. 2006; Fauré et al. 2007), see Table 2 below. The TmNXG1 E94A glycosynthase is therefore well suited for the preparative synthesis of galactosylated xyloglucan oligosaccharides up to certain lengths.

Table 2: Apparent kinetic constants of investigated GH7 and GH16 glycosynthases

XXXGɲF XLLGɲF

kcat Km kcat/Km kcat Km kcat/Km (min-1) (mM) (mM-1 min-1) (min-1) (mM) (mM-1 min-1)

HiCel7B E197A 5.2 r 0.4 6.7 r 1.1 0.77 - - - HiCel7B E197S 120 r 10 8.2 r 0.8 15 - - - PttXET16-34 E85A 1.1 r 0.1 1.6 r 0.4 0.66 1.4 r 0.1 3.8 r 0.2 0.38 TmNXG1 E94A 2.6 r 0.1 1.2 r 0.2 2.2 1.9 r 0.1 1.4 r 0.2 1.4

Product analysis revealed time-dependent formation of a series of elongation products with the general structure (XXXG)n n ”RU ;//* n n ” 8, respectively. These products are formed by subsequent condensations catalysed by TmNXG1 E94A and spontaneous hydrolysis of the C-F bond. Along with these products, transient peaks corresponding to (XXXG)nơF and (XLLG)nơF products are observed.

An interesting feature is that the product distribution can be controlled by the reaction conditions, to favour the formation of certain products. By including a free xyloglucosyl acceptor, i.e. XLLG, short-length condensation products such as XLLGXLLG accumulate, since reaction products from the condensation of XLLGơF with a fluoride-free acceptor can no longer act as a donor in subsequent reactions. The yield of XLLGXLLG can be enhanced substantially

46 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

Figure 20: Product analysis with HPAEC-PAD. Peaks correspond to synthesised xyloglucans of different length, up to n = 8. The left panel shows reaction products of TmNXG1 E94A with XXXGơF, and the right panel with XLLGơF (Gullfot et al. 2009a). compared to other products in this way, by providing XLLGơF donors and XLLG acceptors at a 1:1 molar ratio. This is useful in the synthesis of specific shorter oligosaccharides, which could serve as ligands for structure-function analyses of xyloglucan-active enzymes, such as the octadecasaccharide XLLGXLLG. The enrichment of this particular product could probably be further increased, by applying a donor-blocking strategy. Piens et al. (2008) used GalGXXXG and GalGXXXGXXXG oligosaccharides in a study to investigate the glycosyl-enzyme intermediate in PttXET16-34; similarly Gal-blocked GalGXLLGơF donors that prevent elongation on the non-reducing end could perhaps be used with XLLG acceptors in glycosynthase reactions to yield XLLGXLLG only.

5.2 Structure-function studies of an engineered xyloglucan hydrolase by crystallography and molecular dynamics simulations (publication III)

The three-dimensional protein structures of PttXET16-34 (PDB ID 1UMZ), TmNXG1 (PDB ID 2UWA) and the engineered TmNXG1-'YNIIG hybrid (PDB ID 2UWB) had been solved in previous work (Johansson et al. 2004; Baumann et al. 2007). The structure of PttXET16-34 has a partial xyloglucan substrate bound in the positive subsites, revealing key features of protein- ligand interaction. For proper comparative analysis it is desirable to obtain similar structures with bound ligand for all three proteins. Ideally, these complex structures would display an intact xyloglucan chain spanning the entire binding cleft, in order to shed further light on the key

47 Fredrika Gullfot 2010

interactions of the protein with the substrate, and the features that determine hydrolysis versus transglycosylation.

In publication III (Mark et al. 2009), we present the three-dimensional structure of the engineered TmNXG1-'YNIIG in complex with the xylogluco-nonasaccharide XLLG bound in the negative subsites (PDB ID 2VH9). Mirroring the experiences with PttXET16-34, it was not possible to obtain a structure with an intact ligand spanning the entire binding cleft, i.e. both positive and negative subsites. Instead, we turned to molecular dynamics simulations (MD), and used the structural information on the ligand bound in the positive subsites of PttXET16-34 (Johansson et al. 2004) together with the structure of XLLG bound in the negative subsites of TmNXG1-'YNIIG from our data, to simulate a combined XLLGXLLG molecule that was modelled into the binding cleft of all three enzymes.

The crystal structure of TmNXG1-'YNIIG in complex with XLLG revealed that the glucose backbone sugars were more tightly bound than the xylose and galactose substituents, as indicated by the different B-factor values. The undecorated Glc(-1) unit was stacked against Tyr81, with tight hydrogen bonds between the C6 hydroxyl group and the catalytic acid/base Glu98-OH2. No electron density was observed for the anomeric C1 hydroxyl group, indicating a disordered configuration due to mutarotation. Glc(-2) was stacked against Trp180, while Trp27 formed a stacking platform for Glc(-3). These three hydrophobic platforms are strictly conserved in PttXET16-34 and TmNXG1 as well. In addition to the hydrophobic stacking, these platforms are arranged in a manner that they also form hydrogen bonds to the neighbouring glucose units, as revealed by the TmNXG1-'YNIIG crystal structure. The Glc(-4) unit was loosely bound, forming only indirect hydrogen bonds mediated by water. However, the xylose attached to Glc(- 4) was accommodated by a depression on the protein surface, and formed indirect hydrogen bonds via water molecules to protein residues. Direct hydrogen bonds between sugar hydroxyl groups and protein residues were observed for both Xyl and Gal attached to the Glc(-3) unit.

The MD simulations revealed a highly interesting conformational change of the substrate upon 4 binding. In all three cases, Glc(-1) was observed to flip from the initial C1 chair conformation 1 into a S3 skew boat. This is in accordance with previous studies performed on other endo-Ƣ(1- 4)glucanases, which have shown that the Glc(-1) unit is forced into a distorted form necessary for the pre-transition state, by the active-site groove (Sulzenbacher et al. 1996; Davies et al. 1998;

48 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

Figure 21: Protein – ligand interactions as revealed by MD simulations. Sugars involved in hydrophobic interactions and/or hydrogen bonds (dashed lines) are shown in black. Grey sugars do not participate in any binding interactions (Mark et al. 2009).

49 Fredrika Gullfot 2010

Davies et al. 2003). In addition, the three catalytic residues Glu98, Asp87 and Glu85 (numbering according to PttXET16-34) stabilised the complex by forming hydrogen bonds with Glc(-1) in the skewed boat conformation in all three enzymes.

The stacking interactions between the aforementioned three conserved hydrophobic residues (Trp27, Tyr81 and Trp180 in TmNXG1-'YNIIG) and the backbone glucose units in the -1 to -3 subsites were confirmed by the MD trajectories for PttXET16-34, but also for TmNXG1- 'YNIIG and TmNXG1, thus further validating the correlation between the models and the actually observed crystal complex. Several conserved hydrogen bonds form a common xyloglucan recognition pattern shared by all three proteins. Notably, as in the crystal structure, the tight hydrogen bond between the catalytic Glu98/89 and the C6 hydroxyl group on Glc(-1) is retained in all three cases, confirming the prerequisite of an undecorated Glc in this position.

An interesting observation was made regarding Ser79, a residue that in PttXET16-34 and other XETs is replaced by a strictly conserved alanine. In TmNXG1, Ser79 forms a hydrogen bond with Glc(-1), while in TmNXG1-'YNIIG this hydrogen bond is formed by Trp180, the same residue that forms the stacking platform for Glc(-2). The potential role of Ser79 in the hydrolytic XEHs versus the transglycosylating XETs that lack this hydrophilic residue was deduced from previous structural and phylogenetic analyses, and the MD findings spurred us to produce the Ser79Ala mutant of both TmNXG1 and TmNXG1-'YNIIG to further investigate its role. The analysis of these new constructs is pending (Gullfot, Eklöf and Brumer, unpublished work). Another difference to PttXET16-34 concerns Asn85 of loop 1 in TmNXG1 and TmNXG1- 'YNIIG, which forms a hydrogen bond to the xylosyl branch attached to Glc(-3). A corresponding residue is lacking in PttXET16-34 due to the absence of this loop, and we have thus produced the Asn85Ala mutants of both TmNXG1 and TmNXG1-'YNIIG for future investigations of its potential role.

With regards to the positive subsites, the conserved Trp179 (PttXET16-34 numbering) stacks to Glc(+1) and Glc(+2) in all three proteins. A major cleft is formed by the conserved Arg116, Tyr250 and Arg258, that accommodates the xylosyl branch of Glc(+2) in all variants. Indeed, subsite mapping studies have shown that Xyl at Glc(+2) in PttXET16-34 has the largest contribution to transition state stabilisation, together with Glc(-2) and Glc(-3) (Saura-Valls et al. 2008). The differences in the binding patterns of the three enzymes were found in the hydrogen

50 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

bonding pattern established with the ligand, full details of which can be found in Mark et al. (2009). Generally, in contrast to TmNXG1 and TmNXG1-'YNIIG, PttXET16-34 displayed more substrate interactions in the positive subsites than in the negative subsites.

In conclusion, the findings in the presented work support an active site cleft comprised of 4 negative and 3-4 positive subsites in all three enzymes, that binds the glucan backbone of the xyloglucan oligo- and polysaccharide substrates (in accordance with Saura-Valls et al. 2008). The whole XLLGXLLG molecule is forced by these interactions into a slightly kinked shape, forcing Glc(-1) into the skewed boat conformation while undergoing catalysis. It is also evident from both the crystallographic and MD data that the Glc unit in position -1 must be undecorated, i.e. unmodified at C6. Moreover, the analysis suggests that XEHs and XETs in GH16 display subtle divergent substrate binding affinities in the positive and negative subsites. The hydrolytic XEHs form a greater number of hydrogen bonds with the substrate in the negative subsites, while XETs display more interactions in the positive subsites, i.e. the part of the active-site cleft that receives the incoming glycosyl acceptor. This is in accordance with subsite mapping studies which showed that donor binding in PttXET16-34 is dominated by higher affinity to XXXG- based substrates on the positive subsites, while negative subsites are clearly less specific (Saura- Valls et al. 2008). Perhaps it is the greater plasticity permitted by the less rigid negative subsites in XET that prevent the XET glycosyl intermediate to adopt the conformation necessary for nucleophilic attack by a water molecule, as in the case of hydrolysis. In XEHs, the positioning of the intermediate, the water molecule, and the catalytic base seems more restricted, possibly in the exact obligate conformation for subsequent hydrolysis.

5.3 Structure-guided recombination of GH16 XET/XEH with SCHEMA (paper IV)

The unpublished manuscript included in this thesis presents findings from an exploratory pilot study aimed at structure-guided recombination of PttXET16-34 and TmNXG1 genes with SCHEMA. SCHEMA is a computational tool for protein engineering by recombination, that uses structural information to predict which fragments, or “schemas”, of homologous proteins can be swapped without disrupting the three-dimensional fold of resulting chimeras (Voigt et al. 2002; Endelman et al. 2004; Silberg et al. 2004). Our rationale for pursuing this work was the success with the TmNXG1-'YNIIG loop mutant (Baumann et al. 2007), in order to identify

51 Fredrika Gullfot 2010

further structural motifs responsible for the catalytic differences by swapping larger gene regions between PttXET16-34 and TmNXG1.

A set of 16 crossover locations distributed over the full-length sequence were obtained with the RASPP algorithm (Endelman et al. 2004). Conserved secondary structural motifs included single Ƣ-strands, Ƣ-turn-Ƣ motifs including up to three Ƣ-strands, combinations of Ƣ-strands and loops or ơ-helices, and a few disordered elements. In no cases were pronounced secondary structures such as Ƣ-strands or ơ-helices broken by the generated crossover points.

SCHEMA disruptions were calculated for the combinatorial library generated from the fragments defined by the crossover points, and a restricted library of chimeras with low disruption (E) and high diversity (m) values was chosen. This library contained 100 constructs and revealed interesting information about the propensity for recombination of different segments. The upper and lower Ƣ-sheet structures comprised of Ƣ1-Ƣ6 and Ƣ13-Ƣ14 that form the half of the sandwich to which the donor substrate binds, displayed a strong bias towards an intact sequence from either parent. Instead, recombination was preferentially introduced at the N-terminus, the Ƣ7-Ƣ12 region that form most of the acceptor half of the protein, and the loop following Ƣ14 including most of the C-terminal extension characteristic to GH16 XET/XEH enzymes. These findings were in accordance with our empirical experience from previously performed loop engineering work (Baumann et al. 2007), where the TmNXG1 loop insert recalcitrant to truncation is located in the preferentially conserved region, while the successfully truncated loop is located in an area favourable for recombination (Figure 22).

Finally, in order to assess the predictive power of SCHEMA empirically, six constructs were chosen with low E values for subsequent cloning and expression in Pichia pastoris. All constructs lack the PttXET16-34 N-linked glycosylation site to facilitate expression, but display an interesting variety in recombined fragments, including donor and acceptor sides from opposite parents, see figures in the included manuscript (paper IV). The expression and characterisation of these constructs currently remain under investigation.

In conclusion, this work provided inspirational insights into the potential sensitivity of structural features in TmNXG1 and PttXET16-34 towards mutational disruption. However, further empirical studies would be required to test the robustness of these predictions. Since SCHEMA is designed for the generation of libraries, albeit significantly enriched in functional proteins, a

52 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

Figure 22: Topology diagram of XET/XEH with preferred combinatorial segments. Dashed lines indicate crossover locations, i.e. points where recombination of PttXET16-34 and TmNXG1 preferentially occurs. Strand Ƣ with the active site is marked with an asterisk. The two loop inserts in TmNXG1 are indicated with arches. more conventional application would involve structure-guided recombination of TmNXG1 and PttXET16-34 on a larger scale, perhaps including other suitable parents from the GH16 family, in order to fully explore the functional space inherent in this interesting group of enzymes.

5.4 Engineered carbohydrate binding modules as molecular probes for xyloglucan (publications V and VI)

The role of xyloglucan in plant cell development and microstructure is an area of great scientific interest. However, molecular probes for xyloglucan that can be used for the detection and visualisation of its distribution in plant tissues, for example at different growth stages, have been scarce. At the initiation of this study, only one monoclonal antibody was available to this purpose, however specific to fucosylated xyloglucan only (CCRC-M1, Puhlmann et al. 1994). It

53 Fredrika Gullfot 2010

was therefore desirable to extend the range of xyloglucan-specific probes. While recent developments include a mAb for non-galactosylated xyloglucan (LM 15, Marcus et al. 2008) and a new set of plant polysaccharide mAbs including those targeted at different xyloglucans (Pattathil et al. 2010), non-animal alternatives to mAbs such as CBM-based affinity probes are still valuable to obtain.

The thermostable CBM 4-2 from the Rhodothermus marinus xylanase Xyn10A shows strong affinity WRZDUGVGLIIHUHQW[\ODQVEXWDOVRWRZDUGVEDUOH\Ƣ-glucan and, to a lesser extent, laminarin and lichenan (Abou Hachem et al. 2000). By several rounds of mutations of twelve key amino acids in the binding cleft coupled with phage display selection (Gunnarsson et al. 2004; Gunnarsson et al. 2006), the XG-34 variant was obtained that bound well to non-fucosylated xyloglucan, but poorly RUQRWDWDOOWR[\ODQFHOOXORVH $YLFHO DUDELQR[\ODQEDUOH\Ƣ-glucan or fucosylated xyloglucan, thus proving its specificity towards non-fucosylated xyloglucan.

5.4.1 The three-dimensional structure of XG-34

In paper VI of this thesis (Gullfot et al. 2009b), we present the three-dimensional structure of the XG-34 variant solved by X-ray diffraction at 1.60 Å resolution. The structure of the wild-type CBM4-2 from R. marinus Xyn10A had previously been solved by NMR (Simpson et al. 2002), and our comparison revealed key differences in the binding site topology that are thought to be responsible for the dramatically altered substrate specificity of the engineered XG-34 variant.

XG-34 displays the classical flattened jelly-UROO Ƣ-sandwich fold belonging to CBM family 4 according to the CAZy classification, with the binding cleft formed on the twisted, inner concave Ƣ-sheet (Boraston et al. 2002; Simpson et al. 2002). The largest backbone and side-chain displacements compared to the wild-type structure are found in two loops that participate in forming the binding cleft in both CBM4-2 and XG-34 (Simpson et al. 2002). In this region, five of the in total seven amino acid replacements that resulted in the highly xyloglucan-specific engineered XG-34 variant are found.

The two aromatic residues Tyr69 and His110 in XG-34 that correspond to Trp69 and Phe110 thought to be the main ligand binding residues in the wild-type, are situated much closer and form a distinctly different and more restricted binding cleft, with a narrow waist in its middle. Molecular modelling shows that a xyloglucan oligosaccharide can be fitted into this cleft, with a backbone glucopyranosyl residue located in the waist, and branching decorations accommodated

54 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

by the more spacious regions on either side.

Figure 23: Surface representations XG-34 (a) and CBM4-2 (b). Mutated positions in XG-34 are highlighted in magenta, corresponding residues in CBM4-2 are highlighted in blue. (c) Cartoon representation of XG-34 with a XXXG oligosaccharide modelled into the binding cleft. A backbone glucose clearly fits into the narrow “waist” between Tyr69 and His110, while decorations are accommodated by the more spacious surrounding regions. Figure taken from Gullfot et al. (2009b).

While most of the altered residues in XG-34 are located around the cleft, the role of particular side-chains for the altered specificity cannot be deduced with certainty on the basis of the apostructure alone. For this, we would need to know the position of the bound xyloglucan substrate, to deduce the interactions between sugar groups and protein residues. Unfortunately, despite extensive efforts we were not able to obtain protein crystals with bound ligands, neither by co-crystallisation or soaking. This could partly be explained by the crystal packing, where tight interactions between molecules mediated by His110 and Asp29 preclude any carbohydrate ligands from access to the binding cleft. As was revealed by our following work, the affinity towards xyloglucan oligosaccharides was perhaps also too weak for successful co-crystallisation.

5.4.2 Improved xyloglucan binding modules (XGBM)

The scope of the work presented in paper V was to further increase the binding affinity of XG- 34 towards xyloglucan, by affinity maturation (von Schantz et al. 2009). Here, XG-34 was used as the scaffold for the creation of two libraries: the first by performing epPCR-based random mutagenesis on the XG-34 gene as the template, while a second round of mutations was

55 Fredrika Gullfot 2010

performed to obtain the second library. Selection was performed by phage display in three rounds, to identify clones binding tightly and exclusively to xyloglucan, and discard clones with retained wild-type affinity towards xylan.

Two new variants obtained by this work, XG-34/1-X and XG-34/2-VI, showed strong and specific affinity towards xyloglucan, revealed by affinity electrophoresis (AE), while a third clone, XG34/2-I, showed binding characteristics similar to the parent XG-34. Sequencing revealed that the new variants had 2-6 amino acid substitutions, with an Asp112Glu mutation as a common denominator. Interestingly, this substitution reverses a mutation in XG-34 into the original glutamate found in wild-type CBM4-2. The significance of this residue was investigated by producing the Glu112Asp mutant of XG-34/2-VI and the Asp112Glu mutant of XG-34. AE revealed that Glu112 variants have high affinity towards xyloglucan, while Asp112 variants bind with only low affinity, indicating that the glutamate in position 112 is responsible for the increased affinity of XG-34/2-VI towards xyloglucan compared to the parent, and most likely for the other evolved variants as well.

Competition ELISA revealed that both XG-34/1-X and XG-34/2-VI had retained the ability displayed by XG-34 to discriminate between fucosylated and non-fucosylated xyloglucan, with exclusive affinity towards the latter.

The respective affinities for different xyloglucans were furthermore quantified by isothermal titration calorimetry (ITC), revealing that XG-34/1-X and XG-34/2-VI bound strongly to galactosylated xyloglucan oligosaccharide (XLLG) in an enthalpy-driven interaction, with 5 -1 affinities in the KA = 10 M range. Binding to non-galactosylated XXXG was detected, however at too low levels for precise quantification by ITC. Also, the affinity of the parental scaffold XG- 34 to both XLLG and XXXG could be measured, but again at too low levels for proper quantification, indicating significantly improved affinity for the evolved XG-34/1-X and XG- 34/2-VI successors compared to the parent. The strong binding to XLLG but not to XXXG by XG-34/1-X and XG-34/2-VI suggests that the galactose decorations play an important role in the protein-ligand interaction. Further ITC experiments with galactose, resulting in no detectable affinity, suggest that a specific, ordered pattern of galactose decorations is necessary for proper recognition and binding.

56 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

Figure 24: ITC thermograms from experiments with XLLG. A) XG-34/1-X, B) XG-34/2-VI (von Schantz et al. 2009)

Finally, experiments were performed to test the capability of the new tight binders as molecular probes for xyloglucan in plant tissues. Microtome sections of tamarind seeds, known to be rich in xyloglucan, were incubated with FITC-conjugated binding modules and analysed by fluorescent microscopy. This revealed rapid and specific binding to non-fucosylated xyloglucan in the endosperm, but not in the covering integument cells that are rich in fucosylated xyloglucan, while controls with the monoclonal antibody CCRC-M1 and CBM FXG-14b (both specific for fucosylated xyloglucan) bound to the integument cells only. This clearly demonstrates the capacity of these novel xyloglucan-binding modules as probes for the in vivo detection and localisation of xyloglucan in plant tissues.

57 Fredrika Gullfot 2010

58 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

6 CONCLUDING REMARKS AND OUTLOOK

This work has shown the power of different protein engineering approaches for a variety of applications in the field of carbohydrate-active enzymes, whether aimed at elucidating fundamental mechanisms, as in the case of GH16 XET/XEH hydrolysis versus transglycosylation, or the creation of novel materials and innovative molecular tools, such as the xyloglucan homopolymers and the evolved xyloglucan binding modules.

Clearly, the ability to alter protein function, from the mutation of individual residues to the creation of novel chimeras by the shuffling of homologous genes, is commonplace in modern biomolecular sciences, and as such is performed on routine basis today. However, the full potential of protein engineering is still far from being exploited.

The function of a protein is in essence determined by its structure, which ultimately governs the nano-mechanical and molecular conditions of operation. Novel methods such as SCHEMA attempt to bridge the gap between features obvious to the eye and motifs which are too complex and obscured to be discovered by brainpower alone.

State-of-the-art approaches for the development of new protein engineering and design methods will perhaps be borrowed from other fields than from within the classical biosciences. This is the progress that can be seen for example in the emerging field of synthetic biology, where standard procedures from computer science and software development are applied to the design of new biochemical pathways and even whole organisms. Interdisciplinary cross-fertilisation can greatly accelerate the development of new methods, and empower the protein researcher with tools on a par with a contemporary engineering science.

59 Fredrika Gullfot 2010

60 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

7 ACKNOWLEDGEMENTS

I want to thank Harry Brumer, my supervisor and carbo-star, for taking me on the team, your unswerving belief in my capacities, and so much more along the way. While I might leave the nest now for other pursuits, I do it proudly for what we achieved, and promise to make my very best out of the excellent scientific training I received under your wings.

Tuula Teeri, my first co-supervisor, and the first to make me understand that there were actually some really cool things going on in this department with the obscure name Wood Biotechnology. Vincent Bulone, my second co-supervisor, for taking on from there and leading us into the sweet era of Glycoscience. Keep on listening, pushing, and inspiring us with your visions!

I also want to thank all the collaborators I had the pleasure of working with during these years, in particular enzymology guru Kathleen Piens, Harry The Great Gilbert, ITC-wizard James Flint, Laura von Schantz (my fellow protein slave, I almost feel I know you ;)), Mats Ohlin – keep them CBMs coming! Crystal queen Christina Divne, fellow geek TC Tan, and Pekka Mark, how much fun can you possibly pack into a few femtoseconds.

Dear colleagues in the lab, thank you, and that goes for everyone at floor 2. Oliver Spadiut for much appreciated help with cloning, I believe you’re the kind of guy that will run his own lab before 35. Jens Eklöf in particular for assays but so much more. If you ever want to move down south… Nomchit Kaewthai and all your great protocols. May life offer you abundance in all good things from now and forever. Farid Ibatullin for sugars and Martin Baumann for first steps, you know I miss you both. Gustav Sundqvist for all your MS work, Qi Zhou for help with the SEC-ELS. Ela Nilsson, Lotta Rosenfeldt and Marlene Johnsson for technical and administrative help, where would we be without you. Student Linnéa Granlund for your assistance with the XET/XEH chimera cloning. You will become a fantastic veterinarian. Or scientist. Or both.

Last but not least: I had the great fortune of performing my Ph.D. project within the interdisciplinary context of BIOMIME, the Swedish Centre for Biomimetic Fibre Enginering. I particularly want to thank Prof. Lars Berglund, Mats Brodén, Ulf Carlson, Niklas Nordgren and Susanne Rosén for interesting and inspiring discussions, putting our work into a broader context. Let’s continue pushing boundaries.

61 Fredrika Gullfot 2010

62 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

8 REFERENCES

Abou Hachem, M., Karlsson, E.N., Bartonek-Roxa, E., Raghothama, S., Simpson, P.J., Gilbert, H.J., Williamson, M.P. and Holst, O. (2000). Carbohydrate-binding modules from a thermostable Rhodothermus marinus xylanase: cloning, expression and binding studies. Biochem. J. 345: 53-60. Aftonbladet (2010). "Forskare har skapat liv", 21 May 2010. Ahrenstedt, L., Olksanen, A., Salminen, K. and Brumer, H. (2008). Paper dry strength improvement by xyloglucan addition: Wet-end application, spray coating and synergism with borate. Holzforschung 62(1): 8-14 Allouch, J., Jam, M., Helbert, W., Barbeyron, T., Kloareg, B., Henrissat, B. and Czjzek, M. (2003). The three-dimensional structures of two beta-agarases. J. Biol. Chem. 278(47): 47171- 47180. Arnold, F.H. and Georgiou, G., Eds. (2003). Directed Evolution Library Creation: Methods and Protocols. Totowa, Humana Press Inc. Baumann, M.J., Eklöf, J.M., Michel, G., Kallas, Å.M., Teeri, T.T., Czjzek, M. and Brumer, H., 3rd (2007). Structural evidence for the evolution of xyloglucanase activity from xyloglucan endo-transglycosylases: biological implications for cell wall metabolism. Plant Cell 19(6): 1947-63. Becnel, J., Natarajan, M., Kipp, A. and Braam, J. (2006). Developmental expression patterns of Arabidopsis XTH genes reported by transgenes and Genevestigator. Plant Mol. Biol. 61(3): 451- 467. Benvenuti, M. and Mangani, S. (2007). Crystallization of soluble proteins in vapor diffusion for x-ray crystallography. Nat. Protoc. 2(7): 1633-1651. Blanchard, S., Armand, S., Couthino, P., Patkar, S., Vind, J., Samain, E., Driguez, H. and Cottaz, S. (2007). Unexpected regioselectivity of Humicola insolens Cel7B glycosynthase mutants. Carbohydr. Res. 342(5): 710-716. Bloom, J.D., Silberg, J.J., Wilke, C.O., Drummond, D.A., Adami, C. and Arnold, F.H. (2005). Thermodynamic prediction of protein neutrality. Proc. Natl. Acad. Sci. U. S. A. 102(3): 606-611. Bodin, A., Ahrenstedt, L., Fink, H., Brumer, H., Risberg, B. and Gatenholm, P. (2007). Modification of nanocellulose with a xyloglucan-RGD conjugate enhances adhesion and proliferation of endothelial cells: Implications for tissue engineering. Biomacromolecules 8(12): 3697- 3704. Bolam, D.N., Ciruela, A., McQueen-Mason, S., Simpson, P., Williamson, M.P., Rixon, J.E., Boraston, A., Hazlewood, G.P. and Gilbert, H.J. (1998). Pseudomonas cellulose-binding domains mediate their effects by increasing enzyme substrate proximity. Biochem. J. 331: 775-781. Boraston, A.B., Bolam, D.N., Gilbert, H.J. and Davies, G.J. (2004). Carbohydrate-binding modules: fine-tuning polysaccharide recognition. Biochem. J. 382: 769-781.

63 Fredrika Gullfot 2010

Boraston, A.B., Creagh, A.L., Alam, M.M., Kormos, J.M., Tomme, P., Haynes, C.A., Warren, R.A.J. and Kilburn, D.G. (2001a). Binding specificity and thermodynamics of a family 9 carbohydrate-binding module from Thermotoga maritima xylanase 10A. Biochemistry 40(21): 6240- 6247. Boraston, A.B., McLean, B.W., Guarna, M.M., Amandaron-Akow, E. and Kilburn, D.G. (2001b). A family 2a carbohydrate-binding module suitable as an affinity tag for proteins produced in Pichia pastoris. Protein Expr. Purif. 21(3): 417-423. Boraston, A.B., Nurizzo, D., Notenboom, V., Ducros, V., Rose, D.R., Kilburn, D.G. and Davies, G.J. (2002). Differential oligosaccharide recognition by evolutionarily-related beta-1,4 and beta-1,3 glucan-binding modules. J. Mol. Biol. 319(5): 1143-1156. Böttcher, D. and Bornscheuer, U.T. (2010). Protein engineering of microbial enzymes. Curr. Opin. Microbiol. 13(3): 274-282. Brumer, H., Zhou, Q., Baumann, M.J., Carlsson, K. and Teeri, T.T. (2004). Activation of crystalline cellulose surfaces through the chemoenzymatic modification of xyloglucan. J. Am. Chem. Soc. 126(18): 5715-5721. Brummell, D.A. (2006). Cell wall disassembly in ripening fruit. Funct. Plant Biol. 33(2): 103-119. Brummell, D.A. and Harpster, M.H. (2001). Cell wall metabolism in fruit softening and quality and its manipulation in transgenic plants. Plant. Mol. Biol. 47((1-2)): 311-40. Cabib, E., Farkas, V., Kosik, O., Blanco, N., Arroyo, J. and McPhie, P. (2008). Assembly of the Yeast Cell Wall Crh1p and Crh2p act as transglycosylases in vivo and in vitro. J. Biol. Chem. 283(44): 29859-29872. Campbell, J.A., Davies, G.J., Bulone, V. and Henrissat, B. (1997). A classification of nucleotide-diphospho-sugar glycosyltransferases based on amino acid sequence similarities. Biochem. J. 326: 929-939. Cantarel, B.L., Coutinho, P.M., Rancurel, C., Bernard, T., Lombard, V. and Henrissat, B. (2009). The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics. Nucleic Acids Res. 37: D233-D238. Carpita, N. and McCann, M. (2000). The Cell Wall. Biochemistry and Molecular Biology of Plants. Buchanan, B., Gruissem, W., Jones, R. Rockville MD, American Society of Plant Physiologists: 52-108. Carrard, G., Koivula, A., Söderlund, H. and Beguin, P. (2000). Cellulose-binding domains promote hydrolysis of different sites on crystalline cellulose. Proc. Natl. Acad. Sci. U. S. A. 97(19): 10342-10347. Cataldi, T.R.I., Campa, C. and De Benedetto, G.E. (2000). Carbohydrate analysis by high- performance anion-exchange chromatography with pulsed amperometric detection: The potential is still growing. Fresenius J. Anal. Chem. 368(8): 739-758. Cavaco-Paulo, A. (1998). Mechanism of cellulase action in textile processes. Carbohydr. Polym. 37(3): 273-277. Cavalier, D.M., Lerouxel, O., Neumetzler, L., Yamauchi, K., Reinecke, A., Freshour, G., Zabotina, O.A., Hahn, M.G., Burgert, I., Pauly, M., Raikhel, N.V. and Keegstra, K. (2008). Disrupting two Arabidopsis thaliana xylosyltransferase genes results in plants deficient in

64 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

xyloglucan, a major primary cell wall component. Plant Cell 20(6): 1519-1537. Cherry, J.R. and Fidantsef, A.L. (2003). Directed evolution of industrial enzymes: an update. Curr. Opin. Biotech. 14(4): 438-443. Chruszcz, M., Wlodawer, A. and Minor, W. (2008). Determination of protein structures - A series of fortunate events. Biophys. J. 95(1): 1-9. Cosgrove, D.J. (2005). Growth of the plant cell wall. Nat. Rev. Mol. Cell Bio. 6(11): 850-861. Coutinho, P.M. and Henrissat, B. (1999). Carbohydrate-active enzymes: an integrated database approach. Recent Advances in Carbohydrate Bioengineering. H.J. Gilbert, G.D., B. Henrissat and B. Svensson eds. Cambridge, The Royal Society of Chemistry: 3-12. Crameri, A., Raillard, S.A., Bermudez, E. and Stemmer, W.P.C. (1998). DNA shuffling of a family of genes from diverse species accelerates directed evolution. Nature 391(6664): 288-291. Crout, D.H.G. and Vic, G. (1998). Glycosidases and glycosyl transferases in glycoside and oligosaccharide synthesis. Curr. Opin. Chem. Biol. 2(1): 98-111. Davies, G. and Henrissat, B. (1995). Structures and Mechanisms of Glycosyl Hydrolases. Structure 3(9): 853-859. Davies, G.J., Ducros, V.M.A., Varrot, A. and Zechel, D.L. (2003). Mapping the conformational itinerary of beta-glycosidases by X-ray crystallography. Biochem. Soc. Trans. 31: 523-527. Davies, G.J., Mackenzie, L., Varrot, A., Dauter, M., Brzozowski, A.M., Schülein, M. and Withers, S.G. (1998). Snapshots along an enzymatic reaction coordinate: Analysis of a retaining beta-glycoside hydrolase. Biochemistry 37(34): 11707-11713. Davies, G.J., Wilson, K.S. and Henrissat, B. (1997). Nomenclature for sugar-binding subsites in glycosyl hydrolases. Biochem. J. 321: 557-559. Dembowski, N.J. and Kantrowitz, E.R. (1994). The use of alanine scanning mutagenesis to determine the role of the N-terminus of the regulatory chain in the heterotropic mechanism of Escherichia coli aspartate transcarbamoylase. Protein Eng. 7(5): 673-679. Derewenda, Z.S. (2004). The use of recombinant methods and molecular engineering in protein crystallization. Methods 34(3): 354-363. Dietrich, J.A., Yoshikuni, Y., Fisher, K.J., Woolard, F.X., Ockey, D., McPhee, D.J., Renninger, N.S., Chang, M.C.Y., Baker, D. and Keasling, J.D. (2009). A novel semi- biosynthetic route for Artemisinin production using engineered substrate-promiscuous P450(BM3). ACS Chem. Biol. 4(4): 261-267. Din, N., Damude, H.G., Gilkes, N.R., Miller, R.C., Warren, R.A.J. and Kilburn, D.G. (1994). C-1-C-X revisited - intramolecular synergism in a cellulase. Proc. Natl. Acad. Sci. U. S. A. 91(24): 11383-11387. Din, N., Gilkes, N.R., Tekant, B., Miller, R.C., Warren, A.J. and Kilburn, D.G. (1991). Non-hydrolytic disruption of cellulose fibers by the binding domain of a bacterial cellulase. Biotechnology 9(11): 1096-1099. Divne, C., Ståhlberg, J., Reinikainen, T., Ruohonen, L., Pettersson, G., Knowles, J.K.C., Teeri, T.T. and Jones, T.A. (1994). The 3-dimensional crystal-structure of the catalytic core of

65 Fredrika Gullfot 2010

cellobiohydrolase-I from Trichoderma reesei. Science 265(5171): 524-528. Drummond, D.A., Silberg, J.J., Meyer, M.M., Wilke, C.O. and Arnold, F.H. (2005). On the conservative nature of intragenic recombination. Proc. Natl. Acad. Sci. U. S. A. 102(15): 5380- 5385. Ducros, V.M.A., Tarling, C.A., Zechel, D.L., Brzozowski, A.M., Frandsen, T.P., von Ossowski, I., Schülein, M., Withers, S.G. and Davies, G.J. (2003). Anatomy of glycosynthesis: Structure and kinetics of the Humicola insolens Cel7B E197A and E197S glycosynthase mutants. Chem. Biol. 10(7): 619-628. The Economist (2010). "Genesis redux", 20 May 2010. Eklöf, J.M. and Brumer, H. (2010). The XTH Gene Family: An update on enzyme structure, function, and phylogeny in xyloglucan remodeling. Plant Physiol 153(2): 456-466. El Rassi, Z. and Nashabeh, W. (1995). High Performance Capillary Electrophoresis of carbohydrates and glycoconjugates. Carbohydrate Analysis: High Performance Liquid Chromatography and Capillary Electrophoresis. El Rassi, Z. Amsterdam, Elsevier: 267-360. Endelman, J.B., Silberg, J.J., Wang, Z.G. and Arnold, F.H. (2004). Site-directed protein recombination as a shortest-path problem. Protein Eng. Des. Sel. 17(7): 589-594. Faijes, M. and Planas, A. (2007). In vitro synthesis of artificial polysaccharides by glycosidases and glycosynthases. Carbohydr. Res. 342(12-13): 1581-1594. Farkas, V., Sulova, Z., Stratilova, E., Hanna, R. and Maclachlan, G. (1992). Cleavage of xyloglucan by nasturtium seed xyloglucanase and transglycosylation to xyloglucan subunit oligosaccharides. Arch. Biochem. Biophys. 298(2): 365-370. Fauré, R., Cavalier, D., Keegstra, K., Cottaz, S. and Driguez, H. (2007). Glycosynthase- assisted synthesis of xylo-gluco-oligosaccharide probes for alpha-xylosyltransferases. Eur. J. Org. Chem. (26): 4313-4319. Fauré, R., Saura-Valls, M., Brumer, H., Planas, A., Cottaz, S. and Driguez, H. (2006). Synthesis of a library of xylogluco-oligosaccharides for active-site mapping of xyloglucan endo- transglycosylase. J. Org. Chem. 71(14): 5151-5161. Francisco, J.A., Stathopoulos, C., Warren, R.A.J., Kilburn, D.G. and Georgiou, G. (1993). Specific adhesion and hydrolysis of cellulose by intact Escherichia coli expressing surface anchored cellulase or cellulose binding domains. Biotechnology 11(4): 491-495. Fry, S.C., Smith, R.C., Renwick, K.F., Martin, D.J., Hodge, S.K. and Matthews, K.J. (1992). Xyloglucan endotransglycosylase, a new wall-loosening enzyme-activity from plants. Biochem. J. 282: 821-828. Fry, S.C., York, W.S., Albersheim, P., Darvill, A., Hayashi, T., Joseleau, J.P., Kato, Y., Lorences, E.P., Maclachlan, G.A., Mcneil, M., Mort, A.J., Reid, J.S.G., Seitz, H.U., Selvendran, R.R., Voragen, A.G.J. and White, A.R. (1993). An unambiguous nomenclature for xyloglucan-derived oligosaccharides. Physiol. Plant. 89(1): 1-3. Gao, P.J., Chen, G.J., Wang, T.H., Zhang, Y.S. and Liu, J. (2001). Non-hydrolytic disruption of crystalline structure of cellulose by cellulose binding domain and linker sequence of cellobiohydrolase I from Penicillium janthinellum. Acta Biochim. Biophys. Sin. 33(1): 13-18. Gibson, D.G., Glass, J.I., Lartigue, C., Noskov, V.N., Chuang, R.Y., Algire, M.A.,

66 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

Benders, G.A., Montague, M.G., Ma, L., Moodie, M.M., Merryman, C., Vashee, S., Krishnakumar, R., Assad-Garcia, N., Andrews-Pfannkoch, C., Denisova, E.A., Young, L., Qi, Z.Q., Segall-Shapiro, T.H., Calvey, C.H., Parmar, P.P., Hutchison, C.A., Smith, H.O. and Venter, J.C. (2010). Creation of a bacterial cell controlled by a chemically synthesized genome. Science 329(5987): 52-56. Gilbert, H.J., Stålbrand, H. and Brumer, H. (2008). How the walls come crumbling down: recent structural biochemistry of plant polysaccharide degradation. Curr. Opin. Plant Biol. 11(3): 338-348. Gloster, T.M., Ibatullin, F.M., Macauley, K., Eklöf, J.M., Roberts, S., Turkenburg, J.P., Björnvad, M.E., Jörgensen, P.L., Danielsen, S., Johansen, K.S., Borchert, T.V., Wilson, K.S., Brumer, H. and Davies, G.J. (2007). Characterization and three-dimensional structures of two distinct bacterial xyloglucanases from families GH5 and GH12. J. Biol. Chem. 282(26): 19177- 19189. Greener, A., Callahan, M. and Jerpseth, B. (1997). An efficient random mutagenesis technique using an E-coli mutator strain. Mol. Biotech. 7(2): 189-195. Guerreiro, C., Fontes, C., Gama, M. and Domingues, L. (2008). Escherichia coli expression and purification of four antimicrobial peptides fused to a family 3 carbohydrate-binding module (CBM) from Clostridium thermocellum. Prot. Expr. Pur. 59(1): 161-168. Gullfot, F. (2009). Synthesis of xyloglucan oligo- and polysaccharides with glycosynthase technology. Licentiate thesis. Stockholm, KTH Royal Institute of Technology. Gullfot, F., Ibatullin, F.M., Sundqvist, G., Davies, G.J. and Brumer, H. (2009a). Functional characterization of xyloglucan glycosynthases from GH7, GH12, and GH16 scaffolds. Biomacromolecules 10(7): 1782-1788. Gullfot, F., Tan, T.C., von Schantz, L., Karlsson, E.N., Ohlin, M., Brumer, H. and Divne, C. (2009b). The crystal structure of XG-34, an evolved xyloglucan-specific carbohydrate-binding module. Proteins 78(3): 785-789. Gunnarsson, L.C., Karlsson, E.N., Albrekt, A.S., Andersson, M., Holst, O. and Ohlin, M. (2004). A carbohydrate binding module as a diversity-carrying scaffold. Protein Eng. Des. Sel. 17(3): 213-221. Gunnarsson, L.C., Zhou, Q., Montanier, C., Karlsson, E.N., Brumer, H., 3rd and Ohlin, M. (2006). Engineered xyloglucan specificity in a carbohydrate-binding module. Glycobiology 16(12): 1171-80. Gustavsson, M., Lehtiö, J., Denman, S., Teeri, T.T., Hult, K. and Martinelle, M. (2001). Stable linker peptides for a cellulose-binding domain-lipase expressed in Pichia pastoris. Protein Eng. 14(9): 711-715. Gustavsson, M.T., Persson, P.V., Iversen, T., Martinelle, M., Hult, K., Teeri, T.T. and Brumer, H. (2005). Modification of cellulose fiber surfaces by use of a lipase and a xyloglucan endotransglycosylase. Biomacromolecules 6(1): 196-203. Hahn, M., Keitel, T. and Heinemann, U. (1995). Crystal and molecular-structure at 0.16-nm resolution of the hybrid Bacillus endo-1,3-1,4-beta-D-glucan 4-glucanohydrolase H(A16-M). Eur. J. Biochem. 232(3): 849-858. Hancock, S.M., Vaughan, M.D. and Withers, S.G. (2006). Engineering of glycosidases and

67 Fredrika Gullfot 2010

glycosyltransferases. Curr. Opin. Chem. Biol. 10(5): 509-19. Heinzelman, P., Snow, C.D., Smith, M.A., Yu, X.L., Kannan, A., Boulware, K., Villalobos, A., Govindarajan, S., Minshull, J. and Arnold, F.H. (2009a). SCHEMA recombination of a fungal cellulase uncovers a single mutation that contributes markedly to stability. J. Biol. Chem. 284(39): 26229-26233. Heinzelman, P., Snow, C.D., Wu, I., Nguyen, C., Villalobos, A., Govindarajan, S., Minshull, J. and Arnold, F.H. (2009b). A family of thermostable fungal cellulases created by structure-guided recombination. Proc. Natl. Acad. Sci. U.S.A. 106(14): 5610-5615. Hoffman, M., Jia, Z.H., Pena, M.J., Cash, M., Harper, A., Blackburn, A.R., Darvill, A. and York, W.S. (2005). Structural analysis of xyloglucans in the primary cell walls of plants in the subclass Asteridae. Carbohydr. Res. 340(11): 1826-1840. Hutchison, C.A., Phillips, S., Edgell, M.H., Gillam, S., Jahnke, P. and Smith, M. (1978). Mutagenesis at a specific position in a DNA-sequence. J. Biol. Chem. 253(18): 6551-6560. Hwang, S., Ahn, J., Lee, S., Lee, T.G., Haam, S., Lee, K., Ahn, I.S. and Jung, J.K. (2004). Evaluation of cellulose-binding domain fused to a lipase for the lipase immobilization. Biotechnol. Lett. 26(7): 603-605. Ibatullin, F.M., Baumann, M.J., Greffe, L. and Brumer, H. (2008). Kinetic analyses of retaining endo-(xylo)glucanases from plant and microbial sources using new chromogenic xylogluco-oligosaccharide aryl glycosides. Biochemistry 47(29): 7762-7769. Joern, J.M. (2003). DNA shuffling. Directed Evolution Library Creation: Methods and Protocols. Arnold, F.H. and Georgiou, G. Totowa, Humana Press Inc. Johansson, P., Brumer, H., Baumann, M.J., Kallas, Å.M., Henriksson, H., Denman, S.E., Teeri, T.T. and Jones, T.A. (2004). Crystal structures of a poplar xyloglucan endotransglycosylase reveal details of transglycosylation acceptor binding. Plant Cell 16(4): 874- 886. Johnson, P.E., Joshi, M.D., Tomme, P., Kilburn, D.G. and McIntosh, L.P. (1996). Structure of the N-terminal cellulose-binding domain of Cellulomonas fimi CenC determined by nuclear magnetic resonance spectroscopy. Biochemistry 35(45): 14381-14394. Kaewthai, N., Eklöf, J.M., Klockare, L., Ezcurra, I. and Brumer, H. (2008). Development of a high-throughput assay for screening xyloglucan endo-transglycosylase and endo- xyloglucanase expression. 24th International Carbohydrate Symposium. Oslo, Norway. Kallas, Å. (2006). Heterologous expression, characterization and applications of carbohydrate active enzymes and binding modules. Doctorate thesis. Stockholm, KTH Royal Institute of Technology. Kauffmann, C., Shoseyov, O., Shpigel, E., Bayer, E.A., Lamed, R., Shoham, Y. and Mandelbaum, R.T. (2000). Novel methodology for enzymatic removal of atrazine from water by CBD-fusion protein immobilized on cellulose. Environ. Sci. Technol. 34(7): 1292-1296. Kavoosi, M., Meijer, J., Kwan, E., Creagh, A.L., Kilburn, D.G. and Haynes, C.A. (2004). Inexpensive one-step purification of polypeptides expressed in Escherichia coli as fusions with the family 9 carbohydrate-binding module of xylanase 10A from T-maritima. J. Chrom. B 807(1): 87- 94.

68 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

Knox, J.P. (2008). Revealing the structural and functional diversity of plant cell walls. Curr. Opin. Plant Biol. 11(3): 308-313. Koshland, D.E. (1953). Stereochemistry and the mechanism of enzymatic reactions. Biol. Rev. 28: 416-436. Kraulis, P.J., Clore, G.M., Nilges, M., Jones, T.A., Pettersson, G., Knowles, J. and Gronenborn, A.M. (1989). Determination of the 3-dimensional solution structure of the C- terminal domain of cellobiohydrolase-I from Trichoderma reesei - a study using nuclear magnetic resonance and hybrid distance geometry dynamical simulated annealing. Biochemistry 28(18): 7241- 7257. Ladbury, J.E. and Chowdhry, B.Z. (1996). Sensing the heat: The application of isothermal titration calorimetry to thermodynamic studies of biomolecular interactions. Chem. Biol. 3(10): 791-801. Landwehr, M., Carbone, M., Otey, C.R., Li, Y.G. and Arnold, F.H. (2007). Diversification of catalytic function in a synthetic family of chimeric cytochrome P450s. Chem. Biol. 14: 269-278. Lehtiö, J., Wernerus, H., Samuelson, P., Teeri, T.T. and Stahl, S. (2001). Directed immobilization of recombinant staphylococci on cotton fibers by functional display of a fungal cellulose-binding domain. FEMS Microbiol. Lett. 195(2): 197-204. Levy, I., Paldi, T. and Shoseyov, O. (2004). Engineering a bifunctional starch-cellulose cross- bridge protein. Biomaterials 25(10): 1841-1849. Lin, H.N., Tao, H.Y. and Cornish, V.W. (2004). Directed evolution of a glycosynthase via chemical complementation. J. Am. Chem. Soc. 126(46): 15051-15059. Lindqvist, J., Nyström, D., Ostmark, E., Antoni, P., Carlmark, A., Johansson, M., Hult, A. and Malmstrom, E. (2008). Intelligent dual-responsive cellulose surfaces via surface-initiated ATRP. Biomacromolecules 9(8): 2139-2145. Löfblom, J., Feldwisch, J., Tolmachev, V., Carlsson, J., Ståhl, S. and Frejd, F.Y. (2010). Affibody molecules: Engineered proteins for therapeutic, diagnostic and biotechnological applications. FEBS Lett. 584(12): 2670-2680. Lönnberg, H., Zhou, Q., Brumer, H., Teeri, T.T., Malmström, E. and Hult, A. (2006). Grafting of cellulose fibers with poly(epsilon-caprolactone) and poly(L-lactic acid) via ring- opening polymerization. Biomacromolecules 7(7): 2178-2185. Luft, J.R. and DeTitta, G.T. (2009). Rational selection of crystallization techniques. Protein Crystallisation. Bergfors, T. La Jolla, International University Line: 11-46. Lutz, S. and Bornscheuer, U.T., Eds. (2009). Protein Engineering Handbook. Weinheim, Wiley- VCH. Mackenzie, L.F., Wang, Q.P., Warren, R.A.J. and Withers, S.G. (1998). Glycosynthases: Mutant glycosidases for oligosaccharide synthesis. J. Am. Chem. Soc. 120(22): 5583-5584. Malet, C. and Planas, A. (1998). From beta-glucanase to beta-glucansynthase: glycosyl transfer to alpha-glycosyl fluorides catalyzed by a mutant endoglucanase lacking its catalytic nucleophile. FEBS Lett. 440(1-2): 208-212. Marcus, S.E., Verhertbruggen, Y., Herve, C., Ordaz-Ortiz, J.J., Farkas, V., Pedersen, H.L., Willats, W.G.T. and Knox, J.P. (2008). Pectic homogalacturonan masks abundant sets of

69 Fredrika Gullfot 2010

xyloglucan epitopes in plant cell walls. BMC Plant Biol. 8: 12. Mark, P., Baumann, M.J., Eklöf, J.M., Gullfot, F., Michel, G., Kallas, Å.M., Teeri, T.T., Brumer, H. and Czjzek, M. (2009). Analysis of nasturtium TmNXG1 complexes by crystallography and molecular dynamics provides detailed insight into substrate recognition by family GH16 xyloglucan endo-transglycosylases and endo-hydrolases. Proteins 75(4): 820-836. McCartney, L., Gilbert, H.J., Bolam, D.N., Boraston, A.B. and Knox, J.P. (2004). Glycoside hydrolase carbohydrate-binding modules as molecular probes for the analysis of plant cell wall polymers. Anal. Biochem. 326(1): 49-54. Meyer, M.M., Hochrein, L. and Arnold, F.H. (2006). Structure-guided SCHEMA recombination of distantly related beta-lactamases. Protein Eng. Des. Sel. 19: 563-570. Meyer, M.M., Silberg, J.J., Voigt, C.A., Endelman, J.B., Mayo, S.L., Wang, Z.G. and Arnold, F.H. (2003). Library analysis of SCHEMA-guided protein recombination. Protein Sci. 12(8): 1686-1693. Michel, G., Chantalat, L., Duee, E., Barbeyron, T., Henrissat, B., Kloareg, B. and Dideberg, O. (2001). The kappa-carrageenase of P-carrageenovora features a tunnel-shaped active site: A novel insight in the evolution of clan-B glycoside hydrolases. Structure 9(6): 513-525. Mishra, A. and Malhotra, A.V. (2009). Tamarind xyloglucan: a polysaccharide with versatile application potential. J. Mater. Chem. 19(45): 8528-8536. Moller, I., Sörensen, I., Bernal, A.J., Blaukopf, C., Lee, K., Öbro, J., Pettolino, F., Roberts, A., Mikkelsen, J.D., Knox, J.P., Bacic, A. and Willats, W.G.T. (2007). High- throughput mapping of cell-wall polymers within and between plants using novel microarrays. Plant J. 50(6): 1118-1128. Nam, J.M., Fujita, Y., Arai, T., Kondo, A., Morikawa, Y., Okada, H., Ueda, M. and Tanaka, A. (2002). Construction of engineered yeast with the ability of binding to cellulose. J. Mol. Cat. B 17(3-5): 197-202. Newman, J., Egan, D., Walter, T.S., Meged, R., Berry, I., Ben Jelloul, M., Sussman, J.L., Stuart, D.I. and Perrakis, A. (2005). Towards rationalization of crystallization screening for small- to medium-sized academic laboratories: the PACT/JCSG plus strategy. Acta Crystallogr. D 61: 1426-1431. Nishitani, K. and Tominaga, R. (1992). Endoxyloglucan transferase, a novel class of glycosyltransferase that catalyzes transfer of a segment of xyloglucan molecule to another xyloglucan molecule. J. Biol. Chem. 267(29): 21058-21064. Notenboom, V., Boraston, A.B., Kilburn, D.G. and Rose, D.R. (2001). Crystal structures of the family 9 carbohydrate-binding module from Thermotoga maritima xylanase 10A in native and ligand-bound forms. Biochemistry 40(21): 6248-6256. Novy, R., Yaeger, K., Monsma, S., McCormick, M., Berg, J., Shoseyov, O., Shpigel, E., Seigel, D., Goldlust, A., Efroni, G., Singer, Y., Kilburn, D., Tomme, P. and Gilkes, N. (1997). Cellulose binding domain expression vectors for the rapid, low cost purification of CBD- fusion proteins. FASEB J. 11(9): 1715. Nygren, P.Å. (2008). Alternative binding proteins: Affibody binding proteins developed from a small three-helix bundle scaffold. FEBS J. 275(11): 2668-2676.

70 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

O'Brien, R., Ladbury, J.E. and Chowdhry, B.Z. (2000). Isothermal titration calorimetry of biomolecules. Protein-Ligand Interactions: hydrodynamics and calorimetry. Harding, S.E. and Chowdhry, B.Z. Oxford, Oxford University Press: 263-286. Ofir, K., Berdichevsky, Y., Benhar, I., Azriel-Rosenfeld, R., Larned, R., Barak, Y., Bayer, E.A. and Morag, E. (2005). Versatile protein microarray based on carbohydrate-binding modules. Proteomics 5(7): 1806-1814. Otey, C.R., Landwehr, M., Endelman, J.B., Hiraga, K., Bloom, J.D. and Arnold, F.H. (2006). Structure-guided recombination creates an artificial family of cytochromes P450. PLoS Biol. 4: 789-798. Otey, C.R., Silberg, J.J., Voigt, C.A., Endelman, J.B., Bandara, G. and Arnold, F.H. (2004). Functional evolution and structural conservation in chimeric cytochromes P450: Calibrating a structure-guided approach. Chem. Biol. 11(3): 309-318. Otten, L.G. and Quax, W.J. (2005). Directed evolution: selecting today's biocatalysts. Biomol. Eng. 22(1-3): 1-9. Pala, H., Pinto, R., Mota, M., Duarte, A.P. and Gama, F.M. (2003). Cellulose-binding domains as a tool for paper recycling. Appl. Enz. Lignocell. 855: 105-115. Pattathil, S., Avci, U., Baldwin, D., Swennes, A.G., McGill, J.A., Popper, Z., Bootten, T., Albert, A., Davis, R.H., Chennareddy, C., Dong, R.H., O'Shea, B., Rossi, R., Leoff, C., Freshour, G., Narra, R., O'Neil, M., York, W.S. and Hahn, M.G. (2010). A comprehensive toolkit of plant cell wall glycan-directed monoclonal antibodies. Plant Physiol. 153(2): 514-525. Pena, M.J., Darvill, A.G., Eberhard, S., York, W.S. and O'Neill, M.A. (2008). Moss and liverwort xyloglucans contain galacturonic acid and are structurally distinct from the xyloglucans synthesized by hornworts and vascular plants. Glycobiology 18(11): 891-904. Perugino, G., Cobucci-Ponzano, B., Rossi, M. and Moracci, M. (2005). Recent advances in the oligosaccharide synthesis promoted by catalytically engineered glycosidases. Adv. Synth. Catal. 347(7-8): 941-950. Piens, K., Fauré, R., Sundqvist, G., Baumann, M.J., Saura-Valls, M., Teeri, T.T., Cottaz, S., Planas, A., Driguez, H. and Brumer, H. (2008). Mechanism-based labeling defines the free energy change for formation of the covalent glycosyl-enzyme intermediate in a xyloglucan endo- transglycosylase. J. Biol. Chem. 283(32): 21864-21872. Piens, K., Henriksson, A.M., Gullfot, F., Lopez, M., Fauré, E., Ibatullin, F.M., Teeri, T.T., Driguez, H. and Brumer, H. (2007). Glycosynthase activity of hybrid aspen xyloglucan endo-transglycosylase PttXET16-34 nucleophile mutants. Org. Biomol. Chem. 5(24): 3971-3978. Planas, N. (2000). Bacterial 1,3-1,4-beta-glucanases: structure, function and protein engineering. Biochim. Biophys. Acta-Protein Struct. Molec. Enzym. 1543(2): 361-382. Popper, Z.A. and Fry, S.C. (2004). Primary cell wall composition of pteridophytes and spermatophytes. New Phytol. 164(1): 165-174. Proctor, M.R., Taylor, E.J., Nurizzo, D., Turkenburg, J.P., Lloyd, R.M., Vardakou, M., Davies, G.J. and Gilbert, H.J. (2005). Tailored catalysts for plant cell-wall degradation: Redesigning the exo/endo preference of Cellvibrio japonicus arabinanase 43A. Proc. Natl. Acad. Sci. U. S. A. 102(8): 2697-2702.

71 Fredrika Gullfot 2010

Puhlmann, J., Bucheli, E., Swain, M.J., Dunning, N., Albersheim, P., Darvill, A.G. and Hahn, M.G. (1994). Generation of monoclonal antibodies agains plant cell-wall polysaccharides. 1. Characterization of a monoclonal antibody to a terminal alpha-(1-2)-linked fucosyl-containing epitope. Plant Physiol. 104(2): 699-710. Rapley, R. (2000). Molecular cloning and gene analysis. Principles and Techniques of Practical Biochemistry. Wilson, K. and Walker, J. Cambridge, Cambridge University Press. Reiter, W.D. (2002). Biosynthesis and properties of the plant cell wall. Curr. Opin. Plant Biol. 5(6): 536-42. Rodriguez, B., Kavoosi, M., Koska, J., Creagh, A.L., Kilburn, D.G. and Haynes, C.A. (2004). Inexpensive and generic affinity purification of recombinant proteins using a family 2a CBM fusion tag. Biotech. Progress 20(5): 1479-1489. Rotticci-Mulder, J.C., Gustavsson, M., Holmquist, M., Hult, K. and Martinelle, M. (2001). Expression in Pichia pastoris of Candida antarctica lipase B and lipase B fused to a cellulose-binding domain. Protein Expr. Purif. 21(3): 386-392. Saiki, R.K., Gelfand, D.H., Stoffel, S., Scharf, S.J., Higuchi, R., Horn, G.T., Mullis, K.B. and Erlich, H.A. (1988). Primer-directed enzymatic amplification of DNA with a thermostable DNA-polymerase. Science 239(4839): 487-491. Sandquist, D., Filonova, L., von Schantz, L., Ohlin, M. and Daniel, G. (2010). Microdistribution of xyloglucan in differentiating poplar cells. Bioresources 5(2):796-807. Saura-Valls, M., Fauré, R., Brumer, H., Teeri, T.T., Cottaz, S., Driguez, H. and Planas, A. (2008). Active-site mapping of a Populus xyloglucan endo-transglycosylase with a library of xylogluco-oligosaccharides. J. Biol. Chem. 283(32): 21853-21863. Saura-Valls, M., Fauré, R., Ragas, S., Piens, K., Brumer, H., Teeri, T.T., Cottaz, S., Driguez, H. and Planas, A. (2006). Kinetic analysis using low-molecular mass xyloglucan oligosaccharides defines the catalytic mechanism of a Populus xyloglucan endotransglycosylase. Biochem. J. 395: 99-106. Shirakawa, M., Yamatoya, K. and Nishinari, K. (1998). Tailoring of xyloglucan properties using an enzyme. Food Hydrocolloids 12: 25-28. Shoseyov, O., Shani, Z. and Levy, I. (2006). Carbohydrate binding modules: Biochemical properties and novel applications. Microbiol. Mol. Biol. Rev. 70(2): 283-+. Silberg, J.J., Endelman, J.B. and Arnold, F.H. (2004). SCHEMA-guided protein recombination. Protein Eng. 388: 35-42. Simpson, P.J., Jamieson, S.J., Abou-Hachem, M., Karlsson, E.N., Gilbert, H.J., Holst, O. and Williamson, M.P. (2002). The solution structure of the CBM4-2 carbohydrate binding module from a thermostable Rhodothermus marinus xylanase. Biochemistry 41(18): 5712-9. Sinnott, M.L. (1990). Catalytic mechanisms of enzymatic glycosyl transfer. Chem. Rev. 90(7): 1171-1202. Skerra, A. (2008). Alternative binding proteins: Anticalins - harnessing the structural plasticity of the lipocalin ligand pocket to engineer novel binding activities. FEBS J. 275(11): 2677-2683. Stemmer, W.P.C. (1994). DNA shuffling by random fragmentation and reassembly - in-vitro recombination for molecular evolution. Proc. Natl. Acad. Sci. U. S. A. 91(22): 10747-10751.

72 On the engineering of proteins: methods and applications for carbohydrate-active enzymes

Sulova, Z., Lednicka, M. and Farkas, V. (1995). A colorimetric assay for xyloglucan- endotransglycosylase from germinating seeds. Anal. Biochem. 229(1): 80-85. Sulzenbacher, G., Driguez, H., Henrissat, B., Schülein, M. and Davies, G.J. (1996). Structure of the Fusarium oxysporum endoglucanase I with a nonhydrolyzable substrate analogue: Substrate distortion gives rise to the preferred axial orientation for the leaving group. Biochemistry 35(48): 15280-15287. Suurnakki, A., Tenkanen, M., Siika-Aho, M., Niku-Paavola, M.L., Viikari, L. and Buchert, J. (2000). Trichoderma reesei cellulases and their core domains in the hydrolysis and modification of chemical pulp. Cellulose 7(2): 189-209. Tao, H.Y. and Cornish, V.W. (2002). Milestones in directed enzyme evolution. Curr. Opin. Chem. Biol. 6(6): 858-864. Thorpe, R. and Thorpe, S. (2000). Immunochemical techniques. Principles and Techniques of Practical Biochemistry. Wilson, K. and Walker, J. Cambridge, Cambridge University Press: 206-262. van den Steen, P., Rudd, P.M., Dwek, R.A. and Opdenakker, G. (1998). Concepts and principles of O-linked glycosylation. Crit. Rev. Biochem. Mol. Biol. 33(3): 151-208. Vasur, J., Kawai, R., Larsson, A.M., Igarashi, K., Sandgren, M., Samejima, M. and Ståhlberg, J. (2006). X-ray crystallographic native sulfur SAD structure determination of laminarinase Lam16A from Phanerochaete chrysosporium. Acta Crystallogr. D 62: 1422-1429. Vincken, J.P., Beldman, G. and Voragen, A.G.J. (1997). Substrate specificity of endoglucanases: What determines xyloglucanase activity? Carbohydr. Res. 298(4): 299-310. Vocadlo, D.J. and Withers, S.G. (2000). Glycosidase catalysed oligosaccharide synthesis. Carbohydrates in Chemistry and Biology. Ernst, B., Hart, G.W. and Sinay, P. Weinheim, Wiley-VCH GmbH. 2: 723–844. Voigt, C.A., Martinez, C., Wang, Z.G., Mayo, S.L. and Arnold, F.H. (2002). Protein building blocks preserved by recombination. Nat. Struct. Biol. 9(7): 553-558. von der Osten, C., Björnvad, M., Vind, J., Rasmussen, M.D. and Björnvad, M.E. (1997a). Modification (i.e. polishing or roughening) of saccharide fibres which comprises a catalytically active amino acid sequence of a non-cellulolytic enzyme linked to an amino acid sequence comprising a cellulose binding domain, Novo-Nordisk As. Patent numbers WO9728256-A; EP877799-A; WO9728256-A1; AU9714383-A; EP877799-A1; CN1209838-A; US6017751-A. von der Osten, C., Cherry, J.R., Björnvad, M., Vind, J., Rasmussen, M., Björnvad, M.E., Rasmussen, M.D., Cherry, R., Björnvad, E. and Rasmussen, D. (1997b). Enzyme exhibiting cellulase activity from Bacillus sp. using an enzyme hybrid comprising a sequence of a non- cellulolytic enzyme linked to a cellulose-binding domain sequence, Novo-Nordisk as; Novozymes As. Patent numbers WO9728243-A; EP882123-A; WO9728243-A1; AU9714384-A; EP882123- A1; CN1209833-A; US6015783-A; EP882123-B1; DE69730821-E; ES2230594-T3; CN1109740- C; DE69730821-T2 von der Osten, C., Branner, S., Hastrup, S., Hedegaard, L., Rasmussen, M.D., Bisgardfrantzen, H., Carlsen, S. and Mikkelsen, J.M. (1993). Protein engineering of subtilisins to improve stability in detergent formulations. J. Biotech. 28(1): 55-68. von Schantz, L., Gullfot, F., Scheer, S., Filonova, L., Gunnarsson, L.C., Flint, J.E., Daniel, G., Nordberg-Karlsson, E., Brumer, H. and Ohlin, M. (2009). Affinity maturation

73 Fredrika Gullfot 2010

generates greatly improved xyloglucan-specific carbohydrate binding modules. BMC Biotech. 9. Whitney, S.E.C., Wilson, E., Webster, J., Bacic, A., Reid, J.S.G. and Gidley, M.J. (2006). Effects of structural variation in xyloglucan polymers on interactions with bacterial cellulose. Am. J. Bot. 93: 1402-1414. Wlodawer, A., Minor, W., Dauter, Z. and Jaskolski, M. (2008). Protein crystallography for non-crystallographers, or how to get the best (but not more) from published macromolecular structures. FEBS J. 275(1): 1-21. Wymer, N. and Toone, E.J. (2000). Enzyme-catalyzed synthesis of carbohydrates. Curr. Opin. Chem. Biol. 4(1): 110-119. York, W.S. and Hawkins, R. (2000). Preparation of oligomeric beta-glycosides from cellulose and hemicellulosic polysaccharides via the glycosyl transferase activity of a Trichoderma reesei cellulase. Glycobiology 10(2): 193-201. Zechel, D.L. and Withers, S.G. (1999). Glycosidase mechanisms: Anatomy of a finely tuned catalyst. Acc. Chem. Res. 33(1): 11-18. Zhou, Q., Greffe, L., Baumann, M.J., Malmström, E., Teeri, T.T. and Brumer, H. (2005). Use of xyloglucan as a molecular anchor for the elaboration of polymers from cellulose surfaces: A general route for the design of biocomposites. Macromolecules 38(9): 3547-3549. Zhou, Q., Rutland, M.W., Teeri, T.T. and Brumer, H. (2007). Xyloglucan in cellulose modification. Cellulose 14: 625-641.

74