Heterologous expression, characterization and applications of carbohydrate active enzymes and binding modules

Åsa Kallas

Royal Institute of Technology School of Biotechnology Department of Wood Biotechnology Stockholm, 2006 Copyright ¤ Åsa Kallas Stockholm, 2006 ISBN 91-7178-349-0

Royal Institute of Technology AlbaNova University Center School of Biotechnology SE – 106 91 Stockholm Sweden

Printed at Universitetsservice US-AB Box 700 14 100 44 Stockholm Sweden

Cover illustration: Top left: Fluorescence microscopy showing a birch tension wood section labeled with CBM1HjCel7A-FITC. Top right: FE-SEM micrograph showing a transverse section of latewood poplar fibers. Bottom left and right: FE-SEM micrographs showing labeling of poplar G-layer cellulose aggregates with CBM1HjCel7A-Au/Ag (courtesy of Prof. Geoffrey Daniel). Åsa Kallas (2006). Heterologous expression, characterization and applications of carbohydrate active enzymes and binding modules. Doctoral thesis in Wood Biotechnology. School of Biotechnology, Department of Wood Biotechnology, Royal Institute of Technology, AlbaNova University Center, Stockholm, Sweden. ISBN 91-7178-349-0 ABSTRACT Wood and wood products are of great economical and environmental importance, both in Sweden and globally. Biotechnology can be used both for achieving raw material of improved quality and for industrial processes such as biobleaching. Despite the enormous amount of carbon that is fixed as wood, the knowledge about the enzymes involved in the biosynthesis, re-organization and degradation of plant cell walls is relatively limited. In order to exploit enzymes more efficiently or to develop new biotechnological processes, it is crucial to gain a better understanding of the function and mechanism of the enzymes. This work has aimed to increase the knowledge about some of the enzymes putatively involved in the wood forming processes in Populus. Xyloglucan endotransglycosylases and a putative xylanase represent transglycosylating and hydrolytic enzymes, respectively. Carbohydrate binding modules represent non-catalytic modules, which bind to the substrate. Among 24 encoding for putative xyloglucan endotransglycosylases or xyloglucan endohydrolases that were identified in the Populus EST database, two were chosen for further studies (PttXTH16-34 and PttXTH16-35). The corresponding proteins, PttXET16-34 and PttXET16-35, were expressed in P. pastoris, purified and biochemically characterized. The importance of the N-glycans was investigated by comparing the recombinant wild-type proteins with their deglycosylated counterparts. In order to obtain the large amounts of PttXET16-34 that were needed for crystallization and development of biotechnological applications, the conditions for the large-scale production of PttXET16-34 in a fermenter were optimized. In microorganisms, endo-(1,4)-E-xylanases are important members of the xylan degrading machinery. These enzymes are also present in plants where they might fulfill a similar, but probably more restrictive function. One putative endo-(1,4)-E-xylanase, denoted PttXYN10A, was identified in the hybrid aspen EST library. Sequence analysis shows that this protein contains three putative carbohydrate-binding modules (CBM) from family 22 in addition to the catalytic module from GH10. Heterologous expression and reverse genetics were applied in order to elucidate the function of the catalytic module as well as the binding modules of PttXYN10A. Just as in microorganisms, some of the carbohydrate active enzymes from plants have one or more CBM attached to the catalytic module. So far, a very limited number of plant CBMs has been biochemically characterized. A detailed bio-informatic analysis of the CBM family 43 revealed interesting modularity patterns. In addition, one CBM43 (CBM43PttGH17_84) from a putative Populus E-(1,3)-glucanase was expressed in E. coli and shown to bind to laminarin (E-(1,3)-glucan), mixed-linked E-(1,3)(1,4)-glucans and crystalline cellulose. Due to their high specificity for different carbohydrates, CBMs can be used as probes for the analysis of plant materials. Generally, they are more specific than both staining techniques and carbohydrate-binding antibodies. We have used cellulose- and mannan binding modules from microorganisms as tools for the analysis of intact fibers as well as processed pulps.

Keywords: Populus, xyloglucan endotransglycosylase, carbohydrate binding modules, endo-(1,4)-E- xylanase, , , N-glycosylation, fiber analysis

Copyright ¤ Åsa Kallas, 2006 Åsa Kallas (2006). Heterologous expression, characterization and applications of carbohydrate active enzymes and binding modules. Doktorsavhandling i Träbioteknik. Skolan för Bioteknologi, Avdelningen för Träbioteknik, Kungliga Tekniska Högskolan, AlbaNova Universitetscenter, Stockholm, Sverige. ISBN 91-7178-349-0

SAMMANFATTNING Trä- och träprodukter har stor ekonomisk och miljömässig betydelse, både i Sverige och globalt. Bioteknik kan användas både för att förbättra råvaran och för industriella processer, såsom bioblekning. Trots de enorma mängder kol som är bundet i form av ved så är kunskapen om de enzymer som är involverade i uppbyggnaden, omorganisationen och nedbrytningen av ved relativt begränsad. För att kunna utnyttja enzymer mer effektivt eller utveckla nya bioteknologiska processer är det nödvändigt att erhålla en bättre förståelse av enzymernas mekanism och funktion. Målet med detta arbete har varit en ökad kunskap om några av de enzymer som potentiellt är involverade i vedbildningsprocessen i Populus. Xyloglukan endotransglykosylaserna respektive det potentiella xylanaset representerar transglykosylerande och hydrolyserande enzymer. De kolhydratbindande modulerna representerar icke-katalytiska moduler som binder till substratet. Av 24 gener som kodar för potentiella xyloglukan endotransglykosylaser eller xyloglucan endohydrolaser identifierade i Populus EST databasen valdes två ut för vidare studier (PttXTH16-34 och PttXTH16-35). De motsvarande proteinerna, PttXET16-34 och PttXET16- 35 uttrycktes i P. pastoris, renades och karaktäriserades biokemiskt. Betydelsen av N- glykosyleringen undersöktes genom att jämföra de rekombinanta vildtyps-proteinerna med de deglykosyerade varianterna. För att erhålla de mängder av PttXET16-34 som krävdes för kristallisering och utveckling av bioteknologiska applikationer, optimerades parametrarna för odling av PttXET16-34 i fermentor. I mikroorgansimer är endo-(1,4)-E-xylanaser viktiga delar av det xylan-nedbrytande maskineriet. Dessa enzymer uttrycks också i växter där de antas fylla en liknande, men troligen mer restriktiv funktion. Ett potentiellt endo-(1,4)-E-xylanas, kallat PttXYN10A, identifierades i hybrid asp-EST-biblioteket. Sekvensanalys visar att proteinet, förutom en katalytisk modul från GH10 också innehåller tre potentiella kolhydratbindande moduler (CBM) från familj 22. Rekombinant protein-uttryck och genmodifierade växter användes för att studera den katalytiska modulen samt de bindande modulerna av PttXYN10A. Somliga växtenzymer har, i likhet med många mikrobiella enzymer, en eller flera CBM bundna till den katalytiska modulen. Hittills har ett mycket begränsat antal växt-CBM karaktäriserats biokemiskt. En detaljerad bioinformatisk analys av CBM familj 43 påvisade intressanta modularitetsmönster. Dessutom uttrycktes en CBM43 (CBM43PttGH17_84) från ett potentiellt Populus E-(1,3)-glukanas i E. coli och visades binda till laminarin (E-(1,3)-glukan), E-(1,3)(1,4)-glukan och kristallin cellulosa. CBM kan tack vare deras höga specificitet för olika kolhydrater användas som prober för analys av växtmaterial. De är generellt mer specifika än både infärgningstekniker och kolhydrat-specifika antikroppar. Vi har använt cellulosa- och mannanbindande moduler från mikroorganismer som verktyg för analyser av både intakta fiberr och pappersmassor.

Nyckelord: Populus, xyloglukan endotransglykosylase, kolhydrat bindande moduler, endo-(1,4)-E- xylanas, Escherichia coli, Pichia pastoris, N-glykosylering, fiberanalys

Copyright ¤ Åsa Kallas, 2006 To my family

A tree’s leaves may be ever so good, So may its bark, so may its wood

Leaves Compared with Flowers, Robert Frost (1874-1963)

LIST OF PUBLICATIONS

This thesis is based on the following publications, which in the text are referred to by their roman numerals:

I. Å.M. Kallas, K. Piens, S.E. Denman, H. Henriksson, J. Fäldt, P. Johansson, H. Brumer III and T.T. Teeri (2005) Enzymatic properties of native and deglycosylated hybrid aspen (Populus tremula x tremuloides) xyloglucan endotransglycosylase 16A (PttXET16A) expressed in Pichia pastoris. Biochemical Journal, 390, 105-113

II. Å.M. Kallas, M.J. Baumann, J. Fäldt, H. Aspeborg, S. Denman, E.J. Mellerowicz, N.N. Nishikubo, H. Brumer III and T.T. Teeri. Enzymatic characterization of a recombinant xyloglucan endotransglycosylase PttXET16-35 from Populus tremula x tremuloides. Submitted to Planta.

III. M. Bollok, H. Henriksson, Å. Kallas, M. Jahic, T.T. Teeri and S-O. Enfors (2005) Production of poplar xyloglucan endotransglycosylase using the methylotrophic yeast Pichia pastoris. Applied Biochemistry and Biotechnology, 126, 61-77

IV. Å.M. Kallas, P.M. Coutinho, V. Bulone, E.J. Mellerowicz, H. Gilbert, B. Henrissat, T.T. Teeri. Characterization of a CBM43 module from hybrid aspen (Populus tremula x tremuloides): specificity of polysaccharide interaction and bioinformatic analysis. Manuscript.

V. L. Filonova, Å.M. Kallas, L. Greffe, T.T. Teeri, G. Daniel. Mapping of crystalline cellulose and mannan on the surfaces of wood tissues and pulp fibers using carbohydrate binding modules. Manuscript.

In addition, previously unpublished data is included.

Related publications:

i. P. Johansson, S. Denman, H. Brumer, Å.M. Kallas, H. Henriksson, T. Bergfors, T.T. Teeri and A. Jones (2003) Crystallization and preliminary X-ray analysis of a xyloglucan endotransglycosylase from Populus tremula X tremuloides. Acta Crystallographica, D59, 535-537

ii. P. Johansson, H. Brumer III, M. Baumann, Å. Kallas, H. Henriksson, S. Denman, T.T. Teeri and A. Jones (2004) Crystal structures of a xyloglucan endotransglycosylase reveal details of the transglycosylation acceptor binding. Plant Cell, 16, 874-886 iii. H. Aspeborg, J. Schrader, P.M. Coutinho, M. Stam, Å. Kallas, S. Djerbi, P. Nilsson, S. Denman, B. Amini, F. Sterky, E. Master, G. Sandberg, E. Mellerowicz, B. Sundberg, B. Henrissat and T.T. Teeri (2005) Carbohydrate-active enzymes involved in the secondary cell wall biogenesis in hybrid aspen. Plant Physiology, 137, 983-997

iv. G. Daniel, L. Filonova, Å.M. Kallas and T.T. Teeri. Use of the cellulose-binding module CBM1HjCel7A for morphological and biochemical characterization of the gelatinous cell wall layer in tension wood fibers of Populus tremula and Betula verrucosa using fluorescence and FE-SEM microscopy. Submitted to Holzforschung. TABLE OF CONTENTS

INTRODUCTION______1

1BIOTECHNOLOGY IN THE PULP AND PAPER INDUSTRY ______1 2WOOD FORMATION ______2 2.1 Components of the cell wall ______3 2.1.1 Cellulose and callose______3 2.1.2 Hemicelluloses and pectin______4 2.1.2.1 Xyloglucan ______4 2.1.2.2 Xylans______6 2.1.2.3 Glucomannan and galactoglucomannnan______6 2.1.2.4 Mixed linked E-glucans ______7 2.1.2.5 Pectins______9 2.1.3 Lignin ______9 2.2 Enzymes involved in cell wall synthesis, expansion and degradation______9 2.2.1 Classification of plant cell wall modifying enzymes ______10 2.2.2 Mechanisms of the plant cell wall modifying enzymes ______11 2.3 The XTH family______12 2.3.1 Role of XET and XEH in the plant cell wall______14 2.3.2 Mechanism and substrates for XET and XEH ______14 2.4 Carbohydrate binding modules______16 2.4.1 CBM classification and determination of binding properties ______16 2.4.2 Biotechnological applications of CBMs ______18 2.5 Xylanases ______18 2.5.1 Sequence and structural features of endo-(1,4)-E-xylanases from GH family 10______18 2.5.2 Xylanases in biotechnology______19 3ENZYME DISCOVERY IN POPULUS ______21 3.1 Populus as a model tree ______21 3.2 Identification of carbohydrate active enzymes in Populus ______21 3.3 Reverse genetics ______23 3.4 Immunolocalization and activity measurements in situ______24 3.5 Heterologous ______24 3.5.1 Heterologous protein production in E. coli ______25 3.5.2 Heterologous protein production in P. pastoris ______25 3.5.2.1 Fermentation of P. pastoris______26 3.5.2.2 Glycosylation ______26 3.6 Creating mutants and structural studies ______27 OBJECTIVES OF THE PRESENT INVESTIGATION______29 RESULTS AND DISCUSSION ______31

4HYBRID ASPEN XYLOGLUCAN ENDOTRANSGLYCOSYLASES (PUBLICATIONS I-III)______31 4.1 Identification of target genes: sequence analysis and gene expression ______31 4.2 Recombinant expression and purification of PttXET16-34 and PttXET16-35 ______32 4.3 Characterization of PttXET16-34, PttXET16-34/N53S and PttXET16-35/N29K ______34 4.4 Deglycosylation studies______34 4.5 Fermentation of PttXET16-34______35 5CARBOHYDRATE BINDING MODULES (PUBLICATIONS IV AND V) ______37 5.1 Bioinformatic analysis of the CBM family 43 ______37 5.2 Cloning, expression and purification of CBM43PttGH17_84 ______38 5.3 Binding properties of CBM43PttGH17_84 ______38 5.4 CBMs as analytical tools for fiber surface characterization______39 6PTTXYN10A: A PUTATIVE HYBRID ASPEN ENDO-(1,4)-E-XYLANASE (UNPUBLISHED WORK) ______40 6.1 Sequence and expression analysis of PttXYN10A ______40 6.2 Heterologous expression______41 6.3 Reverse genetics ______42 CONCLUDING REMARKS AND FUTURE PERSPECTIVES ______45 LIST OF ABBREVIATIONS______47 ACKNOWLEDGEMENTS ______49 REFERENCES ______51

INTRODUCTION

1 Biotechnology in the pulp and paper industry Forests cover approximately half of Sweden’s land area and the forest industry is since decades one of Sweden’s largest industries. In 2004, wood and wood products constituted 12% of Sweden’s total net export value (Skogsstatistisk Årsbok 2005, http://www.skogsstyrelsen.se). Also in a global perspective, forest and products derived from the forest industry have significant environmental and economic importance. The development of biotechnology for the pulp and paper industry started during the early 1980’s. The pulp and paper industry typically has heavy machinery, which is expensive to alter for new processes. Also, the up-scaling of new processes from lab- or pilot scale to the very large industry scale might sometimes be problematic. Taken together, these facts might explain why the introduction of biotechnology into the forest industry has been relatively slow. However, the advantages of biotechnology are becoming more and more obvious. The use of enzymes can in many cases lead to lower consumption of chemicals and energy, more specific reactions and last, but probably most important, improved product performance. Examples of enzymes that have already been used successfully to improve pulp processing are xylanases, cellulases, lipases and laccases (Bajpai 1999, Bhat 2000). So far, the focus has mainly been on hydrolytic enzymes from microorganisms but there is an increasing interest for other carbohydrate-active enzymes such as glycosynthases and transglycosylases. These enzymes have the potential to synthesize or modify carbohydrates via a transglycosylation mechanism, which allows altering the properties of the raw material without the risk of a simultaneous degradation. The use of enzymes is very promising yet expensive in the scale of forest product industries. So far, enzyme technology is therefore most suitable for the development of products or materials with new or enhanced properties where there are few alternatives and for which the market is prepared to pay a higher price. An alternative approach for the use of biotechnology in the pulp and paper industry is to use genetically modified trees. For example, trees with lower or modified lignin-content would decrease the consumption of chemicals during pulping. Genetically modified plants with lower requirements for growth and faster growth rates are also one of the more promising ways to meet the increasing demands for wood and wood products. However, it is important to remember that the technology of genetic modification is far ahead of the environmental impact studies. These studies have to be carried out before plant genetic engineering will be generally considered as ecologically safe, and thus a realistic alternative for forest industrial production. One of the main goals for the wood biotechnology research at KTH is to develop biotechnological tools for biomimetic material design via enzyme discovery in wood. As for all biomimicking techniques, it is crucial to understand the basic mechanisms in order to exploit the advantages of the natural process. Putatively interesting proteins are therefore submitted to detailed enzymatic characterization. Once the substrate and the mechanism of the enzyme are determined, directed mutagenesis can be used in order to make the enzyme better suited for the specific applications.

1 Åsa Kallas

2 Wood formation Wood (secondary xylem) results from the developmental process called secondary growth. Secondary growth makes the tree stem thicker, as opposed to primary growth that makes the plant taller. Only perennial plants that live for a number of years develop wood, whereas annual plants such as cereals and other grasses generally do not. However, it has been suggested that all seed plants contain the set of genes encoding for wood forming proteins and that these proteins can be expressed if the plant is exposed to specific conditions (Kirst et al. 2003). Indeed, the formation of secondary xylem in Arabidopsis thaliana has been proven (Nieminen et al. 2004) and trans-differentiation of mesophyll cells into tracheary elements have been successfully induced in Zinnia elegans (McCann 1997). Thus, the absence of wood in non-woody plants might be a consequence of silent genes rather than the absence of the wood forming genes. The cells that are developed during the secondary growth originate from the vascular cambium, a thin cell layer consisting of cambial initials and phloem/xylem mother cells. The vascular cambium contains two types of initial cells: radial initials which develop into ray cells, responsible for the radial transport of fluids in the stem, and fusiform initials which develop into secondary vascular tissue (Mellerowicz et al. 2001, Plomion et al. 2001). A periclinal division (i.e. a division parallel to the vascular cambium) of a fusiform initial cell gives rise to one initial cell and one xylem or phloem mother cell. An anticlinal division (i.e. a division parallel to the radius of the tree) gives rise to two cambial initial cells, which allows the diameter of the cambium to increase. Cells that differentiate towards the outer side of the vascular cambium become phloem cells, and later bark, and cells differentiating towards the inner side of the cambium become xylem cells (Fig 1a). Phloem cells transport nutrients from the photosynthesizing cells to the growing parts of the tree, whereas the water transport from the roots to the leaves takes place in the xylem. The xylem of softwood (wood from conifers such as spruce and pine) contains tracheids that transport water and provide mechanical support and a small amount of ray cells. In the xylem of hardwood (wood from angiosperms such as birch and aspen), tracheids and fibers (sclerenchyma cells) provide the mechanical support whereas specialized vessels transport water. The hardwood xylem also contains ray cells and longitudinal parenchyma cells. Wood formation, or xylogenesis, is a complex procedure that starts with the division of a xylem mother cell. During the expansion of the newly formed cell, there is only a primary cell wall. When the cell expansion is completed, a secondary cell wall is deposited between the primary cell wall and the plasma membrane. The last steps in the wood forming process are lignification of the cell wall followed by programmed cell death (Plomion et al. 2001). The ray cells and the parenchyma cells are not lignified and stay metabolically active for several years. However, when the tree stem achieves a certain width, the ray cells and the parenchyma cells of the innermost part of the tree trunk die and the fluid transporting sapwood is transformed to completely inactive heartwood. It is still not completely understood how the events of the xylogenesis are regulated, but most likely the regulation results from a combination of exogenous factors such as temperature and photoperiod (length of the days) and endogenous factors such as growth hormones (Plomion et al. 2001). The primary cell wall is only about 0.1-0.2 μm thick and relatively flexible, while the secondary cell wall most often is considerably thicker (up to 10 μm) and much more rigid. In many cases, the secondary cell wall is composed of three layers: S1, S2 and S3, with S2 being the thickest (accounting for 75-85% of the total cell wall, Timell 1967) (Fig 1b). The pectin- rich middle lamella separates the primary cell wall of adjacent cells. The role of the cell wall is to maintain the shape of the cell and give mechanical strength to the plant. Furthermore, it

2 Wood formation protects the cell against microbial invasions, mediates inter-cellular contact and regulates the uptake of water and nutrients.

a) b)

Figure 1a. A transversal view of a tree trunk showing the bark, the phloem, the vascular cambium and the xylem. b. A transversal section of a wood fiber showing the middle lamella, the primary cell wall and the three layers of the secondary cell wall. Picture adapted from (Raven et al. 1999).

2.1 Components of the cell wall The major components of plant cell walls are cellulose, lignin, hemicellulose and pectin. These components can be isolated on the basis of their solubility in different solvents. In addition, there are small amounts of extractives as well as structural proteins such as the hydroxyproline-rich extensins. The occurrence of the different components differs between the primary and the secondary cell wall, and also depends on the species and the type of cell (Timell 1967). Pectins and hemicelluloses are synthesized in the Golgi apparatus (Gibeaut 2000) whereas cellulose microfibrils are synthesized by rosette-like terminal complexes in the plasma membrane (reviewed by Doblin et al. 2002). Probably, the cellulose microfibrils are embedded into the matrix polymers simultaneously with their synthesis (Darley et al. 2001).

2.1.1 Cellulose and callose The secondary cell wall of fibers and tracheids contains 40-50% cellulose. In consequence, cellulose is one of the most abundant polysaccharides on earth. It is composed of a non- substituted chain of E-(1ĺ4)-linked glucopyranose units and the degree of polymerization can be up to 25000 (Fig 2). The cellulose chains are organized in microfibrils and provide mechanical strength to the cell wall. In higher plants, one microfibril has been estimated to contain on the average 36 cellulose chains, held together by hydrogen bonds. Cellulose microfibrils in the primary cell wall are randomly and loosely organized, whereas those in the secondary cell wall are strictly organized, contributing to the rigid structure of wood. The microfibrillar angle (MFA), which describes the orientation of the microfibrils in relation to the fiber axis, varies between the different secondary cell wall layers. The S1 and S3 layers have a large MFA (i.e. the microfibrils are close to perpendicular to the fiber axis), which gives a less rigid fiber compared to the S2 layer where the MFA is smaller (5-30q) (Timell 1967). According to ever increasing evidence, the microtubules, which are associated to the terminal complexes in the cell membrane, are involved in determining the orientation of the

3 Åsa Kallas cellulose microfibrils (Baskin 2001). Callose (E-(1,3)-glucan) is a much less frequently occurring carbohydrate than cellulose. It appears in the cell plate of dividing cells, in pollen mother cell walls and tubes and in the sieve plates of phloem elements. In addition, callose commonly occurs as a response to wounding, pathogen attack and abiotic and biotic stress (Scheible et al. 2004). The cellulose chain is a linear molecule, whereas the E-(1,3)-linkages of callose induce a helical structure (Carpita et al. 2000, Pelosi et al. 2003).

Figure 2. Cellulose is composed of unsubstituted E-(1ĺ4)-linked glucopyranose units.

2.1.2 Hemicelluloses and pectin Plant cell walls contain a number of different hemicelluloses such as xyloglucan, mannan, glucomannan, E-(1,3)(1,4)-glucans and different types of xylan (Timell 1967) (Table 1). The general structure of a hemicellulose is a sugar backbone substituted with side chains. The type of monosaccharides as well as the type of linkages in the respective hemicellulose determine properties such as solubility and three-dimensional conformation. Hemicelluloses generally have a lower degree of polymerization than cellulose. Some of the hemicelluloses are more common in the primary cell wall, whereas others occur in the secondary cell wall. The hemicellulose content also differs between different plant species as well as different cell types. Together with the pectins in the primary cell wall and the lignin in the secondary cell wall, the hemicelluloses stabilize the cell wall by building up a matrix between the cellulose microfibrils.

2.1.2.1 Xyloglucan Xyloglucan makes up 20-25% (dry weight) of the primary cell wall of type I cell walls (all dicots and monocot species except grasses and cereals) (Hayashi 1989). It also occurs to a very low extent (2-5%) in type II primary cell walls (grasses and cereals). By forming a hydrogen-bond network with cellulose, it makes the primary cell wall strong enough to resist cellular turgor pressure, but flexible enough to allow cell expansion. Some of the xyloglucan molecules are believed to coat the cellulose microfibrils, others are woven into the cellulose microfibrils and yet others are spanning the gap between two adjacent microfibrils (Fry 1989, Pauly et al. 1999, Rose et al. 1999) (Fig 3). Xyloglucan is also the predominant storage polysaccharide in fruits of Tamarindus (tamarind) as well as species of Primulaceae (primrose family), Linaceae (flax family) and Ranunculaceae (buttercup family) (Carpita & McCann 2000).

4 Wood formation

Figure 3. The xyloglucan in the primary cell wall cover the cellulose microfibrils and connect adjacent microfibrils. Picture adapted from (Rose & Bennett 1999).

Just like cellulose, xyloglucan is made up of a backbone of E-(1ĺ4)-linked glucose units. The chain-length of the xyloglucan backbone is from about 300 glucose units up to 3000 glucose units (Fry 1989). However, in xyloglucan, the backbone is substituted with a number of side chains that alter the physical properties of the xyloglucan compared to cellulose. For example, they prevent the xyloglucan molecules from forming ordered microfibrils in the way that cellulose does. The type of side chains and the degree of substitution depend on the type of plant and the localization of the xyloglucan (Vincken et al. 1997, Vierhuis et al. 2001). In most dicots, about 75% of the backbone glucans carry an Į-(1ĺ6)-linked xylose unit. These xyloses can be further substituted with galactose via a E-(1,2)-linkage and in turn; the galactose can be attached to a fucose via an Į-(1,6)-linkage (Fig 4). In some rare cases, xyloglucan also contains arabinose. It has been suggested that the side groups, especially the fucose units, influence the xyloglucan-cellulose interaction by determining the three- dimensional conformation of the xyloglucan (Levy et al. 1997). In line with this idea, it has been also been shown that the storage xyloglucan, which does not bind to cellulose, is generally less substituted than the structural xyloglucan in the primary cell wall (Fry 1989). However, A. thaliana mur2 mutants that lack fucosylation of the xyloglucan exhibit normal growth habit and wall strength (Vanzin et al. 2002). These findings imply that fucosylation has a different function for the in vivo properties of xyloglucan than previously suggested.

Figure 4. The structure of xyloglucan is made up of a backbone of E-(1o4)-linked glucose units. Around 75% of the glucose residues are substituted with xylose units, which in turn can be substituted with galactose. Some xyloglucan structures also include arabinose and fucose units (not shown).

5 Åsa Kallas

2.1.2.2 Xylans The backbone of xylans consists of approximately 200 E-(1ĺ4)-linked xylose units that are decorated with different side groups depending on the origin of the xylan. While two E- (1ĺ4)-linked glucose units rotate 180°, the E-(1ĺ4)-linked xylose units only rotate 120°, which creates a helical backbone structure (Xie et al. 2001a). Glucuronoarabinoxylan is the major hemicellulose in type II primary cell walls and also occurs as a minor component in type I primary cell walls. The backbone of these primary cell wall xylans can be substituted with a number of different side groups including L-arabinose, 4-O-methyl-D-glucuronic or D- glucuronic acid, galactose and xylose (Fig 5) (Carpita 1996, Reid 1997). O-acetyl-4-O- methyl-glucuronoxylan constitutes 20-35% of lignified cell walls of hardwoods. In O-acetyl- 4-O-methyl-glucuronoxylan, approximately 1 out of 10 xylose units in the backbone is substituted with an Į-(1ĺ2)-linked 4-O-methyl-D-glucuronic acid residue. In addition, approximately 7 out of 10 xylose units are acetylated at C-2 or C-3. In softwood, arabino-4- O-methyl-glucuronoxylan constitutes 7-14% of the lignified cell walls. Arabino-4-O-methyl- glucuronoxylan resembles the hardwood O-acetyl-4-O-methyl-glucuronoxylan, however, the content of 4-O-methyl-D-glucuronic acid is slightly higher (1 acid per 5-10 xylose units). In addition, arabino-4-O-methyl-glucuronoxylan is not acetylated but substituted with varying amounts of Į-(1ĺ3)-linked arabinose units (Timell 1967).

Figure 5. The xylan backbone (a) and the substituents occurring in different xylans (b-d).

2.1.2.3 Glucomannan and galactoglucomannnan The secondary cell walls of hardwood contain a few percent of glucomannan, whereas galactoglucomannan is the major hemicellulose in lignified cell walls of softwood. Both carbohydrates have a backbone of approximately 150 sugar residues. The hardwood glucomannan consists of a linear chain of non-substituted E-(1ĺ4)-linked glucose and mannose units (ratio 1:2). The galactoglucomannan of softwoods has a backbone similar to that of hardwood glucomannan. However, the glucose/mannose ratio is 1:3 and in addition, a few percent of the mannose units are decorated with Į-(1ĺ6)-linked galactose units (Fig 6). The galactoglucomannan is most likely acetylated (Timell 1967).

Figure 6. The glucomannan backbone substituted with an Į-(1ĺ6)-linked galactose unit.

6 Wood formation

2.1.2.4 Mixed linked E-glucans Mixed linked E-glucans constitute a non-substituted group of hemicelluloses, frequently occurring in type II primary cell walls. They are sometimes considered as storage carbohydrates since they are abundant in the endosperm of seeds and degraded upon seed germination. Mixed linked E-glucans have a backbone of E-(1ĺ3)(1ĺ4)-linked glucose units and only differ by the ratio of (1,4) and (1,3)-linkages. E-glucan from barley generally consists of blocks of 2-3 consecutive (1ĺ4)-linked glucose units connected by one (1,3)- linkage, thus giving a ratio of (1,4) and (1,3)-linkages of approximately 2.3-2.5 (Reid 1997, Grishutin et al. 2006) (Fig 7). The content of the respective linkage can significantly influence the three-dimensional structure of the substrate and consequently also the susceptibility to hydrolysis by different enzymes.

Figure 7. Mixed linked E-(1,3)(1,4)-glucan (E-(1,3)-linkages circled).

7 Åsa Kallas

Polysaccharide Occurrence Backbone Substituents DP

Xyloglucan Primary cell E-(1,4)-Glc Xylose (D-(1,6)) 300- walls, type I (20- Galactose 3000 25%) Fucose Acetyl (Arabinose)

Glucuronoarabinoxylan Primary cell E-(1,4)-Xyl L-arabinose (D- 200 walls, type II (1,2) or D-(1,3)) (minor amounts 4-O-methyl-D- in type I cell glucuronic acid or walls) D-glucuronic acid (D-(1,2)) Galactose Xylose

Mixed linked E- Primary cell E-(1,3)(1,4)-Glc - 1200 glucans walls, type II

O-acetyl-4-O-methyl- Lignified E-(1,4)-Xyl 4-O-methyl-D- 150- glucuronoxylan hardwood cells glucuronic acid 200 (20-35%) (D-(1,2)) Acetyl (at C-2 or C-3)

Arabino-4-O-methyl- Lignified E-(1,4)-Xyl 4-O-methyl-D- 200 glucuronoxylan softwood cells glucuronic acid (7-14%) (D-(1,2)) L-arabinose (D-(1,3))

Glucomannan Lignified E-(1,4)-Glc/E- - 150 hardwood cells (1,4)-Man (ratio (3-4%) 1:2)

Galactoglucomannan Lignified E-(1,4)-Glc/E- Galactose (D- 100- softwood cells (1,4)-Man (ratio (1,6)) 150 (12-18%) 1:3) Acetyl (at C-2 or C-3)

Table 1. Occurrence and composition of the most frequently occurring plant cell wall hemicelluloses.

8 Wood formation

2.1.2.5 Pectins Pectins are highly hydrophilic and easily extractable polysaccharides that are abundant in the middle lamella and the primary cell wall of type I. All pectins contain galacturonic acid and might in addition contain rhamnose, arabinose and galactose (Brett et al. 1996). Pectins are synthesized and methylated in the Golgi and thereafter secreted into the cell wall where pectin methyl esterases remove the methyl groups and leave a negatively charged carboxylate ion. The negative charge allows cross-linking of the pectins by calcium ions, making them appear as “egg-box structures”. The pectins in the middle lamella regulate the cell-cell adhesion whereas pectins in the primary cell wall might be involved in the growth control by regulating the accessibility of wall-loosening enzymes to their glycan substrates (Carpita & McCann 2000).

2.1.3 Lignin Lignin is a very complex molecule, composed of three different propylphenols - coumaryl alcohol, coniferyl alcohol and sinapyl alcohol (Grima-Pettenati et al. 1999). A large part of the lignin is found in the middle lamella where it glues adjacent cells together. Lignin is also an important part of the secondary cell wall where it, together with hemicelluloses, fills up the space between the cellulose microfibrils and adds strength to the cell wall (Timell 1967). The hydrophobic nature of lignin increases the resistance of wood to degradation and prevents the cell from taking up water. Since lignin is hard to remove, wood with lower or modified lignin contents would be very attractive to the pulp and paper industry.

2.2 Enzymes involved in cell wall synthesis, expansion and degradation In spite of a central biological role and a large commercial value of wood, the mechanisms for in vivo biosynthesis of the plant cell wall and its components are still relatively poorly understood. A number of cellulose synthases belonging to glycosyl transferase (GT) family 2 have been identified as members of the membrane bound cellulose synthesizing terminal complexes (Pear et al. 1996, Richmond et al. 2000, Tanaka et al. 2003, Burton et al. 2004, Djerbi et al. 2005). In several cases, in vitro synthesis of cellulose and callose from membrane extracts has been demonstrated (Kudlicka et al. 1997, Lai-Kee-Him et al. 2002, Pelosi et al. 2003). In addition, a number of other putative glycosyl transferases possibly involved in the synthesis of hemicelluloses have been identified and the activities of these enzymes in vitro have also been experimentally confirmed in a few cases (Keegstra et al. 2001, Scheible & Pauly 2004). Glycosyl transferases synthesize their respective plant cell wall carbohydrates by transferring a sugar moiety from a sugar donor (for example UDP-D-glucose for the synthesis of a E-(1,4)-glucose chain) to the elongating carbohydrate chain (Gibeaut 2000). The majority of the glycosyl transferases involved in the synthesis of plant cell wall carbohydrates are membrane-bound (to the plasmalemma or the Golgi membrane). This makes both heterologous expression and extraction of the enzymes in an active form more complicated than in the case of cytoplasmic or secreted proteins. The expansion or elongation of the plant cell is driven by internal turgor pressure. In order to maintain the wall thickness and strength, the mechanism for the cell enlargement must involve simultaneous loosening of the cell wall and incorporation of new material. Most likely, several different cell wall loosening agents are required for cell expansion (Rose & Bennett 1999, Cosgrove 2005). The acid-growth hypothesis suggests that auxin acidifies the cell wall by activation of a plasma-membrane proton pump. This acidification would, in turn, activate the cell-wall loosening enzymes (Rayle et al. 1992). Although they lack apparent hydrolytic activity, expansins seem to be among the key cell wall loosening agents (McQueen-Mason et al. 1992, Cosgrove 1998, Cosgrove 2000a, Cosgrove 2000b, Darley et

9 Åsa Kallas al. 2001, Cosgrove 2005). Expansins are believed to interfere with the non-covalent binding between adjacent cellulose microfibrils and/or the xyloglucan – cellulose network. Also, endo-(1,4)-E-glucanases attacking cellulose and glycosidases attacking for example the side groups of hemicelluloses are thought to be important for cell wall loosening (Rose & Bennett 1999). Thirdly, a group of enzymes encoded by the XTH (xyloglucan endotransglycosylase/hydrolase) gene family (Rose et al. 2002) most likely play an important role during cell expansion (Darley et al. 2001). XET enzymes cut xyloglucan molecules and religate them to xyloglucan acceptors by a transglycosylase reaction, whereas XEH enzymes cleave xyloglucan molecule by hydrolysis. Considering the importance of the cellulose- xyloglucan network in the primary cell wall, the ability of XET and XEH proteins to modify this structure makes them to highly interesting candidates as wall-loosening agents. Indeed, several studies have shown a correlation between XET expression and areas of cell wall expansion (Fry et al. 1992, Xu et al. 1996, Vissenberg et al. 2000). For the degradation of the cell wall, hydrolytic enzymes, such as glucanases, mannanases and xylanases are the key players. The first microbial cellulases were discovered over 50 years ago and a large number of hydrolases from both bacteria and fungi have been characterized in detail. In contrast to microbial hydrolases, hydrolytic enzymes of plant origin are less studied. Plant hydrolases have an obvious role of liberating carbon during processes such as fruit ripening and leaf senescence. However, many plant hydrolytic enzymes are probably involved in regulation and “proof-reading” during the carbohydrate synthesis rather than the complete break-down of the cell wall components. One interesting example is KORRIGAN, a membrane-bound endo-(1,4)-E-glucanase from A. thaliana involved in the cellulose synthesis by a not yet defined mechanism. Plants in which the KOR gene has been knocked out show a dwarf phenotype with reduced cellulose content and irregularly shaped cells (Nicol et al. 1998). A comparison between a recombinant hybrid aspen KOR homologue (PttCel9A) and a recombinant microbial homologue from Thermobifida fusca (TfCel9A) showed that the plant enzyme had a thousand-fold lower activity on cellohexaose than the bacterial enzyme. In addition, PttCel9A had narrower substrate specificity than TfCel9A (Master et al. 2004).

2.2.1 Classification of plant cell wall modifying enzymes On the basis of amino acid sequence similarity, carbohydrate active enzymes have been divided into glycoside hydrolases (106 families), glycosyl transferases (83 families), polysaccharide lyases (18 families) and carbohydrate esterases (14 families) and listed on the CAZy (Carbohydrate Active enZYmes) website (http://afmb.cnrs-mrs.fr/CAZY/). The members within a family share a similar three-dimensional fold and the same catalytic mechanism (Henrissat et al. 1997). The CAZy families are further divided into clans. A large number of the carbohydrate active enzymes has a modular structure with a catalytic module and one or more non-catalytic modules such as transmembrane domains or carbohydrate- binding modules (Gilkes et al. 1991). In addition to the families containing catalytic modules, the CAZy website also lists 45 families of non-catalytic carbohydrate binding modules (CBMs). While the CAZy classification is based on sequence similarity, the EC (Enzyme Commission) number describes the activity of an enzyme (http://www.chem.qmul.ac.uk/iubmb/enzyme/). For example, hydrolytic activity towards xyloglucan is denoted EC 3.2.1.151, whereas transglycosylase activity on xyloglucan is denoted EC 2.4.1.207. Many different enzyme activities can be found within one CAZy family. With the increasing number of organisms with a completely sequenced genome, it has become evident that plants have a larger proportion of CAZymes than other organisms (Coutinho et al. 2003b). This is consistent with the fact that carbohydrates are the main constituents of plants. In addition, it seems that woody plants have a slightly higher number of CAZymes than non-woody plants (Geisler-Lee et al. 2006), probably reflecting the large

10 Wood formation number of carbohydrate active enzymes required for the synthesis of the secondary cell wall. This thesis focuses on xyloglucan endotransglycosylases, xylanases and carbohydrate-binding modules and hence, these proteins will be discussed in detail below.

2.2.2 Mechanisms of the plant cell wall modifying enzymes Glycosyl hydrolysis can occur via an inverting or a retaining mechanism, whereas transglycosylase reactions always occur via the retaining mechanism (Koshland 1953). In both cases, two amino acids, generally aspartates or glutamates, are crucial for the activity by acting as catalytic residues (acid and base and acid/base and nucleophile, respectively) (reviewed by Sinnott 1990). The substrate sugar residues on the non-reducing side of the cleavage site are numbered with negative integers with –1 being the sugar residue closest to the cleavage site. Correspondingly, the sugar residues on the reducing side of the cleavage site are numbered with positive integers with +1 being the sugar residue closest to the cleavage site (Davies et al. 1997). The inverting mechanism involves a one step reaction that inverts the configuration of the anomeric carbon atom of the sugar moiety -1. In this reaction, one of the catalytic residues acts as a base, activating a water molecule that becomes the acceptor. The other catalytic residue activates the cleavage of the glycosidic bond by acting as an acid (Fig 8a). The retaining mechanism involves two inverting steps with an intermediary formation of a covalent glycosyl-enzyme intermediate. As the name implies, the retaining mechanism gives a net retention of the anomeric carbon atom of the sugar moiety -1. In the first step (the glycosylation step), one of the catalytic residues acts as an acid, activating the cleavage of the glycosidic bond, while the second catalytic residue performs a nucleophilic attack on the anomeric carbon atom of the -1 sugar residue and forms the glycosyl-enzyme intermediate. In the second step (the deglycosylation step) the residue that previously acted as an acid, now acts as a base that activates the acceptor molecule (Fig 8b). (Withers 2001) In most cases, water is the acceptor in the second step of the reaction, leading to hydrolysis. However, sometimes a second sugar molecule is preferred by the enzyme, leading to a transglycosylase reaction. In plants, there are only two types of transglycosylating enzymes identified so far: xyloglucan endotransglycosylases (Farkaš et al. 1992, Fry et al. 1992, Nishitani et al. 1992) and mannan transglycosylases (Schröder et al. 2004). While hydrolytic activity can be easily detected by measuring the net formation of reducing ends, the assays for a transglycosylating enzyme are more complicated since the product is not always distinguishable from the substrate. In addition to the inverting and the retaining mechanisms, hydrolytic enzymes can be divided into endo- and exo-acting, depending on the site of attack (reviewed by Teeri 1997). Endo- acting enzymes generally have an open cleft active site and cleave the substrate anywhere along the carbohydrate chain. The active site of exo-acting enzymes are tunnel-shaped or formed as a deep groove and degrade the substrate processively starting from one end (reviewed by Davies et al. 1995). However, it should be pointed out that this definition is not strict and that some enzymes can be both exo-and endo-acting.

11 Åsa Kallas

a)

b)

Figure 8. The reaction for hydrolysis can occur via the inverting mechanism (a) or the retaining mechanism (b). Transglycosylase reactions always occur via the retaining mechanism.

2.3 The XTH gene family The XTH (xyloglucan endotransglycosylase/hydrolase) gene family comprises genes encoding for both XET (xyloglucan endotransglycosylase, EC 2.4.1.207) and XEH (xyloglucan endohydrolase, EC 3.2.1.151) enzymes. Genes belonging to the XTH gene family are found in all plants so far studied. The XET and the XEH proteins are members of the GH family 16 and closely related to the endo-(1,3)(1,4)-E-glucanases (lichenases). The two different enzymes share the same catalytic motif – DEIDFEFLG (sometimes DEIDIEFLG for XET/XEH) – and they exhibit similar structures (Planas 2000). Together with enzymes in GH family 7, GH 16 members belong to clan B (Fig 9). Cellulases: Endoglucanases and cellobiohydrolases GH family 7

Common ancestor: N-carrageenases E-bulge + histidine E-agarases (1,3-D-1,4-E- galactanases) GH family 16 Laminarinases (1,3-E- glucanases) Lichenases (1,3-1,4- E-glucanases) Xyloglucan endotransglycosylases/ hydrolases

Figure 9. Evolution in clan B. A schematic tree based on the structural features conserved in each enzyme family. Picture adapted from (Michel et al. 2001).

12 Wood formation

There are protein structures published for two endo-(1,3)(1,4)-E-glucanases (Keitel et al. 1993, Hahn et al. 1995, Tsai et al. 2003), one N-carrageenase (Michel et al. 2001), two E- agarases (Allouch et al. 2003), one xyloglucan endotransglycosylase (Johansson et al. 2004) and one endo-E-galactosidase (Tempel et al. 2005). Despite the fact that the sequence identity within the subfamilies is very low (10-25%), all clan B enzymes exhibit similar structures consisting of two anti-parallel E-sheets forming a E-sandwich with one concave and one convex face (Fig 10). a) b)

Figure 10. Crystal structure of PttXET16-34 representing the typical clan B fold with two E-sheets forming a E-sandwich (a). In PttXET16-34, the convex E-sheet is made up of 7 E-strands and the concave E-sheet of 8 E-strands (b) (Johansson et al. 2004).

Common features for all XTH gene products are a hydrophobic N-terminus signal peptide and four or six cysteine residues capable of forming stabilizing cysteine bonds in the C-terminal portion of the protein. In addition, most XTH gene products have a conserved putative N- glycosylation site close to the catalytic domain. Their typical size is around 33 kDa (Campbell et al. 1999a). In many plant species, for example A. thaliana (Yokoyama et al. 2001), rice (Yokoyama et al. 2004) and Populus (Geisler-Lee et al. 2006), a large number of XTH genes has been identified. Numerous authors describe that XTH genes within a plant species exhibit different expression patterns in different tissues or show different response to hormone treatment (Arrowsmith et al. 1995, Rose et al. 1996, Burstin 2000, Steele et al. 2000, Uozu et al. 2000, Reidy et al. 2001, Nakamura et al. 2003, Sulová et al. 2003, Malinowski et al. 2004, Vissenberg et al. 2005b). In addition, enzymatic properties such as temperature and pH optima might show small variances between the gene products. For example, four XET isoenzymes from A. thaliana (TCH4, Meri-5, EXGT and XTR9) turned out to have different pH and temperature optima. In addition, they showed different substrate preferences. XTR9 was less active on fucosylated substrates, whereas the others were equally active on fucosylated as non-fucosylated substrates (Campbell et al. 1999b). Since storage xyloglucan is generally less fucosylated then the structural xyloglucan in the primary cell wall, this implies different functions for these isoenzymes. Although the majority of the characterized XTH gene products have a predominant transglycosylase activity (EC 2.4.1.207), a few XTH gene products purified from their native

13 Åsa Kallas source have also revealed various degrees of hydrolytic activity (EC 3.2.1.151) (Fanutti et al. 1993, Tabuchi et al. 1997, Schröder et al. 1998). In these cases, hydrolysis mainly occurs when there are very low amounts of sugars available for transglycosylation. In addition, the rate of hydrolysis is much lower than that of transglycosylation. One XTH gene product from azuki bean has been reported to be purely hydrolytic (Tabuchi et al. 2001).

2.3.1 Role of XET and XEH in the plant cell wall As discussed by (Rose et al. 2002) and (Nishitani 2005), the fact that most plants seem to express a number of XET/XEH isoenzymes makes it very likely that different isoenzymes fulfill different functions in the plant cell. Some proteins exhibiting XET activity might be involved in cell wall restructuring (Carpita et al. 1993, Darley et al. 2001), others in incorporation of newly synthesized xyloglucan into the cell wall (Thompson et al. 2001). Proteins with XEH activity can reduce the molecular weight of xyloglucan, for example during fruit ripening (Rose & Bennett 1999). An additional role for XET in restructuring primary walls and reinforcing the connection between the primary and the secondary cell walls at the time when secondary walls are deposited has been suggested by (Bourquin et al. 2002). By using an in situ XET activity assay, the authors could demonstrate XET activity in xylem and phloem fibers of hybrid aspen at the stage of secondary wall formation. Immunolocalization data further detected the presence of hybrid aspen PttXET16-34 (previously PttXET16A) or closely related isoenzymes in fibers at the stage of secondary wall formation.

2.3.2 Mechanism and substrates for XET and XEH Like all transglycosylating enzymes, XET and XEH enzymes act via the retaining mechanism described above. The glutamates of the sequence DEIDFEFLG, located on the E-strand number 7 on the concave face of the structure (Johansson et al. 2004), act as the nucleophile and the acid/base respectively. The aspartate between the glutamates stabilizes the catalytic domain by forming a hydrogen bond to the nucleophile. XET/XEH cleaves the xyloglucan donor molecule in two parts and forms a stable, covalent substrate-enzyme intermediate (Sulová et al. 1998, Sulová et al. 1998). Subsequently, it attaches the newly formed reducing end of the donor molecule to the non-reducing end of the acceptor xyloglucan molecule (XET activity) or in rare cases, a water molecule (XEH activity) (Fig 11).

Figure 11. The XET and the XEH reactions proceed via the retaining mechanism, which involves formation of a covalent substrate-enzyme intermediate.

Both longer xyloglucan molecules and shorter xyloglucan oligosaccharides can act as acceptors in the XET reaction. However, is has been shown that the smallest acceptor molecule that induces transglycosylation by XET proteins from bean and pea is composed of three glucose units substituted with two xylose units attached to the glucose units closest to the non-reducing end (Lorences et al. 1993). Neither the fucose nor the galactose units are

14 Wood formation important for the transglycosylation by these XET enzymes. The structure of PttXET16-34 shows that the loop connecting the E-strands 13 and 14 (amino acid residues A176-W190) strongly interacts with the sugar residues of the acceptor. Especially the highly conserved tryptophan, W179, is interacting with the xylose ligand by both hydrogen bonds and stacking interaction (Fig 12, Johansson et al. 2004). Recent work has shown that chemical modifications at the reducing end of xyloglucan oligosaccharides do not affect their capability to act as acceptors in the XET reaction (Teeri et al. 2003).

Figure 12. The crystal structure of PttXET16-34 shows the importance of the acceptor-binding loop (in green) for the interaction with the substrate (in gold) (Johansson et al. 2004).

Regarding the cleavage site of the donor molecule, it has been shown that xylose substitutions at +2 and perhaps also at –3 are required for nasturtium XET to cleave the xyloglucan backbone. This explains why cellulose is not a substrate for XET. Also, galactosyl substitution of the xylose at position +1 prevents chain-cleavage by the nasturtium enzyme (Fanutti et al. 1996). Furthermore, it has been reported that different XET isoenzymes show different donor substrates preferences regarding chain length and degree of substitution (summarized by Rose et al. 2002). Reflecting the difficulty of measuring transglycosylase activity, the number of assays available for the measurement of XET activity is limited. Both the radioactive assay developed by (Fry et al. 1992) and the colorimetric assay developed by (Sulová et al. 1995) measure the incorporation of xyloglucan oligomers into longer xyloglucan molecules. These assays suffer from having xyloglucan as a very large and polydisperse donor. A recent publication describes a method where a defined low-molecular mass xyloglucan oligosaccharide is used as glycosyl donor and a heptasaccharide derivatized with ANTS (8- aminonaphthalene-1,3,6-trisulfonic acid) as acceptor (Saura-Valls et al. 2006). The use of a well-defined donor enables kinetic and mechanistic studies.

15 Åsa Kallas

2.4 Carbohydrate binding modules Carbohydrate-binding modules (CBMs) are the most frequently occurring non-catalytic components of modular carbohydrate active enzymes. A CBM is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discrete fold having carbohydrate-binding activity (http://www.cazy.org/CAZY/). There are a few special cases such as CBMs in cellulosomal scaffoldins (Bayer et al. 2004) and some cases of independent CBMs (i.e. CBMs which are not associated to a catalytic module) (Barral et al. 2005, Vaaje- Kolstad et al. 2005). CBMs are relatively small proteins (30-180 amino acids) that are rich in aromatic amino acids and structure stabilizing cysteines. They are separated from the catalytic module by a flexible linker, which is often glycosylated and rich in serines, prolines and threonines (Shen et al. 1991). The first CBMs (named cellulose binding domains, CBD) were identified around 20 years ago (van Tilbeurgh et al. 1986, Langsford et al. 1987, Teeri et al. 1987). Proteolytic cleavage, which removed the CBM from the catalytic module, showed that the CBMs were crucial to the activity of the catalytic domain on insoluble substrates (Gilkes et al. 1988, Tomme et al. 1988). The mechanism by which the CBMs increase the efficiency of their catalytic modules has been thoroughly investigated. In many cases, the increased efficiency simply seems to be a consequence of a prolonged contact between the enzyme and the substrate mediated by the CBM (Bolam et al. 1998). For other proteins, such as CBM2a from Cellulomonas fimi Cel6a, it has been reported that the CBM disrupt the crystallinity of the cellulose microfibrils, thus making the substrate more available for the catalytic module (Din et al. 1994). An additional role for the carbohydrate binding modules can be to target the catalytic module to a specific region of the substrate. This could be for example reducing/non-reducing ends or amorphous regions of a cellulose microfibril (Carrard et al. 2000, Notenboom et al. 2001, Boraston et al. 2001a). In the later years, CBMs with affinity towards a variety of cell wall carbohydrates, for example amorphous cellulose, mannan, xylan and E-glucans have been identified in addition to the cellulose binding modules. Also, there are CBMs with affinity towards storage carbohydrates such as starch and glycogen (Boraston et al. 2004). Similar to what was earlier described for hydrolytic enzymes, CBMs originating from fungal and bacterial enzymes are far more studied than CBMs from plant enzymes. Although a substantial number of putative plant CBMs have been identified on the basis of amino acid sequence similarity, very few of them have been thoroughly characterized (Barral et al. 2005).

2.4.1 CBM classification and determination of binding properties The CBMs have been divided into three main groups, type A, B and C, based on the topology of the binding site (Boraston et al. 2004) (Fig 13). Type A CBMs (including CBM families 1, 2a, 3, 5 and 10, Boraston et al. 2004), which bind to crystalline cellulose and/or chitin, have a planar binding surface with three aromatic amino acids (mostly tryptophans and tyrosines) that interact with the substrate via hydrophobic stacking (Kraulis et al. 1989, Reinikainen et al. 1992, Linder et al. 1995a, Linder et al. 1995b, Tormo et al. 1996). The binding of type A CBMs is driven by an increase in entropy, which is a consequence of the dehydration of the binding site of the protein as well as the surface of the substrate (Creagh et al. 1996). Also, polar residues such as glutamines and asparagines contribute to the binding via hydrogen bonding (Tormo et al. 1996, McLean et al. 2000). Type B CBMs (found in CBM families 2b, 4, 6, 15, 17, 20, 22, 27, 28, 29, 34 and 36, Boraston et al. 2004) generally binding soluble polysaccharides and oligosaccharides with a DP of 4 or more have a cleft or a groove where the soluble substrate chain is accommodated. The binding of type B CBMs to their substrates is much dependent on hydrophobic stacking interactions through aromatic amino acids, but to

16 Wood formation some extent also on hydrogen bonding (Xie et al. 2001a, Pell et al. 2003, Flint et al. 2005). The last group, containing CBMs of type C (found in CBM families 9, 13, 14, 18 and 32, Boraston et al. 2004) are lectin-like and have affinity for very short oligosaccharides (mono-, di- and trisaccharides). The binding of type C CBMs is much dependent on hydrogen- bonding. Both the binding of type B and type C CBMs are enthalpically driven and it has been shown that the ǻH° upon substrate binding is larger for these types of CBMs than for type A CBMs. On the other hand, for type B and C CBMs, the increase in entropy caused by dehydration is not large enough to compensate for the loss of entropy caused by the conformational changes of the substrate that occur upon binding. (Creagh et al. 1996) In most cases, the affinity of type B and C CBMs corresponds to the preferred substrate of the catalytic module (Boraston et al. 2004). a) b) c)

Figure 13. Topology of the binding sites (in red) of CBMs from subgroups A, B and C, represented of structures of a. CBM1 from Trichoderma reesei Cel7A (Kraulis et al. 1989); b. CBM4 from Cellulomonas fimi Cel9B (Johnson et al. 1996) and c. CBM9 from Thermotoga maritima Xyn10A (Notenboom et al. 2001). Picture adapted from (Gustavsson 2003).

In a number of cases, two or even more CBMs from the same or different families occur in the same protein. It seems that the CBMs can act synergistically to yield a stronger binding or broader/narrower substrate specificity (Linder et al. 1996, Bolam et al. 2001, Boraston et al. 2002, Boraston et al. 2003). In one case, it has been demonstrated that one substrate molecule is bound at the interface between two CBMs (Flint et al. 2004). However, in other proteins, it seems that one of the CBMs is redundant with no or very little contribution to the binding (Charnock et al. 2000, Notenboom et al. 2001) or that the binding of two individual binding modules is additive rather than cooperative (Tomme et al. 1996, Brun et al. 2000). With a few exceptions, where the CBM families 6 and 13 provide interesting examples (Boraston et al. 2000, Notenboom et al. 2002, Schärpf et al. 2002, Henshaw et al. 2004), most CBMs only have one binding site. The topography of the binding site determines the binding specificity of CBMs binding to insoluble substrates as well as soluble substrates. Some CBMs that have a wide and flexible binding site can accommodate a large variety of substrates whereas others that have a more narrow and rigid binding site are stricter in their preferences. The increasing number of three- dimensional structures of different CBMs in complex with their ligands provides an invaluable tool for the understanding of the binding specificities (Gilbert 2003). There are many methods for determining the binding specificity of a CBM. For the initial screening of a number of substrates, affinity gels (Tomme et al. 2000), macroarray assays (McCartney et al. 2004) or qualitative binding assays are convenient methods. For the detailed analysis of the kinetics of binding, ITC (isothermal titration calorimetry), fluorescence analysis, UV spectroscopy analysis and quantitative binding assays are commonly used.

17 Åsa Kallas

2.4.2 Biotechnological applications of CBMs As reviewed by Levy et al. (2002a), CBMs have been used for a number of biotechnological applications. Examples of use are as affinity tags for protein purification (Tomme et al. 1998, Boraston et al. 2001b) and for targeting of hydrolytic enzymes to the desired substrate. This technique has frequently been used in the textile industry, for example in laundry powders. The targeting increases the efficiency of the enzymes, thus lowering the enzyme consumption and thereby also the cost. CBMs could also be used as cellulose cross-linking agents in paper production (Levy et al. 2002b). By using a CBM-scaffoldin, a number of enzymatic activities can be coupled to a tailor-made multienzyme complex, providing extraordinary properties for a number of applications (Bayer et al. 2004). Hildén et al. and McCartney et al. have used CBMs for the visualization of plant cell walls (Hildén et al. 2003, McCartney et al. 2004, McCartney et al. 2006). For all these applications, it is valuable to be able to control the properties of the CBM, both in terms of substrate specificity and binding affinities. Successful examples of efforts to create CBM-displaying combinatorial phage-libraries from which the most attractive candidates could be selected have been described by (Smith et al. 1998, Lehtiö et al. 2000, Cicortas Gunnarsson et al. 2004).

2.5 Xylanases Xylan-degrading enzymes can be divided into endo-(1,4)-E-xylanases (EC 3.2.1.8), which cleave the xylan backbone and xylan (1,4)-E-xylosidases that degrade short xylan oligomers into single xylose units from the non-reducing end (EC 3.2.1.37) (Sunna et al. 1997). In addition to these enzymes, which attack the E-(1o4)-linked xylose units, a number of other enzymatic activities are also required for the complete degradation of plant cell wall xylans. These include arabinofuranosidase activity (EC 3.2.1.55), removing the arabinose side chains, D-glucuronidase activity (EC 3.2.1.139), removing the glucuronic acids and xylan esterase activity (EC 3.1.1.6) that removes the acetate groups from the xylan backbone. Since the xylan side groups might constitute a steric hindrance for the backbone degrading enzymes, it has generally been considered that the side groups need to be removed before the complete degradation of the xylan backbone can take place (Gilbert et al. 1993). However, for a family 10 xylanase from Thermoascus aurantiacus, it was recently shown that the side groups are important determinants of the efficiency of the backbone degrading enzymes (Vardakou et al. 2005). Also, the deacetylating esterases have been shown to be more efficient on shorter xylo- oligosaccharides than on very long xylans (Thomson 1993). It therefore seems that the battery of xylan-degrading enzymes has to be coordinately expressed and optimized for the individual xylans. Endo-(1,4)-E-xylanases are found in the glycoside hydrolase families 5, 8, 10, 11 and 43 whereas (1,4)-E-xylosidases are found in GH families 3, 30, 39, 43, 52 and 54. While many xylanases from microorganisms have been characterized in detail and frequently used in pulp and paper processes, the situation is once again remarkably different for the plant enzymes and only a few plant xylanases have been biochemically investigated (Lienart et al. 1985, Benjavongkulchai et al. 1986, Slade et al. 1989, Cleemput et al. 1997a, Cleemput et al. 1997b, Bih et al. 1999, Suzuki et al. 2002). All plant endo-(1,4)-E-xylanases identified so far belong to GH family 10, whereas all putative plant (1,4)-E-xylosidases are classified into GH family 3.

2.5.1 Sequence and structural features of endo-(1,4)-E-xylanases from GH family 10 All enzymes in GH family 10 act via the retaining mechanism, using one glutamate as an acid/base and a second glutamate as a nucleophile. Also, two aspartates, one preceding the acid/base and the other one following the nucleophile have been shown to be important for

18 Wood formation the activity of a GH family 10 endo-(1,4)-E-xylanase from Cellulomonas fimi (MacLeod et al. 1994, MacLeod et al. 1996). A signature sequence consisting of the amino acids [G/T/A]-X- X-[L/I/V/N]-X-[I/V/M/F]-[S/T]-E-[L/I/Y]-[D/N]-[L/I/V/M/F], where X represents any amino acid and E is the nucleophilic glutamate, has been proposed for the active site of enzymes belonging to the GH family 10 (Suzuki et al. 2002). A comparison of the amino acid sequences of eleven putative endo-(1,4)-E-xylanases from A. thaliana, two from rice and one from maize, wheat and barley respectively showed that the glutamates, the aspartates as well as other catalytic and substrate-binding residues are well conserved in the plant enzymes (Simpson et al. 2003). The GH family 10 belongs to the clan A, exhibiting a (E/D)8-fold with an extended open cleft where the substrate is bound (Derewenda et al. 1994, Charnock et al. 1997, Charnock et al. 1998, Schmidt et al. 1998). The barley endo-(1,4)-E-xylanase X-1, expressed in the aleurone layer of seeds (Banik et al. 1997) and the maize endo-(1,4)-E-xylanase, expressed in the tapetum cell of the anther (Bih et al. 1999) are, similar to the other cereal endo-(1,4)-E-xylanases, believed to be involved in the degradation of xylan in these tissues. The AtXyn1 from A. thaliana is on the other hand predominantly expressed in the vascular bundles (phloem, cambium and xylem except for the vessels) and has been suggested to have a role during the formation of the secondary cell wall (Suzuki et al. 2002). GH10 endo-(1,4)-E-xylanases from microorganisms commonly contain one or more carbohydrate binding module from family 22 attached to the catalytic module. Members of the CBM family 22 have been shown to bind to xylan or mixed (1,3)(1,4)-E- glucans (Charnock et al. 2000, Meissner et al. 2000) and in one case it has been reported that the CBM22 alter the substrate preferences of the catalytic module (Araki et al. 2004). Not all plant endo-(1,4)-E-xylanases have CBM22s that include the aromatic amino acids believed to be important for substrate binding. Instead, it seems that some enzymes contain CBMs that are non-functional and required to be cleaved off before the catalytic module can exhibit full xylanase activity (Caspers et al. 2001, Simpson et al. 2003).

2.5.2 Xylanases in biotechnology Xylanases are widely used for biotechnological applications and account, together with cellulases and pectinases, for 20% of the world enzyme market (Polizeli et al. 2005). Although the most common use of xylanase is as a bleaching agent in the pulp and paper industry, there are also a number of other commercial applications (reviewed by Kulkarni et al. 1999, Collins et al. 2005, Polizeli et al. 2005). In juice-, wine- and beer-making, xylanases are used for clarification and in the feed industry both cellulases and xylanases have been shown to reduce the viscosity of the seed extracts, thereby ameliorating the uptake of nutrients. Also, xylanases can be used for the conversion of xylan into xylitol, a sweetener used for low-calorie products in the food industry. The advantages of xylanases for pulp bleaching have been exploited for a number of years and xylanases are probably the best example of the use of enzymes in the pulp and paper industry (Viikari et al. 1994, Suurnäkki et al. 1997). In the pulping process, the xylan structure is changed from relatively long, substituted chains to shorter, less substituted xylan chains. These shorter chains have a higher tendency to precipitate and bind to the cellulose molecules on the fiber surface than the naturally occurring xylan. The precipitating xylan can also entrap residual lignin, either by co-precipitation or by covering the lignin and render it less accessible for bleaching chemicals. This residual lignin has to be removed by bleaching agents such as chlorine components, ozone or peroxi-acids in order to avoid discoloration of the final product and here, the xylanase treatment offers an environmentally friendly alternative, since the hydrolysis of the xylan will also lead to the removal or at least the exposure of the lignin. Since the physiological conditions during pulp cooking are rather harsh, it is desirable to use a xylanase that is stable both at high temperatures (80-90 °C) and

19 Åsa Kallas alkaline conditions (pH 9-10). Large efforts have been made in order to identify xylanases from extremophilic microorganisms that could be active at these conditions. In addition, it is crucial that the xylanase does not possess any endo-glucanase activity, since this would have a detrimental effect on the quality of the pulp (Pere et al. 1995).

20 Enzyme discovery in Populus

3 Enzyme discovery in Populus

3.1 Populus as a model tree Although non-woody plants such as Zinnia elegans and A. thaliana are valuable for many aspects of the studies of wood formation, some questions can most likely only be addressed by studying a true tree. This includes the differences in gene expression between for example juvenile and mature wood as well variations in gene regulation due to seasonal variations (Han 2001, Taylor 2002). Populus is a convenient model tree for the studies of wood formation (Brunner et al. 2004). It has a relatively small genome (550 Mb, which is approximately forty times smaller than pine) and it is easy to transform. In addition, the Populus species are fast growing, as compared to conifers such as spruce and pine. This facilitates the development and evaluation of genetically modified plants in which candidate genes are over-expressed or suppressed. The complete genome sequence of Populus trichocarpa was recently released (http://genome.jgi-psf.org/Poptr1/Poptr1.home/html) and in addition, there are a number of EST libraries available (summarized by Brunner et al. 2004). Populus is also commercially interesting since it is used as raw material for timber, plywood and pulp. Just as many other plant genomes, the Populus genome has gone through a number of genome duplications (polyploidizations). In most cases, the duplicated genes are mutated, which lowers the functional redundancy in the genome. There are several possible consequences of these mutations, including pseudogenization (mutations leading to a loss of function) and neo-/subfunctionalization (mutations leading to the development of genes encoding for proteins with a new or altered function) (reviewed by Adams et al. 2005).

3.2 Identification of carbohydrate active enzymes in Populus There are two main approaches for the identification of new genes and proteins: genomic screening supported by expression data and proteomic screening based on biochemical data (Fig 14) (Perrin et al. 2001). Genomic screening implies bioinformatic mining for candidates in a set of genes, for example an EST library or a complete genome. The screening is based on sequence similarity to earlier identified genes and the number of target genes can be narrowed down by using results from expression analysis. Potentially interesting genes are isolated and submitted to further studies as described below (Fig 14). In proteomic screening, candidate proteins in a protein extract are identified for example on the basis of their expression profiles or enzymatic activities. The sequence of the isolated protein can subsequently be determined by Edman degradation or by mass spectrometry analysis (Pandey et al. 2000, Rossignol 2001). A third alternative for identification of target genes is to use forward genetics, which implies the screening of a large set of randomly created plant mutants. Plants with interesting phenotypes are analyzed in order to identify the target gene, in which the mutation has appeared. A large number of genes putatively involved in the formation of the secondary cell wall have been identified by forward genetics (Turner et al. 2001). In collaboration with Umeå Plant Science Center at Umeå University and SLU in Umeå, the Department of Biotechnology at KTH has applied a genomic approach in order to identify cell wall related proteins that ultimately can be used for industrial applications. In 1997, a hybrid aspen (Populus tremula x tremuloides) cambial EST-library was created with the aim to identify enzymes involved in wood fiber synthesis, reorganization and degradation (Sterky et al. 1998). Among the approximately 3000 unique genes in this library, 4% of the genes were predicted to be putatively cell wall related. For example, ESTs that show sequence similarity to endo-glucanases (GH family 9), cellulose synthases (GT family 2), xyloglucan

21 Åsa Kallas endotransglycosylases/hydrolases (GH family 16) and endo-(1,4)-E-xylanases (GH family 10) from other plant species were found. In addition, a number of genes without a predictable function were identified. Today, the Populus EST database contains over 120000 sequences from 19 different cDNA libraries derived from different stages of development or different growth conditions of Populus tremula (European aspen), Populus tremula x tremuloides T89 (hybrid aspen) and Populus trichocarpa (black cottonwood) (Sterky et al. 2004, http://poppel.fysbot.umu.se/). Together with the genomic sequence of Populus trichocarpa, these EST libraries provide a valuable tool for identification of putatively interesting carbohydrate active enzymes. Recently, Geisler-Lee et al. identified approximately 1600 genes encoding for CAZymes among 55000 P. trichocarpa gene models and a large number of these genes have homologues in the EST libraries (Geisler-Lee et al. 2006). A number of microarray experiments provide a detailed comparison of the gene expression in different tissues or growth conditions. The first microarray chip contained 3000 genes from the cambial EST library and was hybridized with RNA from the different stages of xylogenesis (Hertzberg et al. 2001). A larger chip containing 13000 EST clones, corresponding to approximately 9000-10000 unique genes was later created and used to compare gene expression in a number of different tissues/conditions (Andersson et al. 2004, Schrader et al. 2004, Aspeborg et al. 2005, Andersson-Gunnerås et al. 2006). Together with the EST frequency data, results from the microarray analysis give valuable information about where and when the gene is expressed. This might further indicate in which process the corresponding protein is involved. However, it is important to interpret gene expression data with some caution. In many cases, it has been observed that the amount of protein is considerably lower than predicted from the mRNA levels, emphasizing that protein expression is not only regulated on a transcriptional level. Also, numerous examples of false annotations (Coutinho et al. 2003a, Coutinho et al. 2003b) stress the importance of a biochemical characterization of the gene product. A very useful approach is to express the proteins in a heterologous host. This allows characterization of the enzyme in the absence of isoenzymes and other enzymes with contaminating activities. Other methods to obtain information about the protein function are immunolocalization, activity measurements in situ as well as studies of genetically modified plants (Fig 14).

22 Enzyme discovery in Populus

Figure 14. Putatively interesting genes are identified by bioinformatic analyses, a proteomic approach or forward genetics. Microarray analysis and analysis of EST clone distribution give information about the gene expression pattern. Reverse genetics can give suggestions about the in vivo function. Immunolocalizations and in situ activity measurements show where the protein is localized. Heterologous expression enables enzymatic characterization, assay development and crystallization. Finally, the characterized protein might be used for the development of new processes or products.

3.3 Reverse genetics Reverse genetics is commonly applied in order to study the in vivo function of a gene product. As opposed to forward genetics, where a large set of randomly created mutants is screened for a certain phenotype, reverse genetics implies the directed up-or down-regulation of a target gene. In some cases, the phenotype caused by the altered gene expression can be visually examined (for example shortened/elongated stem or different leaf size) whereas other cases require a chemical characterization (for example biochemical assays or measuring carbohydrate composition by FTIR, Fagard et al. 2000). A factor that might complicate the evaluation of a genetically modified plant is that the plant in many cases expresses other isoenzymes or enzymes with a similar function in order to compensate for the down- regulation (Fagard et al. 2000). Common techniques used for reverse genetics are sense/antisense constructs and RNAi (RNA interference) (Waterhouse et al. 2003). In the antisense method, plants are transformed with a plasmid encoding for a sequence that is complementary to the mRNA of the gene that should be down-regulated and the resulting hybridization will lead to lower levels of translation. The antisense approach only results in a partial down-regulation of the gene expression and is not always sufficient for elucidation of the gene function. The RNAi technique relies on the introduction of a double-stranded RNA, which triggers the RNAi pathway. The inserted double stranded RNA is cut into smaller pieces and associated into a complex that subsequently attacks and degrades the target gene. RNAi generally gives a much stronger down-regulation than the antisense approach and might give a more distinct phenotype. However, a too drastic down-regulation could have serious effects on the viability of the plant. Whereas the evaluation of a genetically modified poplar tree might take months or even

23 Åsa Kallas years, the evaluation of smaller plants such as A. thaliana is considerably faster. Therefore, if there are close homologues of the protein of interest, knock-out libraries from other plant species e.g. A. thaliana can be very valuable for elucidation of the protein function (Krysan et al. 1999, Parinov et al. 2000, Bouche et al. 2001).

3.4 Immunolocalization and activity measurements in situ The knowledge about where in the plant an enzyme is expressed can give valuable clues about its in vivo function. Two commonly used methods of protein localization are immunolocalization by specific antibodies and in situ activity measurements. The in situ activity measurements require an enzymatic assay, for example with fluorescently labeled reactants, which can be visualized in a microscope. Also, the in situ activity measurements readily detect an activity and can thus not distinguish between different isoenzymes or even different proteins with the same activity. XET activity has been successfully demonstrated in vivo by using xyloglucan oligosaccharides conjugated with fluorescent substances as acceptors (Vissenberg et al. 2000, Bourquin et al. 2002). The results of immunolocalization studies are very much dependent on the quality of the antibodies. With careful design of the antigens used to elicit antibodies, immunolocalization studies can even distinguish between two relatively similar isoenzymes. An alternative approach for the localization of a protein is to use a reporter gene, which encodes for a protein that is easy to detect by an assay or fluorescence. GUS (E- glucuronidase) hydrolyzes a color-less substrate (5-bromo-4-chloro-3-indolyl-E-D- glucuronide) to give a blue product (5,5’-dibromo-4,4’-dichloro-indigo) and is frequently used to determine in which tissues a protein is expressed (Jefferson et al. 1987, Guivarch et al. 1996). For subcellular localization, fusions with GFP (green fluorescent protein) that can be visualized in fluorescence microscope is commonly utilized (Dixit et al. 2006).

3.5 Heterologous protein production Heterologous expression is often a convenient way to produce large amounts of protein, either for lab-scale characterization or for industrial applications. The yield of protein that can be purified from the native source is in most cases greatly exceeded by the amount of protein that can be produced in a suitable heterologous host. Also, heterologous expression eliminates the risk of contaminating activities from other co-purified proteins from the native source. A drawback is that the recombinant enzyme is isolated from its in vivo environment. For example, compared to the plant cell wall, buffer systems are extremely artificial environments for a cell wall protein. Therefore, also results obtained with recombinant proteins have to be critically evaluated. There are many different systems for recombinant protein expression. Prokaryotic hosts (Escherichia coli, Bacillus etc) are generally easy to work with and generate large amounts of protein. However, these hosts are not capable of performing some of the post-translational modifications, which are common in eukaryotic organisms. Such modifications, for example glycosylation, acylation, phosphorylation and proteolytic cleavage are often crucial for achieving active protein. Commonly used eukaryotic expression systems are mammalian cells, plant cells, insect cells, filamentous fungi and yeasts (Saccharomyces cerevisiae and Pichia pastoris). Filamentous fungi have been especially central for the development of large- scale industrial applications because of the large amounts of enzymes that can be efficiently secreted (Maras et al. 1999). The two expression systems used in the present study are E. coli and P. pastoris.

24 Enzyme discovery in Populus

3.5.1 Heterologous protein production in E. coli The gram-negative E. coli is an attractive prokaryotic system for recombinant expression of peptides or full-length proteins. E. coli grows fast and the expression levels are often high. The first protein expressed in E. coli was the human hormone somatostatin (Itakura et al. 1977). The recombinant proteins can be expressed within the cell - in inclusion bodies or as intracellular soluble protein - or secreted into the periplasm of the bacterial cell (Baneyx 1999). Proteins that are expressed in inclusion bodies must be refolded in order to make them enzymatically active (Misawa et al. 1999). On the other hand, this approach can yield very large amounts of protein, making it attractive for industrial purposes. In addition, inclusion bodies normally contain relatively pure recombinant protein, which facilitates purification of the protein. There are a number of commercial expression vectors and E. coli strains available (reviewed by Sorensen et al. 2005). Some vectors have inducible promoters whereas others have constitutive promoters. Inducible promoters, such as the lac promoter, the T7 promoter and the ara promoter allow a more controlled expression since a chemical or another inducing agent has to be added in order to start the transcription of mRNA. This is useful if the protein is toxic to the cells. Constitutive promoters are useful for slower but constant protein expression, which generally avoids the formation of inclusion bodies. Many of the vectors express the recombinant protein fused to an affinity tag that can be used for the subsequent protein purification and also detection by tag-specific antibodies. When expressing proteins from higher organisms in E. coli, it is important to compare the codon usage. For amino acids that are encoded by several codons (for example arginine), it is important to make sure that the codons in the gene of interest are also used by E. coli (Kurland et al. 1996). A shortage of the corresponding tRNA molecules might induce pre-mature ending of translation or incorporation of false amino acids. In order to avoid these problems, the rare codons in the target gene can be mutated to corresponding codons more frequently used by E. coli (Hu et al. 1996). Another approach is to use an E. coli strain that over-expresses the missing tRNAs (Kleber-Janke et al. 2000).

3.5.2 Heterologous protein production in P. pastoris The two most frequently used yeasts for expression of recombinant proteins are S. cerevisiae (baker’s yeast) and the methylotrophic yeast, P. pastoris (Cereghino et al. 1999, Cereghino et al. 2000). The yeast expression systems are easy to scale up and the cultivation media are relatively cheap. This is important if the intention is to produce the enzymes in large scale for industrial applications. Another advantage is that the integration of the foreign DNA into the genome of the yeasts is generally stable. P. pastoris is capable of using methanol as a carbon source. One of the enzymes responsible for metabolizing methanol is alcohol oxidase (AOX). AOX oxidizes methanol into formaldehyde, with the formation of hydrogen peroxide as a bi-reaction. High levels of formaldehyde are toxic to the yeast cells, wherefore the methanol concentration has to be strictly controlled. The AOX promoter is the most frequently used promoter for induced expression of recombinant proteins, whereas the GAP (glyceraldehyde-3-phosphate dehydrogenase) promoter is frequently used for constitutive expression. Recombinant proteins can be expressed within the yeast cells or secreted into the cultivation media. A great advantage of P. pastoris is that the number of endogenous proteins that are secreted into the culture media is very low. Depending on the expression levels, the recombinant protein can account for more than 90% of the protein in the media and the secretion consequently serves as a first step of purification. Extra-cellular expression is obtained by fusing the foreign gene to a signal peptide. In some cases, it is possible to keep the signal protein of the protein that is going to be expressed. However, in most cases, greater

25 Åsa Kallas success has been obtained with other signal peptides, such as the secretion signal sequence from the S. cerevisiae Į-factor prepro peptide.

3.5.2.1 Fermentation of P. pastoris Growing P. pastoris in a fermenter is a convenient way to obtain high cell densities and consequently high yields of recombinant protein (Cereghino et al. 2002). In a fermenter, the growth conditions can be better controlled than in shake flasks. The fermentation of P. pastoris can be divided into three different phases. The aim of the first phase is to obtain a high cell density by growing the cells on glycerol. In the second phase, the cells are still fed with glycerol, but at a growth-limiting rate. This will prepare the cells for the transition to methanol feeding. The third phase is the methanol phase, where the cells are starting to produce the recombinant enzyme (Cereghino et al. 2002). One of the more important problems that might occur during fermentation is accumulation of proteases in the media, due to partial cell lysis (Jahic 2003). These proteases can significantly reduce the yield of active recombinant protein. There are several ways to overcome proteolysis including the use of protease deficient strains, in which some proteases have been inactivated by gene disruption (Cereghino & Cregg 2000). Also, by changing pH and growth temperature, the conditions can be less favorable for the proteases. A third way to decrease the rate of the proteolysis is to provide an amino acid rich alternative substrate, such as peptone into the growth media (Jahic 2003).

3.5.2.2 Glycosylation There are two main types of glycosylation: O-glycosylation and N-glycosylation. In O- glycosylation, which is less common in P. pastoris, mannose and galactose-based oligosaccharides are added to the hydroxyl-groups of serines and threonines. The N- glycosylation takes place on the nitrogen atom of the asparagine side chain in the sequence N- X-S/T (X can be any amino acid but proline) and starts co-translationally in the endoplasmatic reticulum where the Glc3Man9GlcNAc2 (Glc = glucose, NAc = N-acetylglucosaminyl, Man = mannose) structure is transferred from a pre-assembled Glc3Man9GlcNAc2-Dol-PP complex (Dol-PP=dolichol-pyrophoshate) (Jones et al. 2005). Once the sugar complex is transferred to the protein, the glucose units and one mannose unit are trimmed off. Further processing takes place in the Golgi apparatus, where more mannose or galactose units can be added to the core sugar by a number of transferase activities (Fig 15). The exact structure of the N-glycan depends on the species – for example there are more variants in plant cells (Rayon et al. 1998, Wilson 2002) than in yeast (Montesino et al. 1998, Bretthauer et al. 1999, Gemmill et al. 1999) (Fig 15). Glycosylation is believed to protect proteins from proteolytic degradation, to aid in folding and to stabilize the mature protein (Rayon et al. 1998). It has also been suggested that glycosylation could serve as a quality control before the protein enters the secretory pathway (Ellgaard et al.). Furthermore, glycosylation could be important for the activity of some enzymes, for example by facilitating the binding of the substrate. Several approaches can be used in order to investigate the function of glycosylation in a given protein. Deglycosylating enzymes such as N-glycosidase F and endoglycosidase H remove the sugar molecules completely or partially. The glycosylation site can be removed from the protein sequence by site directed mutagenesis of either the asparagine or the serine/threonine in the N-X-S/T motif. Furthermore, glycoswitch vectors, where parts of the glycosylation machinery have been genetically altered, can be used to control the length of the glycosylation chain during expression (Callewaert et al. 2001, Vervecken et al. 2004). This approach was originally developed for yeast expression in order to create mammalian-like glycan structures that do not induce immunogenic response.

26 Enzyme discovery in Populus

Figure 15. N-glycosylation starts in the endoplasmatic reticulum where Glc3Man9GlcNAc2 is transferred to the newly synthesized protein from the Glc3Man9GlcNAc2-Dol-PP precursor by the OST (oligosaccharyltransferase) complex. Thereafter, the glucoses (Glc) and one mannose (Man) unit are trimmed of by glucosidase I and II (GI and GII) and D-mannosidase I (MI) respectively. In the Golgi apparatus, the glycan structure is further modified. In yeast, the glycan structure only consists of mannose units, whereas glycosylation in plants can vary from basic mannose structures to more complex structures that also include galactose (Gal), fucose (Fuc) and xylose (Xyl) units.

3.6 Creating mutants and structural studies Heterologous protein expression allows modification of the genetic material prior to expression. By creating mutants, the role of specific amino acid residues can be thoroughly investigated. Furthermore, mutagenesis can create enzymes that are more stable to extreme temperatures or pH values or enzymes that have altered substrate specificity. Mutagenesis can be rational (site directed) or random. To be successful, site directed mutagenesis requires knowledge about the structure and mechanism of the enzyme. It is well suited for studies of specific amino acids and their contribution to the enzymatic properties. Random mutagenesis resembles evolution: the amino acids are randomly altered and the gene that encodes for the most viable protein is selected by the scientist (analogous to natural selection). Random mutagenesis requires no structural or mechanistic information, however, it is necessary to perform a screening for interesting candidates after expression. Three-dimensional structures of enzymes, determined by X-ray crystallography or NMR spectroscopy, are extremely useful tools for understanding of the enzymatic mechanism and the substrate specificity. Structures can be determined both of empty enzymes and enzymes in complex with their substrates. The latter is particularly interesting since it reveals putative conformational changes upon binding and how the enzyme interacts with its substrate.

27 28 OBJECTIVES OF THE PRESENT INVESTIGATION

The general objective of the present investigation was to perform a detailed characterization of selected carbohydrate active enzymes from the hybrid aspen EST library.

The specific goals were: ƒExpression, production, purification and characterization of different hybrid aspen isoenzymes of GH family 16 xyloglucan endotransglycosylases ƒEnzymatic characterization of the catalytic and binding modules of a putative xylanase; PttXYN10A ƒBioinformatic analysis and characterization of family 43 CBM43s ƒDevelopment of analytical methods involving CBMs for mapping of fiber surfaces

29

RESULTS AND DISCUSSION

4 Hybrid aspen xyloglucan endotransglycosylases (Publications I-III)

4.1 Identification of target genes: sequence analysis and gene expression 41 gene models belonging to the XTH gene family have been identified in the genome of Populus trichocarpa (Geisler-Lee et al. 2006). 24 of these have at least one EST homologue in the Populus cDNA libraries. In order to elucidate the function of the corresponding proteins and thereby putatively be able to explain the apparent redundancy of XTH genes in Populus, we are currently expressing and characterizing a number of Populus proteins encoded by the XTH gene family. This work describes the study of two members of the Populus XTH gene family: PttXTH16-34 encoding for the protein PttXET16-34 (previously PttXET16A) and PttXTH16-35 encoding for PttXET16-35 (previously PttXET16C). We have followed the recommended nomenclature for the genes (Rose et al. 2002), naming them xyloglucan endotransglycosylases/hydrolases (XTH). However, after characterization we found it appropriate to name the proteins xyloglucan endotransglycosylases (XET), reflecting their enzymatic activity (EC 2.4.1.207). Sequence analysis revealed that PttXET16-34 and PttXET16-35 are relatively similar over the first two thirds of the primary sequence, whereas the last thirds are much more divergent (Fig 1, Publication II). The sequence similarity between PttXET16-34 and PttXET16-35 over the complete amino acid sequences is 59%, however the sequence similarity downstream of amino acid number 201 (PttXET16-35 numbering) is only 31%. Both proteins have an N- terminal signal peptide, four conserved cysteines and a conserved site for putative N- glycosylation close after the catalytic residues. PttXET16-35 has an additional N- glycosylation site in the N-terminus of the protein (N29 to S31). PttXET16-34 has the typical DEIDFEFL catalytic motif, whereas the catalytic sequence is slightly different in PttXET16- 35 (NEFDFEFL). However, both enzymes possess the two catalytic glutamates and the stabilizing aspartate (underlined amino acids). The distribution of EST sequences corresponding to PttXTH16-34 and PttXTH16-35 in the different cDNA libraries is shown in Fig 16. The majority of the PttXTH16-34 clones are found in the EST libraries from the cambium, petiole and female catkin, whereas the majority of the PttXTH16-35 clones are found in the active cambium, tension wood and apical shoot. Although it is difficult to draw any definitive conclusions from these data, they do demonstrate a difference in expression pattern between the two isoenzymes.

31 Åsa Kallas

Figure 16. The number of PttXTH16-34 and PttXTH16-35 EST clones in the different cDNA libraries.

Microarray analysis showed that out of nine analyzed XTH genes, six genes (including PttXTH16-34 and PttXTH16-35) are up-regulated in the early phases of wood formation, two are evenly expressed and one is slightly up-regulated in the later phases (Fig 17) (Hertzberg et al. 2001, Schrader et al. 2004, Aspeborg et al. 2005). Further microarray experiments showed that, of all XTH genes analyzed, PttXTH16-35 has the highest relative level of expression in phloem versus xylem tissues (Djerbi et al. 2004 and Aspeborg H. et al., unpublished data) as well as in tension wood versus normally grown wood (Andersson-Gunnerås et al. 2006).

Figure 17. Results from microarray analysis showing the expression pattern of nine Populus XTH genes during the different stages of wood formation. A = meristematic cells, B = early expansion, C = late expansion, D = secondary wall formation, E = late maturation. Data from (Hertzberg et al. 2001, Schrader et al. 2004, Aspeborg et al. 2005).

4.2 Recombinant expression and purification of PttXET16-34 and PttXET16-35 Most of the XET and XEH proteins studied so far have been extracted from the native source. However, some of them have been expressed in heterologous hosts such as E. coli (Arrowsmith & de Silva 1995, Schröder et al. 1998), insect cells (Campbell et al. 1998, Oh et al. 1998, Campbell & Braam 1999b) or P. pastoris (Catalá et al. 2001, Henriksson et al. 2003, Chanliaud et al. 2004). We expressed recombinant PttXET16-34 and PttXET16-35 in

32 Hybrid aspen xyloglucan endotransglycosylases (Publications I-III)

P. pastoris in order to obtain the amount of protein that was needed for a detailed enzymatic characterization and, in the case of PttXET16-34, protein crystallization (Publications I and II). The genes encoding for PttXET16-34 and PttXET16-35 were obtained as full-length clones from the hybrid aspen cDNA library and cloned into the pPIC9 vector in frame with the secretion signal sequence from the S. cerevisiae Į-factor prepro protein. Also, constructs encoding for the glycosylation mutants of PttXET16-34 (PttXET16-34/N53S) and PttXET16- 35 (PttXET16-35/N29K) were created by site-directed mutagenesis. Transformed P. pastoris cells were screened for expression by colony blotting, using antibodies generated towards PttXET16-34 (Bourquin et al. 2002). Thereafter, small-scale cultures were used to verify the expression from candidate clones. When scaling up the cultivations, we could show that the enzymatic activity was considerably more stable when growing the cells at a lower temperature than recommended after induction. For PttXET16-34, cultivation at 22 qC gave an even protein production, whereas the growth at 30 qC gave a fluctuating and comparably low production of active PttXET16-34 (Fig 18). For PttXET16-35 as well as the glycosylation mutants, the best results were obtained when growing the P. pastoris cells at 16 qC.

Figure 18. Cultivation of PttXET16-34 at different temperatures shows that the yield of active protein is considerably higher at 22 qC (dotted line) than at 30 qC (solid line).

While PttXET16-34 and PttXET16-34/N53S were successfully purified by a two-step cation exchange (Publication I), the same or other conventional purification techniques could not be applied for the purification of PttXET16-35 or PttXET16-35/N29K. Instead, PttXET16-35 and PttXET16-35/N29K were purified on a xyloglucan oligosaccharide affinity column (Publication II). This method has proven excellent for the purification of a non-glycosylated, predominantly hydrolytic XTH gene product from nasturtium seeds, giving high yields of very pure enzyme (Baumann M., Eklöf J., Teeri T.T. and Brumer H., unpublished data). For PttXET16-35, only low amounts of the protein could be bound to the column. This might be explained by different enzymatic behavior, or maybe more likely, by steric hindrance due to the glycosylations of PttXET16-35. The latter theory was further supported by the fact that the yield of the glycosylation mutant PttXET16-35/N29K was considerably higher than the yield of the wild- type enzyme (Publication II). Since the amounts of wild-type PttXET16-35 obtained after purification was very low and the specific activity of the wild-type and the glycosylation mutant were very similar, we used the PttXET16-35/N29K for characterization (Publication II).

33 Åsa Kallas

4.3 Characterization of PttXET16-34, PttXET16-34/N53S and PttXET16- 35/N29K The radioactive XET assay (Fry et al. 1992) was used to determine the kinetic parameters of PttXET16-34, PttXET16-34/N53S and PttXET16-35/N29K. As shown in Table 1 (Publication II), the kinetic parameters are comparable for the different proteins. Both wild- type enzymes, as well as their respective glycosylation mutants were considered as transglycosylases, since no hydrolytic activity could be measured with the BCA assay. The pH optima of the recombinant PttXET16-34, PttXET16-34/N93S and PttXET16- 35/N29K was investigated as described in Publication I and II, showing pH optima between 5 and 5.7 (Fig 1a, Publication I and Fig 2, Publication II). The optimal temperature of PttXET16-34 was determined to be between 35 and 40 qC (Figure 1b, Publication I). The determination of the melting temperature (Tm) clearly showed that PttXET16-35/N29K was less thermostable than both PttXET16-34 and PttXET16-34/N53S (Table 2, Publication I and Table 2, Publication II). It is presently not clear if this is true also for the wild-type PttXET16-35 or if it is a consequence of the partial deglycosylation. By thermal denaturation studies we also showed that NaCl destabilized the enzymes, whereas millimolar amounts of XGO increased the Tm. 4.4 Deglycosylation studies As earlier described, the majority of the proteins encoded by the XTH gene family have at least one putative N-glycosylation site close after the active site. The crystal structure of PttXET16-34, showing two N-acetyl glucose amine rings and one mannose residue that belong to the N-glycosylation of asparagine 93 (Fig 3, Publication I), reveals that the glycan structure is interacting with the acceptor-binding loop (A176-W190) via hydrogen bonds. Thus, the conserved N-glycosylation might be important for stabilization of the correctly folded enzyme. The structure furthermore reveals that glutamate 186 in the acceptor-binding loop forms a salt bridge with arginine 182 in the acceptor-binding loop and arginine 94 located closely after the catalytic residues. In turn, arginine 94 forms a salt bridge with aspartate 66 (Fig 3, Publication I). In Publication I, we suggest that these electrostatic interactions could stabilize the correctly folded molecule and that the N-glycan structure would be especially important for XET and XEH proteins that lack the charged residues. Only a few publications have addressed the effect of removing the N-glycan of XET enzymes (Campbell & Braam 1999b, Henriksson et al. 2003). Some of the XET proteins (AtTCH4 and AtMeri5, Campbell & Braam 1999b) lose activity upon deglycosylation whereas others do not (AtEXGT, AtXTH4 and BobXET16A, Campbell & Braam 1999b, Henriksson et al. 2003). Interestingly, there seems to be a correlation between loss of activity upon deglycosylation and absence of charged residues putatively involved in electrostatic interactions. In PttXET16-34, all of the charged amino acids putatively involved in the structure stabilization are present and we therefore expected that the protein would not lose its activity upon deglycosylation. Analysis by mass spectrometry shows that the recombinant PttXET16-34 is glycosylated with 8-12 mannose units (Table 1, Publication I). The importance of the N-glycan was investigated by expression of PttXET16-34/N59S in which the codon for the asparagine in the glycosylation site had been mutated to a codon for serine (Publication I). This mutant was successfully expressed in P. pastoris when the cells were grown at 16 qC after induction. The specific activity of the glycosylation mutant was almost as high as that of the wild-type enzyme; however, the yield of secreted protein was much lower. A comparison between the ratio of intracellular and secreted protein indicated that the protein lacking the glycosylation might be retained in the cell. The Tm of the PttXET16- 34/N93S was similar to that of PttXET16-34, but we did observe a decreased solubility. A

34 Hybrid aspen xyloglucan endotransglycosylases (Publications I-III) second approach for investigating the importance of the N-glycosylation involved deglycosylation by endoglycosidase H followed by MS analysis and activity measurements. These studies confirmed that PttXET16-34 did not lose activity upon deglycosylation (Fig 2, Publication I). For PttXET16-35, which contains two N-glycosylation sites, the non-conserved site (N29 to S31) was removed by site-directed mutagenesis as described above, yielding PttXET16- 35/N29K. This mutant enzyme expressed as well as the wild-type enzyme and had similar activity, suggesting a less important function of the non-conserved sugar. Molecular modeling, based on the structure of PttXET16-34, further supported this hypothesis by showing that the N-glycan of the non-conserved N-glycosylation site is located well away from the active site. However, comparable to PttXET16-34/N93S, a slightly higher tendency of precipitation was observed for PttXET16-35/N29K. PttXET16-35 only possesses two of the charged residues putatively involved in active site stabilization and according to the salt- bridge hypothesis described above we had expected the conserved N-glycan of PttXET16-35 to be important for the activity and/or stability of the enzyme. However, after removing this sugar from the PttXET16-35/N29K by endoH, the protein had not lost more activity than the negative control (Fig 3, Publication II). These results suggest that both the salt bridges and the N-glycan might be less important for the activity than predicted, and that the loss of activity of the A. thaliana proteins simply might be an effect of harsher conditions for deglycosylation.

4.5 Fermentation of PttXET16-34 In order to obtain the large amount of PttXET16-34 needed for the development of a technical application (see discussion below); the enzyme production was scaled up from shake flask cultivations to cultivation in a 10 l fermenter (Publication III). This up-scaling involved a number of optimization steps aiming at minimizing proteolytic cleavage and denaturation of the recombinant protein. Initially, we showed that rich media, containing peptone and yeast extract, was crucial for stable protein expression both in shake flasks and in the fermenter (Fig 1 and 2, Publication III). SDS-PAGE analysis confirmed that the proteolytic cleavage of PttXET16-34 was considerably lower when using rich media during both the glycerol and the methanol phase (Fig 3, Publication III). However, despite the fact that the rich media led to an accumulation of PttXET16-34, we did not observe increased activity. In order to investigate the effect of the extensive foaming, XET activity was followed in supernatants from cell cultures as well as in whole-cell cultures, with and without anti-foam. In this experiment, we could clearly see a stabilizing effect of the antifoam agent MAZU on the activity of PttXET16-34 (Fig 4, Publication III). In order to further optimize the cultivation conditions, we examined the pH effect on the stability of PttXET16-34. Cell free samples were taken out after 142 hours of fermentation, adjusted to pH 3-7 and further incubated on a shaker. After 100 hours of incubation, the pH 4 sample had retained all of its activity, whereas the pH 5-7- samples had lost up to 50% of the activity (Fig 6, Publication III). SDS PAGE analysis confirmed that the PttXET16-34 is more extensively degraded by proteases at pH 4 than at pH 5 (Fig 5, Publication III). A pH 5 sample incubated with PMSF, a protease inhibitor active on alkaline serine proteases, also lost a significant portion of its activity, indicating that the active proteases are not of this type. Finally, the effect of the cultivation temperature was investigated. The optimal temperature for P. pastoris cell growth is 30 °C, giving high cell densities and a high production of the recombinant enzyme. However, very high cell densities can lead to cell death followed by cell lysis that releases intracellular proteases. In addition, both intra- and extracellular proteases are generally more active at a higher temperature. The investigation of the temperature-dependence on the stability of the protein was investigated in the same way as the pH-dependence. Cell-free samples were taken out after 142 hours and

35 Åsa Kallas further incubated at different temperatures. By analyzing the band volume from full length PttXET16-34, it could be shown that lower temperatures indeed decreased the rate of proteolysis (Fig 8, Publication III). Consequently, cultivations were grown at 15 °C. PttXET16-34 from fed-batch fermentations run under optimized conditions (rich media, MAZU anti-foam agent, pH 4 and 15 °C) was purified according to a slightly modified version of the purification protocol described in Publication I. The yield of purified PttXET16-34 was 54 mg/l filtrate, which is over four times more than what was obtained in shake flasks (Publications I and III).

36 Carbohydrate binding modules (Publications IV and V)

5 Carbohydrate binding modules (Publications IV and V) According to the CAZy website (http://afmb.cnrs-mrs.fr/CAZY/), putative carbohydrate binding modules from plant proteins have been identified in CBM families 13 (lectin-like, has been shown to bind to mannose, galactose and xylan), 18 (chitin-binding), 22 (xylan- and mixed linked E-(1,3)(1,4)-glucan-binding) and 43 (E-(1,3)-glucan-binding). Of these, only one representative (a CBM43 from olive tree) has been characterized in detail (Barral et al. 2005). It is of significant interest to address properties such as binding specificity and binding affinity of plant CBMs and compare them to CBMs from microbial proteins since this could possibly explain their role in proteins that might not be aimed for complete degradation. The Populus genome contain putative CBM candidates from family 18 (connected to putative chitinases from family 19, Mellerowicz E.J. and Sundberg B., unpublished data), family 22 (connected to a putative endo-(1,4)-E-xylanase, (Aspeborg et al. 2005) and Kallas Å. and Teeri T.T., unpublished data) and family 43 (Publication IV).

5.1 Bioinformatic analysis of the CBM family 43 The members of CBM family 43 (previously named X8) have an interesting modularity pattern, with some modules originating from proteins with a catalytic module and others occurring separately without a catalytic module (Henrissat et al. 2000, Henrissat et al. 2001). Many of the proteins also have a glycosylphosphatidylinositol (GPI) anchor. In Publication IV, we have performed a detailed bioinformatic analysis of all CBM43 members listed on the CAZy website (2 modules from Phytophtora infestans, 35 modules from filamentous fungi and 119 plant modules). In addition, we included 69 putative CBM43s identified in the Populus trichocarpa genome. The results show that all fungal modules originate from proteins containing a catalytic module from family 72 (E-(1,3)-glucanosyltransferases). A majority of these proteins also have a C-terminal GPI-anchor. The two modules originating from Phytophtora infestans contain catalytic modules from GH5 as well as putative GPI anchors. The plant proteins show a more divergent modularity pattern and were therefore divided into four subgroups: 1. GH17+CBM43+GPI, 2. GH17+CBM43, 3. CBM43, 4. CBM43+GPI (Fig 1, Publication IV). Table 1 (Publication IV) shows the occurrence of modules belonging to the different subgroups in the analyzed plant proteins. A phylogenetic tree including all plant modules as well as two fungal modules show that the fungal modules form a distinct outgroup (Fig 3, Publication IV). It also shows that the different subgroups, with a few exceptions, end up on separate branches. Some CBM43 modules occur, within the same protein, in pair with another CBM43. The phylogenetic tree indicates that these double CBM43s are a consequence of gene and/or genome duplication. Gene or genome duplication is also a possible explanation for the fact that the large majority of the CBM43s from Populus trichocarpa have a very close homologue. In order to get an idea about how many of the Populus trichocarpa gene models that are actively transcribed, we compared the gene model sequences with the sequences in the Populus cDNA library (http://poppel.fysbot.umu.se/). This comparison showed that 26 of the gene models have at least one homologue in the EST database (Table 1, Publication IV). This search also identified 17 EST clones encoding for putative GH17 proteins without a CBM43. Fig 4 (Publication IV) shows the EST distribution of modules belonging to subfamily 1+2, 3+4 and GH17 without a CBM43. Although these results alone do not prove anything, the differences in expression pattern might imply that there are also differences in physiological role in the plant.

37 Åsa Kallas

5.2 Cloning, expression and purification of CBM43PttGH17_84 In order to investigate its binding properties, one of the hybrid aspen CBM43 modules from subgroup 2 was cloned and expressed in E. coli. In the natural occurring protein, this CBM is fused to PttGH17_84 and we therefore denoted it CBM43PttGH17_84. Fig 19 shows the alignment of CBM43PttGH17_84 and Ole e 10, the earlier characterized CBM43 from olive tree. Conserved hydrophobic amino acids that might contribute to the binding as well as putatively stabilizing cysteines are marked.

# ! # ! PttCBM43 ------STTGN-TWCVANPDAGKEKLQAALDFACGEG 30 Ole_e_10 MRGTAGVPDQPVPTPTPSVPTSSSPVPKPPTQGNKKWCVPKAEATDAQLQSNIDYVCSQS 60 * ** *** * ** * * ! !# # ## ! ! PttCBM43 GADCRPIQPEATCYNPNTLVAHSSFAFNSYYQKKGRGMGDCYFGGAAFVVTQEPKFGECE 90 Ole_e_10 GMDCGPIQANGACFNPNTVRAHASYAMNSWYQSKGRNDFDCDFSGTGAITSSDPSNGSCS 120 * ** *** * **** ** * * ** ** *** ** * * * * * # PttCBM43 FPTGY 95 Ole_e_10 FLS-- 123

Figure 19. Amino acid sequence alignment of CBM43PttGH17_84 with the previously characterized Ole e 10 (Barral et al. 2005). * below letters indicate sequence identity, conserved cysteines are marked with ! above letters and hydrophobic amino acids with #.

A large number of CBMs have been successfully expressed in E. coli. Disulphide bridges that are normally reduced in the cytoplasm of the bacteria can be obtained either by secretion to the periplasm or by using a strain deficient in the reducing proteins (Sorensen & Mortensen 2005). We expressed CBM43PttGH17_84 in pEZZmp18 (Ståhl et al. 1989), a vector earlier used for expression of a number of cellulose binding modules (Lehtiö et al. 2003). The gene of interest is cloned in frame with the Staphylococcus aureus protein A (SpA) signal peptide directing the protein to the periplasm. The vector has a constitutive SpA promoter and fuses the recombinant protein to a ZZ-tag (originating from the IgG-binding SpA, Moks et al. 1987) that can be used for purification and detection. In addition, it has been shown that fusion partners of this type often improve the solubility of the recombinant protein (Baneyx 1999). Expression and subsequent purification (by osmotic shock, IgG affinity and polishing cation exchange) yielded approximately 0.5-1 mg ZZ-CBM43PttGH17_84 per liter of cultivation (Publication IV).

5.3 Binding properties of CBM43PttGH17_84

The binding properties of CBM43PttGH17_84 were investigated by affinity gels and qualitative binding studies evaluated with SDS PAGE analysis. The affinity gels showed that the recombinant ZZ-CBM43PttGH17_84 binds to laminarin (E-(1,3)-glucan) and barley E-glucan (E- (1,3)(1,4)-glucan, ratio 1:2.3-2.5), weakly to lichenan (E-(1,3)(1,4)-glucan, ratio 1:2) but not to CMC, glucomannan, glucuronoxylan or xyloglucan (Fig 5, Publication IV). The fact that the protein is binding stronger to barley E-glucan than to lichenan (although the latter is more similar to laminarin) might be explained by the three-dimensional conformation that the substrate adopts. Dissociation constants (KD) for the binding of ZZ-CBM43PttGH17_84 to laminarin (0.19 mg/ml or 47 μM) and barley E-glucan (2.4 mg/ml or 11μM) were obtained by running affinity gels with a range of different ligand concentrations (Fig 6, Publication IV). The binding studies to the insoluble substrates showed binding to curdlan (E-(1,3)-glucan), BMCC (bacterial microcrystalline cellulose), in vitro synthesized E-(1,3)-glucan of different crystallinity and a mixed (E-(1,3)(1,6)-glucan (Fig 7, Publication IV).

38 Carbohydrate binding modules (Publications IV and V)

5.4 CBMs as analytical tools for fiber surface characterization During pulping processes, the surface of the wood fibers can be significantly altered. In order to evaluate different processes and to obtain a final product with the desired properties, it is important to be able to analyze both the raw material and the processed material regarding carbohydrate contents, cellulose crystallinity etc. A number of microscopy techniques are available for sample analysis, ranging in resolution from 200 nm for light microscopy down to 0.1 nm for atomic force microscopy (AFM). Different staining techniques (specific for certain types of glycosidic bonds) and carbohydrate-specific antibodies have been used in order to label the different components of the plant cell wall (Knox 1997, Willats et al. 2000). However, these techniques are not as fine-tuned as CBMs and can for example not distinguish between crystalline/amorphous cellulose and reducing/non-reducing ends. Due to their high specificity and relatively high affinity for carbohydrates, CBMs have successfully been used for the analysis of plant cell walls, wood samples and pulp fibers (Hildén et al. 2003, McCartney et al. 2004, McCartney et al. 2006). In Publication V, we aimed to establish a protocol for the utilization of recombinant CBMs with different binding specificity as probes for the labeling of fibers and pulps. The method could also be used for other purposes, such as the evaluation of genetically modified plants. We used one cellulose binding module (CBM1 from Hypocrea jecorina Cel7A, CBM1HjCel7A, (Reinikainen et al. 1992, Linder et al. 1996, Lehtiö et al. 2003) which was successfully bound to BMCC, G-layer cellulose and spruce fibers (Fig 2, Publication V). Both directly and indirectly FITC/TRITC-labeled proteins (in fluorescence microscopy) as well as silver-enhanced gold labeling (for transmission electron microscope, Daniel et al. 2006) were used for visualization of the CBM. In addition, two different mannan-binding modules; CBM27TmMan5 (CBM27 from Thermotoga maritima Man5, Boraston et al. 2003) and CBM35CjMan5C (CBM35 from Cellvibrio japonicus Man5C, Hogg et al. 2003, Bolam et al. 2004) were compared, both to each other and to a E-(1,4)- mannan/galacto-E-(1,4)-mannan-specific antibody. The results show that CBM35CjMan5C binds uniformely on both spruce wood sections and spruce fibers whereas CBM27TmMan5 only binds to highly fibrillated parts of fibers and sections (Fig 3, Publication V). The differences in binding properties of the two mannan-binding CBMs can be explained by their three- dimensional structures showing that CBM35CjMan5C has a more flexible binding site than CBM27TmMan5, thus allowing a higher degree of substrate substitution (Boraston et al. 2003, Tunnicliffe et al. 2005). The mannan-specific antibody binds uniformly, but weaker than CBM35CjMan5C. The methods presented in Publication V should be possible to adapt for CBMs with any type of binding specificity. The CBMs could thereby function as molecular probes for detection of changes in the carbohydrate composition and distribution on pulp surfaces after different types of mechanical or chemical processing or for analysis of cell wall carbohydrates in for example genetically modified plants. The method could be further developed by using tailor- made CBMs, with even higher specificity than the naturally occurring proteins.

39 Åsa Kallas

6 PttXYN10A: A putative hybrid aspen endo-(1,4)-E-xylanase (unpublished work)

6.1 Sequence and expression analysis of PttXYN10A The genome of Populus trichocarpa contains 7 gene models encoding for putative GH10 endo-(1,4)-E-xylanases (EC 3.2.8.1) (Geisler-Lee et al. 2006). However, only one of these gene models (PttXYN10A) has corresponding cDNA clones in the Populus EST libraries. One PttXYN10A cDNA clone was found in the cambial library and the other in the wood cell death library (both originating from Populus tremula x tremuloides, Sterky et al. 2004). Microarray analysis comparing the relative gene expression pattern during the different stages of wood formation showed significant up-regulation of the putative endo-(1,4)-E-xylanase during the formation of the secondary cell wall (Fig 20, data from Hertzberg et al. 2001, Schrader et al. 2004).

Figure 20. Results from microarray analysis showing the expression pattern of PttXYN10A during the different stages of wood formation. A = meristematic cells, B = early expansion, C = late expansion, D = secondary wall formation, E = late maturation. Data from (Hertzberg et al. 2001) (2001, solid line) and (Schrader et al. 2004) (2004, dotted line).

The full-length PttXYN10A gene, obtained by 5’RACE was shown to encode for a GH10 catalytic module, preceded by three putative carbohydrate binding modules from family 22 (Fig 21a, Aspeborg et al. 2005). Further sequence analysis shows that PttXYN10A contains 6 putative N-glycosylation sites. Even though the absence of an obvious signal peptide suggests that PttXYN10A should be retained in the cytoplasm, the close homologue in A. thaliana (AtXyn1), which also lacks an apparent signal peptide has been reported to be translocated into the cell wall by an unknown mechanism (Suzuki et al. 2002). Just as AtXyn1, PttXYN10A possess the predicted catalytic motif GLPIWFTELDV (nucleophile underlined). Sequence alignment of the three PttXYN10A CBMs (CBM22_1PttXYN10A, CBM22_2PttXYN10A and CBM22_3PttXYN10A) with the xylan-binding CBM22-2 from Clostridium thermocellum xylanase Xyn10B shows that, despite a low overall homology, the majority of the amino acid residues required for CBM22_2CtXyn10B to be functional (Xie et al. 2001b) are conserved in the hybrid aspen CBMs (Fig 21b). However, the R25 (CBM22-2CtXyn10B numbering), important for structural stability and hydrogen bonding to the ligand, is shifted in position or even absent in two of the three the hybrid aspen CBMs.

40 PttXYN10A: A putative hybrid aspen endo-(1,4)-E-xylanase (unpublished work) a)

b)

PttCBM22_1 --NSNAPNIILNHDFSRGLNSWHPNCCDGFVLSAD------SGHSGFSTKP--GGNYAV 49 PttCBM22_2 ---DGDGNIILNPQFDDGLNNWSGRGCK-IAIHDS------IADGKIVPLS--GKVLAT 47 PttCBM22_3 -----GVNIIQNSNLSDGTNGWFPLGNCTLTVATGSPHILPPMARDSLGPHEPLSGRCIL 55 CtCBM22-2 PEEPDANGYYYHDTFEGSVGQWTARGPAEVLLSGR------TAYK--GSESLL 45

PttCBM22_1 VSNRKECWQGLEQDITSRISPCSTYSISARVGVSGPVQYPTDVLATLKLEYQNSATSYLL 109 PttCBM22_2 ATERTQSWNGIQQEITERVQRKLAYEATAVVRIFGNNVTSADIRATLWVQTPNLREQYIG 107 PttCBM22_3 VTKRTQTWMGPAQMITDKLKLLLTYQVSAWVKIGSGANGPQNVNVALGVDN-----QWVN 110 CtCBM22-2 VRNRTAAWNGAQRALNPRTFVPGNTYCFSVVASFIEGASSTTFCMKLQYVDGSGTQRYDT 105

PttCBM22_1 VGEISVSKEGWEKLEG-TFSLATMPDHVVFYLEGPAPGVDLLIESVIITCSCPSECNN 166 PttCBM22_2 IANLQATDKDWVQLQG-KFLLNGSPKRVVIYIEGPPAGTDILVNSFVVKHAEK----- 159 PttCBM22_3 GGQVEINDDRWHEIGG-SFRIEKQPSKVMVYVQGPAAGVDLMLAGLQIFPVDR----- 162 CtCBM22-2 IDMKTVGPNQWVHLYNPQYRIPSDATDMYVYVETADDTINFYIDEAIGAVAGTVI--- 160

Figure 21a. Modular structure of PttXYN10A. CBM22_1PttXyn10A: aa 25-190, CBM22_2PttXyn10A: aa 197-355, CBM22_3PttXyn10A: aa 370-531, GH10 catalytic module: aa 568-915. b. Sequence alignment of the three CBM22s from hybrid aspen with CBM22-2 from Clostridium thermocellum xylanase Xyn10B. Red letters indicate sequence identity to CBM22_2CtXyn10B, blue letters indicate sequence identity between the hybrid aspen CBMs. The amino acids that are important for activity of CBM22_2CtXyn10B are highlighted yellow.

6.2 Heterologous expression Despite extensive efforts to express the full-length PttXYN10A as well as the catalytic module without the carbohydrate binding modules in P. pastoris, we failed to obtain detectable amounts of either of the two recombinant proteins. However, the three putative hybrid aspen CBMs (CBM22_1PttXYN10A, CBM22_2 PttXYN10A and CBM22_3PttXYN10A) were individually expressed in E. coli, using the pET28a vector and the BL21 Tuner strain (Novagen). The yield of recombinant protein after His-tag purification was 10-50 mg/l of cultivation. Affinity gels with soluble carbohydrates indicate that CBM22_2PttXYN10A and CBM22_3PttXYN10A bind to different xylans such as glucuronoxylan, wheat arabinoxylan and rye arabinoxylan. However, they do not bind to mannan, xyloglucan, laminarin or lichenan. For both these CBMs, the binding to xylan was too weak to be able to determine the affinity constants by ITC analysis, indicating KD values above the millimolar range. Putatively, these CBMs would bind synergistically if they were expressed together, however, they could also constitute examples of plant CBMs with weak binding. CBM22_1PttXYN10A did not bind to any of the tested substrates. In most cases, the affinity of the CBM corresponds to the preferred substrate of the catalytic module that it is attached to (Boraston et al. 2004). Thus, the xylan binding of CBM22_2PttXYN10A and CBM22_3PttXYN10A suggests that the PttXYN10A indeed acts on xylan although the activity of the catalytic module still has to be confirmed experimentally.

41 Åsa Kallas

6.3 Reverse genetics In a parallel approach, we attempted to use reverse genetics to study the physiological role of PttXYN10A (Takahashi J, Awano T, Kallas Å, Berthold F., Teeri TT, Sundberg B and Mellerowicz E, unpublished data). An antisense construct was created in the pPCV702.kana vector and transferred into juvenile hybrid aspen plants via Agrobacterium tumefaciens, strain GV3101. Ten antisense lines were propagated in the green house and used for further analysis. The preliminary analysis of the antisense plants revealed a phenotype with increased height and larger leaves. Also, most antisense plants had larger stem diameter and more leaves than the wild-type plants (data not shown). Analysis of gene expression (measured by RT PCR) confirmed a down-regulation of the PttXYN10A gene in the xylem of all the antisense plants compared to the wild-type plants. Surprisingly, most antisense plants showed an up-regulation of the PttXYN10A transcript in the phloem (Fig 22).

Figure 22. Expression levels of PttXYN10A in the phloem and xylem tissue of the wild-type (WT) and the antisense (A2-A32) plants.

Based on the results from the RT PCR analysis, four antisense lines (A2, A3, A5 and A32) were selected for analysis of the carbohydrate content and protein extractions. Analysis of the carbohydrate content by FTIR showed a slightly higher content of xylan, mannan and lignin in the antisense plants compared to the wild-type plants (Table 2).

Composition (%) A2 A2 A3 A5 A32 A32 T89 (wt) Arabinan 0,28 0,29 0,31 0,28 0,30 0,28 0,30 Galactan 1 0,95 0,95 0,98 1,02 1,02 1,46 Glucan 44,90 45,09 45,49 44,87 44,84 44,82 47,01 Xylan 16,40 16,48 17,38 15,74 16,66 16,74 15,64 Mannan 2,62 2,65 2,71 2,47 2,37 2,54 2,17 Klason lignin 18,2 18,2 17,5 16,5 17,3 17,3 16,4 Acid soluble lignin 3,66 3,66 3,91 3,73 3,85 3,85 3,54 Yield 87,05 87,32 88,26 84,57 86,34 86,55 86,51

Table 2. Carbohydrate composition in four PttXYN10A antisense lines (A2, A3, A5 and A32) and the wild-type control (T89).

42 PttXYN10A: A putative hybrid aspen endo-(1,4)-E-xylanase (unpublished work)

Since both the xylan and mannan-contents were slightly altered in the antisense lines compared to the wild-type plants, we determined endo-mannanase as well as endo-xylanase activity by measuring the formation of reducing ends with the DNS assay (Sumner et al. 1949). Proteins were extracted from phloem and xylem scrapings. For the phloem tissue, the protein extracts from all the antisense lines had somewhat higher hydrolytic activity on both 4-O-methyl-D-glucuronoxylan and konjac glucomannan than the protein extract from the wild-type line. For the xylem tissue, one of the antisense lines (A2) showed a slightly lower activity on 4-O-methyl-D-glucuronoxylan compared to the wild-type line (93 % of the T89), whereas the antisense line A3 had a slightly higher activity on this substrate as compared to the wild-type line (111% of the T89). A5 and A32 showed no difference in activity on 4-O- methyl-D-glucuronoxylan as compared to T89. For the activity of the xylem extracts on glucomannan, the protein extracts from the all the anti-sense lines, except for A32, showed lower activity than T89. These results indicate that the down-regulation of PttXYN10A expression affects the activity on glucomannan more than the activity on glucuronoxylan. This might be a primary effect indicating that the in muro substrate of PttXYN10A is glucomannan rather than glucuronoxylan or a secondary effect caused by the genetic modifications of the plants. As discussed by Schröder et al. (2004), enzymatic assays on plant extracts may be disturbed by various factors such as the presence of inhibitory compounds, proteases or other co-extracted plant enzymes competing for the substrate (for example mannosidases or xylosidases). Furthermore, although we used a high-salt buffer (1M NaCl), the proteins may not have been completely extracted from the plant cell. In summary, the current results have to be interpreted with caution and in order to further elucidate the substrate specificity of PttXYN10A, pure and preferentially recombinant protein would be needed.

Figure 23. Activity on glucuronoxylan and glucomannan in protein extracts from the phloem and xylem side of antisense (A2-A32) and wild-type plants (T89).

43

CONCLUDING REMARKS AND FUTURE PERSPECTIVES

The development of a biotechnological process in which a protein is used for a technical application involves a number of steps. Genes encoding for putatively interesting proteins have to be identified, the corresponding proteins have to be characterized and finally, the conditions for the application have to be optimized and evaluated. The work presented in this thesis has focused on protein expression and characterization, as a continuation of the gene identification work and a precursor to the development of biotechnological applications. Of the investigated proteins, XET is the best example of an enzyme, which has been taken all the way from the gene to an application. This work provides a first insight into the properties of two xyloglucan endotransglycosylases (PttXET16-34 and PttXET16-35) earlier identified in hybrid aspen. Heterologous expression of the recombinant proteins with subsequent purification and biochemical characterization studies showed that they are pure transglycosylases. Also, the function of the N-glycans was investigated. The fermentation of PttXET16-34 yields large amounts of protein that is required for further characterization and development of biotechnological applications. A biotechnological application where recombinant PttXET16-34 is utilized for chemo-enzymatic modification of cellulose without disturbing its structure was recently developed in our laboratory (Brumer et al. 2004). In this approach, PttXET16-34 is used to incorporate chemically modified xyloglucan oligosaccharides into xyloglucan, which is subsequently adsorbed onto cellulose. Thus, by mimicking the cellulose-xyloglucan interaction that occurs naturally in the primary cell wall, this technique makes it possible to alter the properties of the fiber surface. Heterologous expression as well as reverse genetics was used in order to elucidate the function of a putative endo-(1,4)-E-xylanase from hybrid aspen (PttXYN10A). Several facts made it interesting to address the properties of this enzyme. Firstly, only a few plant endo- (1,4)-E-xylanases have so far been studied. Most of these originate from monocot species and are involved in carbohydrate degradation in processes such as seed germination. The expression profile of PttXYN10A obtained by microarray showed a significant up-regulation during the formation of the secondary cell wall when xylan degradation is expected to be limited. This gives rise to the question whether the protein might be a “proof-reading” enzyme similar to KORRIGAN. It is also possible that the PttXYN10A has transglycosylase activity rather than hydrolytic activity. If PttXYN10A is found to be located in the cell wall, the hybrid aspen enzyme could be involved in remodeling the cellulose-xylan network in the secondary cell wall in the same way as the XET proteins remodel the cellulose-xyloglucan in the primary cell wall. The analysis of the results from the reverse genetics gave some indications about the function of PttXYN10A within the plant. However, the down-regulation obtained with the antisense technique might have been insufficient and probably it would be necessary to use an alternative technique such as RNAi. Carbohydrate binding modules increase the efficiency of the catalytic module that they are attached to; a property that makes them interesting tools in a large number of biotechnological applications. So far, the research has been focused on CBMs originating from fungal and bacterial enzymes and the knowledge about plant CBMs is very limited. In plants, the catalytic module that the CBM is attached to might act on exogenous substrates (where the aim is complete degradation), such as chitin from an invading pathogen. In such a scenario, it is likely that the CBM has a role, which is comparable to that of the microbial CBMs. This

45 might be the case with the plant CBMs from family 18, which are normally associated to catalytic modules with chitinase activity. However, if the CBM is associated to a catalytic module that acts on endogenous substrates, for example xylan or callose, it might need to have narrower substrate specificity or weaker affinity. Although these substrates are completely degraded at some stages of plant development (for example seed germination and fruit ripening), other developmental phases only require very limited degradation. In these cases, it should be important that the CBMs target the catalytic module to the exact substrate and that the binding is not too tight, which could lead to severe degradation. The CBM43 expressed and characterized in this study indicated a relatively wide substrate specificity with dissociation constants comparable to those obtained for many CBMs from microorganisms. On the other hand, the CBM22s from PttXYN10A showed weaker substrate binding, perhaps significant for some plant CBMs. However, these are only examples of a few plant CBMs. In order to fully elucidate the role of CBMs in plants, it will be necessary to characterize more CBMs, both associated to and isolated from their catalytic modules. Also, similar to what has been done for a number of microbial CBMs, the activities of the isolated catalytic modules should be compared to the activities of the catalytic modules attached to the CBM. By obtaining increased knowledge about the plant CBMs, it might be possible to exploit them for various purposes. For example, CBMs with narrow substrate specificity would be very attractive as probes for carbohydrate analysis.

46 LIST OF ABBREVIATIONS

AFM atomic force microscopy ANTS 8-aminonaphthalene-1,3,6-trisulfonic acid AOX alcohol oxidase BCA bicinchoninic acid BMCC bacterial microcrystalline cellulose CAZyme carbohydrate active enzyme CBD cellulose binding domain CBM carbohydrate binding module cDNA complementary DNA CMC carboxy-methyl cellulose DNA deoxyribonucleic acid DNS 3,5-dinitrosalicylic acid ER endoplasmic reticulum EC enzyme commission ECF elementary chlorine free EST expressed sequence tag FITC fluorescein isothiocyanate FTIR Fourier transform infrared spectroscopy Gal galactose GAP glyceraldehyde-3-phosphate dehydrogenase GH glycoside hydrolase Glc glucose GlcNAc N-acetyl glucosamine GPI glycosylphosphatidylinositol GT glycosyltransferase ITC isothermal titration calorimetry IPTG isopropyl-beta-D-thiogalactopyranoside KD dissociation constant KOR KORRIGAN Man mannose MFA microfibrillar angle mRNA messenger RNA NMR nuclear magnetic resonance PCR polymerase chain reaction PME pectin methyl esterase PMSF phenylmethylsulfonyl fluoride Ptt Populus tremula x tremuloides SDS-PAGE sodium dodecyl sulfate-polyacrylamide gel electrophoresis RNAi RNA interference RT PCR reverse transcriptase PCR TCF totally chlorine free TEM transmission electron microscopy TRITC tetramethyl rhodamine isothiocyanate RACE rapid amplification of cDNA ends RNA ribonucleic acid

47 SpA Staphylococcus aureus protein A UDP uridine diphosphate UV ultraviolet XEH xyloglucan endohydrolase XET xyloglucan endotransglycosylase XGO xyloglucan oligosaccharide XTH xyloglucan endotransglycosylase / hydrolase XYN endo-1,4-E-xylanase Xyl xylose

48 ACKNOWLEDGEMENTS

Without colleagues, collaborators, friends and family it would never have been possible to write this thesis. I would especially like to thank:

Professor Tuula Teeri for leading the Wood Biotechnology group with great success and for letting me take part of many very exciting research projects. For sorting out my major and minor crisis’s and great help with manuscripts for papers and the thesis. For lending me Vinski when I needed a dog walk!

Dr. Stuart Denman for initial supervision when I was completely lost in the lab. Doc. Harry Brumer for invaluable help with enzymatic matters such as kinetics, pH-profiles etc. Dr. Kathleen Piens for great help and encouragment.

Many and fruitful collaborations outside our lab…Dr. Ewa Mellerowicz, Prof. Björn Sundberg, Dr. Nobuyuki Nishikubo and Junko Takahashi at SLU in Umeå for the work with the transgenic trees, bioinformatics etc. Prof. Geoff Daniel and Dr. Lada Filonova at SLU in Uppsala for excellent microscopy work with the CBMs. Prof. Alwyn Jones and Patrik Johansson at Uppsala University for the PttXET16A crystallization work. Dr. Monika Bollok, Dr. Mehmedalija Jahic and Prof. Sven-Olof Enfors in the Centre of Bioprocessing for the PttXET16A fermentation work. Thanks to Professor Harry Gilbert for welcoming me in Newcastle and the whole HJG group for taking care of me! Professor Bernard Henrissat and Dr. Pedro Coutinho at CNRS in Marseille for helping me to sort out the bioinformatics of the CBM family 43.

…and within the lab! Martin Baumann for excellent XET/NXG collaboration, fruitful discussions (alla talar svenska!) and enlightning company in the lab. Dr. Jenny Fäldt for making the XET mutants. Dr. Hongbin Henriksson for introduction to the ÄKTA systems. Dr. Janne Lehtiö and Dr. Malin Gustavsson for introducing me to the world of CBMs and for letting me inherit the “Can of Mystery”. Dr. Lionel Greffe for struggling with the CBM cloning. Dr. Joke Rotticci-Mulder for patiently answering all my questions. Dr. Henrik Aspeborg for guidance in the microarray djungle. Doc. Ines Ezcurra and Maria Henriksson for critical reading of the thesis manuscript.

All former and present colleagues in XVe and floor 2 for creating a nice working atmosphere! It would not have been as fun without fika, lunches, Christmas parties etc. Special thanks to my great room mates Henrik Aspeborg, Ulla Rudsander, Soraya Djerbi and Martin Baumann! Thanks to Marianne Wittrup Larsen for the parallel struggle to reach the end (soon it will be your turn!!). Jenny Ottosson for irregular training at KTH-hallen and for being a good friend. Joke and Didier Rotticci-Mulder for nice activities outside the lab. Fredrik Viklund for many cups of tea and amusing bus stories. Linda Fransson for your extra-ordinary pep-talks. Cecilia Branneby for making my working hours feeling normal. Nader Nourizad for your happy mood! Anna Olsson and Torkel Berglund for nice discussions about mountains, skiing and Mumintrollen. Anders Winzell and Alex Rajangam for nice chats in the lab (now you can whistle as much as you want Alex!). Lotta Rosenfäldt and Alphonsa Lourdossos for sorting out the paper business and for your efforts to keep the lunch conversations away from science. Thanks also to the new PhD students in Tuula’s

49 group (Christian, Maria, Johanna, Nomchit, Lage and Jens) for bringing in new enthusiasm, new ideas and a somewhat more updated wardrobe.

Jag vill också rikta ett STORT tack till min familj och mina vänner utanför labbet: Maria och Martina för lång och trogen vänskap, hoppas att vi kan ses lite mer framöver. Jenny för fantastiska resor och för att du är den du är. Viveca för att du tar emot oss och tar hand om oss i Paris och Bryssel. Karin och Ingrid för lysande prestationer (!?) på beachvolleyboll-planen och för att ni är såna super-vänner. Emma för att du är världens bästa brevvän -. KTH- maffian: Anna, Erika, Linda, Lisa, Sofie och Therese för många trevliga middagar, pratstunder och framtida barnvagnspromenader. Mina au pair-vänner från Annecy: Linda, Lisa, Cissi, Mette och Camilla. Lalle för festfixande, squash-spel och sällskap på udda filmer. Kattis och Fredrik för trevligt umgänge i Storvallen, på Fejan och i Sigtuna.

Peter – min älskling j. För ditt norrländska lugn. För oändlig kärlek, omtanke och markservice. För engagemang i såväl proteiner, sockermolekyler som utformning av avhandlingen. Malin och Ivar för att vi får komma till Härnösand och vila upp oss. Grynet för sällskap vid datorn och för att du satte perspektiv på min avhandling.

Nanna – min allra käraste syster! För uppmuntran, stöd samt ovärderlig kunskap om hur man klarar sig i livet. Henrik för att du sätter min forskning i ett sammanhang. Tilde och Elvira för att ni har koll på vad som egentligen är viktigt i livet!

Till sist vill jag tacka mina föräldrar – mamma Anette och pappa Lembit. För att ni alltid finns till hands och alltid ställer upp. För att ni alltid uppmuntrat mig till att göra det jag vill i livet (bortsett från en och annan tågluff) och låtit mig bli den jag är.

50 REFERENCES

Adams, K.L. and Wendel, J.F. (2005) Polyploidy and genome evolution in plants. Curr. Opin. Plant Biol. 8: 135-141

Allouch, J., Jam, M., Helbert, W., Barbeyron, T., Kloareg, B., Henrissat, B. and Czjzek, M. (2003) The three-dimensional structures of two beta-agarases. J. Biol. Chem. 278: 47171- 47180

Andersson, A., Keskitalo, J., Sjödin, A., Bhalerao, R., Sterky, F., Wissel, K., Tandre, K., Aspeborg, H., Moyle, R., Ohmiya, Y., Bhalerao, R., Brunner, A., Gustafsson, P., Karlsson, J., Lundeberg, J., Nilsson, O., Sandberg, G., Strauss, S., Sundberg, B., Uhlen, M., Jansson, S. and Nilsson, P. (2004) A transcriptional timetable of autumn senescence. Genome Biol. 5:

Andersson-Gunnerås, S., Mellerowicz, E., Love, J., Segerman, B., Ohmiya, Y., Coutinho, P., Nilsson, P., Henrissat, B., Moritz, T. and Sundberg, B. (2006) Biosynthesis of cellulose-enriched tension wood in Populus: global analysis of transcripts and metabolites identifies biochemical and developmental regulators in secondary wall biosynthesis. Plant J. 45: 144-165

Araki, R., Ali, M.K., Sakka, M., Kimura, T., Sakka, K. and Ohmiya, K. (2004) Essential role of the family-22 carbohydrate-binding modules for beta-1,3-1,4-glucanase activity of Clostridium stercorarium Xyn10B. FEBS Lett. 561: 155-158

Arrowsmith, D.A. and de Silva, J. (1995) Characterization of 2 tomato fruit-expressed cDNAs encoding xyloglucan endo-transglycosylase. Plant Mol. Biol. 28: 391-403

Aspeborg, H., Schrader, J., Coutinho, P., Stam, M., Kallas, Å., Djerbi, S., Nilsson, P., Denman, S., Amini, B., Sterky, F., Master, E., Sandberg, G., Mellerowicz, E., Sundberg, B., Henrissat, B. and Teeri, T.T. (2005) Carbohydrate-active enzymes involved in the secondary cell wall biogenesis in hybrid aspen. Plant Physiol. 137: 983-997

Bajpai, P. (1999) Application of enzymes in the pulp and paper industry. Biotechnol. Prog. 15: 147-157

Baneyx, F. (1999) Recombinant protein expression in Escerichia coli. Curr. Opin. Biotechnol. 10: 411-421

Banik, M., Li, C.D., Langridge, P. and Fincher, G.B. (1997) Structure, hormonal regulation, and chromosomal location of genes encoding barley (1->4)-beta-xylan endohydrolases. Mol. Gen. Genetics 253: 599-608

Barral, P., Suarez, C., Batanero, E., Alfonso, C., Alche Jde, D., Rodriguez-Garcia, M.I., Villalba, M., Rivas, G. and Rodriguez, R. (2005) An olive pollen protein with allergenic activity, Ole e 10, defines a novel family of carbohydrate-binding modules and is potentially implicated in pollen germination. Biochem. J. 390: 77-84

51 Baskin, T.I. (2001) On the alignment of cellulose microfibrils by cortical microtubules: a review and a model. Protoplasma 215: 150-171

Bayer, E.A., Belaich, J.P., Shoham, Y. and Lamed, R. (2004) The cellulosomes: Multienzyme machines for degradation of plant cell wall polysaccharides. Annu. Rev. Microbiol. 58: 521-554

Benjavongkulchai, E. and Spencer, M.S. (1986) Purification and Characterization of Barley-Aleurone Xylanase. Planta 169: 415-419

Bhat, M.K. (2000) Cellulases and related enzymes in biotechnology. Biotechnol. Adv. 18: 355-383

Bih, F.Y., Wu, S.S.H., Ratnayake, C., Walling, L.L., Nothnagel, E.A. and Huang, A.H.C. (1999) The predominant protein on the surface of maize pollen is an endoxylanase synthesized by a tapetum mRNA with a long 5 ' leader. J. Biol. Chem. 274: 22884-22894

Bolam, D.N., Ciruela, A., McQueen-Mason, S., Simpson, P., Williamson, M.P., Rixon, J.E., Boraston, A., Hazlewood, G.P. and Gilbert, H.J. (1998) Pseudomonas cellulose- binding domains mediate their effects by increasing enzyme substrate proximity. Biochem. J. 331: 775-781

Bolam, D.N., Xie, H., White, P., Simpson, P.J., Hancock, S.M., Williamson, M.P. and Gilbert, H.J. (2001) Evidence for synergy between family 2b carbohydrate binding modules in Cellulomonas fimi xylanase 11A. Biochemistry 40: 2468-2477

Bolam, D.N., Xie, H.F., Pell, G., Hogg, D., Galbraith, G., Henrissat, B. and Gilbert, H.J. (2004) X4 modules represent a new family of carbohydrate-binding modules that display novel properties. J. Biol. Chem. 279: 22953-22963

Boraston, A.B., Bolam, D.N., Gilbert, H.J. and Davies, G.J. (2004) Carbohydrate-binding modules: fine-tuning polysaccharide recognition. Biochem. J. 382: 769-781

Boraston, A.B., Creagh, A.L., Alam, M.M., Kormos, J.M., Tomme, P., Haynes, C.A., Warren, R.A.J. and Kilburn, D.G. (2001a) Binding specificity and thermodynamics of a family 9 carbohydrate-binding module from Thermotoga maritima xylanase 10A. Biochemistry 40: 6240-6247

Boraston, A.B., Kwan, E., Chiu, P., Warren, R.A. and Kilburn, D.G. (2003) Recognition and hydrolysis of noncrystalline cellulose. J. Biol. Chem. 278: 6120-6127

Boraston, A.B., McLean, B.W., Chen, G., Li, A., Warren, R.A. and Kilburn, D.G. (2002) Co-operative binding of triplicate carbohydrate-binding modules from a thermophilic xylanase. Mol. Microbiol. 43: 187-194

Boraston, A.B., McLean, B.W., Guarna, M.M., Amandaron-Akow, E. and Kilburn, D.G. (2001b) A family 2a carbohydrate-binding module suitable as an affinity tag for proteins produced in Pichia pastoris. Prot. Expr. Purif. 21: 417-423

52 Boraston, A.B., Revett, T.J., Boraston, C.M., Nurizzo, D. and Davies, G.J. (2003) Structural and thermodynamic dissection of specific mannan recognition by a carbohydrate binding module, TmCBM27. Structure 11: 665-675

Boraston, A.B., Tomme, P., Amandoron, E.A. and Kilburn, D.G. (2000) A novel mechanism of xylan binding by a lectin-like module from Streptomyces lividans xylanase 10A. Biochem. J. 350: 933-941

Bouche, N. and Bouchez, D. (2001) Arabidopsis gene knockout: phenotypes wanted. Curr. Opin. Plant Biol. 4: 111-117

Bourquin, V., Nishikubo, N., Abe, H., Brumer, H., Denman, S., Eklund, M., Christiernin, M., Teeri, T.T., Sundberg, B. and Mellerowicz, E.J. (2002) Xyloglucan endotransglycosylases have a function during the formation of secondary cell walls of vascular tissues. Plant Cell 14: 3073-3088

Brett, C.T. and Waldron, K.W. (1996) Physiology and biochemistry of plant cell walls, Chapman & Hall, Cambridge, UK

Bretthauer, R.K. and Castellino, F.J. (1999) Glycosylation of Pichia pastoris-derived proteins. Biotechnol. Appl. Biochem. 30: 193-200.

Brumer, H., Zhou, Q., Baumann, M.J., Carlsson, K. and Teeri, T.T. (2004) Activation of crystalline cellulose surfaces through the chemoenzymatic modification of xyloglucan. J. Am. Chem. Soc. 126: 5715-5721

Brun, E., Johnson, P.E., Creagh, A.L., Tomme, P., Webster, P., Haynes, C.A. and McIntosh, L.P. (2000) Structure and binding specificity of the second N-terminal cellulose- binding domain from Cellulomonas fimi endoglucanase C. Biochemistry 39: 2445-2458

Brunner, A.M., Busov, V.B. and Strauss, S.H. (2004) Poplar genome sequence: functional genomics in an ecologically dominant plant species. Trends Plant Sci. 9: 49-56

Burstin, J. (2000) Differential expression of two barley XET-related genes during coleoptile growth. J. Exp. Bot. 51: 847-852

Burton, R.A., Shirley, N.J., King, B.J., Harvey, A.J. and Fincher, G.B. (2004) The CesA gene family of barley. Quantitative analysis of transcripts reveals two groups of co-expressed genes. Plant Physiol. 134: 224-236

Callewaert, N., Laroy, W., Cadirgi, H., Geysens, S., Saelens, X., Jou, W.M. and Contreras, R. (2001) Use of HDEL-tagged Trichoderma reesei mannosyl oligosaccharide 1,2-alpha-D-mannosidase for N-glycan engineering in Pichia pastoris. FEBS Letters 503: 173-178

Campbell, P. and Braam, J. (1998) Co- and/or post-translational modifications are critical for TCH4 XET activity. Plant J. 15: 553-561

Campbell, P. and Braam, J. (1999a) Xyloglucan endotransglycosylases: diversity of genes, enzymes and potential wall-modifying functions. Trends Plant Sci. 4: 361-366

53 Campbell, P. and Braam, J. (1999b) In vitro activities of four xyloglucan endotransglycosylases from Arabidopsis. Plant J. 18: 371-382.

Carpita, N.C. (1996) Structure and biogenesis of the cell walls of grasses. Ann Rev Plant Physiol Plant Mol Biol 47: 445-476

Carpita, N.C. and Gibeaut, D.M. (1993) Structural models of primary-cell walls in flowering plants - consistency of molecular-structure with the physical-properties of the walls during growth. Plant J. 3: 1-30

Carpita, N.C. and McCann, M. (2000) in Biochemistry and Molecular Biology of Plants (Buchanan, B., Gruissem, W., and Jones, R., eds), pp. 52-108

Carrard, G., Koivula, A., Soderlund, H. and Beguin, P. (2000) Cellulose-binding domains promote hydrolysis of different sites on crystalline cellulose. Proc. Natl. Acad. Sci. U S A 97: 10342-10347

Caspers, M.P.M., Lok, F., Sinjorgo, K.M.C., van Zeijl, M.J., Nielsen, K.A. and Cameron-Mills, V. (2001) Synthesis, processing and export of cytoplasmic endo-beta-1,4- xylanase from barley aleurone during germination. Plant J. 26: 191-204

Catalá, C., Rose, J.K., York, W.S., Albersheim, P., Darvill, A.G. and Bennett, A.B. (2001) Characterization of a tomato xyloglucan endotransglycosylase gene that is down- regulated by auxin in etiolated hypocotyls. Plant Physiol. 127: 1180-1192.

Cereghino, G.P.L., Cereghino, J.L., Ilgen, C. and Cregg, J.M. (2002) Production of recombinant proteins in fermenter cultures of the yeast Pichia pastoris. Curr. Opin. Biotechnol. 13: 329-332

Cereghino, G.P.L. and Cregg, J.M. (1999) Applications of yeast in biotechnology: protein production and genetic analysis. Curr. Opin. Biotechnol. 10: 422-427

Cereghino, J.L. and Cregg, J.M. (2000) Heterologous protein expression in the methylotrophic yeast Pichia pastoris. FEMS Microbiol. Rev. 24: 45-66

Chanliaud, E., De Silva, J., Strongitharm, B., Jeronimidis, G. and Gidley, M.J. (2004) Mechanical effects of plant cell wall enzymes on cellulose/xyloglucan composites. Plant J. 38: 27-37

Charnock, S.J., Bolam, D.N., Turkenburg, J.P., Gilbert, H.J., Ferreira, L.M., Davies, G.J. and Fontes, C.M. (2000) The X6 "thermostabilizing" domains of xylanases are carbohydrate-binding modules: structure and biochemistry of the Clostridium thermocellum X6b domain. Biochemistry 39: 5013-5021

Charnock, S.J., Lakey, J.H., Virden, R., Hughes, N., Sinnott, M.L., Hazlewood, G.P., Pickersgill, R. and Gilbert, H.J. (1997) Key residues in subsite F play a critical role in the activity of Pseudomonas fluorescens subspecies cellulosa xylanase A against xylooligosaccharides but not against highly polymeric substrates such as xylan. J. Biol. Chem. 272: 2942-2951

54 Charnock, S.J., Spurway, T.D., Xie, H.F., Beylot, M.H., Virden, R., Warren, R.A.J., Hazlewood, G.P. and Gilbert, H.J. (1998) The topology of the substrate binding clefts of glycosyl hydrolase family 10 xylanases are not conserved. J. Biol. Chem. 273: 32187-32199

Cicortas Gunnarsson, L., Nordberg Karlsson, E., Albrekt, A.S., Andersson, M., Holst, O. and Ohlin, M. (2004) A carbohydrate binding module as a diversity-carrying scaffold. Prot. Eng. Des. Sel. 17: 213-221

Cleemput, G., Hessing, M., vanOort, M., Deconynck, M. and Delcour, J.A. (1997a) Purification and characterization of a beta-D-xylosidase and an endo-xylanase from wheat flour. Plant Physiol. 113: 377-386

Cleemput, G., Van Laere, K., Hessing, M., Van Leuven, F., Torrekens, S. and Delcour, J.A. (1997b) Identification and characterization of a novel arabinoxylanase from wheat flour. Plant Physiol. 115: 1619-1627

Collins, T., Gerday, C. and Feller, G. (2005) Xylanases, xylanase families and extremophilic xylanases. FEMS Microbiol. Rev. 29: 3-23

Cosgrove, D.J. (1998) Cell wall loosening by expansins. Plant Physiol. 118: 333-339

Cosgrove, D.J. (2000a) Expansive growth of plant cell walls. Plant Physiol. Biochem. 38: 109-124

Cosgrove, D.J. (2000b) Loosening of plant cell walls by expansins. Nature 407: 321-326

Cosgrove, D.J. (2005) Growth of the plant cell wall. Nat. Rev. Mol. Cell Biol. 6: 850-861

Coutinho, P.M., Deleury, E., Davies, G.J. and Henrissat, B. (2003a) An evolving hierarchical family classification for glycosyltransferases. J. Mol. Biol. 328: 307-317

Coutinho, P.M., Stam, M., Blanc, E. and Henrissat, B. (2003b) Why are there so many carbohydrate-active enzyme-related genes in plants? Trends Plant Sci. 8: 563-565

Creagh, A.L., Ong, E., Jervis, E., Kilburn, D.G. and Haynes, C.A. (1996) Binding of the cellulose-binding domain of exoglucanase Cex from Cellulomonas fimi to insoluble microcrystalline cellulose is entropically driven. Proc. Natl. Acad. Sci. U S A 93: 12229- 12234

Daniel, G., Filonova, L., Kallas, Å.M. and Teeri, T.T. (2006) Use of the cellulose-binding module CBM1HjCel7A for morphological and chemical characterization of the gelatinous cell wall layer in tensio wood fibres of Populus tremula and Betula verrucosa. Submitted to Holzforschung

Darley, C.P., Forrester, A.M. and McQueen-Mason, S.J. (2001) The molecular basis of plant cell wall extension. Plant Mol. Biol. 47: 179-195.

Davies, G. and Henrissat, B. (1995) Structures and mechanisms of glycosyl hydrolases. Structure 3: 853-859

55 Davies, G.J., Wilson, K.S. and Henrissat, B. (1997) Nomenclature for sugar-binding subsites in glycosyl hydrolases. Biochem. J. 321: 557-559

Derewenda, U., Swenson, L., Green, R., Wei, Y.Y., Morosoli, R., Shareck, F., Kluepfel, D. and Derewenda, Z.S. (1994) Crystal structure, at 2.6 Ångstrom sesolution, of the Streptomyces lividans Xylanase A, a member of the F-family of Beta-1,4-D-Glycanases. J. Biol. Chem. 269: 20811-20814

Din, N., Damude, H.G., Gilkes, N.R., Miller, R.C., Jr., Warren, R.A. and Kilburn, D.G. (1994) C1-Cx revisited: intramolecular synergism in a cellulase. Proc. Natl. Acad. Sci. U S A 91: 11383-11387

Dixit, R., Cyr, R. and Gilroy, S. (2006) Using intrinsically fluorescent proteins for plant cell imaging. Plant J. 45: 599-615

Djerbi, S., Aspeborg, H., Nilsson, P., Sundberg, B., Mellerowicz, E., Blomqvist, K. and Teeri, T.T. (2004) Identification and expression analysis of genes encoding putative cellulose synthases (CesA) in the hybrid aspen, Populus tremula (L.) x P. tremuloides (Michx.). Cellulose 11: 301-312

Djerbi, S., Lindskog, M., Arvestad, L., Sterky, F. and Teeri, T.T. (2005) The genome sequence of black cottonwood (Populus trichocarpa) reveals 18 conserved cellulose synthase (CesA) genes. Planta 221: 739-746

Doblin, M.S., Kurek, I., Jacob-Wilk, D. and Delmer, D.P. (2002) Cellulose biosynthesis in plants: from genes to rosettes. Plant Cell Physiol. 43: 1407-1420

Ellgaard, L. and Helenius, A. (2003) Quality control in the endoplasmic reticulum. Nat. Rev. Mol. Cell Biol. 4: 181-191

Fagard, M., Hofte, H. and Vernhettes, S. (2000) Cell wall mutants. Plant Physiol. Biochem. 38: 15-25

Fanutti, C., Gidley, M.J. and Reid, J.S. (1993) Action of a pure xyloglucan endo- transglycosylase (formerly called xyloglucan-specific endo-(1-->4)-beta-D-glucanase) from the cotyledons of germinated nasturtium seeds. Plant J. 3: 691-700

Fanutti, C., Gidley, M.J. and Reid, J.S. (1996) Substrate subsite recognition of the xyloglucan endo-transglycosylase or xyloglucan-specific endo-(1-->4)-beta-D-glucanase from the cotyledons of germinated nasturtium (Tropaeolum majus L.) seeds. Planta 200: 221-228

Farkaš, V., Sulová, Z., Stratilová, E., Hanna, R. and Maclachlan, G. (1992) Cleavage of xyloglucan by nasturtium seed xyloglucanase and transglycosylation to xyloglucan subunit oligosaccharides. Arch. Biochem. Biophys. 298: 365-370

Flint, J., Bolam, D.N., Nurizzo, D., Taylor, E.J., Williamson, M.P., Walters, C., Davies, G.J. and Gilbert, H.J. (2005) Probing the mechanism of ligand recognition in family 29 carbohydrate-binding modules. J. Biol. Chem. 280: 23718-23726

56 Flint, J., Nurizzo, D., Harding, S.E., Longman, E., Davies, G.J., Gilbert, H.J. and Bolam, D.N. (2004) Ligand-mediated dimerization of a carbohydrate-binding module reveals a novel mechanism for protein-carbohydrate recognition. J. Mol. Biol. 337: 417-426

Fry, S.C. (1989) The structure and functions of xyloglucan. J. Exp. Bot. 40: 1-11

Fry, S.C., Smith, R.C., Renwick, K.F., Martin, D.J., Hodge, S.K. and Matthews, K.J. (1992) Xyloglucan endotransglycosylase, a new wall-loosening enzyme activity from plants. Biochem. J. 282: 821-828

Geisler-Lee, J., Geisler, M., Coutinho, P.M., Segerman, B., Nishikubo, N., Takahashi, J., Aspeborg, H., Djerbi, S., Master, E., Andersson-Gunnerås, S., Sundberg, B., Karpinski, S., Teeri, T.T., Kleczkowski, L.A., Henrissat, B. and Mellerowicz, E.J. (2006) Poplar carbohydrate-active enzymes. Gene identification and expression analyses. Plant Physiol. 140: 946-962

Gemmill, T.R. and Trimble, R.B. (1999) Overview of N- and O-linked oligosaccharide structures found in various yeast species. Biochim. Biophys. Acta 1426: 227-237

Gibeaut, D.M. (2000) Nucleotide sugars and glycosyltransferases for synthesis of cell wall matrix polysaccharides. Plant Physiol. Biochem. 38: 69-80

Gilbert, H.J. (2003) How carbohydrate binding modules overcome ligand complexity. Structure 11: 609-610

Gilbert, H.J. and Hazlewood, G.P. (1993) Bacterial Cellulases and Xylanases. J. Gen. Microbiol. 139: 187-194

Gilkes, N.R., Henrissat, B., Kilburn, D.G., Miller, R.C. and Warren, R.A.J. (1991) Domains in Microbial Beta-1,4-Glycanases - Sequence Conservation, Function, and Enzyme Families. Microbiol. Rev. 55: 303-315

Gilkes, N.R., Warren, R.A., Miller, R.C., Jr. and Kilburn, D.G. (1988) Precise excision of the cellulose binding domains from two Cellulomonas fimi cellulases by a homologous protease and the effect on catalysis. J. Biol. Chem. 263: 10401-10407

Grima-Pettenati, J. and Goffner, D. (1999) Lignin genetic engineering revisited. Plant Sci. 145: 51-65

Grishutin, S.G., Gusakov, A.V., Dzedzyulya, E.I. and Sinitsyn, A.P. (2006) A lichenase- like family 12 endo-(l -> 4)-beta-glucanase from Aspergillus japonicus: study of the substrate specificity and mode of action on beta-glucans in comparison with other glycoside hydrolases. Carbohydr. Res. 341: 218-229

Guivarch, A., Caissard, J.C., Azmi, A., Elmayan, T., Chriqui, D. and Tepfer, M. (1996) In situ detection of expression of the gus reporter gene transgenic plants: Ten years of blue genes. Transgen. Res. 5: 281-288

Gustavsson, M. (2003) Design of enzymatic systems for the modification of cellulose fiber surfaces. PhD thesis, Royal Institute of Technology, Stockholm, Sweden

57 Hahn, M., Keitel, T. and Heinemann, U. (1995) Crystal and molecular-structure at 0.16-nm resolution of the hybrid Bacillus Endo-1,3-1,4-Beta-D-Glucan 4-Glucanohydrolase H (A16- M). Eur. J. Biochem. 232: 849-858

Han, K.-H. (2001) Molecular biology of secondary growth. J. Plant Biotechnol. 3: 45-57

Hayashi, T. (1989) Xyloglucans in the primary cell wall. Annu. Rev. Plant Physiol. Plant Mol. Biol. 40: 139-168

Henriksson, H., Denman, S.E., Campuzano, I.D.G., Ademark, P., Master, E.R., Teeri, T.T. and Brumer, H. (2003) N-linked glycosylation of native and recombinant cauliflower xyloglucan endotransglycosylase 16A. Biochem. J. 375: 61-73

Henrissat, B., Coutinho, P.M. and Davies, G.J. (2001) A census of carbohydrate-active enzymes in the genome of Arabidopsis thaliana. Plant Mol. Biol. 47: 55-72

Henrissat, B. and Davies, G. (1997) Structural and sequence-based classification of glycoside hydrolases. Curr. Opin. Struct. Biol. 7: 637-644

Henrissat, B. and Davies, G.J. (2000) Glycoside hydrolases and glycosyltransferases. Families, modules, and implications for genomics. Plant Physiol. 124: 1515-1519

Henshaw, J.L., Bolam, D.N., Pires, V.M.R., Czjzek, M., Henrissat, B., Ferreira, L.M.A., Fontes, C.M.G.A. and Gilbert, H.J. (2004) The family 6 carbohydrate binding module CmCBM6-2 contains two ligand-binding sites with distinct specificities. J. Biol. Chem. 279: 21552-21559

Hertzberg, M., Aspeborg, H., Schrader, J., Andersson, A., Erlandsson, R., Blomqvist, K., Bhalerao, R., Uhlén, M., Teeri, T.T., Lundeberg, J., Sundberg, B., Nilsson, P. and Sandberg, G. (2001) A transcriptional roadmap to wood formation. Proc. Natl. Acad. Sci. U S A 98: 14732-14737

Hildén, L., Daniel, G. and Johansson, G. (2003) Use of a fluorescence labelled, carbohydrate-binding module from Phanerochaete chrysosporium Cel7D for studying wood cell wall ultrastructure. Biotech. Lett. 25: 553-558

Hogg, D., Pell, G., Dupree, P., Goubet, F., Martin-Orue, S.M., Armand, S. and Gilbert, H.J. (2003) The modular architecture of Cellvibrio japonicus mannanases in glycoside hydrolase families 5 and 26 points to differences in their role in mannan degradation. Biochem. J. 371: 1027-1043

Hu, X.Y., Shi, Q.W., Yang, T. and Jackowski, G. (1996) Specific replacement of consecutive AGG codons results in high-level expression of human cardiac troponin T in Escherichia coli. Prot. Expr. Purif. 7: 289-293

Itakura, K., Hirose, T., Crea, R., Riggs, A.D., Heyneker, H.L., Bolivar, F. and Boyer, H.W. (1977) Expression in Escherichia coli of a chemically synthesized gene for hormone somatostatin. Science 198: 1056-1063

58 Jahic, M. (2003) Process techniques for production of recombinant proteins with Pichia pastoris. PhD thesis, Royal Institute of Technology, Stockholm, Sweden

Jefferson, R.A., Kavanagh, T.A. and Bevan, M.W. (1987) Gus Fusions: Beta- glucuronidase as a sensitive and versatile gene fusion marker in higher plants. Embo J. 6: 3901-3907

Johansson, P., Brumer III, H., Baumann, M.J., Kallas, Å.M., Henriksson, H., Denman, S.E., Teeri, T.T. and Jones, T.A. (2004) Crystal structures of a xyloglucan endotransglycosylase reveal details of the transglycosylation acceptor binding. Plant Cell 16: 874-886

Johnson, P.E., Joshi, M.D., Tomme, P., Kilburn, D.G. and McIntosh, L.P. (1996) Structure of the N-terminal cellulose-binding domain of Cellulomonas fimi CenC determined by nuclear magnetic resonance spectroscopy. Biochemistry 35: 14381-14394

Jones, J., Krag, S.S. and Betenbaugh, M.J. (2005) Controlling N-linked glycan site occupancy. Biochim. Biophys. Acta 1726: 121-137

Keegstra, K. and Raikhel, N. (2001) Plant glycosyltransferases. Current Opinion in Plant Biology 4: 219-224

Keitel, T., Simon, O., Borriss, R. and Heinemann, U. (1993) Molecular and active-site structure of a Bacillus 1,3-1,4-beta-glucanase. Proc. Natl. Acad. Sci. U S A 90: 5287-5291

Kirst, M., Johnson, A.F., Baucom, C., Ulrich, E., Hubbard, K., Staggs, R., Paule, C., Retzel, E., Whetten, R. and Sederoff, R. (2003) Apparent homology of expressed genes from wood-forming tissues of loblolly pine (Pinus taeda L.) with Arabidopsis thaliana. Proc. Natl. Acad. Sci. U S A 100: 7383-7388

Kleber-Janke, T. and Becker, W.M. (2000) Use of modified BL21(DE3) Escherichia coli cells for high-level expression of recombinant peanut allergens affected by poor codon usage. Prot. Expr. Purif. 19: 419-424

Knox, J.P. (1997) The use of antibodies to study the architecture and developmental regulation of plant cell walls. Int. Rev. Cytol. 171: 79-120

Koshland, D.E. (1953) Stereochemistry and the mechanism of enzymatic reactions. Biol. Rev. 28: 416-436

Kraulis, J., Clore, G.M., Nilges, M., Jones, T.A., Pettersson, G., Knowles, J. and Gronenborn, A.M. (1989) Determination of the three-dimensional solution structure of the C-terminal domain of cellobiohydrolase I from Trichoderma reesei. A study using nuclear magnetic resonance and hybrid distance geometry-dynamical simulated annealing. Biochemistry 28: 7241-7257

Krysan, P.J., Young, J.C. and Sussman, M.R. (1999) T-DNA as an insertional mutagen in Arabidopsis. Plant Cell 11: 2283-2290

59 Kudlicka, K. and Brown, J.R.M. (1997) Cellulose and callose biosynthesis in higher plants (solubilization and separation of (1->3)- and (1->4)-beta-glucan synthase activities from mung bean). Plant Physiol. 115: 643-656

Kulkarni, N., Shendye, A. and Rao, M. (1999) Molecular and biotechnological aspects of xylanases. FEMS Microbiol. Rev. 23: 411-456

Kurland, C. and Gallant, J. (1996) Errors of heterologous protein expression. Curr. Opin. Biotechnol. 7: 489-493

Lai-Kee-Him, J., Chanzy, H., Muller, M., Putaux, J.L., Imai, T. and Bulone, V. (2002) In vitro versus in vivo cellulose microfibrils from plant primary wall synthases: structural differences. J. Biol. Chem. 277: 36931-36939

Langsford, M.L., Gilkes, N.R., Singh, B., Moser, B., Miller, R.C., Warren, R.A.J. and Kilburn, D.G. (1987) Glycosylation of bacterial cellulases prevents proteolytic cleavage between functional domains. FEBS Lett 225: 163-167

Lehtiö, J., Sugiyama, J., Gustavsson, M., Fransson, L., Linder, M. and Teeri, T.T. (2003) The binding specificity and affinity determinants of family 1 and family 3 cellulose binding modules. Proc. Natl. Acad. Sci. U S A 100: 484-489

Lehtiö, J., Teeri, T.T. and Nygren, P.Å. (2000) Alpha-amylase inhibitors selected from a combinatorial library of a cellulose binding domain scaffold. Prot. Struct. Funct. Gen. 41: 316-322

Levy, I., Nussinovitch, A., Shpigel, E. and Shoseyov, O. (2002b) Recombinant cellulose crosslinking protein: a novel paper-modification biomaterial. Cellulose 9: 91-98

Levy, I. and Shoseyov, O. (2002a) Cellulose-binding domains biotechnological applications. Biotechnol. Adv. 20: 191-213

Levy, S., Maclachlan, G. and Staehelin, L.A. (1997) Xyloglucan sidechains modulate binding to cellulose during in vitro binding assays as predicted by conformational dynamics simulations. Plant J. 11: 373-386

Lienart, Y., Comtat, J. and Barnoud, F. (1985) Purification of cell wall beta-D-xylanases from Acacia cultured cells. Plant Sci. 41: 91-96

Linder, M., Lindeberg, G., Reinikainen, T., Teeri, T.T. and Pettersson, G. (1995b) The difference in affinity between 2 fungal cellulose-binding domains is dominated by a single amino acid substitution. FEBS Lett. 372: 96-98

Linder, M., Mattinen, M.L., Kontteli, M., Lindeberg, G., Stahlberg, J., Drakenberg, T., Reinikainen, T., Pettersson, G. and Annila, A. (1995a) Identification of functionally important amino acids in the cellulose-binding domain of Trichoderma Reesei Cellobiohydrolase I. Protein Sci. 4: 1056-1064

60 Linder, M., Salovuori, I., Ruohonen, L. and Teeri, T.T. (1996) Characterization of a double cellulose-binding domain. Synergistic high affinity binding to crystalline cellulose. J. Biol. Chem. 271: 21268-21272

Linder, M. and Teeri, T.T. (1996) The cellulose-binding domain of the major cellobiohydrolase of Trichoderma reesei exhibits true reversibility and a high exchange rate on crystalline cellulose. Proc. Natl. Acad. Sci. U S A 93: 12251-12255

Lorences, E.P. and Fry, S.C. (1993) Xyloglucan oligosaccharides with at least two alfa-D- xylose residues act as acceptor substrates for xyloglucan endotransglycosylase and promote the depolymerisation of xyloglucan. Physiol. Plant. 88: 105-112

MacLeod, A.M., Lindhorst, T., Withers, S.G. and Warren, R.A.J. (1994) The acid/base catalyst in the exoglucanase/xylanase from Cellulomonas Fimi is glutamic acid 127: evidence from detailed kinetic studies of mutants. Biochemistry 33: 6371-6376

MacLeod, A.M., Tull, D., Rupitz, K., Warren, R.A.J. and Withers, S.G. (1996) Mechanistic consequences of mutation of active site carboxylates in a retaining beta-1,4- glycanase from Cellulomonas fimi. Biochemistry 35: 13165-13172

Malinowski, R., Filipecki, M., Tagashira, N., Wisniewska, A., Gaj, P., Plader, W. and Malepszy, S. (2004) Xyloglucan endotransglucosylase/hydrolase genes in cucumber (Cucumis sativus) - differential expression during somatic embryogenesis. Physiol. Plant. 120: 678-685

Maras, M., van Die, I., Contreras, R. and van den Hondel, C.A.M.J.J. (1999) Filamentous fungi as production organisms for glycoproteins of bio-medical interest. Glycoconj. J. 16: 99- 107

Master, E.R., Rudsander, U.J., Zhou, W., Henriksson, H., Divne, C., Denman, S., Wilson, D.B. and Teeri, T.T. (2004) Recombinant expression and enzymatic characterization of PttCel9A, a KOR homologue from Populus tremula x tremuloides. Biochemistry 43: 10080-10089

McCann, M.C. (1997) Tracheary element formation: building up to a dead end. Trends Plant Sci. 2: 333-338

McCartney, L., Blake, A.W., Flint, J., Bolam, D.N., Boraston, A.B., Gilbert, H.J. and Knox, J.P. (2006) Differential recognition of plant cell walls by microbial xylan-specific carbohydrate-binding modules. Proc. Natl. Acad. Sci. U S A 103: 4765-4770

McCartney, L., Gilbert, H.J., Bolam, D.N., Boraston, A.B. and Knox, J.P. (2004) Glycoside hydrolase carbohydrate-binding modules as molecular probes for the analysis of plant cell wall polymers. Anal. Biochem. 326: 49-54

McLean, B.W., Bray, M.R., Boraston, A.B., Gilkes, N.R., Haynes, C.A. and Kilburn, D.G. (2000) Analysis of binding of the family 2a carbohydrate-binding module from Cellulomonas fimi xylanase 10A to cellulose: specificity and identification of functionally important amino acid residues. Prot. Engineering 13: 801-809

61 McQueen-Mason, S., Durachko, D.M. and Cosgrove, D.J. (1992) Two endogenous proteins that induce cell wall extension in plants. Plant Cell 4: 1425-1433

Meissner, K., Wassenberg, D. and Liebl, W. (2000) The thermostabilizing domain of the modular xylanase XynA of Thermotoga maritima represents a novel type of binding domain with affinity for soluble xylan and mixed-linkage beta-1,3/beta-1,4-glucan. Mol. Microbiol. 36: 898-912

Mellerowicz, E.J., Baucher, M., Sundberg, B. and Boerjan, W. (2001) Unravelling cell wall formation in the woody dicot stem. Plant Mol. Biol. 47: 239-274

Michel, G., Chantalat, L., Duee, E., Barbeyron, T., Henrissat, B., Kloareg, B. and Dideberg, O. (2001) The kappa-carrageenase of P. carrageenovora features a tunnel-shaped active site: A novel insight in the evolution of clan B glycoside hydrolases. Structure 9: 513- 525

Misawa, S. and Kumagai, I. (1999) Refolding of therapeutic proteins produced in Escherichia coli as inclusion bodies. Biopolymers 51: 297-307

Moks, T., Abrahmsen, L., Holmgren, E., Bilich, M., Olsson, A., Uhlen, M., Pohl, G., Sterky, C., Hultberg, H., Josephson, S., Holmgren, A., Jornvall, H. and Nilsson, B. (1987) Expression of human Insulin-like growth factor-I in Bacteria: Use of optimized gene fusion vectors to facilitate protein purification. Biochemistry 26: 5239-5244

Montesino, R., Garcia, R., Quintero, O. and Cremata, J.A. (1998) Variation in N-linked oligosaccharide structures on heterologous proteins secreted by the methylotrophic yeast Pichia pastoris. Prot. Expr. Purif. 14: 197-207.

Nakamura, T., Yokoyama, R., Tomita, E. and Nishitani, K. (2003) Two azuki bean XTH Genes, VaXTH1 and VaXTH2, with similar tissue- specific expression profiles, are differently regulated by auxin. Plant Cell Physiol. 44: 16-24

Nicol, F., His, I., Jauneau, A., Vernhettes, S., Canut, H. and Hofte, H. (1998) A plasma membrane-bound putative endo-1,4-beta-D-glucanase is required for normal wall assembly and cell elongation in Arabidopsis. Embo J 17: 5563-5576

Nieminen, K.M., Kauppinen, L. and Helariutta, Y. (2004) A weed for wood? Arabidopsis as a genetic model for xylem development. Plant Physiol. 135: 653-659

Nishitani, K. (2005) Division of roles among members of the XTH gene family in plants. Plant Biosystems 139: 98-101

Nishitani, K. and Tominaga, R. (1992) Endo-xyloglucan transferase, a novel class of glycosyltransferase that catalyzes transfer of a segment of xyloglucan molecule to another xyloglucan molecule. J. Biol. Chem. 267: 21058-21064

Notenboom, V., Boraston, A.B., Kilburn, D.G. and Rose, D.R. (2001) Crystal structures of the family 9 carbohydrate-binding module from Thermotoga maritima xylanase 10A in native and ligand-bound forms. Biochemistry 40: 6248-6256

62 Notenboom, V., Boraston, A.B., Williams, S.J., Kilburn, D.G. and Rose, D.R. (2002) High-resolution crystal structures of the lectin-like xylan binding domain from Streptomyces lividans xylanase 10A with bound substrates reveal a novel mode of xylan binding. Biochemistry 41: 4246-4254

Oh, M.H., Romanow, W.G., Smith, R.C., Zamski, E., Sasse, J. and Clouse, S.D. (1998) Soybean BRU1 encodes a functional xyloglucan endotransglycosylase that is highly expressed in inner epicotyl tissues during brassinosteroid-promoted elongation. Plant Cell Physiol. 39: 124-130

Pandey, A. and Mann, M. (2000) Proteomics to study genes and genomes. Nature 405: 837- 846

Parinov, S. and Sundaresan, V. (2000) Functional genomics in Arabidopsis: large-scale insertional mutagenesis complements the genome sequencing project. Curr. Opin. Biotech. 11: 157-161

Pauly, M., Albersheim, P., Darvill, A. and York, W.S. (1999) Molecular domains of the cellulose/xyloglucan network in the cell walls of higher plants. Plant J. 20: 629-639

Pear, J.R., Kawagoe, Y., Schrekengost, W.E., Delmer, D.P. and D.M., S. (1996) Higher plants contain homologs of the bacterial celA genes encoding the catalytic subunit of cellulose synthase. Proc. Natl. Acad. Sci. U S A 93: 12637-12642

Pell, G., Williamson, M.P., Walters, C., Du, H.M., Gilbert, H.J. and Bolam, D.N. (2003) Importance of hydrophobic and polar residues in ligand binding in the family 15 carbohydrate-binding module from Cellvibrio japonicus Xyn10C. Biochemistry 42: 9316- 9323

Pelosi, L., Imai, T., Chanzy, H., Heux, L., Buhler, E. and Bulone, V. (2003) Structural and morphological diversity of (1 -> 3)-beta-D-glucans synthesized in vitro by enzymes from Saprolegnia monoica. Comparison with a corresponding in vitro product from blackberry (Rubus fruticosus). Biochemistry 42: 6264-6274

Pere, J., Siikaaho, M., Buchert, J. and Viikari, L. (1995) Effects of purified Trichoderma reesei cellulases on the fiber properties of kraft pulp. Tappi Journal 78: 71-78

Perrin, R., Wilkerson, C. and Keegstra, K. (2001) Golgi enzymes that synthesize plant cell wall polysaccharides: finding and evaluating candidates in the genomic era. Plant Mol. Biol. 47: 115-130

Planas, A. (2000) Bacterial 1,3-1,4-beta-glucanases: structure, function and protein engineering. Biochim. Biophys. Acta 1543: 361-382

Plomion, C., Leprovost, G. and Stokes, A. (2001) Wood formation in trees. Plant Physiol. 127: 1513-1523

Polizeli, M.L.T.M., Rizzatti, A.C.S., Monti, R., Terenzi, H.F., Jorge, J.A. and Amorim, D.S. (2005) Xylanases from fungi: properties and industrial applications. Appl. Microbiol. Biotechnol. 67: 577-591

63 Raven, P.H., Evert, R.F. and Eichhorn, S.E. (1999) Biology of plants, W.H. Freeman and Company, New York, USA

Rayle, D.L. and Cleland, R.E. (1992) The acid growth theory of auxin-induced cell elongation is alive and well. Plant Physiol. 99: 1271-1274

Rayon, C., Lerouge, P. and Faye, L. (1998) The protein N-glycosylation in plants. J. Exp. Bot. 49: 1463-1472

Reid, J.S.G. (1997) in Plant biochemistry (Dey, P. M., and Harborne, J. B., eds), Academic Press, San Diego, USA

Reidy, B., Nosberger, J. and Fleming, A. (2001) Differential expression of XET-related genes in the leaf elongation zone of F. pratensis. J. Exp. Bot. 52: 1847-1856

Reinikainen, T., Ruohonen, L., Nevanen, T., Laaksonen, L., Kraulis, P., Jones, T.A., Knowles, J.K.C. and Teeri, T.T. (1992) Investigation of the function of mutated cellulose- binding domains of Trichoderma Reesei Cellobiohydrolase I. Prot. Struct. Func. Gen. 14: 475-482

Richmond, T.A. and Somerville, C.R. (2000) The cellulose synthase superfamily. Plant Physiol. 124: 495-498

Rose, J.K. and Bennett, A.B. (1999) Cooperative disassembly of the cellulose-xyloglucan network of plant cell walls: parallels between cell expansion and fruit ripening. Trends Plant Sci. 4: 176-183

Rose, J.K., Braam, J., Fry, S.C. and Nishitani, K. (2002) The XTH family of enzymes involved in xyloglucan endotransglucosylation and endohydrolysis: current perspectives and a new unifying nomenclature. Plant Cell Physiol. 43: 1421-1435

Rose, J.K., Brummell, D.A. and Bennett, A.B. (1996) Two divergent xyloglucan endotransglycosylases exhibit mutually exclusive patterns of expression in nasturtium. Plant Physiol. 110: 493-499

Rossignol, M. (2001) Analysis of the plant proteome. Curr. Opin. Biotechnol. 12: 131-134

Saura-Valls, M., Faure, R., Ragas, S., Piens, K., Brumer, H., Teeri, T.T., Cottaz, S., Driguez, H. and Planas, A. (2006) Kinetic analysis using low-molecular mass xyloglucan oligosaccharides defines the catalytic mechanism of a Populus xyloglucan endotransglycosylase. Biochemical Journal 395: 99-106

Schärpf, M., Connelly, G.P., Lee, G.M., Boraston, A.B., Warren, R.A.J. and McIntosh, L.P. (2002) Site-specific characterization of the association of xylooligosaccharides with the CBM13 lectin-like xylan binding domain from Streptomyces lividans xylanase 10A by NMR spectroscopy. Biochemistry 41: 4255-4263

Scheible, W.R. and Pauly, M. (2004) Glycosyltransferases and cell wall biosynthesis: novel players and insights. Curr. Opin. Plant Biol. 7: 285-295

64 Schmidt, A., Schlacher, A., Steiner, W., Schwab, H. and Kratky, C. (1998) Structure of the xylanase from Penicillium simplicissimum. Protein Sci. 7: 2081-2088

Schrader, J., Nilsson, J., Mellerowicz, E., Berglund, A., Nilsson, P., Hertzberg, M. and Sandberg, G. (2004) A high-resolution transcript profile across the wood-forming meristem of poplar identifies potential regulators of cambial stem cell identity. Plant Cell 16: 2278- 2292

Schröder, R., Atkinson, R.G., Langenkamper, G. and Redgwell, R.J. (1998) Biochemical and molecular characterisation of xyloglucan endotransglycosylase from ripe kiwifruit. Planta 204: 242-251

Schröder, R., Wegrzyn, T.F., Bolitho, K.M. and Redgwell, R.J. (2004) Mannan transglycosylase: a novel enzyme activity in cell walls of higher plants. Planta 219: 590-600

Shen, H., Schmuck, M., Pilz, I., Gilkes, N.R., Kilburn, D.G., Miller, R.C., Jr. and Warren, R.A. (1991) Deletion of the linker connecting the catalytic and cellulose-binding domains of endoglucanase A (CenA) of Cellulomonas fimi alters its conformation and catalytic activity. J. Biol. Chem. 266: 11335-11340

Simpson, D.J., Fincher, G.B., Huang, A.H.C. and Cameron-Mills, V. (2003) Structure and function of cereal and related higher plant (1 -> 4)-beta-xylan endohydrolases. J. Cereal Sci. 37: 111-127

Sinnott, M.L. (1990) Catalytic mechanisms of enzymic glycosyl transfer. Chem. Rev. 90: 1171-1202

Slade, A.M., Hoj, P.B., Morrice, N.A. and Fincher, G.B. (1989) Purification and characterization of three (1,4)-beta-D-xylan endohydrolases from germinated barley. Eur. J. Biochem. 185: 533-539

Smith, G.P., Patel, S.U., Windass, J.D., Thornton, J.M., Winter, G. and Griffiths, A.D. (1998) Small binding proteins selected from a combinatorial repertoire of knottins displayed on phage. J. Mol. Biol. 277: 317-332

Sorensen, H.P. and Mortensen, K.K. (2005) Advanced genetic strategies for recombinant protein expression in Escherichia coli. J. Biotechnol. 115: 113-128

Ståhl, S., Sjölander, A., Nygren, P.Å., Berzins, K., Perlmann, P. and Uhlén, M. (1989) A dual expression system for the generation, analysis and purification of antibodies to a repeated sequence of the Plasmodium falciparum antigen Pf155/RESA. J. Immunol. Meth. 124: 43-52

Steele, N.M. and Fry, S.C. (2000) Differences in catalytic properties between native isoenzymes of xyloglucan endotransglycosylase (XET). Phytochem. 54: 667-680

Sterky, F., Bhalerao, R.R., Unneberg, P., Segerman, B., Nilsson, P., Brunner, A.M., Charbonnel-Campaa, L., Lindvall, J.J., Tandre, K., Strauss, S.H., Sundberg, B., Gustafsson, P., Uhlen, M., Bhalerao, R.P., Nilsson, O., Sandberg, G., Karlsson, J.,

65 Lundeberg, J. and Jansson, S. (2004) A Populus EST resource for plant functional genomics. Proc. Natl. Acad. Sci. U S A 101: 13951-13956

Sterky, F., Regan, S., Karlsson, J., Hertzberg, M., Rohde, A., Holmberg, A., Amini, B., Bhalerao, R., Larsson, M., Villarroel, R., Van Montagu, M., Sandberg, G., Olsson, O., Teeri, T.T., Boerjan, W., Gustafsson, P., Uhlén, M., Sundberg, B. and Lundeberg, J. (1998) Gene discovery in the wood-forming tissues of poplar: analysis of 5,692 expressed sequence tags. Proc. Natl. Acad. Sci. U S A 95: 13330-13335

Sulová, Z., Baran, R. and Farkaš, V. (2003) Divergent modes of action on xyloglucan of two isoenzymes of xyloglucan endo-transglycosylase from Tropaeolum majus. Plant. Physiol. Biochem. 41: 431-437

Sulová, Z. and Farkaš, V. (1998) Kinetic evidence of the existence of a stable enzyme- glycosyl intermediary complex in the reaction catalyzed by endotransglycosylase. Gen. Physiol. Biophys. 17: 133-142

Sulová, Z., Lednická, M. and Farkaš, V. (1995) A colorimetric assay for xyloglucan- endotransglycosylase from germinating seeds. Anal. Biochem. 229: 80-85

Sulová, Z., Takácová, M., Steele, N.M., Fry, S.C. and Farkaš, V. (1998) Xyloglucan endotransglycosylase: evidence for the existence of a relatively stable glycosyl-enzyme intermediate. Biochem. J. 330: 1475-1480

Sumner, J.B. and Somers, G.F. (1949) in Laboratory experminets in biological chemistry, pp. 38-39, Academic Press, New York

Sunna, A. and Antranikian, G. (1997) Xylanolytic enzymes from fungi and bacteria. Crit. Rev. Biotechnol. 17: 39-67

Suurnäkki, A., Tenkanen, M., Buchert, J. and Viikari, L. (1997) Hemicellulases in the bleaching of chemical pulps. Adv. Biochem. Eng. Biotechnol. 57: 261-287

Suzuki, M., Kato, A., Nagata, N. and Komeda, Y. (2002) A xylanase, AtXyn1, is predominantly expressed in vascular bundles, and four putative xylanase genes were identified in the Arabidopsis thaliana genome. Plant Cell Physiol. 43: 759-767

Tabuchi, A., Kamisaka, S. and Hoson, T. (1997) Purification of xyloglucan hydrolase/endotransferase from cell walls of azuki bean epicotyls. Plant Cell Physiol. 38: 653-658

Tabuchi, A., Mori, H., Kamisaka, S. and Hoson, T. (2001) A new type of endo-xyloglucan transferase devoted to xyloglucan hydrolysis in the cell wall of azuki bean epicotyls. Plant Cell Physiol. 42: 154-161

Tanaka, K., Murata, K., Yamazaki, M., Onosato, K., Miyao, A. and Hirochika, H. (2003) Three distinct rice cellulose synthase catalytic subunit genes required for cellulose synthesis in the secondary wall. Plant Physiol. 133: 73-83

66 Taylor, G. (2002) Populus: Arabidopsis for forestry. Do we need a model tree? Annals of Botany 90: 681-689

Teeri, T.T. (1997) Crystalline cellulose degradation: New insight into the function of cellobiohydrolases. Trends Biotechnol. 15: 160-167

Teeri, T.T. and Brumer, H. (2003) Discovery, characterisation and applications of enzymes from the wood-forming tissues of poplar: Glycosyl transferases and xyloglucan endotransglycosylases. Biocat. Biotransf. 21: 173-179

Teeri, T.T., Lehtovaara, P., Kauppinen, S., Salovuori, I. and Knowles, J. (1987) Homologous domains in Trichoderma reesei cellulolytic enzymes: gene sequence and expression of cellobiohydrolase II. Gene 51: 43-52

Tempel, W., Liu, Z.J., Horanyi, P.S., Deng, L., Lee, D., Newton, G.N., Rose, J.P., Ashida, H., Li, S.C., Li, Y.T. and Wang, B.C. (2005) Three-dimensional structure of GlcNAc alpha- 1,4-Gal releasing endo-beta-galactosidase from Clostridium perfringens. Prot. Struct. Funct. Bioinform. 59: 141-144

Thompson, J.E. and Fry, S.C. (2001) Restructuring of wall-bound xyloglucan by transglycosylation in living plant cells. Plant J. 26: 23-34

Thomson, J.A. (1993) Molecular biology of xylan degradation. FEMS Microbiol. Rev. 104: 65-82

Timell, T.E. (1967) Recent progress in the chemistry of wood hemicelluloses. Wood. Sci. Techn. 1: 45-70

Tomme, P., Boraston, A., Kormos, J.M., Warren, R.A.J. and Kilburn, D.G. (2000) Affinity electrophoresis for the identification and characterization of soluble sugar binding by carbohydrate-binding modules. Enzyme Microb. Tech. 27: 453-458

Tomme, P., Boraston, A., McLean, B., Kormos, J., Creagh, A.L., Sturch, K., Gilkes, N.R., Haynes, C.A., Warren, R.A.J. and Kilburn, D.G. (1998) Characterization and affinity applications of cellulose-binding domains. J. Chromatography B 715: 283-296

Tomme, P., Creagh, A.L., Kilburn, D.G. and Haynes, C.A. (1996) Interaction of polysaccharides with the N-terminal cellulose-binding domain of Cellulomonas fimi CenC. Binding specificity and calorimetric analysis. Biochemistry 35: 13885-13894

Tomme, P., van Tilbeurgh, H., Pettersson, G., Vandamme, J., Vandekerckhove, J., Knowles, J., Teeri, T. and Claeyssens, M. (1988) Studies of the cellulolytic system of Trichoderma reesei QM 9414. Analysis of domain function in two cellobiohydrolases by limited proteolysis. Eur. J. Biochem. 170: 575-581

Tormo, J., Lamed, R., Chirino, A.J., Morag, E., Bayer, E.A., Shoham, Y. and Steitz, T.A. (1996) Crystal structure of a bacterial family-III cellulose-binding domain: A general mechanism for attachment to cellulose. Embo J. 15: 5739-5751

67 Tsai, L.C., Shyur, L.F., Lee, S.H., Lin, S.S. and Yuan, H.S. (2003) Crystal structure of a natural circularly permuted jellyroll protein: 1,3-1,4-beta-d-glucanase from Fibrobacter succinogenes. J. Mol. Biol. 330: 607-620

Tunnicliffe, R.B., Bolam, D.N., Pell, G., Gilbert, H.J. and Williamson, M.P. (2005) Structure of a mannan-specific family 35 carbohydrate-binding module: Evidence for significant conformational changes upon ligand binding. J. Mol. Biol. 347: 287-296

Turner, S.R., Taylor, N. and Jones, L. (2001) Mutations of the secondary cell wall. Plant Mol. Biol. 47: 209-219

Uozu, S., Tanaka-Ueguchi, M., Kitano, H., Hattori, K. and Matsuoka, M. (2000) Characterization of XET-related genes of rice. Plant Physiol. 122: 853-859

Vaaje-Kolstad, G., Houston, D.R., Riemen, A.H.K., Eijsink, V.G.H. and van Aalten, D.M.F. (2005) Crystal structure and binding properties of the Serratia marcescens chitin- binding protein CBP21. J. Biol. Chem. 280: 11313-11319 van Tilbeurgh, H., Tomme, P., Claeyssens, M., Bhikhabhai, R. and Pettersson, G. (1986) Limited proteolysis of the cellobiohydrolase I from Trichoderma reesei. FEBS Lett. 204: 223- 227

Vanzin, G.F., Madson, M., Carpita, N.C., Raikhel, N.V., Keegstra, K. and Reiter, W.D. (2002) The mur2 mutant of Arabidopsis thaliana lacks fucosylated xyloglucan because of a lesion in fucosyltransferase AtFUT1. Proc. Natl. Acad. Sci. U S A 99: 3340-3345

Vardakou, M., Flint, J., Christakopoulos, P., Lewis, R.J., Gilbert, H.J. and Murray, J.W. (2005) A family 10 Thermoascus aurantiacus xylanase utilizes arabinose decorations of xylan as significant substrate specificity determinants. J. Mol. Biol. 352: 1060-1067

Vervecken, W., Kaigorodov, V., Callewaert, N., Geysens, S., De Vusser, K. and Contreras, R. (2004) In vivo synthesis of mammalian-like, hybrid-type N-glycans in Pichia pastoris. Appl. Environ. Microb. 70: 2639-2646

Vierhuis, E., York, W.S., Kolli, V.S.K., Vincken, J.P., Schols, H.A., Van Alebeek, G.J.W.M. and Voragen, A.G.J. (2001) Structural analyses of two arabinose containing oligosaccharides derived from olive fruit xyloglucan: XXSG and XLSG. Carbohydr. Res. 332: 285-297

Viikari, L., Kantelinen, A., Sundquist, J. and Linko, M. (1994) Xylanases in bleaching - from an idea to the industry. FEMS Microbiol. Rev. 13: 335-350

Vincken, J.P., York, W.S., Beldman, G. and Voragen, A.G.J. (1997) Two general branching patterns of xyloglucan, XXXG and XXGG. Plant Physiol. 114: 9-13

Vissenberg, K., Martinez-Vilchez, I.M., Verbelen, J.P., Miller, J.G. and Fry, S.C. (2000) In vivo colocalization of xyloglucan endotransglycosylase activity and its donor substrate in the elongation zone of Arabidopsis roots. Plant Cell 12: 1229-1237

68 Vissenberg, K., Oyama, M., Osato, Y., Yokoyama, R., Verbelen, J.P. and Nishitani, K. (2005b) Differential expression of AtXTH17, AtXTH18, AtXTH19 and AtXTH20 genes in Arabidopsis roots. Physiological roles in specification in cell wall construction. Plant Cell Physiol. 46: 192-200

Waterhouse, P.M. and Helliwell, C.A. (2003) Exploring plant genomes by RNA-induced gene silencing. Nat. Rev. Gen. 4: 29-38

Willats, W.G.T., Steele-King, C.G., McCartney, L., Orfila, C., Marcus, S.E. and Knox, J.P. (2000) Making and using antibody probes to study plant cell walls. Plant Physiology and Biochemistry 38: 27-36

Wilson, I.B.H. (2002) Glycosylation of proteins in plants and invertebrates. Curr. Opin. Struct. Biol. 12: 569-577

Withers, S.G. (2001) Mechanisms of glycosyl transferases and hydrolases. Carbohydr. pol. 44: 325-337

Xie, H.F., Bolam, D.N., Nagy, T., Szabo, L., Cooper, A., Simpson, P.J., Lakey, J.H., Williamson, M.P. and Gilbert, H.J. (2001a) Role of hydrogen bonding in the interaction between a xylan binding module and xylan. Biochemistry 40: 5700-5707

Xie, H.F., Gilbert, H.J., Charnock, S.J., Davies, G.J., Williamson, M.P., Simpson, P.J., Raghothama, S., Fontes, C.M.G.A., Dias, F.M.V., Ferreira, L.M.A. and Bolam, D.N. (2001b) Clostridium thermocellum Xyn10B carbohydrate-binding module 22-2: The role of conserved amino acids in ligand binding. Biochemistry 40: 9167-9176

Xu, W., Campbell, P., Vargheese, A.K. and Braam, J. (1996) The Arabidopsis XET- related gene family: Environmental and hormonal regulation of expression. Plant J. 9: 879- 889

Yokoyama, R. and Nishitani, K. (2001) A comprehensive expression analysis of all members of a gene family encoding cell-wall enzymes allowed us to predict cis-regulatory regions involved in cell-wall construction in specific organs of Arabidopsis. Plant Cell Physiol. 42: 1025-1033

Yokoyama, R., Rose, J.K.C. and Nishitani, K. (2004) A surprising diversity and abundance of xyloglucan endotransglucosylase/hydrolases in rice. Classification and expression analysis. Plant Physiol. 134: 1088-1099

69