<<

© Johan Larsbrink, 2013

ISBN 978-91-7501-834-8 ISSN 1654-2312 TRITA-BIO Report 2013:13

Division of Glycoscience School of Biotechnology Royal Institute of Technology AlbaNova University Centre SE-106 91 Stockholm Sweden

Printed at US-AB Universitetsservice ABSTRACT

The focus of this thesis is a comparative study of approaches in discovery of carbohydrate-active (CAZymes). CAZymes synthesise, bind to, and degrade all the multitude of carbohydrates found in nature. As such, when aiming for sustainable methods to degrade plant biomass for the generation of biofuels, for which there is a strong drive in society, CAZymes are a natural source of environmentally friendly molecular tools.

In nature, microorganisms are the principal degraders of carbohydrates. Not only do they degrade plant matter in forests and aquatic habitats, but also break down the majority of carbohydrates ingested by animals. These symbiotic microorganisms, known as the microbiota, reside in animal digestive tracts in immense quantities, where one of the key nutrient sources is complex carbohydrates. Thus, microorganisms are a plentiful source of CAZymes, and strategies in the discovery of new enzymes from bacterial sources have been the basis for the work presented here, combined with biochemical characterisation of several enzymes.

Novel enzymatic activities for the glycoside family 31 have been described as a result of the initial projects of the thesis. These later evolved into projects studying bacterial multi-gene systems for the partial or complete degradation of the heterogeneous plant xyloglucan. These systems contain, in addition to various hydrolytic CAZymes, necessary binding-, transport-, and regulatory proteins. The results presented here show, in detail, how very complex carbohydrates can efficiently be degraded by bacterial enzymes of industrial relevance.

Keywords: CAZyme discovery, xyloglucan, polysaccharide-utilisation locus, microbiota, Į-xylosidase, GH31, transglucosidase, human gut

i SAMMANFATTNING

Den huvudsakliga inriktningen i denna avhandling är jämförelser mellan olika metoder för att finna nya enzymer som är aktiva på kolhydrater (CAZymer). Dessa enzymer syntetiserar, binder till, och bryter ned alla de mängder av kolhydrater som finns i naturen. Det finns ett stort behov i samhället av hållbara metoder för att bryta ned växtbaserad biomassa för produktion av biobränslen, och i detta är CAZymer en given källa av miljövänliga molekylära verktyg.

Mikroorganismer är de främsta nedbrytarna av kolhydrater i naturen. De utför inte bara dessa processer i skogar och akvatiska miljöer, utan är även de huvudsakliga nedbrytarna av kolhydrater som konsumeras av djur. Dessa mikroorganismer återfinns i ofantliga mängder i mag- tarmkanalen, där komplexa kolhydrater är en av de viktigaste näringskällorna. Mikroorganismer är således en rik källa av CAZymer, och strategier i att finna och karaktärisera nya enzymer från bakterier har varit huvudfokus i denna avhandling.

Tidigare okända enzymatiska aktiviteter i glykosidhydrolasfamilj 31 har upptäckts som resultat från de inledande projekten i denna avhandling. Dessa projekt utvecklades i sin tur till studier av bakteriella system, vilka inkluderar flera gener, involverade i nedbrytningen av den heterogena växtpolysackariden xyloglukan. Dessa system innehåller, förutom olika hydrolaser, proteiner som binder till och transporterar kolhydrater, och även proteiner som reglerar själva systemen. Resultaten som presenteras visar i detalj hur väldigt komplexa kolhydrater effektivt kan brytas ned av industriellt relevanta bakteriella enzymer.

ii

LIST OF PUBLICATIONS I. Larsbrink J*, Izumi A*, Ibatullin FM, Nakhai A, Gilbert HJ, Davies GJ, Brumer H (2011). Structural and enzymatic characterization of DJO\FRVLGHK\GURODVHIDPLO\Į-xylosidase from Cellvibrio japonicus involved in xyloglucan saccharification. Biochem J. 436(3): 567-80. II. Cartmell A, McKee LS, Peña MJ, Larsbrink J, Brumer H, Kaneko S, Ichinose H, Lewis RJ, Viksø-Nielsen A, Gilbert HJ, Marles-Wright J (2011). The structure and function of an arabinan-specific alpha- 1,2-arabinofuranosidase identified from screening the activities of bacterial GH43 glycoside . J Biol Chem. 286(17): 15483- 95. III. Silipo A, Larsbrink J, Marchetti R, Lanzetta R, Brumer H, Molinaro A (2012). NMR spectroscopic analysis reveals extensive binding interactions of complex xylogluco-oligosaccharides with the Cellvibrio japonicus glycoside hyGURODVHIDPLO\Į-xylosidase. Chemistry. 18(42): 13395-404. IV. Larsbrink J*, Izumi A*, Hemsworth GR, Davies GJ, Brumer H (2012). Structural enzymology of Cellvibrio japonicus Agd31B protein UHYHDOVĮ-transglucosylase activity in family 31. J Biol Chem. 287(52): 43288-99. V. Larsbrink J*, Rogers T*, Hemsworth G*, McKee LS, Tauzin A, Spadiut O, Klinter S, Pudlo NA, Urs K, Koropatkin NM, Creagh AL, Haynes CA, Kelly AG, Nilsson Cederholm S, Davies GJ, Martens EC, Brumer H. A discrete genetic locus confers select Bacteroidetes with a niche capacity for xyloglucan in the human gut. Submitted. VI. Larsbrink J, Lundqvist M, Brumer H. Identification and characterization of a locus in Cellvibrio japonicus involved in xylogluco-oligosaccharide saccharification. Manuscript. VII. Larsbrink J, Brumer H. Screening and activity profiling of the glycoside hydrolase family 31 members from Bacteroides thetaiotaomicron reveal novel specificities. Manuscript.

* Authors contributed equally to the work

iii THE AUTHOR’S CONTRIBUTION Publication I: Performed all molecular biology and biochemical experiments, from cloning, protein production and purification, to kinetic analyses and bioinformatic analyses. Major writing contributions.

Publication II: Made sequence analyses and alignments for the genes described, and constructed the phylogenetic tree.

Publication III: Constructed, produced and purified catalytically inert mutants, as well as wild-type for the STD-NMR studies. Produced and purified the GXXG oligosaccharide which was used with wild-type enzyme.

Publication IV: Performed all molecular biology and biochemical experiments, from cloning, protein production and purification to kinetic analyses. Major writing contributions.

Publication V: Was involved in the cloning, recombinant protein production and purification, biochemical characterisation for all enzymes. Also extracted xyloglucan from tomato and lettuce and generated oligosaccharides for biochemical assays. Major writing contributions.

Publication VI: Discovered the putative locus and cloned both hydrolase genes together with ML. Performed the biochemical characterisation of the galactosidase and performed the experiments on xyloglucan and xylogluco- oligosaccharides for the . Performed localisation and qPCR studies. Wrote the manuscript.

Publication VII: Performed all biochemical experiments and wrote the manuscript.

iv

LIST OF ABBREVIATIONS AA Auxiliary activities AGP Arabinogalactan protein Araf Arabinofuranose BACON Bacteroidetes-associated carbohydrate-binding, often N- terminal CAZyme Carbohydrate-active enzyme CBM Carbohydrate-binding module CE Carbohydrate esterase D-enzyme Disproportionation enzyme Fuc Fucose Gal Galactose GalNAc N-acetylgalactosamine GAX Glucuronoarabinoxylan GH Glycoside hydrolase Glc GlcA Glucuronic acid GRP -rich protein GT Glycosyl HGRP Hydroxyproline-rich HPAEC-PAD High performance anion-exchange chromatography with pulsed amperometric detection HTCS Hybrid two-component system MALDI-TOF Matrix-assisted laser desorption/ionisation – Time of flight mass spectrometry Man Mannose MLG Mixed-linkage ȕ ĺ  ĺ JOXFDQ NMR Nuclear magnetic resonance PA14 (named after the) Protective antigen of anthrax toxin PL Polysaccharide PNP para-nitrophenyl/4-nitrophenyl

v PRP -rich protein PUL Polysaccharide utilisation locus qPCR Quantitative polymerase chain reaction RG Rhamnogalacturonan SCFA Short-chain fatty acid Sus Starch utilisation system TLC Thin-layer chromatography XyG Xyloglucan XyGO Xylogluco-oligosaccharide XyGUL Xyloglucan utilisation locus Xyl Xylose

vi

TABLE OF CONTENTS

ABSTRACT ...... I

SAMMANFATTNING ...... II

LIST OF PUBLICATIONS ...... III

THE AUTHOR’S CONTRIBUTION ...... IV

LIST OF ABBREVIATIONS ...... V

TABLE OF CONTENTS ...... VII

INTRODUCTION ...... 1

BACKGROUND ...... 2

CARBOHYDRATES ...... 3

PLANT ...... 4

STRUCTURAL POLYSACCHARIDES ...... 5

CELLULOSE ...... 7

HEMICELLULOSES ...... 9

PECTIN ...... 15

OTHER PLANT CELL WALL RELATED MOLECULES ...... 15

STRUCTURAL PROTEINS ...... 16

LIGNIN ...... 16

STORAGE POLYSACCHARIDES ...... 17

STARCH ...... 18

CARBOHYDRATE-ACTIVE ENZYMES ...... 19

GLYCOSIDE HYDROLASES ...... 22

GH MECHANISMS ...... 23

CATALYTIC MODE OF ACTION OF GHS...... 26

CAZYME DISCOVERY ...... 27

vii BACTERIAL HABITATS ...... 28

THE SOIL BACTERIUM CELLVIBRIO JAPONICUS ...... 28

THE HUMAN GUT BACTEROIDETES SYMBIONTS ...... 30

HUMAN-MICROBE GUT SYMBIOSIS ...... 30

THE STUDIED BACTEROIDES SPECIES ...... 33

POLYSACCHARIDE UTILISATION LOCI AND THE SUS PARADIGM ...... 34

AIM OF INVESTIGATION ...... 37

RESULTS AND DISCUSSION ...... 41

ENZYME DISCOVERY BY GLYCOSIDE HYDROLASE FAMILY ACTIVITY SCREENING ...... 42 GH31 ...... 42

CJXYL31A ...... 42

CJAGD31B ...... 48

BACTEROIDES THETAIOTAOMICRON GH31 TARGETS ...... 51 GH43 ...... 52

ENZYME DISCOVERY FROM GENETIC LOCI ...... 53

XYLOGLUCAN UTILISATION IN CELLVIBRIO JAPONICUS ...... 54

A DISCRETE LOCUS IN BACTEROIDES OVATUS, ENDOWING THE BACTERIUM WITH THE ABILITY

TO UTILISE XYLOGLUCAN ...... 58

CONCLUSIONS ...... 67

ACKNOWLEDGEMENTS ...... 70

REFERENCES ...... 72

viii Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

INTRODUCTION

1 Johan Larsbrink 2013

Background It is becoming more and more evident that society as a whole is facing a huge challenge in the future, with rising global temperatures and diminishing reserves of fossil fuels (Arkema et al. 2013, Himmel et al. 2007, Li et al. 2009). Several approaches have been suggested to alleviate the possible ramifications of this, and one of them is a shift from our dependence on fossil-based fuels to renewable alternatives. That goal will unquestionably involve using plants and plant-derived biomass as a source of energy (Himmel, et al. 2007, Schubert 2006). However, plants, or more specifically plant cell walls, are highly recalcitrant to degradation, which is a key step in harnessing the stored energy. Harsh chemical treatments and high temperatures are currently required to deconstruct the cell walls using standard industrial means.

In nature, plant cell walls are nevertheless constantly salvaged in the global carbon cycle, a feat typically accomplished by microbes at ambient temperatures and around neutral pH. Thus, understanding of the fundamental biochemical processes of plant cell wall degradation utilised in natural systems is essential in developing environmentally friendly methods to create renewable alternatives to fossil fuels (Li, et al. 2009).

Additionally, understanding the biological processes involved in plant cell wall degradation may lead not only to sustainable fuels, but also allow for the generation of new plant-based materials, as well as have an impact on human health (Flint et al. 2012b, Li, et al. 2009, Schnepp 2013). The reason for the latter is the obvious: plants, both intact and processed, form the basis of a healthy diet. However, the human genome does not grant us the ability to degrade most of the ingested plant cell material, but this feat is instead performed by the microorganisms residing in our gut, which endows us with

2 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria a huge potential for carbohydrate degradation (El Kaoutari et al. 2013). Due to the huge complexity of the plant cell wall, finding enzymes with appropriate properties for carbohydrate degradation and/or modification is not a trivial task, and involves much screening, both bioinformatically and biochemically (Li, et al. 2009), and may even require re-design of individual enzymes to better fit industrial processes (Cobb et al. 2012, Lutz 2010).

Thus, the focus of this thesis is rooted in the need to discover enzymes with novel or superior qualities, which can expand the toolbox of carbohydrate- active enzymes as well as improve our general knowledge in the field of carbohydrate degradation and be a stepping stone for future endeavours. To provide a background for this thesis, the diversity of polymeric structures present in cell walls, with the main focus on carbohydrates, is presented, as understanding the architecture of the cell wall is central in finding enzymes that act upon it. Further, the types of enzymes that are involved in plant cell wall synthesis and degradation are discussed, followed by descriptions of the studied bacteria and their habitats, and how studying them may lead to the discovery of novel enzymes as well as shed light on these intricate biological systems.

Carbohydrates Carbohydrates, or , are ubiquitous in nature and are involved in all major biological processes, from cell communication to metabolism, as well as being a major constituent of genetic material and micro- and macrostructures of living (and dead) organisms and tissues (Varki et al. 2009). Carbohydrates also serve as nutrients for organisms from all kingdoms of life, as building blocks for biosynthesis, and are additionally used as energy storage molecules for many species.

3 Johan Larsbrink 2013

The simplest carbohydrates are the monosaccharides, since they cannot be hydrolysed into smaller sugars (Bertozzi et al. 2009). When joined together by glycosidic bonds, monosaccharides form chains, ranging from simple oligosaccharides up to complex polysaccharide chains, which can in turn interact to form intricate hierarchical structures. Examples of these polysaccharides are chitin, which is the main constituent of the robust exoskeletons of arthropods and a major part of fungal cell walls (Cummings et al. 2009, Tiemeyer et al. 2009), the protein- co-polymer peptidoglycan, which forms the cell walls in bacteria (Esko et al. 2009a), and the glycoprotein surface layer which is found in the cell envelopes of archaea (Albers et al. 2011, Esko, et al. 2009a). Animals also synthesise large amounts of polysaccharides such as the highly complex mucus polysaccharides lining mucosal membranes, and the heterogeneous sugar structures found in and proteoglycans (Brockhausen et al. 2009, Esko et al. 2009b, Koropatkin et al. 2012, Stanley et al. 2009).

The majority of carbohydrates on earth are however synthesised by plants, beginning with photosynthesis.

Plant polysaccharides Though plant proteins are often heavily glycosylated (Rayon et al. 1998), the main bulk of carbohydrates synthesised by plants may be classified as either structural polysaccharides or storage polysaccharides. The structural polysaccharides are incorporated into the cell walls, which are the main load bearing structures of plants, while storage polysaccharides can be found either inside the cell, such as starch and fructans, or in the cell wall, such as mannans and xyloglucan (Buckeridge et al. 2000).

4 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Structural polysaccharides In contrast to animal cells, plants are surrounded by a cell wall, similar to for instance fungi and prokaryotes (Taiz et al. 2010). The cell wall confers rigidity and robustness to the cell, but is at the same time highly dynamic and flexible (Carpita et al. 2000). Plant cell walls are the most abundant reservoir of renewable carbon on the planet, and are as such of major importance in the global economy. The uses of plant cell walls and their constituents include familiar products such as lumber, paper, and textiles (Carpita, et al. 2000, Taiz, et al. 2010), but also more technically advanced products such as plastics, thickeners, coatings etc. in a vast range of applications (Baumann et al. 2003, Mishra et al. 2009, Willför et al. 2008). Plant cell walls are also the main components (excluding water) in fruits and vegetables, and consequently have a significant impact on human diet and health (Carpita, et al. 2000).

The cell walls of plants largely consist of different polysaccharides, but also contain proteins and aromatic compounds (lignin). There are two major types of cell walls in plants – the primary and the secondary cell walls (Carpita, et al. 2000). The main load bearing structure of all plant cell walls is cellulose, which forms fibrillar structures with strong mechanical properties (Carpita, et al. 2000, Carpita et al. 1993, Taiz, et al. 2010).

The primary cell wall is, as the name implies, the first cell wall created by growing cells, and governs the shape and size of the cell as well as being involved in defence against microbial attacks and in cell-cell communication, both within the organism and with other organisms, such as symbiotic bacteria or fungi. The cellulose fibrils of primary cell walls are enmeshed in a matrix of non-cellulosic polysaccharides, commonly known as hemicelluloses or cross-linking glycans, charged polysaccharides known

5 Johan Larsbrink 2013 as pectins, and structural proteins (Figure 1). There are different types of primary cell walls, depending on the plant species, and they differ mainly in hemicellulose composition. In type I cell walls, which are present in dicot plants and noncommelinoid monocots, xyloglucan is the main hemicellulose, while in type II walls, present in commelinoid monocots, the main hemicellulose is glucuronoarabinoxylan. Recently, a type III primary cell wall was discovered in fern leaves, having mannan as the main hemicellulose (Silva et al. 2011). However, the separation of primary cell walls into distinct types, though useful for comparative reasons, may be misleading, as there is likely a continuum between wall compositions rather than three definitive types (Fangel et al. 2012, Popper et al. 2011).

In certain cell types, layers of secondary cell wall are constructed inside the primary cell wall after the cell has adopted its final size and shape (Carpita, et al. 2000, Taiz, et al. 2010). These secondary cell walls are typically composed mainly of cellulose, hemicelluloses and lignin, and are much thicker than the primary cell walls, thus conferring the plant with rigidity and in some tissues, water impermeability (Carpita, et al. 2000, Doblin et al. 2010). The proportion of cellulose is often much higher in secondary cell walls compared to primary cell walls, and there are often several layers of secondary cell wall laid down, with distinct orientations of the cellulose fibrils, which further strengthens the wall (Figure 1).

6 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Figure 1: Models of plant cell walls. A. Cross-section of a primary cell wall, containing the load-bearing cellulose fibrils coated with hemicellulose chains, which are enmeshed in a pectin matrix. Cell wall proteins are exemplified by an extension network. B. The cellulose- rich secondary cell wall and its different layers, with cellulose fibrils in different orientations. P – primary cell wall, S1-S3 – layers of secondary cell wall.

Cellulose The most abundant of all the different plant polysaccharides is cellulose (Carpita, et al. 2000). It is a homopolymer, consisting exclusively of ȕ(1ĺ4)-linked glucose moieties, though the repeating unit of cellulose is often designated as the cellobiose, as every other glucose unit is rotated 180° along the chain axis (Figure 2A). Not only plants synthesise cellulose, but also bacteria, fungi and oomycetes have the capacity for its synthesis (Bartnicki-Garcia 1968, Deinema et al. 1971). Cellulose is synthesised by cellulose synthase enzymes at the surface of the plasma membrane, where individual cellulose chains bind together through extensive inter-molecular hydrogen bonds and Van der Waals forces to form microfibrils, which can consist of thousands of cellulose chains and reach lengths up to the micrometre scale (Carpita, et al. 2000, Endler et al. 2011). In plants, the cellulose microfibrils are enmeshed in a hemicellulose

7 Johan Larsbrink 2013 network, which is itself embedded in a pectic matrix, to form a so-called liquid crystal state (Carpita, et al. 2000). The combination of the matrix glycans and the cellulose microfibrils leads to a very robust structure, able to support such immense living structures as the giant sequoias and redwood trees.

Figure 2: Unbranched glucans of the cell wall. A. Cellulose, consisting of ȕ ĺ -linked glucose units, with the repeating unit cellobiose in parenthesis. B. Callose, with ȕ ĺ  linkages. C: An exemplified stretch of a mixed-linkage ȕ ĺ ȕ ĺ  JOXFDQ 0/*  consisting of stretches of cellotriose and cellotetraose, joined together by ȕ ĺ OLQNDJHV

8 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Hemicelluloses The term hemicellulose was coined in the late 19th century, and defined as plant cell components possible to extract from plant material by alkaline solutions (O'Dwyer 1923). To distinguish the non-charged cell wall polysaccharides from other material extracted from the cell walls, and as the term was coined when nothing was known about plant polysaccharide structure or biosynthesis, cross-linking glycans has been suggested as an alternative term. The word hemicellulose persists as an umbrella term for these polysaccharides however, and will be used throughout this thesis. The hemicelluloses are polysaccharides that coat the cellulose microfibrils in the plant cell wall, and may also further be covalently attached to lignin through ester bonds (Levasseur et al. 2013). The hemicellulose classification encompasses several structurally diverse polysaccharides, which can be found in various amounts in both primary and secondary cell walls, depending on plant species and types of tissues (Ebringerová et al. 2005).

Callose A polysaccharide related to cellulose is callose. It also consists solely of glucose, but in contrast to cellulose, the linkage in callose is ȕ ĺ (Figure 2B), resulting in helical chains (Carpita, et al. 2000). Callose is not a general cell wall polysaccharide, but is present in certain cell types, such as pollen, and is also produced in response to certain microbial attacks, such as by fungi (Carpita, et al. 2000, Ellinger et al. 2013).

Mixed-linkage glucans Another major hemicellulose that are mainly found in the Poales order, which includes the common cereals, are the mixed-linkage ȕ ĺ  ĺ  glucans (MLGs) (Carpita, et al. 2000, Ebringerová, et al. 2005, Fincher 2009). The MLGs are composed mainly of stretches of cellotriose and

9 Johan Larsbrink 2013 cellotetraose, connected by ȕ ĺ -linkages in an irregular fashion, making them soluble in contrast to cellulose which only contains ȕ ĺ -linkages (Figure 2C). MLGs, like other hemicelluloses, bind cellulose, and constitute around 5 – 10 % of the biomass of such commercially important crops as barley and oat (Ebringerová, et al. 2005). In recent years, the realisation of the importance of fibres in human diet and their potential in alleviating such dietary problems as diabetes, colorectal cancer and certain have led to much focus on MLGs as wholegrain cereal products are some of the main sources of dietary fibre in normal diets (Jacobs et al. 2004, Topping 2007).

Xyloglucan Xyloglucan (XyG) is the main hemicellulose in primary cell walls in dicot plants, and is present in all terrestrial plants (Carpita, et al. 2000, Peña et al. 2008, Popper 2008). The amount of XyG can reach 20-25 % of the total carbohydrate content (dry weight) in some species (Ebringerová, et al. 2005), and will as such be a major influence in a wide variety of plant- derived products. In the cell wall, XyG is believed to cross-link the cellulose microfibrils, thus conferring mechanical strength to the plant structure. The cross-linking properties of XyG, thus conferring mechanical strength, may however not be as extensive as most cell wall models have stated in the past, where long XyG tethers (20 – 40 nm) were thought to connect the cellulose fibrils (Park et al. 2012). Recent studies have instead indicated that the main bulk of the XyG chains coat the cellulose fibrils, while only a few key chains tether cellulose fibrils close together. There may even be significant intertwining of the XyG and individual cellulose chains in regions of amorphous non-crystalline cellulose. The high affinity of XyG toward cellulose has been demonstrated in vitro, and the co-localisation of XyG and cellulose has been observed in plant cell walls using specific

10 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria antibodies and electron microscopy (Carpita, et al. 2000, Pauly et al. 1999). XyG is also used as a storage polysaccharide in seeds in certain species, such as the tamarind tree (Tamarindus indica) and nasturtium (Tropaeolum majus), providing energy and building materials for the growing embryo (Buckeridge, et al. 2000, Gidley et al. 1991, Savur et al. 1948).

Structurally, XyG FRQVLVWVRIDOLQHDUȕ ĺ JOXFDQEDFNERQHVXEVWLWXWHG at regular intervals with xylosyl moieties (Carpita, et al. 2000, Ebringerová, et al. 2005). The xylosyl side chains are commonly substituted further with galactosyl, fucosyl and/or arabinofuranosyl units depending on the plant species and tissue (Figure 3A) (Hoffman et al. 2005, Hsieh et al. 2009). To more easily describe the various side-chains of XyG, a distinct nomenclature was developed by (Fry et al. 1993), where G denotes an unbranched Glc residue, and the most common substitutions are as follows: X(Xylp-Į ĺ -ȕ-Glcp), L (ȕ ĺ -Galp-Į ĺ -Xylp-ȕ ĺ -Glcp), F

(L-Fucp-Į ĺ -Galp-Į ĺ -Xylp-ȕ ĺ -Glcp), and S (L-Araf- Į ĺ -Xylp-Į ĺ -ȕ-Glcp. Typically, the XyG chains consist of repeating patterns of the XXGG or XXXG type, corresponding to a repeating backbone motif of four glucose residues substituted with two or three xylose residues, respectively, and carrying various amounts of additional substitutions (Vincken et al. 1997). The most common type of XyG in higher land plants is fucogalacto-XyG of the XXXG type such as in lettuce (Lactuca sativa), carrot (Daucus carota), and soybean (Glycine max) (Figure 3A, left panel) (Hoffman, et al. 2005). Another important dietary plant family is the Solanaceae, including potato (Solanum tuberosum), tomato (Lycopersicon esculentum), and peppers (Capsicum), which are of the XXGG type, lack fucose and instead contain (Figure 3A, right panel).

11 Johan Larsbrink 2013

Figure 3: Structures of common hemicelluloses. Colour code as follows: Glc – blue, Xyl – orange, Gal – yellow, L-Fuc – red, Araf – turquoise, GlcA – purple, Man – green. A. Xyloglucan oligosaccharide structures from different sources. Fucogalactoxyloglucan of the XXXG type, as can be obtained from dicot species (left panel), and so-called solanaceous arabinofuranosylated xyloglucan of the XXGG type, as can be obtained from for example tomato and peppers (right panel). B. Glucuronoarabinoxylan, substituted with arabinofuranose, glucuronic acid and ferulic ester moieties. The double substitution site at O- 3 is highlighted by a parenthesis. C. Galactomannan. Pure mannan would lack the galactosyl decorations. D. Galactoglucomannan.

12 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Other, more rare, patterns also exist, such as fucosyl-lacking XXGGG-based XyG, which is the case for morning glory (Ipomoea pupurea), basil (Ocimum basilicum) and plantain (Plantago major) (Hoffman, et al. 2005). Furthermore, there are many additional substitutions to XyG than the common L, F and S sidechains, as well as plants creating XyG which contains both arabinose and fucose, such as oleander (Nerium oleander) (Hoffman, et al. 2005, Peña, et al. 2008).

XyG is readily available as a by-product from tamarind fruit production in the form of seeds (kernels) or as tamarind kernel powder, and can be used in varying applications from food thickeners to drug delivery agents (Mishra, et al. 2009, Yamatoya et al. 2003). The strong affinity of XyG to cellulose can also be exploited in biomimetics, to functionalise cellulose surfaces by adsorption of chemically modified XyG to gain novel properties while maintaining the strong physical properties of cellulose (Teeri et al. 2007, Zhou et al. 2005).

Xylan Xylan is present in the cell walls of both monocots and dicots, but is the main hemicellulose found in the primary cell walls of grasses, the so-called type II walls, as well as in the secondary cell walls of hardwood trees (Carpita, et al. 2000, Carpita, et al. 1993). Xylan is composed of ȕ ĺ - linked xylopyranose moieties, which can be substituted in various ways, depending on the species (Figure 3B). The type of xylan found in primary cell walls of commelinoid monocots, glucuronoarabinoxylan (GAX), is often substituted with Į-L-Araf at the O-3 position, while Į ĺ -linked glucuronic acid decorations and feruloyl moieties at the C-5 position of Į-L- Araf moieties are rarer. In contrast, GAX in non-commelinoid monocots and dicot species may be decorated with Į-L-Araf at both O-2 and O-3. In

13 Johan Larsbrink 2013 secondary cell walls however, arabinose decorations are not found, but glucuronic acid is still present. Xylan is often a major part of secondary cell walls, and may constitute up to 30 % of the biomass, making it available in large amounts in nature (Ebringerová, et al. 2005). Especially important for human use are the arabinofuranosylated xylans found in common cereals, where they may constitute approximately 13 % in grain from barley and rye, and up to 30 % in wheat bran, and as such be a rich source of dietary fibre.

Mannans There are several types of different polysaccharides containing ȕ ĺ - linked mannan in their backbones: pure mannan, galactomannan, glucomannan and galactoglucomannan (Carpita, et al. 2000). Pure mannan is rare, and have only been found in the bulb of the orchid Oncidium (Wang et al. 2006). Polysaccharides consisting of a pure mannan backbone, substituted with galactosyl units is called galactomannan, and the level of substitution affects their solubility in water, where a less substituted polysaccharide is less soluble, such as the largely unsubstituted mannan in ivory nut (Phytelephas macrocarpa; Figure 3C) (Buckeridge 2010). Galactomannans, consisting of a mannan backbone substituted with galactose in Į ĺ -linkages are used as storage polysaccharides in various plants, such as coffee (Coffea arabica), palm tree (Phoenix dactylifera) and mimosa (Mimosa scabrella) (Buckeridge 2010, Ebringerová, et al. 2005). Mannan (or possibly galactomannan with a low level of galactosyl substitutions) is also the main hemicellulose in the recently described type III primary cell wall, which was discovered in fern leaves (Silva, et al. 2011). In secondary cell walls, glucomannans and galactoglucomannans are the major hemicellulose in softwoods, where the backbone may also be acetylated (Figure 3D) (Ebringerová, et al. 2005). Structurally glucomannans and galactoglucomannans differ from pure mannan and

14 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria galactomannan in that the backbone is made up of alternating stretches of ȕ ĺ -linked mannose units and ȕ ĺ -linked glucose units.

Pectin Pectin is a class of heterogeneous polysaccharides rich in galacturonic acid, that classically were defined as material possible to extract from cell walls by chelators (Carpita, et al. 2000). Pectin is found in both dicot and monocot primary cell walls, as well as in the middle lamella and to a lesser extent in woody tissues (Mohnen 2008). In the cell wall, the hemicellulose- coated cellulose fibrils are extruded into the cell wall during synthesis, where they are embedded in a matrix of pectic polysaccharides (Carpita, et al. 2000). The most common form of pectin is homogalacturonan, which is composed of Į ĺ -linked galacturonic acid units which can be methylated at the C-6 position and/or acetylated at O-2 or O-3. More complex types are the rhamnogalacturonan (RG) I and II, which are highly branched with a multitude of different monosaccharides and linkages. Examples of the sidechains of RG I are galactan, arabinan and arabinogalactan. Pectins are not only a part of the cell walls, but also have many other roles, such as cell-cell adhesion, signalling and fruit development. Another common use of pectin is in the food and cosmetics industries as a gelling agent.

Other plant cell wall related molecules Polysaccharides, such as the ones described above, are not the only parts of plant cell walls. Many proteins, which are often heavily glycosylated, fulfil important roles in the cell walls, and a key factor for the strength of secondary cell walls are the complex networks created from phenolic substances called lignin (Carpita, et al. 2000).

15 Johan Larsbrink 2013

Structural proteins Structural protein networks in the cell walls of plants consist mainly of four classes of proteins: the hydroxyproline-rich glycoproteins (HGRPs), the proline-rich proteins (PRPs), the glycine-rich proteins (GRPs) and the arabinogalactan proteins (AGPs). These proteins are secreted into the cell wall, and all but the GRPs are glycosylated. One of the more studied HGRPs is extensin. Extensin forms rod-like polypeptides that are glycosylated to form a brush-like structure. Extensins form networks in the cell wall, and are involved for example in defence against attacks by pathogens, when they are accumulated, and in the formation of the cell plate during cell division (Lamport et al. 2011). The most heavily glycosylated structural protein class are the AGPs, which may consist of more than 95 % carbohydrate, which makes them highly soluble (Carpita, et al. 2000, Doblin, et al. 2010). Most AGPs have membrane anchors, and may thus be connected to the cell membrane. AGPs are believed to be involved in root formation, pollen tube guidance and cell-cell interactions (Doblin, et al. 2010).

Lignin Lignin is unique to secondary cell walls, and is a key agent in making cell walls water impermeable and recalcitrant to enzymatic degradation, as well as making the plant cell wall strong enough to support the tall species we can observe today (Carpita, et al. 2000, Doblin, et al. 2010, Vanholme et al. 2012). The lignin network is created by radical-reactions joining the basic monomer building blocks derived of p-hydroxycinnamyl alcohol: p- coumaryl, sinapyl and coniferyl alcohols (Figure 4) (Vanholme, et al. 2012). The synthesis of lignin begins with oxidation of the monomers to form monolignol radicals, which then combine through radical-radical couplings, and also allow the network to combine with polysaccharides. The oxidation

16 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria of lignin monomers is performed by peroxidases and/or laccases that are encoded by large gene families in plants. The lignin network not only strengthens the cell walls, and especially for vascular tissues makes them hydrophobic, but also inhibits and sterically hinders other enzymes from attacking the various polysaccharides of the wall (Dixon 2013, Vanholme, et al. 2012). Because of this, many studies are aimed towards reducing the lignin content or changing the ratio of the monomers in crops to facilitate saccharification for biofuels, while not adversely affecting the plants.

Figure 4: The monomers making up the majority of lignin.

Storage polysaccharides As mentioned previously, hemicelluloses often serve as energy sinks in various organisms, to be used during nutrient deficiency or as energy reserves in seeds (Buckeridge 2010). For most plants however, energy stored as carbohydrates is stored in the form of starch (Zeeman et al. 2010).

17 Johan Larsbrink 2013

Starch While animals and certain microorganisms store energy in the form of , plants use the very similar polysaccharide starch (Elbein 2009, Wilson et al. 2010, Zeeman, et al. 2010). Both starch and glycogen are, in common with cellulose, composed purely of glucose, with the difference lying in the linkages and morphology of the glucan chains. In cellulose, as described earlier, the ȕ(1ĺ4) linkages give rise to straight polysaccharide chains. In starch however, the glucan chains are linked in the Į ĺ  configuration (Figure 5A), which leads to the formation of helices, and results in a more soluble polysaccharide. The chains of starch are further branched by Į(1ĺ6) linkages to form brush-like structures (Figure 5B and C). One key difference between starch and the hemicellulose storage polysaccharides is that the former is synthesised within the cells in specialised compartments called plastids, while the latter are inserted into the cell wall (Buckeridge 2010, Pérez et al. 2010).

Chemically, the differences between starch and glycogen lie in the amount of branching, where glycogen is more heavily and continuously branched than starch. Glycogen particles are also smaller (20-60 nm) than starch granules (Zeeman, et al. 2010), which can reach diameters of around 200 µm (Pérez, et al. 2010). The amount of branching in starch affects its digestibility in the gut, and starches with less branching tend to be more crystalline and are more slowly digested (Gilbert et al. 2013). Starch that is not fully digested in the is called resistant starch, and is a nutrient source for many bacteria in the human distal gut (Flint 2012). The ingestion of resistant starch can improve colonic health by increasing the production of short-chain fatty acids by the gut microbiota (Koropatkin, et al. 2012). Many of the plants utilised by mankind as staple food are either

18 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria seeds or tubers containing predominantly starch, such as maize, potato, rice and cereals.

Figure 5: Structures of starch. A. The main building block of starch: Į ĺ -linked glucose, exemplified by the trisaccharide maltotriose. B. The branching pattern of starch by an Į ĺ -linkage. C. An example of the brush-like structure adopted by starch, where ovals represent glucose moieties. Several of these brushes aggregate to form starch granules.

Carbohydrate-active enzymes As evidenced by the above discussion, the possible structural variations of carbohydrates are essentially endless, and consequently a large variety of enzymes capable of acting on these substrates have evolved over the millennia. As the number of possible carbohydrate structures vastly outnumbers that of protein folds, proteins acting on carbohydrates have

19 Johan Larsbrink 2013 evolved different specificities for oligo- and polysaccharides as well as glycoconjugates from a limited number of protein scaffolds. The enzymes acting on these substrates, collectively known as Carbohydrate Active enZymes (CAZymes) (Cantarel et al. 2009), have been grouped into distinct phylogenetic families based on their similarities in primary amino acid sequences, protein folds, and mechanisms of action. These families can be found in the CAZy database (www.cazy.org). The database, which currently contains over 300 protein families (Levasseur, et al. 2013), is continuously updated as more CAZymes are identified and functionally and/or structurally characterised. The protein families of CAZy have been clustered into five different enzyme classes and one class of associated modules, the carbohydrate binding modules, as follows:

Glycoside Hydrolases (GH) – these enzymes break the bonds between sugars by . There are to date over 130 GH families, comprising over 150 000 predicted catalytic modules. Additionally, some GH enzymes are able to perform transglycosylation reactions. The GHs are discussed in greater detail below.

Glycosyl (GT) – the GTs are enzymes which synthesise all the different di- oligo- and polysaccharides found in nature. They perform this task by transferring sugar moieties from activated phospho-sugars; typically nucleotide-sugar donors are used, such as UDP-Glc for cellulose and callose synthases or UDP-Xyl for xylan synthases (Campbell et al. 1998, Lairson et al. 2008).

Polysaccharide (PL) – PL enzymes cleave uronic acid- containing sugars, such as pectins and many algal polysaccharides. The mechanism by which they cleave the glycosidic bonds is ȕ-elimination,

20 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

which yields unsaturated hexenuronic acid or ester moiety at the non- reducing end of the polysaccharide (Garron et al. 2010, Lombard et al. 2010).

Carbohydrate Esterases (CE) – several types of carbohydrates are substituted with acyl moieties, and the enzymes that catalyse the cleavage of these ester bonds are the CEs. The most common mechanism of these enzymes is the classical Ser-His-Asp mechanism employed by proteases. (Dodd et al. 2009, Taylor et al. 2006)

Auxiliary Activities (AA) – the AA classification is a very recent addition to CAZy (Levasseur, et al. 2013), and comprises enzymes that break glycosidic bonds by oxidation. The discovery that the proteins of GH61 and CBM33 were actually metal-dependent polysaccharide monooxygenases, able to break the backbone of crystalline polysaccharides, and not GHs or CBMs, respectively, warranted their re-classification. GH61 and CBM33 were classified as AA families 9 and 10, respectively. Enzymes able to oxidise various positions on carbohydrates have also been identified: AA3 members can act on C1 and C2, AA7 on C1, and AA9 on C1, C4 and C6. In addition, enzymes involved in lignin degradation were included in the classification, as lignin is invariably associated with polysaccharides in plant cell walls.

Carbohydrate Binding Modules (CBM) – CBMs are non-catalytic proteins with the ability to bind carbohydrates. CBMs are by definition modules associated with other CAZymes (though exceptions exist), and function by bringing the catalytic module into close proximity of its substrate and prolonging the enzyme-carbohydrate interaction,

21 Johan Larsbrink 2013

thereby improving catalysis. CBMs can adopt a number of different folds, depending on their target substrates, which range from mono- and oligosaccharides up to both crystalline and non-crystalline polysaccharides (Boraston et al. 2004).

The GH enzymes are the most numerous class in CAZy, and when looking only at carbohydrate degrading enzymes (excluding GTs and CBMs), the number of GH modules completely dwarfs that of PLs, CEs and AAs (Cantarel, et al. 2009, Levasseur, et al. 2013). The main focus of this thesis is the discovery of new GH enzymes with either hydrolytic or transglycosylating activities, and as such, GH enzymes are discussed in greater detail below.

Glycoside hydrolases Glycosidic bonds form some of the most stable polymers on earth. In fact, cellulose, which consists solely of ȕ(1ĺ4) glucose units, has a half-life of close to 5 billion years at 25 °C under neutral conditions (Wolfenden et al. 1998). Despite the extraordinary stability of glycosidic bonds, certain CAZymes are able to enhance the rate of hydrolysis of glycosides by a factor of up to 1017.

GH enzymes carry out the majority of carbohydrate degradation in nature, and are consequently responsible for some of the most important biochemical reactions on a global scale. The number of identified GHs is ever increasing with more (meta)genomic sequences being released, and constitute around half the entries of the CAZy database (Cantarel, et al. 2009). As mentioned above, the GHs have been ordered into families based on amino acid sequence similarity, which correlates with protein folds and enzymatic mechanisms. A given GH family may be monospecific, i.e.

22 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria having members active on a single substrate, such as GH11 which only contains , or polyspecific, containing a wide variety of different specificities, such as GH5, which contains close to 15 different activities (Aspeborg et al. 2012). Intensive work by crystallographers has led to about 80 % of the GH families having one or more representative structures solved to date (Fushinobu et al. 2013).

GH mechanisms GHs will in general perform their catalytic function following one of two catalytic routes: either by net inversion or retention of the anomeric configuration of their respective substrate, as was first described by Koshland 60 years ago (Koshland 1953). For inverting enzymes, hydrolysis occurs in a one-step direct displacement reaction of the leaving group with water (Figure 6A). The catalytic amino acid residues are two carboxylic acids, typically spaced ~10 Å apart, where one acts as a general acid and the other as a general base (Davies et al. 1995, McCarter et al. 1994, Zechel et al. 2000). Also retaining enzymes employ two carboxylic acids, but hydrolysis is instead performed as a double-displacement reaction, involving a glycosyl-enzyme intermediate, where one carboxylic acid residue acts as a nucleophile and leaving group and the other as a general acid/base (Figure 6B). The catalytic residues of retaining enzymes are also more closely spaced, at 5.5 Å. The first step of the reaction consists of a nucleophilic attack on the anomeric carbon by the catalytic nuclephile, leading to the formation of a glycosyl-enzyme intermediate. In the second step, the deglycosylation step, the general acid/base residue deprotonates a water molecule which then displaces the sugar moiety.

23

Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Hydrolysis is however not the only possible outcome in the deglycosylation step, as the glycosyl enzyme can also be intercepted by other acceptor molecules; often, in these cases, the acceptor molecule is another sugar, which leads to transglycosylation. In fact, transglycosylation has been shown, through studies on the reactivation of trapped intermediates, to always be kinetically favoured over hydrolysis (Withers et al. 1990). How transglycosylating enzymes are able to exclude water, which is typically at ~55 M, from the reactions has however been a puzzling subject. One hypothesis is that strict transglycosylases do not allow water molecules to be positioned in a catalytically favourable manner (leading to deprotonation and activation) in the , but instead the enzymes are perfectly tailored to activate an acceptor glycoside moiety (Larsbrink et al. 2012).

Exceptions to the classical retaining mechanisms have been discovered in many GH families. Families 18, 20, 25, 56, 84, and 85, for example, use substrate-assisted hydrolysis, where no nucleophile is provided by the enzyme, and instead the N-acetyl group in a substrate such as chitin acts as an intramolecular nucleophile, forming an oxazoline intermediate, and typically the intermediate is stabilised by a carboxylate residue (Figure 6C) (Mark et al. 2001, Terwisscha van Scheltinga et al. 1995). Another alternative mechanism is displayed by members of families 33 and 34, which contain sialidases and trans-sialidases, and use a tyrosine instead of a carboxylic acid as nucleophile (Watts et al. 2003). The nucleophilicity of the tyrosine is amplified by a neighbouring general base, and a reason for using a non-charged nucleophile is to avoid charge repulsion, as the anomeric centre itself contains a negatively charged carboxyl group. Catalysis in GHs may also be dependent on cofactors, such as NAD+ in families 4 and 109, and break the glycosidic bond by elimination, yet still

25 Johan Larsbrink 2013 leading to the complete reaction being hydrolysis (Rajan et al. 2004, Yip et al. 2004).

Catalytic mode of action of GHs In addition to their mechanisms, GHs are also classified by their mode of action on carbohydrates; that is, whether they cleave internal bonds in polysaccharides (endo-acting enzymes), or whether they act on the ends of poly- or oligosaccharides (exo-acting enzymes) (Davies, et al. 1995). Endo- acting enzymes typically have a cleft-like active site, able to recognise, bind and accommodate a stretch of the target polysaccharide in order to facilitate hydrolysis, thereby creating breaks in the backbone of the polysaccharide. The binding of polysaccharide substrates in the active site is achieved by a number of subsites which bind to individual sugar units along the polysaccharide chain. The subsites are numbered from the point of hydrolysis, where monosaccharides toward the non-reducing end are labelled -1, -2, -3 etc. and consequently monosaccharides toward the reducing end are labelled +1, +2, +3, etc (Davies et al. 1997). Exo-acting enzymes on the other hand typically recognise a limited number of monosaccharide units on their target substrate, and cleave off either a monosaccharide or a shorter oligosaccharide, thus depolymerising the substrate from its ends. Most commonly, this is performed at the non- reducing end of the sugar chains, but exceptions to this rule exist, such as a GH8 from Bacillus halodurans C-125, which releases xylose from the reducing-end (Fushinobu et al. 2005, Honda et al. 2004), or the GH7 cellobiohydrolase I from Hypocrea jecorina (Trichoderma reesei), which processively cleaves cellobiose from the reducing end of cellulose chains (Divne et al. 1998). Certain enzymes, such as the aforementioned cellobiohydrolase I, are able to prolong the contact with their cognate substrate and carry out multiple catalytic events, by for example threading

26 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria the substrate through a tunnel in the enzyme, in which the active site residues are located, such as certain cellulases/cellobiohydrolases (Davies, et al. 1995, Zou et al. 1999). Enzymes acting in such a manner are called processive.

CAZyme discovery The break-down of carbohydrates into their constituent monosaccharides can be achieved by many means, but using enzymes is regarded as the most sustainable method, and is one which preserves the basic carbohydrate structures in contrast to for example thermochemical processes, which can cause caramelisation and unwanted side-products (Horn et al. 2012). A key step in designing a biochemical process for biomass saccharification is finding enzymes with appropriate activities that can efficiently perform this feat, and there is a long history in the biotechnological field of trying to find new and better enzymes for biomass conversion (Falch 1991, Mandels et al. 1957). There are many approaches in discovering novel enzymes, like in- depth proteomic, metabolic and biochemical studies of efficient biomass degraders, as in the extensive work done on enzymes from the wood-rotting fungi Hypocrea jecorina (formerly known as Trichoderma reesei) and Phanerochete chrysosporium (Dashtban et al. 2009, MacDonald et al. 2012, Schuster et al. 2010), or by doing metagenomic sequencing to identify all genes, including CAZymes, of entire environmental samples consisting whole microbial communities (Li, et al. 2009). As bacteria, or prokaryotes, are truly ubiquitous in nature, they provide a practically limitless source of genetic information, and are as such verdant ground for discovering novel carbohydrate degrading systems (Whitman et al. 1998).

27 Johan Larsbrink 2013

Bacterial habitats Wherever life is supported on earth, prokaryotes can be found. Not only have prokaryotes been found in lakes and oceans, forests, and also in association with other species. Further, they have been found in subsurface sediments several kilometres deep, subglacial lakes, as well as in such an unlikely location as the earth’s stratosphere (Shtarkman et al. 2013, Wainwright et al. 2003, Whitman, et al. 1998). As prokaryotes thrive in all kinds of environments, they play pivotal roles to the world’s ecosystems and the build-up and recycling of the molecules of life. A typical cell density of prokaryotes in aquatic and subsurface environments is in the range of around a million cells/ml, while cell densities in forest floors ranges between 106-108 cells/ml (Whitman, et al. 1998). Extremely high prokaryote cell densities can be found on the surface and the inside of animals, although this habitat does not constitute a major fraction of prokaryotes globally.

The enzymes studied in this thesis were acquired from bacteria with high capabilities for degrading carbohydrates: Cellvibrio japonicus, whose habitat is the forest floor, and Bacteroides thetaiotaomicron and Bacteroides ovatus, which are dominant human gut symbionts.

The soil bacterium Cellvibrio japonicus This bacterium was first isolated in 1952 from Japanese soil, and was originally named as Pseudomonas fluorescens subsp. cellulosa (Ueda et al. 1952). The bacterium was however not highly similar to the Pseudomonas genus, and was removed as a member in a revision based on 16S rRNA sequences in 2000 (Anzai et al. 2000). Later, in 2003, a study revealed that the bacterium was more closely related to the genus Cellvibrio, and it was reclassified as a novel species under its current name (Humphry et al. 2003).

28 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Many studies on C. japonicus enzymes had already recognised the bacterium’s potential for complex carbohydrate degradation (Hazlewood et al. 1998), and when the complete genome sequence of C. japonicus was released in 2008, not surprisingly, a vast number of enzymes predicted to be involved in the degradation of plant cell walls were identified (Deboy et al. 2008).

C. japonicus has become something of a model soil saprophyte organism as light has been shed on not only its vast CAZyme repertoire and ability to degrade all major parts of the plant cell wall (Deboy, et al. 2008, Hazlewood, et al. 1998), but also related processes such as gene regulation and protein secretion (Emami et al. 2009, Gardner et al. 2010). Especially noteworthy is the work done on C. japonicus by Prof. Harry Gilbert and co- workers during the years, which has described CAZymes active on a range of substrates, such as xylan (Beylot et al. 2001, Gilbert et al. 1988), mannan (Hogg et al. 2003, Hogg et al. 2001), arabinan (Beylot, et al. 2001, McKie et al. 1997) and cellulose (Gilbert et al. 1987, Hall et al. 1988, Hazlewood et al. 1992).

The enzymes produced by C. japonicus are, in contrast to the membrane- anchored cellulosomes of Clostridia, in many cases secreted into the cell’s surroundings to degrade plant matter (Hazlewood, et al. 1992), and recently, the requirement of the type II secretion system for the bacterium to be able to secrete endo-glucanases, which are enzymes crucial for the deconstruction of biomass, was demonstrated (Gardner, et al. 2010). Furthermore, techniques enabling the genetic manipulation of C. japonicus have been established, and an example of the industrial potential of C. japonicus was demonstrated by introducing the capacity for ethanol production into the bacterium (Gardner, et al. 2010, Gardner et al. 2012).

29 Johan Larsbrink 2013

The human gut Bacteroidetes symbionts The studied human gut symbionts, Bacteroides thetaiotaomicron and B. ovatus are, as mentioned, dominant species in our intestinal microbiota. In order to appreciate their success in such a highly competitive habitat, it is necessary to first give an overview of the forces that shape our internal ecosystem, as well as point out the potential benefits and negative consequences a healthy and disturbed microbiota can create, respectively.

Human-microbe gut symbiosis Although not apparent to the naked eye, each human individual is a veritable microbe farm. The density of prokaryotic cells found on human skin is typically around 104/cm2, but the majority of unicellular organisms are found in the digestive tract (Whitman, et al. 1998). In the small intestine, the number of microbial cells is comparatively low (~104/ml), largely due to the low pH and high flow-rate (Berg 1996). In the distal human gut (colon) however, the numbers of microorganisms may exceed such immense densities as 1011 bacteria/gram intestinal content, and the average human harbours roughly 1.5 kg of intestinal biomass (Figure 7) (Kaper et al. 2005). These trillions of microbes residing in our gastrointestinal tract actually outnumber somatic human cells by at least a factor of 10, and evidently contain a plethora of genes that enable the scavenging of nutrients from the host’s diet (Cantarel et al. 2012, El Kaoutari, et al. 2013, Martens et al. 2011). The number of genes encoded by the gut microbiota has in fact been estimated to be 150 times larger than that encoded by the human genome (Qin et al. 2010), and the multitude of interactions between the gut microbiota and the human host may warrant the microbiota being seen as a separate, ‘virtual’, organ in the body (Evans et al. 2013).

30

Johan Larsbrink 2013 the rapidly increasing volume of (meta)genomic data on the microbiota (Nelson et al. 2010, Qin, et al. 2010, Xu et al. 2003a), the detailed molecular processes responsible for the degradation of dietary polysaccharides are still not well understood. A deeper understanding of the carbohydrate utilisation of beneficial gut bacteria might very well lead to not only novel therapeutic strategies in treating dietary/intestinal disorders, but may also shed light on how these complex intestinal ecosystems are created and maintained through an individual’s life.

The dominant phyla of the gut microbiota are species belonging to the genera of Bacteroides and Firmicutes, and to a lesser degree, Actinobacteria (Turnbaugh et al. 2009), although at the species level, the relative proportions of approximately 70 % of the gut microbiota species in any given individual are specific to the host and are dependent on factors such as diet (Ley et al. 2006b, Tap et al. 2009, Zoetendal et al. 1998). The colonisation of the human gut begins at birth, and the composition of species will change at such key events as transition from milk or baby formula to solid foods (Koropatkin, et al. 2012). The composition of the adult gut microbiota can be surprisingly stable, which in some cases, when there is an abnormal composition of gut symbionts, leads to persistent digestive problems, such as the severe malnutrition disorder kwashiorkor (Smith et al. 2013), or other health hazards such as irritable bowel syndrome, obesity, and colon cancer (Koropatkin, et al. 2012). As an example, the microbiota of twins and/or between mothers and offspring are more similar than between unrelated individuals, even if the related individuals live at locations very distant from each other (Turnbaugh, et al. 2009).

32 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Regarding obesity, studies on twins have revealed that in obese individuals, the phylogenetic diversity of the gut microbiota is significantly lower than in lean individuals, which is likely caused by a less diverse diet (Turnbaugh, et al. 2009). There is also a lower proportion of Bacteroidetes species and a higher proportion of Actinobacteria, an observation which shows the same pattern in similar studies on unrelated individuals as well as in mice (Ley et al. 2005, Ley, et al. 2006b). The gut microbial community furthermore appears to affect human health in such profound ways as in helping in the development of the immune system, inhibition of pathogen colonisation, synthesis of certain vitamins etc. (Xu et al. 2003b).

The administration of broad-spectrum antibiotics sometimes leads to severe health problems, caused by pathogenic organisms which can establish themselves when the normal flora has been knocked out. A solution to this, recently reported in several newspapers, is to introduce a ‘normal’ gut flora from a healthy individual, by faecal transplants from healthy individuals (Macconnachie et al. 2009, Persky et al. 2000, Schwan et al. 1984, van Nood et al. 2013).

The studied Bacteroides species The studied gut bacteria in this thesis are the closely related Bacteroides thetaiotaomicron and Bacteroides ovatus. In the distal human gut, species of the Bacteroidetes phylum are particularly dominant, comprising around 30% of the total population of gut bacteria, and reaching densities of 1010 – 1011 cells per gram of human fecal matter (Ley et al. 2006a). Bacteroides thetaiotaomicron has served as a culturable and genetically tractable model species in studies of human-gut/microbial symbiosis. The whole-genome sequencing of this organism in 2003 was a watershed in the field (Xu, et al. 2003a); currently there is a growing number of genome sequences for

33 Johan Larsbrink 2013 related Bacteroidetes and transcriptomics data are now available for selected model species (Kim et al. 2011, Martens, et al. 2011, Nelson, et al. 2010, Xu, et al. 2003a). Particularly striking is the vast number of predicted glycoside hydrolase and polysaccharide lyase family members in individual Bacteroidetes genomes, which typically exceeds that of other human gut bacteria, e.g. representative Firmicutes (Cantarel, et al. 2009, Mahowald et al. 2009, Xu, et al. 2003a).

B. ovatus is closely related to B. thetaiotaomicron, and the two species actually share 96.5 % nucleotide sequence identity in their respective 16S rDNA genes (Martens, et al. 2011). Excluding cellulose, these two species together share the capability to degrade all major mucosal and plant-derived polysaccharides that may enter the human gut. While the two bacteria share some common glycan preferences, a comparison of their carbohydrate degradative abilities reveals that while B. thetaiotaomicron is adapted to forage on host glycans and pectins, B. ovatus is a proficient degrader of all common hemicelluloses. Consequently, these data show how different bacteria have evolved to not only spatial, but also nutritional niches as specific as distinct polysaccharides, to thrive in the gut ecosystem.

Polysaccharide Utilisation Loci and the Sus paradigm Species of Bacteroidetes utilise a complex genetic setup to degrade certain polysaccharides, called Polysaccharide Utilisation Loci (PULs) (Martens et al. 2009). These PULs are composed of multiple genes which are co- regulated and the transcription of individual PULs is regulated by the sensing of a corresponding carbohydrate. The existence of PULs was discovered by studies on the archetypal Starch Utilisation System (Sus) of B. thetaiotaomicron, which has been described in detail over the past two decades, and was founded on ground-breaking work by Prof. Abigail

34 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Salyers and co-workers (Cho et al. 2001, D'Elia et al. 1996a, D'Elia et al. 1996b, Reeves et al. 1997, Shipman et al. 1999). The Sus consists of eight contiguous genes, susRABCDEFG. Starch degradation by the Sus commences with starch polysaccharide being bound by surface-attached proteins (SusD, SusE and SusF) and hydrolysed by an endo- (SusG) into oligosaccharides which are imported into the periplasm by a TonB- ExbBD complex, using the proton-motive force (Martens, et al. 2009). The imported oligosaccharides are subsequently hydrolysed into glucose monosaccharides by periplasmic glycoside hydrolases: the (SusA) and the Į-glucosidase (SusB). Imported oligosaccharides also bind to the N-terminal, periplasmic, segment of a hybrid two-component system (HTCS; SusR) located on the inner membrane, and productive binding triggers a signal across the membrane into the cytoplasm, which leads to transcriptional activation of the SusABCDEFG genes. Sensing of the degradation products from the early actions of the Sus hydrolases thus informs the bacterium that starch is present in the cell’s surroundings, allowing a highly specific response in a rapidly shifting environment. Neither of the binding proteins SusE and SusF, which have been shown through in vitro experiments to contain multiple carbohydrate binding modules (CBMs), are required for the bacterium to utilise starch, and neither are they very homologous to gene products found in other genomes thus far (Cameron et al. 2012). Similarly, either one of the hydrolases SusA and SusB can be removed as long as the other remains, as they can both hydrolyse maltooligosaccharides, albeit with different specificities. The backbone-cleaving SusG is however indispensable for growth.

The presence of a pair of SusC/D homologous proteins is the signature for all PULs, and is used to identify novel PULs in the genomes of Bacteroidetes species (Martens, et al. 2009, Martens, et al. 2011). In

35 Johan Larsbrink 2013 functional PULs, the SusC/D homologous pair is coupled to a variable number of glycoside hydrolases, reflecting the monosaccharide and linkage complexity of the targeted polysaccharide. Analogous to the Sus, in other PULs, the SusC-like proteins import oligosaccharides, aided by SusD-like proteins which capture specific glycans at the cell surface, while the hydrolases both cleave the backbone of polysaccharides at the cell surface and degrade imported oligosaccharides down to the constituent monosaccharides in the periplasm. Similarly, SusR-like proteins can be expected to act as sensors and regulators of respective PULs, specifically binding to early break-down products of the targeted polysaccharide, thus enabling the cell to sense and rapidly respond to the different types of carbohydrates present in the surroundings. An interesting observation is that in many cases, these bacteria grow faster on poly- and oligosaccharide than on the monosaccharide constituents, which is likely because of the adaptation of the SusR-like proteins to bind oligosaccharides, rather than monosaccharides, which are rapidly consumed in the gut (Martens, et al. 2011).

In B. thetaiotaomicron, 88 putative PULs have been identified, and including accessory proteins to the SusC/D pair, these PULS make up almost 20 % of the bacterium’s genome (Martens et al. 2008, Martens, et al. 2011). As a close relative to B. thetaiotaomicron, B. ovatus has also devoted a substantial part of its genome to PUL structures, and 112 putative PULs have been identified (Martens, et al. 2011). The large investment in PULs in these two species, coupled to the dominant status of Bacteroides species in the human gut, shows how this strategy of carbohydrate sensing and breakdown has given these species an advantage over other microorganisms in a highly competitive environment.

36 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

AIM OF INVESTIGATION

37 Johan Larsbrink 2013

The overall scope of this thesis concerns the microbial degradation of carbohydrates, with a special focus on the hemicellulose xyloglucan (XyG). The initial goals were to discover novel Į-xylosidases able to hydrolyse the xyloside substitutions of this polysaccharide, which could potentially open up routes to new innovative applications where the XyG chains could be specifically modified to generate value-added materials with new properties, as well as novel enzyme substrates. The approach was to screen the members of Glycoside Hydrolase Family 31 (GH31) from the soil bacterium Cellvibrio japonicus and the human gut bacterium Bacteroides thetaiotaomicron, as GH31 is the only family which includes Į-xylosidases. Following this initial scheme, a highly xylogluco-oligosaccharide(XyGO)- specific Į-xylosidase from C. japonicus was discovered and characterised biochemically and structurally. In addition, the screening of enzymes uncovered from the same bacterium an enzyme with a novel activity for the family – the CjAgd31B Į(1ĺ4)-glucan transglucosidase. Further, from B. thetaiotaomicron, several new activities for the family were discovered, related to in the human gut.

This initial broad investigation expanded into the detailed study of XyG utilisation in both the aforementioned C. japonicus as well as a human gut bacterium closely related to B. thetaiotaomicron, namely Bacteroides ovatus.Both C. japonicus and B. ovatus possess the ability to utilise XyG as sole carbon source, and thus harbour the enzymes needed to deconstruct this complex polysaccharide. The identification of a discrete locus in B. ovatus, predicted to be involved in XyG degradation, led to a project with the aims of biochemically characterising all eight predicted hydrolases of the locus, as well as performing knock-out- and structural studies to fully appreciate the molecular processes providing this gut symbiont with the ability to utilise XyG. A similar XyG-related locus was identified in C. japonicus, as

38 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria a development of the C. japonicus Į-xylosidase project, albeit lacking a number of required enzymatic activities required for full XyG hydrolysis. The locus instead appears to be geared towards the degradation and import of XyGOs, which demonstrates how different bacteria have evolved separate systems to exploit XyG, which is bound to be abundant both in the human gut and on the forest floor.

39 Johan Larsbrink 2013

40 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

RESULTS AND DISCUSSION

41 Johan Larsbrink 2013

Enzyme discovery by Glycoside Hydrolase family activity screening In the discovery of new carbohydrate-active enzymes (CAZymes), several approaches may be used. For the work presented in Publications I, II, III (as a follow-up project), IV and VII, whole Glycoside Hydrolase (GH) family screening was used, comprising cloning, expression and biochemical characterisation of all members of a particular GH family from selected bacteria in order to potentially discover novel enzymatic activities or features. The enzymes studied in these publications come from the soil bacterium C. japonicus and the human gut symbiont B. thetaiotaomicron, and the GH families studied were GH31 for Publication I, III, IV and VII, and GH43 for Publication II.

GH31 The initial intention of the GH31 SURMHFW ZDV WR GLVFRYHU QRYHO Į- xylosidases, active on xyloglucan (XyG). GH31 is the only family FRQWDLQLQJ Į-xylosidases (Cantarel, et al. 2009), and as such the natural target to find new enzymes with this activity. Previous studies on XyG- VSHFLILF Į-xylosidases had shown that *+ Į-xylosidases enzymes are able to selectively remove the terminal xyloside moiety from the extreme non-reducing end of xylogluco-oligosaccharides (XyGOs) (Fanutti et al. 1991, Moracci et al. 2000, Sampedro et al. 2001). The main anticipation of the study was to discover enzymes able to access ‘internal’ xyloside moieties on XyGOs, which could lead to novel value-added materials as well as new enzyme substrates.

CjXyl31A From the three GH31 genes in C. japonicus, DQ Į-xylosidase was indeed identified, and in Publication I, the characterisation of this highly specific

42 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

XyGO-degrading Į-xylosidase, CjXyl31A, was presented, using both detailed biochemical and structural analyses, in collaboration with Prof. Gideon Davies and co-workers. ,QNHHSLQJZLWKSUHYLRXVO\FKDUDFWHULVHGĮ- xylosidases, the enzyme exclusively hydrolysed the xylose moiety on the extreme non-reducing end of XyGOs, but was also shown to have a preference for XyGOs with a backbone length equal or greater than three glucose residues (Table 1).

Table 1: Activity of CjXyl31A on various substrates.

Substrate kcat Km kcat/Km ǻǻ*‡ s–1 mM s–1 mM–1 kJ mol-1 p13Į;\O 0.58 ± 0.026 0.39 ± 0.06 1.48 p13Į*OF “–4 5.31 ± 1.0 –3 p13Į*DO N/Dv N/D N/D p13ȕ-L-Arap N/Dv N/D N/D XXXG 26.7 ± 1.5 0.54 ± 0.073 49.5 XLLG 33.6 ± 3.2 2.6 ± 0.42 13.1 3.30 XX 31.2 ± 1.5 0.72 ± 0.076 44.7 0.25 XGol 9.39 ± 0.61 0.70 ± 0.013 13.4 3.23 X (Iso- 2.94 ± 0.36 55.00 ± 7.4 0.05 17.1 primeverose)

ǻǻ*‡ ZDVFDOFXODWHGXVLQJWKHIRUPXODǻǻ*‡=RTln(kcat/Km[XXXG]/ kcat/Km[XGO])

This observation was attributed to a novel PA14 (named from the protective antigen of anthrax toxin (Petosa et al. 1997)) domain insert in the N- terminal domain of the enzyme (Figure 8). The N-terminal domains of GH31 enzymes have previously been presumed to be involved in substrate recognition (Lovering et al. 2005), but this was the first presented case of a

43

Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria and the binding of anti-CjXyl31A antibodies to the fractions was assayed by the western blot technique. CjXyl31A was identified as a lipoprotein, enriched in the membrane fractions. The localisation studies, together with the detailed biochemical data, allowed for a tentative model of the pathway of XyG degradation by this soil bacterium. In connection to the localisation studies, it was also shown that C. japonicus secretes significant endo- xyloglucanase activity into the growth medium when the carbon source is XyGOs.

In a follow-up study, Publication III, the interaction between CjXyl31A and XyGOs was further dissected using advanced NMR techniques, in collaboration with Prof. Antoni Molinaro and co-workers. The focus of the study was, specifically, WKHUHDFWLRQRI;;;*ĺ*;;*, performed by the enzyme. In order to be able to follow the reaction, both the wild-type enzyme and a catalytically crippled mutant, CjXyl31A_D659A, were produced, as well as purified GXXG (the reaction product). Thus, the reaction between wild-type enzyme and XXXG, the interaction between wild-type enzyme and GXXG, and the interaction between CjXyl31A_D659A and XXXG could be monitored. In saturation transfer difference (STD) NMR, the interaction between a small molecule and a macromolecule can be studied, such as an oligosaccharide and an enzyme in this case (Haselhorst et al. 2009). NMR is performed on the free oligosaccharide, or ligand, as well as the ligand together with the macromolecule. As macromolecules in general are too bulky to give NMR signals, only the spectra of the ligands is observed, and by subtracting the spectra obtained with or without added macromolecule, the parts of the ligand that interacts with the macromolecule can be identified (Haselhorst, et al. 2009).

45 Johan Larsbrink 2013

The study revealed how the individual chemical groups of the assayed oligosaccharides interact with different intensity with the enzyme, and as expected, the whole length of the oligosaccharides tested did interact with CjXyl31A and the mutant (Figure 9). The data obtained, together with molecular modelling, led to a refined model of the CjXyl31A-XXXG enzyme-substrate complex. Furthermore, for the first time in GH31, the stereospecificity of a family member could be monitored directly.

46

Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria substrate (Figure 10B), and exclusively utilises maltooligosaccharides and/or starch as donor substrates, with the highest activity observed on maltotriose. Also, hydrolysis was only observed for the disaccharide , although very weak even in this case. As acceptor substrates, CjAgd31B was able to use both Į(1ĺ4)- and Į(1ĺ6)-linked isomaltooligosaccharides.

The nearly complete lack of hydrolysis observed for the enzyme was investigated by detailed study of the three-dimensional structure of CjAgd31B, again in collaboration with Prof. Gideon Davies and co- workers. The enzyme apparently circumvents hydrolysis, even in ~55 M water solutions, by not allowing water molecules with appropriate geometry or interaction with the catalytic acid/base residue. Instead, only a sugar moiety can interact favourably with the enzyme’s active site residues, as seen for the trapped glycosyl-enzyme intermediates. These observations may explain how other transglycosylases from other families avoid hydrolysis by similar means.

The action of CjAgd31B was inhibited by the pseudo-saccharide acarbose, which was also successfully co-crystallised in the active site of the enzyme in the structural studies. The crystal structure revealed how the enzyme adopts a classical GH31 fold, with an N-terminal domain, the central (Į/ȕ)8- barrel and two C-terminal domains (Figure 11A). The enzyme was also shown to possess at least four subsites for oligosaccharide recognition (-1 to +3), comprising a tyrosine clamp at the +2 and +3 subsites for oligosaccharide binding and recognition, which is a novel observation for the enzyme family (Figure 11B).

49

Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria generate substrates for glycogen phosphorylases. The novel 4-Į- glucosyltransferase-activity of CjAgd31B has previously not been described in the literature for any other enzyme.

Bacteroides thetaiotaomicron GH31 targets The other aspect of the GH31 screening project concerned the six family members encoded by the gut bacterium Bacteroides thetaiotaomicron, and the results from this study are presented in Publication VII. B. thetaiotaomicron is, as has been described in previous sections, a dominant member of the human gut microbiota, and has been shown to be an especially proficient degrader of host glycans and pectins (Martens, et al. 2011). However, B. thetaiotaomicron has, in the time since the GH31 project commenced, been shown to be unable to grow on hemicelluloses, and as such, the chance of finding Į-xylosidases active on XyG was slim.

Novel activities for the family were nonetheless discovered in the screening of the enzymes encoded by B. thetaiotaomicron, with Į-galactosidase, Į-N- DFHW\OJDODFWRVDPLQLGDVHDQGVWULFWĮ ĺ -glucosidase activities observed.

The enzyme BT3086 was shown to be an exo-/Į ĺ -linkage specific glucosidase, active on dextran, which could be expected, as BT3086 is part of a PUL which is upregulated when B. thetaiotaomicron is grown on dextran. Enzymes able to hydrolyse Į(1ĺ -linkages are known in GH31, such as the human - (Heymann et al. 1995, Sim et al. 2010). However, unlike BT3086, these enzymes are promiscuous and able to address multiple glucosidic linkages.

BT3085 was active on Gal-Į-PNP, though curiously, hydrolysis was observed on neither the trisaccharide raffinose (Galp-Į ĺ)-Glcp- Į ĺ -Fru), the disaccharide melibiose (Galp-Į ĺ -Glcp) nor conjac

51 Johan Larsbrink 2013 galactomannan polysaccharide. Likely, the enzyme is not active on Į ĺ - linkages, but instead a substrate might be Į ĺ -linked galactosyl units found in blood group B epitopes (Brockhausen, et al. 2009).

BT3169 hydrolysed the substrate N-acetylgalactosamine-Į-PNP (GalNAc- Į-PNP) with high efficiency. A paucity of available natural substrates, however, hinders a definitive determination of its substrate(s) in vivo. Likely targets are, as B. thetaiotaomicron degrades many host glycans, either the Į ĺ -linked galactosyls found in the cores of mucins or the Į ĺ - linked GalNAc units found in blood group A epitopes (Brockhausen, et al. 2009).

Of the remaining enzymes, BT3299 did not hydrolyse any of the substrates assayed, while BT0339 and BT3659 only showed activity on Xyl-Į-PNP. Neither of these Į-xylosidases could address the Į ĺ -linked xylosyl decorations of XyGOs, even though XyG is the predominant source of Į- linked xylose in nature. As B. thetaiotaomicron does not grow on XyG, this observation is however not illogical, and the enzymes may very well be active on Į-linked xylose from a yet unknown host glycan or plant-derived source.

GH43 In an enzyme discovery study (Publication II) related to the work on GH31, all GH43 enzymes from both C. japonicus and B. thetaiotaomicron were characterised biochemically, by the group of Prof. Harry Gilbert. Although the study comprised 47 unique putative enzymes, the one major finding of the project was the arabinan-specific Į(1ĺ2)- arabinofuranosidase, CjAb43A. Biochemical and structural analysis of this enzyme, and biochemical analysis of the homologous Bt0369, are reported

52 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria in the paper. The remaining enzymes were typical xylan-acting Į- arabinofuranosidases, weak endo-xylanases, and endo-arabinanases, with differing specificities for branched or linear arabinan. A phylogenetic tree was constructed for all the studied genes, and different clades corresponding to the different activities could be identified, which allows for future predictions of GH43 enzymes from other organisms.

Enzyme discovery from genetic loci During the course of the thesis work, the focus of the field has shifted rapidly from a focus on individual enzymes or GH families to a more system oriented approach, especially following studies of the PULs of Bacteroidetes species. Screening the activities of GH members from certain bacteria can, as described above, reveal novel activities to the family, but there is also a risk, as in the case of Publication II, for the yield of novel activities to be quite meagre. When searching for a particular enzymatic activity, another possible route is to look at genetic loci of bacteria, as many bacterial genes are co-expressed in operons or gene clusters (Osbourn et al. 2009). The PULs of Bacteroidetes are especially sophisticated examples of carbohydrate catabolism-related loci, as they contain not only enzymes for carbohydrate degradation, but also proteins for the binding, sensing and transport of carbohydrates as well as PUL regulation proteins (Martens, et al. 2009, Martens, et al. 2011). Thus, when looking for an enzyme able to act on a certain linkage in an oligo- or polysaccharide, there is a high chance of finding it in a PUL predicted to act on said substrate.

53

Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

In order to assay the activity of CjAfc95A, fucogalactoxyloglucan from iceberg lettuce was extracted. Both CjBgl35A and CjAfc95A were active on and highly specific for XyGOs, and to a lower degree on XyG polysaccharides, and their modes of action were studied in detail. Both enzymes were able to fully strip off all of their respective target monosaccharide substitutions from the substrates, in contrast to the Į- xylosidase CjXyl31A.

In order to determine whether the genes of the locus were in fact transcribed as C. japonicus senses and breaks down XyG, a qPCR study was undertaken. C. japonicus cultures were grown in minimal media supplemented with either glucose or XyGOs as carbon source, and total RNA was extracted and converted into cDNA to be used as qPCR templates. In the genome of C. japonicus there are three additional predicted

ȕ- and one additional Į-L-fucosidase; these genes were included in the study as reference genes. The four locus genes were all found to be highly upregulated in the cultures containing XyGOs compared to the glucose cultures. The reference genes were either not upregulated at all or at a much lower level than the locus genes, indicating that the locus is the actual site for transcription of these necessary activities in XyG metabolism (Figure 13).

By following the fractionation procedure established in Publication I, proteins were obtained from secreted, periplasmic and intracellular fractions of C. japonicus grown in minimal media containing either glucose or

XyGOs as carbon source. Substrates for ȕ-galactosidases and Į-L- fucosidases were used to assay the activities in each fraction, and a clear enrichment of both ȕ-galactosidase and Į-L-fucosidase activities could be observed in the periplasmic fractions.

55

Johan Larsbrink 2013

A discrete locus in Bacteroides ovatus, endowing the bacterium with the ability to utilise xyloglucan As described in previous sections, many strains from the Bacteroidetes phylum possess PULs, endowing them with the capability to degrade both simple and complex carbohydrate structures (Martens, et al. 2009, Martens, et al. 2011). Bacteroides thetaiotaomicron has been a key source of CAZymes and a useful model for the study of human-gut symbiosis. However, B. thetaiotaomicron is unable to utilise the various complex hemicelluloses found in plant cell walls, and instead prefers host glycans and pectins as nutrient sources (Martens, et al. 2011). Instead, another member of Bacteroidetes with the ability to grow on all major hemicelluloses, B. ovatus, has emerged as a potential treasure trove for CAZyme discovery. In fact, B. ovatus contains even more predicted PULs than the archetypal B. thetaiotaomicron, and new interesting enzymes and carbohydrate degradation pathways are bound to be discovered in the near future.

The ability to utilise xyloglucan (XyG) as a sole carbon source was demonstrated in B. ovatus through growth studies (Martens, et al. 2011), which initiated the project presented in Publication V. Through transcriptomics studies on B. ovatus grown in minimal media supplemented with XyG, a discrete PUL, endowing B. ovatus with the capability to utilise XyG (the XyloGlucan Utilisation Locus; XyGUL), was identified. The scope of the project was to characterise all encoded GH enzymes by biochemical, genetic and structural studies. Interestingly, the study also revealed how only a small number of human gut Bacteroides species have been shown to be able to utilise XyG, despite it being ubiquitous in all land plants and as such plentiful as dietary fibre in a healthy human diet. Thus, B. ovatus seems to have adopted a niche in the human gut, in targeting XyG.

58

Johan Larsbrink 2013

Table 2: Summary of the kinetic parameters of the XyGUL hydrolases.

kcat Km kcat/Km Enzyme Substrate (s-1) (mM) (s-1 mM-1) BoGH2A Gal-ȕ-PNP 4.0 ± 0.092 0.087 ± 0.0074 46.0 BoGH3A Glc-ȕ-PNP Trace activity cellobiose Trace activity cellotetraose 65.0 ± 3.81 2.49 ± 0.42 21.3 cellohexaose 0.62 ± 0.15 0.698 ± 0.52 0.90 GLLG n.d. n.d. 0.15 GXXG n.d. n.d. 3.41 BoGH3B Glc-ȕ-PNP 1.74 ± 0.21 0.155 ± 0.039 11.2 Gal-ȕ-PNP Trace activity cellobiose 1.57 ± 0.11 0.68 ± 0.16 2.3 cellotetraose 3.99 ± 0.31 0.23 ± 0.060 17.3 cellohexaose 4.91 ± 0.71 0.47 ± 0.21 10.4 GLLG n.d. n.d. 0.18 GXXG n.d. n.d. 3.34 BoGH5A XXXG-ȕ-CNP 10.5 ± 0.14 0.036 ± 0.0027 291.7 XLLG-ȕ-CNP 11.1 ± 1.13 0.145 ± 0.024 76.5 GGGG-ȕ-CNP 0.12 ± 0.025 3.59 ± 1.27 0.034 Tamarind XG 435.3 ± 25.6 0.82 ± 0.17* 534.0* Lettuce XG n.d. n.d. 543.1* Tomato XG n.d. n.d. 501.5* BoGH9A Tomato XG Active BoGH31A Xyl-Į-PNP 1.60 ± 0.031 7.7 ± 0.273 0.21 0.0018 Glc-Į-PNP (0.071 ± 0.015) (31.8 ± 5.1) (0.002165) XXXG 32.6 ± 2.1 0.223 ± 0.046 146.2 XLLG 31.0 ± 1.9 0.378 ± 0.075 82.0 Isoprimeverose 1.78 ± 0.21 38.1 ± 7.1 0.047 BoGH43A L-Araf-Į-PNP 0.057 ± 0.001 0.71 ± 0.03 0.081 Xyl-ȕ-PNP 0.26 ± 0.005 6.58 ± 0.21 0.039 Tomato XGOs n.d. n.d. 0.013* Tomato XG Active BoGH43B L-Araf-Į-PNP 5.0×10-4 ± 8.1×10-5 6.6 ± 2.3 7.6×10-5 Tomato XGOs n.d. n.d. 0.0024* Tomato XG Active -1 -1 * Units for Km and kcat/Km are given in s mg ml n.d.: not determined

60 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

To further assess the biological function of the XyGUL, knock-out studies were performed on both individual genes and the entire XyGUL, by Prof. Eric Martens and co-workers. When the entire XyGUL was knocked out, ability to grow on XyG was completely abolished, clearly correlating the existence of the XyGUL with XyG utilisation. Additionally, knocking out the endo-xyloglucanase BoGH5A resulted in the same phenotype, proving it to be crucial for the utilisation of XyG polysaccharides. However, the phenotype could be rescued by feeding the bacteria with XyGOs generated by recombinantly produced BoGH5A. Also, knocking out the only GH31 enzyme of the XyGUL, BoGH31A, resulted in a phenotype not able to grow either on polysaccharide or oligosaccharide XyG, showing how the removal of pendant xyloside substitutions are key in depolymerising XyGOs into monosaccharides. The rationale behind the severe ǻBoGH31A phenotype is illuminated by the flowchart of XyG depolymerisation, shown in (Figure 16). An interesting observation regarding the GH31 enzyme, in particular for this thesis, is that the enzyme has a predicted PA14 domain insert in the N-terminal domain, analogous to that found in the CjXyl31A Į-xylosidase, described above. The PA14 domain can thus tentatively be ascribed to the high specificity for XyGOs also in the B. ovatus enzyme.

61

Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Carbohydrate-binding Often N-terminal) domain (Mello et al. 2010), yielding the first presentation of such a structure in the literature. The structure of BoGH5A was solved in two conformations, in which the BACON domain adopted two entirely different orientations. This suggests that the BACON domain can move rather freely relative to the catalytic domain. The “Carbohydrate-binding” implied by the name of the BACON domain warranted analysis, and the domain was shown, by Dr. Alexandra Tauzin, to not bind any of the tested polysaccharides or proteins to a significant extent. Furthermore, BoGH5A was shown to be a surface- exposed lipoprotein attached to the outer membrane at its N-terminus, analogous to the SusG amylase of the B. thetaiotaomicron Sus (Martens, et al. 2009, Shipman, et al. 1999), using anti-BoGH5A antibodies. These data taken together, indicate that the BACON domain acts as a linker domain, allowing the catalytic domain of BoGH5A freedom of movement at the cell surface, which is likely a benefit in hydrolysis of bulky polysaccharide substrates.

63

Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

BoGH9A. The oligosaccharides bind to the SusD-like membrane-bound lipoprotein and are imported into the periplasm by the SusC-like TonB- dependent transporter. In the periplasm, the oligosaccharides are degraded into their monosaccharide constituents by the exo-acting hydrolases to be further metabolised. The HTCS-like protein, spanning the inner membrane, binds early degradation products of the endo-xyloglucanases and thereby triggers the transcription of the entire locus, sans itself.

This work provides detailed insight into the utilisation of such a complex hemicellulose as XyG by B. ovatus, and how this bacterium has carved out a nutritional niche in such a highly competitive environment as the human gut. The dominance of Bacteroides species in the gut indicate that such sophisticated nutrition acquisition systems as PULs can provide a significant advantage in a milieu with rapidly changing nutritional conditions. The insights provided by the study has not only revealed new CAZymes and shed light on the biology of B. ovatus, but may also be relevant for future studies regarding prebiotics and human health, as XyG is such a ubiquitous polysaccharide in human diets.

65

Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

CONCLUSIONS

67 Johan Larsbrink 2013

The work presented in this thesis concerns the discovery and characterisation of Carbohydrate-Active enZymes (CAZymes) from bacteria living is such varied environments as forests and the human distal gut.

Two main approaches have been employed in the discovery of new enzymes, with the first being family and species based screening of Glycoside Hydrolases (GHs). The members of a selected GH family and bacterial species are in this approach cloned, expressed and screened biochemically in order to identify enzymes with predicted or, in some cases, novel activities to the family. The strength of this approach is the chance of finding unexpected activities, as the CAZyme classification is based solely on amino acid sequence similarity. Though many GH families are monospecific, containing a single activity, many are not, and contain a wide range of activities, where individual enzymes (in general) share basic mechanisms and folds, but can address various sugars and linkages. A downside of the approach is however the large amount of screening for activities that may be required, for discovery of novel activities can in no way be guaranteed.

The other approach is looking at genetic loci, predicted to encode a cluster of CAZymes acting on a given polysaccharide. This is especially promising in Bacteroidetes species, as they have evolved Polysaccharide Utilisation Loci, containing whole sets of enzymes and accessory proteins required for the degradation of specific polysaccharides. The SusC/D homologous pairs that are the signature of a PUL are easily identified by bioinformatical means, and by looking for predicted activities related to a given polysaccharide, relevant PULs may be identified. Similar loci or operons are also known in other bacteria, and can be a route for CAZyme discovery even in non-Bacteroidetes species. However, not all CAZymes are parts of

68 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

PULs or operons, which is a natural weakness of the approach in the discovery of novel CAZymes.

In this work, the GH family screening approach has yielded several novel activities in GH31 – D SUHGRPLQDQW Į-galactosidDVH DQ Į ĺ -linkage specific JOXFRVLGDVH DQ Į-N-acetylgalactosaminidase from Bacteroides thetaiotaomicron and WKHĮ ĺ -glucan transglucosidase CjAgd31B from Cellvibrio japonicus as well as an arabinan-VSHFLILF Į ĺ - arabinofuranosidase in GH43.

The other approach, looking at genetic loci, has led to the description of the xylogluco-oligosaccharide utilisation locus in C. japonicus, which contains three GHs and an TonB-dependent transporter. Localisation studies of the enzymes led to a model of the utilisation of xyloglucan polysaccharides employed by this efficient plant degrader. A more complete system was described for the gut symbiont B. ovatus, for which the whole Xyloglucan Utilisation Locus (XyGUL) was described. In-depth biochemical characterisation of the eight encoded GHs, coupled to genetic knock-outs and structural studies of the main endo-xyloglucanase, has revealed, in detail, how this bacterium can degrade such a complex polysaccharide as xyloglucan.

The publications collected in this thesis give insight into the discoveries made using the two enzyme screening approaches discussed. The thesis adds to the knowledge about CAZyme families, bacterial metabolism of carbohydrates and human nutrition. We feel that this is a considerable contribution to the field.

69 Johan Larsbrink 2013

ACKNOWLEDGEMENTS

Of course, this thesis is not the result of the work done solely by me, but has been influenced by a number of people in an even greater number of ways, and these people deserve more credit than there is room for here.

First of all, I would like to thank Harry Brumer for welcoming me into the international community of CAZymologists, through my time in his group. Even after having moved to Vancouver, advice has never been more than an e-mail away. I really appreciate the way you’ve trusted me in managing all my various projects in a largely independent way, as well as the opportunities to travel all over the world – especially the 6 months in Canada.

The Brumer group has been quite diverse over the years, and has made the journey an enjoyable one. First, I’d like to acknowledge the last member of the Sweden-based team, Lauren; we started at the same time on the same EU project, and your coming here was a perfect transition in so many ways. Jens, great travel companion and lab know-it-all. Oliver, fellow initiator of the Bo project and sorely missed since the return to Austria. Nomchit, you are missed, as are fellow office-mates Gea (though not part of the Brumer lab) and Fredrika. Magnus, fantastic diploma student and now a firm friend. Farid and Sassa, having made great substrates over the years. Gustav, for all the MS work. The great Vancouver people: Tyler, Alex and Fatemeh. And all the others, Chunlin, Ana Catarina and the Stefans.

I would like to thank all the collaborators from other labs that I’ve worked with through the years: Gideon Davies and his structural biology crew – especially Glyn and Atsushi, Harry Gilbert et. al., NMR specialists Antonio

70 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Molinaro and Alba Silipo, Eric Martens and Nicole Koropatkin – I really appreciated my week at your place!

The glycoscience team and the people on floor 2: Vincent, for keeping an eye on my development by heading the Glycoscience division; Ela and Nina, for help with all those practicalities; Cortwa, for comic relief at work; Francisco and Sara, for taking my gentle slurs in good humour. And of course all the people on floor two, past and present. Finally I would like to thank Karl Hult, who encouraged me to go to Japan for my diploma project, which segued into several enjoyable years of biotech research.

I would also like to thank Clara and Paula, who have been there from the beginning of my studies at KTH to this day, and Mattias for all those (work?) lunches, and of course other friends, family and loved ones.

71 Johan Larsbrink 2013

REFERENCES

Albers, S.-V. and Meyer, B.H. (2011) The archaeal cell envelope. Nat Rev Micro, 9, 414-426. Anzai, Y., Kim, H., Park, J.Y., Wakabayashi, H. and Oyaizu, H. (2000) Phylogenetic affiliation of the pseudomonads based on 16S rRNA sequence. Int J Syst Evol Microbiol, 50 Pt 4, 1563-1589. Arkema, K.K., Guannel, G., Verutes, G., Wood, S.A., Guerry, A., Ruckelshaus, M., Kareiva, P., Lacayo, M. and Silver, J.M. (2013) Coastal habitats shield people and property from sea-level rise and storms. Nature Clim Change, advance online publication. Aspeborg, H., Coutinho, P.M., Wang, Y., Brumer, H., 3rd and Henrissat, B. (2012) Evolution, substrate specificity and subfamily classification of glycoside hydrolase family 5 (GH5). BMC Evol Biol, 12, 186. Bartnicki-Garcia, S. (1968) Cell wall chemistry, morphogenesis, and taxonomy of fungi. Annu Rev Microbiol, 22, 87-108. Baumann, M.G.D. and Conner, A.H. (2003) Carbohydrate polymers as adhesives. In Handbook of Adhesive Technology (Pizzi, A. and Mittal, K.L. eds): Dekker, pp. 495-510. Berg, R.D. (1996) The indigenous gastrointestinal microflora. Trends Microbiol, 4, 430-435. Bertozzi, C.R. and Rabuka, D. (2009) Structural basis of glycan diversity. In Essentials of Glycobiology, 2nd edition (Varki, A., Cummings, R.D., Esko, J.D., Freeze, H.H., Stanley, P., Bertozzi, C.R., Hart, G.W. and Etzler, M.E. eds). Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press, pp. 23- 36. Beylot, M.H., McKie, V.A., Voragen, A.G., Doeswijk-Voragen, C.H. and Gilbert, H.J. (2001) The Pseudomonas cellulosa glycoside hydrolase family 51 arabinofuranosidase exhibits wide substrate specificity. Biochem J, 358, 607- 614. Boraston, A.B., Bolam, D.N., Gilbert, H.J. and Davies, G.J. (2004) Carbohydrate- binding modules: fine-tuning polysaccharide recognition. Biochem J, 382, 769-781. Brockhausen, I., Schachter, H. and Stanley, P. (2009) O-GalNAc glycans. In Essentials of Glycobiology, 2nd edition (Varki, A., Cummings, R.D., Esko, J.D., Freeze, H.H., Stanley, P., Bertozzi, C.R., Hart, G.W. and Etzler, M.E. eds). Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press, pp. 115-128. Buckeridge, M.S. (2010) Seed cell wall storage polysaccharides: models to understand cell wall biosynthesis and degradation. Plant Physiol, 154, 1017-1023. Buckeridge, M.S., Pessoa dos Santos, H. and Tiné, M.A.S. (2000) Mobilisation of storage cell wall polysaccharides in seeds. Plant Physiol Bioch, 38, 141-156. Cameron, E.A., Maynard, M.A., Smith, C.J., Smith, T.J., Koropatkin, N.M. and Martens, E.C. (2012) Multidomain carbohydrate-binding proteins Involved in Bacteroides thetaiotaomicron starch metabolism. J Biol Chem, 287, 34614- 34625.

72 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Campbell, J.A., Davies, G.J., Bulone, V.V. and Henrissat, B. (1998) A classification of nucleotide-diphospho-sugar based on amino acid sequence similarities. Biochem J, 329 (Pt 3), 719. Cantarel, B.L., Coutinho, P.M., Rancurel, C., Bernard, T., Lombard, V. and Henrissat, B. (2009) The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics. Nucleic Acids Res, 37, D233-D238. Cantarel, B.L., Lombard, V. and Henrissat, B. (2012) Complex carbohydrate utilization by the healthy human microbiome. PloS one, 7, e28742. Carpita, N. and McCann, M. (2000) The Cell Wall. In Biochemistry and Molecular Biology of Plants: Somerset, NJ: John Wiley & Sons, pp. 55-108. Carpita, N.C. and Gibeaut, D.M. (1993) Structural models of primary cell walls in flowering plants: consistency of molecular structure with the physical properties of the walls during growth. Plant J, 3, 1-30. Cho, K.H. and Salyers, A.A. (2001) Biochemical analysis of interactions between outer membrane proteins that contribute to starch utilization by Bacteroides thetaiotaomicron. J Bacteriol, 183, 7224-7230. Cobb, R.E., Si, T. and Zhao, H. (2012) Directed evolution: an evolving and enabling synthetic biology tool. Curr Opin Chem Biol, 16, 285-291. Cummings, J.H. and Macfarlane, G.T. (1997) Role of intestinal bacteria in nutrient metabolism. Clin Nutr, 16, 3-11. Cummings, R.D. and Doering, T.L. (2009) Fungi. In Essentials of Glycobiology (Varki, A., Cummings, R.D., Esko, J.D., Freeze, H.H., Stanley, P., Bertozzi, C.R., Hart, G.W. and Etzler, M.E. eds). Cold Spring Harbor NY: The Consortium of Glycobiology Editors, La Jolla, California. D'Elia, J.N. and Salyers, A.A. (1996a) Contribution of a neopullulanase, a , and an alpha-glucosidase to growth of Bacteroides thetaiotaomicron on starch. J Bacteriol, 178, 7173-7179. D'Elia, J.N. and Salyers, A.A. (1996b) Effect of regulatory protein levels on utilization of starch by Bacteroides thetaiotaomicron. J Bacteriol, 178, 7180-7186. Dashtban, M., Schraft, H. and Qin, W. (2009) Fungal bioconversion of lignocellulosic residues; opportunities & perspectives. Int J Biol Sci, 5, 578- 595. Davies, G. and Henrissat, B. (1995) Structures and mechanisms of glycosyl hydrolases. Structure, 3, 853-859. Davies, G.J., Wilson, K.S. and Henrissat, B. (1997) Nomenclature for sugar-binding subsites in glycosyl hydrolases. Biochem J, 321 ( Pt 2), 557-559. Deboy, R.T., Mongodin, E.F., Fouts, D.E., Tailford, L.E., Khouri, H., Emerson, J.B., Mohamoud, Y., Watkins, K., Henrissat, B., Gilbert, H.J. and Nelson, K.E. (2008) Insights into plant cell wall degradation from the genome sequence of the soil bacterium Cellvibrio japonicus. J Bacteriol, 190, 5455- 5463. Deinema, M.H. and Zevenhuizen, L.P. (1971) Formation of cellulose fibrils by gram- negative bacteria and their role in bacterial flocculation. Arch Mikrobiol, 78, 42-51. Divne, C., Ståhlberg, J., Teeri, T.T. and Jones, T.A. (1998) High-resolution crystal structures reveal how a cellulose chain is bound in the 50 Å long tunnel of cellobiohydrolase I from Trichoderma reesei. J Mol Biol, 275, 309-325. Dixon, R.A. (2013) Microbiology: Break down the walls. Nature, 493, 36-37.

73 Johan Larsbrink 2013

Doblin, M.S., Pettolino, F. and Bacic, A. (2010) Plant cell walls: the skeleton of the plant world. Functional Plant Biology, 37, 357-381. Dodd, D. and Cann, I.K. (2009) Enzymatic deconstruction of xylan for biofuel production. Global Change Biol, 1, 2-17. Ebringerová, A., Hromádková, Z. and Heinze, T. (2005) Hemicellulose. Adv Polym Sci, 186, 1-67. El Kaoutari, A., Armougom, F., Gordon, J.I., Raoult, D. and Henrissat, B. (2013) The abundance and variety of carbohydrate-active enzymes in the human gut microbiota. Nat Rev Microbiol, 11, 497-504. Elbein, A.D. (2009) Cytoplasmic carbohydrate molecules: trehalose and glycogen. In Microbial Glycobiology - Structures, Relevance and Applications: Academic Press, pp. 185-201. Ellinger, D., Naumann, M., Falter, C., Zwikowics, C., Jamrow, T., Manisseri, C., Somerville, S.C. and Voigt, C.A. (2013) Elevated early callose deposition results in complete penetration resistance to powdery mildew in Arabidopsis. Plant Physiol, 161, 1433-1444. Emami, K., Topakas, E., Nagy, T., Henshaw, J., Jackson, K.A., Nelson, K.E., Mongodin, E.F., Murray, J.W., Lewis, R.J. and Gilbert, H.J. (2009) Regulation of the xylan-degrading apparatus of Cellvibrio japonicus by a novel two-component system. J Biol Chem, 284, 1086-1096. Endler, A. and Persson, S. (2011) Cellulose synthases and synthesis in Arabidopsis. Mol Plant, 4, 199-211. Esko, J.D., Doering, T.L. and Raetz, C.R.H. (2009a) Eubacteria and Archaea. In Essentials of Glycobiology (Varki, A., Cummings, R.D., Esko, J.D., Freeze, H.H., Stanley, P., Bertozzi, C.R., Hart, G.W. and Etzler, M.E. eds). Cold Spring Harbor NY: The Consortium of Glycobiology Editors, La Jolla, California. Esko, J.D., Kimata, K. and Lindahl, U. (2009b) Proteoglycans and sulfated glycosaminoglycans. In Essentials of Glycobiology (Varki, A., Cummings, R.D., Esko, J.D., Freeze, H.H., Stanley, P., Bertozzi, C.R., Hart, G.W. and Etzler, M.E. eds). Cold Spring Harbor NY: The Consortium of Glycobiology Editors, La Jolla, California. Evans, J.M., Morris, L.S. and Marchesi, J. (2013) The gut microbiome: The role of a virtual organ in the endocrinology of the host. J Endocrinol, advance online publication. Falch, E.A. (1991) Industrial enzymes — Developments in production and application. Biotechnol Adv, 9, 643-658. Fangel, J.U., Ulvskov, P., Knox, J.P., Mikkelsen, M.D., Harholt, J., Popper, Z.A. and Willats, W.G. (2012) Cell wall evolution and diversity. Front Plant Sci, 3, 152. Fanutti, C., Gidley, M.J. and Reid, J.S.G. (1991) A xyloglucan-oligosaccharide- specific alpha-D-xylosidase or exo-oligoxyloglucan-alpha-xylohydrolase from germinated nasturtium (Tropaeolum majus L) seeds - Purification, properties and its interaction with a xyloglucan-specific endo-(1-]4)-beta-D-glucanase and other hydrolases during storage-xyloglucan mobilization. Planta, 184, 137-147. Fincher, G.B. (2009) Revolutionary times in our understanding of cell wall biosynthesis and remodeling in the grasses. Plant Physiol, 149, 27-37.

74 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Flint, H., Scott, K., Duncan, S., Louis, P. and Forano, E. (2012a) Microbial degradation of complex carbohydrates in the gut. Gut Microbes, 3. Flint, H.J. (2012) The impact of nutrition on the human microbiome. Nutr Rev, 70, S10-S13. Flint, H.J., Scott, K.P., Louis, P. and Duncan, S.H. (2012b) The role of the gut microbiota in nutrition and health. Nat Rev Gastroentero, 9, 577-589. Fry, S.C., York, W.S., Albersheim, P., Darvill, A., Hayashi, T., Joseleau, J.P., Kato, Y., Lorences, E.P., Maclachlan, G.A., Mcneil, M., Mort, A.J., Reid, J.S.G., Seitz, H.U., Selvendran, R.R., Voragen, A.G.J. and White, A.R. (1993) An unambiguous nomenclature for xyloglucan-derived oligosaccharides. Physiol Plantarum, 89, 1-3. Fushinobu, S., Alves, V.D. and Coutinho, P.M. (2013) Multiple rewards from a treasure trove of novel glycoside hydrolase and polysaccharide lyase structures: new folds, mechanistic details, and evolutionary relationships. Curr Opin Struc Biol. Fushinobu, S., Hidaka, M., Honda, Y., Wakagi, T., Shoun, H. and Kitaoka, M. (2005) Structural basis for the specificity of the reducing end xylose-releasing exo-oligoxylanase from Bacillus halodurans C-125. J Biol Chem, 280, 17180- 17186. Gardner, J.G. and Keating, D.H. (2010) Requirement of the type II secretion system for utilization of cellulosic substrates by Cellvibrio japonicus. Appl Environ Microb, 76, 5079-5087. Gardner, J.G. and Keating, D.H. (2012) Genetic and functional genomic approaches for the study of plant cell wall degradation in Cellvibrio japonicus. In Methods in Enzymology: Cellulases (Gilbert, H.J. ed. Amsterdam: Elsevier, pp. 331- 347. Garron, M.L. and Cygler, M. (2010) Structural and mechanistic classification of uronic acid-containing polysaccharide lyases. Glycobiology, 20, 1547-1573. Gidley, M.J., Lillford, P.J., Rowlands, D.W., Lang, P., Dentini, M., Crescenzi, V., Edwards, M., Fanutti, C. and Reid, J.S. (1991) Structure and solution properties of tamarind-seed polysaccharide. Carbohydr Res, 214, 299-314. Gilbert, H.J., Jenkins, G., Sullivan, D.A. and Hall, J. (1987) Evidence for multiple carboxymethylcellulase genes in Pseudomonas fluorescens subsp. cellulosa. Mol Gen Genet, 210, 551-556. Gilbert, H.J., Sullivan, D.A., Jenkins, G., Kellett, L.E., Minton, N.P. and Hall, J. (1988) Molecular cloning of multiple xylanase genes from Pseudomonas fluorescens subsp. cellulosa. J Gen Microbiol, 134, 3239-3247. Gilbert, R.G., Wu, A.C., Sullivan, M.A., Sumarriva, G.E., Ersch, N. and Hasjim, J. (2013) Improving human health through understanding the complex structure of glucose polymers. Anal Bioanal Chem. Hall, J. and Gilbert, H.J. (1988) The nucleotide sequence of a carboxymethylcellulase gene from Pseudomonas fluorescens subsp. cellulosa. Mol Gen Genet, 213, 112-117. Haselhorst, T., Lamerz, A.C. and Itzstein, M. (2009) Saturation transfer difference NMR spectroscopy as a technique to investigate protein-carbohydrate interactions in solution. Methods Mol Biol, 534, 375-386. Hazlewood, G.P. and Gilbert, H.J. (1998) Structure and function analysis of Pseudomonas plant cell wall hydrolases. Prog Nucleic Acid Re, Vol 61, 61, 211-241.

75 Johan Larsbrink 2013

Hazlewood, G.P., Laurie, J.I., Ferreira, L.M. and Gilbert, H.J. (1992) Pseudomonas fluorescens subsp. cellulosa: an alternative model for bacterial . J Appl Bacteriol, 72, 244-251. Heymann, H., Breitmeier, D. and Gunther, S. (1995) Human small intestinal - isomaltase: different binding patterns for malto- and isomaltooligosaccharides. Biol Chem H-S, 376, 249-253. Himmel, M.E., Ding, S.Y., Johnson, D.K., Adney, W.S., Nimlos, M.R., Brady, J.W. and Foust, T.D. (2007) Biomass recalcitrance: engineering plants and enzymes for biofuels production. Science, 315, 804-807. Hoffman, M., Jia, Z.H., Pena, M.J., Cash, M., Harper, A., Blackburn, A.R., Darvill, A. and York, W.S. (2005) Structural analysis of xyloglucans in the primary cell walls of plants in the subclass Asteridae. Carbohyd Res, 340, 1826-1840. Hogg, D., Pell, G., Dupree, P., Goubet, F., Martin-Orue, S.M., Armand, S. and Gilbert, H.J. (2003) The modular architecture of Cellvibrio japonicus mannanases in glycoside hydrolase families 5 and 26 points to differences in their role in mannan degradation. Biochem J, 371, 1027-1043. Hogg, D., Woo, E.J., Bolam, D.N., McKie, V.A., Gilbert, H.J. and Pickersgill, R.W. (2001) Crystal structure of mannanase 26A from Pseudomonas cellulosa and analysis of residues involved in substrate binding. J Biol Chem, 276, 31186- 31192. Honda, Y. and Kitaoka, M. (2004) A family 8 glycoside hydrolase from Bacillus halodurans C-125 (BH2105) is a reducing end xylose-releasing exo- oligoxylanase. J Biol Chem, 279, 55097-55103. Horn, S.J., Vaaje-Kolstad, G., Westereng, B. and Eijsink, V.G. (2012) Novel enzymes for the degradation of cellulose. Biotechnol biofuels, 5, 45. Hsieh, Y.S.Y. and Harris, P.J. (2009) Xyloglucans of monocotyledons have diverse structures. Mol Plant, 2, 943-965. Humphry, D.R., Black, G.W. and Cummings, S.P. (2003) Reclassification of 'Pseudomonas fluorescens subsp cellulosa' NCIMB 10462 (Ueda et al. 1952) as Cellvibrio japonicus sp nov and revival of Cellvibrio vulgaris sp nov., nom. rev. and Cellvibrio fulvus sp nov., nom. rev. Int J Syst Evol Micr, 53, 393-400. Jacobs, D.R., Jr. and Gallaher, D.D. (2004) Whole grain intake and cardiovascular disease: a review. Current Atheroscler Rep, 6, 415-423. Kaper, J.B. and Sperandio, V. (2005) Bacterial cell-to-cell signaling in the gastrointestinal tract. Infect Immun, 73, 3197-3209. Kim, M.S., Whon, T.W., Roh, S.W., Shin, N.R. and Bae, J.W. (2011) Draft genome sequence of Bacteroides faecis MAJ27(T), a strain isolated from human feces. J Bacteriol, 193, 6801-6802. Koropatkin, N.M., Cameron, E.A. and Martens, E.C. (2012) How glycan metabolism shapes the human gut microbiota. Nat Rev Microbiol, 10, 323-335. Koshland, J.D.E. (1953) Stereochemistry and the mechanism of enzymatic reactions. Biol Rev Camb Philos Soc, 28, 416-436. Lairson, L.L., Henrissat, B., Davies, G.J. and Withers, S.G. (2008) Glycosyltransferases: structures, functions, and mechanisms. Annu Rev Biochem, 77, 521-555. Lamport, D.T., Kieliszewski, M.J., Chen, Y. and Cannon, M.C. (2011) Role of the extensin superfamily in primary cell wall architecture. Plant Physiol, 156, 11- 19.

76 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Larsbrink, J., Izumi, A., Hemsworth, G.R., Davies, G.J. and Brumer, H. (2012) Structural enzymology of Cellvibrio japonicus Agd31B protein reveals alpha- transglucosylase activity in glycoside hydrolase family 31. J Biol Chem, 287, 43288-43299. Levasseur, A., Drula, E., Lombard, V., Coutinho, P.M. and Henrissat, B. (2013) Expansion of the enzymatic repertoire of the CAZy database to integrate auxiliary redox enzymes. Biotechnol biofuels, 6, 41. Ley, R.E., Backhed, F., Turnbaugh, P., Lozupone, C.A., Knight, R.D. and Gordon, J.I. (2005) Obesity alters gut microbial ecology. P Natl Acad Sci USA, 102, 11070-11075. Ley, R.E., Peterson, D.A. and Gordon, J.I. (2006a) Ecological and evolutionary forces shaping microbial diversity in the human intestine. Cell, 124, 837-848. Ley, R.E., Turnbaugh, P.J., Klein, S. and Gordon, J.I. (2006b) Microbial ecology - Human gut microbes associated with obesity. Nature, 444, 1022-1023. Li, L.L., McCorkle, S.R., Monchy, S., Taghavi, S. and van der Lelie, D. (2009) Bioprospecting metagenomes: glycosyl hydrolases for converting biomass. Biotechnol Biofuels, 2, 10. Lombard, V., Bernard, T., Rancurel, C., Brumer, H., Coutinho, P.M. and Henrissat, B. (2010) A hierarchical classification of polysaccharide lyases for glycogenomics. Biochem J, 432, 437-444. Lovering, A.L., Lee, S.S., Kim, Y.W., Withers, S.G. and Strynadka, N.C.J. (2005) Mechanistic and structural analysis of a family 31 alpha-glycosidase and its glycosyl-enzyme intermediate. J Biol Chem, 280, 2105-2115. Lutz, S. (2010) Beyond directed evolution--semi-rational protein engineering and design. Curr Opin Biotechnol, 21, 734-743. Macconnachie, A.A., Fox, R., Kennedy, D.R. and Seaton, R.A. (2009) Faecal transplant for recurrent Clostridium difficile-associated diarrhoea: a UK case series. Qjm-Int J Med, 102, 781-784. MacDonald, J., Suzuki, H. and Master, E.R. (2012) Expression and regulation of genes encoding lignocellulose-degrading activity in the genus Phanerochaete. Appl Microbiol Biot, 94, 339-351. Mahowald, M.A., Rey, F.E., Seedorf, H., Turnbaugh, P.J., Fulton, R.S., Wollam, A., Shah, N., Wang, C.Y., Magrini, V., Wilson, R.K., Cantarel, B.L., Coutinho, P.M., Henrissat, B., Crock, L.W., Russell, A., Verberkmoes, N.C., Hettich, R.L. and Gordon, J.I. (2009) Characterizing a model human gut microbiota composed of members of its two dominant bacterial phyla. P Natl Acad Sci USA, 106, 5859-5864. Mandels, M. and Reese, E.T. (1957) Induction of cellulase in Trichoderma viride as influenced by carbon sources and metals. J Bacteriol, 73, 269-278. Mark, B.L., Vocadlo, D.J., Knapp, S., Triggs-Raine, B.L., Withers, S.G. and James, M.N. (2001) Crystallographic evidence for substrate-assisted catalysis in a bacterial beta-. J Biol Chem, 276, 10330-10337. Martens, E.C., Chiang, H.C. and Gordon, J.I. (2008) Mucosal glycan foraging enhances fitness and transmission of a saccharolytic human gut bacterial symbiont. Cell Host Microbe, 4, 447-457. Martens, E.C., Koropatkin, N.M., Smith, T.J. and Gordon, J.I. (2009) Complex glycan catabolism by the human gut microbiota: The Bacteroidetes Sus-like paradigm. J Biol Chem, 284, 24673-24677.

77 Johan Larsbrink 2013

Martens, E.C., Lowe, E.C., Chiang, H., Pudlo, N.A., Wu, M., McNulty, N.P., Abbott, D.W., Henrissat, B., Gilbert, H.J., Bolam, D.N. and Gordon, J.I. (2011) Recognition and degradation of plant cell wall polysaccharides by two human gut symbionts. Plos Biol, 9. McCarter, J.D. and Withers, S.G. (1994) Mechanisms of enzymatic glycoside hydrolysis. Curr Opin Struc Biol, 4, 885-892. McKie, V.A., Black, G.W., Millward-Sadler, S.J., Hazlewood, G.P., Laurie, J.I. and Gilbert, H.J. (1997) Arabinanase A from Pseudomonas fluorescens subsp. cellulosa exhibits both an endo- and an exo- mode of action. Biochemical J, 323 ( Pt 2), 547-555. McNeil, N.I. (1984) The contribution of the large intestine to energy supplies in man. Am. J. Clin. Nutr., 39, 338-342. Mello, L.V., Chen, X. and Rigden, D.J. (2010) Mining metagenomic data for novel domains: BACON, a new carbohydrate-binding module. FEBS Lett, 584, 2421-2426. Mishra, A. and Malhotra, A.V. (2009) Tamarind xyloglucan: a polysaccharide with versatile application potential. J Mater Chem, 19, 8528-8536. Mohnen, D. (2008) Pectin structure and biosynthesis. Curr Opin Plant Biol, 11, 266- 277. Moracci, M., Cobucci Ponzano, B., Trincone, A., Fusco, S., De Rosa, M., van Der Oost, J., Sensen, C.W., Charlebois, R.L. and Rossi, M. (2000) Identification and molecular characterization of the first alpha -xylosidase from an archaeon. J Biol Chem, 275, 22082-22089. Nelson, K.E., Weinstock, G.M., Highlander, S.K., Worley, K.C., Creasy, H.H., Wortman, J.R., Rusch, D.B., Mitreva, M., Sodergren, E., Chinwalla, A.T., Feldgarden, M., Gevers, D., Haas, B.J., Madupu, R., Ward, D.V., Birren, B., Gibbs, R.A., Methe, B., Petrosino, J.F., Strausberg, R.L., Sutton, G.G., White, O.R., Wilson, R.K., Durkin, S., Gujja, S., Howarth, C., Kodira, C.D., Kyrpides, N., Mehta, T., Muzny, D.M., Pearson, M., Pepin, K., Pati, A., Qin, X., Yandava, C., Zeng, Q.D., Zhang, L., Berlin, A.M., Chen, L., Hepburn, T.A., Johnson, J., McCorrison, J., Miller, J., Minx, P., Nusbaum, C., Russ, C., Sykes, S.M., Tomlinson, C.M., Young, S., Warren, W.C., Badger, J., Crabtree, J., Markowitz, V.M., Orvis, J., Cree, A., Ferriera, S., Gillis, M., Hemphill, L.D., Joshi, V., Kovar, C., Wetterstrand, K.A., Abouellleil, A., Wollam, A.M., Buhay, C.J., Ding, Y., Dugan, S., Fulton, L.L., Fulton, R.S., Holder, M., Hostetler, J., Allen-Vercoe, E., Clifton, S.W., Earl, A.M., Farmer, C.N., Giglio, M.G., Liolios, K., Surette, M.G., Torralba, M., Xu, Q., Pohl, C., Wilczek-Boney, K., Zhu, D.H. and Human Microbiome, J. (2010) A catalog of reference genomes from the human microbiome. Science, 328, 994-999. O'Dwyer, M.H. (1923) The Hemicelluloses. III. The hemicellulose of American white oak. Biochem J, 17, 501-509. Osbourn, A.E. and Field, B. (2009) Operons. Cell Mol Life Sci, 66, 3755-3775. Park, Y.B. and Cosgrove, D.J. (2012) A revised architecture of primary cell walls based on biomechanical changes induced by substrate-specific endoglucanases. Plant Physiol, 158, 1933-1943. Pauly, M., Albersheim, P., Darvill, A. and York, W.S. (1999) Molecular domains of the cellulose/xyloglucan network in the cell walls of higher plants. Plant J, 20, 629-639.

78 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Peña, M.J., Darvill, A.G., Eberhard, S., York, W.S. and O'Neill, M.A. (2008) Moss and liverwort xyloglucans contain galacturonic acid and are structurally distinct from the xyloglucans synthesized by hornworts and vascular plants. Glycobiology, 18, 891-904. Pérez, S. and Bertoft, E. (2010) The molecular structures of starch components and their contribution to the architecture of starch granules: A comprehensive review. Starch-Starke, 62, 389-420. Persky, S.E. and Brandt, L.J. (2000) Treatment of recurrent Clostridium difficile- associated by administration of donated stool directly through a colonoscope. Am J Gastroenterol, 95, 3283-3285. Petosa, C., Collier, R.J., Klimpel, K.R., Leppla, S.H. and Liddington, R.C. (1997) Crystal structure of the anthrax toxin protective antigen. Nature, 385, 833-838. Popper, Z.A. (2008) Evolution and diversity of green plant cell walls. Curr Opin Plant Biol, 11, 286-292. Popper, Z.A., Michel, G., Herve, C., Domozych, D.S., Willats, W.G., Tuohy, M.G., Kloareg, B. and Stengel, D.B. (2011) Evolution and diversity of plant cell walls: from algae to flowering plants. Annu Rev Plant Biol, 62, 567-590. Qin, J.J., Li, R.Q., Raes, J., Arumugam, M., Burgdorf, K.S., Manichanh, C., Nielsen, T., Pons, N., Levenez, F., Yamada, T., Mende, D.R., Li, J.H., Xu, J.M., Li, S.C., Li, D.F., Cao, J.J., Wang, B., Liang, H.Q., Zheng, H.S., Xie, Y.L., Tap, J., Lepage, P., Bertalan, M., Batto, J.M., Hansen, T., Le Paslier, D., Linneberg, A., Nielsen, H.B., Pelletier, E., Renault, P., Sicheritz-Ponten, T., Turner, K., Zhu, H.M., Yu, C., Li, S.T., Jian, M., Zhou, Y., Li, Y.R., Zhang, X.Q., Li, S.G., Qin, N., Yang, H.M., Wang, J., Brunak, S., Dore, J., Guarner, F., Kristiansen, K., Pedersen, O., Parkhill, J., Weissenbach, J., Bork, P., Ehrlich, S.D., Wang, J. and Consortium, M. (2010) A human gut microbial gene catalogue established by metagenomic sequencing. Nature, 464, 59-U70. Rajan, S.S., Yang, X., Collart, F., Yip, V.L., Withers, S.G., Varrot, A., Thompson, J., Davies, G.J. and Anderson, W.F. (2004) Novel catalytic mechanism of glycoside hydrolysis based on the structure of an NAD+/Mn2+ -dependent phospho-alpha-glucosidase from Bacillus subtilis. Structure, 12, 1619-1629. Rayon, C., Lerouge, P. and Faye, L. (1998) The protein N-glycosylation in plants. J Exp Bot, 49, 1463-1472. Reeves, A.R., Wang, G.R. and Salyers, A.A. (1997) Characterization of four outer membrane proteins that play a role in utilization of starch by Bacteroides thetaiotaomicron. J Bacteriol, 179, 643-649. Sampedro, J., Sieiro, C., Revilla, G., Gonzalez-Villa, T. and Zarra, I. (2001) Cloning and expression pattern of a gene encoding an alpha-xylosidase active against xyloglucan oligosaccharides from Arabidopsis. Plant Physiol, 126, 910-920. Savur, G.R. and Sreeniv Asan, A. (1948) Isolation and characterization of tamarind seed (Tamarindus indica L.) polysaccharide. J Biol Chem, 172, 501-509. Schnepp, Z. (2013) Biopolymers as a flexible resource for nanochemistry. Angew Chem Int Ed, 52, 1096-1108. Schubert, C. (2006) Can biofuels finally take center stage? Nat Biotech, 24, 777-784. Schuster, A. and Schmoll, M. (2010) Biology and biotechnology of Trichoderma. Appl Microbiol Biot, 87, 787-799.

79 Johan Larsbrink 2013

Schwan, A., Sjolin, S., Trottestam, U. and Aronsson, B. (1984) Relapsing Clostridium difficile enterocolitis cured by rectal infusion of normal feces. Scand J Infect Dis 16, 211-215. Scott, K.P., Duncan, S.H. and Flint, H.J. (2008) Dietary fibre and the gut microbiota. Nutr Bull, 33, 201-211. Senoura, T., Ito, S., Taguchi, H., Higa, M., Hamada, S., Matsui, H., Ozawa, T., Jin, S., Watanabe, J., Wasaki, J. and Ito, S. (2011) New microbial mannan catabolic pathway that involves a novel mannosylglucose phosphorylase. Biochem Biop Res Co, 408, 701-706. Shipman, J.A., Cho, K.H., Siegel, H.A. and Salyers, A.A. (1999) Physiological characterization of SusG, an outer membrane protein essential for starch utilization by Bacteroides thetaiotaomicron. J Bacteriol, 181, 7206-7211. Shtarkman, Y.M., Kocer, Z.A., Edgar, R., Veerapaneni, R.S., D'Elia, T., Morris, P.F. and Rogers, S.O. (2013) Subglacial lake vostok (antarctica) accretion ice contains a diverse set of sequences from aquatic, marine and sediment- inhabiting bacteria and eukarya. PloS one, 8, e67221. Silva, G.B., Ionashiro, M., Carrara, T.B., Crivellari, A.C., Tine, M.A.S., Prado, J., Carpita, N.C. and Buckeridge, M.S. (2011) Cell wall polysaccharides from fern leaves: Evidence for a mannan-rich Type III cell wall in Adiantum raddianum. Phytochemistry, 72, 2352-2360. Sim, L., Willemsma, C., Mohan, S., Naim, H.Y., Pinto, B.M. and Rose, D.R. (2010) Structural basis for substrate selectivity in human -glucoamylase and sucrase-isomaltase N-terminal domains. J Biol Chem, 285, 17763-17770. Smith, M.I., Yatsunenko, T., Manary, M.J., Trehan, I., Mkakosya, R., Cheng, J., Kau, A.L., Rich, S.S., Concannon, P., Mychaleckyj, J.C., Liu, J., Houpt, E., Li, J.V., Holmes, E., Nicholson, J., Knights, D., Ursell, L.K., Knight, R. and Gordon, J.I. (2013) Gut microbiomes of Malawian twin pairs discordant for kwashiorkor. Science, 339, 548-554. Stanley, P., Schachter, H. and Taniguchi, N. (2009) N-Glycans. In Essentials of Glycobiology (Varki, A., Cummings, R.D., Esko, J.D., Freeze, H.H., Stanley, P., Bertozzi, C.R., Hart, G.W. and Etzler, M.E. eds). Cold Spring Harbor NY: The Consortium of Glycobiology Editors, La Jolla, California. Taiz, L. and Zeiger, E. (2010) Plant Physiology: Sinauer Associates, Incorporated. Takaha, T. and Smith, S.M. (1999) The functions of 4-alpha-glucanotransferases and their use for the production of cyclic glucans. Biotechnol Genet Eng, 16, 257- 280. Tap, J., Mondot, S., Levenez, F., Pelletier, E., Caron, C., Furet, J.P., Ugarte, E., Munoz-Tamayo, R., Paslier, D.L.E., Nalin, R., Dore, J. and Leclerc, M. (2009) Towards the human intestinal microbiota phylogenetic core. Environ Microbiol, 11, 2574-2584. Taylor, E.J., Gloster, T.M., Turkenburg, J.P., Vincent, F., Brzozowski, A.M., Dupont, C., Shareck, F., Centeno, M.S., Prates, J.A., Puchart, V., Ferreira, L.M., Fontes, C.M., Biely, P. and Davies, G.J. (2006) Structure and activity of two metal ion-dependent acetylxylan esterases involved in plant cell wall degradation reveals a close similarity to peptidoglycan deacetylases. J Biol Chem, 281, 10968-10975. Teeri, T.T., Brumer Iii, H., Daniel, G. and Gatenholm, P. (2007) Biomimetic engineering of cellulose-based materials. Trends Biotechnol, 25, 299-306.

80 Strategies for the Discovery of Carbohydrate-Active Enzymes from Environmental Bacteria

Terwisscha van Scheltinga, A.C., Armand, S., Kalk, K.H., Isogai, A., Henrissat, B. and Dijkstra, B.W. (1995) Stereochemistry of chitin hydrolysis by a plant / and X-ray structure of a complex with allosamidin: evidence for substrate assisted catalysis. Biochemistry, 34, 15619-15623. Tiemeyer, M., Selleck, S.B. and Esko, J.D. (2009) Arthropoda. In Essentials of Glycobiology (Varki, A., Cummings, R.D., Esko, J.D., Freeze, H.H., Stanley, P., Bertozzi, C.R., Hart, G.W. and Etzler, M.E. eds). Cold Spring Harbor NY: The Consortium of Glycobiology Editors, La Jolla, California. Topping, D. (2007) Cereal complex carbohydrates and their contribution to human health. J Cereal Sci, 46, 220-229. Turnbaugh, P.J., Hamady, M., Yatsunenko, T., Cantarel, B.L., Duncan, A., Ley, R.E., Sogin, M.L., Jones, W.J., Roe, B.A., Affourtit, J.P., Egholm, M., Henrissat, B., Heath, A.C., Knight, R. and Gordon, J.I. (2009) A core gut microbiome in obese and lean twins. Nature, 457, 480-U487. Ueda, K., Ishikawa, S., Itami, T. and Asai, T. (1952) Studies on the aerobic mesophilic cellulose-decomposing bacteria. Part 5-2. Taxonomical study on the genus Pseudomonas. J Agric Chem Soc Jpn, 26, 35-41. Wainwright, M., Wickramasinghe, N.C., Narlikar, J.V. and Rajaratnam, P. (2003) Microorganisms cultured from stratospheric air samples obtained at 41 km. FEMS Microbiol Lett, 218, 161-165. van Nood, E., Vrieze, A., Nieuwdorp, M., Fuentes, S., Zoetendal, E.G., de Vos, W.M., Visser, C.E., Kuijper, E.J., Bartelsman, J.F.W.M., Tijssen, J.G.P., Speelman, P., Dijkgraaf, M.G.W. and Keller, J.J. (2013) Duodenal infusion of donor feces for recurrent Clostridium difficile. New Engl J Med, 368, 407- 415. Wang, H.L., Yeh, K.W., Chen, P.R., Chang, C.H., Chen, J.M. and Khoo, K.H. (2006) Isolation and characterization of a pure mannan from Oncidium (cv. Gower Ramsey) current pseudobulb during initial inflorescence development. Biosci Biotechnol Biochem, 70, 551-553. Vanholme, R., Morreel, K., Darrah, C., Oyarce, P., Grabber, J.H., Ralph, J. and Boerjan, W. (2012) Metabolic engineering of novel lignin in biomass crops. New Phytol, 196, 978-1000. Varki, A. and Sharon, N. (2009) Historical Background and Overview. In Essentials of Glycobiology, 2nd edition (Varki, A., Cummings, R.D., Esko, J.D., Freeze, H.H., Stanley, P., Bertozzi, C.R., Hart, G.W. and Etzler, M.E. eds). Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press, pp. 1-22. Watts, A.G., Damager, I., Amaya, M.L., Buschiazzo, A., Alzari, P., Frasch, A.C. and Withers, S.G. (2003) Trypanosoma cruzi trans-sialidase operates through a covalent sialyl-enzyme intermediate: tyrosine is the catalytic nucleophile. J Am Chem Soc, 125, 7532-7533. Whitman, W.B., Coleman, D.C. and Wiebe, W.J. (1998) Prokaryotes: The unseen majority. P Natl Acad Sci USA, 95, 6578-6583. Willför, S., Sundberg, K., Tenkanen, M. and Holmbom, B. (2008) Spruce-derived mannans - A potential raw material for hydrocolloids and novel advanced natural materials. Carbohyd Polym, 72, 197-210. Wilson, W.A., Roach, P.J., Montero, M., Baroja-Fernandez, E., Munoz, F.J., Eydallin, G., Viale, A.M. and Pozueta-Romero, J. (2010) Regulation of glycogen metabolism in yeast and bacteria. FEMS Microbiol Rev, 34, 952-985.

81 Johan Larsbrink 2013

Vincken, J.P., York, W.S., Beldman, G. and Voragen, A.G. (1997) Two general branching patterns of xyloglucan, XXXG and XXGG. Plant Physiol, 114, 9- 13. Withers, S.G., Warren, R.A.J., Street, I.P., Rupitz, K., Kempton, J.B. and Aebersold, R. (1990) Unequivocal demonstration of the involvement of a glutamate residue as a nucleophile in the mechanism of a retaining glycosidase. J Am Chem Soc, 112, 5887-5889. Wolfenden, R., Lu, X.D. and Young, G. (1998) Spontaneous hydrolysis of glycosides. J Am Chem Soc, 120, 6814-6815. Xu, J., Bjursell, M.K., Himrod, J., Deng, S., Carmichael, L.K., Chiang, H.C., Hooper, L.V. and Gordon, J.I. (2003a) A genomic view of the human- Bacteroides thetaiotaomicron symbiosis. Science, 299, 2074-2076. Xu, J. and Gordon, J.I. (2003b) Honor thy symbionts. P Natl Acad Sci USA, 100, 10452-10459. Yamatoya, K. and Shirakawa, M. (2003) Xyloglucan: structure, rheological properties, biological functions and enzymatic modification. Curr Trends Polymer Sci, 8, 27-72. Yip, V.L., Varrot, A., Davies, G.J., Rajan, S.S., Yang, X., Thompson, J., Anderson, W.F. and Withers, S.G. (2004) An unusual mechanism of glycoside hydrolysis involving redox and elimination steps by a family 4 beta- glycosidase from Thermotoga maritima. J Am Chem Soc, 126, 8354-8355. Zechel, D.L. and Withers, S.G. (2000) Glycosidase mechanisms: anatomy of a finely tuned catalyst. Acc Chem Res, 33, 11-18. Zeeman, S.C., Kossmann, J. and Smith, A.M. (2010) Starch: Its metabolism, evolution, and biotechnological modification in plants. Annu Rev Plant Biol, 61, 209-234. Zhou, Q., Greffe, L., Baumann, M.J., Malmström, E., Teeri, T.T. and Brumer, H. (2005) Use of xyloglucan as a molecular anchor for the elaboration of SRO\PHUV IURP FHOOXORVH VXUIDFHVௗ $ JHQHUDO URXWH IRU WKH GHVLJQ RI biocomposites. Macromolecules, 38, 3547-3549. Zoetendal, E.G., Akkermans, A.D.L. and De Vos, W.M. (1998) Temperature gradient gel electrophoresis analysis of 16S rRNA from human fecal samples reveals stable and host-specific communities of active bacteria. Appl Environ Microb, 64, 3854-3859. Zou, J., Kleywegt, G.J., Stahlberg, J., Driguez, H., Nerinckx, W., Claeyssens, M., Koivula, A., Teeri, T.T. and Jones, T.A. (1999) Crystallographic evidence for substrate ring distortion and protein conformational changes during catalysis in cellobiohydrolase Ce16A from Trichoderma reesei. Structure, 7, 1035-1045.

82